Re: [PATCH net-next RFC] selftests: net: add netpoll basic functionality test
On Mon, Jun 23, 2025 at 10:29:03AM -0700, Jakub Kicinski wrote: > On Mon, 23 Jun 2025 02:16:12 -0700 Breno Leitao wrote: > > So, the selftest for netpoll is already in the mailing list[1], so, we > > have two options, now: > > > > 1) Steal your patch and make [1] depend on it. > > 2) Merge the selftest [1] and, then, steal your patch by adding the > > bpftrace support in it. > > > > What is your recommendation? > > Let's see if [1] gets merged as is, if we need a v2 let's add the > bpftrace patch? Sounds like a plan. Thanks! --breno
Re: [PATCH net-next RFC] selftests: net: add netpoll basic functionality test
On Mon, 23 Jun 2025 02:16:12 -0700 Breno Leitao wrote: > So, the selftest for netpoll is already in the mailing list[1], so, we > have two options, now: > > 1) Steal your patch and make [1] depend on it. > 2) Merge the selftest [1] and, then, steal your patch by adding the > bpftrace support in it. > > What is your recommendation? Let's see if [1] gets merged as is, if we need a v2 let's add the bpftrace patch?
Re: [PATCH net-next RFC] selftests: net: add netpoll basic functionality test
On Sat, Jun 21, 2025 at 06:51:21AM -0700, Jakub Kicinski wrote: > On Fri, 20 Jun 2025 01:39:43 -0700 Breno Leitao wrote: > > > FWIW you can steal bpftrace integration from this series: > > > https://lore.kernel.org/all/[email protected]/ > > > > Yes, that would be great. I think we can iterate until we hit the poll > > path, otherwise we skip the test at timeout. Something as: > > > > while (true): > > send msg > > if netpoll_poll_dev() was invoked: > > ksft_exit > > > > if timeout: > > raise KsftSkipEx > > > > As soon as your code lands, I will adapt the test to do so. Meanwhile, > > I will send the v1 for the netpoll, and later we can iterate. > > > > Thanks for working on this bfptrace helper. This will be useful on other > > usecases as well. > > Right, you're the second person I pointed that patch out to. Would be > great if someone could steal that patch and make it a part of their > series so that it gets merged I can do that. I was expecting your patches to be landed, and then I would reuse it. I was not expecting to ship it as part of my patchset. So, the selftest for netpoll is already in the mailing list[1], so, we have two options, now: 1) Steal your patch and make [1] depend on it. 2) Merge the selftest [1] and, then, steal your patch by adding the bpftrace support in it. What is your recommendation? Link: https://lore.kernel.org/all/[email protected]/ [1] Thanks, --breno
Re: [PATCH net-next RFC] selftests: net: add netpoll basic functionality test
On Fri, 20 Jun 2025 01:39:43 -0700 Breno Leitao wrote: > > FWIW you can steal bpftrace integration from this series: > > https://lore.kernel.org/all/[email protected]/ > > Yes, that would be great. I think we can iterate until we hit the poll > path, otherwise we skip the test at timeout. Something as: > > while (true): > send msg > if netpoll_poll_dev() was invoked: > ksft_exit > > if timeout: > raise KsftSkipEx > > As soon as your code lands, I will adapt the test to do so. Meanwhile, > I will send the v1 for the netpoll, and later we can iterate. > > Thanks for working on this bfptrace helper. This will be useful on other > usecases as well. Right, you're the second person I pointed that patch out to. Would be great if someone could steal that patch and make it a part of their series so that it gets merged :S But it's alright if you prefer to stick to non-bpftrace testing for now..
Re: [PATCH net-next RFC] selftests: net: add netpoll basic functionality test
On Fri, Jun 13, 2025 at 05:42:33PM -0700, Jakub Kicinski wrote: > On Fri, 13 Jun 2025 05:47:50 -0700 Breno Leitao wrote: > > > Or is there another way that the packets could be observed, e.g., > > > counters. > > > > Unfortunately netpoll doesn't expose any data, thus, it is hard to get > > it. > > > > I have plans to create a configfs for netpoll, so, we can check for > > these numbers (as also configure some pre-defined values today, such as > > USEC_PER_POLL, MAX_SKBS, ip6h->version = 6; ip6h->priority = 0, etc. > > > > In fact, I've an private PoC for this, but, I am modernizing the code > > first, and creating some selftests to help me with those changes later > > (given we have very little test on netpoll, and I aim to improve this, > > given how critical it is for some datacenter designs). > > FWIW you can steal bpftrace integration from this series: > https://lore.kernel.org/all/[email protected]/ Yes, that would be great. I think we can iterate until we hit the poll path, otherwise we skip the test at timeout. Something as: while (true): send msg if netpoll_poll_dev() was invoked: ksft_exit if timeout: raise KsftSkipEx As soon as your code lands, I will adapt the test to do so. Meanwhile, I will send the v1 for the netpoll, and later we can iterate. Thanks for working on this bfptrace helper. This will be useful on other usecases as well. --breno
Re: [PATCH net-next RFC] selftests: net: add netpoll basic functionality test
On Fri, 13 Jun 2025 05:47:50 -0700 Breno Leitao wrote: > > Or is there another way that the packets could be observed, e.g., > > counters. > > Unfortunately netpoll doesn't expose any data, thus, it is hard to get > it. > > I have plans to create a configfs for netpoll, so, we can check for > these numbers (as also configure some pre-defined values today, such as > USEC_PER_POLL, MAX_SKBS, ip6h->version = 6; ip6h->priority = 0, etc. > > In fact, I've an private PoC for this, but, I am modernizing the code > first, and creating some selftests to help me with those changes later > (given we have very little test on netpoll, and I aim to improve this, > given how critical it is for some datacenter designs). FWIW you can steal bpftrace integration from this series: https://lore.kernel.org/all/[email protected]/
Re: [PATCH net-next RFC] selftests: net: add netpoll basic functionality test
On Fri, Jun 13, 2025 at 09:43:35AM -0400, Willem de Bruijn wrote: > Breno Leitao wrote: > > > > +def check_traffic_flowing(cfg: NetDrvEpEnv, netdevnl: NetdevFamily) -> > > > > int: > > > > +"""Check if traffic is flowing on the interface""" > > > > +stat1 = get_stats(cfg, netdevnl) > > > > +time.sleep(1) > > > > > > Can the same be learned with sufficient precision when sleeping > > > for only 100 msec? As tests are added, it's worth trying to keep > > > their runtime short. > > > > 100%. In fact, I don't need to wait for 1 seconds. In fact, we don't > > even need to check for traffic flowing after the traffic started. I've > > just added it to help me do develop the test. > > > > We can either reduce it to 100ms or just remove it from the loop, > > without prejudice to the test itself. Maybe reducing it to 100 ms might > > help someone else that might debug this in the future, while just > > slowing down ITERATIONS * 0.1 seconds !? > > That makes sense. Or only keep it in DEBUG mode? Even better, I will move it to DEBUG mode. Thanks!
Re: [PATCH net-next RFC] selftests: net: add netpoll basic functionality test
Breno Leitao wrote: > Hello Willem, > > On Thu, Jun 12, 2025 at 10:35:54PM -0400, Willem de Bruijn wrote: > > Breno Leitao wrote: > > > Add a basic selftest for the netpoll polling mechanism, specifically > > > targeting the netpoll poll() side. > > > > > > The test creates a scenario where network transmission is running at > > > maximum sppend, and netpoll needs to poll the NIC. This is achieved by: > > > > minor type: sppend/speed > > Thanks! I will update. > > > > 1. Configuring a single RX/TX queue to create contention > > > 2. Generating background traffic to saturate the interface > > > 3. Sending netconsole messages to trigger netpoll polling > > > 4. Using dynamic netconsole targets via configfs > > > > > > The test validates a critical netpoll code path by monitoring traffic > > > flow and ensuring netpoll_poll_dev() is called when the normal TX path > > > is blocked. Perf probing confirms this test successfully triggers > > > netpoll_poll_dev() in typical test runs. > > > > So the test needs profiling to make it a pass/fail regression test? > > Then perhaps add it to TEST_FILES rather than TEST_PROGS. Unless > > exercising the code on its own is valuable enough. > > Sorry for not being clear. This test doesn't depend on any profiling > data. Basically I just run `perf probe` to guarantee that > netpoll_poll_dev() was being called (as that was the goal of the test). > > This test is self contained and should run at `make run_test` targets. > > > Or is there another way that the packets could be observed, e.g., > > counters. > > Unfortunately netpoll doesn't expose any data, thus, it is hard to get > it. > > I have plans to create a configfs for netpoll, so, we can check for > these numbers (as also configure some pre-defined values today, such as > USEC_PER_POLL, MAX_SKBS, ip6h->version = 6; ip6h->priority = 0, etc. > > In fact, I've an private PoC for this, but, I am modernizing the code > first, and creating some selftests to help me with those changes later > (given we have very little test on netpoll, and I aim to improve this, > given how critical it is for some datacenter designs). > > > > +NETCONSOLE_CONFIGFS_PATH = "/sys/kernel/config/netconsole" > > > +REMOTE_PORT = > > > +LOCAL_PORT = 1514 > > > +# Number of netcons messages to send. I usually see netpoll_poll_dev() > > > +# being called at least once in 10 iterations. > > > +ITERATIONS = 10 > > > > Is usually sufficient to avoid flakiness, or should this be cranked > > up? > > 10 was the minimum number I was able to trigger it on my dev > environment, either with default configuration and a debug heavy > configuration, but, the higher the number, more change to trigger it. > I can crank up it a bit more. Maybe 20? > > > > +def check_traffic_flowing(cfg: NetDrvEpEnv, netdevnl: NetdevFamily) -> > > > int: > > > +"""Check if traffic is flowing on the interface""" > > > +stat1 = get_stats(cfg, netdevnl) > > > +time.sleep(1) > > > > Can the same be learned with sufficient precision when sleeping > > for only 100 msec? As tests are added, it's worth trying to keep > > their runtime short. > > 100%. In fact, I don't need to wait for 1 seconds. In fact, we don't > even need to check for traffic flowing after the traffic started. I've > just added it to help me do develop the test. > > We can either reduce it to 100ms or just remove it from the loop, > without prejudice to the test itself. Maybe reducing it to 100 ms might > help someone else that might debug this in the future, while just > slowing down ITERATIONS * 0.1 seconds !? That makes sense. Or only keep it in DEBUG mode?
Re: [PATCH net-next RFC] selftests: net: add netpoll basic functionality test
Hello Willem, On Thu, Jun 12, 2025 at 10:35:54PM -0400, Willem de Bruijn wrote: > Breno Leitao wrote: > > Add a basic selftest for the netpoll polling mechanism, specifically > > targeting the netpoll poll() side. > > > > The test creates a scenario where network transmission is running at > > maximum sppend, and netpoll needs to poll the NIC. This is achieved by: > > minor type: sppend/speed Thanks! I will update. > > 1. Configuring a single RX/TX queue to create contention > > 2. Generating background traffic to saturate the interface > > 3. Sending netconsole messages to trigger netpoll polling > > 4. Using dynamic netconsole targets via configfs > > > > The test validates a critical netpoll code path by monitoring traffic > > flow and ensuring netpoll_poll_dev() is called when the normal TX path > > is blocked. Perf probing confirms this test successfully triggers > > netpoll_poll_dev() in typical test runs. > > So the test needs profiling to make it a pass/fail regression test? > Then perhaps add it to TEST_FILES rather than TEST_PROGS. Unless > exercising the code on its own is valuable enough. Sorry for not being clear. This test doesn't depend on any profiling data. Basically I just run `perf probe` to guarantee that netpoll_poll_dev() was being called (as that was the goal of the test). This test is self contained and should run at `make run_test` targets. > Or is there another way that the packets could be observed, e.g., > counters. Unfortunately netpoll doesn't expose any data, thus, it is hard to get it. I have plans to create a configfs for netpoll, so, we can check for these numbers (as also configure some pre-defined values today, such as USEC_PER_POLL, MAX_SKBS, ip6h->version = 6; ip6h->priority = 0, etc. In fact, I've an private PoC for this, but, I am modernizing the code first, and creating some selftests to help me with those changes later (given we have very little test on netpoll, and I aim to improve this, given how critical it is for some datacenter designs). > > +NETCONSOLE_CONFIGFS_PATH = "/sys/kernel/config/netconsole" > > +REMOTE_PORT = > > +LOCAL_PORT = 1514 > > +# Number of netcons messages to send. I usually see netpoll_poll_dev() > > +# being called at least once in 10 iterations. > > +ITERATIONS = 10 > > Is usually sufficient to avoid flakiness, or should this be cranked > up? 10 was the minimum number I was able to trigger it on my dev environment, either with default configuration and a debug heavy configuration, but, the higher the number, more change to trigger it. I can crank up it a bit more. Maybe 20? > > +def check_traffic_flowing(cfg: NetDrvEpEnv, netdevnl: NetdevFamily) -> int: > > +"""Check if traffic is flowing on the interface""" > > +stat1 = get_stats(cfg, netdevnl) > > +time.sleep(1) > > Can the same be learned with sufficient precision when sleeping > for only 100 msec? As tests are added, it's worth trying to keep > their runtime short. 100%. In fact, I don't need to wait for 1 seconds. In fact, we don't even need to check for traffic flowing after the traffic started. I've just added it to help me do develop the test. We can either reduce it to 100ms or just remove it from the loop, without prejudice to the test itself. Maybe reducing it to 100 ms might help someone else that might debug this in the future, while just slowing down ITERATIONS * 0.1 seconds !? Thanks for the review! --breno
Re: [PATCH net-next RFC] selftests: net: add netpoll basic functionality test
Breno Leitao wrote: > Add a basic selftest for the netpoll polling mechanism, specifically > targeting the netpoll poll() side. > > The test creates a scenario where network transmission is running at > maximum sppend, and netpoll needs to poll the NIC. This is achieved by: minor type: sppend/speed > 1. Configuring a single RX/TX queue to create contention > 2. Generating background traffic to saturate the interface > 3. Sending netconsole messages to trigger netpoll polling > 4. Using dynamic netconsole targets via configfs > > The test validates a critical netpoll code path by monitoring traffic > flow and ensuring netpoll_poll_dev() is called when the normal TX path > is blocked. Perf probing confirms this test successfully triggers > netpoll_poll_dev() in typical test runs. So the test needs profiling to make it a pass/fail regression test? Then perhaps add it to TEST_FILES rather than TEST_PROGS. Unless exercising the code on its own is valuable enough. Or is there another way that the packets could be observed, e.g., counters. > This addresses a gap in netpoll test coverage for a path that is > tricky for the network stack. > > Signed-off-by: Breno Leitao > --- > Sending as an RFC for your appreciation, but it dpends on [1] which is > stil under review. Once [1] lands, I will send this officially. > > Link: > https://lore.kernel.org/all/[email protected]/ > [1] > --- > tools/testing/selftests/drivers/net/Makefile | 1 + > .../testing/selftests/drivers/net/netpoll_basic.py | 201 > + > 2 files changed, 202 insertions(+) > > diff --git a/tools/testing/selftests/drivers/net/Makefile > b/tools/testing/selftests/drivers/net/Makefile > index be780bcb73a3b..70d6e3a920b7f 100644 > --- a/tools/testing/selftests/drivers/net/Makefile > +++ b/tools/testing/selftests/drivers/net/Makefile > @@ -15,6 +15,7 @@ TEST_PROGS := \ > netcons_fragmented_msg.sh \ > netcons_overflow.sh \ > netcons_sysdata.sh \ > + netpoll_basic.py \ > ping.py \ > queues.py \ > stats.py \ > diff --git a/tools/testing/selftests/drivers/net/netpoll_basic.py > b/tools/testing/selftests/drivers/net/netpoll_basic.py > new file mode 100755 > index 0..8abdfb2b1eb6e > --- /dev/null > +++ b/tools/testing/selftests/drivers/net/netpoll_basic.py > @@ -0,0 +1,201 @@ > +#!/usr/bin/env python3 > +# SPDX-License-Identifier: GPL-2.0 > + > +# This test aims to evaluate the netpoll polling mechanism (as in > netpoll_poll_dev()). > +# It presents a complex scenario where the network attempts to send a packet > but fails, > +# prompting it to poll the NIC from within the netpoll TX side. > +# > +# This has been a crucial path in netpoll that was previously untested. Jakub > +# suggested using a single RX/TX queue, pushing traffic to the NIC, and then > sending > +# netpoll messages (via netconsole) to trigger the poll. `perf` probing of > netpoll_poll_dev() > +# showed that this test indeed triggers netpoll_poll_dev() once or twice in > 10 iterations. > + > +# Author: Breno Leitao > + > +import errno > +import os > +import random > +import string > +import time > + > +from lib.py import ( > +ethtool, > +GenerateTraffic, > +ksft_exit, > +ksft_pr, > +ksft_run, > +KsftFailEx, > +KsftSkipEx, > +NetdevFamily, > +NetDrvEpEnv, > +) > + > +NETCONSOLE_CONFIGFS_PATH = "/sys/kernel/config/netconsole" > +REMOTE_PORT = > +LOCAL_PORT = 1514 > +# Number of netcons messages to send. I usually see netpoll_poll_dev() > +# being called at least once in 10 iterations. > +ITERATIONS = 10 Is usually sufficient to avoid flakiness, or should this be cranked up? > +DEBUG = False > + > + > +def generate_random_netcons_name() -> str: > +"""Generate a random name starting with 'netcons'""" > +random_suffix = "".join(random.choices(string.ascii_lowercase + > string.digits, k=8)) > +return f"netcons_{random_suffix}" > + > + > +def get_stats(cfg: NetDrvEpEnv, netdevnl: NetdevFamily) -> dict[str, int]: > +"""Get the statistics for the interface""" > +return netdevnl.qstats_get({"ifindex": cfg.ifindex}, dump=True)[0] > + > + > +def set_single_rx_tx_queue(interface_name: str) -> None: > +"""Set the number of RX and TX queues to 1 using ethtool""" > +try: > +# This don't need to be reverted, since interfaces will be deleted > after test > +ethtool(f"-G {interface_name} rx 1 tx 1") > +except Exception as e: > +raise KsftSkipEx( > +f"Failed to configure RX/TX queues: {e}. Ethtool not available?" > +) > + > + > +def create_netconsole_target( > +config_data: dict[str, str], > +target_name: str, > +) -> None: > +"""Create a netconsole dynamic target against the interfaces""" > +ksft_pr(f"Using netconsole name: {target_name}") > +try: > +ksft_pr(f"Created target directory: > {NETCONSOLE_CONFIGFS_PATH}/{target_nam

