Hello Willem,

On Thu, Jun 12, 2025 at 10:35:54PM -0400, Willem de Bruijn wrote:
> Breno Leitao wrote:
> > Add a basic selftest for the netpoll polling mechanism, specifically
> > targeting the netpoll poll() side.
> > 
> > The test creates a scenario where network transmission is running at
> > maximum sppend, and netpoll needs to poll the NIC. This is achieved by:
> 
> minor type: sppend/speed

Thanks! I will update.

> >   1. Configuring a single RX/TX queue to create contention
> >   2. Generating background traffic to saturate the interface
> >   3. Sending netconsole messages to trigger netpoll polling
> >   4. Using dynamic netconsole targets via configfs
> > 
> > The test validates a critical netpoll code path by monitoring traffic
> > flow and ensuring netpoll_poll_dev() is called when the normal TX path
> > is blocked. Perf probing confirms this test successfully triggers
> > netpoll_poll_dev() in typical test runs.
> 
> So the test needs profiling to make it a pass/fail regression test?
> Then perhaps add it to TEST_FILES rather than TEST_PROGS. Unless
> exercising the code on its own is valuable enough.

Sorry for not being clear. This test doesn't depend on any profiling
data. Basically I just run `perf probe` to guarantee that
netpoll_poll_dev() was being called (as that was the goal of the test).

This test is self contained and should run at `make run_test` targets.

> Or is there another way that the packets could be observed, e.g.,
> counters.

Unfortunately netpoll doesn't expose any data, thus, it is hard to get
it. 

I have plans to create a configfs for netpoll, so, we can check for
these numbers (as also configure some pre-defined values today, such as
USEC_PER_POLL, MAX_SKBS, ip6h->version = 6; ip6h->priority = 0, etc.

In fact, I've an private PoC for this, but, I am modernizing the code
first, and creating some selftests to help me with those changes later
(given we have very little test on netpoll, and I aim to improve this,
given how critical it is for some datacenter designs).

> > +NETCONSOLE_CONFIGFS_PATH = "/sys/kernel/config/netconsole"
> > +REMOTE_PORT = 6666
> > +LOCAL_PORT = 1514
> > +# Number of netcons messages to send. I usually see netpoll_poll_dev()
> > +# being called at least once in 10 iterations.
> > +ITERATIONS = 10
> 
> Is usually sufficient to avoid flakiness, or should this be cranked
> up?

10 was the minimum number I was able to trigger it on my dev
environment, either with default configuration and a debug heavy
configuration, but, the higher the number, more change to trigger it.
I can crank up it a bit more. Maybe 20?

> > +def check_traffic_flowing(cfg: NetDrvEpEnv, netdevnl: NetdevFamily) -> int:
> > +    """Check if traffic is flowing on the interface"""
> > +    stat1 = get_stats(cfg, netdevnl)
> > +    time.sleep(1)
> 
> Can the same be learned with sufficient precision when sleeping
> for only 100 msec? As tests are added, it's worth trying to keep
> their runtime short.

100%. In fact, I don't need to wait for 1 seconds. In fact, we don't
even need to check for traffic flowing after the traffic started. I've
just added it to help me do develop the test.

We can either reduce it to 100ms or just remove it from the loop,
without prejudice to the test itself. Maybe reducing it to 100 ms might
help someone else that might debug this in the future, while just
slowing down ITERATIONS * 0.1 seconds !?

Thanks for the review!
--breno

Reply via email to