It is well known that the APU2/4 underperforms when used as a router
with OpenBSD, but I found that the throughput fluctuates quite a bit,
and I think it has to do with CPU allocation and interrupts. My
trivial setup simulating a home router/gateway:

hostname.em0:  dhcp
hostname.em1:  inet 10.3.2.1 255.255.255.0
pf.conf:
  pass
  match out on em0 inet from !(em0:network) to any nat-to (em0)

Nothing else is running on the router, and throughput is tested with a
simple iperf3 TCP benchmark between linux hosts on each side of the
router, capped att 600 Mbit/s to get a stable baseline: iperf3 -b600M
(that's just under the maximum throughput I saw, around 620 Mbit/s)

I noticed that the speed always starts at a clean 600 Mbit/s, then
eventually backs down to 4-500 Mbit/s, then back up again. The
interval is on the order of a minute, but varies greatly.

Looking at "systat cpu" during the transfer I noticed that during the
fast speed, CPU 1 and 2 were busy at 45% each, while CPU 0 was
handling interrupts at 25%.

CPU            User         Nice      System        Spin   Interrupt        Idle
0              0.0%         0.0%        0.0%        0.4%       25.0%       74.7%
1              0.0%         0.0%       45.3%        1.0%        0.0%       53.7%
2              0.0%         0.0%       44.7%        0.8%        0.0%       54.5%
3              0.0%         0.0%        0.0%        0.0%        0.0%        100%

This could go on for seconds up to minutes.

Eventually whatever was running on CPU 1 and 2 migrated up to CPU 0,
causing the bandwidth to drop down to 4-500 Mbit/s:

CPU            User         Nice      System        Spin   Interrupt        Idle
0              0.0%         0.0%       76.8%        1.0%       22.2%        0.0%
1              0.0%         0.0%        0.0%        0.0%        0.0%        100%
2              0.0%         0.0%        0.0%        0.0%        0.0%        100%
3              0.0%         0.0%        0.0%        0.0%        0.0%        100%


Now, waiting even further, I saw that the "System" load could
sometimes move back to an idle core, and then the speed would get back
to 600 Mbit/s again:

CPU            User         Nice      System        Spin   Interrupt        Idle
0              0.0%         0.0%        0.2%        0.2%       23.0%       76.6%
1              0.0%         0.0%       99.0%        1.0%        0.0%        0.0%
2              0.0%         0.0%        0.0%        0.0%        0.0%        100%
3              0.0%         0.0%        0.0%        0.0%        0.0%        100%


I'm guessing that the interrupts are all tied to CPU 0 in hardware,
and that whatever process that handles the networking initially
selects one or more random idle core. Then, the system thinks "Aha, we
should run these on the same core that handles the interrupts", moves
them over, which then starves that core.

This tells me that the rumour that OpenBSD can't use more than one
core on this little device is not completely true. It works well for a
long time initially with the load shared between two cores, while a
third handles interrupts.

Does this make sense? Is there a way to enforce the "shared cores" behaviour?

// Anders

Reply via email to