On 10/31/25 7:45 PM, Ilya Maximets wrote:
> This change attempts to track the actual available stack and stop
> early if the flow translation logic is about to overflow it.
> 
> Unlike the recursion depth counter, this approach allows to track the
> actual stack usage and bail early even if the recursion depth is not
> reached yet.  This is important because different actions have vastly
> different stack requirements and different systems have different
> amount of stack allocated per thread by default.
> 
> Should work with both GCC and Clang, will likely not work on Windows.
> The change should have no effect on platforms / compilers that do not
> support checking the current stack frame address via builtins.
> 
> The main thread is not treated fairly.  At least on Linux the main
> thread can grow its stack if the limit is dynamically increased.
> That is not normally true for other threads.  However, this patch
> sticks to initial stack size even for the main thread.  This should
> not be a problem for OVS though, as vast majority of all the packet
> processing is normally done by handlers, revalidators or PMD threads.
> 
> Unlike the previous RFC version of this change [1], we're not trying
> to work around the stack limits by recirculating packets through the
> datapath.  Practice shows that such techniques may lead to self-DoS,
> overwhelming OVS with upcalls.  See the ovn-controller self-DoS issue
> fixed recently in OVN:
>   https://mail.openvswitch.org/pipermail/ovs-dev/2025-October/426746.html
> So, it's safer to just drop the packets and only execute actions that
> we can translate safely.  All in all, stack exhaustion usually means
> a loop or otherwise a very inefficient OpenFlow pipeline that should
> be fixed by the users.  Attempts to process the whole thing would only
> mask the problem.
> 
> It seems hard to create a unit test for this, as support for measuring
> the actual stack depth as well as the amount of space each stack frame
> takes and the ways to limit the stack largely depend on a platform and
> a compiler.  Can be tested manually with an infinite resubmit case:
> 
>   make -j8
>   (ulimit -s 386; make sandbox)
>   ovs-vsctl add-br br0
>   ovs-vsctl add-port br0 p1
>   ovs-ofctl del-flows br0
>   (for i in $(seq 0 64); do
>       j=$(expr $i + 1);
>       echo "table=$i, actions=local,resubmit(,$j),local,resubmit(,$j),local";
>    done;
>    echo "table=65, actions=resubmit(,0)") > ./resubmits.txt
>   ovs-ofctl add-flows br0 ./resubmits.txt
>   ovs-appctl ofproto/trace br0 'in_port=1' > ./trace.txt
> 
> [1] https://mail.openvswitch.org/pipermail/ovs-dev/2024-February/412048.html
> 
> Signed-off-by: Ilya Maximets <[email protected]>
> ---
>  include/linux/openvswitch.h    |  1 +
>  include/openvswitch/compiler.h | 11 ++++++++
>  lib/odp-execute.c              |  4 +++
>  lib/ovs-thread.c               | 47 ++++++++++++++++++++++++++++++++++
>  lib/ovs-thread.h               |  8 ++++++
>  lib/sat-math.h                 |  5 +---
>  ofproto/ofproto-dpif-xlate.c   | 10 ++++++--
>  vswitchd/ovs-vswitchd.c        |  1 +
>  8 files changed, 81 insertions(+), 6 deletions(-)

Flaky IPsec NxN test.

Recheck-request: github-robot
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to