The crash have not been found anymore.

Does this fix make any sense? it it does, I will submit a patch later.

Zhang Dongya via lists.fd.io <fortitude.zhang=gmail....@lists.fd.io> 于
2022年11月29日周二 22:51写道:

> Hi ben,
>
> In the beginning I also think it should be a barrier issue, however it
> turned out not the case.
>
> The pkt which had sw_if_index[VLIB_RX] set as the to-be-deleted interface
> is actually being put to ip4-lookup node by my process node, the process
> node add pkt in a timer drive way.
>
> Since the pkt is added by my process node, I think it is not affected by
> the worker barrier.  in my case the sub if is deleted by API, which is
> processed in linux_epoll_input PRE_INPUT node, let's consider the following
> sequence:
>
>
>    1. my process add a pkt to ip4-node, and the pkt refer to a valid sw
>    if index
>    2. linux_epoll_input process a API request to delete the above sw if
>    index.
>    3. vpp schedule ip4-lookup node, then it will crash because the sw if
>    index is deleted and ip4_lookup node can't use sw_if_index[VLIB_RX] which
>    is now ~0 to get a valid fib index.
>
>
> There are some code that do this way (ikev2_send_ike and others), I think
> it's not doable to update the pending frame when the interface is deleted.
>
> Benoit Ganne (bganne) via lists.fd.io <bganne=cisco....@lists.fd.io>
> 于2022年11月29日周二 22:22写道:
>
>> Hi Zhang,
>>
>> I'd expect the interface deletion to happen under the worker barrier. VPP
>> workers should drain all their in-flight packets before entering the
>> barrier, so it should not be possible for the interface to disappear
>> between your node and ip4-lookup. Or am I missing something?
>> What I have seen happening is you'd have some data structure where you
>> keep the interface index that you use in your node, and this data is not
>> updated when the interface is removed.
>> Regarding your proposal, I suspect an issue could be when we reuse the
>> sw_if_index: if you del a sw_interface and then add a new one, chances are
>> you'll be reusing the same index, but fib_index might be different.
>>
>> Best
>> ben
>>
>> > -----Original Message-----
>> > From: vpp-dev@lists.fd.io <vpp-dev@lists.fd.io> On Behalf Of Zhang
>> Dongya
>> > Sent: Tuesday, November 29, 2022 3:45
>> > To: vpp-dev@lists.fd.io
>> > Subject: Re: [vpp-dev] possible use deleted sw if index in ip4-lookup
>> and
>> > cause crash
>> >
>> >
>> > I have found a solution and it can solve the crash issue.
>> >
>> > In ip4_sw_interface_add_del which is a callback for interface deletion,
>> we
>> > may set the fib index of the removed interface to 0 (default fib)
>> instead
>> > of ~0.  This behavior is same with interface creation.
>> >
>> >
>> >
>> > Zhang Dongya via lists.fd.io <http://lists.fd.io>
>> > <fortitude.zhang=gmail....@lists.fd.io <mailto:gmail....@lists.fd.io>
>> > 于
>> > 2022年11月28日周一 19:41写道:
>> >
>> >
>> >       Hi list,
>> >
>> >       Recently I encountered a vpp crash with my plugin enabled, after
>> > some investigation I find it may related with l3 sub interface delete
>> > while my process node add work to ip4-lookup node.
>> >
>> >
>> >       Intuitively I think it may related to a barrier usage but I tried
>> > to fix by add some check in my process node to guard the case that l3
>> sub
>> > interface is deleted. however the crash still exists.
>> >
>> >       Finally I think it should be related to a pattern like this:
>> >
>> >       1, my process node adds a pkt by using put_frame_to_node to ip4-
>> > lookup directly, which set the rx interface to the l3 sub interface
>> > created before.
>> >
>> >       2, my control plane agent (using govpp) delete the l3 sub
>> > interface. (it should be handled in vpp api-process node)
>> >
>> >       3, vpp schedule pending nodes. since the rx interface is deleted,
>> > vpp can't get a valid fib index and there is not check in the following
>> > ip4_fib_forwarding_lookup, so it crash with abort.
>> >
>> >       I think vpp may schedule my process node(timeout driven) and api-
>> > process node one over one, then it will schedule the pending nodes.
>> >
>> >       Should I add some check in ip4-lookup or there are better way of
>> > sending pkt in ctrl process not correct ?
>> >
>> >       Thanks a lot.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>>
>>
> 
>
>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22295): https://lists.fd.io/g/vpp-dev/message/22295
Mute This Topic: https://lists.fd.io/mt/95307938/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to