Re: [vpp-dev] possible use deleted sw if index in ip4-lookup and cause crash

2022-12-13 Thread Zhang Dongya
By adding the following code right after process dispatch in the main loop,
the crash is fixed.

So I think the condition mentioned above is a rare but valid case.

A ctrl process node being scheduled adds a packet (pending frame) to a node
and the packet is referring to an interface which will be deleted soon.

The interface will then be deleted in the unix_epoll_input PRE_INPUT node
which handles API input, then in the following graph scheduling it will
trigger various assert failures.


  {
> /* Ctrl nodes may have added work to the pending vector too.
>Process pending vector until there is nothing left.
>All pending vectors will be processed from input -> output. */
> for (i = 0; i < _vec_len (nm->pending_frames); i++)
>   cpu_time_now = dispatch_pending_node (vm, i, cpu_time_now);
> /* Reset pending vector for next iteration. */
> vec_set_len (nm->pending_frames, 0);
>
> if (is_main)
>   {
> // We also need do a barrier here to ensure worker node which
> have
> // pkt handoffed.
> vlib_worker_thread_barrier_sync (vm);
> vlib_worker_thread_barrier_release (vm);
>   }
>   }
>

Zhang Dongya via lists.fd.io 
于2022年12月14日周三 11:52写道:

> Hi list,
>
> During the test, when l3sub if is deleted, I got a new abort in interface
> drop node, seems the packet reference to a deleted interface.
>
> #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
>> #1  0x7face8d17859 in __GI_abort () at abort.c:79
>> #2  0x00407397 in os_exit (code=1) at
>> /home/fortitude/glx/vpp/src/vpp/vnet/main.c:440
>> #3  0x7face922dd57 in unix_signal_handler (signum=6,
>> si=0x7faca2891170, uc=0x7faca2891040) at
>> /home/fortitude/glx/vpp/src/vlib/unix/main.c:188
>> #4  
>> #5  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
>> #6  0x7face8d17859 in __GI_abort () at abort.c:79
>> #7  0x00407333 in os_panic () at
>> /home/fortitude/glx/vpp/src/vpp/vnet/main.c:416
>> #8  0x7face9067039 in debugger () at
>> /home/fortitude/glx/vpp/src/vppinfra/error.c:84
>> #9  0x7face9066dfa in _clib_error (how_to_die=2, function_name=0x0,
>> line_number=0, fmt=0x7face9f7a208 "%s:%d (%s) assertion `%s' fails") at
>> /home/fortitude/glx/vpp/src/vppinfra/error.c:143
>> #10 0x7face9b28358 in vnet_get_sw_interface (vnm=0x7facea243f38
>> , sw_if_index=14) at
>> /home/fortitude/glx/vpp/src/vnet/interface_funcs.h:60
>> #11 0x7face9b2a4ba in interface_drop_punt (vm=0x7facac8e5b00,
>> node=0x7faca95c8840, frame=0x7facc2004a40,
>> disposition=VNET_ERROR_DISPOSITION_DROP)
>> at /home/fortitude/glx/vpp/src/vnet/interface_output.c:1061
>> #12 0x7face9b29a96 in interface_drop_fn_hsw (vm=0x7facac8e5b00,
>> node=0x7faca95c8840, frame=0x7facc2004a40) at
>> /home/fortitude/glx/vpp/src/vnet/interface_output.c:1215
>> #13 0x7face91cd50d in dispatch_node (vm=0x7facac8e5b00,
>> node=0x7faca95c8840, type=VLIB_NODE_TYPE_INTERNAL,
>> dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x7facc2004a40,
>> last_time_stamp=404307411779413) at
>> /home/fortitude/glx/vpp/src/vlib/main.c:961
>> #14 0x7face91cdfb0 in dispatch_pending_node (vm=0x7facac8e5b00,
>> pending_frame_index=3, last_time_stamp=404307411779413) at
>> /home/fortitude/glx/vpp/src/vlib/main.c:1120
>> #15 0x7face91c921f in vlib_main_or_worker_loop (vm=0x7facac8e5b00,
>> is_main=0) at /home/fortitude/glx/vpp/src/vlib/main.c:1589
>> #16 0x7face91c8947 in vlib_worker_loop (vm=0x7facac8e5b00) at
>> /home/fortitude/glx/vpp/src/vlib/main.c:1723
>> #17 0x7face92080a4 in vlib_worker_thread_fn (arg=0x7facaa227d00) at
>> /home/fortitude/glx/vpp/src/vlib/threads.c:1579
>> #18 0x7face9203195 in vlib_worker_thread_bootstrap_fn
>> (arg=0x7facaa227d00) at /home/fortitude/glx/vpp/src/vlib/threads.c:418
>> #19 0x7face9121609 in start_thread (arg=) at
>> pthread_create.c:477
>> #20 0x7face8e14133 in clone () at
>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>>
>
> From the first mail, I want to know is the sequence can happen or not ?
>
> 1, my process node adds a pkt by using put_frame_to_node to ip4-lookup
> directly, which set the rx interface to the l3 sub interface created before.
> 2, my control plane agent (using govpp) delete the l3 sub interface. (it
> should be handled in vpp api-process node)
> 3, vpp schedule pending nodes. since the rx interface is deleted, vpp
> can't get a valid fib index and there is not check in the following
> ip4_fib_forwarding_lookup, so it crash with abort.
>
> I don't think a api barrier in step 2 can solve this, since the pkt is
> already in the pending frame.
>
> Zhang Dongya via lists.fd.io 
> 于2022年12月8日周四 00:17写道:
>
>> The crash have not been found anymore.
>>
>> Does this fix make any sense? it it does, I will submit a patch later.
>>
>> Zhang Dongya via lists.fd.io  于
>> 2022年11月29日周二 22:51写道:
>>
>>> Hi ben,
>>>
>>> In the 

Re: [vpp-dev] possible use deleted sw if index in ip4-lookup and cause crash

2022-12-13 Thread Zhang Dongya
Hi list,

During the test, when l3sub if is deleted, I got a new abort in interface
drop node, seems the packet reference to a deleted interface.

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x7face8d17859 in __GI_abort () at abort.c:79
> #2  0x00407397 in os_exit (code=1) at
> /home/fortitude/glx/vpp/src/vpp/vnet/main.c:440
> #3  0x7face922dd57 in unix_signal_handler (signum=6,
> si=0x7faca2891170, uc=0x7faca2891040) at
> /home/fortitude/glx/vpp/src/vlib/unix/main.c:188
> #4  
> #5  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> #6  0x7face8d17859 in __GI_abort () at abort.c:79
> #7  0x00407333 in os_panic () at
> /home/fortitude/glx/vpp/src/vpp/vnet/main.c:416
> #8  0x7face9067039 in debugger () at
> /home/fortitude/glx/vpp/src/vppinfra/error.c:84
> #9  0x7face9066dfa in _clib_error (how_to_die=2, function_name=0x0,
> line_number=0, fmt=0x7face9f7a208 "%s:%d (%s) assertion `%s' fails") at
> /home/fortitude/glx/vpp/src/vppinfra/error.c:143
> #10 0x7face9b28358 in vnet_get_sw_interface (vnm=0x7facea243f38
> , sw_if_index=14) at
> /home/fortitude/glx/vpp/src/vnet/interface_funcs.h:60
> #11 0x7face9b2a4ba in interface_drop_punt (vm=0x7facac8e5b00,
> node=0x7faca95c8840, frame=0x7facc2004a40,
> disposition=VNET_ERROR_DISPOSITION_DROP)
> at /home/fortitude/glx/vpp/src/vnet/interface_output.c:1061
> #12 0x7face9b29a96 in interface_drop_fn_hsw (vm=0x7facac8e5b00,
> node=0x7faca95c8840, frame=0x7facc2004a40) at
> /home/fortitude/glx/vpp/src/vnet/interface_output.c:1215
> #13 0x7face91cd50d in dispatch_node (vm=0x7facac8e5b00,
> node=0x7faca95c8840, type=VLIB_NODE_TYPE_INTERNAL,
> dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x7facc2004a40,
> last_time_stamp=404307411779413) at
> /home/fortitude/glx/vpp/src/vlib/main.c:961
> #14 0x7face91cdfb0 in dispatch_pending_node (vm=0x7facac8e5b00,
> pending_frame_index=3, last_time_stamp=404307411779413) at
> /home/fortitude/glx/vpp/src/vlib/main.c:1120
> #15 0x7face91c921f in vlib_main_or_worker_loop (vm=0x7facac8e5b00,
> is_main=0) at /home/fortitude/glx/vpp/src/vlib/main.c:1589
> #16 0x7face91c8947 in vlib_worker_loop (vm=0x7facac8e5b00) at
> /home/fortitude/glx/vpp/src/vlib/main.c:1723
> #17 0x7face92080a4 in vlib_worker_thread_fn (arg=0x7facaa227d00) at
> /home/fortitude/glx/vpp/src/vlib/threads.c:1579
> #18 0x7face9203195 in vlib_worker_thread_bootstrap_fn
> (arg=0x7facaa227d00) at /home/fortitude/glx/vpp/src/vlib/threads.c:418
> #19 0x7face9121609 in start_thread (arg=) at
> pthread_create.c:477
> #20 0x7face8e14133 in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>

>From the first mail, I want to know is the sequence can happen or not ?

1, my process node adds a pkt by using put_frame_to_node to ip4-lookup
directly, which set the rx interface to the l3 sub interface created before.
2, my control plane agent (using govpp) delete the l3 sub interface. (it
should be handled in vpp api-process node)
3, vpp schedule pending nodes. since the rx interface is deleted, vpp can't
get a valid fib index and there is not check in the following
ip4_fib_forwarding_lookup, so it crash with abort.

I don't think a api barrier in step 2 can solve this, since the pkt is
already in the pending frame.

Zhang Dongya via lists.fd.io 
于2022年12月8日周四 00:17写道:

> The crash have not been found anymore.
>
> Does this fix make any sense? it it does, I will submit a patch later.
>
> Zhang Dongya via lists.fd.io  于
> 2022年11月29日周二 22:51写道:
>
>> Hi ben,
>>
>> In the beginning I also think it should be a barrier issue, however it
>> turned out not the case.
>>
>> The pkt which had sw_if_index[VLIB_RX] set as the to-be-deleted interface
>> is actually being put to ip4-lookup node by my process node, the process
>> node add pkt in a timer drive way.
>>
>> Since the pkt is added by my process node, I think it is not affected by
>> the worker barrier.  in my case the sub if is deleted by API, which is
>> processed in linux_epoll_input PRE_INPUT node, let's consider the following
>> sequence:
>>
>>
>>1. my process add a pkt to ip4-node, and the pkt refer to a valid sw
>>if index
>>2. linux_epoll_input process a API request to delete the above sw if
>>index.
>>3. vpp schedule ip4-lookup node, then it will crash because the sw if
>>index is deleted and ip4_lookup node can't use sw_if_index[VLIB_RX] which
>>is now ~0 to get a valid fib index.
>>
>>
>> There are some code that do this way (ikev2_send_ike and others), I think
>> it's not doable to update the pending frame when the interface is deleted.
>>
>> Benoit Ganne (bganne) via lists.fd.io 
>> 于2022年11月29日周二 22:22写道:
>>
>>> Hi Zhang,
>>>
>>> I'd expect the interface deletion to happen under the worker barrier.
>>> VPP workers should drain all their in-flight packets before entering the
>>> barrier, so it should not be possible for the interface to disappear
>>> 

Re: [vpp-dev] possible use deleted sw if index in ip4-lookup and cause crash

2022-12-07 Thread Zhang Dongya
The crash have not been found anymore.

Does this fix make any sense? it it does, I will submit a patch later.

Zhang Dongya via lists.fd.io  于
2022年11月29日周二 22:51写道:

> Hi ben,
>
> In the beginning I also think it should be a barrier issue, however it
> turned out not the case.
>
> The pkt which had sw_if_index[VLIB_RX] set as the to-be-deleted interface
> is actually being put to ip4-lookup node by my process node, the process
> node add pkt in a timer drive way.
>
> Since the pkt is added by my process node, I think it is not affected by
> the worker barrier.  in my case the sub if is deleted by API, which is
> processed in linux_epoll_input PRE_INPUT node, let's consider the following
> sequence:
>
>
>1. my process add a pkt to ip4-node, and the pkt refer to a valid sw
>if index
>2. linux_epoll_input process a API request to delete the above sw if
>index.
>3. vpp schedule ip4-lookup node, then it will crash because the sw if
>index is deleted and ip4_lookup node can't use sw_if_index[VLIB_RX] which
>is now ~0 to get a valid fib index.
>
>
> There are some code that do this way (ikev2_send_ike and others), I think
> it's not doable to update the pending frame when the interface is deleted.
>
> Benoit Ganne (bganne) via lists.fd.io 
> 于2022年11月29日周二 22:22写道:
>
>> Hi Zhang,
>>
>> I'd expect the interface deletion to happen under the worker barrier. VPP
>> workers should drain all their in-flight packets before entering the
>> barrier, so it should not be possible for the interface to disappear
>> between your node and ip4-lookup. Or am I missing something?
>> What I have seen happening is you'd have some data structure where you
>> keep the interface index that you use in your node, and this data is not
>> updated when the interface is removed.
>> Regarding your proposal, I suspect an issue could be when we reuse the
>> sw_if_index: if you del a sw_interface and then add a new one, chances are
>> you'll be reusing the same index, but fib_index might be different.
>>
>> Best
>> ben
>>
>> > -Original Message-
>> > From: vpp-dev@lists.fd.io  On Behalf Of Zhang
>> Dongya
>> > Sent: Tuesday, November 29, 2022 3:45
>> > To: vpp-dev@lists.fd.io
>> > Subject: Re: [vpp-dev] possible use deleted sw if index in ip4-lookup
>> and
>> > cause crash
>> >
>> >
>> > I have found a solution and it can solve the crash issue.
>> >
>> > In ip4_sw_interface_add_del which is a callback for interface deletion,
>> we
>> > may set the fib index of the removed interface to 0 (default fib)
>> instead
>> > of ~0.  This behavior is same with interface creation.
>> >
>> >
>> >
>> > Zhang Dongya via lists.fd.io 
>> > mailto:gmail@lists.fd.io>
>> > 于
>> > 2022年11月28日周一 19:41写道:
>> >
>> >
>> >   Hi list,
>> >
>> >   Recently I encountered a vpp crash with my plugin enabled, after
>> > some investigation I find it may related with l3 sub interface delete
>> > while my process node add work to ip4-lookup node.
>> >
>> >
>> >   Intuitively I think it may related to a barrier usage but I tried
>> > to fix by add some check in my process node to guard the case that l3
>> sub
>> > interface is deleted. however the crash still exists.
>> >
>> >   Finally I think it should be related to a pattern like this:
>> >
>> >   1, my process node adds a pkt by using put_frame_to_node to ip4-
>> > lookup directly, which set the rx interface to the l3 sub interface
>> > created before.
>> >
>> >   2, my control plane agent (using govpp) delete the l3 sub
>> > interface. (it should be handled in vpp api-process node)
>> >
>> >   3, vpp schedule pending nodes. since the rx interface is deleted,
>> > vpp can't get a valid fib index and there is not check in the following
>> > ip4_fib_forwarding_lookup, so it crash with abort.
>> >
>> >   I think vpp may schedule my process node(timeout driven) and api-
>> > process node one over one, then it will schedule the pending nodes.
>> >
>> >   Should I add some check in ip4-lookup or there are better way of
>> > sending pkt in ctrl process not correct ?
>> >
>> >   Thanks a lot.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>>
>>
> 
>
>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22295): https://lists.fd.io/g/vpp-dev/message/22295
Mute This Topic: https://lists.fd.io/mt/95307938/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] possible use deleted sw if index in ip4-lookup and cause crash

2022-11-29 Thread Zhang Dongya
Hi ben,

In the beginning I also think it should be a barrier issue, however it
turned out not the case.

The pkt which had sw_if_index[VLIB_RX] set as the to-be-deleted interface
is actually being put to ip4-lookup node by my process node, the process
node add pkt in a timer drive way.

Since the pkt is added by my process node, I think it is not affected by
the worker barrier.  in my case the sub if is deleted by API, which is
processed in linux_epoll_input PRE_INPUT node, let's consider the following
sequence:


   1. my process add a pkt to ip4-node, and the pkt refer to a valid sw if
   index
   2. linux_epoll_input process a API request to delete the above sw if
   index.
   3. vpp schedule ip4-lookup node, then it will crash because the sw if
   index is deleted and ip4_lookup node can't use sw_if_index[VLIB_RX] which
   is now ~0 to get a valid fib index.


There are some code that do this way (ikev2_send_ike and others), I think
it's not doable to update the pending frame when the interface is deleted.

Benoit Ganne (bganne) via lists.fd.io 
于2022年11月29日周二 22:22写道:

> Hi Zhang,
>
> I'd expect the interface deletion to happen under the worker barrier. VPP
> workers should drain all their in-flight packets before entering the
> barrier, so it should not be possible for the interface to disappear
> between your node and ip4-lookup. Or am I missing something?
> What I have seen happening is you'd have some data structure where you
> keep the interface index that you use in your node, and this data is not
> updated when the interface is removed.
> Regarding your proposal, I suspect an issue could be when we reuse the
> sw_if_index: if you del a sw_interface and then add a new one, chances are
> you'll be reusing the same index, but fib_index might be different.
>
> Best
> ben
>
> > -Original Message-
> > From: vpp-dev@lists.fd.io  On Behalf Of Zhang
> Dongya
> > Sent: Tuesday, November 29, 2022 3:45
> > To: vpp-dev@lists.fd.io
> > Subject: Re: [vpp-dev] possible use deleted sw if index in ip4-lookup and
> > cause crash
> >
> >
> > I have found a solution and it can solve the crash issue.
> >
> > In ip4_sw_interface_add_del which is a callback for interface deletion,
> we
> > may set the fib index of the removed interface to 0 (default fib) instead
> > of ~0.  This behavior is same with interface creation.
> >
> >
> >
> > Zhang Dongya via lists.fd.io 
> > mailto:gmail@lists.fd.io> >
> 于
> > 2022年11月28日周一 19:41写道:
> >
> >
> >   Hi list,
> >
> >   Recently I encountered a vpp crash with my plugin enabled, after
> > some investigation I find it may related with l3 sub interface delete
> > while my process node add work to ip4-lookup node.
> >
> >
> >   Intuitively I think it may related to a barrier usage but I tried
> > to fix by add some check in my process node to guard the case that l3 sub
> > interface is deleted. however the crash still exists.
> >
> >   Finally I think it should be related to a pattern like this:
> >
> >   1, my process node adds a pkt by using put_frame_to_node to ip4-
> > lookup directly, which set the rx interface to the l3 sub interface
> > created before.
> >
> >   2, my control plane agent (using govpp) delete the l3 sub
> > interface. (it should be handled in vpp api-process node)
> >
> >   3, vpp schedule pending nodes. since the rx interface is deleted,
> > vpp can't get a valid fib index and there is not check in the following
> > ip4_fib_forwarding_lookup, so it crash with abort.
> >
> >   I think vpp may schedule my process node(timeout driven) and api-
> > process node one over one, then it will schedule the pending nodes.
> >
> >   Should I add some check in ip4-lookup or there are better way of
> > sending pkt in ctrl process not correct ?
> >
> >   Thanks a lot.
> >
> >
> >
> >
> >
> >
> >
> >
>
>
> 
>
>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22256): https://lists.fd.io/g/vpp-dev/message/22256
Mute This Topic: https://lists.fd.io/mt/95307938/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] possible use deleted sw if index in ip4-lookup and cause crash

2022-11-29 Thread Benoit Ganne (bganne) via lists.fd.io
Hi Zhang,

I'd expect the interface deletion to happen under the worker barrier. VPP 
workers should drain all their in-flight packets before entering the barrier, 
so it should not be possible for the interface to disappear between your node 
and ip4-lookup. Or am I missing something?
What I have seen happening is you'd have some data structure where you keep the 
interface index that you use in your node, and this data is not updated when 
the interface is removed.
Regarding your proposal, I suspect an issue could be when we reuse the 
sw_if_index: if you del a sw_interface and then add a new one, chances are 
you'll be reusing the same index, but fib_index might be different.

Best
ben

> -Original Message-
> From: vpp-dev@lists.fd.io  On Behalf Of Zhang Dongya
> Sent: Tuesday, November 29, 2022 3:45
> To: vpp-dev@lists.fd.io
> Subject: Re: [vpp-dev] possible use deleted sw if index in ip4-lookup and
> cause crash
> 
> 
> I have found a solution and it can solve the crash issue.
> 
> In ip4_sw_interface_add_del which is a callback for interface deletion, we
> may set the fib index of the removed interface to 0 (default fib) instead
> of ~0.  This behavior is same with interface creation.
> 
> 
> 
> Zhang Dongya via lists.fd.io 
> mailto:gmail@lists.fd.io> > 于
> 2022年11月28日周一 19:41写道:
> 
> 
>   Hi list,
> 
>   Recently I encountered a vpp crash with my plugin enabled, after
> some investigation I find it may related with l3 sub interface delete
> while my process node add work to ip4-lookup node.
> 
> 
>   Intuitively I think it may related to a barrier usage but I tried
> to fix by add some check in my process node to guard the case that l3 sub
> interface is deleted. however the crash still exists.
> 
>   Finally I think it should be related to a pattern like this:
> 
>   1, my process node adds a pkt by using put_frame_to_node to ip4-
> lookup directly, which set the rx interface to the l3 sub interface
> created before.
> 
>   2, my control plane agent (using govpp) delete the l3 sub
> interface. (it should be handled in vpp api-process node)
> 
>   3, vpp schedule pending nodes. since the rx interface is deleted,
> vpp can't get a valid fib index and there is not check in the following
> ip4_fib_forwarding_lookup, so it crash with abort.
> 
>   I think vpp may schedule my process node(timeout driven) and api-
> process node one over one, then it will schedule the pending nodes.
> 
>   Should I add some check in ip4-lookup or there are better way of
> sending pkt in ctrl process not correct ?
> 
>   Thanks a lot.
> 
> 
> 
> 
> 
> 
> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22255): https://lists.fd.io/g/vpp-dev/message/22255
Mute This Topic: https://lists.fd.io/mt/95307938/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] possible use deleted sw if index in ip4-lookup and cause crash

2022-11-28 Thread Zhang Dongya
I have found a solution and it can solve the crash issue.

In ip4_sw_interface_add_del which is a callback for interface deletion, we
may set the fib index of the removed interface to 0 (default fib) instead
of ~0.  This behavior is same with interface creation.


Zhang Dongya via lists.fd.io 
于2022年11月28日周一 19:41写道:

> Hi list,
>
> Recently I encountered a vpp crash with my plugin enabled, after some
> investigation I find it may related with l3 sub interface delete while my
> process node add work to ip4-lookup node.
>
> Intuitively I think it may related to a barrier usage but I tried to fix
> by add some check in my process node to guard the case that l3 sub
> interface is deleted. however the crash still exists.
>
> Finally I think it should be related to a pattern like this:
>
> 1, my process node adds a pkt by using put_frame_to_node to ip4-lookup
> directly, which set the rx interface to the l3 sub interface created before.
> 2, my control plane agent (using govpp) delete the l3 sub interface. (it
> should be handled in vpp api-process node)
> 3, vpp schedule pending nodes. since the rx interface is deleted, vpp
> can't get a valid fib index and there is not check in the following
> ip4_fib_forwarding_lookup, so it crash with abort.
>
> I think vpp may schedule my process node(timeout driven) and api-process
> node one over one, then it will schedule the pending nodes.
>
> Should I add some check in ip4-lookup or there are better way of sending
> pkt in ctrl process not correct ?
>
> Thanks a lot.
>
>
>
>
>
> 
>
>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22251): https://lists.fd.io/g/vpp-dev/message/22251
Mute This Topic: https://lists.fd.io/mt/95307938/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



[vpp-dev] possible use deleted sw if index in ip4-lookup and cause crash

2022-11-28 Thread Zhang Dongya
Hi list,

Recently I encountered a vpp crash with my plugin enabled, after some
investigation I find it may related with l3 sub interface delete while my
process node add work to ip4-lookup node.

Intuitively I think it may related to a barrier usage but I tried to fix by
add some check in my process node to guard the case that l3 sub interface
is deleted. however the crash still exists.

Finally I think it should be related to a pattern like this:

1, my process node adds a pkt by using put_frame_to_node to ip4-lookup
directly, which set the rx interface to the l3 sub interface created before.
2, my control plane agent (using govpp) delete the l3 sub interface. (it
should be handled in vpp api-process node)
3, vpp schedule pending nodes. since the rx interface is deleted, vpp can't
get a valid fib index and there is not check in the following
ip4_fib_forwarding_lookup, so it crash with abort.

I think vpp may schedule my process node(timeout driven) and api-process
node one over one, then it will schedule the pending nodes.

Should I add some check in ip4-lookup or there are better way of sending
pkt in ctrl process not correct ?

Thanks a lot.

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22245): https://lists.fd.io/g/vpp-dev/message/22245
Mute This Topic: https://lists.fd.io/mt/95307938/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-