Re: [vpp-dev] process node suspended indefinitely

2023-03-12 Thread Sudhir CR via lists.fd.io
Hi Dave,
we are using VPP Version *21.10.*

Thanks and regards,
Sudhir

On Fri, Mar 10, 2023 at 5:31 PM Dave Barach  wrote:

> I should have had the sense to ask this earlier: which version of vpp are
> you using?
>
>
>
> The line number in your debug snippet is more than 100 lines off from
> master/latest. The timer wheel code has been relatively untouched, but
> there have been several important fixes over the years...
>
>
>
> D.
>
>
>
> diff --git a/src/vlib/main.c b/src/vlib/main.c
> index af0fcd1cb..55c231d8b 100644
> --- a/src/vlib/main.c
> +++ b/src/vlib/main.c
> @@ -1490,6 +1490,9 @@ dispatch_suspended_process (vlib_main_t * vm,
>  }
>else
>  {
> +   if (strcmp((char *)node->name, "rtb-vpp-epoll-process") == 0) {
> +       ASSERT(0);
> +   }
>
>
>
> *From:* vpp-dev@lists.fd.io  *On Behalf Of *Sudhir
> CR via lists.fd.io
> *Sent:* Thursday, March 9, 2023 4:00 AM
> *To:* vpp-dev@lists.fd.io
> *Cc:* rtbrick@lists.fd.io
> *Subject:* Re: [vpp-dev] process node suspended indefinitely
>
>
>
> Hi Dave,
>
> Please excuse my delayed response. It took some time to recreate this
> issue.
>
> I made changes to our process node as per your suggestion. now our process
> node code looks like this
>
>
>
> while (1) {
>
> vlib_process_wait_for_event_or_clock (vm,
> RTB_VPP_EPOLL_PROCESS_NODE_TIMER);
> event_type = vlib_process_get_events (vm, _data);
> vec_reset_length(event_data);
>
> switch (event_type) {
> case ~0: /* handle timer expirations */
> rtb_event_loop_run_once ();
> break;
>
> default: /* bug! */
> ASSERT (0);
> }
> }
>
> After these changes we didn't observe any assertions but we hit the
> process node suspend issue. with this it is clear other than time out we
> are not getting any other events.
>
>
>
> In the issue state I have collected vlib_process node
> (rtb_vpp_epoll_process) flags value and it seems to be correct (flags = 11).
>
>
>
> Please find the vlib_process_t and vlib_node_t data structure values
> collected in the issue state below.
>
>
>
> vlib_process_t:
>
> 
>
> $38 = {
>   cacheline0 = 0x7f9b2da50380 "\200~\274+\233\177",
>   node_runtime = {
> cacheline0 = 0x7f9b2da50380 "\200~\274+\233\177",
> function = 0x7f9b2bbc7e80 ,
> errors = 0x7f9b3076a560,
> clocks_since_last_overflow = 0,
> max_clock = 3785970526,
> max_clock_n = 0,
> calls_since_last_overflow = 0,
> vectors_since_last_overflow = 0,
> next_frame_index = 1668,
> node_index = 437,
> input_main_loops_per_call = 0,
> main_loop_count_last_dispatch = 4147405645,
> main_loop_vector_stats = {0, 0},
> flags = 0,
> state = 0,
> n_next_nodes = 0,
> cached_next_index = 0,
> thread_index = 0,
> runtime_data = 0x7f9b2da503c6 ""
>   },
>   return_longjmp = {
> regs = {94502584873984, 140304430422064, 140306731463680,
> 94502584874048, 94502640552512, 0, 140304430422032, 140306703608766}
>   },
>   resume_longjmp = {
> regs = {94502584873984, 140304161734368, 140306731463680,
> 94502584874048, 94502640552512, 0, 140304161734272, 140304430441787}
>   },
>   *flags = 11, *
>   log2_n_stack_bytes = 16,
>   suspended_process_frame_index = 0,
>   n_suspends = 0,
>   pending_event_data_by_type_index = 0x7f9b307b8310,
>   non_empty_event_type_bitmap = 0x7f9b307b8390,
>   one_time_event_type_bitmap = 0x0,
>   event_type_index_by_type_opaque = 0x7f9b2dab8bd8,
>   event_type_pool = 0x7f9b2dcb5978,
>   resume_clock_interval = 1000,
>   stop_timer_handle = 3098,
>   output_function = 0x0,
>   output_function_arg = 0,
>   stack = 0x7f9b1bb78000
> }
>
>
>
> vlib_node_t
>
> =
>
>  (gdb) p *n
>
> $17 = {
>   function = 0x7f9b2bbc7e80 ,
>   name = 0x7f9b3076a3f0 "rtb-vpp-epoll-process",
>   name_elog_string = 11783,
>   stats_total = {
> calls = 0,
> vectors = 0,
> clocks = 1971244932732,
> suspends = 6847366,
> max_clock = 3785970526,
> max_clock_n = 0
>   },
>   stats_last_clear = {
> calls = 0,
> vectors = 0,
> clocks = 0,
> suspends = 0,
> max_clock = 0,
> max_clock_n = 0
>   },
>   type = VLIB_NODE_TYPE_PROCESS,
>   index = 437,
>   runtime_index = 40,
>   runtime_data = 0x0,
>   flags = 0,
>   state = 0 '\000',
>   runtime_data_bytes = 0 '\000',
>   protocol_hint = 0 '\000',
>   n

Re: [vpp-dev] process node suspended indefinitely

2023-03-10 Thread Sudhir CR via lists.fd.io
Hi jinsh,
Thanks for the help.
I placed assert statement in *vlib_process_signal_event_**helper* function. But
in this place the assert statement didn't hit.
When I debugged further I found that my process node is not there in
the *data_from_advancing_timing_wheel
*vector.
i believe due to this process node is not getting called.now i am checking
why  *rtb-**vpp-epoll-process *node entry is not present in
*data_from_advancing_timing_wheel
*vector*.*

Thanks and regards,
Sudhir

On Thu, Mar 9, 2023 at 9:08 PM jinsh11  wrote:

>
>-
>
>I think you can query who stopped the current node's time wheel.
>
>always_inline void *
>
>vlib_process_signal_event_helper (vlib_node_main_t * nm,
>
>  vlib_node_t * n,
>
>  vlib_process_t * p,
>
>  uword t,
>
>  uword n_data_elts, uword n_data_elt_bytes)
>
>{
>
>
>
>if (add_to_pending)
>
>{
>
>  u32 x = vlib_timing_wheel_data_set_suspended_process
>(n->runtime_index);
>
>  p->flags = p_flags | VLIB_PROCESS_RESUME_PENDING;
>
>  vec_add1 (nm->data_from_advancing_timing_wheel, x);
>
>  if (delete_from_wheel){
>
>TW (tw_timer_stop) ((TWT (tw_timer_wheel) *) nm->timing_wheel,
>
>p->stop_timer_handle);
>
>   *vlib_process_t *p1 = vec_elt (nm->processes,
>vlib_get_node_by_name(vm,"rtb-vpp-epoll-process”)->runtime_index);*
>
>*  If ((p != p1 && (p-> stop_timer_handle ==
>p1->stop_timer_handle))*
>
>*  {*
>
>*  ASSERT();*
>
>*   }*
>}
>
>}
>
>}
>
>
> 
>
>

-- 
NOTICE TO
RECIPIENT This e-mail message and any attachments are 
confidential and may be
privileged. If you received this e-mail in error, 
any review, use,
dissemination, distribution, or copying of this e-mail is 
strictly
prohibited. Please notify us immediately of the error by return 
e-mail and
please delete this message from your system. For more 
information about Rtbrick, please visit us at www.rtbrick.com 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22688): https://lists.fd.io/g/vpp-dev/message/22688
Mute This Topic: https://lists.fd.io/mt/97032803/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] process node suspended indefinitely

2023-03-09 Thread Sudhir CR via lists.fd.io
Hi Dave,
Please excuse my delayed response. It took some time to recreate this issue.
I made changes to our process node as per your suggestion. now our process
node code looks like this

while (1) {

vlib_process_wait_for_event_or_clock (vm,
RTB_VPP_EPOLL_PROCESS_NODE_TIMER);
event_type = vlib_process_get_events (vm, _data);
vec_reset_length(event_data);

switch (event_type) {
case ~0: /* handle timer expirations */
rtb_event_loop_run_once ();
break;

default: /* bug! */
ASSERT (0);
}
}
After these changes we didn't observe any assertions but we hit the process
node suspend issue. with this it is clear other than time out we are not
getting any other events.

In the issue state I have collected vlib_process node
(rtb_vpp_epoll_process) flags value and it seems to be correct (flags = 11).

Please find the vlib_process_t and vlib_node_t data structure values
collected in the issue state below.

vlib_process_t:

$38 = {
  cacheline0 = 0x7f9b2da50380 "\200~\274+\233\177",
  node_runtime = {
cacheline0 = 0x7f9b2da50380 "\200~\274+\233\177",
function = 0x7f9b2bbc7e80 ,
errors = 0x7f9b3076a560,
clocks_since_last_overflow = 0,
max_clock = 3785970526,
max_clock_n = 0,
calls_since_last_overflow = 0,
vectors_since_last_overflow = 0,
next_frame_index = 1668,
node_index = 437,
input_main_loops_per_call = 0,
main_loop_count_last_dispatch = 4147405645,
main_loop_vector_stats = {0, 0},
flags = 0,
state = 0,
n_next_nodes = 0,
cached_next_index = 0,
thread_index = 0,
runtime_data = 0x7f9b2da503c6 ""
  },
  return_longjmp = {
regs = {94502584873984, 140304430422064, 140306731463680,
94502584874048, 94502640552512, 0, 140304430422032, 140306703608766}
  },
  resume_longjmp = {
regs = {94502584873984, 140304161734368, 140306731463680,
94502584874048, 94502640552512, 0, 140304161734272, 140304430441787}
  },
  *flags = 11, *
  log2_n_stack_bytes = 16,
  suspended_process_frame_index = 0,
  n_suspends = 0,
  pending_event_data_by_type_index = 0x7f9b307b8310,
  non_empty_event_type_bitmap = 0x7f9b307b8390,
  one_time_event_type_bitmap = 0x0,
  event_type_index_by_type_opaque = 0x7f9b2dab8bd8,
  event_type_pool = 0x7f9b2dcb5978,
  resume_clock_interval = 1000,
  stop_timer_handle = 3098,
  output_function = 0x0,
  output_function_arg = 0,
  stack = 0x7f9b1bb78000
}

vlib_node_t
=
 (gdb) p *n
$17 = {
  function = 0x7f9b2bbc7e80 ,
  name = 0x7f9b3076a3f0 "rtb-vpp-epoll-process",
  name_elog_string = 11783,
  stats_total = {
calls = 0,
vectors = 0,
clocks = 1971244932732,
suspends = 6847366,
max_clock = 3785970526,
max_clock_n = 0
  },
  stats_last_clear = {
calls = 0,
vectors = 0,
clocks = 0,
suspends = 0,
max_clock = 0,
max_clock_n = 0
  },
  type = VLIB_NODE_TYPE_PROCESS,
  index = 437,
  runtime_index = 40,
  runtime_data = 0x0,
  flags = 0,
  state = 0 '\000',
  runtime_data_bytes = 0 '\000',
  protocol_hint = 0 '\000',
  n_errors = 0,
  scalar_size = 0,
  vector_size = 0,
  error_heap_handle = 0,
  error_heap_index = 0,
  error_counters = 0x0,
  next_node_names = 0x7f9b3076a530,
  next_nodes = 0x0,
  sibling_of = 0x0,
  sibling_bitmap = 0x0,
  n_vectors_by_next_node = 0x0,
  next_slot_by_node = 0x0,
  prev_node_bitmap = 0x0,
  owner_node_index = 4294967295,
  owner_next_index = 4294967295,
  format_buffer = 0x0,
  unformat_buffer = 0x0,
  format_trace = 0x0,
  validate_frame = 0x0,
  state_string = 0x0,
  node_fn_registrations = 0x0
}

I added an assert statement before clearing *VLIB_PROCESS_IS_RUNNING* flag
in *dispatch_suspended_process* function.
But this assert statement is not hitting.

diff --git a/src/vlib/main.c b/src/vlib/main.c
index af0fcd1cb..55c231d8b 100644
--- a/src/vlib/main.c
+++ b/src/vlib/main.c
@@ -1490,6 +1490,9 @@ dispatch_suspended_process (vlib_main_t * vm,
 }
   else
 {
+   if (strcmp((char *)node->name, "rtb-vpp-epoll-process") == 0) {
+   ASSERT(0);
+   }
   p->flags &= ~VLIB_PROCESS_IS_RUNNING;
   pool_put_index (nm->suspended_process_frames,
  p->suspended_process_frame_index);

I am not able to figure out why this process node is suspended in some
scenarios. Can you please help me by providing some pointers to debug and
resolve this issue.

Hi Jinsh,
I applied your patch to my code. The issue is not solved with your patch.
Thank you for helping me out.

Thanks and Regards,
Sudhir


On Fri, Mar 3, 2023 at 12:53 PM Sudhir CR via lists.fd.io  wrote:

> Hi Chetan,
> In our case we are observing this issue occasionally exact steps  to
> recreate the issue are not known.
> I made changes to our process node as suggested by dave and with these
> changes trying to recreate the issue.
>
>

Re: [vpp-dev] process node suspended indefinitely

2023-03-02 Thread Sudhir CR via lists.fd.io
Hi Chetan,
In our case we are observing this issue occasionally exact steps  to
recreate the issue are not known.
I made changes to our process node as suggested by dave and with these
changes trying to recreate the issue.

Soon I will update my results and findings in this mail thread.

Thanks and Regards,
Sudhir

On Fri, Mar 3, 2023 at 12:37 PM chetan bhasin 
wrote:

> Hi Sudhir,
>
> Is your issue resolved?
>
> Actually we are facing same issue on vpp.2106.
> In our case "api-rx-ring" is not getting called.
> in our usecase workers are calling some functions in main-thread context
> leading to RPC message and memory is allocated from api section.
> This leads to Api-segment memory is used fully and leads to crash.
>
> Thanks,
> Chetan
>
>
> On Mon, Feb 20, 2023, 18:24 Sudhir CR via lists.fd.io  rtbrick@lists.fd.io> wrote:
>
>> Hi Dave,
>> Thank you very much for your inputs. I will try this out and get back to
>> you with the results.
>>
>> Regards,
>> Sudhir
>>
>> On Mon, Feb 20, 2023 at 6:01 PM Dave Barach  wrote:
>>
>>> Please try something like this, to eliminate the possibility that some
>>> bit of code is sending this process an event. It’s not a good idea to skip
>>> the vec_reset_length (event_data) step.
>>>
>>>
>>>
>>> while (1)
>>>
>>> {
>>>
>>>uword event_type, * event_data = 0;
>>>
>>>int i;
>>>
>>>
>>>
>>>vlib_process_wait_for_event_or_clock (vm, 1e-2 /* 10 ms */);
>>>
>>>
>>>
>>>event_type = vlib_process_get_events (vm, _data);
>>>
>>>
>>>
>>>switch (event_type) {
>>>
>>>   case ~0: /* handle timer expirations */
>>>
>>>rtb_event_loop_run_once ();
>>>
>>>break;
>>>
>>>
>>>
>>>default: /* bug! */
>>>
>>>ASSERT (0);
>>>
>>>}
>>>
>>>
>>>
>>>vec_reset_length(event_data);
>>>
>>> }
>>>
>>>
>>>
>>> *From:* vpp-dev@lists.fd.io  *On Behalf Of *Sudhir
>>> CR via lists.fd.io
>>> *Sent:* Monday, February 20, 2023 4:02 AM
>>> *To:* vpp-dev@lists.fd.io
>>> *Subject:* Re: [vpp-dev] process node suspended indefinitely
>>>
>>>
>>>
>>> Hi Dave,
>>> Thank you for your response and help.
>>>
>>>
>>>
>>> Please find the additional details below.
>>>
>>> VPP Version *21.10*
>>>
>>>
>>> We are creating a process node* rtb-vpp-epoll-process *to handle
>>> control plane events like interface add/delete, route add/delete.
>>> This process node waits for *10ms* of time (Not Interested in any
>>> events ) once 10ms is expired it will process control plane events
>>> mentioned above.
>>>
>>> code snippet looks like below
>>>
>>>
>>>
>>> ```
>>>
>>> static uword
>>> rtb_vpp_epoll_process (vlib_main_t *vm,
>>>vlib_node_runtime_t  *rt,
>>>vlib_frame_t *f)
>>> {
>>>
>>> ...
>>> ...
>>> while (1) {
>>> vlib_process_wait_for_event_or_clock (vm, 10e-3);
>>> vlib_process_get_events (vm, NULL);
>>>
>>> rtb_event_loop_run_once();   *< controlplane events
>>> handling*
>>> }
>>> }
>>> ```
>>>
>>> What we observed is that sometimes (when there is a high controlplane
>>> load like request to install more routes) "rtb-vpp-epoll-process" is
>>> suspended and not scheduled furever. this we found by using "show runtime
>>> rtb-vpp-epoll-process"*  (*in "show runtime rtb-vpp-epoll-process"
>>> command output suspends counter is not incrementing.)
>>>
>>> *show runtime output in working case :*
>>>
>>>
>>> ```
>>> DBGvpp# show runtime rtb-vpp-epoll-process
>>>  Name State Calls  Vectors
>>>  *Suspends* Clocks   Vectors/Call
>>> rtb-vpp-epoll-process   any wait 0
>>> 0  *192246*  1.91e60.00
>>> DBGvpp#
>>>
>>> DBGvpp# show runtime rtb-vpp-epoll-process
>>>  Name  

Re: [vpp-dev] process node suspended indefinitely

2023-02-20 Thread Sudhir CR via lists.fd.io
Hi Dave,
Thank you very much for your inputs. I will try this out and get back to
you with the results.

Regards,
Sudhir

On Mon, Feb 20, 2023 at 6:01 PM Dave Barach  wrote:

> Please try something like this, to eliminate the possibility that some bit
> of code is sending this process an event. It’s not a good idea to skip the
> vec_reset_length (event_data) step.
>
>
>
> while (1)
>
> {
>
>uword event_type, * event_data = 0;
>
>int i;
>
>
>
>vlib_process_wait_for_event_or_clock (vm, 1e-2 /* 10 ms */);
>
>
>
>event_type = vlib_process_get_events (vm, _data);
>
>
>
>switch (event_type) {
>
>   case ~0: /* handle timer expirations */
>
>rtb_event_loop_run_once ();
>
>break;
>
>
>
>default: /* bug! */
>
>ASSERT (0);
>
>    }
>
>
>
>vec_reset_length(event_data);
>
> }
>
>
>
> *From:* vpp-dev@lists.fd.io  *On Behalf Of *Sudhir
> CR via lists.fd.io
> *Sent:* Monday, February 20, 2023 4:02 AM
> *To:* vpp-dev@lists.fd.io
> *Subject:* Re: [vpp-dev] process node suspended indefinitely
>
>
>
> Hi Dave,
> Thank you for your response and help.
>
>
>
> Please find the additional details below.
>
> VPP Version *21.10*
>
>
> We are creating a process node* rtb-vpp-epoll-process *to handle control
> plane events like interface add/delete, route add/delete.
> This process node waits for *10ms* of time (Not Interested in any events
> ) once 10ms is expired it will process control plane events mentioned above.
>
> code snippet looks like below
>
>
>
> ```
>
> static uword
> rtb_vpp_epoll_process (vlib_main_t *vm,
>vlib_node_runtime_t  *rt,
>vlib_frame_t *f)
> {
>
> ...
> ...
> while (1) {
> vlib_process_wait_for_event_or_clock (vm, 10e-3);
> vlib_process_get_events (vm, NULL);
>
> rtb_event_loop_run_once();   *< controlplane events handling*
> }
> }
> ```
>
> What we observed is that sometimes (when there is a high controlplane load
> like request to install more routes) "rtb-vpp-epoll-process" is suspended
> and not scheduled furever. this we found by using "show runtime
> rtb-vpp-epoll-process"*  (*in "show runtime rtb-vpp-epoll-process"
> command output suspends counter is not incrementing.)
>
> *show runtime output in working case :*
>
>
> ```
> DBGvpp# show runtime rtb-vpp-epoll-process
>  Name State Calls  Vectors
>*Suspends* Clocks   Vectors/Call
> rtb-vpp-epoll-process   any wait 0   0
>  *192246*  1.91e60.00
> DBGvpp#
>
> DBGvpp# show runtime rtb-vpp-epoll-process
>  Name State Calls  Vectors
>*Suspends* Clocks   Vectors/Call
> rtb-vpp-epoll-process   any wait 0   0
>  *193634*  1.89e60.00
> DBGvpp#
>
> ```
>
>
> *show runtime output in issue case :```*
>
> DBGvpp# show runtime rtb-vpp-epoll-process
>
>  Name State Calls  Vectors
> *Suspends* Clocks   Vectors/Call
>
> rtb-vpp-epoll-process   any wait 0   0
>*81477*  7.08e60.00
>
> DBGvpp# show runtime rtb-vpp-epoll-process
>
>  Name State Calls  Vectors
> *Suspends *Clocks   Vectors/Call
>
> rtb-vpp-epoll-process   any wait 0   0
>*81477*  7.08e60.00
>
> *```*
>
> Other process nodes like lldp-process,
> ip4-neighbor-age-process, ip6-ra-process running without any issue. only
> "rtb-vpp-epoll-process" process node suspended forever.
>
>
>
> Please let me know if any additional information is required.
>
> Hi Jinsh,
> Thanks for pointing me to the issue you faced. The issue I am facing looks
> similar.
> I will verify with the given patch.
>
>
> Thanks and Regards,
>
> Sudhir
>
>
>
> On Sun, Feb 19, 2023 at 6:19 AM jinsh11  wrote:
>
> HI:
>
>
>- I have the same problem,
>
> bfd process node stop running. I raised this issue,
>
> https://lists.fd.io/g/vpp-dev/message/22380
> I think there is a problem with the porcess scheduling module when using
> the time wheel.
>
>
>
>
>
> NOTICE TO RECIPIENT This e-mail message and any attac

Re: [vpp-dev] process node suspended indefinitely

2023-02-20 Thread Sudhir CR via lists.fd.io
Hi Dave,
Thank you for your response and help.

Please find the additional details below.
VPP Version *21.10*

We are creating a process node* rtb-vpp-epoll-process *to handle control
plane events like interface add/delete, route add/delete.
This process node waits for *10ms* of time (Not Interested in any events )
once 10ms is expired it will process control plane events mentioned above.

code snippet looks like below

```
static uword
rtb_vpp_epoll_process (vlib_main_t *vm,
   vlib_node_runtime_t  *rt,
   vlib_frame_t *f)
{
...
...
while (1) {
vlib_process_wait_for_event_or_clock (vm, 10e-3);
vlib_process_get_events (vm, NULL);

rtb_event_loop_run_once();   *< controlplane events handling*
}
}
```
What we observed is that sometimes (when there is a high controlplane load
like request to install more routes) "rtb-vpp-epoll-process" is suspended
and not scheduled furever. this we found by using "show runtime
rtb-vpp-epoll-process"*  (*in "show runtime rtb-vpp-epoll-process" command
output suspends counter is not incrementing.)

*show runtime output in working case :*

```
DBGvpp# show runtime rtb-vpp-epoll-process
 Name State Calls  Vectors
   *Suspends* Clocks   Vectors/Call
rtb-vpp-epoll-process   any wait 0   0
 *192246*  1.91e60.00
DBGvpp#

DBGvpp# show runtime rtb-vpp-epoll-process
 Name State Calls  Vectors
   *Suspends* Clocks   Vectors/Call
rtb-vpp-epoll-process   any wait 0   0
 *193634*  1.89e60.00
DBGvpp#

```

*show runtime output in issue case :```*

DBGvpp# show runtime rtb-vpp-epoll-process
 Name State Calls  Vectors
   *Suspends* Clocks   Vectors/Call
rtb-vpp-epoll-process   any wait 0
  0   *81477*  7.08e60.00

DBGvpp# show runtime rtb-vpp-epoll-process
 Name State Calls  Vectors
   *Suspends *Clocks   Vectors/Call
rtb-vpp-epoll-process   any wait 0
  0   *81477*  7.08e60.00

*```*

Other process nodes like lldp-process,
ip4-neighbor-age-process, ip6-ra-process running without any issue. only
"rtb-vpp-epoll-process" process node suspended forever.

Please let me know if any additional information is required.

Hi Jinsh,
Thanks for pointing me to the issue you faced. The issue I am facing looks
similar.
I will verify with the given patch.

Thanks and Regards,
Sudhir

On Sun, Feb 19, 2023 at 6:19 AM jinsh11  wrote:

> HI:
>
>-
>
>I have the same problem,
>bfd process node stop running. I raised this issue,
>
> https://lists.fd.io/g/vpp-dev/message/22380
> I think there is a problem with the porcess scheduling module when using
> the time wheel.
>
> 
>
>

-- 
NOTICE TO
RECIPIENT This e-mail message and any attachments are 
confidential and may be
privileged. If you received this e-mail in error, 
any review, use,
dissemination, distribution, or copying of this e-mail is 
strictly
prohibited. Please notify us immediately of the error by return 
e-mail and
please delete this message from your system. For more 
information about Rtbrick, please visit us at www.rtbrick.com 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22604): https://lists.fd.io/g/vpp-dev/message/22604
Mute This Topic: https://lists.fd.io/mt/97032803/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



[vpp-dev] process node suspended indefinitely

2023-02-17 Thread Sudhir CR via lists.fd.io
Hi Team,
We have a process node. which we will use to do some control plane related
activity. Sometimes we observe that this *process node is
suspended  indefinitely*.

I know that if any process node is taking *Unreasonably long time* such
nodes will  not be scheduled further. But not able to figure out in code
where this is done.

Can anyone point me to the code where we are tracking time taken by each
process node and suspend indefinitely if it is consuming more time.

Thanks and regards,
Sudhir

-- 
NOTICE TO
RECIPIENT This e-mail message and any attachments are 
confidential and may be
privileged. If you received this e-mail in error, 
any review, use,
dissemination, distribution, or copying of this e-mail is 
strictly
prohibited. Please notify us immediately of the error by return 
e-mail and
please delete this message from your system. For more 
information about Rtbrick, please visit us at www.rtbrick.com 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22599): https://lists.fd.io/g/vpp-dev/message/22599
Mute This Topic: https://lists.fd.io/mt/97032803/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] VPP crash while create ipip tunnel after a certain limit

2022-12-08 Thread Sudhir CR via lists.fd.io
To my knowledge there is no such ratio between heapsize and  stackseg.
based on your application needs you can tune these values.
In your case as the number of ipip tunnels are more you may be required to
increase stackseg size to accommodate the counters for those tunnel
interfaces.

Thanks and Regards,
Sudhir

On Fri, Dec 9, 2022 at 12:05 AM Chinmaya Aggarwal 
wrote:

> On Wed, Dec 7, 2022 at 08:04 PM, Sudhir CR wrote:
>
>  heapsize and statseg
>
> Thanks for your response. What should be the ideal ratio between heapsize
> and statseg?
>
> Thanks and Regards,
> Chinmaya Agarwal.
> 
>
>

-- 
NOTICE TO
RECIPIENT This e-mail message and any attachments are 
confidential and may be
privileged. If you received this e-mail in error, 
any review, use,
dissemination, distribution, or copying of this e-mail is 
strictly
prohibited. Please notify us immediately of the error by return 
e-mail and
please delete this message from your system. For more 
information about Rtbrick, please visit us at www.rtbrick.com 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22306): https://lists.fd.io/g/vpp-dev/message/22306
Mute This Topic: https://lists.fd.io/mt/95527187/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] VPP crash while create ipip tunnel after a certain limit

2022-12-07 Thread Sudhir CR via lists.fd.io
Hi Chinmaya Aggarwal,
I can see the "vec_resize_allocate_memory" api in the above stack and you
are telling that  after 7k tunnels this issue is seen.
I suspect this issue could be due to memory exhaust in the system.

Can you please increase heapsize and statseg size in startup.conf file and
check once.

Thanks and regards,
Sudhir



On Thu, Dec 8, 2022 at 4:25 AM Chinmaya Aggarwal 
wrote:

> Hi,
>
> As per our use case, we need to have a large number of ipip tunnels in VPP
> (approx 1). When we try to configure that many tunnels inside VPP,
> after a certain limit it crashes with below core dump:-
>
> Dec 07 20:01:27 j3norvmstm01 vpp[2053130]: ipipCouldn't create
> /tmp/api_post_mortem.2053130
> Dec 07 20:01:27 j3norvmstm01 vnet[2053130]: received signal SIGABRT, PC
> 0x7f019f8e537f
> Dec 07 20:01:27 j3norvmstm01 vnet[2053130]: #0  0x7f01a0b3ef0b
> 0x7f01a0b3ef0b
> Dec 07 20:01:27 j3norvmstm01 vnet[2053130]: #1  0x7f01a0478c20
> 0x7f01a0478c20
> Dec 07 20:01:27 j3norvmstm01 vnet[2053130]: #2  0x7f019f8e537f gsignal
> + 0x10f
> Dec 07 20:01:27 j3norvmstm01 vnet[2053130]: #3  0x7f019f8cfdb5 abort +
> 0x127
> Dec 07 20:01:27 j3norvmstm01 vnet[2053130]: #4  0x55ca1a5f60e3
> 0x55ca1a5f60e3
> Dec 07 20:01:27 j3norvmstm01 vnet[2053130]: #5  0x7f01a0006065
> vec_resize_allocate_memory + 0x285
> Dec 07 20:01:27 j3norvmstm01 vnet[2053130]: #6  0x55ca1a5f6cb0
> 0x55ca1a5f6cb0
> Dec 07 20:01:27 j3norvmstm01 vnet[2053130]: #7  0x55ca1a5f97f8
> 0x55ca1a5f97f8
> Dec 07 20:01:27 j3norvmstm01 vnet[2053130]: #8  0x55ca1a5faf09
> 0x55ca1a5faf09
> Dec 07 20:01:27 j3norvmstm01 vnet[2053130]: #9  0x7f01a10adefa
> 0x7f01a10adefa
> Dec 07 20:01:27 j3norvmstm01 vnet[2053130]: #10 0x7f01a10b263d
> vnet_register_interface + 0x6ed
> Dec 07 20:01:27 j3norvmstm01 vnet[2053130]: #11 0x7f01a1415094
> ipip_add_tunnel + 0x2c4
> Dec 07 20:01:27 j3norvmstm01 vnet[2053130]: #12 0x7f01a141a4f0
> 0x7f01a141a4f0
> Dec 07 20:01:27 j3norvmstm01 vnet[2053130]: #13 0x7f01a0acdb82
> 0x7f01a0acdb82
> Dec 07 20:01:27 j3norvmstm01 vnet[2053130]: #14 0x7f01a0acdce7
> 0x7f01a0acdce7
> Dec 07 20:01:27 j3norvmstm01 vpp[2053130]: Couldn't create
> /tmp/api_post_mortem.2053130
>
> It is able to create only 7362 tunnels and after that VPP crashes.
>
> What could be the possible reason for this crash? Also, is there any limit
> on the number of ipip tunnels (or interface created corresponding to ipip
> tunnels) in VPP?
>
> Thanks and Regards,
> Chinmaya Agarwal.
>
> 
>
>

-- 
NOTICE TO
RECIPIENT This e-mail message and any attachments are 
confidential and may be
privileged. If you received this e-mail in error, 
any review, use,
dissemination, distribution, or copying of this e-mail is 
strictly
prohibited. Please notify us immediately of the error by return 
e-mail and
please delete this message from your system. For more 
information about Rtbrick, please visit us at www.rtbrick.com 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22301): https://lists.fd.io/g/vpp-dev/message/22301
Mute This Topic: https://lists.fd.io/mt/95527187/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] LACP issues w/ cdma/connectX 6

2022-12-05 Thread Sudhir CR via lists.fd.io
Hi Eyle,
Once we faced  LACP issue with host-if interfaces. In our topology host-if
interfaces between two containers are connected via Linux bridge.
when one container is sending LACP PDU to another container those PDU's are
dropped by Linux.  Please enable the packet trace on both sides and check
whether LACP PDU's are exchanged or not between the interfaces.

Below reference might be useful to you.

https://bugzilla.kernel.org/show_bug.cgi?id=202663

Thanks and regards,
Sudhir

On Mon, Dec 5, 2022 at 7:45 PM Eyle Brinkhuis 
wrote:

> Hi, thanks for your reply.
>
> That’s the weird thing.. we have two identical hosts connected to the same
> switch (sn2700), with same OS, same VPP version, same minx_ofed and same
> everything except for the NIC. The box with a CX5 works like a charm, the
> box with the CX6 doesn’t.. but also does, when I create the bond in
> netplan..
>
> Regards,
>
> Eyle
>
> > On 5 Dec 2022, at 15:07, najieb  wrote:
> >
> > I once experienced LACP not UP because the mode on vpp (lacp-static) did
> not match the mode on switch (lacp-dynamic). try changing the lacp mode on
> your switch.
> >
> >
>
>
> 
>
>

-- 
NOTICE TO
RECIPIENT This e-mail message and any attachments are 
confidential and may be
privileged. If you received this e-mail in error, 
any review, use,
dissemination, distribution, or copying of this e-mail is 
strictly
prohibited. Please notify us immediately of the error by return 
e-mail and
please delete this message from your system. For more 
information about Rtbrick, please visit us at www.rtbrick.com 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22283): https://lists.fd.io/g/vpp-dev/message/22283
Mute This Topic: https://lists.fd.io/mt/95468251/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] VPP v22.02 not coming up

2022-05-25 Thread Sudhir CR via lists.fd.io
Hi Chinmaya,
Below errors comes when vpp fails to parse startup.conf parameters
like "socket-mem"
, "linux-cp".
seems these options are deprecated  in 22.02 version .

vlib_call_all_config_functions: unknown input `dpdk  socket-mem 1024 dev
:00:08.0 dev :00:09.0 '

vlib_call_all_config_functions: unknown input `linux-cp  default netns
dataplane interface-auto-create '

To confirm the same and unblock yourself  you can comment "socket-mem" and
"linux-cp" options in startup.conf and try.

like below
dpdk {
#socket-mem 16384,16384
dev :0b:00.0 {
  }
dev :0b:00.1 {
}
}
 Later you can go through the vpp source code to know what new
options/changes are done in this area.


Thanks and Regards,
Sudhir

On Wed, May 25, 2022 at 5:30 PM Chinmaya Aggarwal 
wrote:

> Hi,
>
> As per the suggestions, I modified /etc/vpp/startup.conf file but I am
> still facing the same issues. Below is the modified contents of the file:-
>
> dpdk {
> socket-mem 1024 dev :00:08.0 dev :00:09.0
> }
>
>  plugins {
> ## Adjusting the plugin path depending on where the VPP plugins are
> #   path
> /ws/vpp/build-root/install-vpp-native/vpp/lib/vpp_plugins
> ## Add additional directory to the plugin path
> #   add-path /tmp/vpp_plugins
>
> ## Disable all plugins by default and then selectively enable
> specific plugins
>  plugin default { disable }
>  plugin dpdk_plugin.so { enable }
> # plugin acl_plugin.so { enable }
> plugin linux_cp_plugin.so { enable }
> plugin linux_nl_plugin.so { enable }
> ## Enable all plugins by default and then selectively disable
> specific plugins
> # plugin dpdk_plugin.so { disable }
> # plugin acl_plugin.so { disable }
>  }
>
> linux-cp {
>   default netns dataplane
> }
>
> Same issue is seen:-
>
> vlib_call_all_config_functions: unknown input `dpdk  socket-mem 1024 dev
> :00:08.0 dev :00:09.0 '
>
> vlib_call_all_config_functions: unknown input `linux-cp  default netns
> dataplane interface-auto-create '
>
> Also, I followed below steps for compiling VPP v22.02:-
>
> a) Cloning the repository
>
> git clone -b stable/2202 https://gerrit.fd.io/r/vpp
>
> b) Cherry picking commits (after stable/2202) for linux-cp and linux-nl
> plugin from master branch
>
> git cherry-pick 616447c392311791e630a604a07a2c7e47dbb7d6
> git cherry-pick 307ff11acbe811b7834f58a5bd14dd3038c991cd
> git cherry-pick ffd7a9876e5038ad96af1d5dbbb0283c5fe6ab27
> git cherry-pick 09cdea643aa181d833df15b8c96c3a812320761a
> git cherry-pick 53f8a272a63444b61b700690a2f438fa5066f37b
> git cherry-pick adac308aa8033de28ec9e627af2ed517f37aba6a
> git cherry-pick 87e92c6586747a790ae514effb79b86a3e53958e
> git cherry-pick 3819205bdb5ac0217b54f074d7645efa5356b561
> git cherry-pick aebfc285a89be20f68e5599b8d67dda8f20888a5
> git cherry-pick bc91e86674d446e024a957318d42a3bbd3280bf1
> git cherry-pick f4795a9bd8f488c5d32f9b171aa1d195bb4b8186
> git cherry-pick 2286f937d9a805324a8e46ba5a17198c039ba91a
> git cherry-pick 7e721954d4ea31a26ad44872acc199c91b9595e6
> git cherry-pick 7e647358af812d207004be00eef1d0396ab9f138
> git cherry-pick d373ebef012b1fe94c3df0b92e8c27f90cf782f9
> git cherry-pick 30bb344ab6c82d742d2e5a79f62f8d4868db16f1
> git cherry-pick 7d6f7d0d67face9889e43bdb5f71f352294b918a
> git cherry-pick 1c5b127d2247b68f362b3caac8ff229406fab4d0
> git cherry-pick 851215a04ff53df2eb153133e3f47f514facde3a
> git cherry-pick fbc4ad5fd4a48c49c492912fe75e33a2dbb41dab
> git cherry-pick 6120441f9fbfd275e3993712b92eeb80da652767
> git cherry-pick 3bad8b62d87513c5f4004c3172551c8785c78e65
> git cherry-pick 8abbdf509bbd20c5325c8637f78f502aeeb77af3
>
> c) Enabling MLX5
>
> vim build/external/packages/dpdk.mk
>
> DPDK_MLX5_PMD?= y
> DPDK_MLX5_COMMON_PMD ?= y
>
> git add .
> git commit -m "MLX5 related configuration added"
>
> d) Install dependencies
>
> make wipe-release
> make install-dep
> make install-ext-deps
> dnf install libmnl-devel
>
> e) Build
>
> make build-release
>
> f) Memif compilation and copying.so files
> cd /opt/vpp
> make -C build-root PLATFORM=vpp TAG=vpp libmemif-install
> cp /opt/vpp/build-root/install-vpp-native/libmemif/lib/libmemif.so
> /usr/lib64
> cd /usr/lib/
> mkdir vpp_plugins
> cp
> /opt/vpp/build-root/build-vpp-native/vpp/lib64/vpp_plugins/dpdk_plugin.so
> /usr/lib/vpp_plugins/
> cp
> /opt/vpp/build-root/build-vpp-native/vpp/lib64/vpp_plugins/acl_plugin.so
> /usr/lib/vpp_plugins/
> cp
> /opt/vpp/build-root/build-vpp-native/vpp/lib64/vpp_plugins/hs_apps_plugin.so
> /usr/lib/vpp_plugins/
>
> g) Make rpm packages
> make pkg-rpm
>
> h) RPMS packages are created at path /opt/vpp/build-root/
>
> Are we missing out something here? We have not faced this issue with the
> previous VPP version we were using i.e. v21.06 (compilation steps were
> similar). How can we resolve this issue?
>
> Thanks and Regards,
> Chinmaya Agarwal.
>
> 
>
>

-- 
NOTICE TO
RECIPIENT 

Re: [vpp-dev] VPP 2110 with AddressSanitizer enabled

2022-03-15 Thread Sudhir CR via lists.fd.io
Hi Ben,
Thank you very much for the response and useful information.

Regards,
Sudhir


On Tue, Mar 15, 2022 at 11:02 PM Benoit Ganne (bganne) 
wrote:

> Hi Sudhir,
>
> Yes Asan is pretty picky about compiler version (probably issue 1), how
> you link your plugins (issue 3 is ASan complaining about your private
> plugin librtbvpp.so redefining an already defined global symbol, issue 2
> might be linked to that too).
> Anyway, using GCC, LD_PRELOAD and ASAN_OPTIONS are valid workarounds 
> Regarding not detecting leaks, this is something we do not support
> unfortunately: only memory violations (use-after-free etc) should be
> detected.
>
> Best
> ben
>
> > -Original Message-----
> > From: vpp-dev@lists.fd.io  On Behalf Of Sudhir CR
> via
> > lists.fd.io
> > Sent: mardi 15 mars 2022 06:49
> > To: Sudhir CR 
> > Cc: Benoit Ganne (bganne) ; vpp-dev@lists.fd.io
> > Subject: Re: [vpp-dev] VPP 2110 with AddressSanitizer enabled
> >
> > Hi Ben,
> >
> > I tried to run ASAN on vpp version 21.10 in debug mode.But It's not
> > working for me.
> >
> >
> > Issue 1:
> > I compiled vpp code with below build command (with clang)
> > sudo make rebuild VPP_EXTRA_CMAKE_ARGS=-DVPP_ENABLE_SANITIZE_ADDR=ON
> > After compilation starting vpp is failing with below error
> > sed: symbol lookup error: /home/supervisor/libvpp/build-root/install-
> > vpp_debug-native/vpp/lib/libvppinfra.so.1.0.1: undefined symbol:
> > __asan_option_detect_stack_use_after_return
> >
> >
> > Issue 2:
> > since compiling with clang is not working i tried compiling with GCC. i
> > used below command for compilation
> >
> >
> > sudo make rebuild VPP_EXTRA_CMAKE_ARGS=-DVPP_ENABLE_SANITIZE_ADDR=ON
> > CC=gcc-8
> >
> >
> > with this vpp startup is failing with below error
> > ==908==ASan runtime does not come first in initial library list; you
> > should either link runtime to your application or manually preload it
> with
> > LD_PRELOAD.
> > I resolved above issue with below configuration
> > LD_PRELOAD=/usr/lib/gcc/x86_64-linux-gnu/8/libasan.so
> > export LD_PRELOAD
> > Issue 3:
> > once issue 2 is resolved vpp crashed with below backtrace
> > ```
> > Thread 1 (Thread 0x77fce400 (LWP 678)):
> > #0  __GI_raise (sig=sig@entry=6) at
> ../sysdeps/unix/sysv/linux/raise.c:51
> > #1  0x775917f1 in __GI_abort () at abort.c:79
> > #2  0x7fffe99f39ce in ?? () from /usr/lib/x86_64-linux-
> > gnu/libasan.so.5
> > #3  0x7fffe99fc088 in ?? () from /usr/lib/x86_64-linux-
> > gnu/libasan.so.5
> > #4  0x7fffe99d3a39 in ?? () from /usr/lib/x86_64-linux-
> > gnu/libasan.so.5
> > #5  0x7fffe99e1ee9 in ?? () from /usr/lib/x86_64-linux-
> > gnu/libasan.so.5
> > #6  0x77de38d3 in call_init (env=0x7fffe308,
> > argv=0x7fffe2e8, argc=3, l=) at dl-init.c:72
> > #7  _dl_init (main_map=main_map@entry=0x583cf670, argc=3,
> > argv=0x7fffe2e8, env=0x7fffe308) at dl-init.c:119
> > #8  0x77de839f in dl_open_worker (a=a@entry=0x7fffcf40) at
> dl-
> > open.c:522
> > #9  0x776b816f in __GI__dl_catch_exception
> > (exception=0x7fffcf20, operate=0x77de7f60 ,
> > args=0x7fffcf40)
> > at dl-error-skeleton.c:196
> > #10 0x77de796a in _dl_open (file=0x7fffd1b0
> > "/usr/local/lib/librtbvpp.so", mode=-2147483391,
> > caller_dlopen=0x7794b73d , nsid= > out>, argc=3, argv=, env=0x7fffe308) at dl-open.c:605
> > #11 0x770c9f96 in dlopen_doit (a=a@entry=0x7fffd170) at
> > dlopen.c:66
> > #12 0x776b816f in __GI__dl_catch_exception
> > (exception=exception@entry=0x7fffd110, operate=0x770c9f40
> > ,
> > args=0x7fffd170) at dl-error-skeleton.c:196
> > #13 0x776b81ff in __GI__dl_catch_error (objname=0x5779df30,
> > errstring=0x5779df38, mallocedp=0x5779df28, operate= > out>,
> > args=) at dl-error-skeleton.c:215
> > #14 0x770ca745 in _dlerror_run
> > (operate=operate@entry=0x770c9f40 ,
> > args=args@entry=0x7fffd170) at dlerror.c:162
> > #15 0x770ca051 in __dlopen (file=file@entry=0x7fffd1b0
> > "/usr/local/lib/librtbvpp.so", mode=mode@entry=257) at dlopen.c:87
> > ```
> > I resolved this issue by exporting below ASAN options
> > export ASAN_OPTIONS=verify_asan_link_order=0:detect_odr_violation=0
> > Once issue 3 is resolved vpp is up but it's not catching/reporting any
> > memory leaks
> > (I induced one l

[vpp-dev] Failed to start vpp debug image

2022-03-15 Thread Sudhir CR via lists.fd.io
Hi All,
I compiled vpp code with "sudo make build" , but when I started vpp with
the "sudo make debug" command it aborted.
However I am able to start vpp with the release image successfully .

*vpp version : 21.10*

Please find full logs below

starting vpp with debug image:
===
supervisor@l2_sudhir>srv2:~/libvpp $ sudo make debug
WARNING: STARTUP_CONF not defined or file doesn't exist.
 Running with minimal startup config:  unix { interactive
cli-listen /run/vpp/cli.sock gid 0 } dpdk { no-pci } \n
GNU gdb (Ubuntu 8.1.1-0ubuntu1) 8.1.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/supervisor/libvpp/build-root/
*install-vpp_debug-native*/vpp/bin/vpp...done.
SignalStop Print Pass to program Description
SIGUSR1   No No Yes User defined signal 1
(gdb) run -c /etc/vpp/startup.conf
Starting program:
/home/supervisor/libvpp/build-root/install-vpp_debug-native/vpp/bin/vpp -c
/etc/vpp/startup.conf
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
unix_config:521: couldn't open log '/var/log/vpp/vpp.log'
/home/supervisor/libvpp/build-root/install-vpp_debug-native/vpp/bin/vpp[710]:
snort: initialized
/home/supervisor/libvpp/build-root/install-vpp_debug-native/vpp/bin/vpp[710]:
/home/supervisor/libvpp/src/vppinfra/ptclosure.c:25 (clib_ptclosure_alloc)
assertion `n > 0' fails

Program received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x74e4d7f1 in __GI_abort () at abort.c:79
#2  0x00407ab3 in os_panic () at
/home/supervisor/libvpp/src/vpp/vnet/main.c:618
#3  0x755d6a19 in debugger () at
/home/supervisor/libvpp/src/vppinfra/error.c:84
#4  0x755d6797 in _clib_error (how_to_die=2, function_name=0x0,
line_number=0, fmt=0x756b8705 "%s:%d (%s) assertion `%s' fails")
at /home/supervisor/libvpp/src/vppinfra/error.c:143
#5  0x7562ceb1 in clib_ptclosure_alloc (n=0) at
/home/supervisor/libvpp/src/vppinfra/ptclosure.c:25
#6  0x770a7144 in vnet_feature_arc_init (vm=0x7fffcf60d680,
vcm=0x7fffd1debc90, feature_start_nodes=0x77b40810 <.compoundliteral>,
num_feature_start_nodes=1, last_in_arc=0x0, first_reg=0x0,
first_const_set=0x0, in_feature_nodes=0x7fffd1dd2960)
at /home/supervisor/libvpp/src/vnet/feature/registration.c:253
#7  0x770a0e34 in vnet_feature_init (vm=0x7fffcf60d680) at
/home/supervisor/libvpp/src/vnet/feature/feature.c:151
#8  0x76a3e56f in ip4_lookup_init (vm=0x7fffcf60d680) at
/home/supervisor/libvpp/src/vnet/ip/ip4_forward.c:1131
#9  0x76197167 in call_init_exit_functions_internal
(vm=0x7fffcf60d680, headp=0x764ab270 ,
call_once=1, do_sort=1,
is_global=1) at /home/supervisor/libvpp/src/vlib/init.c:363
#10 0x76196fff in vlib_call_init_exit_functions (vm=0x7fffcf60d680,
headp=0x764ab270 , call_once=1, is_global=1)
at /home/supervisor/libvpp/src/vlib/init.c:377
#11 0x7619722c in vlib_call_all_init_functions (vm=0x7fffcf60d680)
at /home/supervisor/libvpp/src/vlib/init.c:400
#12 0x761c3e02 in vlib_main (vm=0x7fffcf60d680,
input=0x7fffb8d5efa8) at /home/supervisor/libvpp/src/vlib/main.c:2029
#13 0x7624d13e in thread0 (arg=140736672618112) at
/home/supervisor/libvpp/src/vlib/unix/main.c:716
#14 0x75604a58 in clib_calljmp () at
/home/supervisor/libvpp/src/vppinfra/longjmp.S:123
#15 0x7fffce00 in ?? ()
#16 0x7624cedb in vlib_unix_main (argc=45, argv=0x681500) at
/home/supervisor/libvpp/src/vlib/unix/main.c:797
#17 0x004067a0 in main (argc=45, argv=0x681500) at
/home/supervisor/libvpp/src/vpp/vnet/main.c:344
(gdb)

starting vpp with release image:

supervisor@l2_sudhir>srv2:~/libvpp $ make debug-release
WARNING: STARTUP_CONF not defined or file doesn't exist.
 Running with minimal startup config:  unix { interactive
cli-listen /run/vpp/cli.sock gid 1000 } dpdk { no-pci } \n
GNU gdb (Ubuntu 8.1.1-0ubuntu1) 8.1.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free 

Re: [vpp-dev] VPP 2110 with AddressSanitizer enabled

2022-03-14 Thread Sudhir CR via lists.fd.io
Hi Ben,

I tried to run ASAN on vpp version 21.10 in debug mode.But It's not working
for me.

Issue 1:
I compiled vpp code with below build command (with clang)

*sudo make rebuild VPP_EXTRA_CMAKE_ARGS=-DVPP_ENABLE_SANITIZE_ADDR=ON*

After compilation starting vpp is failing with below error

*sed: symbol lookup error:
/home/supervisor/libvpp/build-root/install-vpp_debug-native/vpp/lib/libvppinfra.so.1.0.1:
undefined symbol: __asan_option_detect_stack_use_after_return*


Issue 2:
since compiling with clang is not working i tried compiling with GCC. i
used below command for compilation

*sudo make rebuild VPP_EXTRA_CMAKE_ARGS=-DVPP_ENABLE_SANITIZE_ADDR=ON
CC=gcc-8*

with this vpp startup is failing with below error

*==908==ASan runtime does not come first in initial library list; you
should either link runtime to your application or manually preload it
with LD_PRELOAD.*

I resolved above issue with below configuration

LD_PRELOAD=/usr/lib/gcc/x86_64-linux-gnu/8/libasan.so
export LD_PRELOAD

Issue 3:

once issue 2 is resolved vpp crashed with below backtrace

```

Thread 1 (Thread 0x77fce400 (LWP 678)):
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x775917f1 in __GI_abort () at abort.c:79
#2  0x7fffe99f39ce in ?? () from /usr/lib/x86_64-linux-gnu/libasan.so.5
#3  0x7fffe99fc088 in ?? () from /usr/lib/x86_64-linux-gnu/libasan.so.5
#4  0x7fffe99d3a39 in ?? () from /usr/lib/x86_64-linux-gnu/libasan.so.5
#5  0x7fffe99e1ee9 in ?? () from /usr/lib/x86_64-linux-gnu/libasan.so.5
#6  0x77de38d3 in call_init (env=0x7fffe308,
argv=0x7fffe2e8, argc=3, l=) at dl-init.c:72
#7  _dl_init (main_map=main_map@entry=0x583cf670, argc=3,
argv=0x7fffe2e8, env=0x7fffe308) at dl-init.c:119
#8  0x77de839f in dl_open_worker (a=a@entry=0x7fffcf40) at
dl-open.c:522
#9  0x776b816f in __GI__dl_catch_exception
(exception=0x7fffcf20, operate=0x77de7f60 ,
args=0x7fffcf40)
at dl-error-skeleton.c:196
#10 0x77de796a in _dl_open (file=0x7fffd1b0
"/usr/local/lib/librtbvpp.so", mode=-2147483391,
caller_dlopen=0x7794b73d , nsid=, argc=3, argv=, env=0x7fffe308) at
dl-open.c:605
#11 0x770c9f96 in dlopen_doit (a=a@entry=0x7fffd170) at dlopen.c:66
#12 0x776b816f in __GI__dl_catch_exception
(exception=exception@entry=0x7fffd110, operate=0x770c9f40
,
args=0x7fffd170) at dl-error-skeleton.c:196
#13 0x776b81ff in __GI__dl_catch_error
(objname=0x5779df30, errstring=0x5779df38,
mallocedp=0x5779df28, operate=,
args=) at dl-error-skeleton.c:215
#14 0x770ca745 in _dlerror_run
(operate=operate@entry=0x770c9f40 ,
args=args@entry=0x7fffd170) at dlerror.c:162
#15 0x770ca051 in __dlopen (file=file@entry=0x7fffd1b0
"/usr/local/lib/librtbvpp.so", mode=mode@entry=257) at dlopen.c:87

```

I resolved this issue by exporting below ASAN options

*export ASAN_OPTIONS=verify_asan_link_order=0:detect_odr_violation=0*

Once issue 3 is resolved vpp is up but it's not catching/reporting any
memory leaks

(I induced one leak in our code to verify the same).

we are starting our fib process (vpp) by executing below script

supervisor@dev1_sudhir>srv2:~ $ cat fib.sh
LD_PRELOAD=/usr/lib/gcc/x86_64-linux-gnu/8/libasan.so
export LD_PRELOAD
export ASAN_OPTIONS=verify_asan_link_order=0:detect_odr_violation=0
sudo -E gdb --args bd -i /etc/rtbrick/bd/config/fibd.json

./fib.sh


Can you please let me know if you find any issue in the procedure i followed or
any pointers to solve the issue i am facing.


Thanks and Regards,

Sudhir


On Thu, Feb 24, 2022 at 6:39 PM Sudhir CR via lists.fd.io  wrote:

> Hi Ben,
> Thanks for the update.
>
> Regards,
> Sudhir
>
> On Thu, Feb 24, 2022 at 6:37 PM Benoit Ganne (bganne) 
> wrote:
>
>> Hi Sudhir,
>>
>> I am working on a few bugfixes for ASan.
>> Right now, I'd recommend to use ASan in debug builds only, there are
>> several issues that need to be fixed in release mode.
>>
>> Best
>> ben
>>
>> > -Original Message-
>> > From: vpp-dev@lists.fd.io  On Behalf Of Sudhir CR
>> via
>> > lists.fd.io
>> > Sent: jeudi 24 février 2022 14:03
>> > To: vpp-dev@lists.fd.io
>> > Subject: [vpp-dev] VPP 2110 with AddressSanitizer enabled
>> >
>> > Hi Team,
>> >
>> >
>> > We compiled vpp code with AddressSanitizer enabled. We used the below
>> > command to compile code.
>> >
>> >
>> > sudo make rebuild-release VPP_EXTRA_CMAKE_ARGS=-
>> > DVPP_ENABLE_SANITIZE_ADDR=ON
>> >
>> >
>> >
>> > we are starting vpp with below command
>> >
>> >
>> > sudo ./vpp -c /etc/vpp/startup.conf
>> >
>> >
>> >

[vpp-dev] fragmentation issue with ttl 1

2022-02-24 Thread Sudhir CR via lists.fd.io
Hi Team,

During our testing we observed below issue

when a packet with TTL 1 and size is greater than interface MTU  is
received its getting dropped with below error counter

*23172 ip4-input ip4 ttl <= 1*

we added below check to resolve the issue  in ip4_frag_do_fragment function

+if (from_b->flags & VNET_BUFFER_F_LOCALLY_ORIGINATED)
+{
+to_b->flags |= VNET_BUFFER_F_LOCALLY_ORIGINATED;
+}

is this fix is correct , if correct can you please approve below patch
https://gerrit.fd.io/r/c/vpp/+/35367

Thanks & Regards,
Sudhir

-- 
NOTICE TO
RECIPIENT This e-mail message and any attachments are 
confidential and may be
privileged. If you received this e-mail in error, 
any review, use,
dissemination, distribution, or copying of this e-mail is 
strictly
prohibited. Please notify us immediately of the error by return 
e-mail and
please delete this message from your system. For more 
information about Rtbrick, please visit us at www.rtbrick.com 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20906): https://lists.fd.io/g/vpp-dev/message/20906
Mute This Topic: https://lists.fd.io/mt/89364542/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] VPP 2110 with AddressSanitizer enabled

2022-02-24 Thread Sudhir CR via lists.fd.io
Hi Ben,
Thanks for the update.

Regards,
Sudhir

On Thu, Feb 24, 2022 at 6:37 PM Benoit Ganne (bganne) 
wrote:

> Hi Sudhir,
>
> I am working on a few bugfixes for ASan.
> Right now, I'd recommend to use ASan in debug builds only, there are
> several issues that need to be fixed in release mode.
>
> Best
> ben
>
> > -Original Message-
> > From: vpp-dev@lists.fd.io  On Behalf Of Sudhir CR
> via
> > lists.fd.io
> > Sent: jeudi 24 février 2022 14:03
> > To: vpp-dev@lists.fd.io
> > Subject: [vpp-dev] VPP 2110 with AddressSanitizer enabled
> >
> > Hi Team,
> >
> >
> > We compiled vpp code with AddressSanitizer enabled. We used the below
> > command to compile code.
> >
> >
> > sudo make rebuild-release VPP_EXTRA_CMAKE_ARGS=-
> > DVPP_ENABLE_SANITIZE_ADDR=ON
> >
> >
> >
> > we are starting vpp with below command
> >
> >
> > sudo ./vpp -c /etc/vpp/startup.conf
> >
> >
> >
> > But vpp startup is failed with below AddressSanitizer error
> >
> >
> > AddressSanitizer:DEADLYSIGNAL
> > =
> > ==1442028==ERROR: AddressSanitizer: SEGV on unknown address
> 0x0200255e3b3e
> > (pc 0x7fe59ae338f7 bp 0x7ffc128b1c40 sp 0x7ffc128b1480 T0)
> > ==1442028==The signal is caused by a READ memory access.
> > #0 0x7fe59ae338f6 in sanitizer_unpoison_push__
> > /home/supervisor/development/libvpp/src/vppinfra/sanitizer.h:54:17
> > #1 0x7fe59ae338f6 in hash_memory64
> > /home/supervisor/development/libvpp/src/vppinfra/hash.c:157:15
> > #2 0x7fe59ae338f6 in hash_memory
> > /home/supervisor/development/libvpp/src/vppinfra/hash.c:280:10
> > #3 0x7fe59ae34b4e in key_sum
> > /home/supervisor/development/libvpp/src/vppinfra/hash.c
> > #4 0x7fe59ae34b4e in lookup.llvm.8926505759877686271
> > /home/supervisor/development/libvpp/src/vppinfra/hash.c:557:7
> > #5 0x7fe59aeb203f in _hash_set3
> > /home/supervisor/development/libvpp/src/vppinfra/hash.c:848:10
> > #6 0x7fe59c4edc8e in config_one_plugin
> > /home/supervisor/development/libvpp/src/vlib/unix/plugin.c:710:3
> > #7 0x7fe59c4edc8e in vlib_plugin_config
> > /home/supervisor/development/libvpp/src/vlib/unix/plugin.c:775:12
> > #8 0x7fe59c49c3f6 in vlib_unix_main
> > /home/supervisor/development/libvpp/src/vlib/unix/main.c:764:12
> > #9 0x55a040ce2e40 in main
> > /home/supervisor/development/libvpp/src/vpp/vnet/main.c:344:14
> > #10 0x7fe59a29ebf6 in __libc_start_main /build/glibc-S9d2JN/glibc-
> > 2.27/csu/../csu/libc-start.c:310
> > #11 0x55a040c41479 in _start
> > (/home/supervisor/development/libvpp/build-root/install-vpp-
> > native/vpp/bin/vpp+0x41479)
> >
> > AddressSanitizer can not provide additional info.
> > SUMMARY: AddressSanitizer: SEGV
> > /home/supervisor/development/libvpp/src/vppinfra/sanitizer.h:54:17 in
> > sanitizer_unpoison_push__
> > ==1442028==ABORTING
> > Aborted
> >
> >
> >
> > contents of  startup.conf looks like below
> >
> >
> >
> > unix {
> >   nodaemon
> >   log /var/log/vpp/vpp.log
> >   full-coredump
> >   cli-listen localhost:5003
> >   runtime-dir /shm/run/vpp/
> > }
> >
> > api-trace {
> >   on
> > }
> >
> > cpu {
> >   main-core 2
> >   corelist-workers 3,4
> > }
> >
> > heapsize 600M
> >
> > statseg {
> >size 150M
> > }
> > plugin_path /usr/local/lib/vpp_plugins/
> > plugins {
> >   plugin rtbrick_bcm_plugin.so { disable }
> >   plugin dpdk_plugin.so { disable }
> > }
> >
> >
> >
> > Please let me know any suggestions on how to resolve this Error.
> >
> >
> > Thanks in Advance,
> >
> > Sudhir
> >
> >
> > NOTICE TO RECIPIENT This e-mail message and any attachments are
> > confidential and may be privileged. If you received this e-mail in error,
> > any review, use, dissemination, distribution, or copying of this e-mail
> is
> > strictly prohibited. Please notify us immediately of the error by return
> > e-mail and please delete this message from your system. For more
> > information about Rtbrick, please visit us at www.rtbrick.com
> > <http://www.rtbrick.com>
>

-- 
NOTICE TO
RECIPIENT This e-mail message and any attachments are 
confidential and may be
privileged. If you received this e-mail in error, 
any review, use,
dissemination, distribution, or copying of this e-mail is 
strictly
prohibited. Please notify us immediately of the error by return 
e-mail and
please delete this message from your system. For more 
information about Rtbrick, please visit us at www.rtbrick.com 
<http://www.rtbrick.com>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20905): https://lists.fd.io/g/vpp-dev/message/20905
Mute This Topic: https://lists.fd.io/mt/89364107/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



[vpp-dev] VPP 2110 with AddressSanitizer enabled

2022-02-24 Thread Sudhir CR via lists.fd.io
Hi Team,

We compiled vpp code with AddressSanitizer enabled. We used the below
command to compile code.

*sudo make rebuild-release
VPP_EXTRA_CMAKE_ARGS=-DVPP_ENABLE_SANITIZE_ADDR=ON*

we are starting vpp with below command


*sudo ./vpp -c /etc/vpp/startup.conf*

But vpp startup is failed with below AddressSanitizer error

AddressSanitizer:DEADLYSIGNAL
=
==1442028==ERROR: AddressSanitizer: SEGV on unknown address 0x0200255e3b3e
(pc 0x7fe59ae338f7 bp 0x7ffc128b1c40 sp 0x7ffc128b1480 T0)
==1442028==The signal is caused by a READ memory access.
#0 0x7fe59ae338f6 in sanitizer_unpoison_push__
/home/supervisor/development/libvpp/src/vppinfra/sanitizer.h:54:17
#1 0x7fe59ae338f6 in hash_memory64
/home/supervisor/development/libvpp/src/vppinfra/hash.c:157:15
#2 0x7fe59ae338f6 in hash_memory
/home/supervisor/development/libvpp/src/vppinfra/hash.c:280:10
#3 0x7fe59ae34b4e in key_sum
/home/supervisor/development/libvpp/src/vppinfra/hash.c
#4 0x7fe59ae34b4e in lookup.llvm.8926505759877686271
/home/supervisor/development/libvpp/src/vppinfra/hash.c:557:7
#5 0x7fe59aeb203f in _hash_set3
/home/supervisor/development/libvpp/src/vppinfra/hash.c:848:10
#6 0x7fe59c4edc8e in config_one_plugin
/home/supervisor/development/libvpp/src/vlib/unix/plugin.c:710:3
#7 0x7fe59c4edc8e in vlib_plugin_config
/home/supervisor/development/libvpp/src/vlib/unix/plugin.c:775:12
#8 0x7fe59c49c3f6 in vlib_unix_main
/home/supervisor/development/libvpp/src/vlib/unix/main.c:764:12
#9 0x55a040ce2e40 in main
/home/supervisor/development/libvpp/src/vpp/vnet/main.c:344:14
#10 0x7fe59a29ebf6 in __libc_start_main
/build/glibc-S9d2JN/glibc-2.27/csu/../csu/libc-start.c:310
#11 0x55a040c41479 in _start
(/home/supervisor/development/libvpp/build-root/install-vpp-native/vpp/bin/vpp+0x41479)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV
/home/supervisor/development/libvpp/src/vppinfra/sanitizer.h:54:17 in
sanitizer_unpoison_push__
==1442028==ABORTING
Aborted


*contents of  startup.conf looks like below *

unix {
  nodaemon
  log /var/log/vpp/vpp.log
  full-coredump
  cli-listen localhost:5003
  runtime-dir /shm/run/vpp/
}

api-trace {
  on
}

cpu {
  main-core 2
  corelist-workers 3,4
}

heapsize 600M

statseg {
   size 150M
}
plugin_path /usr/local/lib/vpp_plugins/
plugins {
  plugin rtbrick_bcm_plugin.so { disable }
  plugin dpdk_plugin.so { disable }
}

Please let me know any suggestions on how to resolve this Error.

Thanks in Advance,
Sudhir

-- 
NOTICE TO
RECIPIENT This e-mail message and any attachments are 
confidential and may be
privileged. If you received this e-mail in error, 
any review, use,
dissemination, distribution, or copying of this e-mail is 
strictly
prohibited. Please notify us immediately of the error by return 
e-mail and
please delete this message from your system. For more 
information about Rtbrick, please visit us at www.rtbrick.com 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20903): https://lists.fd.io/g/vpp-dev/message/20903
Mute This Topic: https://lists.fd.io/mt/89364107/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] vpp main thread crashed at mspace_put

2021-07-19 Thread Sudhir CR via lists.fd.io
Hi Murthy,
We observed this issue when memory is exhausted in our system (due to
memory leak in our application).
After solving the above mentioned issue we have not observed this issue.

Regards,
Sudhir

On Mon, Jul 19, 2021 at 4:46 PM Satya Murthy 
wrote:

> Hi Sudhir,
>
> Were you able to find a solution to this problem.
> We are also facing similar issue.
>
> Any inputs would be helpful.
>
> --
> Thanks & Regards,
> Murthy
> 
>
>

-- 
NOTICE TO
RECIPIENT This e-mail message and any attachments are 
confidential and may be
privileged. If you received this e-mail in error, 
any review, use,
dissemination, distribution, or copying of this e-mail is 
strictly
prohibited. Please notify us immediately of the error by return 
e-mail and
please delete this message from your system. For more 
information about Rtbrick, please visit us at www.rtbrick.com 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#19832): https://lists.fd.io/g/vpp-dev/message/19832
Mute This Topic: https://lists.fd.io/mt/81600282/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] Infinite loop in fib_walk_sync

2021-06-17 Thread Sudhir CR via lists.fd.io
Hi All,
I further modified the configuration and observed that issue is seen when
mpls route is present in the configuration.
Please find the simplified configuration to reproduce the issue.
Here MPLS route is used to pop the label do ipv4 forwarding

*Device configuration*
*Node 1:*
set interface ip address memif32321/32321 23.0.0.2/24

*mpls local-label add 2003 non-eos via 23.0.0.3 memif32321/32321
 out-labels 3*
bfd udp session add interface memif32321/32321 local-addr 23.0.0.2
peer-addr 23.0.0.3 desired-min-tx 40 required-min-rx 40 detect-mult
3
*Node 2:*
set interface ip address memif32321/32321 23.0.0.3/24

*mpls local-label add 2002 non-eos via 23.0.0.2 memif32321/32321
 out-labels 3*
bfd udp session add interface memif32321/32321 local-addr 23.0.0.3
peer-addr 23.0.0.2 desired-min-tx 40 required-min-rx 40 detect-mult
3

Thanks and Regards,
Sudhir

On Thu, Jun 17, 2021 at 2:32 PM Benoit Ganne (bganne) 
wrote:

> Hi Sudhir,
>
> It went through and this is the 3rd time now, but AFAICS you still did not
> address Neale's point:
> https://lists.fd.io/g/vpp-dev/message/19554?p=,,,20,0,0,0::Created,,infinite+loop,20,2,0,83418156
>
> Best
> ben
>
> > -Original Message-
> > From: vpp-dev@lists.fd.io  On Behalf Of Sudhir CR
> via
> > lists.fd.io
> > Sent: jeudi 17 juin 2021 10:59
> > To: vpp-dev@lists.fd.io
> > Subject: Re: [vpp-dev] Infinite loop in fib_walk_sync
> >
> > Hi All,
> >
> > I am resending the same again as i am not sure the former one had reached
> > the forum.
> >
> > Regards,
> > Sudhir.
> >
> > On Thu, Jun 17, 2021 at 1:15 PM Sudhir CR via lists.fd.io
> > <http://lists.fd.io>   > <mailto:rtbrick@lists.fd.io> > wrote:
> >
> >
> >   Hi All,
> >   We have been using vpp with our stack for the 6PE solution for some
> > time.
> >   But when we recently enabled BFD in vpp we are observing an
> infinite
> > loop with the below call stack.
> >
> >   Any help in resolving this issue would be appreciated  .
> >
> >   (gdb) thread apply all bt
> >
> >   Thread 3 (Thread 0x7f6d27bfe700 (LWP 449)):
> >   #0  0x7f6dc79d4007 in vlib_worker_thread_barrier_check () at
> > /home/supervisor/libvpp/src/vlib/threads.h:438
> >   #1  0x7f6dc79ce52e in vlib_main_or_worker_loop
> > (vm=0x7f6da5f9b6c0, is_main=0) at
> > /home/supervisor/libvpp/src/vlib/main.c:1788
> >   #2  0x7f6dc79cdd47 in vlib_worker_loop (vm=0x7f6da5f9b6c0) at
> > /home/supervisor/libvpp/src/vlib/main.c:2008
> >   #3  0x7f6dc7a2592a in vlib_worker_thread_fn
> (arg=0x7f6da3593180)
> > at /home/supervisor/libvpp/src/vlib/threads.c:1862
> >   #4  0x7f6dc724bc34 in clib_calljmp () at
> > /home/supervisor/libvpp/src/vppinfra/longjmp.S:123
> >   #5  0x7f6d27bfdec0 in ?? ()
> >   #6  0x7f6dc7a1dad3 in vlib_worker_thread_bootstrap_fn
> > (arg=0x7f6da3593180) at /home/supervisor/libvpp/src/vlib/threads.c:585
> >   Backtrace stopped: previous frame inner to this frame (corrupt
> > stack?)
> >
> >   Thread 2 (Thread 0x7f6d283ff700 (LWP 448)):
> >   #0  0x7f6dc79d3ffe in vlib_worker_thread_barrier_check () at
> > /home/supervisor/libvpp/src/vlib/threads.h:438
> >   #1  0x7f6dc79ce52e in vlib_main_or_worker_loop
> > (vm=0x7f6da5f9a200, is_main=0) at
> > /home/supervisor/libvpp/src/vlib/main.c:1788
> >   #2  0x7f6dc79cdd47 in vlib_worker_loop (vm=0x7f6da5f9a200) at
> > /home/supervisor/libvpp/src/vlib/main.c:2008
> >   #3  0x7f6dc7a2592a in vlib_worker_thread_fn
> (arg=0x7f6da3593080)
> > at /home/supervisor/libvpp/src/vlib/threads.c:1862
> >   #4  0x7f6dc724bc34 in clib_calljmp () at
> > /home/supervisor/libvpp/src/vppinfra/longjmp.S:123
> >   #5  0x7f6d283feec0 in ?? ()
> >   #6  0x7f6dc7a1dad3 in vlib_worker_thread_bootstrap_fn
> > (arg=0x7f6da3593080) at /home/supervisor/libvpp/src/vlib/threads.c:585
> >   Backtrace stopped: previous frame inner to this frame (corrupt
> > stack?)
> >
> >   Thread 1 (Thread 0x7f6dd47c2240 (LWP 226)):
> >   #0  0x7f6dc723c2dc in hash_header (v=0x7f6da6870e18) at
> > /home/supervisor/libvpp/src/vppinfra/hash.h:113
> >   #1  0x7f6dc723d329 in get_pair (v=0x7f6da6870e18, i=55) at
> > /home/supervisor/libvpp/src/vppinfra/hash.c:58
> >   #2  0x7f6dc723c372 in lookup (v=0x7f6da6870e18,
> > key=140108524924744, op=GET, new_value=0x0, old_value=0x0)
> >   at /home/supervisor/libvpp/src/vppinfra/hash.c:557
> >  

Re: [vpp-dev] vpp hangs with bfd configuration

2021-06-17 Thread Sudhir CR via lists.fd.io
Hi neale,
Sorry for the late reply,
After your comment i made a slight change in the configuration, still issue
is seen.
Since it looks like a fib issue, I started a new issue with a different
subject.
Please find full details in below url
https://lists.fd.io/g/vpp-dev/message/19589

Thanks and Regards,
Sudhir


On Thu, Jun 10, 2021 at 9:07 PM Neale Ranns  wrote:

>
>
>
>
> *From: *vpp-dev@lists.fd.io  on behalf of Sudhir CR
> via lists.fd.io 
> *Date: *Thursday, 10 June 2021 at 08:50
> *To: *vpp-dev@lists.fd.io 
> *Subject: *[vpp-dev] vpp hangs with bfd configuration
>
> Hi All,
>
> when we are trying to establish a BFD session between two containers while
> processing "adj_bfd_notify '' * VPP went into an infinite loop and hung *in
> one of the containers, and this issue is reproducible every time with below
> topology and configuration.
>
>
>
> Any help in fixing the issue would be appreciated.
>
>
>
> *Topology:*
>
>
>
>   Container1 (memif32321/32321)  -
> (memif32321/32321)Container2
>
>
>
> *Configuration:*
>
> Container1
>
> 
>
> set interface ip address memif32321/32321 4.4.4.4/24
> ip table add 100
> ip route add 4.4.4.0/24 table 100 via 4.4.4.5 memif32321/32321 out-labels
> 
> ip route add 4.4.4.5/32 table 100 via 4.4.4.5 memif32321/32321 out-labels
> 
>
> set interface mpls memif32321/32321 enable
> mpls local-label add  eos via 4.4.4.5 memif32321/32321
> ip4-lookup-in-table 100
>
>
>
> what’s the intent here? Do you want to forward via the memif or do a
> lookup, you can’t do both.
>
> Fix that and see if it helps.
>
>
>
> /neale
>
>
>
> bfd udp session add interface memif32321/32321 local-addr 4.4.4.4
> peer-addr 4.4.4.5 desired-min-tx 40 required-min-rx 40 detect-mult 3
>
>
>
> Container2
>
> 
>
> set interface ip address memif32321/32321 4.4.4.5/24
> ip table add 100
> ip route add 4.4.4.0/24 table 100 via 4.4.4.4 memif32321/32321 out-labels
> 
> ip route add 4.4.4.4/32 table 100 via 4.4.4.4 memif32321/32321 out-labels
> 
> set interface mpls memif32321/32321 enable
> mpls local-label add   eos via 4.4.4.4 memif32321/32321
> ip4-lookup-in-table 100
> bfd udp session add interface memif32321/32321 local-addr 4.4.4.5
> peer-addr 4.4.4.4 desired-min-tx 40 required-min-rx 40 detect-mult 3
>
>
>
> *VPP version: *20.09
>
>
>
> (gdb) thread apply all bt
>
> Thread 3 (Thread 0x7f7ac6ffe700 (LWP 422)):
> #0  0x7f7b67036ffe in vlib_worker_thread_barrier_check () at
> /home/supervisor/development/libvpp/src/vlib/threads.h:438
> #1  0x7f7b6703152e in vlib_main_or_worker_loop (vm=0x7f7b46cf3240,
> is_main=0) at /home/supervisor/development/libvpp/src/vlib/main.c:1788
> #2  0x7f7b67030d47 in vlib_worker_loop (vm=0x7f7b46cf3240) at
> /home/supervisor/development/libvpp/src/vlib/main.c:2008
> #3  0x7f7b6708892a in vlib_worker_thread_fn (arg=0x7f7b41f14540) at
> /home/supervisor/development/libvpp/src/vlib/threads.c:1862
> #4  0x7f7b668adc44 in clib_calljmp () at
> /home/supervisor/development/libvpp/src/vppinfra/longjmp.S:123
> #5  0x7f7ac6ffdec0 in ?? ()
> #6  0x7f7b67080ad3 in vlib_worker_thread_bootstrap_fn
> (arg=0x7f7b41f14540) at
> /home/supervisor/development/libvpp/src/vlib/threads.c:585
> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
>
> Thread 2 (Thread 0x7f7ac77ff700 (LWP 421)):
> #0  0x7f7b67036fef in vlib_worker_thread_barrier_check () at
> /home/supervisor/development/libvpp/src/vlib/threads.h:437
> #1  0x7f7b6703152e in vlib_main_or_worker_loop (vm=0x7f7b45fe8b80,
> is_main=0) at /home/supervisor/development/libvpp/src/vlib/main.c:1788
> #2  0x7f7b67030d47 in vlib_worker_loop (vm=0x7f7b45fe8b80) at
> /home/supervisor/development/libvpp/src/vlib/main.c:2008
> #3  0x7f7b6708892a in vlib_worker_thread_fn (arg=0x7f7b41f14440) at
> /home/supervisor/development/libvpp/src/vlib/threads.c:1862
> #4  0x7f7b668adc44 in clib_calljmp () at
> /home/supervisor/development/libvpp/src/vppinfra/longjmp.S:123
> #5  0x7f7ac77feec0 in ?? ()
> #6  0x7f7b67080ad3 in vlib_worker_thread_bootstrap_fn
> (arg=0x7f7b41f14440) at
> /home/supervisor/development/libvpp/src/vlib/threads.c:585
> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
>
> Thread 1 (Thread 0x7f7b739b7740 (LWP 226)):
> #0  0x7f7b681c952b in fib_node_list_remove (list=54, sibling=63) at
> /home/supervisor/development/libvpp/src/vnet/fib/fib_node_list.c:246
> #1  0x7f7b681c7695 in fib_node_child_remove
> (parent_type=FIB_NODE_TYPE_ADJ, parent_index=1, sibling_index=63)
> 

Re: [vpp-dev] Infinite loop in fib_walk_sync

2021-06-17 Thread Sudhir CR via lists.fd.io
Hi All,
I am resending the same again as i am not sure the former one had reached
the forum.

Regards,
Sudhir.

On Thu, Jun 17, 2021 at 1:15 PM Sudhir CR via lists.fd.io  wrote:

> Hi All,
> We have been using vpp with our stack for the 6PE solution for some time.
> But when we recently enabled BFD in vpp we are observing an infinite loop
> with the below call stack.
>
> Any help in resolving this issue would be appreciated  .
>
> (gdb) thread apply all bt
>
> Thread 3 (Thread 0x7f6d27bfe700 (LWP 449)):
> #0  0x7f6dc79d4007 in vlib_worker_thread_barrier_check () at
> /home/supervisor/libvpp/src/vlib/threads.h:438
> #1  0x7f6dc79ce52e in vlib_main_or_worker_loop (vm=0x7f6da5f9b6c0,
> is_main=0) at /home/supervisor/libvpp/src/vlib/main.c:1788
> #2  0x7f6dc79cdd47 in vlib_worker_loop (vm=0x7f6da5f9b6c0) at
> /home/supervisor/libvpp/src/vlib/main.c:2008
> #3  0x7f6dc7a2592a in vlib_worker_thread_fn (arg=0x7f6da3593180) at
> /home/supervisor/libvpp/src/vlib/threads.c:1862
> #4  0x7f6dc724bc34 in clib_calljmp () at
> /home/supervisor/libvpp/src/vppinfra/longjmp.S:123
> #5  0x7f6d27bfdec0 in ?? ()
> #6  0x7f6dc7a1dad3 in vlib_worker_thread_bootstrap_fn
> (arg=0x7f6da3593180) at /home/supervisor/libvpp/src/vlib/threads.c:585
> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
>
> Thread 2 (Thread 0x7f6d283ff700 (LWP 448)):
> #0  0x7f6dc79d3ffe in vlib_worker_thread_barrier_check () at
> /home/supervisor/libvpp/src/vlib/threads.h:438
> #1  0x7f6dc79ce52e in vlib_main_or_worker_loop (vm=0x7f6da5f9a200,
> is_main=0) at /home/supervisor/libvpp/src/vlib/main.c:1788
> #2  0x7f6dc79cdd47 in vlib_worker_loop (vm=0x7f6da5f9a200) at
> /home/supervisor/libvpp/src/vlib/main.c:2008
> #3  0x7f6dc7a2592a in vlib_worker_thread_fn (arg=0x7f6da3593080) at
> /home/supervisor/libvpp/src/vlib/threads.c:1862
> #4  0x7f6dc724bc34 in clib_calljmp () at
> /home/supervisor/libvpp/src/vppinfra/longjmp.S:123
> #5  0x7f6d283feec0 in ?? ()
> #6  0x7f6dc7a1dad3 in vlib_worker_thread_bootstrap_fn
> (arg=0x7f6da3593080) at /home/supervisor/libvpp/src/vlib/threads.c:585
> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
>
> Thread 1 (Thread 0x7f6dd47c2240 (LWP 226)):
> #0  0x7f6dc723c2dc in hash_header (v=0x7f6da6870e18) at
> /home/supervisor/libvpp/src/vppinfra/hash.h:113
> #1  0x7f6dc723d329 in get_pair (v=0x7f6da6870e18, i=55) at
> /home/supervisor/libvpp/src/vppinfra/hash.c:58
> #2  0x7f6dc723c372 in lookup (v=0x7f6da6870e18, key=140108524924744,
> op=GET, new_value=0x0, old_value=0x0)
> at /home/supervisor/libvpp/src/vppinfra/hash.c:557
> #3  0x7f6dc723c261 in _hash_get (v=0x7f6da6870e18,
> key=140108524924744) at /home/supervisor/libvpp/src/vppinfra/hash.c:641
> #4  0x7f6dc8bbb5f4 in adj_nbr_find (nh_proto=FIB_PROTOCOL_IP4,
> link_type=VNET_LINK_MPLS, nh_addr=0x7f6da6866c30, sw_if_index=8)
> at /home/supervisor/libvpp/src/vnet/adj/adj_nbr.c:124
> #5  0x7f6dc8bbb661 in adj_nbr_add_or_lock (nh_proto=FIB_PROTOCOL_IP4,
> link_type=VNET_LINK_MPLS, nh_addr=0x7f6da6866c30, sw_if_index=8)
> at /home/supervisor/libvpp/src/vnet/adj/adj_nbr.c:243
> #6  0x7f6dc8b904db in fib_path_attached_next_hop_get_adj
> (path=0x7f6da6866c18, link=VNET_LINK_MPLS, dpo=0x7f6d8edbb168)
> at /home/supervisor/libvpp/src/vnet/fib/fib_path.c:674
> #7  0x7f6dc8b8ffb0 in fib_path_contribute_forwarding (path_index=58,
> fct=FIB_FORW_CHAIN_TYPE_MPLS_NON_EOS, dpo=0x7f6d8edbb168)
> at /home/supervisor/libvpp/src/vnet/fib/fib_path.c:2475
> ---Type  to continue, or q  to quit---
> #8  0x7f6dc8b98399 in fib_path_ext_stack (path_ext=0x7f6da42ab220,
> child_fct=FIB_FORW_CHAIN_TYPE_MPLS_NON_EOS,
> imp_null_fct=FIB_FORW_CHAIN_TYPE_MPLS_NON_EOS, nhs=0x7f6da8718a80) at
> /home/supervisor/libvpp/src/vnet/fib/fib_path_ext.c:241
> #9  0x7f6dc8b6e293 in fib_entry_src_collect_forwarding (pl_index=50,
> path_index=58, arg=0x7f6d8edbb380)
> at /home/supervisor/libvpp/src/vnet/fib/fib_entry_src.c:476
> #10 0x7f6dc8b8926d in fib_path_list_walk (path_list_index=50,
> func=0x7f6dc8b6e100 , ctx=0x7f6d8edbb380)
> at /home/supervisor/libvpp/src/vnet/fib/fib_path_list.c:1408
> #11 0x7f6dc8b6da44 in fib_entry_src_mk_lb (fib_entry=0x7f6da6868730,
> esrc=0x7f6da75b11c0, fct=FIB_FORW_CHAIN_TYPE_MPLS_NON_EOS,
> dpo_lb=0x7f6da6868758) at
> /home/supervisor/libvpp/src/vnet/fib/fib_entry_src.c:576
> #12 0x7f6dc8b6e6d3 in fib_entry_src_action_install
> (fib_entry=0x7f6da6868730, source=FIB_SOURCE_CLI)
> at /home/supervisor/libvpp/src/vnet/fib/fib_entry_src.c:706
> #13 0x7f6dc8b6f5ff in fib_entry_src_action_reactivate
> (fib_entry=0x7f6da6868730, source=FIB_SOURCE_CLI)
>

[vpp-dev] Infinite loop in fib_walk_sync

2021-06-17 Thread Sudhir CR via lists.fd.io
Hi All,
We have been using vpp with our stack for the 6PE solution for some time.
But when we recently enabled BFD in vpp we are observing an infinite loop
with the below call stack.

Any help in resolving this issue would be appreciated  .

(gdb) thread apply all bt

Thread 3 (Thread 0x7f6d27bfe700 (LWP 449)):
#0  0x7f6dc79d4007 in vlib_worker_thread_barrier_check () at
/home/supervisor/libvpp/src/vlib/threads.h:438
#1  0x7f6dc79ce52e in vlib_main_or_worker_loop (vm=0x7f6da5f9b6c0,
is_main=0) at /home/supervisor/libvpp/src/vlib/main.c:1788
#2  0x7f6dc79cdd47 in vlib_worker_loop (vm=0x7f6da5f9b6c0) at
/home/supervisor/libvpp/src/vlib/main.c:2008
#3  0x7f6dc7a2592a in vlib_worker_thread_fn (arg=0x7f6da3593180) at
/home/supervisor/libvpp/src/vlib/threads.c:1862
#4  0x7f6dc724bc34 in clib_calljmp () at
/home/supervisor/libvpp/src/vppinfra/longjmp.S:123
#5  0x7f6d27bfdec0 in ?? ()
#6  0x7f6dc7a1dad3 in vlib_worker_thread_bootstrap_fn
(arg=0x7f6da3593180) at /home/supervisor/libvpp/src/vlib/threads.c:585
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 2 (Thread 0x7f6d283ff700 (LWP 448)):
#0  0x7f6dc79d3ffe in vlib_worker_thread_barrier_check () at
/home/supervisor/libvpp/src/vlib/threads.h:438
#1  0x7f6dc79ce52e in vlib_main_or_worker_loop (vm=0x7f6da5f9a200,
is_main=0) at /home/supervisor/libvpp/src/vlib/main.c:1788
#2  0x7f6dc79cdd47 in vlib_worker_loop (vm=0x7f6da5f9a200) at
/home/supervisor/libvpp/src/vlib/main.c:2008
#3  0x7f6dc7a2592a in vlib_worker_thread_fn (arg=0x7f6da3593080) at
/home/supervisor/libvpp/src/vlib/threads.c:1862
#4  0x7f6dc724bc34 in clib_calljmp () at
/home/supervisor/libvpp/src/vppinfra/longjmp.S:123
#5  0x7f6d283feec0 in ?? ()
#6  0x7f6dc7a1dad3 in vlib_worker_thread_bootstrap_fn
(arg=0x7f6da3593080) at /home/supervisor/libvpp/src/vlib/threads.c:585
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 1 (Thread 0x7f6dd47c2240 (LWP 226)):
#0  0x7f6dc723c2dc in hash_header (v=0x7f6da6870e18) at
/home/supervisor/libvpp/src/vppinfra/hash.h:113
#1  0x7f6dc723d329 in get_pair (v=0x7f6da6870e18, i=55) at
/home/supervisor/libvpp/src/vppinfra/hash.c:58
#2  0x7f6dc723c372 in lookup (v=0x7f6da6870e18, key=140108524924744,
op=GET, new_value=0x0, old_value=0x0)
at /home/supervisor/libvpp/src/vppinfra/hash.c:557
#3  0x7f6dc723c261 in _hash_get (v=0x7f6da6870e18, key=140108524924744)
at /home/supervisor/libvpp/src/vppinfra/hash.c:641
#4  0x7f6dc8bbb5f4 in adj_nbr_find (nh_proto=FIB_PROTOCOL_IP4,
link_type=VNET_LINK_MPLS, nh_addr=0x7f6da6866c30, sw_if_index=8)
at /home/supervisor/libvpp/src/vnet/adj/adj_nbr.c:124
#5  0x7f6dc8bbb661 in adj_nbr_add_or_lock (nh_proto=FIB_PROTOCOL_IP4,
link_type=VNET_LINK_MPLS, nh_addr=0x7f6da6866c30, sw_if_index=8)
at /home/supervisor/libvpp/src/vnet/adj/adj_nbr.c:243
#6  0x7f6dc8b904db in fib_path_attached_next_hop_get_adj
(path=0x7f6da6866c18, link=VNET_LINK_MPLS, dpo=0x7f6d8edbb168)
at /home/supervisor/libvpp/src/vnet/fib/fib_path.c:674
#7  0x7f6dc8b8ffb0 in fib_path_contribute_forwarding (path_index=58,
fct=FIB_FORW_CHAIN_TYPE_MPLS_NON_EOS, dpo=0x7f6d8edbb168)
at /home/supervisor/libvpp/src/vnet/fib/fib_path.c:2475
---Type  to continue, or q  to quit---
#8  0x7f6dc8b98399 in fib_path_ext_stack (path_ext=0x7f6da42ab220,
child_fct=FIB_FORW_CHAIN_TYPE_MPLS_NON_EOS,
imp_null_fct=FIB_FORW_CHAIN_TYPE_MPLS_NON_EOS, nhs=0x7f6da8718a80) at
/home/supervisor/libvpp/src/vnet/fib/fib_path_ext.c:241
#9  0x7f6dc8b6e293 in fib_entry_src_collect_forwarding (pl_index=50,
path_index=58, arg=0x7f6d8edbb380)
at /home/supervisor/libvpp/src/vnet/fib/fib_entry_src.c:476
#10 0x7f6dc8b8926d in fib_path_list_walk (path_list_index=50,
func=0x7f6dc8b6e100 , ctx=0x7f6d8edbb380)
at /home/supervisor/libvpp/src/vnet/fib/fib_path_list.c:1408
#11 0x7f6dc8b6da44 in fib_entry_src_mk_lb (fib_entry=0x7f6da6868730,
esrc=0x7f6da75b11c0, fct=FIB_FORW_CHAIN_TYPE_MPLS_NON_EOS,
dpo_lb=0x7f6da6868758) at
/home/supervisor/libvpp/src/vnet/fib/fib_entry_src.c:576
#12 0x7f6dc8b6e6d3 in fib_entry_src_action_install
(fib_entry=0x7f6da6868730, source=FIB_SOURCE_CLI)
at /home/supervisor/libvpp/src/vnet/fib/fib_entry_src.c:706
#13 0x7f6dc8b6f5ff in fib_entry_src_action_reactivate
(fib_entry=0x7f6da6868730, source=FIB_SOURCE_CLI)
at /home/supervisor/libvpp/src/vnet/fib/fib_entry_src.c:1222
#14 0x7f6dc8b6c5c2 in fib_entry_back_walk_notify (node=0x7f6da6868730,
ctx=0x7f6d8edbb668)
at /home/supervisor/libvpp/src/vnet/fib/fib_entry.c:316
#15 0x7f6dc8b648c2 in fib_node_back_walk_one (ptr=0x7f6d8edbb688,
ctx=0x7f6d8edbb668)
at /home/supervisor/libvpp/src/vnet/fib/fib_node.c:161
#16 0x7f6dc8b4f36a in fib_walk_advance (fwi=1) at
/home/supervisor/libvpp/src/vnet/fib/fib_walk.c:368
#17 0x7f6dc8b4ff00 in* fib_walk_sync *(parent_type=FIB_NODE_TYPE_PATH_LIST,
parent_index=50, ctx=0x7f6d8edbb828)

[vpp-dev] vpp hangs with bfd configuration

2021-06-10 Thread Sudhir CR via lists.fd.io
Hi All,
when we are trying to establish a BFD session between two containers while
processing "adj_bfd_notify ''  VPP went into an infinite loop and hung in
one of the containers, and this issue is reproducible every time with below
topology and configuration.

Any help in fixing the issue would be appreciated.

Topology:

  Container1 (memif32321/32321)  -
(memif32321/32321)Container2

Configuration:
Container1

set interface ip address memif32321/32321 4.4.4.4/24
ip table add 100
ip route add 4.4.4.0/24 table 100 via 4.4.4.5 memif32321/32321 out-labels

ip route add 4.4.4.5/32 table 100 via 4.4.4.5 memif32321/32321 out-labels


set interface mpls memif32321/32321 enable
mpls local-label add  eos via 4.4.4.5 memif32321/32321
ip4-lookup-in-table 100

bfd udp session add interface memif32321/32321 local-addr 4.4.4.4 peer-addr
4.4.4.5 desired-min-tx 40 required-min-rx 40 detect-mult 3

Container2

set interface ip address memif32321/32321 4.4.4.5/24
ip table add 100
ip route add 4.4.4.0/24 table 100 via 4.4.4.4 memif32321/32321 out-labels

ip route add 4.4.4.4/32 table 100 via 4.4.4.4 memif32321/32321 out-labels

set interface mpls memif32321/32321 enable
mpls local-label add   eos via 4.4.4.4 memif32321/32321
ip4-lookup-in-table 100
bfd udp session add interface memif32321/32321 local-addr 4.4.4.5 peer-addr
4.4.4.4 desired-min-tx 40 required-min-rx 40 detect-mult 3

VPP version: 20.09

(gdb) thread apply all bt

Thread 3 (Thread 0x7f7ac6ffe700 (LWP 422)):
#0  0x7f7b67036ffe in vlib_worker_thread_barrier_check () at
/home/supervisor/development/libvpp/src/vlib/threads.h:438
#1  0x7f7b6703152e in vlib_main_or_worker_loop (vm=0x7f7b46cf3240,
is_main=0) at /home/supervisor/development/libvpp/src/vlib/main.c:1788
#2  0x7f7b67030d47 in vlib_worker_loop (vm=0x7f7b46cf3240) at
/home/supervisor/development/libvpp/src/vlib/main.c:2008
#3  0x7f7b6708892a in vlib_worker_thread_fn (arg=0x7f7b41f14540) at
/home/supervisor/development/libvpp/src/vlib/threads.c:1862
#4  0x7f7b668adc44 in clib_calljmp () at
/home/supervisor/development/libvpp/src/vppinfra/longjmp.S:123
#5  0x7f7ac6ffdec0 in ?? ()
#6  0x7f7b67080ad3 in vlib_worker_thread_bootstrap_fn
(arg=0x7f7b41f14540) at
/home/supervisor/development/libvpp/src/vlib/threads.c:585
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 2 (Thread 0x7f7ac77ff700 (LWP 421)):
#0  0x7f7b67036fef in vlib_worker_thread_barrier_check () at
/home/supervisor/development/libvpp/src/vlib/threads.h:437
#1  0x7f7b6703152e in vlib_main_or_worker_loop (vm=0x7f7b45fe8b80,
is_main=0) at /home/supervisor/development/libvpp/src/vlib/main.c:1788
#2  0x7f7b67030d47 in vlib_worker_loop (vm=0x7f7b45fe8b80) at
/home/supervisor/development/libvpp/src/vlib/main.c:2008
#3  0x7f7b6708892a in vlib_worker_thread_fn (arg=0x7f7b41f14440) at
/home/supervisor/development/libvpp/src/vlib/threads.c:1862
#4  0x7f7b668adc44 in clib_calljmp () at
/home/supervisor/development/libvpp/src/vppinfra/longjmp.S:123
#5  0x7f7ac77feec0 in ?? ()
#6  0x7f7b67080ad3 in vlib_worker_thread_bootstrap_fn
(arg=0x7f7b41f14440) at
/home/supervisor/development/libvpp/src/vlib/threads.c:585
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 1 (Thread 0x7f7b739b7740 (LWP 226)):
#0  0x7f7b681c952b in fib_node_list_remove (list=54, sibling=63) at
/home/supervisor/development/libvpp/src/vnet/fib/fib_node_list.c:246
#1  0x7f7b681c7695 in fib_node_child_remove
(parent_type=FIB_NODE_TYPE_ADJ, parent_index=1, sibling_index=63)
at /home/supervisor/development/libvpp/src/vnet/fib/fib_node.c:131
#2  0x7f7b681b2395 in fib_walk_destroy (fwi=2) at
/home/supervisor/development/libvpp/src/vnet/fib/fib_walk.c:262
#3  0x7f7b681b2f13 in fib_walk_sync (parent_type=FIB_NODE_TYPE_ADJ,
parent_index=1, ctx=0x7f7b2e08dc90)
at /home/supervisor/development/libvpp/src/vnet/fib/fib_walk.c:818
#4  0x7f7b6821ed4d in adj_nbr_update_rewrite_internal
(adj=0x7f7b46e08c80, adj_next_index=IP_LOOKUP_NEXT_REWRITE, this_node=426,
next_node=682, rewrite=0x7f7b4a5c4b40
"z\001\277d\004\004zP\245d\004\004\210G")
at /home/supervisor/development/libvpp/src/vnet/adj/adj_nbr.c:472
#5  0x7f7b6821eb99 in adj_nbr_update_rewrite (adj_index=2,
flags=ADJ_NBR_REWRITE_FLAG_COMPLETE,
rewrite=0x7f7b4a5c4b40 "z\001\277d\004\004zP\245d\004\004\210G") at
/home/supervisor/development/libvpp/src/vnet/adj/adj_nbr.c:335
#6  0x7f7b67c1e02d in ip_neighbor_mk_complete (ai=2, ipn=0x7f7b476dd0d8)
at
/home/supervisor/development/libvpp/src/vnet/ip-neighbor/ip_neighbor.c:337
---Type  to continue, or q  to quit---
#7  0x7f7b67c11683 in ip_neighbor_mk_complete_walk (ai=2,
ctx=0x7f7b476dd0d8)
at
/home/supervisor/development/libvpp/src/vnet/ip-neighbor/ip_neighbor.c:364
#8  0x7f7b68220063 in adj_nbr_walk_nh4 (sw_if_index=5,
addr=0x7f7b476dcf60, cb=0x7f7b67c11660 ,

[vpp-dev] vpp hangs with bfd configuration

2021-06-09 Thread Sudhir CR via lists.fd.io
Hi Team,
when we are trying to establish a BFD session between two containers while
processing "adj_bfd_notify '' * VPP went into an infinite loop and hung *in
one of the containers, and this issue is reproducible every time with below
topology and configuration.

Any help in fixing the issue would be appreciated.

*Topology:*

  Container1 (memif32321/32321)  -
(memif32321/32321)Container2

*Configuration:*

Container1

set interface ip address memif32321/32321 4.4.4.4/24
ip table add 100
ip route add 4.4.4.0/24 table 100 via 4.4.4.5 memif32321/32321 out-labels

ip route add 4.4.4.5/32 table 100 via 4.4.4.5 memif32321/32321 out-labels


set interface mpls memif32321/32321 enable
mpls local-label add  eos via 4.4.4.5 memif32321/32321
ip4-lookup-in-table 100

bfd udp session add interface memif32321/32321 local-addr 4.4.4.4 peer-addr
4.4.4.5 desired-min-tx 40 required-min-rx 40 detect-mult 3

Container2

set interface ip address memif32321/32321 4.4.4.5/24
ip table add 100
ip route add 4.4.4.0/24 table 100 via 4.4.4.4 memif32321/32321 out-labels

ip route add 4.4.4.4/32 table 100 via 4.4.4.4 memif32321/32321 out-labels

set interface mpls memif32321/32321 enable
mpls local-label add   eos via 4.4.4.4 memif32321/32321
ip4-lookup-in-table 100
bfd udp session add interface memif32321/32321 local-addr 4.4.4.5 peer-addr
4.4.4.4 desired-min-tx 40 required-min-rx 40 detect-mult 3

*VPP version: *20.09

*Backtrace in issue state:*

(gdb) thread apply all bt

Thread 3 (Thread 0x7f7ac6ffe700 (LWP 422)):
#0  0x7f7b67036ffe in vlib_worker_thread_barrier_check () at
/home/supervisor/development/libvpp/src/vlib/threads.h:438
#1  0x7f7b6703152e in vlib_main_or_worker_loop (vm=0x7f7b46cf3240,
is_main=0) at /home/supervisor/development/libvpp/src/vlib/main.c:1788
#2  0x7f7b67030d47 in vlib_worker_loop (vm=0x7f7b46cf3240) at
/home/supervisor/development/libvpp/src/vlib/main.c:2008
#3  0x7f7b6708892a in vlib_worker_thread_fn (arg=0x7f7b41f14540) at
/home/supervisor/development/libvpp/src/vlib/threads.c:1862
#4  0x7f7b668adc44 in clib_calljmp () at
/home/supervisor/development/libvpp/src/vppinfra/longjmp.S:123
#5  0x7f7ac6ffdec0 in ?? ()
#6  0x7f7b67080ad3 in vlib_worker_thread_bootstrap_fn
(arg=0x7f7b41f14540) at
/home/supervisor/development/libvpp/src/vlib/threads.c:585
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 2 (Thread 0x7f7ac77ff700 (LWP 421)):
#0  0x7f7b67036fef in vlib_worker_thread_barrier_check () at
/home/supervisor/development/libvpp/src/vlib/threads.h:437
#1  0x7f7b6703152e in vlib_main_or_worker_loop (vm=0x7f7b45fe8b80,
is_main=0) at /home/supervisor/development/libvpp/src/vlib/main.c:1788
#2  0x7f7b67030d47 in vlib_worker_loop (vm=0x7f7b45fe8b80) at
/home/supervisor/development/libvpp/src/vlib/main.c:2008
#3  0x7f7b6708892a in vlib_worker_thread_fn (arg=0x7f7b41f14440) at
/home/supervisor/development/libvpp/src/vlib/threads.c:1862
#4  0x7f7b668adc44 in clib_calljmp () at
/home/supervisor/development/libvpp/src/vppinfra/longjmp.S:123
#5  0x7f7ac77feec0 in ?? ()
#6  0x7f7b67080ad3 in vlib_worker_thread_bootstrap_fn
(arg=0x7f7b41f14440) at
/home/supervisor/development/libvpp/src/vlib/threads.c:585
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 1 (Thread 0x7f7b739b7740 (LWP 226)):
#0  0x7f7b681c952b in fib_node_list_remove (list=54, sibling=63) at
/home/supervisor/development/libvpp/src/vnet/fib/fib_node_list.c:246
#1  0x7f7b681c7695 in fib_node_child_remove
(parent_type=FIB_NODE_TYPE_ADJ, parent_index=1, sibling_index=63)
at /home/supervisor/development/libvpp/src/vnet/fib/fib_node.c:131
#2  0x7f7b681b2395 in fib_walk_destroy (fwi=2) at
/home/supervisor/development/libvpp/src/vnet/fib/fib_walk.c:262
#3  0x7f7b681b2f13 in fib_walk_sync (parent_type=FIB_NODE_TYPE_ADJ,
parent_index=1, ctx=0x7f7b2e08dc90)
at /home/supervisor/development/libvpp/src/vnet/fib/fib_walk.c:818
#4  0x7f7b6821ed4d in adj_nbr_update_rewrite_internal
(adj=0x7f7b46e08c80, adj_next_index=IP_LOOKUP_NEXT_REWRITE, this_node=426,
next_node=682, rewrite=0x7f7b4a5c4b40
"z\001\277d\004\004zP\245d\004\004\210G")
at /home/supervisor/development/libvpp/src/vnet/adj/adj_nbr.c:472
#5  0x7f7b6821eb99 in adj_nbr_update_rewrite (adj_index=2,
flags=ADJ_NBR_REWRITE_FLAG_COMPLETE,
rewrite=0x7f7b4a5c4b40 "z\001\277d\004\004zP\245d\004\004\210G") at
/home/supervisor/development/libvpp/src/vnet/adj/adj_nbr.c:335
#6  0x7f7b67c1e02d in ip_neighbor_mk_complete (ai=2, ipn=0x7f7b476dd0d8)
at
/home/supervisor/development/libvpp/src/vnet/ip-neighbor/ip_neighbor.c:337
---Type  to continue, or q  to quit---
#7  0x7f7b67c11683 in ip_neighbor_mk_complete_walk (ai=2,
ctx=0x7f7b476dd0d8)
at
/home/supervisor/development/libvpp/src/vnet/ip-neighbor/ip_neighbor.c:364
#8  0x7f7b68220063 in adj_nbr_walk_nh4 (sw_if_index=5,

Re: [vpp-dev] observing issue with LACP port selection logic

2021-05-12 Thread Sudhir CR via lists.fd.io
Hi Steven,

Thanks for the patch.
I verified the patch and it is working fine.
After applying the patch, ports having different remote key than the 1st
member are not added to the Bond Interface.

DBGvpp# show bond
interface name   sw_if_index  mode  load balance  active members
members
BondEthernet07lacp  l23   2  3
DBGvpp#

DBGvpp# show bond details
BondEthernet0
  mode: lacp
  load balance: l23
  number of active members: 2
memif2/2
memif3/3
  number of members: 3
memif2/2
memif3/3
memif4/4
  device instance: 0
  interface id: 0
  sw_if_index: 7
  hw_if_index: 7
DBGvpp#

DBGvpp# show lacp
actor state
 partner state
interface namesw_if_index  bond interface
exp/def/dis/col/syn/agg/tim/act  exp/def/dis/col/syn/agg/tim/act
memif2/2  8BondEthernet0  0   0   1   1   1
  1   1   10   0   1   1   1   1   1   1
  LAG ID: [(,7a-67-1e-01-0c-02,0007,00ff,0001),
(,7a-37-f7-00-0c-02,000a,00ff,0001)]
  RX-state: CURRENT, TX-state: TRANSMIT, MUX-state:
COLLECTING_DISTRIBUTING, PTX-state: PERIODIC_TX
memif3/3  9BondEthernet0  0   0   1   1   1
  1   1   10   0   1   1   1   1   1   1
  LAG ID: [(,7a-67-1e-01-0c-02,0007,00ff,0002),
(,7a-37-f7-00-0c-02,000a,00ff,0002)]
  RX-state: CURRENT, TX-state: TRANSMIT, MUX-state:
COLLECTING_DISTRIBUTING, PTX-state: PERIODIC_TX
memif4/4  10   BondEthernet0  0   0   0   0   0
  1   1   10   0   0   0   1   1   1   1
  LAG ID: [(,7a-67-1e-01-0c-02,0007,00ff,0003),
(,7a-37-f7-00-0c-04,000b,00ff,0001)]
  RX-state: CURRENT, TX-state: TRANSMIT, MUX-state: DETACHED, PTX-state:
PERIODIC_TX
DBGvpp#

DBGvpp# show errors
   CountNode  Reason
11   lacp-input   good lacp packets --
consumed
58   lacp-input   good lacp packets --
cache hit
*72   lacp-input   Bad key*
 7   lldp-input   good lldp packets
(processed)
   148   bond-input   no error
   148   bond-input   pass through (CDP, LLDP,
slow protocols)
 6null-node   blackholed packets
DBGvpp#

Thanks and Regards, Sudhir

On Thu, May 13, 2021 at 3:32 AM Steven Luong (sluong) 
wrote:

> Sudhir,
>
>
>
> It is an error topology/configuration we don’t currently handle. Please
> try this and report back
>
>
>
> https://gerrit.fd.io/r/c/vpp/+/32292
>
>
>
> The behavior is container-1 will form one bonding group with container-2.
> It is with either BondEthernet0 or BondEthernet1.
>
>
>
> Steven
>
>
>
> *From: * on behalf of "Sudhir CR via lists.fd.io"
> 
> *Reply-To: *"sud...@rtbrick.com" 
> *Date: *Tuesday, May 11, 2021 at 7:30 PM
> *To: *"vpp-dev@lists.fd.io" 
> *Subject: *[vpp-dev] observing issue with LACP port selection logic
>
>
>
> Hi all,
>
> i am configuring LACP between two containers.
>
> vpp version used : *20.09*
>
> topology looks like below
>
> in above topology since memif-4/4 interface is not part of same bond
> interface on both the containers (different partner system id)
>
> memif-4/4 should not be marked as active  interface and attached to
> BondEthernet0 in container1 but is attaching
>
> to BondEthernet0.
>
>
>
> Any help in fixing the issue would be appreciated.
>
>
>
> Please find configuration in container1 :
>
> DBGvpp# show bond
> interface name   sw_if_index  mode  load balance  active members
> members
> BondEthernet09lacp  l23   3  3
>
>
>
> DBGvpp# show bond details
> BondEthernet0
>   mode: lacp
>   load balance: l23
>   number of active members: 3
> memif2/2
> memif3/3
> memif4/4
>   number of members: 3
> memif2/2
> memif3/3
> memif4/4
>   device instance: 0
>   interface id: 0
>   sw_if_index: 9
>   hw_if_index: 9
>
>
>
> DBGvpp# show lacp
> actor state
>partner state
> interface namesw_if_index  bond interface
> exp/def/dis/col/syn/agg/tim/act  exp/def/dis/col/syn/agg/tim/act
> memif2/2  2BondEthernet0  0   0   1   1
> 1   1   1   10   0   1   1   1   1   1   1
>   LAG ID: [(,7a-67-1e-01-0c-02,0009,00ff,0001),
> (,7a-37-f7-00-0c-02,000f,00ff,0001)]
>   RX-state: CURRENT, TX-state: TRANSMIT, MUX-state:
> COLLECTING_DISTRIBUTING, PTX-state: PERIODIC_TX
> memif3/3   

[vpp-dev] observing issue with LACP port selection logic

2021-05-11 Thread Sudhir CR via lists.fd.io
Hi all,
i am configuring LACP between two containers.
vpp version used : *20.09*
topology looks like below
[image: image.png]
in above topology since memif-4/4 interface is not part of same bond
interface on both the containers (different partner system id)
memif-4/4 should not be marked as active  interface and attached to
BondEthernet0 in container1 but is attaching
to BondEthernet0.

Any help in fixing the issue would be appreciated.

Please find configuration in container1 :

DBGvpp# show bond
interface name   sw_if_index  mode  load balance  active members
members
BondEthernet09lacp  l23   3  3

DBGvpp# show bond details
BondEthernet0
  mode: lacp
  load balance: l23
  number of active members: 3
memif2/2
memif3/3
memif4/4
  number of members: 3
memif2/2
memif3/3
memif4/4
  device instance: 0
  interface id: 0
  sw_if_index: 9
  hw_if_index: 9

DBGvpp# show lacp
actor state
 partner state
interface namesw_if_index  bond interface
exp/def/dis/col/syn/agg/tim/act  exp/def/dis/col/syn/agg/tim/act
memif2/2  2BondEthernet0  0   0   1   1   1
  1   1   10   0   1   1   1   1   1   1
  LAG ID: [(,7a-67-1e-01-0c-02,0009,00ff,0001),
(,7a-37-f7-00-0c-02,000f,00ff,0001)]
  RX-state: CURRENT, TX-state: TRANSMIT, MUX-state:
COLLECTING_DISTRIBUTING, PTX-state: PERIODIC_TX
memif3/3  3BondEthernet0  0   0   1   1   1
  1   1   10   0   1   1   1   1   1   1
  LAG ID: [(,7a-67-1e-01-0c-02,0009,00ff,0002),
(,7a-37-f7-00-0c-02,000f,00ff,0002)]
  RX-state: CURRENT, TX-state: TRANSMIT, MUX-state:
COLLECTING_DISTRIBUTING, PTX-state: PERIODIC_TX
memif4/4  4BondEthernet0  0   0   1   1   1
  1   1   10   0   1   1   1   1   1   1
  LAG ID: [(,7a-67-1e-01-0c-02,0009,00ff,0003),
(,7a-37-f7-00-0c-04,0010,00ff,0001)]
  RX-state: CURRENT, TX-state: TRANSMIT, MUX-state:
COLLECTING_DISTRIBUTING, PTX-state: PERIODIC_TX
DBGvpp#

Please find configuration in container2 :
DBGvpp# show bond
interface name   sw_if_index  mode  load balance  active members
members
BondEthernet015   lacp  l23   2  2
BondEthernet116   lacp  l23   1  1
DBGvpp#
DBGvpp#

DBGvpp# show bond details
BondEthernet0
  mode: lacp
  load balance: l23
  number of active members: 2
memif2/2
memif3/3
  number of members: 2
memif2/2
memif3/3
  device instance: 0
  interface id: 0
  sw_if_index: 15
  hw_if_index: 15
BondEthernet1
  mode: lacp
  load balance: l23
  number of active members: 1
memif4/4
  number of members: 1
memif4/4
  device instance: 1
  interface id: 1
  sw_if_index: 16
  hw_if_index: 16

DBGvpp# show lacp
actor state
 partner state
interface namesw_if_index  bond interface
exp/def/dis/col/syn/agg/tim/act  exp/def/dis/col/syn/agg/tim/act
memif2/2  8BondEthernet0  0   0   1   1   1
  1   1   10   0   1   1   1   1   1   1
  LAG ID: [(,7a-37-f7-00-0c-02,000f,00ff,0001),
(,7a-67-1e-01-0c-02,0009,00ff,0001)]
  RX-state: CURRENT, TX-state: TRANSMIT, MUX-state:
COLLECTING_DISTRIBUTING, PTX-state: PERIODIC_TX
memif3/3  9BondEthernet0  0   0   1   1   1
  1   1   10   0   1   1   1   1   1   1
  LAG ID: [(,7a-37-f7-00-0c-02,000f,00ff,0002),
(,7a-67-1e-01-0c-02,0009,00ff,0002)]
  RX-state: CURRENT, TX-state: TRANSMIT, MUX-state:
COLLECTING_DISTRIBUTING, PTX-state: PERIODIC_TX
memif4/4  10   BondEthernet1  0   0   1   1   1
  1   1   10   0   1   1   1   1   1   1
  LAG ID: [(,7a-37-f7-00-0c-04,0010,00ff,0001),
(,7a-67-1e-01-0c-02,0009,00ff,0003)]
  RX-state: CURRENT, TX-state: TRANSMIT, MUX-state:
COLLECTING_DISTRIBUTING, PTX-state: PERIODIC_TX
DBGvpp#


Thanks and Regards,
Sudhir

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#19373): https://lists.fd.io/g/vpp-dev/message/19373
Mute This Topic: https://lists.fd.io/mt/82763645/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] vpp main thread crashed at mspace_put

2021-03-25 Thread Sudhir CR via lists.fd.io
Hi All,

The segmentation fault is happening at memset in the code below.

#if CLIB_DEBUG 
> 0 && !defined(CLIB_SANITIZE_ADDR)

  */* Poison the object */*
  {
 size_t psize = mspace_usable_size

(object_header);
 memset (object_header, 0x13, psize);
   }
#endif

Not sure how to proceed further to root cause the issue.


Thanks,

Sudhir



On Thu, Mar 25, 2021 at 5:31 PM Sudhir CR  wrote:

> Hi All,
> We have loaded our box with internet feed routes. Initially everything is
> good.
> But after *three hours* we observed a *vpp main thread crashed* due to 
> *segmentation
> error *at mspace_put function.
>
> #28 0x7f0f802c0793 in unix_signal_handler (signum=11, si=0x7f0f33c086b0, 
> uc=0x7f0f33c08580)
> at /development/libvpp/src/vlib/unix/main.c:127
> #29 
> #30 __memset_avx2_erms () at 
> ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:145
> #31 0x7f0f7fa67735 in mspace_put (msp=0x130044010, p_arg=0x130089a48) at 
> /development/libvpp/src/vppinfra/dlmalloc.c:4316
>
> We are using *20.09 version* and the complete* backtrace *is  pasted below. 
> Any help in fixing the issue would be appreciated.
>
> Thread 3 (Thread 0x7f0eccbfe700 (LWP 476)):
> #0  0x7f0f8023cff7 in vlib_worker_thread_barrier_check () at 
> /development/libvpp/src/vlib/threads.h:438
> #1  0x7f0f8023751e in vlib_main_or_worker_loop (vm=0x7f0f652dd300, 
> is_main=0) at /development/libvpp/src/vlib/main.c:1788
> #2  0x7f0f80236d37 in vlib_worker_loop (vm=0x7f0f652dd300) at 
> /development/libvpp/src/vlib/main.c:2008
> #3  0x7f0f8028e91a in vlib_worker_thread_fn (arg=0x7f0f4a974940) at 
> /development/libvpp/src/vlib/threads.c:1862
> #4  0x7f0f7fab5c34 in clib_calljmp () at 
> /development/libvpp/src/vppinfra/longjmp.S:123
> #5  0x7f0eccbfdec0 in ?? ()   
> [58/393]
> #6  0x7f0f80286ac3 in vlib_worker_thread_bootstrap_fn 
> (arg=0x7f0f4a974940) at /development/libvpp/src/vlib/threads.c:585
> ---Type  to continue, or q  to quit---
> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
>
> Thread 2 (Thread 0x7f0ecd3ff700 (LWP 475)):
> #0  0x7f0f8023cfec in vlib_worker_thread_barrier_check () at 
> /development/libvpp/src/vlib/threads.h:438
> #1  0x7f0f8023751e in vlib_main_or_worker_loop (vm=0x7f0f64d99ec0, 
> is_main=0) at /development/libvpp/src/vlib/main.c:1788
> #2  0x7f0f80236d37 in vlib_worker_loop (vm=0x7f0f64d99ec0) at 
> /development/libvpp/src/vlib/main.c:2008
> #3  0x7f0f8028e91a in vlib_worker_thread_fn (arg=0x7f0f4a974840) at 
> /development/libvpp/src/vlib/threads.c:1862
> #4  0x7f0f7fab5c34 in clib_calljmp () at 
> /development/libvpp/src/vppinfra/longjmp.S:123
> #5  0x7f0ecd3feec0 in ?? ()
> #6  0x7f0f80286ac3 in vlib_worker_thread_bootstrap_fn 
> (arg=0x7f0f4a974840) at /development/libvpp/src/vlib/threads.c:585
> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
>
> Thread 1 (Thread 0x7f0f8d076d00 (LWP 280)):
> #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> #1  0x7f0f8c0d3921 in __GI_abort () at abort.c:79
> #2  0x7f0f81cbd253 in os_panic () at 
> /development/libvpp/src/vpp/vnet/main.c:572
> #3  0x7f0f7fa91aa9 in debugger () at 
> /development/libvpp/src/vppinfra/error.c:84
> #4  0x7f0f7fa91827 in _clib_error (how_to_die=2, function_name=0x0, 
> line_number=0, fmt=0x7f0f7fb613df "%s:%d (%s) assertion `%s' fails")
> at /development/libvpp/src/vppinfra/error.c:143
> #5  0x7f0f7fa98e61 in _vec_resize_inline (v=0x7f0f4a69fa90, 
> length_increment=16, data_bytes=16, header_bytes=0, data_align=1,
> numa_id=255) at /development/libvpp/src/vppinfra/vec.h:154
> #6  0x7f0f7fa98b8b in va_format (s=0x7f0f4a69fa90 "", fmt=0x7f0f802dbf1b 
> "received signal %U, PC %U", va=0x7f0f33c065f0)
> at /development/libvpp/src/vppinfra/format.c:403
> #7  0x7f0f7faa03c6 in format (s=0x7f0f4a69fa90 "", fmt=0x7f0f802dbf1b 
> "received signal %U, PC %U")
> at /development/libvpp/src/vppinfra/format.c:428
> #8  0x7f0f802c0793 in unix_signal_handler (signum=6, si=0x7f0f33c068b0, 
> uc=0x7f0f33c06780)
> at /development/libvpp/src/vlib/unix/main.c:127
> #9  
> #10 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> #11 0x7f0f8c0d3921 in __GI_abort () at abort.c:79
> #12 0x7f0f81cbd253 in os_panic () at 
> /development/libvpp/src/vpp/vnet/main.c:572
> #13 0x7f0f7fa91aa9 in debugger () at 
> /development/libvpp/src/vppinfra/error.c:84
> #14 0x7f0f7fa91827 in _clib_error (how_to_die=2, function_name=0x0, 
> line_number=0, fmt=0x7f0f7fb613df "%s:%d (%s) assertion `%s' f[23/393]
> at /development/libvpp/src/vppinfra/error.c:143
> ---Type  to continue, or q  to quit---
> #15 0x7f0f7fa98e61 in 

[vpp-dev] vpp main thread crashed at mspace_put

2021-03-25 Thread Sudhir CR via lists.fd.io
Hi All,
We have loaded our box with internet feed routes. Initially everything is
good.
But after *three hours* we observed a *vpp main thread crashed* due to
*segmentation
error *at mspace_put function.

#28 0x7f0f802c0793 in unix_signal_handler (signum=11,
si=0x7f0f33c086b0, uc=0x7f0f33c08580)
at /development/libvpp/src/vlib/unix/main.c:127
#29 
#30 __memset_avx2_erms () at
../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:145
#31 0x7f0f7fa67735 in mspace_put (msp=0x130044010,
p_arg=0x130089a48) at /development/libvpp/src/vppinfra/dlmalloc.c:4316

We are using *20.09 version* and the complete* backtrace *is  pasted
below. Any help in fixing the issue would be appreciated.

Thread 3 (Thread 0x7f0eccbfe700 (LWP 476)):
#0  0x7f0f8023cff7 in vlib_worker_thread_barrier_check () at
/development/libvpp/src/vlib/threads.h:438
#1  0x7f0f8023751e in vlib_main_or_worker_loop (vm=0x7f0f652dd300,
is_main=0) at /development/libvpp/src/vlib/main.c:1788
#2  0x7f0f80236d37 in vlib_worker_loop (vm=0x7f0f652dd300) at
/development/libvpp/src/vlib/main.c:2008
#3  0x7f0f8028e91a in vlib_worker_thread_fn (arg=0x7f0f4a974940)
at /development/libvpp/src/vlib/threads.c:1862
#4  0x7f0f7fab5c34 in clib_calljmp () at
/development/libvpp/src/vppinfra/longjmp.S:123
#5  0x7f0eccbfdec0 in ?? ()

[58/393]
#6  0x7f0f80286ac3 in vlib_worker_thread_bootstrap_fn
(arg=0x7f0f4a974940) at /development/libvpp/src/vlib/threads.c:585
---Type  to continue, or q  to quit---
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 2 (Thread 0x7f0ecd3ff700 (LWP 475)):
#0  0x7f0f8023cfec in vlib_worker_thread_barrier_check () at
/development/libvpp/src/vlib/threads.h:438
#1  0x7f0f8023751e in vlib_main_or_worker_loop (vm=0x7f0f64d99ec0,
is_main=0) at /development/libvpp/src/vlib/main.c:1788
#2  0x7f0f80236d37 in vlib_worker_loop (vm=0x7f0f64d99ec0) at
/development/libvpp/src/vlib/main.c:2008
#3  0x7f0f8028e91a in vlib_worker_thread_fn (arg=0x7f0f4a974840)
at /development/libvpp/src/vlib/threads.c:1862
#4  0x7f0f7fab5c34 in clib_calljmp () at
/development/libvpp/src/vppinfra/longjmp.S:123
#5  0x7f0ecd3feec0 in ?? ()
#6  0x7f0f80286ac3 in vlib_worker_thread_bootstrap_fn
(arg=0x7f0f4a974840) at /development/libvpp/src/vlib/threads.c:585
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 1 (Thread 0x7f0f8d076d00 (LWP 280)):
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x7f0f8c0d3921 in __GI_abort () at abort.c:79
#2  0x7f0f81cbd253 in os_panic () at
/development/libvpp/src/vpp/vnet/main.c:572
#3  0x7f0f7fa91aa9 in debugger () at
/development/libvpp/src/vppinfra/error.c:84
#4  0x7f0f7fa91827 in _clib_error (how_to_die=2,
function_name=0x0, line_number=0, fmt=0x7f0f7fb613df "%s:%d (%s)
assertion `%s' fails")
at /development/libvpp/src/vppinfra/error.c:143
#5  0x7f0f7fa98e61 in _vec_resize_inline (v=0x7f0f4a69fa90,
length_increment=16, data_bytes=16, header_bytes=0, data_align=1,
numa_id=255) at /development/libvpp/src/vppinfra/vec.h:154
#6  0x7f0f7fa98b8b in va_format (s=0x7f0f4a69fa90 "",
fmt=0x7f0f802dbf1b "received signal %U, PC %U", va=0x7f0f33c065f0)
at /development/libvpp/src/vppinfra/format.c:403
#7  0x7f0f7faa03c6 in format (s=0x7f0f4a69fa90 "",
fmt=0x7f0f802dbf1b "received signal %U, PC %U")
at /development/libvpp/src/vppinfra/format.c:428
#8  0x7f0f802c0793 in unix_signal_handler (signum=6,
si=0x7f0f33c068b0, uc=0x7f0f33c06780)
at /development/libvpp/src/vlib/unix/main.c:127
#9  
#10 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#11 0x7f0f8c0d3921 in __GI_abort () at abort.c:79
#12 0x7f0f81cbd253 in os_panic () at
/development/libvpp/src/vpp/vnet/main.c:572
#13 0x7f0f7fa91aa9 in debugger () at
/development/libvpp/src/vppinfra/error.c:84
#14 0x7f0f7fa91827 in _clib_error (how_to_die=2,
function_name=0x0, line_number=0, fmt=0x7f0f7fb613df "%s:%d (%s)
assertion `%s' f[23/393]
at /development/libvpp/src/vppinfra/error.c:143
---Type  to continue, or q  to quit---
#15 0x7f0f7fa98e61 in _vec_resize_inline (v=0x7f0f4a69fa90,
length_increment=16, data_bytes=16, header_bytes=0, data_align=1,
numa_id=255) at /development/libvpp/src/vppinfra/vec.h:154
#16 0x7f0f7fa98b8b in va_format (s=0x7f0f4a69fa90 "",
fmt=0x7f0f802dbf1b "received signal %U, PC %U", va=0x7f0f33c074f0)
at /development/libvpp/src/vppinfra/format.c:403
#17 0x7f0f7faa03c6 in format (s=0x7f0f4a69fa90 "",
fmt=0x7f0f802dbf1b "received signal %U, PC %U")
at /development/libvpp/src/vppinfra/format.c:428
#18 0x7f0f802c0793 in unix_signal_handler (signum=6,
si=0x7f0f33c077b0, uc=0x7f0f33c07680)
at /development/libvpp/src/vlib/unix/main.c:127
#19 
#20 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#21 0x7f0f8c0d3921 in __GI_abort () at abort.c:79
#22 0x7f0f81cbd253 in os_panic () at