Hi,

I've run into the same issue with different, but also external code.

The calling sequence in my case looks very similar to the one from Hugo.
I'm also getting a invalid point from vlib_get_frame_to_node.
It is crashing here:
https://github.com/travelping/vpp/blob/feature/master/upf%2Btdf/src/plugins/upf/upf_pfcp_server.c#L121

@Hugo: have you found the root cause for your problem?

Regards
Andreas

Am Mi., 28. Nov. 2018 um 12:53 Uhr schrieb Dave Barach via Lists.Fd.Io
<dbarach=cisco....@lists.fd.io>:

> None of the routine names in the backtrace exist in master/latest – it’s
> your code - so it will be challenging for the community to help you.
>
>
>
> See if you can repro the problem with a TAG=vpp_debug images (aka “make
> build” not “make build-release”). If you’re lucky, one of the numerous
> ASSERTs will catch the problem early.
>
>
>
> vlib_get_frame_to_node(...) is not new code, it’s used all over the place,
> and it needs “help” to fail as shown below.
>
>
>
> D.
>
>
>
> *From:* vpp-dev@lists.fd.io <vpp-dev@lists.fd.io> *On Behalf Of *Hugo
> Garza
> *Sent:* Tuesday, November 27, 2018 7:39 PM
> *To:* vpp-dev@lists.fd.io
> *Subject:* [vpp-dev] SIGSEGV after calling vlib_get_frame_to_node
>
>
>
> Hi vpp-dev,
>
> I'm seeing a crash when I enable our application with multiple works.
> Nov 26 14:29:32  vnet[64035]: received signal SIGSEGV, PC 0x7f6979a12ce8,
> faulting address 0x7fa6cd0bd444
> Nov 26 14:29:32  vnet[64035]: #0  0x00007f6a812743d8 0x7f6a812743d8
> Nov 26 14:29:32  vnet[64035]: #1  0x00007f6a80bc56d0 0x7f6a80bc56d0
> Nov 26 14:29:32  vnet[64035]: #2  0x00007f6979a12ce8
> vlib_frame_vector_args + 0x10
> Nov 26 14:29:32  vnet[64035]: #3  0x00007f6979a16a2c
> tcpo_enqueue_to_output_i + 0xf4
> Nov 26 14:29:32  vnet[64035]: #4  0x00007f6979a16b23
> tcpo_enqueue_to_output + 0x25
> Nov 26 14:29:32  vnet[64035]: #5  0x00007f6979a33fba send_packets + 0x7f2
> Nov 26 14:29:32  vnet[64035]: #6  0x00007f6979a346f8 connection_tx + 0x17e
> Nov 26 14:29:32  vnet[64035]: #7  0x00007f6979a34f08 tcpo_dispatch_node_fn
> + 0x7fa
> Nov 26 14:29:32  vnet[64035]: #8  0x00007f6a81248cb6 vlib_worker_loop +
> 0x6a6
> Nov 26 14:29:32  vnet[64035]: #9  0x00007f6a8094f694 0x7f6a8094f694
>
> Running on CentOS 7.4  with kernel 3.10.0-693.el7.x86_64
> VPP
> Version:                  v18.10-13~g00adcce~b60
> Compiled by:              root
> Compile host:             b0f32e97e93a
> Compile date:             Mon Nov 26 09:09:42 UTC 2018
> Compile location:         /w/workspace/vpp-merge-1810-centos7
> Compiler:                 GCC 7.3.1 20180303 (Red Hat 7.3.1-5)
> Current PID:              9612
>
> On a Cisco server with 2 socket Intel Xeon E5-2697Av4 @ 2.60GHz and 2
> Intel X520 NICs. T-Rex traffic generator is hooked up on the other end to
> provided data at about 5Gbps per NIC.
> ./t-rex-64 --astf -f astf/nginx_wget.py -c 14 -m 40000 -d 3000
>
> startup.conf
> unix {
>   nodaemon
>   interactive
>   log /opt/tcpo/logs/vpp.log
>   full-coredump
>   cli-no-banner
>   #startup-config /opt/tcpo/conf/local.conf
>   cli-listen /run/vpp/cli.sock
> }
> api-trace {
>   on
> }
> heapsize 3G
> cpu {
>   main-core 1
>   corelist-workers 2-5
> }
> tcpo {
> runtime-config /opt/tcpo/conf/runtime.conf
> session-pool-size 1024000
> }
> dpdk {
>   dev 0000:86:00.0 {
>     num-rx-queues 1
>   }
>   dev 0000:86:00.1 {
>     num-rx-queues 1
>   }
>   dev 0000:84:00.0 {
>     num-rx-queues 1
>   }
>   dev 0000:84:00.1 {
>     num-rx-queues 1
>   }
>   num-mbufs 1024000
>   socket-mem 4096,4096
> }
> plugin_path /usr/lib/vpp_plugins
> api-segment {
>   gid vpp
> }
>
> Here's the function where the SIGSEGV is happening:
>
>
>
> static void enqueue_to_output_i(tcpo_worker_ctx_t * wrk, u32 bi, u8
> flush) {
>
>
>
>     u32 *to_next, next_index;
>
>
>
>     vlib_frame_t *f;
>
>
>
>
>
>     TRACE_FUNC_VAR(bi);
>
>
>
>
>
>     next_index = tcpo_output_node.index;
>
>
>
>
>
>     /* Get frame to output node */
>
>
>
>     f = wrk->tx_frame;
>
>
>
>     if (!f) {
>
>
>
>         f = vlib_get_frame_to_node(wrk->vm, next_index);
>
>
>
>         ASSERT (clib_mem_is_heap_object (f));
>
>
>
>         wrk->tx_frame = f;
>
>
>
>     }
>
>
>
>     ASSERT (clib_mem_is_heap_object (f));
>
>
>
>
>
>     to_next = vlib_frame_vector_args(f);
>
>
>
>     to_next[f->n_vectors] = bi;
>
>
>
>     f->n_vectors += 1;
>
>
>
>
>
>     if (flush || f->n_vectors == VLIB_FRAME_SIZE) {
>
>
>
>         TRACE_FUNC_VAR2(flush, f->n_vectors);
>
>
>
>         vlib_put_frame_to_node(wrk->vm, next_index, f);
>
>
>
>         wrk->tx_frame = 0;
>
>
>
>     }
>
>
>
> }
>
>
>
>
> I've observed that after a few Gbps of traffic go through and we call
> *vlib_get_frame_to_node* the pointer *f* that gets returned points to a
> chunk of memory that is invalid as confirmed by the assert statement that I
> added afterwards right below.
>
> Not sure how to progress further on tracking down this issue, any help or
> advice would be much appreciated.
>
> Thanks,
> Hugo
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
>
> View/Reply Online (#11444): https://lists.fd.io/g/vpp-dev/message/11444
> Mute This Topic: https://lists.fd.io/mt/28408842/675601
> Group Owner: vpp-dev+ow...@lists.fd.io
> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [
> andreas.schu...@travelping.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>


-- 

Andreas Schultz

-- 

Principal Engineer

t: +49 391 819099-224

------------------------------- enabling your networks
-----------------------------

Travelping GmbH

Roentgenstraße 13

39108 Magdeburg

Germany

t: +49 391 819099-0

f: +49 391 819099-299

e: i...@travelping.com

w: https://www.travelping.com/

Company registration: Amtsgericht Stendal  Reg. No.: HRB 10578
Geschaeftsfuehrer: Holger Winkelmann VAT ID: DE236673780
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#13431): https://lists.fd.io/g/vpp-dev/message/13431
Mute This Topic: https://lists.fd.io/mt/28408842/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to