Dear Andrew Sorry for my delay, I get last revision of master (commit : e695cb4dbdb6f9424ac5a567799e67f791fad328 ), and segfault did not occur with the same environment and test scenario. I will try to reproduce the potential bug with running test with longer duration and more aggressive scenario.
Regards, Khers On Wed, Oct 25, 2017 at 1:45 PM, Andrew 👽 Yourtchenko <ayour...@gmail.com> wrote: > Dear Khers, > > okay, cool! When testing the debug image, you could save the full dump > and the .debs for all the artefacts so just in case I could grab the > entire set of info and was able to look at it in my environment. > > Meantime, I had an idea for another potential failure mode, whereby > the session would get checked while there is a session being freed, > potentially resulting in a reallocation of the free bitmap in the > pool. > > So before the reproduction in the debug build, give a shot to this > one-line change > in the release build and see if you still can reproduce the crash with it: > > --- a/src/plugins/acl/fa_node.c > +++ b/src/plugins/acl/fa_node.c > @@ -609,6 +609,8 @@ acl_fa_verify_init_sessions (acl_main_t * am) > for (wk = 0; wk < vec_len (am->per_worker_data); wk++) { > acl_fa_per_worker_data_t *pw = &am->per_worker_data[wk]; > pool_alloc_aligned(pw->fa_sessions_pool, > am->fa_conn_table_max_entries, CLIB_CACHE_LINE_BYTES); > + /* preallocate the free bitmap */ > + clib_bitmap_validate(pool_header(pw->fa_sessions_pool)-> > free_bitmap, > am->fa_conn_table_max_entries); > } > > --a > > On 10/24/17, khers <s3m2e1.6s...@gmail.com> wrote: > > Dear Andrew > > > > I used latest version of master branch, I will replay the test with debug > > build to make more debug info ASAP. > > Vpp is running on Xeon E5-2600 series. > > I did the tanother tests with two rx-queue and two worker, also with 4 > > rx-queue and 4 worker, I got segmentation fault on the same function. > > > > I will send more info in few days. > > > > Regards, > > Khers > > > > On Oct 24, 2017 6:43 PM, "Andrew 👽 Yourtchenko" <ayour...@gmail.com> > > wrote: > > > >> Dear Khers, > >> > >> Thanks for the info! > >> > >> I tried with these configs in my local setup (I tried even to increase > >> the multi-cpu contention by specifying 4 rx-queues instead of 2), but > >> it works ok for me on the master. What is the version you are testing > >> with ? I presume it is also the master, but just wanted to verify. > >> > >> To try to get more info about this happening: could you give a shot at > >> reproducing this on the debug build ? There are a few asserts that > >> would be handy to verify that they do hold true during your tests - > >> the location of the crash points to either the pool header being > >> corrupted by something (the asserts should catch that) or the pool > >> itself reallocated and memory used by something else (which should not > >> happen because the memory is preallocated during the initialisation > >> time - unless you change the max number of sessions after > >> initialisation). > >> > >> Also, could you tell a bit more about the hardware you are testing > >> with ? (cat /proc/cpuinfo) > >> > >> --a > >> > >> On 10/24/17, khers <s3m2e1.6s...@gmail.com> wrote: > >> > Dear Andrew > >> > > >> > Thanks for your attention. > >> > Trex config file <https://paste.ubuntu.com/25807801/> > >> > Trex scenario is default sfr.yaml. > >> > vpp: startup.conf <https://paste.ubuntu.com/25807840/> > >> > I changed size of acl_mheap to '(uword)2<<32' in acl.c > >> > vpp config: > >> > vppctl set interface l2 bridge TenGigabitEthernet86/0/0 1 > >> > vppctl set interface l2 bridge TenGigabitEthernet86/0/1 1 > >> > > >> > vppctl set int state TenGigabitEthernet86/0/0 up > >> > vppctl set int state TenGigabitEthernet86/0/1 up > >> > > >> > vppctl set acl-plugin session table hash-table-buckets 1000000 > >> > vppctl set acl-plugin session table hash-table-memory 2147483648 > >> > > >> > vppctl set acl-plugin session timeout udp idle 5 > >> > vppctl set acl-plugin session timeout tcp idle 10 > >> > vppctl set acl-plugin session timeout tcp transient 5 > >> > > >> > Regards, > >> > Khers > >> > > >> > > >> > On Mon, Oct 23, 2017 at 7:52 PM, Andrew 👽 Yourtchenko < > >> ayour...@gmail.com> > >> > wrote: > >> > > >> >> Hi, > >> >> > >> >> could you share the exact TRex and VPP config files, so I could > >> >> recreate it locally to investigate further ? > >> >> > >> >> Thanks a lot! > >> >> > >> >> --a > >> >> > >> >> On 10/23/17, khers <s3m2e1.6s...@gmail.com> wrote: > >> >> > Dear folks > >> >> > > >> >> > I have bridged two interfaces and set permit+reflect acl on the > >> >> > input > >> >> > of > >> >> > interface one and deny rule on output of same interface as follow: > >> >> > > >> >> > acl_add_replace permit+reflect > >> >> > acl_add_replace deny > >> >> > > >> >> > acl_interface_add_del sw_if_index 1 add input acl 0 > >> >> > acl_interface_add_del sw_if_index 1 add output acl 1 > >> >> > > >> >> > > >> >> > after about 100 seconds of running Trex with sfr scenario I got > >> >> > sigsegv. > >> >> > this is gdb's backtrace <https://pastebin.com/VvZ9Z3Nf>. > >> >> > > >> >> > Trex : > >> >> > ./t-rex-64 -f cap2/sfr.yaml -m 5 -c 4 > >> >> > > >> >> > > >> >> > Regards, > >> >> > Khers > >> >> > > >> >> > >> > > >> > > >
_______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev