Dear Andrew

Sorry for my delay, I get last revision of master  (commit :
e695cb4dbdb6f9424ac5a567799e67f791fad328 ), and
segfault did not occur with the same environment and test scenario. I will
try to reproduce the potential bug
with running test with longer duration and more aggressive scenario.

Regards,
Khers

On Wed, Oct 25, 2017 at 1:45 PM, Andrew 👽 Yourtchenko <ayour...@gmail.com>
wrote:

> Dear Khers,
>
> okay, cool! When testing the debug image, you could save the full dump
> and the .debs for all the artefacts so just in case I could grab the
> entire set of info and was able to look at it in my environment.
>
> Meantime, I had an idea for another potential failure mode, whereby
> the session would get checked while there is a session being freed,
> potentially resulting in a reallocation of the free bitmap in the
> pool.
>
> So before the reproduction in the debug build, give a shot to this
> one-line change
>  in the release build and see if you still can reproduce the crash with it:
>
> --- a/src/plugins/acl/fa_node.c
> +++ b/src/plugins/acl/fa_node.c
> @@ -609,6 +609,8 @@ acl_fa_verify_init_sessions (acl_main_t * am)
>      for (wk = 0; wk < vec_len (am->per_worker_data); wk++) {
>        acl_fa_per_worker_data_t *pw = &am->per_worker_data[wk];
>        pool_alloc_aligned(pw->fa_sessions_pool,
> am->fa_conn_table_max_entries, CLIB_CACHE_LINE_BYTES);
> +      /* preallocate the free bitmap */
> +      clib_bitmap_validate(pool_header(pw->fa_sessions_pool)->
> free_bitmap,
> am->fa_conn_table_max_entries);
>      }
>
> --a
>
> On 10/24/17, khers <s3m2e1.6s...@gmail.com> wrote:
> > Dear Andrew
> >
> > I used latest version of master branch, I will replay the test with debug
> > build to make more debug info ASAP.
> > Vpp is running on Xeon E5-2600  series.
> > I did the tanother tests with two rx-queue and two worker, also with 4
> > rx-queue and 4 worker, I got segmentation fault on the same function.
> >
> > I will send more info in few days.
> >
> > Regards,
> > Khers
> >
> > On Oct 24, 2017 6:43 PM, "Andrew 👽 Yourtchenko" <ayour...@gmail.com>
> > wrote:
> >
> >> Dear Khers,
> >>
> >> Thanks for the info!
> >>
> >> I tried with these configs in my local setup (I tried even to increase
> >> the multi-cpu contention by specifying 4 rx-queues instead of 2), but
> >> it works ok for me on the master. What is the version you are testing
> >> with ? I presume it is also the master, but just wanted to verify.
> >>
> >> To try to get more info about this happening: could you give a shot at
> >> reproducing this on the debug build ? There are a few asserts that
> >> would be handy to verify that they do hold true during your tests -
> >> the location of the crash points to either the pool header being
> >> corrupted by something (the asserts should catch that) or the pool
> >> itself reallocated and memory used by something else (which should not
> >> happen because the memory is preallocated during the initialisation
> >> time - unless you change the max number of sessions after
> >> initialisation).
> >>
> >> Also, could you tell a bit more about the hardware you are testing
> >> with ? (cat /proc/cpuinfo)
> >>
> >> --a
> >>
> >> On 10/24/17, khers <s3m2e1.6s...@gmail.com> wrote:
> >> > Dear Andrew
> >> >
> >> > Thanks for your attention.
> >> > Trex config file <https://paste.ubuntu.com/25807801/>
> >> > Trex scenario is default sfr.yaml.
> >> > vpp: startup.conf <https://paste.ubuntu.com/25807840/>
> >> > I changed size of acl_mheap to '(uword)2<<32' in acl.c
> >> > vpp config:
> >> > vppctl set interface l2 bridge TenGigabitEthernet86/0/0 1
> >> > vppctl set interface l2 bridge TenGigabitEthernet86/0/1 1
> >> >
> >> > vppctl set int state TenGigabitEthernet86/0/0 up
> >> > vppctl set int state TenGigabitEthernet86/0/1 up
> >> >
> >> > vppctl set acl-plugin session table hash-table-buckets 1000000
> >> > vppctl set acl-plugin session table hash-table-memory 2147483648
> >> >
> >> > vppctl set acl-plugin session timeout udp idle 5
> >> > vppctl set acl-plugin session timeout tcp idle 10
> >> > vppctl set acl-plugin session timeout tcp transient 5
> >> >
> >> > Regards,
> >> > Khers
> >> >
> >> >
> >> > On Mon, Oct 23, 2017 at 7:52 PM, Andrew 👽 Yourtchenko <
> >> ayour...@gmail.com>
> >> > wrote:
> >> >
> >> >> Hi,
> >> >>
> >> >> could you share the exact TRex and VPP config files, so I could
> >> >> recreate it locally to investigate further ?
> >> >>
> >> >> Thanks a lot!
> >> >>
> >> >> --a
> >> >>
> >> >> On 10/23/17, khers <s3m2e1.6s...@gmail.com> wrote:
> >> >> > Dear folks
> >> >> >
> >> >> > I have bridged two interfaces and set permit+reflect acl on the
> >> >> > input
> >> >> > of
> >> >> > interface one and deny rule on output of same interface as follow:
> >> >> >
> >> >> > acl_add_replace permit+reflect
> >> >> > acl_add_replace deny
> >> >> >
> >> >> > acl_interface_add_del sw_if_index 1 add input acl 0
> >> >> > acl_interface_add_del sw_if_index 1 add output acl 1
> >> >> >
> >> >> >
> >> >> > after about 100 seconds of running Trex with sfr scenario I got
> >> >> > sigsegv.
> >> >> > this is gdb's backtrace <https://pastebin.com/VvZ9Z3Nf>.
> >> >> >
> >> >> > Trex :
> >> >> > ./t-rex-64 -f cap2/sfr.yaml -m 5 -c 4
> >> >> >
> >> >> >
> >> >> > Regards,
> >> >> > Khers
> >> >> >
> >> >>
> >> >
> >>
> >
>
_______________________________________________
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Reply via email to