Hi Ole,

I need roughly the amount of tunnels I ended up testing with - ~300-500k.

I thought about extending generic VPP API with a (blocking?) method
accepting burst (array) of messages and returning burst of responses.

One of the challenges in map is that to configure a tunnel I need two
request messages, with the second one depending on first's result (index).
It would be much smoother if I could just send a single message to do both
domain and rule(/s) addition...

Best Regards,
Jacek.

2017-09-05 17:06 GMT+02:00 Ole Troan <otr...@employees.org>:

> Jacek,
>
> It's also been on my list for a while to add a better bulk add for MAP
> domains / rules.
> Any idea of the scale you are looking at here?
>
> Best regards,
> Ole
>
>
> > On 5 Sep 2017, at 15:07, Jacek Siuda <j...@semihalf.com> wrote:
> >
> > Hi,
> >
> > I'm conducting a tunnel test using VPP (vnet) map with the following
> parameters:
> > ea_bits_len=0, psid_offset=16, psid=length, single rule for each domain;
> total number of tunnels: 300000, total number of control messages: 600k.
> >
> > My problem is with simple adding tunnels. After adding more than
> ~150k-200k, performance drops significantly: first 100k is added in ~3s (on
> asynchronous C client), next 100k in another ~5s, but the last 100k takes
> ~37s to add; in total: ~45s. Python clients are performing even worse: 32
> minutes(!) for 300k tunnels with synchronous (blocking) version and ~95s
> with asynchronous. The python clients are expected to perform a bit worse
> according to vpp docs, but I was worried by non-linear time of single
> tunnel addition that is visible even on C client.
> >
> > While investigating this using perf, I found the culprit: it is the
> memory allocation done for ip address by rule addition request.
> > The memory is allocated by clib, which is using mheap library (~98% of
> cpu consumption). I looked into mheap and it looks a bit complicated for
> allocating a short object.
> > I've done a short experiment by replacing (in vnet/map/ only) clib
> allocation with DPDK rte_malloc() and achieved a way better performance:
> 300k tunnels in ~5-6s with the same C-client, and respectively ~70s and
> ~30-40s with Python clients. Also, I haven't noticed any negative impact on
> packet throughput with my experimental allocator.
> >
> > So, here are my questions:
> > 1) Did someone other reported performance penalties for using mheap
> library? I've searched the list archive and could not find any related
> questions.
> > 2) Why mheap library was chosen to be used in clib? Are there any
> performance benefits in some scenarios?
> > 3) Are there any (long- or short-term) plans to replace memory
> management in clib with some other library?
> > 4) I wonder, if I'd like to upstream my solution, how should I approach
> customization of memory allocation, so it would be accepted by community.
> Installable function pointers defaulting to clib?
> >
> > Best Regards,
> > Jacek Siuda.
> >
> >
> > _______________________________________________
> > vpp-dev mailing list
> > vpp-dev@lists.fd.io
> > https://lists.fd.io/mailman/listinfo/vpp-dev
>
>
_______________________________________________
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Reply via email to