[vpp-dev] Connection issue between container (slave) and host vpp (master) with memif

2018-06-28 Thread chakravarthy . arisetty
Hi,

How do we connect memif inside host to memif inside a container?  Somehow, the 
container is not able to communicate with host.
Can someone point me what I'm missing?

Thanks
Chakri

VPP inside Host
---
vpp# show memif
sockets
  id  listener    filename
  0   yes (1)     /run/vpp/memif.sock
 *11  yes (1)     /tmp/memif1.sock*
 
*interface memif11/33*
*  socket-id 11 id 33 mode ethernet*
*  flags admin-up*
*  listener-fd 22 conn-fd 0*
*  num-s2m-rings 0 num-m2s-rings 0 buffer-size 0 num-regions 0*
interface memif0/0
  socket-id 0 id 0 mode ethernet
  flags
  listener-fd 21 conn-fd 0
  num-s2m-rings 1 num-m2s-rings 1 buffer-size 0 num-regions 0
  local-disc-reason "disconnected"
vpp#

vpp# show int
              Name               Idx       State          Counter          Count
local0                            0         up
memif0/0                          6        up
*memif11/33                        4         up*

Container VPP configuration
-
vpp# show memif
sockets
  id  listener    filename
  0   no          /run/vpp/memif.sock
  11  no          /tmp/memif1.sock
 
*interface memif11/33*
*  socket-id 11 id 33 mode ethernet*
*  flags admin-up slave zero-copy*
*  listener-fd 0 conn-fd 0*
*  num-s2m-rings 0 num-m2s-rings 0 buffer-size 0 num-regions 0*
*vpp#*

*vpp# sh int*
*              Name               Idx       State          Counter          
Count*
*l* ocal0                            0         up       drops                   
       0
memif0/0                          2         up
*memif11/33                        1         up       drops                     
     0*
*                                                     tx-error                  
     0
*
On host, these commands are used to create master socket
--
create memif socket id 11 filename  /tmp/memif1.sock
create interface memif id 33 socket-id 11 master
set int state memif11/33 up

Inside container, these commands are used to create slave socket
-
create memif socket id 11 filename  /tmp/memif1.sock
create interface memif id 33 socket-id 11 slave
set int state memif11/33 up

*Intrestingly,  host vpp is able to connect to client (icmpr-epoll) on the 
host. the issue is only with client socket inside the container.*
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#9736): https://lists.fd.io/g/vpp-dev/message/9736
Mute This Topic: https://lists.fd.io/mt/22892000/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] CPU Usage in VPP

2018-06-28 Thread John DeNisco via Lists.Fd.Io
Rubina,
Damjan,

I am working on a project to enhance the VPP documentation. I think these are 
great questions and answers.

Do you mind if I use them for my documents?

Thanks,

John

From:  on behalf of Damjan Marion 
Date: Tuesday, June 26, 2018 at 6:28 AM
To: Rubina Bianchi 
Cc: "vpp-dev@lists.fd.io" 
Subject: Re: [vpp-dev] CPU Usage in VPP


Yes, good indicator is "average vectors/node" in show run. Bigger number means 
VPP is more busy but also more efficient.
Maximum value us 255 (unless you change VLIB_FRAME_SIZE in code). It basically 
means how many packets are processed
in batch. If VPP is not loaded it will likely poll so fast that it will just 
get one or few packets from the rx queue, as load goes up
vpp will have more work to do, so it will poll less frequently, and that will 
result in more packets waiting in rx queue. More packets
will result in more efficient execution of the code so number of clock cycles / 
packet will go down.
When "average vectors/node" goes up close to 255, you will likely start 
observing  rx queue tail drops.

Hope this explains...

On 26 Jun 2018, at 07:57, Rubina Bianchi 
mailto:r_bian...@outlook.com>> wrote:

Thanks Dear Damjan

So, Is there any way to compute VPP load or power?
Actually, what I'm looking for is a metric to determine how much is VPP load in 
different cases.
In other word I want to know how much VPP is busy.

Thanks,
Sincerely


From: Damjan Marion mailto:dmar...@me.com>>
Sent: Monday, June 25, 2018 8:43 PM
To: Rubina Bianchi
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] CPU Usage in VPP


With at least one interface in polling mode, VPP CPU is always 100% utilised.
For memory usage, you can use "show memory [verbose]" debug cli.



On 25 Jun 2018, at 13:51, Rubina Bianchi 
mailto:r_bian...@outlook.com>> wrote:

Hi Dear VPP

Is there any command or tools in vpp to show real cpu and memory usage on 
runtime? I used top and htop in linux kernel but it shows pre-allocated 
amounts. I also tried vppctl sho runtime, but its frequency is less than what I 
expected (on high traffic and low traffic its reported clocks are same or have 
low frequency).

Thanks,
Sincerely

_._,_._,_

Links:

You receive all messages sent to this group.

View/Reply Online (#9705) | Reply 
To 
Sender
 | Reply To 
Group
 | Mute This Topic | New 
Topic

Your Subscription | Contact Group 
Owner | 
Unsubscribe [jdeni...@cisco.com]
_._,_._,_
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#9735): https://lists.fd.io/g/vpp-dev/message/9735
Mute This Topic: https://lists.fd.io/mt/22674849/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] mheap performance issue and fixup

2018-06-28 Thread Dave Barach via Lists.Fd.Io
Allocating a large number of 16 byte objects @ 64 byte alignment will never 
work very well. If you pad the object such that the mheap header plus the 
object is exactly 64 bytes, the issue may go away.

With that hint, however, I’ll go build a test vector. It sounds like the mheap 
required size calculation might be a brick shy of a load.

D.

From: vpp-dev@lists.fd.io  On Behalf Of Kingwel Xie
Sent: Thursday, June 28, 2018 2:25 AM
To: Dave Barach (dbarach) ; Damjan Marion 
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] mheap performance issue and fixup

No problem. I’ll do that later.

Actually there has been a discussion about mheap performance, which describe 
the issue we are talking about. Please also check it again:

https://lists.fd.io/g/vpp-dev/topic/10642197#6399


From: Dave Barach (dbarach) mailto:dbar...@cisco.com>>
Sent: Thursday, June 28, 2018 3:38 AM
To: Damjan Marion mailto:dmar...@me.com>>; Kingwel Xie 
mailto:kingwel@ericsson.com>>
Cc: vpp-dev@lists.fd.io
Subject: RE: [vpp-dev] mheap performance issue and fixup

+1.

It would be super-helpful if you were to add test cases to 
.../src/vppinfra/test_mheap.c, and push a draft patch so we can reproduce / fix 
the problem(s).

Thanks... Dave

From: vpp-dev@lists.fd.io 
mailto:vpp-dev@lists.fd.io>> On Behalf Of Damjan Marion
Sent: Wednesday, June 27, 2018 3:27 PM
To: Kingwel Xie mailto:kingwel@ericsson.com>>
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] mheap performance issue and fixup


Dear Kingwei,

We finally managed to look at your mheap patches, sorry for delay.

Still we are not 100% convinced that there is a bug(s) in the mheap code.
Please note that that mheap code is stable, not changed frequently and used for 
years.

It will really help if you can provide test vectors for each issue you observed.
It will be much easier to understand the problem and confirm the fix if we are 
able to reproduce it in controlled environment.

thanks,

Damjan


From: mailto:vpp-dev@lists.fd.io>> on behalf of Kingwel 
Xie mailto:kingwel@ericsson.com>>
Date: Thursday, 19 April 2018 at 03:19
To: "Damjan Marion (damarion)" mailto:damar...@cisco.com>>
Cc: "vpp-dev@lists.fd.io" 
mailto:vpp-dev@lists.fd.io>>
Subject: Re: [vpp-dev] mheap performance issue and fixup

Hi Damjan,

We will do it asap. Actually we are quite new to vPP and even don’t know how to 
make bug report and code contribution or so.

Regards,
Kingwel

From: vpp-dev@lists.fd.io 
[mailto:vpp-dev@lists.fd.io] On Behalf Of Damjan Marion
Sent: Wednesday, April 18, 2018 11:30 PM
To: Kingwel Xie mailto:kingwel@ericsson.com>>
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] mheap performance issue and fixup

Dear Kingwel,

Thank you for your email. It will be really appreciated if you can submit your 
changes to gerrit, preferably each point in separate patch.
That will be best place to discuss those changes...

Thanks in Advance,

--
Damjan

On 16 Apr 2018, at 10:13, Kingwel Xie 
mailto:kingwel@ericsson.com>> wrote:

Hi all,

We recently worked on GTPU tunnel and our target is to create 2M tunnels. It is 
not as easy as it looks like, and it took us quite some time to figure it out. 
The biggest problem we found is about mheap, which as you know is the low layer 
memory management function of vPP. We believe it makes sense to share what we 
found and what we’ve done to improve the performance of mheap.

First of all, mheap is fast. It has well-designed small object cache and 
multi-level free lists, to speed up the get/put. However, as discussed in the 
mail list before, it has a performance issue when dealing with 
align/align_offset allocation. We managed to locate the problem is brought by a 
pointer ‘rewrite’ in gtp_tunnel_t. This rewrite is a vector and required to be 
aligned to 64B cache line, therefore with 4 bytes align offset. We realized 
that it is because that the free list must be very long, meaning so many 
mheap_elts, but unfortunately it doesn’t have an element which fits to all 3 
prerequisites: size, align, and align offset. In this case,  each allocation 
has to traverse all elements till it reaches the end of element. As a result, 
you might observe each allocation is greater than 10 clocks/call with ‘show 
memory verbose’. It indicates the allocation takes too long, while it should be 
200~300 clocks/call in general. Also you should have noticed ‘per-attempt’ is 
quite high, even more than 100.

The fix is straight and simple : as discussed int his mail list before, to 
allocate ‘rewrite’ from a pool, instead of from mheap. Frankly speaking, it 
looks like a workaround not a real fix, so we spent some time fix the problem 
thoroughly. The idea is to add a few more bytes to the original required block 
size so that mheap will always lookup in a bigger free list, then most likely a 

[vpp-dev] twamp

2018-06-28 Thread Avi Cohen (A)
Hi,
Is there any plan to implement/integrate TWAMP into VPP ?
Regards
Avi

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#9733): https://lists.fd.io/g/vpp-dev/message/9733
Mute This Topic: https://lists.fd.io/mt/22866386/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] arp request

2018-06-28 Thread Gulakh
Hi,
The problem fixed but I don't know how. maybe it was a configuration
problem or something else.

Just to aware you that it was not a problem to take care of

On Mon, Jun 25, 2018 at 6:56 PM, Holoo Gulakh  wrote:

> Hi,
> I have configured VPP to have VPLS base on this link
>  somehow.
> Now at the first time, a computer in the first LAN send an ARP request to
> find the MAC address of the other computer in the second LAN.
>
> By capturing received packets in the second computer, I can see that ARP
> requests are received but ARP reply is not sent on response.
> By statically setting MAC address in the ARP table of both computers, the
> problem is fixed.
>
> Q:Why is not ARP reply sent automatically? How can I fix it??
>
> Thanks
>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#9732): https://lists.fd.io/g/vpp-dev/message/9732
Mute This Topic: https://lists.fd.io/mt/22676139/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Is VPP IPSec implementation thread safe?

2018-06-28 Thread Jim Thompson
All,

I don't know if any of the previously-raised issues occur in real-life.
Goodness knows we've run billions of IPsec packets in the test harnesses
(harnessi?) here without seeing them.

There are a couple issues with IPsec and multicore that haven't been
raised, however, so I'm gonna hijack the thread.

If multiple worker threads are configured in VPP, it seems like there’s the
potential for problems with IPsec where the sequence number or replay
window for an SA could get stomped on by two threads trying to update them
at the same. We assume that this issue is well known since the following
comment occurs at line 173 in src/vnet/ipsec/esp.h

/* TODO seq increment should be atomic to be accessed by multiple
workers */

See: https://github.com/FDio/vpp/blob/master/src/vnet/ipsec/esp.h#L173

We've asked if anyone is working on this, and are willing to try and fix
it, but would need some direction on what is the best way to accomplish
same.

We could try to use locking, which would be straightforward but would add
overhead.  Maybe that overhead could be offset some by requesting a block
of sequence numbers upfront for all of the packets being processed instead
of getting a sequence number and incrementing as each packet is processed.

There is also the clib_smp_atomic_add() call, which invokes
__sync_fetch_and_add(addr,increment).  This is a GCC built_in that uses a
memory barrier to avoid obtaining a lock.  We're not sure if there are
drawbacks to using this.

See: http://gcc.gnu.org/onlinedocs/gcc-4.4.3/gcc/Atomic-Builtins.html

GRE uses clib_smp_atomic_add() for sequence number processing, see
src/vnet/gre/gre.c#L409
and src/vnet/gre/gre.c#L421

Finally, there seem to be issues around AES-GCM nonce processing when
operating multi-threaded.  If it is nonce processing, it can probably
(also) be addressed via clib_smp_atomic_add(), but.. don't know yet.

We've raised these before, but haven't received much in the way of
response.  Again, we're willing to work on these, but would like a bit of
'guidance' from vpp-dev.

Thanks,

Jim (and the rest of Netgate)


On Thu, Jun 28, 2018 at 1:44 AM, Vamsi Krishna  wrote:

> Hi Damjan, Dave,
>
> Thanks for the quick reply.
>
> It is really helpful. So the barrier ensures that the IPSec data structure
> access is thread safe.
>
> Have a few more question on the IPSec implementation.
> 1. The inbound SA lookup (in ipsec-input) is actually going through the
> inbound policies for the given spd id linearly and matching a policy. The
> SA is picked based on the matching policy.
>  This could have been an SAD hash table with key as (SPI, dst address,
> proto (ESP or AH) ), so that the SA can be looked up from the hash on
> receiving an ESP packet.
>  Is there a particular reason it is implemented using a linear policy
> match?
>
> 2. There is also an IKEv2 responder implementation that adds/deletes IPSec
> tunnel interfaces. How does this work? Is there any documentation that can
> be referred to?
>
> Thanks
> Krishna
>
> On Wed, Jun 27, 2018 at 6:23 PM, Dave Barach (dbarach) 
> wrote:
>
>> +1.
>>
>>
>>
>> To amplify a bit: *all* binary API messages are processed with worker
>> threads paused in a barrier sync, unless the API message has been
>> explicitly marked thread-safe.
>>
>>
>>
>> Here is the relevant code in .../src/vlibapi/api_shared.c:v
>> l_api_msg_handler_with_vm_node(...)
>>
>>
>>
>>   if (!am->is_mp_safe[id])
>>
>>  {
>>
>>vl_msg_api_barrier_trace_context (am->msg_names[id]);
>>
>>vl_msg_api_barrier_sync ();
>>
>>  }
>>
>>   (*handler) (the_msg, vm, node);
>>
>>
>>
>>   if (!am->is_mp_safe[id])
>>
>>vl_msg_api_barrier_release ();
>>
>>
>>
>> Typical example of marking a message mp-safe:
>>
>>
>>
>>   api_main_t *am=_main;
>>
>>   ...
>>
>>
>>
>>   am->is_mp_safe[VL_API_MEMCLNT_KEEPALIVE_REPLY] = 1;
>>
>>
>>
>> The debug CLI uses the same scheme. Unless otherwise marked mp-safe,
>> debug CLI commands are executed with worker threads paused in a barrier
>> sync.
>>
>>
>>
>> HTH... Dave
>>
>>
>>
>> -Original Message-
>> From: vpp-dev@lists.fd.io  On Behalf Of Damjan
>> Marion
>> Sent: Wednesday, June 27, 2018 6:59 AM
>> To: Vamsi Krishna 
>> Cc: vpp-dev@lists.fd.io
>> Subject: Re: [vpp-dev] Is VPP IPSec implementation thread safe?
>>
>>
>>
>> ipsec data structures are updated during barrier sync, so there is not
>> packets in-flight...
>>
>>
>>
>>
>>
>> > On 27 Jun 2018, at 07:45, Vamsi Krishna  wrote:
>>
>> >
>>
>> > Hi ,
>>
>> >
>>
>> > I have looked at the ipsec code in VPP and trying to understand how it
>> works in a multi threaded environment. Noticed that the datastructures for
>> spd, sad and tunnel interface are pools and there are no locks to prevent
>> race conditions.
>>
>> >
>>
>> > For instance the ipsec-input node passes SA index to the esp-encrypt
>> node, and esp-encrypt node looks up the SA from sad pool. But during the
>> time in which the packet is passed from one node to 

Re: [vpp-dev] Is VPP IPSec implementation thread safe?

2018-06-28 Thread Vamsi Krishna
Hi Damjan, Dave,

Thanks for the quick reply.

It is really helpful. So the barrier ensures that the IPSec data structure
access is thread safe.

Have a few more question on the IPSec implementation.
1. The inbound SA lookup (in ipsec-input) is actually going through the
inbound policies for the given spd id linearly and matching a policy. The
SA is picked based on the matching policy.
 This could have been an SAD hash table with key as (SPI, dst address,
proto (ESP or AH) ), so that the SA can be looked up from the hash on
receiving an ESP packet.
 Is there a particular reason it is implemented using a linear policy
match?

2. There is also an IKEv2 responder implementation that adds/deletes IPSec
tunnel interfaces. How does this work? Is there any documentation that can
be referred to?

Thanks
Krishna

On Wed, Jun 27, 2018 at 6:23 PM, Dave Barach (dbarach) 
wrote:

> +1.
>
>
>
> To amplify a bit: *all* binary API messages are processed with worker
> threads paused in a barrier sync, unless the API message has been
> explicitly marked thread-safe.
>
>
>
> Here is the relevant code in .../src/vlibapi/api_shared.c:
> vl_api_msg_handler_with_vm_node(...)
>
>
>
>   if (!am->is_mp_safe[id])
>
>  {
>
>vl_msg_api_barrier_trace_context (am->msg_names[id]);
>
>vl_msg_api_barrier_sync ();
>
>  }
>
>   (*handler) (the_msg, vm, node);
>
>
>
>   if (!am->is_mp_safe[id])
>
>vl_msg_api_barrier_release ();
>
>
>
> Typical example of marking a message mp-safe:
>
>
>
>   api_main_t *am=_main;
>
>   ...
>
>
>
>   am->is_mp_safe[VL_API_MEMCLNT_KEEPALIVE_REPLY] = 1;
>
>
>
> The debug CLI uses the same scheme. Unless otherwise marked mp-safe, debug
> CLI commands are executed with worker threads paused in a barrier sync.
>
>
>
> HTH... Dave
>
>
>
> -Original Message-
> From: vpp-dev@lists.fd.io  On Behalf Of Damjan Marion
> Sent: Wednesday, June 27, 2018 6:59 AM
> To: Vamsi Krishna 
> Cc: vpp-dev@lists.fd.io
> Subject: Re: [vpp-dev] Is VPP IPSec implementation thread safe?
>
>
>
> ipsec data structures are updated during barrier sync, so there is not
> packets in-flight...
>
>
>
>
>
> > On 27 Jun 2018, at 07:45, Vamsi Krishna  wrote:
>
> >
>
> > Hi ,
>
> >
>
> > I have looked at the ipsec code in VPP and trying to understand how it
> works in a multi threaded environment. Noticed that the datastructures for
> spd, sad and tunnel interface are pools and there are no locks to prevent
> race conditions.
>
> >
>
> > For instance the ipsec-input node passes SA index to the esp-encrypt
> node, and esp-encrypt node looks up the SA from sad pool. But during the
> time in which the packet is passed from one node to another the entry at SA
> index may be changed or deleted. Same seems to be true for dpdk-esp-encrypt
> and dpdk-esp-decrypt. How are these cases handled? Can the implementation
> be used in multi-threaded environment?
>
> >
>
> > Please help understand the IPSec implementation.
>
> >
>
> > Thanks
>
> > Krishna
>
> > -=-=-=-=-=-=-=-=-=-=-=-
>
> > Links: You receive all messages sent to this group.
>
> >
>
> > View/Reply Online (#9709): https://lists.fd.io/g/vpp-dev/message/9709
>
> > Mute This Topic: https://lists.fd.io/mt/22720913/675642
>
> > Group Owner: vpp-dev+ow...@lists.fd.io
>
> > Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [dmar...@me.com]
>
> > -=-=-=-=-=-=-=-=-=-=-=-
>
>
>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#9730): https://lists.fd.io/g/vpp-dev/message/9730
Mute This Topic: https://lists.fd.io/mt/22720913/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] mheap performance issue and fixup

2018-06-28 Thread Kingwel Xie
No problem. I’ll do that later.

Actually there has been a discussion about mheap performance, which describe 
the issue we are talking about. Please also check it again:

https://lists.fd.io/g/vpp-dev/topic/10642197#6399


From: Dave Barach (dbarach) 
Sent: Thursday, June 28, 2018 3:38 AM
To: Damjan Marion ; Kingwel Xie 
Cc: vpp-dev@lists.fd.io
Subject: RE: [vpp-dev] mheap performance issue and fixup

+1.

It would be super-helpful if you were to add test cases to 
.../src/vppinfra/test_mheap.c, and push a draft patch so we can reproduce / fix 
the problem(s).

Thanks... Dave

From: vpp-dev@lists.fd.io 
mailto:vpp-dev@lists.fd.io>> On Behalf Of Damjan Marion
Sent: Wednesday, June 27, 2018 3:27 PM
To: Kingwel Xie mailto:kingwel@ericsson.com>>
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] mheap performance issue and fixup


Dear Kingwei,

We finally managed to look at your mheap patches, sorry for delay.

Still we are not 100% convinced that there is a bug(s) in the mheap code.
Please note that that mheap code is stable, not changed frequently and used for 
years.

It will really help if you can provide test vectors for each issue you observed.
It will be much easier to understand the problem and confirm the fix if we are 
able to reproduce it in controlled environment.

thanks,

Damjan


From: mailto:vpp-dev@lists.fd.io>> on behalf of Kingwel 
Xie mailto:kingwel@ericsson.com>>
Date: Thursday, 19 April 2018 at 03:19
To: "Damjan Marion (damarion)" mailto:damar...@cisco.com>>
Cc: "vpp-dev@lists.fd.io" 
mailto:vpp-dev@lists.fd.io>>
Subject: Re: [vpp-dev] mheap performance issue and fixup

Hi Damjan,

We will do it asap. Actually we are quite new to vPP and even don’t know how to 
make bug report and code contribution or so.

Regards,
Kingwel

From: vpp-dev@lists.fd.io 
[mailto:vpp-dev@lists.fd.io] On Behalf Of Damjan Marion
Sent: Wednesday, April 18, 2018 11:30 PM
To: Kingwel Xie mailto:kingwel@ericsson.com>>
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] mheap performance issue and fixup

Dear Kingwel,

Thank you for your email. It will be really appreciated if you can submit your 
changes to gerrit, preferably each point in separate patch.
That will be best place to discuss those changes...

Thanks in Advance,

--
Damjan

On 16 Apr 2018, at 10:13, Kingwel Xie 
mailto:kingwel@ericsson.com>> wrote:

Hi all,

We recently worked on GTPU tunnel and our target is to create 2M tunnels. It is 
not as easy as it looks like, and it took us quite some time to figure it out. 
The biggest problem we found is about mheap, which as you know is the low layer 
memory management function of vPP. We believe it makes sense to share what we 
found and what we’ve done to improve the performance of mheap.

First of all, mheap is fast. It has well-designed small object cache and 
multi-level free lists, to speed up the get/put. However, as discussed in the 
mail list before, it has a performance issue when dealing with 
align/align_offset allocation. We managed to locate the problem is brought by a 
pointer ‘rewrite’ in gtp_tunnel_t. This rewrite is a vector and required to be 
aligned to 64B cache line, therefore with 4 bytes align offset. We realized 
that it is because that the free list must be very long, meaning so many 
mheap_elts, but unfortunately it doesn’t have an element which fits to all 3 
prerequisites: size, align, and align offset. In this case,  each allocation 
has to traverse all elements till it reaches the end of element. As a result, 
you might observe each allocation is greater than 10 clocks/call with ‘show 
memory verbose’. It indicates the allocation takes too long, while it should be 
200~300 clocks/call in general. Also you should have noticed ‘per-attempt’ is 
quite high, even more than 100.

The fix is straight and simple : as discussed int his mail list before, to 
allocate ‘rewrite’ from a pool, instead of from mheap. Frankly speaking, it 
looks like a workaround not a real fix, so we spent some time fix the problem 
thoroughly. The idea is to add a few more bytes to the original required block 
size so that mheap will always lookup in a bigger free list, then most likely a 
suitable block can be easily located. Well, now the problem becomes how big is 
this extra size? It should be at least align+align_offset, not hard to 
understand. But after careful analysis we think it is better to be like this, 
see code below:

Mheap.c:545
  word modifier = (align > MHEAP_USER_DATA_WORD_BYTES ? align + align_offset + 
sizeof(mheap_elt_t) : 0);
  bin = user_data_size_to_bin_index (n_user_bytes + modifier);

The reason of extra sizeof(mheap_elt_t) is to avoid lo_free_size is too small 
to hold a complete free element. You will understand it if you really know how 
mheap_get_search_free_bin is working. I am not going to go through the detail 
of it. In