from:"Elias Rudberg"

Re: [vpp-dev] n_vectors...

2020-03-31 Thread Elias Rudberg

Hi Chris and Dave,

Thanks for bringing this up, and thanks for explaining!

I agree with Chris that this is confusing, it makes it much more
difficult to understand the code.

Perhaps this is the kind of thing that doesn't matter much to those who
are already familiar with the code, while at the same time it matters a
lot for newcomers. If you want to lower the threshold for new people to
be able to come in and understand the code and possibly contribute,
then I think it would be a good idea to fix this even if it means
changing many lines of code. It could be argued that the fact that
"n_vectors" exists in so many places makes it even more important to
have a reasonable name for it. One way could be to start with renaming
things in some of the main data structures like those in vlib/node.h
and vlib/threads.h and such places, and the changes the compiler will
force as a result of that.

Best regards,
Elias

On Tue, 2020-03-31 at 00:45 +, Dave Barach via Lists.Fd.Io wrote:
> Hmmm, yeah. Been at this for years, I can’t really remember when we
> settled on e.g. n_vectors vs. n_vector_elts or some such.
>  
> In new code, it’s perfectly fair to use whatever names seem fit for
> purpose.
>  
> Vlib would be happy doing image processing, or any other kind of
> vector processing. There’s no law which says that frames need to have
> 32-bit elements. Each node decides.
>  
> FWIW... Dave
>  
> From: vpp-dev@lists.fd.io  On Behalf Of
> Christian Hopps
> Sent: Monday, March 30, 2020 8:07 PM
> To: vpp-dev 
> Cc: Christian Hopps 
> Subject: [vpp-dev] n_vectors...
>  
> Something has always bothered me about my understanding of VPPs use
> of the term "vector" and "vectors". When I think of Vector Packet
> Processing I think of processing a vector (array) of packets in a
> single call to a node. The code, though, then seems to refer to the
> individual packets as "vectors" when it uses field names like
> "n_vectors" to refer to the number of buffers in a frame, or when
> "show runtime" talks about "vectors per call", when I think it's
> really talking about "packets/buffers per call" (and my mind wants to
> think that it's always *1* vector/frame of packets per call by
> design).
> 
> I find this confusing, and so I thought I'd ask if there was some
> meaning here I'm missing?
> 
> Thanks,
> Chris.

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15956): https://lists.fd.io/g/vpp-dev/message/15956
Mute This Topic: https://lists.fd.io/mt/72667316/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] VPP nat ipfix logging problem, need to use thread-specific vlib_main_t?

2020-04-05 Thread Elias Rudberg

Hello VPP experts,

We have been using VPP for NAT44 for a while and it has been working
fine, but a few days ago when we tried turing on nat ipfix logging, vpp
crashed. It turned out that the problem went away if we used only a
single thread, so it seemed related to how threading was handled in the
ipfix logging code. The crash happened in different ways on different
runs but often seemed related to the snat_ipfix_send() function in
plugins/nat/nat_ipfix_logging.c.

Having looked at the code in nat_ipfix_logging.c I have the following
theory about what goes wrong (I might have misunderstood something, if
so please correct me):

In the the snat_ipfix_send() function, a vlib_main_t data structure is
used, a pointer to it is fetched in the following way:

   vlib_main_t *vm = frm->vlib_main;

So the frm->vlib_main pointer comes from "frm" which has been set to
flow_report_main which is a global data structure from vnet/ipfix-
export/flow_report.c that as far as I can tell only exists once in
memory (not once per thread). This means that different threads calling
the snat_ipfix_send() function are using the same vlib_main_t data
structure. That is not how it should be, I think, instead each thread
should be using its own thread-specific vlib_main_t data structure.

A suggestion for how to fix this is to replace the line

   vlib_main_t *vm = frm->vlib_main;

with the following line

   vlib_main_t *vm = vlib_mains[thread_index];

in all places where worker threads are using such a vlib_main_t
pointer. Using vlib_mains[thread_index] means that we are picking the
thread-specific vlib_main_t data structure for the current thread,
instead of all threads using the same vlib_main_t. I pushed such a
change to gerrit, here: https://gerrit.fd.io/r/c/vpp/+/26359

That fix seems to solve the issue in my tests, vpp does not crash
anymore after the change. Please have a look at it and let me know if
this seems reasonable or if I have misunderstood something.

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15990): https://lists.fd.io/g/vpp-dev/message/15990
Mute This Topic: https://lists.fd.io/mt/72786912/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] n_vectors...

2020-04-06 Thread Elias Rudberg

Hi Dave,

Thanks for your answer, I understand that there are many difficulties
and problems with renaming things in existing code which I did not
realize before.

> P.S. mapping "n_vectors" to whatever it means to you seems like a
> pretty minimal entry barrier. It's not like the code is inconsistent.

Here however I disagree: I think it can be a significant entry barrier.

If you imagine yourself in the position of someone who is a newcomer
starting to use and learn about VPP. Perhaps someone with an
engineering background who has an understanding of the "vector" concept
from linear algebra courses and so on. This person has read about the
ideas of how VPP works for example here 
https://wiki.fd.io/view/VPP/What_is_VPP%3F where it says "the VPP
platform grabs all available packets from RX rings to form a vector of
packets" which seems fine according to the usual meaning of the word
"vector". Up to that point everything is fine and someone familiar with
the vector concept will feel that their knowledge about vectors can be
useful when working with VPP. But at the moment when this person starts
looking at the code and sees "n_vectors" there, that will be confusing.
Making the assumption that the VPP source code uses its own definition
of what a "vector" is, is actually a pretty big step to make.

Of course it's not the first time a word has different meanings
depending on the context, but in this case the concept of a "vector" is
quite well established and also seems to be used according to its usual
meaning in VPP documentation. Then it becomes confusing when the word
apparently has a different meaning in the source code.

So, while you are probably right that it's not practical to rename
things like that in the existing code, I still think this issue can be
a significant obstacle for new people coming in.

Anyway, thanks again for explaining the situation, for me personally
this helped my understanding a lot.

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16000): https://lists.fd.io/g/vpp-dev/message/16000
Mute This Topic: https://lists.fd.io/mt/72667316/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] n_vectors...

2020-04-06 Thread Elias Rudberg

Hi Burt,

Thanks, but then I think you mean the vectors as in src/vppinfra/vec.h
but the discussion here was about how the name "n_vectors" is used in
for example src/vlib/node.h and such places. It's a different thing.

If we have a situation like this, now first described using a picture
without using the word "vector" for anything:

A : [ a1 a2 a3 ]
B : [ b1 b2 b3 b4 ]

Then the above can be described in different ways.

Alternative 1: we can say that A and B are vectors. A is a vector with
3 elements, B is a vector with 4 elements. The number of vectors is 2
(A and B). According to this view, if there was something called
n_vectors then we would say that n_vectors=2.

Alternative 2 (the VPP way): A consists o3 3 vectors, and B consists of
4 vectors. The number of vectors for A is 3, and the number of vectors
for B is 4. A and B each have their own n_vectors values, A has
n_vectors=3 and B has n_vectors=4. At least this is how I think it is
in the VPP source code.

The VPP source code can be confusing if you assume the word "vector" is
used as in alternative 1.

I think the main scenario of interest in VPP is that there is a bunch
of packets that are processed together. You might think that this would
be described as a vector of packets, but the VPP source code instead
describes the individual packets as vectors, so that "number of
vectors" in effect means "number of packets". At least that is how I
think it is.

There is at least one comment in src/vlib/node.h that seems to say
this, it looks like this:

  /* Number of vector elements currently in frame. */
  u16 n_vectors;

So that variable is called n_vectors but according to the comment its
meaning is the number of vector elements rather than the number of
vectors.

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16016): https://lists.fd.io/g/vpp-dev/message/16016
Mute This Topic: https://lists.fd.io/mt/72667316/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] Deadlock between NAT threads when frame queues for handoff are congested

2020-04-15 Thread Elias Rudberg

Hello VPP experts,

We are using VPP for NAT44 and last week we encountered a problem where
some VPP threads stopped forwarding traffic. We saw the problem on two
separate VPP servers within a short time, apparently it was triggered
by some specific kind of out2in traffic that arrived at that time.

As far as I can tell, this issue exists in both the current master
branch and in the 1908 and 2001 branches.

After investigating and finally being able to reproduce the problem in
a lab setting, we came to the following conclusion about what happened:

The scenario where this happens is that several threads (8 threads in
our case) are used for NAT and the frame queues for handoff between
threads are being congested for some of the threads. This can be
triggered for example by "garbage" out2in traffic that comes in at some
port, if much of the out2in traffic has the same destination port then
much of the traffic will be handed off to the same thread, since the
out2in handoff thread index is decided based on the dest port. It
doesn't matter if the traffic belongs to any existing NAT sessions or
not, since handoff must be done before checking that and the problem is
related to the handoff.

When a frame queue is congested, that is supposed to be detected by the
is_vlib_frame_queue_congested() call in
vlib_buffer_enqueue_to_thread(). However, that check is not completely
reliable since other threads may add things to the queue after the
check. For example, it can happen that two threads call
is_vlib_frame_queue_congested() simultaneously and both come to the
conclusion that the queue is not congested when in fact it will be
congested when one of them has added to the queue giving trouble for
the other thread. This problem is to some extent mitigated by the fact
that the check in is_vlib_frame_queue_congested() uses a
"queue_hi_thresh" value that is set slightly lower than the number of
elements in the queue, it is set like this:

fqm->queue_hi_thresh = frame_queue_nelts - 2;

The -2 there means that things are still OK if two threads call
is_vlib_frame_queue_congested() simultaneously, but if three or four
threads do it simultaneously we are anyway in trouble, and that seems
to be what happened on our VPP servers last week. This leads to one or
more threads being stuck in an infinite loop, in the loop that looks
like this in vlib_get_frame_queue_elt():

  /* Wait until a ring slot is available */
  while (new_tail >= fq->head_hint + fq->nelts)
vlib_worker_thread_barrier_check ();

The loop above is supposed to end when a different thread changes the
value of the volatile variable fq->head_hint but that will not happen
if the other thread is also stuck in this loop. We get a deadlock, A is
waiting for B and B is waiting for A. In the context of NAT, thread A
wants to handoff something to thread B at the same time as thread B
wants to handoff something to thread A, while at the same time their
frame queues are congested. This leads to those two threads being stuck
in the loop forever, each of them waiting for the other one.

To me it looks like the subtraction by 2 when setting queue_hi_thresh
is just an ad hoc choice, there is no reason why 2 would be enough. I
think that to make it safe, we need to subtract the number of threads.
Essentially, we need to ensure that there is room for each thread to
reserve one extra element in the queue so that no thread can get stuck
waiting in the loop above. I tested this by hard-coding -8 instead of
-2 and then the problem cannot be reproduced anymore, so that fix seems
to work. The frame_queue_nelts value is 64 so using -8 means that the
queue is considered congested already 56 instead of 62 as it is now.

What do you think, is it a good solution to check the number of threads
and use that to set "fqm->queue_hi_thresh = frame_queue_nelts -
n_threads;"?

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16083): https://lists.fd.io/g/vpp-dev/message/16083
Mute This Topic: https://lists.fd.io/mt/73030838/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Deadlock between NAT threads when frame queues for handoff are congested

2020-04-16 Thread Elias Rudberg

Hi Ole!

Thanks, here is a change doing that, please have a look: 
https://gerrit.fd.io/r/c/vpp/+/26544

With this change, an assertion will fail if the number of threads is
greater than 55 or something like that. To make things work for such
large thread counts it would be necessary to increase the queue size
also, this change does not handle that.

Best regards,
Elias


On Thu, 2020-04-16 at 13:43 +0200, Ole Troan wrote:
> Hi Elias,
> 
> Thank you for the thorough analysis.
> I think the best approach for now is the one you propose. Reserve as
> many slots as you have workers.
> Potentially also increase the queue size > 64.
> 
> Damjan is looking at some further improvements in this space, but for
> now please go with what you propose.
> 
> Best regards,
> Ole

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16089): https://lists.fd.io/g/vpp-dev/message/16089
Mute This Topic: https://lists.fd.io/mt/73030838/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] VPP nat ipfix logging problem, need to use thread-specific vlib_main_t?

2020-04-23 Thread Elias Rudberg

Hello,

There was a merge conflict for my previous fix for this. Now I made a
new one, it's essentially the same thing, just avoiding the merge
conflict: https://gerrit.fd.io/r/c/vpp/+/26659

Please have a look at that one and merge if it seems ok. Based on our
experience from the past few weeks it seems good, we have seen no more
ipfix loggning crashes after implementing this fix.

Best regards,
Elias


On Sun, 2020-04-05 at 12:08 +, Dave Barach via lists.fd.io wrote:
> If you have the thread index handy, that's OK. Otherwise, use
> vlib_get_main() which grabs the thread index from thread local
> storage. 
> 
> -Original Message-
> From: vpp-dev@lists.fd.io  On Behalf Of Elias
> Rudberg
> Sent: Sunday, April 5, 2020 4:58 AM
> To: vpp-dev@lists.fd.io
> Subject: [vpp-dev] VPP nat ipfix logging problem, need to use thread-
> specific vlib_main_t?
> 
> Hello VPP experts,
> 
> We have been using VPP for NAT44 for a while and it has been working
> fine, but a few days ago when we tried turing on nat ipfix logging,
> vpp crashed. It turned out that the problem went away if we used only
> a single thread, so it seemed related to how threading was handled in
> the ipfix logging code. The crash happened in different ways on
> different runs but often seemed related to the snat_ipfix_send()
> function in plugins/nat/nat_ipfix_logging.c.
> 
> Having looked at the code in nat_ipfix_logging.c I have the following
> theory about what goes wrong (I might have misunderstood something,
> if so please correct me):
> 
> In the the snat_ipfix_send() function, a vlib_main_t data structure
> is used, a pointer to it is fetched in the following way:
> 
>vlib_main_t *vm = frm->vlib_main;
> 
> So the frm->vlib_main pointer comes from "frm" which has been set to
> flow_report_main which is a global data structure from vnet/ipfix-
> export/flow_report.c that as far as I can tell only exists once in
> memory (not once per thread). This means that different threads
> calling the snat_ipfix_send() function are using the same vlib_main_t
> data structure. That is not how it should be, I think, instead each
> thread should be using its own thread-specific vlib_main_t data
> structure.
> 
> A suggestion for how to fix this is to replace the line
> 
>vlib_main_t *vm = frm->vlib_main;
> 
> with the following line
> 
>vlib_main_t *vm = vlib_mains[thread_index];
> 
> in all places where worker threads are using such a vlib_main_t
> pointer. Using vlib_mains[thread_index] means that we are picking the
> thread-specific vlib_main_t data structure for the current thread,
> instead of all threads using the same vlib_main_t. I pushed such a
> change to gerrit, here: https://gerrit.fd.io/r/c/vpp/+/26359
> 
> That fix seems to solve the issue in my tests, vpp does not crash
> anymore after the change. Please have a look at it and let me know if
> this seems reasonable or if I have misunderstood something.
> 
> Best regards,
> Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16147): https://lists.fd.io/g/vpp-dev/message/16147
Mute This Topic: https://lists.fd.io/mt/72786912/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] Segmentation fault in rdma_device_input_refill when using clang compiler

2020-05-06 Thread Elias Rudberg

Hello VPP experts,

When trying to use the current master branch, we get a segmentation
fault error. Here is what it looks like in gdb:

Thread 3 "vpp_wk_0" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fedf91fe700 (LWP 21309)]
rdma_device_input_refill (vm=0x7ff8a5d2f4c0, rd=0x7fedd35ed5c0,
rxq=0x77edea80, is_mlx5dv=1)
at vpp/src/plugins/rdma/input.c:115
115   *(u64x4 *) (va + 4) = u64x4_byte_swap (*(u64x4 *) (va
+ 4));
(gdb) bt
#0  rdma_device_input_refill (vm=0x7ff8a5d2f4c0, rd=0x7fedd35ed5c0,
rxq=0x77edea80, is_mlx5dv=1)
at vpp/src/plugins/rdma/input.c:115
#1  0x7fffa84d in rdma_device_input_inline (vm=0x7ff8a5d2f4c0,
node=0x7ff5ccdfee00, frame=0x0, rd=0x7fedd35ed5c0, qid=0, use_mlx5dv=1)
at vpp/src/plugins/rdma/input.c:622
#2  0x7fffabbbae44 in rdma_input_node_fn_skx (vm=0x7ff8a5d2f4c0,
node=0x7ff5ccdfee00, frame=0x0)
at vpp/src/plugins/rdma/input.c:647
#3  0x760e3155 in dispatch_node (vm=0x7ff8a5d2f4c0,
node=0x7ff5ccdfee00, type=VLIB_NODE_TYPE_INPUT,
dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x0, 
last_time_stamp=66486783453597600) at vpp/src/vlib/main.c:1235
#4  0x760ddbf5 in vlib_main_or_worker_loop (vm=0x7ff8a5d2f4c0,
is_main=0) at vpp/src/vlib/main.c:1815
#5  0x760dd227 in vlib_worker_loop (vm=0x7ff8a5d2f4c0) at
vpp/src/vlib/main.c:1996
#6  0x761345a1 in vlib_worker_thread_fn (arg=0x7fffb74ea980) at
vpp/src/vlib/threads.c:1795
#7  0x75531954 in clib_calljmp () at
vpp/src/vppinfra/longjmp.S:123
#8  0x7fedf91fdce0 in ?? ()
#9  0x7612cd53 in vlib_worker_thread_bootstrap_fn
(arg=0x7fffb74ea980) at vpp/src/vlib/threads.c:584
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

This segmentation fault happens the same way every time I try to start
VPP.

This is in Ubuntu 18.04.4 using the rdma plugin with Mellanox mlx5 NICs
and a Intel Xeon Gold 6126 CPU.

I have looked back at recent changes and found that this problem
started with the commit 4ba16a44 "misc: switch to clang-9" dated April
28. Before that we could use the master branch without thie problem.

Changing back to gcc by removing clang in src/CMakeLists.txt makes the
error go away. However, there is then instead a problem with a "symbol
lookup error" for crypto_native_plugin.so: undefined symbol:
crypto_native_aes_cbc_init_avx512 (that problem disappears if disabling
the crypto_native plugin)

So, two problems:

(1) The segmentation fault itself, perhaps indicating a bug somewhere
but seems to appear only with clang and not with gcc

(2) The "undefined symbol: crypto_native_aes_cbc_init_avx512" problem
when trying to use gcc instead of clang

What do you think about these?

As a short-term fix, is removing clang in src/CMakeLists.txt reasonable
or is there a better/easier workaround?

Does anyone else use the rdma plugin when compiling using clang --
perhaps that combination triggers this problem?

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16252): https://lists.fd.io/g/vpp-dev/message/16252
Mute This Topic: https://lists.fd.io/mt/74033970/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Segmentation fault in rdma_device_input_refill when using clang compiler

2020-05-06 Thread Elias Rudberg

Hi Dave and Damjan,

Here is instruction and register info:

(gdb) x/i $pc
=> 0x7fffabbbdd67 :   vmovdqa64
-0x30a0(%rbp),%ymm0
(gdb) info registers rbp ymm0
rbp0x7417daf0   0x7417daf0
ymm0   {v8_float = {0x0, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0,
0xfffd}, v4_double = {0x0, 0x37, 0x0, 0xff85}, v32_int8
= {0x0, 0x0, 0x0, 0x10, 
0x3f, 0xf6, 0x41, 0x80, 0x0, 0x0, 0x0, 0x10, 0x3f, 0xf6, 0x4b,
0x40, 0x0, 0x0, 0x0, 0x10, 0x3f, 0xf6, 0x55, 0x0, 0x0, 0x0, 0x0, 0x10,
0x3f, 0xf6, 0x5e, 
0xc0}, v16_int16 = {0x0, 0x1000, 0xf63f, 0x8041, 0x0, 0x1000,
0xf63f, 0x404b, 0x0, 0x1000, 0xf63f, 0x55, 0x0, 0x1000, 0xf63f,
0xc05e}, v8_int32 = {
0x1000, 0x8041f63f, 0x1000, 0x404bf63f, 0x1000,
0x55f63f, 0x1000, 0xc05ef63f}, v4_int64 = {0x8041f63f1000,
0x404bf63f1000, 
0x55f63f1000, 0xc05ef63f1000}, v2_int128 =
{0x404bf63f10008041f63f1000,
0xc05ef63f1055f63f1000}}

Not sure if I understand all this but perhaps it means that the value
in %rbp is used as a memory address, but that address 0x7417daf0 is
not 32-byte aligned as it needs to be.

Adding __attribute__((aligned(32))) as Damjan suggests indeed seems to
help. After that there was again a segfault in another place in the
same file, where the same trick of adding __attribute__((aligned(32)))
again helped.

So it seems the problem can be fixed by adding that alignment attribute
in two places, like this:

diff --git a/src/plugins/rdma/input.c b/src/plugins/rdma/input.c
index cf0b6bffe..324436f01 100644
--- a/src/plugins/rdma/input.c
+++ b/src/plugins/rdma/input.c
@@ -103,7 +103,7 @@ rdma_device_input_refill (vlib_main_t * vm,
rdma_device_t * rd,
 
   if (is_mlx5dv)
 {
-  u64 va[8];
+  u64 va[8] __attribute__((aligned(32)));
   mlx5dv_rwq_t *wqe = rxq->wqes + slot;
 
   while (n >= 1)
@@ -488,7 +488,7 @@ rdma_device_input_inline (vlib_main_t * vm,
vlib_node_runtime_t * node,
   rdma_rxq_t *rxq = vec_elt_at_index (rd->rxqs, qid);
   vlib_buffer_t *bufs[VLIB_FRAME_SIZE], **b = bufs;
   struct ibv_wc wc[VLIB_FRAME_SIZE];
-  u32 byte_cnts[VLIB_FRAME_SIZE];
+  u32 byte_cnts[VLIB_FRAME_SIZE] __attribute__((aligned(32)));
   vlib_buffer_t bt;
   u32 next_index, *to_next, n_left_to_next, n_rx_bytes = 0;
   int n_rx_packets, skip_ip4_cksum = 0;

Many thanks for you help!

Should I push the above as a patch to gerrit?

/ Elias



On Wed, 2020-05-06 at 20:38 +0200, Damjan Marion wrote:
> Can you try this:
> 
> diff --git a/src/plugins/rdma/input.c b/src/plugins/rdma/input.c
> index cf0b6bffe..b461ee27b 100644
> --- a/src/plugins/rdma/input.c
> +++ b/src/plugins/rdma/input.c
> @@ -103,7 +103,7 @@ rdma_device_input_refill (vlib_main_t * vm,
> rdma_device_t * rd,
> 
>if (is_mlx5dv)
>  {
> -  u64 va[8];
> +  u64 va[8] __attribute__((aligned(32)));
>mlx5dv_rwq_t *wqe = rxq->wqes + slot;
> 
>while (n >= 1)
> 
> 
> Thanks!
> 
> > On 6 May 2020, at 19:45, Elias Rudberg 
> > wrote:
> > 
> > Hello VPP experts,
> > 
> > When trying to use the current master branch, we get a segmentation
> > fault error. Here is what it looks like in gdb:
> > 
> > Thread 3 "vpp_wk_0" received signal SIGSEGV, Segmentation fault.
> > [Switching to Thread 0x7fedf91fe700 (LWP 21309)]
> > rdma_device_input_refill (vm=0x7ff8a5d2f4c0, rd=0x7fedd35ed5c0,
> > rxq=0x77edea80, is_mlx5dv=1)
> >at vpp/src/plugins/rdma/input.c:115
> > 115   *(u64x4 *) (va + 4) = u64x4_byte_swap (*(u64x4 *) (va
> > + 4));

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16257): https://lists.fd.io/g/vpp-dev/message/16257
Mute This Topic: https://lists.fd.io/mt/74033970/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Segmentation fault in rdma_device_input_refill when using clang compiler

2020-05-06 Thread Elias Rudberg

OK now I updated it (https://gerrit.fd.io/r/c/vpp/+/26934).
Thanks again for your help!
/ Elias


On Thu, 2020-05-07 at 01:58 +0200, Damjan Marion wrote:
> i already pushed one, can you updatr it instead?
> 
> Thanks
> 
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16259): https://lists.fd.io/g/vpp-dev/message/16259
Mute This Topic: https://lists.fd.io/mt/74033970/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] Fix in LACP code to avoid assertion failure in vlib_time_now()

2020-05-07 Thread Elias Rudberg

Hello VPP experts,

When trying the current VPP master branch using a debug build we
encountered an assertion failure in vlib_time_now() here:

always_inline f64
vlib_time_now (vlib_main_t * vm)
{
#if CLIB_DEBUG > 0
  extern __thread uword __os_thread_index;
#endif
  /*
   * Make sure folks don't pass &vlib_global_main from a worker thread.
   */
  ASSERT (vm->thread_index == __os_thread_index);
  return clib_time_now (&vm->clib_time) + vm->time_offset;
}

The ASSERT there is triggered because the LACP code passes
&vlib_global_main when it should pass a thread-specific vlib_main_t. So
this looks like precisely the kind of issue that the assertion was made
to catch.

To reproduce the problem I think it should be anough to use LACP in a
multi-threaded scenario, using a debug build, then the assertion
failure happens directy at startup, every time.

I pushed a fix, here: https://gerrit.fd.io/r/c/vpp/+/26943

After that fix it seems to work, LACP then works without assertion
failure. Please have a look and merge if it seems okay.

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16270): https://lists.fd.io/g/vpp-dev/message/16270
Mute This Topic: https://lists.fd.io/mt/74051150/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] Assertion failure in nat_get_vlib_main() in snat_init()

2020-05-07 Thread Elias Rudberg

Hello,

With the current master branch (def78344) we now get an assertion
failure on startup, here:

(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:51
#1  0x7462e801 in __GI_abort () at abort.c:79
#2  0x004071f3 in os_panic ()
at vpp/src/vpp/vnet/main.c:366
#3  0x7550d7d9 in debugger ()
at vpp/src/vppinfra/error.c:84
#4  0x7550d557 in _clib_error (how_to_die=2, function_name=0x0,
line_number=0, 
fmt=0x7fffacbc0310 "%s:%d (%s) assertion `%s' fails")
at vpp/src/vppinfra/error.c:143
#5  0x7fffacac659e in nat_get_vlib_main (thread_index=4)
at vpp/src/plugins/nat/nat.c:2557
#6  0x7fffacabd7a5 in snat_init (vm=0x7639b980
)
at vpp/src/plugins/nat/nat.c:2685
#7  0x760b9f66 in call_init_exit_functions_internal
(vm=0x7639b980 , 
headp=0x7639bfa8 , call_once=1,
do_sort=1)
at vpp/src/vlib/init.c:350
#8  0x760b9e88 in vlib_call_init_exit_functions
(vm=0x7639b980 , 
headp=0x7639bfa8 , call_once=1)
at vpp/src/vlib/init.c:364
#9  0x760ba011 in vlib_call_all_init_functions
(vm=0x7639b980 )
at vpp/src/vlib/init.c:386
#10 0x760df1f8 in vlib_main (vm=0x7639b980
, input=0x7fffb4b2afa8)
at vpp/src/vlib/main.c:2171
#11 0x76166405 in thread0 (arg=140737324366208)
at vpp/src/vlib/unix/main.c:658
#12 0x75531954 in clib_calljmp ()
at vpp/src/vppinfra/longjmp.S:123
#13 0x7fffcf30 in ?? ()
#14 0x76165f97 in vlib_unix_main (argc=57, argv=0x71d520)
at vpp/src/vlib/unix/main.c:730
#15 0x004068d8 in main (argc=57, argv=0x71d520)
at vpp/src/vpp/vnet/main.c:291

The code looks like this (this part was added in a recent commit it
seems):

always_inline vlib_main_t *
nat_get_vlib_main (u32 thread_index)
{
  vlib_main_t *vm;
  vm = vlib_mains[thread_index];
  ASSERT (vm);
  return vm;
}

So it is looking at vlib_mains[thread_index] but that is NULL,
apparently.

Since this happens at startup, could it be that vlib_mains has not been
initialized yet, it is too early to try to access it?

Is vlib_mains[thread_index] supposed to be initialized by the time when
vlib_call_all_init_functions() runs?

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16276): https://lists.fd.io/g/vpp-dev/message/16276
Mute This Topic: https://lists.fd.io/mt/74060018/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Assertion failure in nat_get_vlib_main() in snat_init()

2020-05-08 Thread Elias Rudberg

Hi Ole,

Yes, that fixes it!
With that patch my NAT test works, no more assertion failures.

/ Elias


On Fri, 2020-05-08 at 10:06 +0200, Ole Troan wrote:
> Hi Elias,
> 
> Thanks for finding that one.
> Can you verify that this patch fixes it:
> https://gerrit.fd.io/r/c/vpp/+/26951 nat: fix per thread data
> vlib_main_t usage take 2 [NEW] 
> 
> Best regards,
> Ole
> 
> > On 7 May 2020, at 22:57, Elias Rudberg 
> > wrote:
> > 
> > Hello,
> > 
> > With the current master branch (def78344) we now get an assertion
> > failure on startup, here:
> > 
> > (gdb) bt
> > #0  __GI_raise (sig=sig@entry=6) at
> > ../sysdeps/unix/sysv/linux/raise.c:51
> > #1  0x7462e801 in __GI_abort () at abort.c:79
> > #2  0x004071f3 in os_panic ()
> >at vpp/src/vpp/vnet/main.c:366
> > #3  0x7550d7d9 in debugger ()
> >at vpp/src/vppinfra/error.c:84
> > #4  0x7550d557 in _clib_error (how_to_die=2,
> > function_name=0x0,
> > line_number=0, 
> >fmt=0x7fffacbc0310 "%s:%d (%s) assertion `%s' fails")
> >at vpp/src/vppinfra/error.c:143
> > #5  0x7fffacac659e in nat_get_vlib_main (thread_index=4)
> >at vpp/src/plugins/nat/nat.c:2557
> > #6  0x7fffacabd7a5 in snat_init (vm=0x7639b980
> > )
> >at vpp/src/plugins/nat/nat.c:2685
> > #7  0x760b9f66 in call_init_exit_functions_internal
> > (vm=0x7639b980 , 
> >headp=0x7639bfa8 , call_once=1,
> > do_sort=1)
> >at vpp/src/vlib/init.c:350
> > #8  0x760b9e88 in vlib_call_init_exit_functions
> > (vm=0x7639b980 , 
> >headp=0x7639bfa8 , call_once=1)
> >at vpp/src/vlib/init.c:364
> > #9  0x760ba011 in vlib_call_all_init_functions
> > (vm=0x7639b980 )
> >at vpp/src/vlib/init.c:386
> > #10 0x760df1f8 in vlib_main (vm=0x7639b980
> > , input=0x7fffb4b2afa8)
> >at vpp/src/vlib/main.c:2171
> > #11 0x76166405 in thread0 (arg=140737324366208)
> >at vpp/src/vlib/unix/main.c:658
> > #12 0x75531954 in clib_calljmp ()
> >at vpp/src/vppinfra/longjmp.S:123
> > #13 0x7fffcf30 in ?? ()
> > #14 0x76165f97 in vlib_unix_main (argc=57, argv=0x71d520)
> >at vpp/src/vlib/unix/main.c:730
> > #15 0x004068d8 in main (argc=57, argv=0x71d520)
> >at vpp/src/vpp/vnet/main.c:291
> > 
> > The code looks like this (this part was added in a recent commit it
> > seems):
> > 
> > always_inline vlib_main_t *
> > nat_get_vlib_main (u32 thread_index)
> > {
> >  vlib_main_t *vm;
> >  vm = vlib_mains[thread_index];
> >  ASSERT (vm);
> >  return vm;
> > }
> > 
> > So it is looking at vlib_mains[thread_index] but that is NULL,
> > apparently.
> > 
> > Since this happens at startup, could it be that vlib_mains has not
> > been
> > initialized yet, it is too early to try to access it?
> > 
> > Is vlib_mains[thread_index] supposed to be initialized by the time
> > when
> > vlib_call_all_init_functions() runs?
> > 
> > Best regards,
> > Elias
> > 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16281): https://lists.fd.io/g/vpp-dev/message/16281
Mute This Topic: https://lists.fd.io/mt/74060018/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] Fix in set_ipfix_exporter_command_fn() to avoid segmentation fault crash

2020-05-26 Thread Elias Rudberg

Hello VPP experts,

When testing the current master branch for NAT with ipfix logging
enabled we encountered a problem with a segmentation fault crash. It
seems like this was caused by a bug in set_ipfix_exporter_command_fn()
in vnet/ipfix-export/flow_report.c where the variable collector_port
is declared as u16:

u16 collector_port = UDP_DST_PORT_ipfix;

and then a few lines later the address of that variable is given as
argument to unformat() with %u like this:

else if (unformat (input, "port %u", &collector_port))

I think that is wrong because %u should correspond to a 32-bit
variable, so when passing the address of a 16-bit variable some data
next to it can get corrupted. In our case what happened was that the
"fib_index" variable that happened to be nearby on the stack got
corrupted, leading to a crash later on.

The problem only appears for release build and not for debug, perhaps
because compiler optimization affects how variables are stored on the
stack. It could be that the compiler (clang or gcc) also matters, that
could explain why the problem was not seen earlier.

Here is a fix, please check it and merge if you agree:
https://gerrit.fd.io/r/c/vpp/+/27280

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16519): https://lists.fd.io/g/vpp-dev/message/16519
Mute This Topic: https://lists.fd.io/mt/74491544/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] Another fix to avoid assertion failure related to vlib_time_now()

2020-05-26 Thread Elias Rudberg

Hello again,

Here is another fix to avoid assertion failure due to vlib_time_now()
being called with a vm corresponding to a different thread, in
nat_ipfix_logging.c:

https://gerrit.fd.io/r/c/vpp/+/27281

Please have a look and merge if it seems okay. Maybe it could be done
more elegantly, this way required changing in several places to pass
along the thread_index value.

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16520): https://lists.fd.io/g/vpp-dev/message/16520
Mute This Topic: https://lists.fd.io/mt/74491949/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Fix in set_ipfix_exporter_command_fn() to avoid segmentation fault crash

2020-05-27 Thread Elias Rudberg

Hi Ole,

OK, now I have changed the patch to include a bounds check. This is
still using an intermediate u32 variable however, I tried making
collector_port a u32 but then one of the Gerrit tests failed, I wasn't
able to figure out why as I could not reproduce that problem on my end,
it happened only in one of the gerrit test cases. This way, with a
temporary u32 variable that is copied to the u16 collector_port after
the bounds check, both solves the crash for me and passes the Gerrit
tests:

https://gerrit.fd.io/r/c/vpp/+/27280

What do you think, is this an acceptable solution?
(Otherwise it would be necessary to dig deeper into what went wrong in
the gerrit tests when collector_port was declared as u32.)

Best regards,
Elias

On Wed, 2020-05-27 at 09:15 +0200, Ole Troan wrote:
> Hi Elias,
> 
> Thanks for spotting that.
> Just make collector_port a u32 and add a boundary check?
> 
> Best regards,
> Ole
> 
> [...]
> > 
> > Here is a fix, please check it and merge if you agree:
> > https://gerrit.fd.io/r/c/vpp/+/27280
> > 
> > Best regards,
> > Elias
> > 
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16532): https://lists.fd.io/g/vpp-dev/message/16532
Mute This Topic: https://lists.fd.io/mt/74491544/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Fix in set_ipfix_exporter_command_fn() to avoid segmentation fault crash

2020-05-27 Thread Elias Rudberg

Hi Andrew,

Yes, it was Basic LISP test. It looked like this in the console.log.gz
for vpp-verify-master-ubuntu1804:

===
===
TEST RESULTS:
 Scheduled tests: 1177
  Executed tests: 1176
Passed tests: 1039
   Skipped tests: 137
  Not Executed tests: 1
  Errors: 1
FAILURES AND ERRORS IN TESTS:
  Testcase name: Basic LISP test 
  ERROR: Test case for basic encapsulation
[test_lisp.TestLisp.test_lisp_basic_encap]
TESTCASES WHERE NO TESTS WERE SUCCESSFULLY EXECUTED:
  Basic LISP test 
===
===

/ Elias



On Wed, 2020-05-27 at 18:42 +0200, Andrew 👽 Yourtchenko wrote:
> Basic LISP test - was it the one that was failing for you ?
> 
> That particular test intermittently failed a couple of times for me
> as well, on a doc-only change, so we have an unrelated issue.
> 
> I am running it locally to see what is going on.
> 
> --a
> 
> > 
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16535): https://lists.fd.io/g/vpp-dev/message/16535
Mute This Topic: https://lists.fd.io/mt/74491544/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Fix in set_ipfix_exporter_command_fn() to avoid segmentation fault crash

2020-05-28 Thread Elias Rudberg

Hi Andrew,

In my case it failed several times and appeared to be triggered by
seemingly harmless code changes, but it seemed like the problem was
reproducible for a given version of the code. What seemed to matter was
when I changed things related to local variables inside the
set_ipfix_exporter_command_fn() function. The test logs said "Core-file 
exists" which I suppose means that vpp crashed. The testing framework
repeats the test several times, saying "3 attempt(s) left", then "2
attempt(s) left" and so on, all those repeated attempts seemed to crash
in the same way.

It could be something with uninitialized variables, e.g. something that
is assumed to be zero but is never explicitly initialized so it can
work when it happens to be zero but depending on platform and compiler
details there could be some garbage there causing a problem. Then
unrelated code changes like adding variables somewhere making things
end up at slightly different memory ocations could make the error come
and go. This is just guessing of course.

Is it possible to get login access to the machine where the
gerrit/jenkins tests are run, to debug it there where the issue can be
reproduced?

/ Elias

On Wed, 2020-05-27 at 19:03 +0200, Andrew 👽 Yourtchenko wrote:
> Yep, so it looks like we have an issue...
> 
> https://gerrit.fd.io/r/c/vpp/+/27305 has the same failures, I am
> rerunning it now to see how intermittent it is - as well as testing
> the latest master locally....
> 
> --a
> 
> > On 27 May 2020, at 18:56, Elias Rudberg 
> > wrote:
> > 
> > Hi Andrew,
> > 
> > Yes, it was Basic LISP test. It looked like this in the
> > console.log.gz
> > for vpp-verify-master-ubuntu1804:
> > 
> > ===
> > 
> > ===
> > TEST RESULTS:
> > Scheduled tests: 1177
> >  Executed tests: 1176
> >Passed tests: 1039
> >   Skipped tests: 137
> >  Not Executed tests: 1
> >  Errors: 1
> > FAILURES AND ERRORS IN TESTS:
> >  Testcase name: Basic LISP test 
> >  ERROR: Test case for basic encapsulation
> > [test_lisp.TestLisp.test_lisp_basic_encap]
> > TESTCASES WHERE NO TESTS WERE SUCCESSFULLY EXECUTED:
> >  Basic LISP test 
> > ===
> > 
> > ===
> > 
> > / Elias
> > 
> > 
> > 
> > On Wed, 2020-05-27 at 18:42 +0200, Andrew 👽 Yourtchenko wrote:
> > > Basic LISP test - was it the one that was failing for you ?
> > > That particular test intermittently failed a couple of times for
> > > me
> > > as well, on a doc-only change, so we have an unrelated issue.
> > > I am running it locally to see what is going on.
> > > --a
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16549): https://lists.fd.io/g/vpp-dev/message/16549
Mute This Topic: https://lists.fd.io/mt/74491544/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Fix in set_ipfix_exporter_command_fn() to avoid segmentation fault crash

2020-05-28 Thread Elias Rudberg

Hi Andrew,

> Could you push as a separate change the code that reliably gives you
> the error in the LISP unit test

I tried but today, whatever I do, I cannot reproduce the test failure
anymore. All tests pass now even when I try exactly the same code for
which the test failed yesterday.

For example, Patchset 4 for https://gerrit.fd.io/r/c/vpp/+/27280 failed
yesterday, but now I created Patchset 8 which is identical to Patchset
4, and Patchset 8 passes all tests.

I don't know, maybe something changed in the testing environment since
yesterday, or maybe the issue was never reproducible, it was just a
coincidence that made it seem that way yesterday.

The good news is that the fix I wanted to do now passes the tests also
when written as Ole suggested, with collector_port as u32 and a bounds
check added:

https://gerrit.fd.io/r/c/vpp/+/27280

It would be great if that could get merged.

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16556): https://lists.fd.io/g/vpp-dev/message/16556
Mute This Topic: https://lists.fd.io/mt/74491544/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Fix in set_ipfix_exporter_command_fn() to avoid segmentation fault crash

2020-05-28 Thread Elias Rudberg

I changed the fix using %U and a new unformat_l3_port function, as
suggested by Paul:

https://gerrit.fd.io/r/c/vpp/+/27280

This works fine, but I wasn't sure where to put the unformat_l3_port
function. Now it's in vnet/udp/udp_format.c -- let me know if you have
a better idea about where it should be.

/ Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16565): https://lists.fd.io/g/vpp-dev/message/16565
Mute This Topic: https://lists.fd.io/mt/74491544/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Fix in set_ipfix_exporter_command_fn() to avoid segmentation fault crash

2020-05-28 Thread Elias Rudberg

Ah. OK, now it's changed to the hopefully better name
"unformat_udp_port".
/ Elias

On Fri, 2020-05-29 at 00:32 +0200, Andrew 👽 Yourtchenko wrote:
> > On 29 May 2020, at 00:02, Elias Rudberg 
> > wrote:
> > 
> > I changed the fix using %U and a new unformat_l3_port function, as
> > suggested by Paul:
> > 
> > https://gerrit.fd.io/r/c/vpp/+/27280
> 
> My opinion it’s an incorrect and unnecessary
> generalization/abstraction:
> 
> 1) port is a L4 concept, not L3. Cf name.
> 
> 2) no one said all L4 ports are/have to be a u16, or that the L4 has
> to have a concept of port. Don’t let TCP/UDP monoculture fool you.
> 
> But, 🤷‍♂️.
> 
> —a
> 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16567): https://lists.fd.io/g/vpp-dev/message/16567
Mute This Topic: https://lists.fd.io/mt/74491544/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] Request to include recent collector_port and vm fixes in stable/2005 branch

2020-05-29 Thread Elias Rudberg

Hello,

The following two fixes were recently merged to the master branch.
Could they please be included in the stable/2005 branch also?

https://gerrit.fd.io/r/c/vpp/+/27280 (misc: ipfix-export unformat u16
collector_port fix)

https://gerrit.fd.io/r/c/vpp/+/27281 (nat: fix regarding vm arg for
vlib_time_now call)

We need them to avoid segmentation fault and assertion failure
problems.

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16575): https://lists.fd.io/g/vpp-dev/message/16575
Mute This Topic: https://lists.fd.io/mt/74544789/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] worker thread deadlock for current master branch, started with commit "bonding: adjust link state based on active slaves"

2020-05-29 Thread Elias Rudberg

Hello,

We now get this kind of error for the current master branch (5bb3e81e):

vlib_worker_thread_barrier_sync_int: worker thread deadlock

Testing previous commits indicates the problem started with the recent
commit 9121c415 "bonding: adjust link state based on active slaves"
(AuthorDate May 18, CommitDate May 27).

We can reproduce the problem using the following config:

unix {
  nodaemon
  exec /etc/vpp/commands.txt
}
cpu {
  workers 10
}

where commands.txt looks like this:

create bond mode lacp load-balance l23
create int rdma host-if enp101s0f1 name Interface101
create int rdma host-if enp179s0f1 name Interface179
bond add BondEthernet0 Interface101
bond add BondEthernet0 Interface179
create sub-interfaces BondEthernet0 1012
create sub-interfaces BondEthernet0 1013
set int ip address BondEthernet0.1012 10.1.1.1/30
set int ip address BondEthernet0.1013 10.1.2.1/30
set int state BondEthernet0 up
set int state Interface101 up
set int state Interface179 up
set int state BondEthernet0.1012 up
set int state BondEthernet0.1013 up

Then we get the "worker thread deadlock" every time at startup, after
just a few seconds.

We get the following gdb backtrace (for a release build):

vlib_worker_thread_barrier_sync_int: worker thread deadlock
Thread 3 "vpp_wk_0" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffe027fe700 (LWP 12171)]
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:51
#1  0x742ff801 in __GI_abort () at abort.c:79
#2  0xc700 in os_panic () at vpp/src/vpp/vnet/main.c:371
#3  0x75dd03ab in vlib_worker_thread_barrier_sync_int
(vm=0x7fffb87c0300, func_name=) at
vpp/src/vlib/threads.c:1517
#4  0x777bfa9c in dpo_get_next_node (child_type=, child_proto=, parent_dpo=0x7fffb9cebda0) at
vpp/src/vnet/dpo/dpo.c:430
#5  dpo_stack (child_type=, child_proto=,
dpo=, parent=0x7fffb9cebda0) at
vpp/src/vnet/dpo/dpo.c:521
#6  0x777c50ac in load_balance_set_bucket_i (lb=0x7fffb8e784c0,
bucket=, buckets=0x7fffb8e784e0, next=)
at vpp/src/vnet/dpo/load_balance.c:252
#7  load_balance_fill_buckets_norm (lb=0x7fffb8e784c0,
nhs=0x7fffb9cebda0, buckets=0x7fffb8e784e0, n_buckets=)
at vpp/src/vnet/dpo/load_balance.c:525
#8  load_balance_fill_buckets (lb=0x7fffb8e784c0, nhs=0x7fffb9cebda0,
buckets=0x7fffb8e784e0, n_buckets=, flags=)
at vpp/src/vnet/dpo/load_balance.c:589
#9  0x777c4d5f in load_balance_multipath_update (dpo=, raw_nhs=, flags=) at
vpp/src/vnet/dpo/load_balance.c:88
#10 0x7778e0fc in fib_entry_src_mk_lb
(fib_entry=0x7fffb90dd770, esrc=0x7fffb8c60150,
fct=FIB_FORW_CHAIN_TYPE_UNICAST_IP4, dpo_lb=0x7fffb90dd798)
at vpp/src/vnet/fib/fib_entry_src.c:645
#11 0x7778e4b7 in fib_entry_src_action_install
(fib_entry=0x7fffb90dd770, source=FIB_SOURCE_INTERFACE) at
vpp/src/vnet/fib/fib_entry_src.c:705
#12 0x7778f0b0 in fib_entry_src_action_reactivate
(fib_entry=0x7fffb90dd770, source=FIB_SOURCE_INTERFACE) at
vpp/src/vnet/fib/fib_entry_src.c:1221
#13 0x7778d873 in fib_entry_back_walk_notify
(node=0x7fffb90dd770, ctx=0x7fffb89c21d0) at
vpp/src/vnet/fib/fib_entry.c:316
#14 0x7778343b in fib_walk_advance (fwi=) at
vpp/src/vnet/fib/fib_walk.c:368
#15 0x77784107 in fib_walk_sync (parent_type=,
parent_index=, ctx=0x7fffb89c22a0) at
vpp/src/vnet/fib/fib_walk.c:792
#16 0x7779a43b in fib_path_back_walk_notify (node=, ctx=0x7fffb89c22a0) at vpp/src/vnet/fib/fib_path.c:1226
#17 0x7778343b in fib_walk_advance (fwi=) at
vpp/src/vnet/fib/fib_walk.c:368
#18 0x77784107 in fib_walk_sync (parent_type=,
parent_index=, ctx=0x7fffb89c2330) at
vpp/src/vnet/fib/fib_walk.c:792
#19 0x777a6dec in adj_glean_interface_state_change
(vnm=, sw_if_index=5, flags=) at
vpp/src/vnet/adj/adj_glean.c:166
#20 adj_nbr_hw_sw_interface_state_change (vnm=,
sw_if_index=5, arg=) at vpp/src/vnet/adj/adj_glean.c:183
#21 0x770e06cc in vnet_hw_interface_walk_sw (vnm=0x77b570f0
, hw_if_index=, fn=0x777a6da0
, ctx=0x1)
at vpp/src/vnet/interface.c:1062
#22 0x777a6b72 in adj_glean_hw_interface_state_change (vnm=0x2,
hw_if_index=3097238656, flags=) at
vpp/src/vnet/adj/adj_glean.c:205
#23 0x770df60c in call_elf_section_interface_callbacks
(vnm=0x77b570f0 , if_index=1, flags=,
elts=0x77b571a0 )
at vpp/src/vnet/interface.c:251
#24 vnet_hw_interface_set_flags_helper (vnm=0x77b570f0 ,
hw_if_index=1, flags=VNET_HW_INTERFACE_FLAG_LINK_UP,
helper_flags=)
at vpp/src/vnet/interface.c:331
#25 0x771b300f in bond_enable_collecting_distributing
(vm=, sif=0x7fffb95de168) at
vpp/src/vnet/bonding/cli.c:178
#26 0x7fffad765636 in lacp_mux_action_collecting_distributing
(p1=0x7fffb87c0300, p2=0x7fffb95de168) at
vpp/src/plugins/lacp/mux_machine.c:173
#27 0x7fffad7654ff in lacp_mux_action_attached (p1=0x7ff

Re: [vpp-dev] ixge and rdma drivers

2020-06-02 Thread Elias Rudberg

Hi Chris,

About mlx5, we are using mlx5 cards with the VPP rdma plugin and it is
working fine for us, for VPP 19.08 and newer.

(I think there may be a problem with the rdma plugin for larger MTU
values but for MTU < 2000 or so, everything works fine.)

/ Elias


On Tue, 2020-06-02 at 03:40 -0400, Christian Hopps wrote:
> Hi vpp-dev,
> 
> I've been contemplating trying to use native drivers in place of DPDK
> with the understanding that I may be paying a ~20% penalty by using
> DPDK. So I went to try things out, but had some trouble. The systems
> in paticular I'm interested in have 10GE intel NICs in them which I
> believe would be supported by the ixge driver. I noticed that this
> driver has been marked deprecated in VPP though. Is there a
> replacement or is DPDK required for this NIC?
> 
> I also have systems that have mlx5 (and eventually will have
> connectx-6 cards). These cards appear to be supported by the rdma
> native driver. I was able to create the interfaces and saw TX packets
> but no RX.  Is this driver considered stable and usable in 19.08 (and
> if not which release would it be consider so)?
> 
> Thanks,
> Chris.
> 
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16607): https://lists.fd.io/g/vpp-dev/message/16607
Mute This Topic: https://lists.fd.io/mt/74623336/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] ixge and rdma drivers

2020-06-02 Thread Elias Rudberg

Hi Ben,

> > (I think there may be a problem with the rdma plugin for larger MTU
> > values but for MTU < 2000 or so, everything works fine.)
>  
> It should work, jumbo support was added in the last months. Or do you
> refer to something else?

I think I mean something else, a problem that I noticed a few weeks ago
but never had time to report it then. Now I tried again and it can
still be reproduced with the current master branch.

The setup is that I have one server running VPP doing NAT44 and then I
have two other servers on inside and outside. This works fine when the
MTU is 1500. Then I set the MTU to 3000 on all involved interfaces and
restart VPP. Now it works as longas only small packets are used, but as
soon as a packet larger than ~2048 bytes appears, VPP stops working.
(Doing e.g. ping -s 2100 is enough to trigger it.) After that VPP is
stuck in some kind of error state from which it does not recover, even
small packets are not forwarded after that.

I tried to investigate further and then it seemed like that what
happens is that the RDMA_DEVICE_F_ERROR flag is set in
src/plugins/rdma/input.c which causes the rdma plugin code to get
stuck, the error flag is never cleared it seems.

The reason why the larger packet size caused an error seems to be that
the log2_cq_size value used in src/plugins/rdma/input.c is
log2_cq_size = 11 which corresponds to 2^11 = 2048 bytes which is
roughly the packet size where the problem appears.

So I got the impression that the rdma plugin is limited to 2^11 = 2048
bytes MTU due to the log2_cq_size = 11 value. Maybe that can be
configured somehow? In any case, it seems bad that VPP gets stuck after
one such error appears, it would be better if it just increased an
error counter and dropped the packet.

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16612): https://lists.fd.io/g/vpp-dev/message/16612
Mute This Topic: https://lists.fd.io/mt/74623336/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] Assertion failure triggered by "ip mroute add" command (master branch)

2020-06-03 Thread Elias Rudberg

Hello VPP experts,

There seems to be a problem with "ip mroute add" causing assertion
failure. This happens for the current master branch and the stable/2005
branch, but not for stable/1908 and stable/2001.

Doing the following is enough to see the problem:

create int rdma host-if enp101s0f1 name Interface101
set int ip address Interface101 10.0.0.1/24
ip mroute add 224.0.0.1 via Interface101 Accept

The "ip mroute add" command there then causes an assertion failure.
Backtrace:

Thread 1 "vpp_main" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:51
#1  0x74629801 in __GI_abort () at abort.c:79
#2  0x004071a3 in os_panic () at vpp/src/vpp/vnet/main.c:371
#3  0x755085b9 in debugger () at vpp/src/vppinfra/error.c:84
#4  0x75508337 in _clib_error (how_to_die=2, function_name=0x0,
line_number=0, fmt=0x776b04b0 "%s:%d (%s) assertion `%s' fails")
at vpp/src/vppinfra/error.c:143
#5  0x774d1ed8 in dpo_proto_to_fib (dpo_proto=255) at
vpp/src/vnet/fib/fib_types.c:353
#6  0x77504111 in fib_path_attached_get_adj
(path=0x7fffb602cda0, link=255, dpo=0x7fffa6f3c2e8) at
vpp/src/vnet/fib/fib_path.c:721
#7  0x775038fa in fib_path_resolve (path_index=15) at
vpp/src/vnet/fib/fib_path.c:1949
#8  0x774f6a18 in fib_path_list_paths_add (path_list_index=13,
rpaths=0x7fffb6523b40) at vpp/src/vnet/fib/fib_path_list.c:902
#9  0x775c795a in mfib_entry_src_paths_add
(msrc=0x7fffb6527c10, rpaths=0x7fffb6523b40) at
vpp/src/vnet/mfib/mfib_entry.c:754
#10 0x775c764e in mfib_entry_path_update (mfib_entry_index=1,
source=MFIB_SOURCE_CLI, rpaths=0x7fffb6523b40) at
vpp/src/vnet/mfib/mfib_entry.c:1009
#11 0x775ce98a in mfib_table_entry_paths_update_i (fib_index=0,
prefix=0x7fffa6f3c720, source=MFIB_SOURCE_CLI, rpaths=0x7fffb6523b40)
at vpp/src/vnet/mfib/mfib_table.c:318
#12 0x775ce643 in mfib_table_entry_path_update (fib_index=0,
prefix=0x7fffa6f3c720, source=MFIB_SOURCE_CLI, rpath=0x7fffb5ffa330)
at vpp/src/vnet/mfib/mfib_table.c:335
#13 0x76f18ce2 in vnet_ip_mroute_cmd (vm=0x763969c0
, main_input=0x7fffa6f3cf18, cmd=0x7fffb5efced0) at
vpp/src/vnet/ip/lookup.c:819
#14 0x76093139 in vlib_cli_dispatch_sub_commands
(vm=0x763969c0 , cm=0x76396bf0
, input=0x7fffa6f3cf18, parent_command_index=463)
at vpp/src/vlib/cli.c:568
#15 0x76092fdd in vlib_cli_dispatch_sub_commands
(vm=0x763969c0 , cm=0x76396bf0
, input=0x7fffa6f3cf18, parent_command_index=0)
at vpp/src/vlib/cli.c:528
#16 0x7609218f in vlib_cli_input (vm=0x763969c0
, input=0x7fffa6f3cf18, function=0x0, function_arg=0)
at vpp/src/vlib/cli.c:667
#17 0x7616180b in startup_config_process (vm=0x763969c0
, rt=0x7fffb4a9c480, f=0x0) at
vpp/src/vlib/unix/main.c:366
#18 0x760dd704 in vlib_process_bootstrap (_a=140736226945080)
at vpp/src/vlib/main.c:1502
#19 0x7552c744 in clib_calljmp () at
vpp/src/vppinfra/longjmp.S:123
#20 0x7fffb4d06830 in ?? ()
#21 0x760dd2a2 in vlib_process_startup (vm=0x288,
p=0xcd5b1d5112dc20, f=0xb4d069a0) at vpp/src/vlib/main.c:1524
#22 0x0030b6523520 in ?? ()
#23 0x002f in ?? ()
#24 0x0035b4d429c0 in ?? ()
#25 0x0034 in ?? ()
#26 0x77b775b4 in vlibapi_get_main () at
vpp/src/vlibapi/api_common.h:385
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) 

The code at the assertion at fib_types.c:353 looks like this:

fib_protocol_t
dpo_proto_to_fib (dpo_proto_t dpo_proto)
{
switch (dpo_proto)
{
case DPO_PROTO_IP6:
return (FIB_PROTOCOL_IP6);
case DPO_PROTO_IP4:
return (FIB_PROTOCOL_IP4);
case DPO_PROTO_MPLS:
return (FIB_PROTOCOL_MPLS);
default:
break;
}
ASSERT(0);   <--- this assertion is triggered
return (0);
}

so apparently dpo_proto does not have any of the allowed values.

Testing earlier commits in the git history pointed to the following
seemingly unrelated and harmless refactoring commit as the point when
this problem started:
30cca512c (build: remove valgrind leftovers, 2019-11-25)

What we are trying to do, which has worked for VPP 19.08, is to enable
receiving of multicast packets on a given interface using two commands
like this:

ip mroute add 224.0.0.1 via Interface101 Accept
ip mroute add 224.0.0.1 via local Forward

but now for the master branch the first of those "ip mroute add" lines
gives the assertion failure.

Has something changed regarding how the "ip mroute add" command is to
be used?
If not, could the assertion failure indicate a bug somewhere?

The problem seems easy to reproduce, at least for me the assertion
happens in the same way every time.

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You

Re: [vpp-dev] Assertion failure triggered by "ip mroute add" command (master branch)

2020-06-03 Thread Elias Rudberg

Hi Ben!

> It is probably a bug but I could not reproduce it.
> Note that commit 30cca512c (build: remove valgrind
> leftovers, 2019-11-25) is present in stable/2001
> so probably not the culprit...

Agreed.

> Can you share how you built VPP and your complete startup.conf?
> You seems to be running those commands from startup.conf directly.

Yes, I had those three commands in a file and then pointed to that file
as "exec /path/to/file" in the unix { } part of startup.conf.

Anyway, I got inspired and debugged the issue further myself: the
problem seems to be that the variable payload_proto in
vnet_ip_mroute_cmd() does not get set to anything, it end up having
whatever value was on the stack which could be any garbage.

My test works correctly after initializing it to zero, like this:

--- a/src/vnet/ip/lookup.c
+++ b/src/vnet/ip/lookup.c
@@ -661,7 +661,7 @@ vnet_ip_mroute_cmd (vlib_main_t * vm,
   unformat_input_t _line_input, *line_input = &_line_input;
   fib_route_path_t rpath, *rpaths = NULL;
   clib_error_t *error = NULL;
-  u32 table_id, is_del, payload_proto;
+  u32 table_id, is_del, payload_proto = 0;

If you want to reproduce the problem, you can simply set
payload_proto=77 (or whatever) instead of payload_proto=0 there, to
mimic garbage on the stack.

Just setting payload_proto=0 is probably not a good fix though, I guess
that just means hard-coding the FIB_PROTOCOL_IP4 value which happens to
work in my case.

To fix it properly I think payload_proto should be set to the
appropriate protocol in the different "else if" clauses, when
pfx.fp_proto is set then payload_proto should also be set, in the same
way as it is done in the vnet_ip_route_cmd() function.

I pushed a fix like that to gerrit, please have a look: 
https://gerrit.fd.io/r/c/vpp/+/27416

Best regards,
Elias

P.S.
By the way, do you think address sanitizer could be used to find this
kind of bugs?
(Or perhaps if there was a compiler option to poison the stack at each
function call, or something like that. I think it's a common problem
that code relies on uninitialized things being zero and that can
sometimes go undetected for a long time because things often happen to
be zero, forcing something nonzero could help detecting such bugs.)

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16649): https://lists.fd.io/g/vpp-dev/message/16649
Mute This Topic: https://lists.fd.io/mt/74649468/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] RDMA problem in master and stable/2005, started with commit introducing direct verb for Cx4/5 tx #mellanox #rdma

2020-06-26 Thread Elias Rudberg

Hello VPP experts,

There seems to be a problem with the RDMA driver in VPP when using
Mellanox ConnectX5 network interfaces. This problem appears for the
master branch and for the stable/2005 branch, while stable/2001 does
not have this problem.

The problem is that when a frame with 2 packets is to be sent, only the
first packets is sent directly while the second packet gets delayed. It
seems like the second packet is only sent later, when some other frame
with other packets is to be sent then the delayed earlier packet is
also sent.

Perhaps this can go undetected if there is lots of traffic all the
time, if there is always new traffic to flush out any delayed packets
from earlier. So to reproduce it, it seems best to have a testing setup
with very little traffic such that there are several seconds without
any traffic, then it seems like packets can get delayed for several
seconds. Note that the delay is not seen inside VPP where packet traces
look like the packets are sent directly, VPP thinks they are sent but
it seems some packets are held in the NIC and only sent later on.
Monitoring traffic arriving at the other end shows that there was a
delay.

The behavior seems reproducible, except when there is other traffic
being sent soon after since that causes the delayed packets to be sent.

The specific case when this came up for us was when using VPP for NAT
with ipfix logging turned on, and doing some ping tests. Then when a
single ping echo request packet is to be NATed, that usually works fine
but sometimes there is also a ipfix logging packet to be sent, that
ends up in the same frame so that the frame has 2 packets. Then the
ipfix logging packet gets sent directly while the ICMP packet is
delayed, sometimes so much that the ping failed, it timed out. I don't
think the problem has anything to do with NAT or ipfix logging, it
seems like a more general problem with the rdma plugin.

Testing previous commits indicates that the problem started with this
commit:

dc812d9a7 (rdma: introduce direct verb for Cx4/5 tx, 2019-12-16)

That commit exists in master and in stable/2005 but not in stable/2001
which fits with that this problem is seen for master and stable/2005
but not for stable/2001.

Tried updating to the latest Mellanox driver (v5.0-2.1.8) but that did
not help.

In the code in src/plugins/rdma/output.c it seems like the function
rdma_device_output_tx_mlx5() is handling the packets, but I was not
able to fully understand how it works. There is a concept of a
"doorbell" function call there, apparently the idea is that when
packets are to be sent, info about the packets is prepared and then the
"doorbell" is used to alert the NIC that there are things to send. From
my limited understanding, it seems like the doorbell currently results
in only the first packet is really being physically sent by the NIC
directly, while remaining packets are somehow stored and sent later. So
far I don't understand exactly why that happens or how to fix it.

As a workaround, it seems to work to simply revert the entire rdma
plugin to the way it looks in the stable/2001 branch, then the problem
seems to disappear. But that probably means we lose performance gains
and other improvements in the newer code.

Can someone with insight in the rdma plugin please help try to fix
this?

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16822): https://lists.fd.io/g/vpp-dev/message/16822
Mute This Topic: https://lists.fd.io/mt/75120690/21656
Mute #mellanox: https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/mellanox
Mute #rdma: https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/rdma
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] RDMA problem in master and stable/2005, started with commit introducing direct verb for Cx4/5 tx #mellanox #rdma

2020-06-26 Thread Elias Rudberg

Hi Ben,

Thanks, now I tried it (the Patchset 2 variant) but it seems to behave
like before, the delay is sitll happening.

Let me know if you have something more I could try.

/ Elias


On Fri, 2020-06-26 at 12:04 +, Benoit Ganne (bganne) via
lists.fd.io wrote:
> Hi Elias,
> 
> Thanks for the detailed report. I suspect you are correct, it seems
> to be related to the doorbell update to notify the NIC there are some
> work to do.
> Could you check https://gerrit.fd.io/r/c/vpp/+/27708 and report
> whether it fixes the issue?
> 
> Best
> ben

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16831): https://lists.fd.io/g/vpp-dev/message/16831
Mute This Topic: https://lists.fd.io/mt/75120690/21656
Mute #mellanox: https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/mellanox
Mute #rdma: https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/rdma
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] RDMA problem in master and stable/2005, started with commit introducing direct verb for Cx4/5 tx #mellanox #rdma

2020-06-26 Thread Elias Rudberg

Hi Ben,
Thanks, I tested that now but it did not help, it behaves the same also
with "MLX5_SHUT_UP_BF=1" set.
/ Elias


> Can you try to export "MLX5_SHUT_UP_BF=1" in your environment before
> starting VPP (ie, VPP environment must contain this)? This should
> disable the "BlueFlame" mechanism in Mellanox NIC. Otherwise I'll
> need to take a deeper look.
> 
> Best
> ben
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16833): https://lists.fd.io/g/vpp-dev/message/16833
Mute This Topic: https://lists.fd.io/mt/75120690/21656
Mute #mellanox: https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/mellanox
Mute #rdma: https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/rdma
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] NAT port number selection problem, leads to wrong thread index for some sessions

2020-07-02 Thread Elias Rudberg

Hello VPP experts,

There seems to be a problem with the way port number is selected for
NAT: sometimes the selected port number leads to a different thread
index being selected for out2in packets, making that session useless.
This applies to the current master branch as well as the latest stable
branches, I think.

Here is the story as I understand it, please correct me if I have
misunderstood something. Each NAT thread has a range of port numbers
that it can use, and when a new session is created a port number is
picked at random from within that range. That happens when a in2out
packet is NATed. Then later when a response comes as a out2in packet,
VPP needs to make sure it is handled by the correct thread, the same
thread that created the session.

The port number to use for a new session is selected in
nat_alloc_addr_and_port_default() like this:

portnum = (port_per_thread * snat_thread_index) + snat_random_port(1,
port_per_thread) + 1024;

where port_per_thread is the number of ports each thread is allowed to
use, and snat_random_port() returns a random number in the given range.
This means that the smallest possible portnum is 1025, that can happen
when snat_thread_index is zero.

The corresponding calculation to get the thread index back based on the
port number is essentially this:

(portnum - 1024) / port_per_thread

This works most of the time, but not always. It works in all cases
except when snat_random_port() returns the largest possible value, in
that case we end up with the wrong thread index. That means that out2in
packets arriving for that session get handed off to another thread. The
other thread is unaware of that session so all out2in packets are then
dropped for that session.

Since each thread has thousands of port numbers to choose from and the
problem only appears for one particular choice, only a small fraction
of all sessions are affected by this. In my tests there was 8 NAT
threads, then the port_per_thread value was about 8000 so that the
probability was about 1/8000 or roughly 0.0125% of all sessions that
failed.

The test I used was simply to try many separate ping commands with the
"-c 1" option, all should give the normal result "1 packets
transmitted, 1 received, 0% packet loss" but due to this problem some
of the pings fail. Note that it needs to be separate ping commands so
that VPP creates a new session for each of them. Provided that you test
a large enough number of sessions, it is straightforward to reproduce
the problem.

It could be fixed in different ways, one way is to simply shift the
arguments to snat_random_port() down by one:
snat_random_port(1, port_per_thread)
-->
snat_random_port(0, port_per_thread-1)

I pushed such a change to gerrit, here: 
https://gerrit.fd.io/r/c/vpp/+/27786

The smallest port number used then becomes 1024 instead of 1025 as it
has been so far, I suppose that should be OK since it is the "well-
known ports" from 0 to 1023 that should be avoided, port 1024 should be
okay to use. What do you think, does it make sense to fix it in this
way?

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16880): https://lists.fd.io/g/vpp-dev/message/16880
Mute This Topic: https://lists.fd.io/mt/75267169/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] VPP 20.05.1 tomorrow 15th July 2020

2020-07-14 Thread Elias Rudberg

Hello Andrew,

The following two fixes have been merged to the master branch, it would
be good to have them in stable/2005 also:

https://gerrit.fd.io/r/c/vpp/+/27280 (misc: ipfix-export unformat u16
collector_port fix)

https://gerrit.fd.io/r/c/vpp/+/27281 (nat: fix regarding vm arg for
vlib_time_now call)

Best regards,
Elias


On Tue, 2020-07-14 at 19:04 +0200, Andrew Yourtchenko wrote:
> Hi all,
> 
> As agreed on the VPP community call today, we will declare the
> current stable/2005 branch as v20.05.1 tomorrow (15th July)
> 
> If you have any fixes that are already in master but not yet in
> stable/2005, that you want to get in there - please let  me know
> before noon UTC.
> 
> --a
> Your friendly release manager 
> 
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16966): https://lists.fd.io/g/vpp-dev/message/16966
Mute This Topic: https://lists.fd.io/mt/75503386/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] VPP 20.05.1 tomorrow 15th July 2020

2020-07-15 Thread Elias Rudberg

Hi Andrew,

I don't know how to cherry-pick. I was under the impression that only
the trusted commiters were allowed to do that, maybe I misunderstood
that.

What I know so far about the gerrit system is what I read here: 
https://wiki.fd.io/view/VPP/Pulling,_Building,_Running,_Hacking_and_Pushing_VPP_Code#Pushing_Code_with_git_review

Is there a guide somewhere describing how to do cherry-picking?
Alternatively, could you do it for me?

/ Elias


On Wed, 2020-07-15 at 12:27 +0200, Andrew 👽 Yourtchenko wrote:
> Hi Elias, sure, feel free to cherry-pick to stable/2005 branch and
> add me as a reviewer, then I can merge when JJB gives thumbs up.
> 
> --a
> 
> > On 15 Jul 2020, at 07:25, Elias Rudberg 
> > wrote:
> > 
> > Hello Andrew,
> > 
> > The following two fixes have been merged to the master branch, it
> > would
> > be good to have them in stable/2005 also:
> > 
> > https://gerrit.fd.io/r/c/vpp/+/27280 (misc: ipfix-export unformat
> > u16
> > collector_port fix)
> > 
> > https://gerrit.fd.io/r/c/vpp/+/27281 (nat: fix regarding vm arg for
> > vlib_time_now call)
> > 
> > Best regards,
> > Elias
> > 
> > 
> > > On Tue, 2020-07-14 at 19:04 +0200, Andrew Yourtchenko wrote:
> > > Hi all,
> > > 
> > > As agreed on the VPP community call today, we will declare the
> > > current stable/2005 branch as v20.05.1 tomorrow (15th July)
> > > 
> > > If you have any fixes that are already in master but not yet in
> > > stable/2005, that you want to get in there - please let  me know
> > > before noon UTC.
> > > 
> > > --a
> > > Your friendly release manager 
> > > -=-=-=-=-=-=-=-=-=-=-=-
> > > 
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16970): https://lists.fd.io/g/vpp-dev/message/16970
Mute This Topic: https://lists.fd.io/mt/75503386/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] NAT port number selection problem, leads to wrong thread index for some sessions

2020-07-23 Thread Elias Rudberg

Hello,
Just a reminder about this, see below.
Best regards,
Elias

 Forwarded Message 
From: Elias Rudberg 
To: vpp-dev@lists.fd.io 
Subject: [vpp-dev] NAT port number selection problem, leads to wrong 
thread index for some sessions
Date: Thu, 02 Jul 2020 20:43:12 +

Hello VPP experts,

There seems to be a problem with the way port number is selected for
NAT: sometimes the selected port number leads to a different thread
index being selected for out2in packets, making that session useless.
This applies to the current master branch as well as the latest stable
branches, I think.

Here is the story as I understand it, please correct me if I have
misunderstood something. Each NAT thread has a range of port numbers
that it can use, and when a new session is created a port number is
picked at random from within that range. That happens when a in2out
packet is NATed. Then later when a response comes as a out2in packet,
VPP needs to make sure it is handled by the correct thread, the same
thread that created the session.

The port number to use for a new session is selected in
nat_alloc_addr_and_port_default() like this:

portnum = (port_per_thread * snat_thread_index) + snat_random_port(1,
port_per_thread) + 1024;

where port_per_thread is the number of ports each thread is allowed to
use, and snat_random_port() returns a random number in the given range.
This means that the smallest possible portnum is 1025, that can happen
when snat_thread_index is zero.

The corresponding calculation to get the thread index back based on the
port number is essentially this:

(portnum - 1024) / port_per_thread

This works most of the time, but not always. It works in all cases
except when snat_random_port() returns the largest possible value, in
that case we end up with the wrong thread index. That means that out2in
packets arriving for that session get handed off to another thread. The
other thread is unaware of that session so all out2in packets are then
dropped for that session.

Since each thread has thousands of port numbers to choose from and the
problem only appears for one particular choice, only a small fraction
of all sessions are affected by this. In my tests there was 8 NAT
threads, then the port_per_thread value was about 8000 so that the
probability was about 1/8000 or roughly 0.0125% of all sessions that
failed.

The test I used was simply to try many separate ping commands with the
"-c 1" option, all should give the normal result "1 packets
transmitted, 1 received, 0% packet loss" but due to this problem some
of the pings fail. Note that it needs to be separate ping commands so
that VPP creates a new session for each of them. Provided that you test
a large enough number of sessions, it is straightforward to reproduce
the problem.

It could be fixed in different ways, one way is to simply shift the
arguments to snat_random_port() down by one:
snat_random_port(1, port_per_thread)
-->
snat_random_port(0, port_per_thread-1)

I pushed such a change to gerrit, here: 
https://gerrit.fd.io/r/c/vpp/+/27786

The smallest port number used then becomes 1024 instead of 1025 as it
has been so far, I suppose that should be OK since it is the "well-
known ports" from 0 to 1023 that should be avoided, port 1024 should be
okay to use. What do you think, does it make sense to fix it in this
way?

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17052): https://lists.fd.io/g/vpp-dev/message/17052
Mute This Topic: https://lists.fd.io/mt/75267169/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] VPP load estimation

2020-11-11 Thread Elias Rudberg

Hi Ben,

> Yes, it is the main way to quickly assess VPP load, see 
> https://fd.io/docs/vpp/master/troubleshooting/cpuusage.html#vpp-cpu-load
> My very crude rule-of-thumb looks like this (but your mileage may
> vary):
>  - between 0 and 50: VPP is not working too hard
>  - between 50 and 100: VPP is starting to be pushed hard
>  - above 100: you'll probably experiment drops with bursts
>  - 250+: you're dropping traffic

Is it possible to get this information using the python API instead of
the vppctl "show runtime" command?

In our case we have some monitoring tools that fetch statistics from
VPP regularly, like several times each minute. So then we would like to
do it in a way that does not cause performance problems. Is it a bad
idea to use the vppctl "show runtime" command frequently (it causes a
thread barrier I think) and if so, is there a better way of getting the
corresponding information?

I also have another question related to load estimation: we are using
VPP for NAT44 and we are seeing a significant number (like 1000 per
second) of congestion drops (meaning that a NAT thread wants to handoff
packets to another thread but the handoff queue is full). Then we
looked at the "show runtime" output and expected to see some large
values for the vector rate there, but it just shows values like 7 and
similar, far below 50, which by your rule of thumb should indicate that
VPP is not working too hard. In this case, are there some other
statistics we could look at to figure out what is happening? One theory
is that there are some short bursts of more intense traffic causing our
drops, that we do not see with "show runtime" because statistics there
are smeared out over time. Are there some other statistics we could use
to understand if that is the case, or better ways to investigate this
kind of problem?

Best regards,
Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#17984): https://lists.fd.io/g/vpp-dev/message/17984
Mute This Topic: https://lists.fd.io/mt/78132591/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?

2020-11-13 Thread Elias Rudberg

Hello VPP experts,

We are using VPP for NAT44 and we get some "congestion drops", in a
situation where we think VPP is far from overloaded in general. Then
we started to investigate if it would help to use a larger handoff
frame queue size. In theory at least, allowing a longer queue could
help avoiding drops in case of short spikes of traffic, or if it
happens that some worker thread is temporarily busy for whatever
reason.

The NAT worker handoff frame queue size is hard-coded in the
NAT_FQ_NELTS macro in src/plugins/nat/nat.h where the current value is
64. The idea is that putting a larger value there could help.

We have run some tests where we changed the NAT_FQ_NELTS value from 64
to a range of other values, each time rebuilding VPP and running an
identical test, a test case that is to some extent trying to mimic our
real traffic, although of course it is simplified. The test runs many
iperf3 tests simultaneously using TCP, combined with some UDP traffic
chosen to trigger VPP to create more new sessions (to make the NAT
"slowpath" happen more).

The following NAT_FQ_NELTS values were tested:
16
32
64  <-- current value
128
256
512
1024
2048  <-- best performance in our tests
4096
8192
16384
32768
65536
131072

In those tests, performance was very bad for the smallest NAT_FQ_NELTS
values of 16 and 32, while values larger than 64 gave improved
performance. The best results in terms of throughput were seen for
NAT_FQ_NELTS=2048. For even larger values than that, we got reduced
performance compared to the 2048 case.

The tests were done for VPP 20.05 running on a Ubuntu 18.04 server
with a 12-core Intel Xeon CPU and two Mellanox mlx5 network cards. The
number of NAT threads was 8 in some of the tests and 4 in some of the
tests.

According to these tests, the effect of changing NAT_FQ_NELTS can be
quite large. For example, for one test case chosen such that
congestion drops were a significant problem, the throughput increased
from about 43 to 90 Gbit/second with the amount of congestion drops
per second reduced to about one third. In another kind of test,
throughput increased by about 20% with congestion drops reduced to
zero. Of course such results depend a lot on how the tests are
constructed. But anyway, it seems clear that the choice of
NAT_FQ_NELTS value can be important and that increasing it would be
good, at least for the kind of usage we have tested now.

Based on the above, we are considering changing NAT_FQ_NELTS from 64
to a larger value and start trying that in our production environment
(so far we have only tried it in a test environment).

Were there specific reasons for setting NAT_FQ_NELTS to 64?

Are there some potential drawbacks or dangers of changing it to a
larger value?

Would you consider changing to a larger value in the official VPP
code?

Best regards,
Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18012): https://lists.fd.io/g/vpp-dev/message/18012
Mute This Topic: https://lists.fd.io/mt/78230881/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?

2020-11-16 Thread Elias Rudberg

Hi Klement,

Thanks! I have now tested your patch (28980), it seems to work and it
does give some improvement. However, according to my tests, increasing
NAT_FQ_NELTS seems to have a bigger effect, it improves performance a
lot. When using the original NAT_FQ_NELTS value of 64, your patch
gives some improvement but I still get the best performance when
increasing NAT_FQ_NELTS.

For example, one of the tests behaves like this:

Without patch, NAT_FQ_NELTS=64  --> 129 Gbit/s and ~600k cong. drops
With patch, NAT_FQ_NELTS=64  --> 136 Gbit/s and ~400k cong. drops
Without patch, NAT_FQ_NELTS=1024  --> 151 Gbit/s and 0 cong. drops
With patch, NAT_FQ_NELTS=1024  --> 151 Gbit/s and 0 cong. drops

So it still looks like increasing NAT_FQ_NELTS would be good, which
brings me back to the same questions as before:

Were there specific reasons for setting NAT_FQ_NELTS to 64?

Are there some potential drawbacks or dangers of changing it to a
larger value?

I suppose everyone will agree that when there is a queue with a
maximum length, the choice of that maximum length can be important. Is
there some particular reason to believe that 64 would be enough? In
our case we are using 8 NAT threads. Suppose thread 8 is held up
briefly due to something taking a little longer than usual, meanwhile
threads 1-7 each hand off 10 frames to thread 8, that situation would
require a queue size of at least 70, unless I misunderstood how the
handoff mechanism works. To me, allowing a longer queue seems like a
good thing because it allows us to handle also more difficult cases
when threads are not always equally fast, there can be spikes in
traffic that affect some threads more than others, things like
that. But maybe there are strong reasons for keeping the queue short,
reasons I don't know about, that's why I'm asking.

Best regards,
Elias

On Fri, 2020-11-13 at 15:14 +, Klement Sekera -X (ksekera -
PANTHEON TECH SRO at Cisco) wrote:
> Hi Elias,
> 
> I’ve already debugged this and came to the conclusion that it’s the
> infra which is the weak link. I was seeing congestion drops at mild
> load, but not at full load. Issue is that with handoff, there is
> uneven workload. For simplicity’s sake, just consider thread 1
> handing off all the traffic to thread 2. What happens is that for
> thread 1, the job is much easier, it just does some ip4 parsing and
> then hands packet to thread 2, which actually does the heavy lifting
> of hash inserts/lookups/translation etc. 64 element queue can hold 64
> frames, one extreme is 64 1-packet frames, totalling 64 packets,
> other extreme is 64 255-packet frames, totalling ~16k packets. What
> happens is this: thread 1 is mostly idle and just picking a few
> packets from NIC and every one of these small frames creates an entry
> in the handoff queue. Now thread 2 picks one element from the handoff
> queue and deals with it before picking another one. If the queue has
> only 3-packet or 10-packet elements, then thread 2 can never really
> get into what VPP excels in - bulk processing.
> 
> Q: Why doesn’t it pick as many packets as possible from the handoff
> queue? 
> A: It’s not implemented.
> 
> I already wrote a patch for it, which made all congestion drops which
> I saw (in above synthetic test case) disappear. Mentioned patch 
> https://gerrit.fd.io/r/c/vpp/+/28980 is sitting in gerrit.
> 
> Would you like to give it a try and see if it helps your issue? We
> shouldn’t need big queues under mild loads anyway …
> 
> Regards,
> Klement
> 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18039): https://lists.fd.io/g/vpp-dev/message/18039
Mute This Topic: https://lists.fd.io/mt/78230881/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?

2020-11-17 Thread Elias Rudberg

Hi Klement,

> I see no reason why this shouldn’t be configurable.
> [...]
> Would you like to submit a patch?

Sure, I'll give that a try, adding it as a config option of the same
kind as other NAT options.

Best regards,
Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18061): https://lists.fd.io/g/vpp-dev/message/18061
Mute This Topic: https://lists.fd.io/mt/78230881/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] RDMA problem in master and stable/2005, started with commit introducing direct verb for Cx4/5 tx #mellanox #rdma

2020-11-24 Thread Elias Rudberg

Hi Ben,

Returning to this issue, last discussed in June.

> > Thanks, now I tried it (the Patchset 2 variant) but it seems to
> > behave like before, the delay is sitll happening.
> 
> Hmm thanks ☹
> Can you try to export "MLX5_SHUT_UP_BF=1" in your environment before
> starting VPP (ie, VPP environment must contain this)? This should
> disable the "BlueFlame" mechanism in Mellanox NIC. Otherwise I'll
> need to take a deeper look.

Unfortunately that did not help, it seemed to behave the same also
with "MLX5_SHUT_UP_BF=1" set.

We are still having this problem now, with the current master branch.
Like before, the behavior seems to be that when 2 packets are to be
sent, only the first one gets sent directly, the second packet gets
delayed. I have a test case now where the delay is more than 3 seconds.
It seems the delay lasts until something else is to be sent, then the
old packet gets sent also. So nothing gets lost, just delayed. But
things can anyway fail, for example some ping tests fail because they
time out.

I have looked a bit more at it and tried to understand what happens,
but I did not get much wiser, still just seems to me like VPP rings the
"doorbell" and expects the packets to be sent, but somehow only one
packet is sent and the other is delayed.

Am I right to assume that the "doorbell" action is the last thing VPP
is doing that we can check in the VPP source code itself, then we would
need to go poke around inside the underlying rdma-core driver to see
what is happening?
Can you help more with this?

Best regards,
Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18130): https://lists.fd.io/g/vpp-dev/message/18130
Mute This Topic: https://lists.fd.io/mt/75120690/21656
Mute #mellanox:https://lists.fd.io/g/vpp-dev/mutehashtag/mellanox
Mute #rdma:https://lists.fd.io/g/vpp-dev/mutehashtag/rdma
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] After recent "interface: improve logging" commit, "Secondary MAC Addresses not supported" message appears, does this mean something is wrong?

2020-11-25 Thread Elias Rudberg

Hello VPP experts,

Using the current master branch, we now get log messages like this
(shown by journalctl in red color):

Nov 25 15:10:29 vnet[...]: interface: hw_add_del_mac_address:
vnet_hw_interface_add_del_mac_address: Secondary MAC Addresses not
supported for interface index 0
Nov 25 15:10:29 vnet[...]: interface: hw_add_del_mac_address:
vnet_hw_interface_add_del_mac_address: Secondary MAC Addresses not
supported for interface index 0

This seems to have started with the commit d1bd5d26 "interface: improve
logging" on November 23. Even though the commit message says it was
only a logging change, I still wnder if the message is correct and if
so, if it means that something is wrong with the way we have configured
VPP.

Here is an example of VPP commands leading to those log messages:

create bond mode lacp load-balance l23
create int rdma host-if enp101s0f1 name i1
create int rdma host-if enp179s0f1 name i2
bond add BondEthernet0 i1
bond add BondEthernet0 i2
create sub-interfaces BondEthernet0 1
create sub-interfaces BondEthernet0 2
set int ip address BondEthernet0.1 10.0.0.1/30

The "set int ip address" command there triggers two such "Secondary MAC
Addresses not supported" messages -- what does that mean in case of the
config above?
Should we do something differently to avoid the error messages?

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18135): https://lists.fd.io/g/vpp-dev/message/18135
Mute This Topic: https://lists.fd.io/mt/78500983/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] NAT memory usage problem for VPP 20.09 compared to 20.05 due to larger translation_buckets value

2020-11-26 Thread Elias Rudberg

Hello VPP experts,

We are using VPP for NAT44 and are currently looking at how to move
from VPP 20.05 to 20.09. There are some differences in the way the NAT
plugin is configured.

One difficulty for us is the maximum number of sessions allowed, we
need to handle large numbers of sessions so that limit can be
important for us. For VPP 20.05 we have used "translation hash buckets
1048576" and then the maximum number of sessions per thread becomes 10
times that because of this line in the source code in snat_config():

sm->max_translations = 10 * translation_buckets;

So then we got a limit of about 10 million sessions per thread, which
we have been happy with so far.

With VPP 20.09 however, things have changed so that the maximum number
of sessions is now configured explicitly, and the relationship between
max_translations_per_thread and translation_buckets is no longer a
factor of 10 but instead given by the nat_calc_bihash_buckets()
function:

static u32
nat_calc_bihash_buckets (u32 n_elts)
{
  return 1 << (max_log2 (n_elts >> 1) + 1);
}

The above function corresponds to a factor of somewhere between 1 and
2 instead of 10. So, if I understood this correctly, for a given
maximum number of sessions, the corresponding translation_buckets
value will be something like 5 to 10 times larger in VPP 20.09
compared to how it was in VPP 20.05, leading to significantly
increased memory requirement given that we want to have the same
maximum number of sessions as before.

It seems a little strange that the translation_buckets value would
change so much between VPP versions, was that change intentional? The
old relationship "max_translations = 10 * translation_buckets" seems
to have worked well in practice, at least for our use case.

What could we do to get around this, if we want to switch to VPP 20.09
but without reducing the maximum number of sessions? If we were to
simply divide the nat_calc_bihash_buckets() value by 8 or so to make
it more similar to how it was earlier, would that lead to other
problems?

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18160): https://lists.fd.io/g/vpp-dev/message/18160
Mute This Topic: https://lists.fd.io/mt/78533277/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] minor doc change

2020-11-29 Thread Elias Rudberg

On Sat, 2020-11-28 at 13:18 -0500, Paul Vinciguerra wrote:
> 
> We don't see pull requests.  Github is just a mirror of the gerrit
> repo.

I think it would be good if that could be clarified on the github page.
When people search for "vpp source code" or similar, I think they will
often end up on the github page and it's not immediately obvious from
there that it's only a mirror. (People might get the wrong idea about
some things, for example github shows a "contributors" list which I
guess is not accurate as it only shows authors who happen to have
github accounts that are linked to gerrit in some way?)

The github page says, under "About" to the right, "No description,
website, or topics provided." So there is apparently a possibility to
enter a "description", perhaps that could be used to indicate that it
is just a mirror?

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18190): https://lists.fd.io/g/vpp-dev/message/18190
Mute This Topic: https://lists.fd.io/mt/78559913/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] minor doc change

2020-11-29 Thread Elias Rudberg

Hi Hemant,

> I agree with Elias.  Long term, maybe use of gerrit
> is deprecated and github is used.

Perhaps I should clarify that I did not mean to recommend moving VPP to
github. On the contrary, I think it is good that the VPP source code is
managed independently from github and I hope it will stay that way.

My point was just that in the current situation when there is a github
mirror, it would be good to make that more clear to avoid confusion.
Another way to avoid confusion would be to remove the code from github
(that would be fine as I see it but of course there will be different
opinions about that).

> Github is free for public repos.

That depends on what you mean by "free". It could be argued that there
is a cost in terms of control over the project and being able to do
what you want in the future. Moving something to github means partly
giving up control.

Best regards,
Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18193): https://lists.fd.io/g/vpp-dev/message/18193
Mute This Topic: https://lists.fd.io/mt/78559913/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] stat_set_simple_counter fix to avoid stat client crash needed in stable/2009 also?

2020-11-29 Thread Elias Rudberg

Hello everyone,

For VPP 20.05 the following works to extract /sys/vector_rate
statistics:

#!/usr/bin/python3
from vpp_papi.vpp_stats import VPPStats
stat = VPPStats("/run/vpp/stats.sock")
dir = stat.ls(['^/sys/vector_rate'])
counters = stat.dump(dir)
vector_rate=counters.get('/sys/vector_rate')
print("vector_rate = ", vector_rate)

Unfortunately, with VPP 20.09 the stat client crashes when doing that.
Seems like a problem introduced by https://gerrit.fd.io/r/c/vpp/+/28017
 (stats: remove offsets on vpp side) and fixed in master by 
https://gerrit.fd.io/r/c/vpp/+/29569 (stats: missing dimension in
stat_set_simple_counter).

I was hoping this could be fixed by cherry-picking the fix into the
stable/2009 branch which I tried here: 
https://gerrit.fd.io/r/c/vpp/+/30161
However that does not pass the jenkins tests due to some problem
related to "vom" which was recently deprecated in the master branch,
that might explain why the fix works in master but not in stable/2009.
Still, the fix does work for me in stable/2009, maybe different
compiler version or other details matter and cause some of the jenkins
builds to fail.

How to get around this, to make the stat client work for stable/2009
also?

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18196): https://lists.fd.io/g/vpp-dev/message/18196
Mute This Topic: https://lists.fd.io/mt/78601259/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] stat_set_simple_counter fix to avoid stat client crash needed in stable/2009 also?

2020-11-30 Thread Elias Rudberg

Hi Ole, thanks for your answer.

> /w/workspace/vpp-verify-2009-ubuntu1804-x86_64/build-root/install-
> vpp-native/vpp/include/vpp-api/client/stat_client.h:107:11: error:
> pointer of type ‘void *’ used in arithmetic [-Werror=pointer-arith]
>((p + sizeof (p)) < ((void *) sm->shared_header + sm-
> >memory_size)))
> 
>  ~~^~
> 
> Doing pointer arithmetic on an incomplete type (void) isn't entirely
> kosher.
> GCC supports it, and you could disable the warning.
> But the correct-est approach would be to cast it to a type with size
> 1.

After adding some (char *) casts in stat_segment_adjust() it passed the
tests, please have a look: https://gerrit.fd.io/r/c/vpp/+/30161

Best regards,
Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18203): https://lists.fd.io/g/vpp-dev/message/18203
Mute This Topic: https://lists.fd.io/mt/78601259/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] stat_set_simple_counter fix to avoid stat client crash needed in stable/2009 also?

2020-12-01 Thread Elias Rudberg

Hi Ole,

> Thanks Elias, merged.

Great. Thanks!

> Would you mind fixing that in master too?

OK: https://gerrit.fd.io/r/c/vpp/+/30207

With that, the stat_client.h file becomes identical in master and
stable/2009.

Some pedantic part of me noticed that the same issue seems to exist
also in stat_client.c in the stat_vec_dup macro and maybe other places,
but the compiler does not complain about that in either of the branches
so I did not change it. Perhaps the reason why there were compilation
problems for stable/2009 is that stat_client.h is included from some
C++ code in extras/vom/ and the rules and/or compiler options are
different for C++.

Best regards,
Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18205): https://lists.fd.io/g/vpp-dev/message/18205
Mute This Topic: https://lists.fd.io/mt/78601259/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] VPP hanging and running out of memory due to infinite loop related to nat44-hairpinning

2020-12-02 Thread Elias Rudberg

Hello VPP experts,

For our NAT44 usage of VPP we have encountered a problem with VPP
running out of memory, which now, after much headache and many out-of-
memory crashes over the past several months, has turned out to be
caused by an infinite loop where VPP gets stuck repeating the three
nodes ip4-lookup, ip4-local and nat44-hairpinning. A single packet gets
passed around and around between those three nodes, eating more and
more memory which causes that worker thread to get stuck and VPP to run
out of memory after a few seconds. (Earlier we speculated that it was
due to a memory leak but now it seems it was not.)

This concerns the current master branch as well as the stable/2009
branches and earlier VPP versions as well.

One scenario when this happens is when a UDP (or TCP) packet is sent
from a client on the inside with a destination IP address that matches
an existing static NAT mapping that maps that IP address on the inside
to the same IP address on the outside.

Then, the problem can be triggered for example by doing this from a
client on the inside, where DESTINATION_IP is the IP address of such a
static mapping:

echo hello > /dev/udp/$DESTINATION_IP/3

Here is the packet trace for the thread that receives the packet at
rdma-input:

--

Packet 42

00:03:07:636840: rdma-input
  rdma: Interface179 (4) next-node bond-input l2-ok l3-ok l4-ok ip4 udp
00:03:07:636841: bond-input
  src d4:6a:35:52:30:db, dst 02:fe:8d:23:60:a7, Interface179 ->
BondEthernet0
00:03:07:636843: ethernet-input
  IP4: d4:6a:35:52:30:db -> 02:fe:8d:23:60:a7 802.1q vlan 1013
00:03:07:636844: ip4-input
  UDP: SOURCE_IP_INSIDE -> DESTINATION_IP
tos 0x00, ttl 63, length 34, checksum 0xe7e3 dscp CS0 ecn NON_ECN
fragment id 0x50fe, flags DONT_FRAGMENT
  UDP: 48824 -> 3
length 14, checksum 0x781e
00:03:07:636846: ip4-sv-reassembly-feature
  [not-fragmented]
00:03:07:636847: nat44-in2out-worker-handoff
  NAT44_IN2OUT_WORKER_HANDOFF : next-worker 8 trace index 41

--

So it is doing handoff to thread 8 with trace index 41. Nothing wrong
so far, I think.

Here is the beginning of the corresponding packet trace for the
receiving thread:

--

Packet 57

00:03:07:636850: handoff_trace
  HANDED-OFF: from thread 7 trace index 41
00:03:07:636850: nat44-in2out
  NAT44_IN2OUT_FAST_PATH: sw_if_index 6, next index 3, session -1
00:03:07:636855: nat44-in2out-slowpath
  NAT44_IN2OUT_SLOW_PATH: sw_if_index 6, next index 0, session 11
00:03:07:636927: ip4-lookup
  fib 0 dpo-idx 577 flow hash: 0x
  UDP: SOURCE_IP_OUTSIDE -> DESTINATION_IP
tos 0x00, ttl 63, length 34, checksum 0x5eee dscp CS0 ecn NON_ECN
fragment id 0x50fe, flags DONT_FRAGMENT
  UDP: 63957 -> 3
length 14, checksum 0xb40b
00:03:07:636930: ip4-local
UDP: SOURCE_IP_OUTSIDE -> DESTINATION_IP
  tos 0x00, ttl 63, length 34, checksum 0x5eee dscp CS0 ecn NON_ECN
  fragment id 0x50fe, flags DONT_FRAGMENT
UDP: 63957 -> 3
  length 14, checksum 0xb40b
00:03:07:636932: nat44-hairpinning
  new dst addr DESTINATION_IP port 3 fib-index 0 is-static-mapping
00:03:07:636934: ip4-lookup
  fib 0 dpo-idx 577 flow hash: 0x
  UDP: SOURCE_IP_OUTSIDE -> DESTINATION_IP
tos 0x00, ttl 63, length 34, checksum 0x5eee dscp CS0 ecn NON_ECN
fragment id 0x50fe, flags DONT_FRAGMENT
  UDP: 63957 -> 3
length 14, checksum 0xb40b
00:03:07:636936: ip4-local
UDP: SOURCE_IP_OUTSIDE -> DESTINATION_IP
  tos 0x00, ttl 63, length 34, checksum 0x5eee dscp CS0 ecn NON_ECN
  fragment id 0x50fe, flags DONT_FRAGMENT
UDP: 63957 -> 3
  length 14, checksum 0xb40b
00:03:07:636937: nat44-hairpinning
  new dst addr DESTINATION_IP port 3 fib-index 0 is-static-mapping
00:03:07:636937: ip4-lookup
  fib 0 dpo-idx 577 flow hash: 0x
  UDP: SOURCE_IP_OUTSIDE -> DESTINATION_IP
tos 0x00, ttl 63, length 34, checksum 0x5eee dscp CS0 ecn NON_ECN
fragment id 0x50fe, flags DONT_FRAGMENT
  UDP: 63957 -> 3
length 14, checksum 0xb40b
00:03:07:636939: ip4-local
UDP: SOURCE_IP_OUTSIDE -> DESTINATION_IP
  tos 0x00, ttl 63, length 34, checksum 0x5eee dscp CS0 ecn NON_ECN
  fragment id 0x50fe, flags DONT_FRAGMENT
UDP: 63957 -> 3
  length 14, checksum 0xb40b
00:03:07:636940: nat44-hairpinning
  new dst addr DESTINATION_IP port 3 fib-index 0 is-static-mapping

...

... and so on. In principle it never ends. To get this trace I had
added a hack in nat44-hairpinning to stop when my added debug counter
exceeded a few thousand. Without that, it seems to loop forever, that
worker thread gets stuck.

What happens seems to be that the nat44-hairpinning node determines
that there is an existing session and then decides the packet should go
to the ip4-lookup node, followed by the ip4-local, followed by the
nat44-hairpinning node which makes the same decision again, so it just
goes round and round like that. Inside the snat_hairpinning() function
it always comes to the "Destinat

Re: [vpp-dev] VPP hanging and running out of memory due to infinite loop related to nat44-hairpinning

2020-12-02 Thread Elias Rudberg

Hi Klement,

> > an existing static NAT mapping that maps that IP address on the
> > inside to the same IP address on the outside.

> what is the point of such static mapping? What is the use case here?

We are using VPP for endpoint-independent NAT44. Then all traffic from
outside is normally translated by NAT dynamic sessions but we have
special treatment of traffic to a certain IP address that corresponds
to our BGP (Border Gateway Protocol) traffic, that should not be
translated, so then we have such a static mapping for that. If we do
not have this static mapping then VPP tries to translate our BGP
packets and then BGP does not work properly.

It may be possible to do things differently so that no such mapping
would be needed, but we have been using such a mapping until now and
things have worked fine apart from this infinite loop issue, that
happens when a client from inside happens to send something to our
special BGP IP address that is intended to be used from the outside.
That IP address is normally not used by traffic from clients, the
normal thing is for the router to communicate with the VPP server using
that address, from outside. This is why the out-of-memory problem has
appeared random and hard to reproduce earlier, it just happened when a
client behaved in an unusual way, that did not happen very often but
when it did, we got the out-of-memory crash and now we finally know
why. Now that we know, we can easily reproduce it, it is not really
random it just seemed that way.

Anyway, even if it would be unusual and possibly a bad idea to have
such a static mapping, do you agree that VPP should handle the
situation differently?

Best regards,
Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18233): https://lists.fd.io/g/vpp-dev/message/18233
Mute This Topic: https://lists.fd.io/mt/78662322/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] VPP hanging and running out of memory due to infinite loop related to nat44-hairpinning

2020-12-04 Thread Elias Rudberg

Hi Klement,

> Would you mind pushing it to gerrit?

Here: https://gerrit.fd.io/r/c/vpp/+/30284

>  It would be super cool if the change also contained a test case ;-)

Coolness is always my goal. Have a look, see if the patch qualifies.
:-)

/ Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18252): https://lists.fd.io/g/vpp-dev/message/18252
Mute This Topic: https://lists.fd.io/mt/78662322/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] VPP hanging and running out of memory due to infinite loop related to nat44-hairpinning

2020-12-08 Thread Elias Rudberg

Hi Klement,

> > Would you mind pushing it to gerrit?
> 
> Here: https://gerrit.fd.io/r/c/vpp/+/30284

I see you added "code review +1" there, thanks!

What more is needed to get it merged? Do we need to add another
reviewer?

/ Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18282): https://lists.fd.io/g/vpp-dev/message/18282
Mute This Topic: https://lists.fd.io/mt/78662322/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] VPP hanging and running out of memory due to infinite loop related to nat44-hairpinning

2020-12-10 Thread Elias Rudberg

Hi Ole,

Thanks for merging 30284. I did the same change in the stable/2009
branch also, here: https://gerrit.fd.io/r/c/vpp/+/30340

If that could get merged as weil, it would be much appreciated.

Best regards,
Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18303): https://lists.fd.io/g/vpp-dev/message/18303
Mute This Topic: https://lists.fd.io/mt/78662322/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?

2020-12-15 Thread Elias Rudberg

Hi Klement,

> > I see no reason why this shouldn’t be configurable.
> > [...]
> > Would you like to submit a patch?

Here is a patch making it configurable: 
https://gerrit.fd.io/r/c/vpp/+/30433

Best regards,
Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18349): https://lists.fd.io/g/vpp-dev/message/18349
Mute This Topic: https://lists.fd.io/mt/78230881/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [EXTERNAL] [vpp-dev] Check running status of vpp

2020-12-16 Thread Elias Rudberg

Hello,

On Wed, 2020-12-16 at 14:23 +, Chris Luke via lists.fd.io wrote:
> [...] I wonder if the filesystem entry is a remnant from a previous
> session that was not cleaned up.

FWIW we have had such problems earlier, maybe the issue is similar. In
our case we were mixing use of two different VPP versions that were
using different conventions for naming of those .sock files under the
/run/ directory. We got into trouble because the program that tried to
communicate with VPP was trying those in some specific order and if it
found an existing .sock file it would try to connect using that. When
there was an old .sock file it tried and failed to connect using that,
it did not realize that there was a new .sock file (with a slightly
different name or path) that it was possible to connect to. We were
able to resolve that situation by either removing the old .sock file or
by rebooting the machine which had the same effect, cleaning up old
stuff under /run/.

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18362): https://lists.fd.io/g/vpp-dev/message/18362
Mute This Topic: https://lists.fd.io/mt/79001336/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?

2020-12-21 Thread Elias Rudberg

Hi Klement,

> > > I see no reason why this shouldn’t be configurable.
> > > [...]
> > > Would you like to submit a patch?
> 
> Here is a patch making it configurable: 
> [...]

New patch, including API support and a test case: 
https://gerrit.fd.io/r/c/vpp/+/30482

Please check that one instead, I think it's better.

Best regards,
Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18403): https://lists.fd.io/g/vpp-dev/message/18403
Mute This Topic: https://lists.fd.io/mt/78230881/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?

2021-01-26 Thread Elias Rudberg

Hi Klement,

> > I see no reason why this shouldn’t be configurable.
> > [...]
> > Would you like to submit a patch?

I had a patch in December that was lying around too long so there were
merge conflicts, so now I made a new one again. Third time's the charm,
I hope. Here it is:

https://gerrit.fd.io/r/c/vpp/+/30933

It makes the frame queue size configurable and also adds API support
and a test verifying the API support. Please have a look!

/ Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18596): https://lists.fd.io/g/vpp-dev/message/18596
Mute This Topic: https://lists.fd.io/mt/78230881/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] How to add in/out interfaces in NAT44 in vpp21

2021-02-07 Thread Elias Rudberg

I think that with the latest VPP versions you need to use the "nat44
enable" command first, for example like this:

nat44 enable sessions 100 users 1000

where the numbers are your choices for the maximum number of sessions
and users per thread.

Best regards,
Elias


On Sun, 2021-02-07 at 09:01 +, Юрий Иванов wrote:
> Hi,
> I'm trying to configure NAT44 feature on latest vpp:
> vpp# show version
> vpp v21.01-release built by root on fcb1bae62b24 at 2021-01-
> 27T16:06:22
> 
> 
> Can someone help to determine why adding interfaces not working like
> it should
> vpp# set interface nat44 in GigabitEthernet0/5/0 out
> GigabitEthernet0/4/0
> set interface nat44: add GigabitEthernet0/5/0 failed
> 
> My config:
> set interface ip address GigabitEthernet0/4/0 1.0.0.1/24
> set interface ip address GigabitEthernet0/5/0 10.0.1.1/24
> 
> set interface state GigabitEthernet0/4/0 up
> set interface state GigabitEthernet0/5/0 up
> 
> nat44 forwarding enable
> nat44 add address 1.0.0.2-1.0.0.100
> vpp# show interface
>   Name   IdxState  MTU (L3/IP4/IP6/MPLS)
> Counter  Count
> GigabitEthernet0/4/0  1  up  9000/0/0/0
> rx packets 9
>
> rx bytes3339
>
> drops  9
> GigabitEthernet0/5/0  2  up  9000/0/0/0
> local00 down  0/0/0/0
> 
> Maybe somehting has changed once more, because version 17-18 was
> working as expected?
> 
> 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18689): https://lists.fd.io/g/vpp-dev/message/18689
Mute This Topic: https://lists.fd.io/mt/80449289/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] VPP 20.09 os_out_of_memory() in clib_bihash_add_del_16_8 in IPv4 Shallow Virtual reassembly code

2021-02-19 Thread Elias Rudberg

Hello VPP experts,

We have a problem with VPP 20.09 crashing with SIGABRT, this
happened several times lately but we do not have an exact way of
reproducing it. Here is a backtrace from gdb:

Thread 10 "vpp_wk_7" received signal SIGABRT, Aborted.
[Switching to Thread 0x7feac47f8700 (LWP 6263)]
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#0  __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:51
#1  0x74044921 in __GI_abort () at abort.c:79
#2  0xc640 in os_panic () at src/vpp/vnet/main.c:368
#3  0x77719229 in alloc_aligned_16_8 (h=0x77b79990
, nbytes=) at
src/vppinfra/bihash_template.c:34
#4  0x7771b650 in value_alloc_16_8 (h=0x77b79990
, log2_pages=4) at
src/vppinfra/bihash_template.c:356
#5  0x7771b43a in split_and_rehash_16_8 (h=0x77b79990
, old_values=0x7ff87c7b0d40, old_log2_pages=3,
new_log2_pages=4) at src/vppinfra/bihash_template.c:453
#6  0x77710f84 in clib_bihash_add_del_inline_with_hash_16_8
(h=0x77b79990 , add_v=0x7ffbf2088c60,
hash=, is_add=, is_stale_cb=0x0, arg=0x0)
at src/vppinfra/bihash_template.c:765
#7  clib_bihash_add_del_inline_16_8 (h=0x77b79990
, add_v=0x7ffbf2088c60, is_add=,
is_stale_cb=0x0, arg=0x0) at src/vppinfra/bihash_template.c:857
#8  clib_bihash_add_del_16_8 (h=0x77b79990 ,
add_v=0x7ffbf2088c60, is_add=) at
src/vppinfra/bihash_template.c:864
#9  0x766795ec in ip4_sv_reass_find_or_create (vm=, rm=, rt=, kv=,
do_handoff=) at src/vnet/ip/reass/ip4_sv_reass.c:364
#10 ip4_sv_reass_inline (vm=, node=,
frame=, is_feature=255, is_output_feature=false,
is_custom=false) at src/vnet/ip/reass/ip4_sv_reass.c:726
#11 ip4_sv_reass_node_feature_fn_skx (vm=,
node=, frame=) at
src/vnet/ip/reass/ip4_sv_reass.c:919
#12 0x75ac806e in dispatch_node (vm=0x7ffbf1e74400,
node=0x7ffbf2553fc0, type=VLIB_NODE_TYPE_INTERNAL,
dispatch_state=VLIB_NODE_STATE_POLLING, frame=,
last_time_stamp=) at src/vlib/main.c:1194
#13 dispatch_pending_node (vm=0x7ffbf1e74400,
pending_frame_index=, last_time_stamp=)
at src/vlib/main.c:1353
#14 vlib_main_or_worker_loop (vm=0x7ffbf1e74400, is_main=0) at
src/vlib/main.c:1846
#15 vlib_worker_loop (vm=0x7ffbf1e74400) at src/vlib/main.c:1980

The line at bihash_template.c:34 is "os_out_of_memory ()".

If VPP calls "os_out_of_memory()" at that point in the code, what
does that mean, is there some way we could configure VPP to allow
it to use more memory for this kind of allocations?

We have plenty of physical memory available and the main
heap ("heapsize" in startup.conf) has already been set to a large
value but maybe this part of the code is using some other kind of
memory allocation, not using the main heap? How do we know if
this particular allocation is using the main heap or not?

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18769): https://lists.fd.io/g/vpp-dev/message/18769
Mute This Topic: https://lists.fd.io/mt/80753669/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] VPP 20.09 os_out_of_memory() in clib_bihash_add_del_16_8 in IPv4 Shallow Virtual reassembly code

2021-02-19 Thread Elias Rudberg

Thanks Dave, however it looks like BIHASH_USE_HEAP does not exist in
VPP 20.09 but was introduced later. Looks like it appeared with the
commit 2454de2d4 "vppinfra: use heap to store bihash data" which was
after 20.09 was released.

I guess this means that bihash data is not stored on the heap in VPP
20.09. Maybe switching to VPP 21.01 would help with this issue then, or
at least with 21.01 all of our main heap space would need to be
consumed before we get another os_out_of_memory() SIGABRT crash?

/ Elias


On Fri, 2021-02-19 at 09:56 -0500, v...@barachs.net wrote:
> See ../src/vppinfra/bihash_16_8.h:
> 
> #define BIHASH_USE_HEAP 1
> 
> The the sv reassembly bihash table configuration appears to be
> hardwired, and complex enough to satisfy the cash customers. If the
> number of buckets is way too low for your use-case, bihash is capable
> of wasting a considerable amount of memory.
> 
> Suggest that you ping Klement Sekera, it's his code...
> 
> D.
> 
> -Original Message-
> From: vpp-dev@lists.fd.io  On Behalf Of Elias
> Rudberg
> Sent: Friday, February 19, 2021 7:41 AM
> To: vpp-dev@lists.fd.io
> Subject: [vpp-dev] VPP 20.09 os_out_of_memory() in
> clib_bihash_add_del_16_8 in IPv4 Shallow Virtual reassembly code
> 
> Hello VPP experts,
> 
> We have a problem with VPP 20.09 crashing with SIGABRT, this happened
> several times lately but we do not have an exact way of reproducing
> it. Here is a backtrace from gdb:
> 
> Thread 10 "vpp_wk_7" received signal SIGABRT, Aborted.
> [Switching to Thread 0x7feac47f8700 (LWP 6263)] __GI_raise (
> sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> #0  __GI_raise (sig=sig@entry=6) at
> ../sysdeps/unix/sysv/linux/raise.c:51
> #1  0x74044921 in __GI_abort () at abort.c:79
> #2  0xc640 in os_panic () at src/vpp/vnet/main.c:368
> #3  0x77719229 in alloc_aligned_16_8 (h=0x77b79990
> , nbytes=) at
> src/vppinfra/bihash_template.c:34
> #4  0x7771b650 in value_alloc_16_8 (h=0x77b79990
> , log2_pages=4) at
> src/vppinfra/bihash_template.c:356
> #5  0x7771b43a in split_and_rehash_16_8 (h=0x77b79990
> , old_values=0x7ff87c7b0d40, old_log2_pages=3,
> new_log2_pages=4) at src/vppinfra/bihash_template.c:453
> #6  0x77710f84 in clib_bihash_add_del_inline_with_hash_16_8
> (h=0x77b79990 , add_v=0x7ffbf2088c60,
> hash=, is_add=, is_stale_cb=0x0,
> arg=0x0) at src/vppinfra/bihash_template.c:765
> #7  clib_bihash_add_del_inline_16_8 (h=0x77b79990
> , add_v=0x7ffbf2088c60, is_add=,
> is_stale_cb=0x0, arg=0x0) at src/vppinfra/bihash_template.c:857
> #8  clib_bihash_add_del_16_8 (h=0x77b79990
> , add_v=0x7ffbf2088c60, is_add=)
> at
> src/vppinfra/bihash_template.c:864
> #9  0x766795ec in ip4_sv_reass_find_or_create (vm= out>, rm=, rt=, kv=,
> do_handoff=) at src/vnet/ip/reass/ip4_sv_reass.c:364
> #10 ip4_sv_reass_inline (vm=, node=,
> frame=, is_feature=255, is_output_feature=false,
> is_custom=false) at src/vnet/ip/reass/ip4_sv_reass.c:726
> #11 ip4_sv_reass_node_feature_fn_skx (vm=,
> node=, frame=) at
> src/vnet/ip/reass/ip4_sv_reass.c:919
> #12 0x75ac806e in dispatch_node (vm=0x7ffbf1e74400,
> node=0x7ffbf2553fc0, type=VLIB_NODE_TYPE_INTERNAL,
> dispatch_state=VLIB_NODE_STATE_POLLING, frame=,
> last_time_stamp=) at src/vlib/main.c:1194
> #13 dispatch_pending_node (vm=0x7ffbf1e74400,
> pending_frame_index=, last_time_stamp=)
> at src/vlib/main.c:1353
> #14 vlib_main_or_worker_loop (vm=0x7ffbf1e74400, is_main=0) at
> src/vlib/main.c:1846
> #15 vlib_worker_loop (vm=0x7ffbf1e74400) at src/vlib/main.c:1980
> 
> The line at bihash_template.c:34 is "os_out_of_memory ()".
> 
> If VPP calls "os_out_of_memory()" at that point in the code, what
> does that mean, is there some way we could configure VPP to allow it
> to use more memory for this kind of allocations?
> 
> We have plenty of physical memory available and the main heap
> ("heapsize" in startup.conf) has already been set to a large value
> but maybe this part of the code is using some other kind of memory
> allocation, not using the main heap? How do we know if this
> particular allocation is using the main heap or not?
> 
> Best regards,
> Elias
> 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18772): https://lists.fd.io/g/vpp-dev/message/18772
Mute This Topic: https://lists.fd.io/mt/80753669/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?

2021-02-25 Thread Elias Rudberg

Hi Marcos,

If you are building VPP 20.05 from source then the easiest way is to
simply change the value at "#define NAT_FQ_NELTS 64"
in src/plugins/nat/nat.h from 64 to something larger, we have been
using 512 which seems to work fine in our case.

Note that this can help with one specific kind of packet drops in VPP
NAT called "congestion drops", if you have packet loss for other
reasons then a NAT_FQ_NELTS change will probably not help.

Best regards,
Elias


On Wed, 2021-02-24 at 13:45 -0300, Marcos - Mgiga wrote:
> Hi Elias, 
> 
> I have been following this discussion and finally I gave VPP a try
> implementing it as a CGN gateway. Unfortunattely some issues came up,
> like packets loss and I believe your patch can be helpful,
> 
> Would mind give me guidance to deploy it? I'm using VPP 20.05 as you
> did
> 
> Best Regards
> 
> -Mensagem original-
> De: vpp-dev@lists.fd.io  Em nome de Elias
> Rudberg
> Enviada em: terça-feira, 26 de janeiro de 2021 11:10
> Para: ksek...@cisco.com
> Cc: vpp-dev@lists.fd.io
> Assunto: Re: [vpp-dev] Increasing NAT worker handoff frame queue size
> NAT_FQ_NELTS to avoid congestion drops?
> 
> Hi Klement,
> 
> > > I see no reason why this shouldn’t be configurable.
> > > [...]
> > > Would you like to submit a patch?
> 
> I had a patch in December that was lying around too long so there
> were merge conflicts, so now I made a new one again. Third time's the
> charm, I hope. Here it is:
> 
> https://gerrit.fd.io/r/c/vpp/+/30933
> 
> It makes the frame queue size configurable and also adds API support
> and a test verifying the API support. Please have a look!
> 
> / Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18802): https://lists.fd.io/g/vpp-dev/message/18802
Mute This Topic: https://lists.fd.io/mt/78230881/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] Suggestion: clarify that github repo is only a mirror, add link to real repo?

2021-03-16 Thread Elias Rudberg

Hello,

Searching for "VPP source code" using my favourive web search engine
gives the github page https://github.com/FDio/vpp as top search result.
However that is not the real VPP repo, the github page is only a
mirror.

I think it would be good to clarify this in the "About" part for the
github project, to avoid confusion.

For comparison, look at how it is done for the Linux kernel source code
mirror here:

https://github.com/gregkh/linux

To the top right there it says "Linux kernel stable tree mirror" with a
link to the real repository which in that case is under git.kernel.org.

Could that be done in the same way for VPP also?

Best regards,
Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18938): https://lists.fd.io/g/vpp-dev/message/18938
Mute This Topic: https://lists.fd.io/mt/81371296/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] VPP 20.09 os_out_of_memory() in clib_bihash_add_del_16_8 in IPv4 Shallow Virtual reassembly code

2021-03-25 Thread Elias Rudberg

Hello Dave,

Just to follow up on this, we switched from 20.09 to 21.01 and that
indeed seems to have solved the problem for us, having now run for
about a month without the issue coming back.

Thanks for your help!

Best regards,
Elias


On Sun, 2021-02-21 at 07:43 -0500, v...@barachs.net wrote:
> That's right. In 20.09, bihash did its own os-level memory
> allocation. You could (probably) pick up and port
> src/vppinfra/bihash*.[ch] to 20.09, or you could add some config
> knobs to the reassembly code. 
> 
> If switching to 21.01 is an option, that seems like the path of least
> resistance.
> 
> HTH... Dave
> 
> -Original Message-
> From: vpp-dev@lists.fd.io  On Behalf Of Elias
> Rudberg
> Sent: Friday, February 19, 2021 12:10 PM
> To: v...@barachs.net; vpp-dev@lists.fd.io
> Subject: Re: [vpp-dev] VPP 20.09 os_out_of_memory() in
> clib_bihash_add_del_16_8 in IPv4 Shallow Virtual reassembly code
> 
> Thanks Dave, however it looks like BIHASH_USE_HEAP does not exist in
> VPP 20.09 but was introduced later. Looks like it appeared with the
> commit 2454de2d4 "vppinfra: use heap to store bihash data" which was
> after 20.09 was released.
> 
> I guess this means that bihash data is not stored on the heap in VPP
> 20.09. Maybe switching to VPP 21.01 would help with this issue then,
> or at least with 21.01 all of our main heap space would need to be
> consumed before we get another os_out_of_memory() SIGABRT crash?
> 
> / Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#19022): https://lists.fd.io/g/vpp-dev/message/19022
Mute This Topic: https://lists.fd.io/mt/80753669/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] Thread safety issue in NAT plugin regarding counter for busy ports

2021-04-01 Thread Elias Rudberg

Hello VPP experts,

I think there is a thread safety issue in the NAT plugin regarding the
counter for busy ports.

Looking at this for the master branch now, there has been some
refactoring lately but the issue has anyway been there for a long time,
at least several VPP versions back, although filenames and function
names have changed.

Here I will take the endpoint-independent code in nat44-ei/nat44_ei.c
code because that is the part I am using, but it looks like a similar
issue is there for nat44-ed as well.

In the nat44_ei_alloc_default_cb() function in nat44_ei.c there is a
part that looks like this:

  --a->busy_##n##_port_refcounts[portnum];  \
  a->busy_##n##_ports_per_thread[thread_index]++;   \
  a->busy_##n##_ports++;\

where the variable "a" is an address (nat44_ei_address_t) that belongs
to the "addresses" in the global nat44_ei_main, so not thread-specific. 
As I understand it, different threads may be using the same "a" at the
same time.

At first sight it might seem like all those three lines are risky
because different threads can execute this code at the same time for
the same "a". However, the _port_refcounts[portnum] and
_ports_per_thread[thread_index] parts are actually okay to access
because the [portnum] and [thread_index] ensures that those lines only
access parts of those arrays that belong to thecurrent thread, that is
how the port number is selected.

So the first two lines there are fine, I think, but the third line,
incrementing a->busy_##n##_ports, can give a race condition when
different threads execute it at the same time. The same issue is also
there in other places where the busy_##n##_ports values are updated.

I think this is not critical because the busy_##n##_ports information
(that can be wrong because of this thread safety issue) is not used
very much. However those values are used in nat44_ei_del_address()
where it looks like this:

  /* Delete sessions using address */
  if (a->busy_tcp_ports || a->busy_udp_ports || a->busy_icmp_ports)
{

and then inside that if-statement there is some code to delete those
sessions. If the busy_##n##_ports values are wrong it could in
principle happen that the session deletion is skipped when there were
actually some sessions that needed deleting. Perhaps rare and perhaps
resulting in nothing worse than a small memory leak, but anyway.

One effect of this is that there can be an inconsistency, if we were to
sum up the busy_##n##_ports_per_thread values for all threads, that
should be equal to busy_##n##_ports but due to this issue there could
be a difference, because while the busy_##n##_ports_per_thread values
are correct the busy_##n##_ports values may have been corupted due to
the race condition mentioned above.

Not sure if the above is a problem in practice, my main motivation for
reporting this is that it confuses me when I am trying to understand
how te code works in order to do some modifications. Either the code is
not thread safe there, or I have misunderstood things.

What do you think, is it an issue?
If not, what have I missed?

(This is not an April fools' joke, I really am this pedantic)

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#19087): https://lists.fd.io/g/vpp-dev/message/19087
Mute This Topic: https://lists.fd.io/mt/81773552/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Thread safety issue in NAT plugin regarding counter for busy ports

2021-04-05 Thread Elias Rudberg

Hi Klement,

> it’s spot on. I think all of it. Would you like to push an atomic-
> increment patch or should I?

Better if you do it, I don't really know how such atomic-increment
operations work, it's something new to me. If you have a way of fixing
it like that, I would be interested to see how you did it. Do you think
that could be done without too much performance cost, and still
portable enough?

Best regards,
Elias


> Thanks for spotting this!!!
> Klement
> 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#19102): https://lists.fd.io/g/vpp-dev/message/19102
Mute This Topic: https://lists.fd.io/mt/81773552/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Progressive_VPP_Tutorial show ip: unknown input arp

2021-05-02 Thread Elias Rudberg

Hi Farzad,

I noticed also that "show ip arp" does not seem to be available anymore
in recent VPP versions. Maybe the tutorial needs updating.

You can try using "show ip neighbors" instead, I think it shows about
the same info that "show ip arp" used to give.

Best regards,
Elias


On Sun, 2021-05-02 at 10:41 +0430, Farzad Sadeghi wrote:
> I'm very new to vpp so I decided to do the progressive vpp tutorial.
> At some point you are supposed to run "show ip arp". I get this in
> response:
> show ip: unknown input `arp'
> 
> I pulled the vagrant box mentioned in the tutorial so I'm on Ubuntu
> Xenial.
> Here's the output of "show version":
> vpp v20.01-release built by root on 4d189446a03d at 2020-01-
> 29T22:12:33
> 
> Do I need to enable a certain plugin for "show ip arp" to be
> available?
> 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#19312): https://lists.fd.io/g/vpp-dev/message/19312
Mute This Topic: https://lists.fd.io/mt/82522555/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] How to use valgrind to check for memory errors in vpp?

2019-09-09 Thread Elias Rudberg

Hello,

I would like to use valgrind to check for memory errors in vpp.

I understand that running something through valgrind makes it very very slow so 
that it is not an option for real production usage of vpp. However, valgrind is 
still very useful for finding errors even if it's only for very limited test 
runs, so I would very much like to make that work.

I know that vpp has some built-in checking for memory leaks, but the reason I 
want to use valgrind is not primarily to check for memory leaks but to check 
for other kinds of memory-access-related errors, like the "invalid read" and 
"invalid write" errors that valgrind can detect.

So far, what I have done is to build vpp (debug configuration) according to the 
instructions here: 
https://fdio-vpp.readthedocs.io/en/latest/gettingstarted/developers/building.html
Then I stopped the vpp service since I want to run vpp from the command-line 
through valgrind, and finally I run it like this:

sudo valgrind vpp -c /etc/vpp/startup.conf

That gave warnings about "client switching stacks?" and suggested adding 
--max-stackframe=137286291952 so I did that:

sudo valgrind --max-stackframe=137286291936 vpp -c /etc/vpp/startup.conf

Then valgrind gives a warning "Warning: set address range perms: large range" 
followed by some error reports of the type "Conditional jump or move depends on 
uninitialised value(s)" inside the mspace_malloc routine in dlmalloc.c.

I think these issues are probably related to the fact that vpp uses its own 
malloc implementation (in dlmalloc.c) instead of the default malloc, possibly 
combined with the fact that vpp uses very large (virual) memory.

Questions:

- Are there ways to configure vpp to allow it to work together with valgrind?

- Are there ways to make vpp use less memory? (currently "top" shows 0.205t 
VIRT memory usage for the vpp_main process)

- Is it possible to somehow configure vpp to use standard malloc instead of the 
dlmalloc.c implementation, perhaps sacrificing performance but making things 
work better with valgrind?

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#13923): https://lists.fd.io/g/vpp-dev/message/13923
Mute This Topic: https://lists.fd.io/mt/34077527/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] How to use valgrind to check for memory errors in vpp?

2019-09-09 Thread Elias Rudberg

Thanks Dave and Ben for your kind replies.

Ben, the Address Sanitizer integration sounds very interesting, if you
could share your WIP patches that would be great!

Best regards,
Elias


On Mon, 2019-09-09 at 12:57 +, Benoit Ganne (bganne) via
Lists.Fd.Io wrote:
> Hi Elias,
> 
> As mentioned by Dave, running Valgrind on VPP is challenging because
> of speed and custom allocators.
> That being said, I am (slowly) working on integrating Address
> Sanitizer into VPP. I have some cleanup to do but I can share my WIP
> patches if interested.
> 
> Best
> ben
> 
> > -Original Message-
> > From: vpp-dev@lists.fd.io  On Behalf Of Dave
> > Barach
> > via Lists.Fd.Io
> > Sent: lundi 9 septembre 2019 14:20
> > To: Elias Rudberg ; vpp-dev@lists.fd.io
> > Cc: vpp-dev@lists.fd.io
> > Subject: Re: [vpp-dev] How to use valgrind to check for memory
> > errors in
> > vpp?
> > 
> > Dlmalloc [aka "Doug Lea Malloc"] is a lightly modified copy of the
> > allocator described here: 
> > http://gee.cs.oswego.edu/dl/html/malloc.html. If
> > you've managed to find an issue in it, please share the details.
> > Until
> > proven otherwise, I suspect the report rather than dlmalloc itself.
> > 
> > Vpp does indeed manage its own thread stacks. The so-called vpp
> > process
> > model [in truth: cooperative multi-tasking threads] uses
> > setjmp/longjmp to
> > switch stacks. The scheme is fundamental, and won't be changed to
> > accomodate valgrind.
> > 
> > Dlmalloc does not support valgrind. It's a waste of a huge number
> > of
> > cycles to run valgrind unless the memory allocator supports it. My
> > experience making vpp's previous memory allocator support valgrind
> > might
> > be worth sharing: it never worked very well. After > 15 years
> > working on
> > the code base, I've not felt the need to go back and make it work
> > in
> > detail.
> > 
> > Vpp uses multiple, independent heaps - some in shared memory - so
> > switching to vanilla malloc() seems like a non-starter.
> > 
> > Vpp's virtual space is larger than one might like - note the
> > difference
> > with none of the plugins loaded - but in terms of real memory
> > consumption
> > we often see RSS sizes in the 20-30mb range. A decent fraction of
> > the
> > virtual space is used to avoid expensive computations in device
> > drivers:
> > to facilitate virtual <--> physical address translation.
> > 
> > Any issues accidentally introduced into the memory allocator would
> > be a
> > severe nuisance. Folks would be well-advised not to tinker with it.
> > 
> > HTH... Dave
> > 
> > -Original Message-
> > From: vpp-dev@lists.fd.io  On Behalf Of Elias
> > Rudberg
> > Sent: Monday, September 9, 2019 4:43 AM
> > To: vpp-dev@lists.fd.io
> > Subject: [vpp-dev] How to use valgrind to check for memory errors
> > in vpp?
> > 
> > Hello,
> > 
> > I would like to use valgrind to check for memory errors in vpp.
> > 
> > I understand that running something through valgrind makes it very
> > very
> > slow so that it is not an option for real production usage of vpp.
> > However, valgrind is still very useful for finding errors even if
> > it's
> > only for very limited test runs, so I would very much like to make
> > that
> > work.
> > 
> > I know that vpp has some built-in checking for memory leaks, but
> > the
> > reason I want to use valgrind is not primarily to check for memory
> > leaks
> > but to check for other kinds of memory-access-related errors, like
> > the
> > "invalid read" and "invalid write" errors that valgrind can detect.
> > 
> > So far, what I have done is to build vpp (debug configuration)
> > according
> > to the instructions here: https://fdio-
> > vpp.readthedocs.io/en/latest/gettingstarted/developers/building.htm
> > l
> > Then I stopped the vpp service since I want to run vpp from the
> > command-
> > line through valgrind, and finally I run it like this:
> > 
> > sudo valgrind vpp -c /etc/vpp/startup.conf
> > 
> > That gave warnings about "client switching stacks?" and suggested
> > adding -
> > -max-stackframe=137286291952 so I did that:
> > 
> > sudo valgrind --max-stackframe=137286291936 vpp -c
> > /etc/vpp/startup.conf
> > 
> > Then valgrind gives a warning "Warning: set address range perms:
> > large
&g

[vpp-dev] Bug in plugins/dpdk/device/init.c related to eal_init_args found using AddressSanitizer

2019-09-11 Thread Elias Rudberg

Hello,

Thanks to the patches shared by Benoit Ganne on Monday, I was today
able to use AddressSanitizer for vpp. AddressSanitizer detected a
problem that I think is caused by a bug in plugins/dpdk/device/init.c
related to how the conf->eal_init_args vector is manipulated in the
dpdk_config function.

It appears that the code there uses two different kinds of strings,
both C-style null-terminated strings (char*) and vectors of type (u8*)
which are not necessarily null-terminated but instead have their length
stored in a different way (as described in vppinfra/vec.h).

In the dpdk_config function, various strings are added to the conf-
>eal_init_args vector. Those strings need to be null-terminated because
they are later used as input to the "format" function which expects
null-terminated strings for its later arguments. The strings are mostly
null-terminated but not all of them, which leads to the error detected
by AddressSanitizer.

I think what happens is that some string that was generated by the
"format" function and is thus not null-terminated is later given as
input to a function that needs null-terminated strings as input,
leading to illegal memory access.

I'm able to make AddressSanitizer happy by making the following two
changes:

(1) Null-terminate the tmp string for conf->nchannels in the same way
as it is done in other places in the code:

-  tmp = format (0, "%d", conf->nchannels);
+  tmp = format (0, "%d%c", conf->nchannels, 0);

(2) Null-terminate conf->eal_init_args_str before the call to
dpdk_log_warn:

+  vec_add1(conf->eal_init_args_str, 0);

After that, vpp starts without complaints from AddressSanitizer.

Should this be reported as a new bug in the Jira system for VPP (
https://jira.fd.io/browse/VPP)?

Should I push a fix myself (not sure if I have permission to do that)
or could someone more familiar with that part of the code do it?

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#13955): https://lists.fd.io/g/vpp-dev/message/13955
Mute This Topic: https://lists.fd.io/mt/34104878/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Bug in plugins/dpdk/device/init.c related to eal_init_args found using AddressSanitizer

2019-09-12 Thread Elias Rudberg

OK, now I created a Jira issue about it:
https://jira.fd.io/browse/VPP-1772

I would like to commit and push a fix also, but I'm not sure how to do
that properly. Looking at "git log" it looks like you are using some
special form of commit messages with special "Signed-off-by" and
"Change-Id" parts, I don't know what those mean. Are you using some
tool to generate those commit messages, rather than just doing "git
commit" at the command-line?

Best regards,
Elias

On Wed, 2019-09-11 at 15:03 -0400, Dave Wallace wrote:
> Elias,
> 
> Please open a Jira Ticket and push a patch with this fix.
> 
> BTW, there is a macro [0] that safely adds c-string termination to a
> vector which I would recommend using for your fix (2).
> 
> Thanks,
> -daw-
> [0] 
> https://docs.fd.io/vpp/19.08/db/d65/vec_8h.html#a2bc43313bc727b5453c3e5d7cc57a464
> 
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#13969): https://lists.fd.io/g/vpp-dev/message/13969
Mute This Topic: https://lists.fd.io/mt/34104878/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Bug in plugins/dpdk/device/init.c related to eal_init_args found using AddressSanitizer

2019-09-13 Thread Elias Rudberg

Thanks!

What about the Jira ticket here https://jira.fd.io/browse/VPP-1772 --
now I set "Resolution: Done" there, should the "Fix Version/s" field be
changed also?

/ Elias

On Thu, 2019-09-12 at 12:00 -0400, Dave Wallace wrote:
> Elias,
> 
> Thanks for the patch -- I just merged it.
> 
> Welcome to the VPP community :)
> 
> Thanks,
> -daw-

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#13975): https://lists.fd.io/g/vpp-dev/message/13975
Mute This Topic: https://lists.fd.io/mt/34104878/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] Poor NAT performance with 19.08 compared to 19.01, problem related to thread placement?

2019-10-03 Thread Elias Rudberg

As we are about to switch from VPP 19.01 to 19.08 we encountered a
problem with NAT performance. We try to use the same settings (as far
as possible) for 19.08 as we did for 19.01, on the same computer.

In 19.01 we used 11 worker threads in total, combined with "set nat
workers 0-6" so that 7 of the worker threads were handling NAT work.
That worked fine in 19.01, but now that we try the same with 19.08 the
performance gets really bad. The problem seems related to the choice of
NAT treads.

Examples to illustrate the issue:

"set nat workers 0-1" --> works fine for both 19.01 and 19.08.

"set nat workers 2-3" --> works fine for 19.01, but gives bad
performance for 19.08.

It seems as if, for version 19.08, only threads 0 and 1 can do NAT work
with decent performance; as soon as any other threads are specified,
performance gets bad.
In contrast, for version 19.01, seemingly any of the threads can be
used for NAT without performance problems.

"Bad" performance here means that things work something like 10x
slower, e.g. VPP starts to drop packets already at only 10% of the
amount of traffic that it could handle otherwise. So it is really a big
difference.

Using gdb I was able to verify that the NAT functions are really
executed by those worker threads that were chosen using "set nat
workers", and as long as there is not too much traffic vpp still
processes the packets correctly, it is just that it gets really slow
when using other NAT threads than 0 and 1.

My best guess is that the problem has something to do with how threads
are bound (or not) to certain CPU cores and/or NUMA memory banks. But
we have not changed any configuration options related to such things.
Maybe if there has been a change in default behavior between 19.01 and
19.08 then that could explain it.

The behavior for the current master branch seems to be the same as for
19.08.

Questions:

Are there some new configuration options that we need to use to make
19.08 work with good performance using more than 2 NAT threads?

Has the default behavior regarding binding of threads to CPU cores
changed between VPP versions 19.01 and 19.08?

Other ideas of what could be causing this and/or how to troubleshoot
further?

(In case that matters, we are using Mellanox hardware interfaces that
required "make dpdk-install-dev DPDK_MLX5_PMD=y
DPDK_MLX5_PMD_DLOPEN_DEPS=n" when building for vpp 19.01, while for
19.08 the interfaces are setup using "create int rdma host-if ...".)

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14104): https://lists.fd.io/g/vpp-dev/message/14104
Mute This Topic: https://lists.fd.io/mt/34379814/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Poor NAT performance with 19.08 compared to 19.01, problem related to thread placement?

2019-10-03 Thread Elias Rudberg

More info after investigating further: the issue seems related to the
fact that the RDMA plugin is available in 19.08, which did not exist in
19.01. As a result, we no longer need the "make dpdk-install-dev
DPDK_MLX5_PMD=y DPDK_MLX5_PMD_DLOPEN_DEPS=n" complication when
building. The release notes for VPP 19.04 say "RDMA (ibverb) driver
plugin - MLX5 with multiqueue".

For 19.01 we had configured "num-rx-queues" for each of the two
interfaces used, in the dpdk dev part of the startup.conf file. After
testing different choices for that it turns out that if we set "num-rx-
queues 1" for each interface, then 19.01 gets the same performance
problem that we see for 19.08 (i.e. only threads 0 and 1 can be used
efficiently for NAT). So it appears that the reason why our 19.01
installation can use more NAT threads is that we have set larger "num-
rx-queues" values. For 19.08 however, the "num-rx-queues" values seem
to be ignored, presumably because the RDMA plugin is used.

Is it correct that the dpdk dev num-rx-queues option is ignored when
the RDMA plugin is used?

How can we add more queues or polling threads to RDMA interfaces so
that we can use more NAT workers?

Best regards,
Elias

On Thu, 2019-10-03 at 07:28 +, Elias Rudberg wrote:
> As we are about to switch from VPP 19.01 to 19.08 we encountered a
> problem with NAT performance. We try to use the same settings (as far
> as possible) for 19.08 as we did for 19.01, on the same computer.
> 
> [...]

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14105): https://lists.fd.io/g/vpp-dev/message/14105
Mute This Topic: https://lists.fd.io/mt/34379814/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Poor NAT performance with 19.08 compared to 19.01, problem related to thread placement?

2019-10-03 Thread Elias Rudberg

Dear Chris and Ben,

This solved the issue for us. Many thanks for your help!

Best regards,
Elias


On Thu, 2019-10-03 at 11:55 +, Benoit Ganne (bganne) via
Lists.Fd.Io wrote:
> Chris is correct, rdma driver is independent from DPDK driver and as
> such is not aware of any DPDK config option.
> Here is an example to create 8 rx queues:
> ~# vppctl create int rdma host-if enp94s0f0 name rdma-0 num-rx-queues 
> 8
> 
> Best
> Ben
> 
> > -Original Message-
> > From: vpp-dev@lists.fd.io  On Behalf Of
> > Christian
> > Hopps
> > Sent: jeudi 3 octobre 2019 13:29
> > To: Elias Rudberg 
> > Cc: Christian Hopps ; vpp-dev@lists.fd.io
> > Subject: Re: [vpp-dev] Poor NAT performance with 19.08 compared to
> > 19.01,
> > problem related to thread placement?
> > 
> > "create interface rdma" CLI has an num-rx-queues config
> > 
> > VLIB_CLI_COMMAND (rdma_create_command, static) = {
> >   .path = "create interface rdma",
> >   .short_help = "create interface rdma  [name
> > ]"
> > " [rx-queue-size ] [tx-queue-size ]"
> > " [num-rx-queues ]",
> >   .function = rdma_create_command_fn,
> > };
> > 
> > is that were you are setting it? DPDK config will not apply when
> > you are
> > using the native driver.
> > 
> > Thanks,
> > Chris.
> > 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14108): https://lists.fd.io/g/vpp-dev/message/14108
Mute This Topic: https://lists.fd.io/mt/34379814/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] per-worker stat vector length fix needed in 19.08 also?

2019-10-03 Thread Elias Rudberg

I was just chasing a strange error that turned out to be related to
some code in src/vpp/stats/stat_segment.c where something went wrong
regarding statistics vectors for different threads (some kind of memory
corruption that ended up causing an infinite loop inside dlmalloc.c).

Then I saw the following commit by Ben in the master branch:
-
commit dba00cad1a2e41b4974911793cc76eab81a6e30e
Author: Benoît Ganne 
Date:   Mon Sep 30 12:39:55 2019 +0200

stats: fix per-worker stat vector length

Type: fix
-

The above commit in the master branch fixes the problem I was
struggling with for 19.08.

Can that commit be applied (cherry-picked?) also for the 19.08 branch?

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14113): https://lists.fd.io/g/vpp-dev/message/14113
Mute This Topic: https://lists.fd.io/mt/34391310/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] prevent loopback of broadcast packets rdma fix needed in 19.08 also?

2019-10-08 Thread Elias Rudberg

We just had problems making LACP bonding work with RDMA and Mellanox
cards, using VPP 19.08. It turned out to be caused by a problem with
unintended loopback of some packets, something that is fixed by the
following commit by Ben in the master branch:

---
commit df213385d391f21d99eaeaf066f0130a20f7ccde
Author: Benoît Ganne 
Date:   Fri Oct 4 15:28:12 2019 +0200

rdma: prevent loopback of broadcast packets

TX queues must be created before RX queues on Mellanox cards in
order to
not receive our own broadcast packets.

Type: fix

Change-Id: I32ae25a47d819f715feda621a5ecddcf4efd71ba
Signed-off-by: Benoît Ganne 
---

Can that fix be applied also for the 19.08 branch?

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14148): https://lists.fd.io/g/vpp-dev/message/14148
Mute This Topic: https://lists.fd.io/mt/34443946/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] Access to gerrit.fd.io port 29418 works for IPv4 but not for IPv6?

2019-10-30 Thread Elias Rudberg

Hello,

According to the instructions here 
https://wiki.fd.io/view/VPP/Pulling,_Building,_Running,_Hacking_and_Pushing_VPP_Code#Pulling_code_via_ssh
 
pulling the code should be done like this:

git clone ssh://usern...@gerrit.fd.io:29418/vpp.git

However, from my computer that does not work (it hangs). First I
thought this was due to port 29418 being blocked for me locally but it
turns out that was not the issue.

Doing "host gerrit.fd.io" shows that it has both an IPv4 and an IPv6
address:
IPv4: 52.10.107.188
IPv6: 2600:1f14:9b3:3400:ee75:f90f:2247:905d

If I use the IPv4 address instead of the hostname, like this, then it
works:

git clone ssh://USERNAME@52.10.107.188:29418/vpp.git

Trying from another computer that only uses IPv4, it works as it should
using the hostname.

So, it seems like the ssh access to gerrit.fd.io:29418 works for IPv4
but not for IPv6. That would explain why I can get it to work by typing
the IPv4 address instead of the hostname, I guess that forces IPv4 to
be used.

As another way of verifying this, I tested disabling IPv6 completely on
my computer. Then things work, consistent with the hypotethis that the
problem is related to IPv6 configuration of the gerrit.fd.io server.
(if I'm right, anyone trying to access gerrit.fd.io:29418 using IPv6
should see this problem)

For now, using the IPv4 address works as a workaround, but I guess this
is something that should be fixed in how the server is configured?

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14385): https://lists.fd.io/g/vpp-dev/message/14385
Mute This Topic: https://lists.fd.io/mt/39770462/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] RDMA fix needed in 19.08 also

2019-10-31 Thread Elias Rudberg

It seems like the rdma plugin is currently not working in the
stable/1908 branch. It stopped working after commit b4c5f16889.

In the master branch, the rdma plugin stopped working in commit
534de8b2a7 but started working again after the fix in commit 386ebb6e2b
with commit message "rdma: build: fix ibverb compilation test".

To make rdma work again in the stable/1908 branch, I think the fix
386ebb6e2b "rdma: build: fix ibverb compilation test" would be needed
in that branch also. The change is quite small, only a few lines in the
file src/plugins/rdma/CMakeLists.txt. Can that change be applied in the
stable/1908 branch?

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14432): https://lists.fd.io/g/vpp-dev/message/14432
Mute This Topic: https://lists.fd.io/mt/40219418/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] RDMA fix needed in 19.08 also

2019-11-01 Thread Elias Rudberg

Yes, now it works. Thank you!
/ Elias


On Fri, 2019-11-01 at 08:42 +0100, Andrew 👽 Yourtchenko wrote:
> It’s merged. Please let me know if all ok now.
> 
> --a
> 
> > On 31 Oct 2019, at 23:55, Andrew Yourtchenko via Lists.Fd.Io <
> > ayourtch=gmail@lists.fd.io> wrote:
> > 
> > Elias,
> > 
> > Thanks for telling! I have cherry-picked  
> > https://gerrit.fd.io/r/#/c/vpp/+/23164/ and will merge it tomorrow.
> > 
> > --a
> > 
> > > On 31 Oct 2019, at 19:18, Elias Rudberg <
> > > elias.rudb...@bahnhof.net> wrote:
> > > 
> > > 386ebb6e2b
> > 
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14458): https://lists.fd.io/g/vpp-dev/message/14458
Mute This Topic: https://lists.fd.io/mt/40219418/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] NAT worker HANDOFF but no HANDED-OFF -- no worker picks up the handed-off work

2019-11-15 Thread Elias Rudberg

We are using VPP 19.08 for NAT (nat44) and are struggling with the
following problem: it first works seemingly fine for a while, like
several days or weeks, but then suddenly VPP stops forwarding traffic.
Even ping to the "outside" IP address fails.

The VPP process is still running so we try to investigate further using
vppctl, enabling packet trace as follows:

clear trace
trace add rdma-input 5

then doing ping to "outside" and then "show trace".

To see the normal behavior we have compared to another server running
VPP without the strange problem happening; there we can see that the
normal behavior is that one worker starts processing the packet and
then does NAT44_OUT2IN_WORKER_HANDOFF after which another worker takes
over: "handoff_trace" and then "HANDED-OFF: from thread..." and then
that worker continues processing the packet.
So the relevant parts of the trace look like this (abbreviated to show
only node names and handoff info) for a case when thread 8 hands off
work to thread 3:

--- Start of thread 3 vpp_wk_2 ---
Packet 1

08:15:10:781992: handoff_trace
  HANDED-OFF: from thread 8 trace index 0
08:15:10:781992: nat44-out2in
08:15:10:782008: ip4-lookup
08:15:10:782009: ip4-local
08:15:10:782010: ip4-icmp-input
08:15:10:782011: ip4-icmp-echo-request
08:15:10:782011: ip4-load-balance
08:15:10:782013: ip4-rewrite
08:15:10:782014: BondEthernet0-output

--- Start of thread 8 vpp_wk_7 ---
Packet 1

08:15:10:781986: rdma-input
08:15:10:781988: bond-input
08:15:10:781989: ethernet-input
08:15:10:781989: ip4-input
08:15:10:781990: nat44-out2in-worker-handoff
  NAT44_OUT2IN_WORKER_HANDOFF : next-worker 3 trace index 0

The above is what it looks like normally. The problem is that
sometimes, for some reason, the handoff stops working so that we only
get the initial processing by a worker and that working saying
NAT44_OUT2IN_WORKER_HANDOFF but the other worker does not pick up the
work, it is seemingly ignored.

Here is what it looks like then, when the problem has happened, thread
7 trying to handoff to thread 3:

--- Start of thread 3 vpp_wk_2 ---
No packets in trace buffer

--- Start of thread 7 vpp_wk_6 ---
Packet 1

08:38:41:904654: rdma-input
08:38:41:904656: bond-input
08:38:41:904658: ethernet-input
08:38:41:904660: ip4-input
08:38:41:904663: nat44-out2in-worker-handoff
  NAT44_OUT2IN_WORKER_HANDOFF : next-worker 3 trace index 0

So, work is also in this case handed off to thread 3 but thread 3 does
not pick it up. There is no "HANDED-OFF" message in the trace at all,
not for any worker. It seems like the handed-off work was ignored. Then
of course it is understandable that the ping does not work and packet
forwarding does not work, the question is: why does that hand-off
procedure fail?

Are there some known reasons that can cause this behavior?

When there is a NAT44_OUT2IN_WORKER_HANDOFF message in the packet
trace, should there always be a corresponding "HANDED-OFF" message for
another thread picking it up?

One more question related to the above: sometimes when looking at trace
for ICMP packets to investigate this problem we have seen a worker
apparently handing off work to itself, which seems strange. Example:

--- Start of thread 3 vpp_wk_2 ---
Packet 1

08:31:23:871274: rdma-input
08:31:23:871279: bond-input
08:31:23:871282: ethernet-input
08:31:23:871285: ip4-input
08:31:23:871289: nat44-out2in-worker-handoff
  NAT44_OUT2IN_WORKER_HANDOFF : next-worker 3 trace index 0

If the purpose of "handoff" is to let another thread take over, then
this seems strange by itself (even without considering that there is no
"HANDED-OFF" for any thread): why is thread 3 trying to handoff work to
itself? Does that indicate something wrong or are there legitimate
cases where a thread "hands off" something to itself?

We have encountered this problem several times but unfortunately we
have not yet found a way to reproduce it in a lab environment, we do
not know exactly what triggers the problem. Previous times, when we
have restarted vpp it starts working normally again.

Any input on this or ideas for how to troubleshoot further would be
much appreciated.

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14602): https://lists.fd.io/g/vpp-dev/message/14602
Mute This Topic: https://lists.fd.io/mt/59112885/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] NAT worker HANDOFF but no HANDED-OFF -- no worker picks up the handed-off work

2019-11-15 Thread Elias Rudberg

Hi Andrew,

Thanks, that looks promising. The issue 
https://jira.fd.io/browse/VPP-1734 that the fix refers to seems like it
could be the same issue we are seeing.

We have just restarted vpp with the fix, it will be interesting to see
if it helps. Thanks again for your help!

/ Elias


On Fri, 2019-11-15 at 11:26 +0100, Andrew 👽 Yourtchenko wrote:
> Hi Elias,
> 
> Could you give a shot running a build with 
> https://gerrit.fd.io/r/#/c/vpp/+/23461/ in ?
> 
> I cherry-picked it from master today but it is not in 19.08 branch
> yet.
> 
> --a
> 
> > On 15 Nov 2019, at 11:05, Elias Rudberg 
> > wrote:
> > 
> > We are using VPP 19.08 for NAT (nat44) and are struggling with the
> > following problem: it first works seemingly fine for a while, like
> > several days or weeks, but then suddenly VPP stops forwarding
> > traffic.
> > Even ping to the "outside" IP address fails.
> > 
> > The VPP process is still running so we try to investigate further
> > using
> > vppctl, enabling packet trace as follows:
> > 
> > clear trace
> > trace add rdma-input 5
> > 
> > then doing ping to "outside" and then "show trace".
> > 
> > To see the normal behavior we have compared to another server
> > running
> > VPP without the strange problem happening; there we can see that
> > the
> > normal behavior is that one worker starts processing the packet and
> > then does NAT44_OUT2IN_WORKER_HANDOFF after which another worker
> > takes
> > over: "handoff_trace" and then "HANDED-OFF: from thread..." and
> > then
> > that worker continues processing the packet.
> > So the relevant parts of the trace look like this (abbreviated to
> > show
> > only node names and handoff info) for a case when thread 8 hands
> > off
> > work to thread 3:
> > 
> > --- Start of thread 3 vpp_wk_2 ---
> > Packet 1
> > 
> > 08:15:10:781992: handoff_trace
> >  HANDED-OFF: from thread 8 trace index 0
> > 08:15:10:781992: nat44-out2in
> > 08:15:10:782008: ip4-lookup
> > 08:15:10:782009: ip4-local
> > 08:15:10:782010: ip4-icmp-input
> > 08:15:10:782011: ip4-icmp-echo-request
> > 08:15:10:782011: ip4-load-balance
> > 08:15:10:782013: ip4-rewrite
> > 08:15:10:782014: BondEthernet0-output
> > 
> > --- Start of thread 8 vpp_wk_7 ---
> > Packet 1
> > 
> > 08:15:10:781986: rdma-input
> > 08:15:10:781988: bond-input
> > 08:15:10:781989: ethernet-input
> > 08:15:10:781989: ip4-input
> > 08:15:10:781990: nat44-out2in-worker-handoff
> >  NAT44_OUT2IN_WORKER_HANDOFF : next-worker 3 trace index 0
> > 
> > The above is what it looks like normally. The problem is that
> > sometimes, for some reason, the handoff stops working so that we
> > only
> > get the initial processing by a worker and that working saying
> > NAT44_OUT2IN_WORKER_HANDOFF but the other worker does not pick up
> > the
> > work, it is seemingly ignored.
> > 
> > Here is what it looks like then, when the problem has happened,
> > thread
> > 7 trying to handoff to thread 3:
> > 
> > --- Start of thread 3 vpp_wk_2 ---
> > No packets in trace buffer
> > 
> > --- Start of thread 7 vpp_wk_6 ---
> > Packet 1
> > 
> > 08:38:41:904654: rdma-input
> > 08:38:41:904656: bond-input
> > 08:38:41:904658: ethernet-input
> > 08:38:41:904660: ip4-input
> > 08:38:41:904663: nat44-out2in-worker-handoff
> >  NAT44_OUT2IN_WORKER_HANDOFF : next-worker 3 trace index 0
> > 
> > So, work is also in this case handed off to thread 3 but thread 3
> > does
> > not pick it up. There is no "HANDED-OFF" message in the trace at
> > all,
> > not for any worker. It seems like the handed-off work was ignored.
> > Then
> > of course it is understandable that the ping does not work and
> > packet
> > forwarding does not work, the question is: why does that hand-off
> > procedure fail?
> > 
> > Are there some known reasons that can cause this behavior?
> > 
> > When there is a NAT44_OUT2IN_WORKER_HANDOFF message in the packet
> > trace, should there always be a corresponding "HANDED-OFF" message
> > for
> > another thread picking it up?
> > 
> > One more question related to the above: sometimes when looking at
> > trace
> > for ICMP packets to investigate this problem we have seen a worker
> > apparently

[vpp-dev] undefined symbol: nat_ha_resync (trying to use Active-Passive NAT HA)

2019-11-26 Thread Elias Rudberg

When trying to use the Active-Passive NAT HA functionality described
at https://docs.fd.io/vpp/20.01/dd/d2e/nat_ha_doc.html and trying the
"nat ha resync" command, VPP crashes with the following message:

symbol lookup error: [...] nat_plugin.so: undefined symbol:
nat_ha_resync

The attempted function call is in nat_ha_resync_command_fn in
plugins/nat/nat44_cli.c and looks like this:

  if (nat_ha_resync (0, 0, 0))
error = clib_error_return (0, "NAT HA resync already running");

The nat_ha_resync function is declared in plugins/nat/nat_ha.h like
this:

/**
 * @brief Resync HA (resend existing sessions to new failover)
 */
int nat_ha_resync (u32 client_index, u32 pid,
   nat_ha_resync_event_cb_t event_callback);

so it is declared so the compiler accepts the function call, but
apparently the function is not implemented anywhere, leading to
the symbol lookup error.

We tried this with 19.08 as well as the current master branch and
encounter the same problem for both.

Any ideas on how to make this work?

Also, any other advice regarding the NAT HA functionality or links to
further documentation or example usage (if there is more than 
https://docs.fd.io/vpp/20.01/dd/d2e/nat_ha_doc.html) would be much
appreciated.

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14696): https://lists.fd.io/g/vpp-dev/message/14696
Mute This Topic: https://lists.fd.io/mt/61957444/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] Good LACP packets giving "error-drop" statistics

2019-11-27 Thread Elias Rudberg

We are using LACP and it works fine except that the "error-drop"
statistics are increased for each LACP packet that arrives. We see this
behavior both for VPP 19.08 and for the current master branch.

Here is an example of a packet trace for a LACP packet:

00:00:16:717846: rdma-input
  rdma: Interface101 (3) next-node bond-input
00:00:16:717848: bond-input
  src [...], dst [...], Interface101 -> Interface101
00:00:16:717849: ethernet-input
  SLOW_PROTOCOLS: [...]
00:00:16:717850: lacp-input
  Interface101:
Length: 110
  LACPv1
  Actor Information TLV: length 20
[... LACP info here ...]
  Partner Information TLV: length 20
[... LACP info here ...]
00:00:16:717851: error-drop
  rx:Interface101
00:00:16:717852: drop
  lacp-input: good lacp packets -- cache hit

So it says "good lacp packets" but at the same time "error-drop" which
seems contradictory.

LACP is in fact working fine, the only issue we have is the "error-
drop" statistics that we would like to avoid if there is in fact
nothing wrong.

Is there some reason why it is desirable to report error-drop for all
LACP packets, or is this something that can be fixed so that error-drop 
is only used when there is something wrong?

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14716): https://lists.fd.io/g/vpp-dev/message/14716
Mute This Topic: https://lists.fd.io/mt/62549109/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] Status of VPP Active-Passive NAT HA code?

2019-11-28 Thread Elias Rudberg

Hello VPP experts,

I would like to ask about the status of the Active-Passive NAT HA (high
availability) code in src/plugins/nat/nat_ha.c and nat_ha.h. In the git
history it looks like it was added by Matus Fabian in February 2019,
with few changes since then.

Having looked at it and tested it I think it is partly working, it can
indeed sync sessions from the active to the passive vpp server, but the
"resync" functionality needed to (re-)send all session data to a new
passive vpp server is as far as I can tell not fully implemented. In
particular, the function nat_ha_resync declared in nat_ha.h is not
implemented in nat_ha.c which makes vpp crash when trying to use the
"nat ha resync" command in vppctl.

The "resync" functionality would be really good to have since that
would allow us to restore the primary server in a situation when the
secondary has taken over, if resync is supported then the secondary can
send the session data back again once the primary has been
fixed/upgraded and then the original setup with the redundancy can be
recovered, all without breaking existing user sessions.

Is anyone working on that part of the code now, or using it, or have
some idea about its status?

Any advice in case I were to try implementing the missing pieces
myself?

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14727): https://lists.fd.io/g/vpp-dev/message/14727
Mute This Topic: https://lists.fd.io/mt/64117356/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Status of VPP Active-Passive NAT HA code?

2019-11-29 Thread Elias Rudberg

Hi Ole,

Thanks for explaining!

The "programmable flow NAT" solution you describe sounds very
interesting, it may be better for us to focus on that if it's not too
far off in the future. Please let me know if, when and how I can help
with that.

Best regards,
Elias


> The NAT HA code was something Matus ported across from another
> project.
> The other work, was an experiment with a split of the NAT fast-path
> and slow-path, with a protocol between them.
> The NAT fast-path (aka the NAT DP) used a flow cache, with
> instructions. On cache miss it would send a protocol packet to the
> NAT slow path / NAT CP, asking for instructions for this flow.
> The flow cache was uni-directional. And forward / reverse flow could
> be on different VPP instances, as could the NAT CP.
> 
> In our experiment the NAT CP also ran on VPP (although it doesn't
> have to). And while the NAT DP instances by design don't need HA
> functionality, the backend NAT CP would have to.
> The bits of NAT HA was upstreamed by Matus from that experiment.
> 
> It hasn't been worked on since as far as I know.
> Filip probably has a better understanding of the details of that
> code.
> Happy to help of course, and happy to declare you the owner of NAT HA
> from now on. ;-)
> 
> Upstreaming the "programmable flow NAT" is on my list. It will be in
> a separate plugin. Let me know if you are interested / want to
> contribute.
> 
> Best regards,
> Ole
> 
> PS: On a more personal rant. I dislike HA solutions in general. They
> tend to, by their increased level of complexity, to decrease
> reliability.
> One benefit of IPv4 run-out is that applications/the transport layer
> has been trained to expect that the network holds session state, and
> that it's the applications responsibility to maintain that session
> state in the network. I wouldn't be so worried about dropping
> sessions in the case of an unexpected event. Applications will
> recreate session. At least something worth testing.
> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> 
> View/Reply Online (#14728): 
> https://lists.fd.io/g/vpp-dev/message/14728
> Mute This Topic: https://lists.fd.io/mt/64117356/1968077
> Group Owner: vpp-dev+ow...@lists.fd.io
> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [elias.rudberg@bahn
> hof.net]
> -=-=-=-=-=-=-=-=-=-=-=-
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14733): https://lists.fd.io/g/vpp-dev/message/14733
Mute This Topic: https://lists.fd.io/mt/64117356/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] How to receive broadcast messages in VPP?

2020-02-06 Thread Elias Rudberg

Hello everyone,

I am trying to figure out how to receive broadcast messages in VPP (vpp
version 19.08 in case that matters).

This is in the context of some changes we are considering in the VPP
NAT HA functionality. That code in e.g. plugins/nat/nat_ha.c uses UDP
messages to communicate information about NAT sessions between
different VPP servers. It is currently using unicast messages, but we
are considering the possibility of using broadcast messages instead,
hoping that could be more efficient in case there are more than two
servers involved. For example, when a new NAT session has been created,
we could send a broadcast message about the new session, that would
reach several other VPP servers, without need to send a separate
unicast message to each server.

The code in plugins/nat/nat_ha.c calls udp_register_dst_port() to
register that it wants to receive UDP traffic, like this:

  udp_register_dst_port (ha->vlib_main, port,
 nat_ha_handoff_node.index, 1);

This works fine for unicast messages; when such packets arrive at the
given port, they get handled by the nat_ha_handoff_node as desired.

However, if broadcast packets arrive, those packets are dropped
instead, they do not arrive at the nat_ha_handoff_node.

For example, if the IP address of the relevant interface on the
receiving side is 10.10.50.1/24 then unicast UDP messages with
destination 10.10.50.1 are handled fine. However, if the destination is
10.10.50.255 (the broadcast address for that /24 subnet) then the
packets are dropped. Here is an example of a packet trace when such a
packet is received from 10.10.50.2:

02:41:19:250212: rdma-input
  rdma: Interface101 (3) next-node bond-input
02:41:19:250214: bond-input
  src 02:fe:ff:76:e4:5d, dst ff:ff:ff:ff:ff:ff, Interface101 ->
BondEthernet0
02:41:19:250214: ethernet-input
  IP4: 02:fe:ff:76:e4:5d -> ff:ff:ff:ff:ff:ff 802.1q vlan 1015
02:41:19:250215: ip4-input
  UDP: 10.10.50.2 -> 10.10.50.255
tos 0x80, ttl 254, length 92, checksum 0x02fa
fragment id 0x0002, flags DONT_FRAGMENT
  UDP: 1234 -> 2345
length 72, checksum 0x
02:41:19:250216: ip4-lookup
  fib 0 dpo-idx 0 flow hash: 0x
  UDP: 10.10.50.2 -> 10.10.50.255
tos 0x80, ttl 254, length 92, checksum 0x02fa
fragment id 0x0002, flags DONT_FRAGMENT
  UDP: 1234 -> 2345
length 72, checksum 0x
02:41:19:250217: ip4-drop
UDP: 10.10.50.2 -> 10.10.50.255
  tos 0x80, ttl 254, length 92, checksum 0x02fa
  fragment id 0x0002, flags DONT_FRAGMENT
UDP: 1234 -> 2345
  length 72, checksum 0x
02:41:19:250217: error-drop
  rx:BondEthernet0.1015
02:41:19:250217: drop
  ethernet-input: no error

So the packet ends up at ip4-drop when I would have liked it to come to
nat_ha_handoff_node.

Does anyone have a suggestion about how to make this work?
Is some special configuration of the receiving interface needed to tell
VPP that we want it to receive broadcast packets?

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15352): https://lists.fd.io/g/vpp-dev/message/15352
Mute This Topic: https://lists.fd.io/mt/71020576/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] How to receive broadcast messages in VPP?

2020-02-14 Thread Elias Rudberg

Hi Neale and Dave,

Thanks for your answers!
I was able to make it work using multicast as Neale suggested.

Here is roughly what I did to make it work using multicast instead of
unicast:

On the sending side, to make it send multicast packets:

adj_index_t adj_index_for_multicast = adj_mcast_add_or_lock
(FIB_PROTOCOL_IP4, VNET_LINK_IP4, sw_if_index);

and then when a message is to be sent, use the above created adj_index
before invoking ip4_rewrite_node (instead of ip4_lookup_node):

vnet_buffer (b)->ip.adj_index[VLIB_TX] = adj_index_for_multicast;
vlib_put_frame_to_node (vm, ip4_rewrite_node.index, f);

On the receiving side the following config was needed:

ip mroute add 224.0.0.1 via MyInterface Accept
ip mroute add 224.0.0.1 via local Forward

After that it works using multicast. Thanks for your help!
(Please let me know if the above is not the right way to do it)

Best regards,
Elias


On Thu, 2020-02-06 at 13:45 +, Neale Ranns via Lists.Fd.Io wrote:
> Hi Elias,
> 
> Please see inline.
> 
> 
> On 06/02/2020 12:41, "vpp-dev@lists.fd.io on behalf of Elias
> Rudberg" 
> wrote:
> 
> Hello everyone,
> 
> I am trying to figure out how to receive broadcast messages in
> VPP (vpp
> version 19.08 in case that matters).
> 
> This is in the context of some changes we are considering in the
> VPP
> NAT HA functionality. That code in e.g. plugins/nat/nat_ha.c uses
> UDP
> messages to communicate information about NAT sessions between
> different VPP servers. It is currently using unicast messages,
> but we
> are considering the possibility of using broadcast messages
> instead,
> hoping that could be more efficient in case there are more than
> two
> servers involved. For example, when a new NAT session has been
> created,
> we could send a broadcast message about the new session, that
> would
> reach several other VPP servers, without need to send a separate
> unicast message to each server.
> 
> The code in plugins/nat/nat_ha.c calls udp_register_dst_port() to
> register that it wants to receive UDP traffic, like this:
> 
>   udp_register_dst_port (ha->vlib_main, port,
>  nat_ha_handoff_node.index, 1);
> 
> This works fine for unicast messages; when such packets arrive at
> the
> given port, they get handled by the nat_ha_handoff_node as
> desired.
> 
> However, if broadcast packets arrive, those packets are dropped
> instead, they do not arrive at the nat_ha_handoff_node.
> 
> For example, if the IP address of the relevant interface on the
> receiving side is 10.10.50.1/24 then unicast UDP messages with
> destination 10.10.50.1 are handled fine. However, if the
> destination is
> 10.10.50.255 (the broadcast address for that /24 subnet) then the
> packets are dropped. Here is an example of a packet trace when
> such a
> packet is received from 10.10.50.2:
> 
> 02:41:19:250212: rdma-input
>   rdma: Interface101 (3) next-node bond-input
> 02:41:19:250214: bond-input
>   src 02:fe:ff:76:e4:5d, dst ff:ff:ff:ff:ff:ff, Interface101 ->
> BondEthernet0
> 02:41:19:250214: ethernet-input
>   IP4: 02:fe:ff:76:e4:5d -> ff:ff:ff:ff:ff:ff 802.1q vlan 1015
> 02:41:19:250215: ip4-input
>   UDP: 10.10.50.2 -> 10.10.50.255
> tos 0x80, ttl 254, length 92, checksum 0x02fa
> fragment id 0x0002, flags DONT_FRAGMENT
>   UDP: 1234 -> 2345
> length 72, checksum 0x
> 02:41:19:250216: ip4-lookup
>   fib 0 dpo-idx 0 flow hash: 0x
>   UDP: 10.10.50.2 -> 10.10.50.255
> tos 0x80, ttl 254, length 92, checksum 0x02fa
> fragment id 0x0002, flags DONT_FRAGMENT
>   UDP: 1234 -> 2345
> length 72, checksum 0x
> 02:41:19:250217: ip4-drop
> UDP: 10.10.50.2 -> 10.10.50.255
>   tos 0x80, ttl 254, length 92, checksum 0x02fa
>   fragment id 0x0002, flags DONT_FRAGMENT
> UDP: 1234 -> 2345
>   length 72, checksum 0x
> 
> if you check:
>   sh ip fib 10.10.50.255/32
> you'll see an explicit entry to drop. You can't override this.
> 
> 
> 02:41:19:250217: error-drop
>   rx:BondEthernet0.1015
> 02:41:19:250217: drop
>   ethernet-input: no error
> 
> So the packet ends up at ip4-drop when I would have liked it to
> come to
> nat_ha_handoff_node.
> 
> Does anyone have a suggestion about how to make this work?
> Is some special configuration of the receiving interface needed
> to tell
> VPP that we want it to receive broadcast packets?
>

[vpp-dev] VPP ip4-input drops packets due to "ip4 length > l2 length" errors when using rdma with Mellanox mlx5 cards

2020-02-14 Thread Elias Rudberg

Hello VPP developers,

We have a problem with VPP used for NAT on Ubuntu 18.04 servers
equipped with Mellanox ConnectX-5 network cards (ConnectX-5 EN network
interface card; 100GbE dual-port QSFP28; PCIe3.0 x16; tall bracket;
ROHS R6).

VPP is dropping packets in the ip4-input node due to "ip4 length > l2
length" errors, when we use the RDMA plugin.

The interfaces are configured like this:

create int rdma host-if enp101s0f1 name Interface101 num-rx-queues 1
create int rdma host-if enp179s0f1 name Interface179 num-rx-queues 1

(we have set num-rx-queues 1 now to simplify while troubleshooting, in
production we use num-rx-queues 4)

We see some packets dropped due to "ip4 length > l2 length" for example
in TCP tests with around 100 Mbit/s -- running such a test for a few
seconds already gives some errors. More traffic gives more errors and
it seems to be unrelated to the contents of the packets, it seems to
happen quite randomly and already at such moderate amounts of traffic,
very far below what should be the capacity of the hardware.

Only a small fraction of packets are dropped: in tests at 100 Mbit/s
and packet size 500, for each million packets about 3 or 4 packets get
the "ip4 length > l2 length" drop problem. However, the effect appears
stronger for larger amounts of traffic and has impacted some of our end
users who observe decresed TCP speed as a result of these drops.

The "ip4 length > l2 length" errors can be seen using vppctl "show
errors":

142ip4-input   ip4 length > l2 length

To get more info about the "ip4 length > l2 length" error we printed
the involved sizes when the error happens (ip_len0 and cur_len0 in
src/vnet/ip/ip4_input.h), which shows that the actual packet size is
often much smaller than the ip_len0 value which is what the IP packet
size should be according to the IP header. For example, when
ip_len0=500 as is the case for many of our packets in the test runs,
the cur_len0 value is sometimes much smaller. The smallest case we have
seen was cur_len0 = 59 with ip_len0 = 500 -- the IP header said the IP
packet size was 500 bytes, but the actual size was only 59 bytes. So it
seems some data is lost, packets have been truncated, sometimes large
parts of the packets are missing.

The problems disappear if we skip using the RDMA plugin and use the
(old?) dpdk way of handling the interfaces, then there are no "ip4
length > l2 length" drops at all. That makes us think there is
something wrong with the rdma plugin, perhaps a bug or something wrong
with how it is configured.

We have tested this with both the current master branch and the
stable/1908 branch, we see the same problem for both.

We tried updating the Mellanox driver from v4.6 to v4.7 (latest
version) but that did not help.

After trying some different values of the rx-queue-size parameter to
the "create int rdma" command, it seems like the "ip4 length > l2
length" becomes smaller as the rx-queue-size is increased, perhaps
indicating the problem has to do with what happens when the end of that
queue is reached.

Do you agree that the above points to a problem with the RDMA plugin in
VPP?

Are there known bugs or other issues that could explain the "ip4 length
> l2 length" drops?

Does it seem like a good idea to set a very large value of the rx-
queue-size parameter if that alleviates the "ip4 length > l2 length"
problem, or are there big downsides of using a large rx-queue-size
value?

What else could we do to troubleshoot this further, are there
configuration options to the RDMA plugin that could be used to solve
this and/or get more information about what is happening?

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15403): https://lists.fd.io/g/vpp-dev/message/15403
Mute This Topic: https://lists.fd.io/mt/71273976/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] VPP ip4-input drops packets due to "ip4 length > l2 length" errors when using rdma with Mellanox mlx5 cards

2020-02-17 Thread Elias Rudberg

Hi Ben,

Thanks for your answer.

Now I think I found the problem, looks like a bug in
plugins/rdma/input.c related to what happens when the list of input
packets wrap around to the beginning of the ring buffer.
To fix it, the following change is needed:

diff --git a/src/plugins/rdma/input.c b/src/plugins/rdma/input.c
index 30fae83e0..f9979545d 100644
--- a/src/plugins/rdma/input.c
+++ b/src/plugins/rdma/input.c
@@ -318,7 +318,7 @@ rdma_device_input_inline (vlib_main_t * vm,
vlib_node_runtime_t * node,
&bt);
   if (n_tail < n_rx_packets)
 n_rx_bytes +=
-  rdma_device_input_bufs (vm, rd, &to_next[n_tail], &rxq->bufs[0], 
wc,
+  rdma_device_input_bufs (vm, rd, &to_next[n_tail], &rxq->bufs[0], 
&wc[n_tail],
  n_rx_packets - n_tail, &bt);
   rdma_device_input_ethernet (vm, node, rd, next_index);

At that point in the code, the rdma_device_input_bufs() function is
called twice to handle the n_rx_packets that have arrived. First it is
called for the part up to the end of the buffer, and then a second call
is made to handle the remaining part, starting from the beginning of
the buffer. The problem is that the same "wc" argument is passed both
times, when in fact that pointer needs to be moved forward for the
second call, so we need &wc[n_tail] instead of just wc for the second
call to rdma_device_input_bufs() -- n_tail is the number of packets
that were handled by the first rdma_device_input_bufs() call.

In my tests so far it looks like the above change fixes the problem
completely, after the fix there are no longer any "ip4 length > l2
length" errors.

This explanation fits with what we saw in our tests earlier, that the
problem with erroneous packets became smaller when the buffer size was
increased, since the second call to rdma_device_input_bufs() only comes
into play at the end of the ring buffer, which happens more rarely when
the buffer is larger. (But after the fix above there is no longer any
need to increase the buffer size.)

What do you think, does this seem right?

Best regards,
Elias

On Mon, 2020-02-17 at 15:38 +, Benoit Ganne (bganne) via
Lists.Fd.Io wrote:
> Hi Elias,
> 
> As the problem only arise with VPP rdma driver and not the DPDK
> driver, it is fair to say it is a VPP rdma driver issue.
> I'll try to reproduce the issue on my setup and keep you posted.
> In the meantime I do not see a big issue increasing the rx-queue-size 
> to mitigate it.
> 
> ben
> 
> > -Original Message-
> > From: vpp-dev@lists.fd.io  On Behalf Of Elias
> > Rudberg
> > Sent: vendredi 14 février 2020 16:56
> > To: vpp-dev@lists.fd.io
> > Subject: [vpp-dev] VPP ip4-input drops packets due to "ip4 length >
> > l2
> > length" errors when using rdma with Mellanox mlx5 cards
> > 
> > Hello VPP developers,
> > 
> > We have a problem with VPP used for NAT on Ubuntu 18.04 servers
> > equipped with Mellanox ConnectX-5 network cards (ConnectX-5 EN
> > network
> > interface card; 100GbE dual-port QSFP28; PCIe3.0 x16; tall bracket;
> > ROHS R6).
> > 
> > VPP is dropping packets in the ip4-input node due to "ip4 length >
> > l2
> > length" errors, when we use the RDMA plugin.
> > 
> > The interfaces are configured like this:
> > 
> > create int rdma host-if enp101s0f1 name Interface101 num-rx-queues
> > 1
> > create int rdma host-if enp179s0f1 name Interface179 num-rx-queues
> > 1
> > 
> > (we have set num-rx-queues 1 now to simplify while troubleshooting,
> > in
> > production we use num-rx-queues 4)
> > 
> > We see some packets dropped due to "ip4 length > l2 length" for
> > example
> > in TCP tests with around 100 Mbit/s -- running such a test for a
> > few
> > seconds already gives some errors. More traffic gives more errors
> > and
> > it seems to be unrelated to the contents of the packets, it seems
> > to
> > happen quite randomly and already at such moderate amounts of
> > traffic,
> > very far below what should be the capacity of the hardware.
> > 
> > Only a small fraction of packets are dropped: in tests at 100
> > Mbit/s
> > and packet size 500, for each million packets about 3 or 4 packets
> > get
> > the "ip4 length > l2 length" drop problem. However, the effect
> > appears
> > stronger for larger amounts of traffic and has impacted some of our
> > end
> > users who observe decresed TCP speed as a result of these drops.
> > 
> > The "ip4 length > l2 length" errors can be seen using vppctl "show
> > errors":
> > 
> >

Re: [vpp-dev] VPP ip4-input drops packets due to "ip4 length > l2 length" errors when using rdma with Mellanox mlx5 cards

2020-02-18 Thread Elias Rudberg

Hi Ben,

Great! I tried submitting a patch myself, here it is:

https://gerrit.fd.io/r/c/vpp/+/25233

Let me know if something more is needed. I tried to follow the
instructions here: 
https://wiki.fd.io/view/VPP/Pulling,_Building,_Running,_Hacking_and_Pushing_VPP_Code#Pushing_Code_with_git_review

/ Elias


On Tue, 2020-02-18 at 09:30 +, Benoit Ganne (bganne) via
Lists.Fd.Io wrote:
> Hi Elias,
> 
> > Now I think I found the problem, looks like a bug in
> > plugins/rdma/input.c related to what happens when the list of input
> > packets wrap around to the beginning of the ring buffer.
> > To fix it, the following change is needed:
> 
> Indeed, your fix is correct, good catch. Do you want to submit a
> patch through gerrit or do you prefer me to do it?
> 
> Best
> ben

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15446): https://lists.fd.io/g/vpp-dev/message/15446
Mute This Topic: https://lists.fd.io/mt/71273976/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] VPP ip4-input drops packets due to "ip4 length > l2 length" errors when using rdma with Mellanox mlx5 cards

2020-03-09 Thread Elias Rudberg

Hello,

Could this fix be applied in the stable/1908 (and maybe stable/2001)
branch also?

Best regards,
Elias



On Tue, 2020-02-18 at 11:48 +, Elias Rudberg wrote:
> Hi Ben,
> 
> Great! I tried submitting a patch myself, here it is:
> 
> https://gerrit.fd.io/r/c/vpp/+/25233
> 
> Let me know if something more is needed. I tried to follow the
> instructions here: 
> https://wiki.fd.io/view/VPP/Pulling,_Building,_Running,_Hacking_and_Pushing_VPP_Code#Pushing_Code_with_git_review
> 
> / Elias
> 
> 
> On Tue, 2020-02-18 at 09:30 +, Benoit Ganne (bganne) via
> Lists.Fd.Io wrote:
> > Hi Elias,
> > 
> > > Now I think I found the problem, looks like a bug in
> > > plugins/rdma/input.c related to what happens when the list of
> > > input
> > > packets wrap around to the beginning of the ring buffer.
> > > To fix it, the following change is needed:
> > 
> > Indeed, your fix is correct, good catch. Do you want to submit a
> > patch through gerrit or do you prefer me to do it?
> > 
> > Best
> > ben

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15702): https://lists.fd.io/g/vpp-dev/message/15702
Mute This Topic: https://lists.fd.io/mt/71273976/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] VPP ip4-input drops packets due to "ip4 length > l2 length" errors when using rdma with Mellanox mlx5 cards

2020-03-11 Thread Elias Rudberg

Hello again,

Thanks for the help with getting this fix into the 1908 branch!

Could the same fix please be added in the stable/2001 branch also?

That would be very helpful for us, since although we are until now
using 19.08 we are about to move to 20.01 because we need the NAT
improvements in 20.01 that are not available in 19.08.

Best regards,
Elias



On Mon, 2020-03-09 at 09:21 +, Elias Rudberg wrote:
> Hello,
> 
> Could this fix be applied in the stable/1908 (and maybe stable/2001)
> branch also?
> 
> Best regards,
> Elias
> 
> 
> 
> On Tue, 2020-02-18 at 11:48 +, Elias Rudberg wrote:
> > Hi Ben,
> > 
> > Great! I tried submitting a patch myself, here it is:
> > 
> > https://gerrit.fd.io/r/c/vpp/+/25233
> > 
> > Let me know if something more is needed. I tried to follow the
> > instructions here: 
> > https://wiki.fd.io/view/VPP/Pulling,_Building,_Running,_Hacking_and_Pushing_VPP_Code#Pushing_Code_with_git_review
> > 
> > / Elias
> > 
> > 
> > On Tue, 2020-02-18 at 09:30 +, Benoit Ganne (bganne) via
> > Lists.Fd.Io wrote:
> > > Hi Elias,
> > > 
> > > > Now I think I found the problem, looks like a bug in
> > > > plugins/rdma/input.c related to what happens when the list of
> > > > input
> > > > packets wrap around to the beginning of the ring buffer.
> > > > To fix it, the following change is needed:
> > > 
> > > Indeed, your fix is correct, good catch. Do you want to submit a
> > > patch through gerrit or do you prefer me to do it?
> > > 
> > > Best
> > > ben

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15725): https://lists.fd.io/g/vpp-dev/message/15725
Mute This Topic: https://lists.fd.io/mt/71273976/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] impact of API requests on forwarding performance?

2020-03-12 Thread Elias Rudberg

Hi Andreas,

I think you are right about the stop-world way it works.

We have seen a performance impact, but that was for a command that was
quite slow, listing something with many lines of output (the "show
nat44 sessions" command). So then the worker threads were stopped
during that whole operation and we saw some packet drops each time.
Later we were able to extract the info we needed in other ways (like
getting number of sessions directly as a single number per thread via
the python API instead of fetching a large output and counting lines in
that), so we could avoid that performance problem.

For small things like checking the values of some counters, we have not
seen any performance impact. But then we only did those calls once
every 30 seconds or so. If you do it very often, like many times times
per second, maybe there could be a performance impact also for small
things. I suppose you could test it by gradually increasing the
frequency of your API calls and seeing when drops start to happen.

Best regards,
Elias

On Wed, 2020-03-11 at 17:03 +0100, Andreas Schultz wrote:
> Hi,
> 
> Has anyone benchmarked the impact of VPP API invocations on the
> forwarding performance?
> 
> Background: most calls on the VPP API run in a stop-world maner. That
> means all graph node worker threads are stopped at a barrier, the API
> call is executed and then the workers are released from the barrier.
> Right?
> 
> My question is now, when I do 1k, 10k or even 100k API invocation per
> second, how does that impact the forwarding performance of VPP?
> 
> Does anyone have a use-case running that is actually doing that?
> 
> Many thanks,
> Andreas

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15738): https://lists.fd.io/g/vpp-dev/message/15738
Mute This Topic: https://lists.fd.io/mt/71882379/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] impact of API requests on forwarding performance?

2020-03-12 Thread Elias Rudberg

Hi Ole,

Thanks for explaining!
I'm sorry if what I wrote before was wrong or confusing.

> Checking counters values in the stats segment has _no_ impact on VPP.
> VPP writes those counters regardless of reader frequency.

That's great!

Just to be clear, to make sure I understand what this means, if we do
the following in python:

from vpp_papi.vpp_stats import VPPStats
stat = VPPStats("/run/vpp/stats.sock")
dir = stat.ls(['^/nat44/total-users'])
counters = stat.dump(dir)
list_of_counters=counters.get('/nat44/total-users')

(followed by a loop in python to sum up the counter values from
different vpp threads) then what we are doing is that we are checking
counters values in the stats segment, so there should be no impact on
VPP?

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15743): https://lists.fd.io/g/vpp-dev/message/15743
Mute This Topic: https://lists.fd.io/mt/71882379/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] NAT bugix related to in2out/out2in handoff node index

2020-03-13 Thread Elias Rudberg

Hello,

While working on moving from VPP 19.08 to 20.01 we found that NAT was
no longer working and it seems to be due to a bug in
src/plugins/nat/nat.c for the dynamic endpoint-independent case, here:

sm->handoff_out2in_index = snat_in2out_node.index;
sm->handoff_in2out_index = snat_out2in_node.index;

As I understand it, handoff_out2in_index is supposed to be the node
index of the out2in node, but it is set to the in2out node index
instead. And the other way around, in2out/in2out are mixed up in those
two lines.

I pushed a fix to gerrit, it's just those two lines that are changed:
https://gerrit.fd.io/r/c/vpp/+/25856

If you agree, can this fix please be accepted into master and also into
the stable/2001 branch?

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15772): https://lists.fd.io/g/vpp-dev/message/15772
Mute This Topic: https://lists.fd.io/mt/71926127/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] Approve NAT in2out/out2in handoff node index fix in stable/2001 branch also?

2020-03-17 Thread Elias Rudberg

Hello,

Can someone please approve this change so that we get the fix in the
stable/2001 branch also?

https://gerrit.fd.io/r/c/vpp/+/25861

(it was done in the master branch last week -- see 
https://gerrit.fd.io/r/c/vpp/+/25856 -- then it was cherry picked for
the stable/2001 branch)

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15799): https://lists.fd.io/g/vpp-dev/message/15799
Mute This Topic: https://lists.fd.io/mt/72020681/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] VPP 20.05 problems related to memory allocations -- possible memory leak?

2021-11-17 Thread Elias Rudberg

Hello Murthy,

I think that the problem with VPP 20.05 that I wrote about back in
October 2020 later turned out to be related to NAT44 hairpinning, see
the discussion here: https://lists.fd.io/g/vpp-dev/topic/78662322

A fix was merged so the exact same problem should not happen for the
21.06 version that you are using.

If you have some similar problem in the sense that VPP runs out of
memory for some unknown reason, then my advice would be to gather as
much statistics as you can (error counters and so on) from around the
time the problem happens, to see if that gives you any clue.

Best regards,
Elias


On Tue, 2021-11-16 at 23:10 -0800, Satya Murthy wrote:
> Hi Klemant/Elias/Vpp-Experts,
> 
> We are also seeing the same crash with fdio 21.06 version.
> 
> vec_resize_allocate_memory + 0x285
> vlib_put_next_frame + 0xbd
> 
> Our main-heap size is set to 2G.
> 
> Is this a known issue (or) any fix that is available for this.
> 
> Any inputs will be helpful.
> 
> 
> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20505): https://lists.fd.io/g/vpp-dev/message/20505
Mute This Topic: https://lists.fd.io/mt/77479819/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] Are some VPP releases considered LTS releases?

2022-10-31 Thread Elias Rudberg

Hello VPP experts,

Are some VPP releases considered LTS (long-term support) releases?
If so, which is the latest LTS version at this time?

Best regards,
Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22097): https://lists.fd.io/g/vpp-dev/message/22097
Mute This Topic: https://lists.fd.io/mt/94681424/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] How to make VPP work with Mellanox ConnectX-6 NICs?

2022-11-16 Thread Elias Rudberg

Hello VPP experts,

We have been using VPP with Mellanox ConnectX-5 cards for a while,
which has been working great.

Now we have a new server where we want to run VPP in a similar way that
we are used to, the difference is that the new server has ConnectX-6
cards instead of ConnectX-5.

The lspci command shows each ConnectX-6 card as follows:

51:00.0 Infiniband controller: Mellanox Technologies MT28908 Family
[ConnectX-6]

Trying to create an interface using the following command:

create int rdma host-if ibs1f1 name if1 num-rx-queues 4

gives the following error:

DBGvpp# create int rdma host-if ibs1f1 name if1 num-rx-queues 4
create interface rdma: Queue Pair create failed: Invalid argument

and journalctl shows the following:

Nov 16 16:06:39 [...] vnet[3147]: rdma: rdma_txq_init: Queue Pair
create failed: Invalid argument
Nov 16 16:06:39 [...] vnet[3147]: create interface rdma: Queue Pair
create failed: Invalid argument
Nov 16 16:06:39 [...] kernel: infiniband mlx5_3: create_qp:3206:(pid
3147): Create QP type 8 failed

We are using Ubuntu 22.04 and the VPP version tested was vpp v22.10.

Do we need to do something different when using ConnectX-6 cards
compared to the ConnectX-5 case?

Best regards,
Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22189): https://lists.fd.io/g/vpp-dev/message/22189
Mute This Topic: https://lists.fd.io/mt/95069595/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] How to make VPP work with Mellanox ConnectX-6 NICs?

2022-11-22 Thread Elias Rudberg

Hi Ben,

You were right that my issue had to do with IB/ETH mode. The card was
set to IB mode. After changing to ETH mode, things are now working. No
change in how VPP is configured for ConnectX-6 compared to ConnectX-5,
everything is the same except that the interface names are slightly
different as can be seen for example using the "ip link" command.

In case it helps someone else, the command used to change mode was
"mlxconfig" and the options changed were LINK_TYPE_P1 and LINK_TYPE_P2,
both of those were changed from IB(1) to ETH(2).

Thanks!

/ Elias


On Tue, 2022-11-22 at 08:59 +, Benoit Ganne (bganne) via
lists.fd.io wrote:
> Hi Elias,
> 
> Sorry, this slipped through my mind!
> I do not have any Cx6 to test (I think we should receive some in CSIT
> at some point), but as it seems to complain about the QP type 8 which
> is supposed to be the ethernet queue type, can you check if your
> adapter supports Ethernet and if so, if it is set to Ethernet and not
> IB? You might need to use some mlx tools to query/change settings in
> the card fw.
> 
> Best
> ben
> 
> > -Original Message-----
> > From: vpp-dev@lists.fd.io  On Behalf Of Elias
> > Rudberg
> > Sent: Wednesday, November 16, 2022 17:10
> > To: vpp-dev@lists.fd.io
> > Subject: [vpp-dev] How to make VPP work with Mellanox ConnectX-6
> > NICs?
> > 
> > Hello VPP experts,
> > 
> > We have been using VPP with Mellanox ConnectX-5 cards for a while,
> > which has been working great.
> > 
> > Now we have a new server where we want to run VPP in a similar way
> > that
> > we are used to, the difference is that the new server has ConnectX-
> > 6
> > cards instead of ConnectX-5.
> > 
> > The lspci command shows each ConnectX-6 card as follows:
> > 
> > 51:00.0 Infiniband controller: Mellanox Technologies MT28908 Family
> > [ConnectX-6]
> > 
> > Trying to create an interface using the following command:
> > 
> > create int rdma host-if ibs1f1 name if1 num-rx-queues 4
> > 
> > gives the following error:
> > 
> > DBGvpp# create int rdma host-if ibs1f1 name if1 num-rx-queues 4
> > create interface rdma: Queue Pair create failed: Invalid argument
> > 
> > and journalctl shows the following:
> > 
> > Nov 16 16:06:39 [...] vnet[3147]: rdma: rdma_txq_init: Queue Pair
> > create failed: Invalid argument
> > Nov 16 16:06:39 [...] vnet[3147]: create interface rdma: Queue Pair
> > create failed: Invalid argument
> > Nov 16 16:06:39 [...] kernel: infiniband mlx5_3:
> > create_qp:3206:(pid
> > 3147): Create QP type 8 failed
> > 
> > We are using Ubuntu 22.04 and the VPP version tested was vpp
> > v22.10.
> > 
> > Do we need to do something different when using ConnectX-6 cards
> > compared to the ConnectX-5 case?
> > 
> > Best regards,
> > Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#0): https://lists.fd.io/g/vpp-dev/message/0
Mute This Topic: https://lists.fd.io/mt/95069595/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] Traffic shaping functionality in VPP?

2023-03-23 Thread Elias Rudberg

Hello VPP experts,

We have been using VPP for NAT44 for a while, which has worked great.
We contributed some fixes a couple of years ago and have been using VPP
without issues since then.

Now we are considering the possibility of using VPP for a different
usecase as well, related to "broadband network gateway" (BNG)
functionality.
This would involve traffic shaping, something like buffering packets
for each user/subscriber when the rate of traffic reaches a certain
limit, allowing different limits for different users.
There would need to be a separate buffer for each user and some
counters to keep track of the current rate of traffic for each user.

Questions related to this:

- is there some already existing traffic shaping functionality in VPP
that could be used for this?

- otherwise, if we were to implement such functionality, would you say
it is feasible to do as a VPP plugin and do you have advice on how to
do it?

- are others on this list also interested in this, or even someone
already working on something like this?

I would also be interested in any other comments or thoughts you may
have about this.

Best regards,
Elias


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22757): https://lists.fd.io/g/vpp-dev/message/22757
Mute This Topic: https://lists.fd.io/mt/97800741/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

99 matches

Mail list logo