Re: [vpp-dev] n_vectors...
Hi Chris and Dave, Thanks for bringing this up, and thanks for explaining! I agree with Chris that this is confusing, it makes it much more difficult to understand the code. Perhaps this is the kind of thing that doesn't matter much to those who are already familiar with the code, while at the same time it matters a lot for newcomers. If you want to lower the threshold for new people to be able to come in and understand the code and possibly contribute, then I think it would be a good idea to fix this even if it means changing many lines of code. It could be argued that the fact that "n_vectors" exists in so many places makes it even more important to have a reasonable name for it. One way could be to start with renaming things in some of the main data structures like those in vlib/node.h and vlib/threads.h and such places, and the changes the compiler will force as a result of that. Best regards, Elias On Tue, 2020-03-31 at 00:45 +, Dave Barach via Lists.Fd.Io wrote: > Hmmm, yeah. Been at this for years, I can’t really remember when we > settled on e.g. n_vectors vs. n_vector_elts or some such. > > In new code, it’s perfectly fair to use whatever names seem fit for > purpose. > > Vlib would be happy doing image processing, or any other kind of > vector processing. There’s no law which says that frames need to have > 32-bit elements. Each node decides. > > FWIW... Dave > > From: vpp-dev@lists.fd.io On Behalf Of > Christian Hopps > Sent: Monday, March 30, 2020 8:07 PM > To: vpp-dev > Cc: Christian Hopps > Subject: [vpp-dev] n_vectors... > > Something has always bothered me about my understanding of VPPs use > of the term "vector" and "vectors". When I think of Vector Packet > Processing I think of processing a vector (array) of packets in a > single call to a node. The code, though, then seems to refer to the > individual packets as "vectors" when it uses field names like > "n_vectors" to refer to the number of buffers in a frame, or when > "show runtime" talks about "vectors per call", when I think it's > really talking about "packets/buffers per call" (and my mind wants to > think that it's always *1* vector/frame of packets per call by > design). > > I find this confusing, and so I thought I'd ask if there was some > meaning here I'm missing? > > Thanks, > Chris. -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#15956): https://lists.fd.io/g/vpp-dev/message/15956 Mute This Topic: https://lists.fd.io/mt/72667316/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] VPP nat ipfix logging problem, need to use thread-specific vlib_main_t?
Hello VPP experts, We have been using VPP for NAT44 for a while and it has been working fine, but a few days ago when we tried turing on nat ipfix logging, vpp crashed. It turned out that the problem went away if we used only a single thread, so it seemed related to how threading was handled in the ipfix logging code. The crash happened in different ways on different runs but often seemed related to the snat_ipfix_send() function in plugins/nat/nat_ipfix_logging.c. Having looked at the code in nat_ipfix_logging.c I have the following theory about what goes wrong (I might have misunderstood something, if so please correct me): In the the snat_ipfix_send() function, a vlib_main_t data structure is used, a pointer to it is fetched in the following way: vlib_main_t *vm = frm->vlib_main; So the frm->vlib_main pointer comes from "frm" which has been set to flow_report_main which is a global data structure from vnet/ipfix- export/flow_report.c that as far as I can tell only exists once in memory (not once per thread). This means that different threads calling the snat_ipfix_send() function are using the same vlib_main_t data structure. That is not how it should be, I think, instead each thread should be using its own thread-specific vlib_main_t data structure. A suggestion for how to fix this is to replace the line vlib_main_t *vm = frm->vlib_main; with the following line vlib_main_t *vm = vlib_mains[thread_index]; in all places where worker threads are using such a vlib_main_t pointer. Using vlib_mains[thread_index] means that we are picking the thread-specific vlib_main_t data structure for the current thread, instead of all threads using the same vlib_main_t. I pushed such a change to gerrit, here: https://gerrit.fd.io/r/c/vpp/+/26359 That fix seems to solve the issue in my tests, vpp does not crash anymore after the change. Please have a look at it and let me know if this seems reasonable or if I have misunderstood something. Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#15990): https://lists.fd.io/g/vpp-dev/message/15990 Mute This Topic: https://lists.fd.io/mt/72786912/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] n_vectors...
Hi Dave, Thanks for your answer, I understand that there are many difficulties and problems with renaming things in existing code which I did not realize before. > P.S. mapping "n_vectors" to whatever it means to you seems like a > pretty minimal entry barrier. It's not like the code is inconsistent. Here however I disagree: I think it can be a significant entry barrier. If you imagine yourself in the position of someone who is a newcomer starting to use and learn about VPP. Perhaps someone with an engineering background who has an understanding of the "vector" concept from linear algebra courses and so on. This person has read about the ideas of how VPP works for example here https://wiki.fd.io/view/VPP/What_is_VPP%3F where it says "the VPP platform grabs all available packets from RX rings to form a vector of packets" which seems fine according to the usual meaning of the word "vector". Up to that point everything is fine and someone familiar with the vector concept will feel that their knowledge about vectors can be useful when working with VPP. But at the moment when this person starts looking at the code and sees "n_vectors" there, that will be confusing. Making the assumption that the VPP source code uses its own definition of what a "vector" is, is actually a pretty big step to make. Of course it's not the first time a word has different meanings depending on the context, but in this case the concept of a "vector" is quite well established and also seems to be used according to its usual meaning in VPP documentation. Then it becomes confusing when the word apparently has a different meaning in the source code. So, while you are probably right that it's not practical to rename things like that in the existing code, I still think this issue can be a significant obstacle for new people coming in. Anyway, thanks again for explaining the situation, for me personally this helped my understanding a lot. Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16000): https://lists.fd.io/g/vpp-dev/message/16000 Mute This Topic: https://lists.fd.io/mt/72667316/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] n_vectors...
Hi Burt, Thanks, but then I think you mean the vectors as in src/vppinfra/vec.h but the discussion here was about how the name "n_vectors" is used in for example src/vlib/node.h and such places. It's a different thing. If we have a situation like this, now first described using a picture without using the word "vector" for anything: A : [ a1 a2 a3 ] B : [ b1 b2 b3 b4 ] Then the above can be described in different ways. Alternative 1: we can say that A and B are vectors. A is a vector with 3 elements, B is a vector with 4 elements. The number of vectors is 2 (A and B). According to this view, if there was something called n_vectors then we would say that n_vectors=2. Alternative 2 (the VPP way): A consists o3 3 vectors, and B consists of 4 vectors. The number of vectors for A is 3, and the number of vectors for B is 4. A and B each have their own n_vectors values, A has n_vectors=3 and B has n_vectors=4. At least this is how I think it is in the VPP source code. The VPP source code can be confusing if you assume the word "vector" is used as in alternative 1. I think the main scenario of interest in VPP is that there is a bunch of packets that are processed together. You might think that this would be described as a vector of packets, but the VPP source code instead describes the individual packets as vectors, so that "number of vectors" in effect means "number of packets". At least that is how I think it is. There is at least one comment in src/vlib/node.h that seems to say this, it looks like this: /* Number of vector elements currently in frame. */ u16 n_vectors; So that variable is called n_vectors but according to the comment its meaning is the number of vector elements rather than the number of vectors. Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16016): https://lists.fd.io/g/vpp-dev/message/16016 Mute This Topic: https://lists.fd.io/mt/72667316/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Deadlock between NAT threads when frame queues for handoff are congested
Hello VPP experts, We are using VPP for NAT44 and last week we encountered a problem where some VPP threads stopped forwarding traffic. We saw the problem on two separate VPP servers within a short time, apparently it was triggered by some specific kind of out2in traffic that arrived at that time. As far as I can tell, this issue exists in both the current master branch and in the 1908 and 2001 branches. After investigating and finally being able to reproduce the problem in a lab setting, we came to the following conclusion about what happened: The scenario where this happens is that several threads (8 threads in our case) are used for NAT and the frame queues for handoff between threads are being congested for some of the threads. This can be triggered for example by "garbage" out2in traffic that comes in at some port, if much of the out2in traffic has the same destination port then much of the traffic will be handed off to the same thread, since the out2in handoff thread index is decided based on the dest port. It doesn't matter if the traffic belongs to any existing NAT sessions or not, since handoff must be done before checking that and the problem is related to the handoff. When a frame queue is congested, that is supposed to be detected by the is_vlib_frame_queue_congested() call in vlib_buffer_enqueue_to_thread(). However, that check is not completely reliable since other threads may add things to the queue after the check. For example, it can happen that two threads call is_vlib_frame_queue_congested() simultaneously and both come to the conclusion that the queue is not congested when in fact it will be congested when one of them has added to the queue giving trouble for the other thread. This problem is to some extent mitigated by the fact that the check in is_vlib_frame_queue_congested() uses a "queue_hi_thresh" value that is set slightly lower than the number of elements in the queue, it is set like this: fqm->queue_hi_thresh = frame_queue_nelts - 2; The -2 there means that things are still OK if two threads call is_vlib_frame_queue_congested() simultaneously, but if three or four threads do it simultaneously we are anyway in trouble, and that seems to be what happened on our VPP servers last week. This leads to one or more threads being stuck in an infinite loop, in the loop that looks like this in vlib_get_frame_queue_elt(): /* Wait until a ring slot is available */ while (new_tail >= fq->head_hint + fq->nelts) vlib_worker_thread_barrier_check (); The loop above is supposed to end when a different thread changes the value of the volatile variable fq->head_hint but that will not happen if the other thread is also stuck in this loop. We get a deadlock, A is waiting for B and B is waiting for A. In the context of NAT, thread A wants to handoff something to thread B at the same time as thread B wants to handoff something to thread A, while at the same time their frame queues are congested. This leads to those two threads being stuck in the loop forever, each of them waiting for the other one. To me it looks like the subtraction by 2 when setting queue_hi_thresh is just an ad hoc choice, there is no reason why 2 would be enough. I think that to make it safe, we need to subtract the number of threads. Essentially, we need to ensure that there is room for each thread to reserve one extra element in the queue so that no thread can get stuck waiting in the loop above. I tested this by hard-coding -8 instead of -2 and then the problem cannot be reproduced anymore, so that fix seems to work. The frame_queue_nelts value is 64 so using -8 means that the queue is considered congested already 56 instead of 62 as it is now. What do you think, is it a good solution to check the number of threads and use that to set "fqm->queue_hi_thresh = frame_queue_nelts - n_threads;"? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16083): https://lists.fd.io/g/vpp-dev/message/16083 Mute This Topic: https://lists.fd.io/mt/73030838/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Deadlock between NAT threads when frame queues for handoff are congested
Hi Ole! Thanks, here is a change doing that, please have a look: https://gerrit.fd.io/r/c/vpp/+/26544 With this change, an assertion will fail if the number of threads is greater than 55 or something like that. To make things work for such large thread counts it would be necessary to increase the queue size also, this change does not handle that. Best regards, Elias On Thu, 2020-04-16 at 13:43 +0200, Ole Troan wrote: > Hi Elias, > > Thank you for the thorough analysis. > I think the best approach for now is the one you propose. Reserve as > many slots as you have workers. > Potentially also increase the queue size > 64. > > Damjan is looking at some further improvements in this space, but for > now please go with what you propose. > > Best regards, > Ole -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16089): https://lists.fd.io/g/vpp-dev/message/16089 Mute This Topic: https://lists.fd.io/mt/73030838/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] VPP nat ipfix logging problem, need to use thread-specific vlib_main_t?
Hello, There was a merge conflict for my previous fix for this. Now I made a new one, it's essentially the same thing, just avoiding the merge conflict: https://gerrit.fd.io/r/c/vpp/+/26659 Please have a look at that one and merge if it seems ok. Based on our experience from the past few weeks it seems good, we have seen no more ipfix loggning crashes after implementing this fix. Best regards, Elias On Sun, 2020-04-05 at 12:08 +, Dave Barach via lists.fd.io wrote: > If you have the thread index handy, that's OK. Otherwise, use > vlib_get_main() which grabs the thread index from thread local > storage. > > -Original Message- > From: vpp-dev@lists.fd.io On Behalf Of Elias > Rudberg > Sent: Sunday, April 5, 2020 4:58 AM > To: vpp-dev@lists.fd.io > Subject: [vpp-dev] VPP nat ipfix logging problem, need to use thread- > specific vlib_main_t? > > Hello VPP experts, > > We have been using VPP for NAT44 for a while and it has been working > fine, but a few days ago when we tried turing on nat ipfix logging, > vpp crashed. It turned out that the problem went away if we used only > a single thread, so it seemed related to how threading was handled in > the ipfix logging code. The crash happened in different ways on > different runs but often seemed related to the snat_ipfix_send() > function in plugins/nat/nat_ipfix_logging.c. > > Having looked at the code in nat_ipfix_logging.c I have the following > theory about what goes wrong (I might have misunderstood something, > if so please correct me): > > In the the snat_ipfix_send() function, a vlib_main_t data structure > is used, a pointer to it is fetched in the following way: > >vlib_main_t *vm = frm->vlib_main; > > So the frm->vlib_main pointer comes from "frm" which has been set to > flow_report_main which is a global data structure from vnet/ipfix- > export/flow_report.c that as far as I can tell only exists once in > memory (not once per thread). This means that different threads > calling the snat_ipfix_send() function are using the same vlib_main_t > data structure. That is not how it should be, I think, instead each > thread should be using its own thread-specific vlib_main_t data > structure. > > A suggestion for how to fix this is to replace the line > >vlib_main_t *vm = frm->vlib_main; > > with the following line > >vlib_main_t *vm = vlib_mains[thread_index]; > > in all places where worker threads are using such a vlib_main_t > pointer. Using vlib_mains[thread_index] means that we are picking the > thread-specific vlib_main_t data structure for the current thread, > instead of all threads using the same vlib_main_t. I pushed such a > change to gerrit, here: https://gerrit.fd.io/r/c/vpp/+/26359 > > That fix seems to solve the issue in my tests, vpp does not crash > anymore after the change. Please have a look at it and let me know if > this seems reasonable or if I have misunderstood something. > > Best regards, > Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16147): https://lists.fd.io/g/vpp-dev/message/16147 Mute This Topic: https://lists.fd.io/mt/72786912/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Segmentation fault in rdma_device_input_refill when using clang compiler
Hello VPP experts, When trying to use the current master branch, we get a segmentation fault error. Here is what it looks like in gdb: Thread 3 "vpp_wk_0" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fedf91fe700 (LWP 21309)] rdma_device_input_refill (vm=0x7ff8a5d2f4c0, rd=0x7fedd35ed5c0, rxq=0x77edea80, is_mlx5dv=1) at vpp/src/plugins/rdma/input.c:115 115 *(u64x4 *) (va + 4) = u64x4_byte_swap (*(u64x4 *) (va + 4)); (gdb) bt #0 rdma_device_input_refill (vm=0x7ff8a5d2f4c0, rd=0x7fedd35ed5c0, rxq=0x77edea80, is_mlx5dv=1) at vpp/src/plugins/rdma/input.c:115 #1 0x7fffa84d in rdma_device_input_inline (vm=0x7ff8a5d2f4c0, node=0x7ff5ccdfee00, frame=0x0, rd=0x7fedd35ed5c0, qid=0, use_mlx5dv=1) at vpp/src/plugins/rdma/input.c:622 #2 0x7fffabbbae44 in rdma_input_node_fn_skx (vm=0x7ff8a5d2f4c0, node=0x7ff5ccdfee00, frame=0x0) at vpp/src/plugins/rdma/input.c:647 #3 0x760e3155 in dispatch_node (vm=0x7ff8a5d2f4c0, node=0x7ff5ccdfee00, type=VLIB_NODE_TYPE_INPUT, dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x0, last_time_stamp=66486783453597600) at vpp/src/vlib/main.c:1235 #4 0x760ddbf5 in vlib_main_or_worker_loop (vm=0x7ff8a5d2f4c0, is_main=0) at vpp/src/vlib/main.c:1815 #5 0x760dd227 in vlib_worker_loop (vm=0x7ff8a5d2f4c0) at vpp/src/vlib/main.c:1996 #6 0x761345a1 in vlib_worker_thread_fn (arg=0x7fffb74ea980) at vpp/src/vlib/threads.c:1795 #7 0x75531954 in clib_calljmp () at vpp/src/vppinfra/longjmp.S:123 #8 0x7fedf91fdce0 in ?? () #9 0x7612cd53 in vlib_worker_thread_bootstrap_fn (arg=0x7fffb74ea980) at vpp/src/vlib/threads.c:584 Backtrace stopped: previous frame inner to this frame (corrupt stack?) This segmentation fault happens the same way every time I try to start VPP. This is in Ubuntu 18.04.4 using the rdma plugin with Mellanox mlx5 NICs and a Intel Xeon Gold 6126 CPU. I have looked back at recent changes and found that this problem started with the commit 4ba16a44 "misc: switch to clang-9" dated April 28. Before that we could use the master branch without thie problem. Changing back to gcc by removing clang in src/CMakeLists.txt makes the error go away. However, there is then instead a problem with a "symbol lookup error" for crypto_native_plugin.so: undefined symbol: crypto_native_aes_cbc_init_avx512 (that problem disappears if disabling the crypto_native plugin) So, two problems: (1) The segmentation fault itself, perhaps indicating a bug somewhere but seems to appear only with clang and not with gcc (2) The "undefined symbol: crypto_native_aes_cbc_init_avx512" problem when trying to use gcc instead of clang What do you think about these? As a short-term fix, is removing clang in src/CMakeLists.txt reasonable or is there a better/easier workaround? Does anyone else use the rdma plugin when compiling using clang -- perhaps that combination triggers this problem? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16252): https://lists.fd.io/g/vpp-dev/message/16252 Mute This Topic: https://lists.fd.io/mt/74033970/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Segmentation fault in rdma_device_input_refill when using clang compiler
Hi Dave and Damjan, Here is instruction and register info: (gdb) x/i $pc => 0x7fffabbbdd67 : vmovdqa64 -0x30a0(%rbp),%ymm0 (gdb) info registers rbp ymm0 rbp0x7417daf0 0x7417daf0 ymm0 {v8_float = {0x0, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0, 0xfffd}, v4_double = {0x0, 0x37, 0x0, 0xff85}, v32_int8 = {0x0, 0x0, 0x0, 0x10, 0x3f, 0xf6, 0x41, 0x80, 0x0, 0x0, 0x0, 0x10, 0x3f, 0xf6, 0x4b, 0x40, 0x0, 0x0, 0x0, 0x10, 0x3f, 0xf6, 0x55, 0x0, 0x0, 0x0, 0x0, 0x10, 0x3f, 0xf6, 0x5e, 0xc0}, v16_int16 = {0x0, 0x1000, 0xf63f, 0x8041, 0x0, 0x1000, 0xf63f, 0x404b, 0x0, 0x1000, 0xf63f, 0x55, 0x0, 0x1000, 0xf63f, 0xc05e}, v8_int32 = { 0x1000, 0x8041f63f, 0x1000, 0x404bf63f, 0x1000, 0x55f63f, 0x1000, 0xc05ef63f}, v4_int64 = {0x8041f63f1000, 0x404bf63f1000, 0x55f63f1000, 0xc05ef63f1000}, v2_int128 = {0x404bf63f10008041f63f1000, 0xc05ef63f1055f63f1000}} Not sure if I understand all this but perhaps it means that the value in %rbp is used as a memory address, but that address 0x7417daf0 is not 32-byte aligned as it needs to be. Adding __attribute__((aligned(32))) as Damjan suggests indeed seems to help. After that there was again a segfault in another place in the same file, where the same trick of adding __attribute__((aligned(32))) again helped. So it seems the problem can be fixed by adding that alignment attribute in two places, like this: diff --git a/src/plugins/rdma/input.c b/src/plugins/rdma/input.c index cf0b6bffe..324436f01 100644 --- a/src/plugins/rdma/input.c +++ b/src/plugins/rdma/input.c @@ -103,7 +103,7 @@ rdma_device_input_refill (vlib_main_t * vm, rdma_device_t * rd, if (is_mlx5dv) { - u64 va[8]; + u64 va[8] __attribute__((aligned(32))); mlx5dv_rwq_t *wqe = rxq->wqes + slot; while (n >= 1) @@ -488,7 +488,7 @@ rdma_device_input_inline (vlib_main_t * vm, vlib_node_runtime_t * node, rdma_rxq_t *rxq = vec_elt_at_index (rd->rxqs, qid); vlib_buffer_t *bufs[VLIB_FRAME_SIZE], **b = bufs; struct ibv_wc wc[VLIB_FRAME_SIZE]; - u32 byte_cnts[VLIB_FRAME_SIZE]; + u32 byte_cnts[VLIB_FRAME_SIZE] __attribute__((aligned(32))); vlib_buffer_t bt; u32 next_index, *to_next, n_left_to_next, n_rx_bytes = 0; int n_rx_packets, skip_ip4_cksum = 0; Many thanks for you help! Should I push the above as a patch to gerrit? / Elias On Wed, 2020-05-06 at 20:38 +0200, Damjan Marion wrote: > Can you try this: > > diff --git a/src/plugins/rdma/input.c b/src/plugins/rdma/input.c > index cf0b6bffe..b461ee27b 100644 > --- a/src/plugins/rdma/input.c > +++ b/src/plugins/rdma/input.c > @@ -103,7 +103,7 @@ rdma_device_input_refill (vlib_main_t * vm, > rdma_device_t * rd, > >if (is_mlx5dv) > { > - u64 va[8]; > + u64 va[8] __attribute__((aligned(32))); >mlx5dv_rwq_t *wqe = rxq->wqes + slot; > >while (n >= 1) > > > Thanks! > > > On 6 May 2020, at 19:45, Elias Rudberg > > wrote: > > > > Hello VPP experts, > > > > When trying to use the current master branch, we get a segmentation > > fault error. Here is what it looks like in gdb: > > > > Thread 3 "vpp_wk_0" received signal SIGSEGV, Segmentation fault. > > [Switching to Thread 0x7fedf91fe700 (LWP 21309)] > > rdma_device_input_refill (vm=0x7ff8a5d2f4c0, rd=0x7fedd35ed5c0, > > rxq=0x77edea80, is_mlx5dv=1) > >at vpp/src/plugins/rdma/input.c:115 > > 115 *(u64x4 *) (va + 4) = u64x4_byte_swap (*(u64x4 *) (va > > + 4)); -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16257): https://lists.fd.io/g/vpp-dev/message/16257 Mute This Topic: https://lists.fd.io/mt/74033970/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Segmentation fault in rdma_device_input_refill when using clang compiler
OK now I updated it (https://gerrit.fd.io/r/c/vpp/+/26934). Thanks again for your help! / Elias On Thu, 2020-05-07 at 01:58 +0200, Damjan Marion wrote: > i already pushed one, can you updatr it instead? > > Thanks > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16259): https://lists.fd.io/g/vpp-dev/message/16259 Mute This Topic: https://lists.fd.io/mt/74033970/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Fix in LACP code to avoid assertion failure in vlib_time_now()
Hello VPP experts, When trying the current VPP master branch using a debug build we encountered an assertion failure in vlib_time_now() here: always_inline f64 vlib_time_now (vlib_main_t * vm) { #if CLIB_DEBUG > 0 extern __thread uword __os_thread_index; #endif /* * Make sure folks don't pass &vlib_global_main from a worker thread. */ ASSERT (vm->thread_index == __os_thread_index); return clib_time_now (&vm->clib_time) + vm->time_offset; } The ASSERT there is triggered because the LACP code passes &vlib_global_main when it should pass a thread-specific vlib_main_t. So this looks like precisely the kind of issue that the assertion was made to catch. To reproduce the problem I think it should be anough to use LACP in a multi-threaded scenario, using a debug build, then the assertion failure happens directy at startup, every time. I pushed a fix, here: https://gerrit.fd.io/r/c/vpp/+/26943 After that fix it seems to work, LACP then works without assertion failure. Please have a look and merge if it seems okay. Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16270): https://lists.fd.io/g/vpp-dev/message/16270 Mute This Topic: https://lists.fd.io/mt/74051150/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Assertion failure in nat_get_vlib_main() in snat_init()
Hello, With the current master branch (def78344) we now get an assertion failure on startup, here: (gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x7462e801 in __GI_abort () at abort.c:79 #2 0x004071f3 in os_panic () at vpp/src/vpp/vnet/main.c:366 #3 0x7550d7d9 in debugger () at vpp/src/vppinfra/error.c:84 #4 0x7550d557 in _clib_error (how_to_die=2, function_name=0x0, line_number=0, fmt=0x7fffacbc0310 "%s:%d (%s) assertion `%s' fails") at vpp/src/vppinfra/error.c:143 #5 0x7fffacac659e in nat_get_vlib_main (thread_index=4) at vpp/src/plugins/nat/nat.c:2557 #6 0x7fffacabd7a5 in snat_init (vm=0x7639b980 ) at vpp/src/plugins/nat/nat.c:2685 #7 0x760b9f66 in call_init_exit_functions_internal (vm=0x7639b980 , headp=0x7639bfa8 , call_once=1, do_sort=1) at vpp/src/vlib/init.c:350 #8 0x760b9e88 in vlib_call_init_exit_functions (vm=0x7639b980 , headp=0x7639bfa8 , call_once=1) at vpp/src/vlib/init.c:364 #9 0x760ba011 in vlib_call_all_init_functions (vm=0x7639b980 ) at vpp/src/vlib/init.c:386 #10 0x760df1f8 in vlib_main (vm=0x7639b980 , input=0x7fffb4b2afa8) at vpp/src/vlib/main.c:2171 #11 0x76166405 in thread0 (arg=140737324366208) at vpp/src/vlib/unix/main.c:658 #12 0x75531954 in clib_calljmp () at vpp/src/vppinfra/longjmp.S:123 #13 0x7fffcf30 in ?? () #14 0x76165f97 in vlib_unix_main (argc=57, argv=0x71d520) at vpp/src/vlib/unix/main.c:730 #15 0x004068d8 in main (argc=57, argv=0x71d520) at vpp/src/vpp/vnet/main.c:291 The code looks like this (this part was added in a recent commit it seems): always_inline vlib_main_t * nat_get_vlib_main (u32 thread_index) { vlib_main_t *vm; vm = vlib_mains[thread_index]; ASSERT (vm); return vm; } So it is looking at vlib_mains[thread_index] but that is NULL, apparently. Since this happens at startup, could it be that vlib_mains has not been initialized yet, it is too early to try to access it? Is vlib_mains[thread_index] supposed to be initialized by the time when vlib_call_all_init_functions() runs? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16276): https://lists.fd.io/g/vpp-dev/message/16276 Mute This Topic: https://lists.fd.io/mt/74060018/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Assertion failure in nat_get_vlib_main() in snat_init()
Hi Ole, Yes, that fixes it! With that patch my NAT test works, no more assertion failures. / Elias On Fri, 2020-05-08 at 10:06 +0200, Ole Troan wrote: > Hi Elias, > > Thanks for finding that one. > Can you verify that this patch fixes it: > https://gerrit.fd.io/r/c/vpp/+/26951 nat: fix per thread data > vlib_main_t usage take 2 [NEW] > > Best regards, > Ole > > > On 7 May 2020, at 22:57, Elias Rudberg > > wrote: > > > > Hello, > > > > With the current master branch (def78344) we now get an assertion > > failure on startup, here: > > > > (gdb) bt > > #0 __GI_raise (sig=sig@entry=6) at > > ../sysdeps/unix/sysv/linux/raise.c:51 > > #1 0x7462e801 in __GI_abort () at abort.c:79 > > #2 0x004071f3 in os_panic () > >at vpp/src/vpp/vnet/main.c:366 > > #3 0x7550d7d9 in debugger () > >at vpp/src/vppinfra/error.c:84 > > #4 0x7550d557 in _clib_error (how_to_die=2, > > function_name=0x0, > > line_number=0, > >fmt=0x7fffacbc0310 "%s:%d (%s) assertion `%s' fails") > >at vpp/src/vppinfra/error.c:143 > > #5 0x7fffacac659e in nat_get_vlib_main (thread_index=4) > >at vpp/src/plugins/nat/nat.c:2557 > > #6 0x7fffacabd7a5 in snat_init (vm=0x7639b980 > > ) > >at vpp/src/plugins/nat/nat.c:2685 > > #7 0x760b9f66 in call_init_exit_functions_internal > > (vm=0x7639b980 , > >headp=0x7639bfa8 , call_once=1, > > do_sort=1) > >at vpp/src/vlib/init.c:350 > > #8 0x760b9e88 in vlib_call_init_exit_functions > > (vm=0x7639b980 , > >headp=0x7639bfa8 , call_once=1) > >at vpp/src/vlib/init.c:364 > > #9 0x760ba011 in vlib_call_all_init_functions > > (vm=0x7639b980 ) > >at vpp/src/vlib/init.c:386 > > #10 0x760df1f8 in vlib_main (vm=0x7639b980 > > , input=0x7fffb4b2afa8) > >at vpp/src/vlib/main.c:2171 > > #11 0x76166405 in thread0 (arg=140737324366208) > >at vpp/src/vlib/unix/main.c:658 > > #12 0x75531954 in clib_calljmp () > >at vpp/src/vppinfra/longjmp.S:123 > > #13 0x7fffcf30 in ?? () > > #14 0x76165f97 in vlib_unix_main (argc=57, argv=0x71d520) > >at vpp/src/vlib/unix/main.c:730 > > #15 0x004068d8 in main (argc=57, argv=0x71d520) > >at vpp/src/vpp/vnet/main.c:291 > > > > The code looks like this (this part was added in a recent commit it > > seems): > > > > always_inline vlib_main_t * > > nat_get_vlib_main (u32 thread_index) > > { > > vlib_main_t *vm; > > vm = vlib_mains[thread_index]; > > ASSERT (vm); > > return vm; > > } > > > > So it is looking at vlib_mains[thread_index] but that is NULL, > > apparently. > > > > Since this happens at startup, could it be that vlib_mains has not > > been > > initialized yet, it is too early to try to access it? > > > > Is vlib_mains[thread_index] supposed to be initialized by the time > > when > > vlib_call_all_init_functions() runs? > > > > Best regards, > > Elias > > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16281): https://lists.fd.io/g/vpp-dev/message/16281 Mute This Topic: https://lists.fd.io/mt/74060018/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Fix in set_ipfix_exporter_command_fn() to avoid segmentation fault crash
Hello VPP experts, When testing the current master branch for NAT with ipfix logging enabled we encountered a problem with a segmentation fault crash. It seems like this was caused by a bug in set_ipfix_exporter_command_fn() in vnet/ipfix-export/flow_report.c where the variable collector_port is declared as u16: u16 collector_port = UDP_DST_PORT_ipfix; and then a few lines later the address of that variable is given as argument to unformat() with %u like this: else if (unformat (input, "port %u", &collector_port)) I think that is wrong because %u should correspond to a 32-bit variable, so when passing the address of a 16-bit variable some data next to it can get corrupted. In our case what happened was that the "fib_index" variable that happened to be nearby on the stack got corrupted, leading to a crash later on. The problem only appears for release build and not for debug, perhaps because compiler optimization affects how variables are stored on the stack. It could be that the compiler (clang or gcc) also matters, that could explain why the problem was not seen earlier. Here is a fix, please check it and merge if you agree: https://gerrit.fd.io/r/c/vpp/+/27280 Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16519): https://lists.fd.io/g/vpp-dev/message/16519 Mute This Topic: https://lists.fd.io/mt/74491544/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Another fix to avoid assertion failure related to vlib_time_now()
Hello again, Here is another fix to avoid assertion failure due to vlib_time_now() being called with a vm corresponding to a different thread, in nat_ipfix_logging.c: https://gerrit.fd.io/r/c/vpp/+/27281 Please have a look and merge if it seems okay. Maybe it could be done more elegantly, this way required changing in several places to pass along the thread_index value. Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16520): https://lists.fd.io/g/vpp-dev/message/16520 Mute This Topic: https://lists.fd.io/mt/74491949/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Fix in set_ipfix_exporter_command_fn() to avoid segmentation fault crash
Hi Ole, OK, now I have changed the patch to include a bounds check. This is still using an intermediate u32 variable however, I tried making collector_port a u32 but then one of the Gerrit tests failed, I wasn't able to figure out why as I could not reproduce that problem on my end, it happened only in one of the gerrit test cases. This way, with a temporary u32 variable that is copied to the u16 collector_port after the bounds check, both solves the crash for me and passes the Gerrit tests: https://gerrit.fd.io/r/c/vpp/+/27280 What do you think, is this an acceptable solution? (Otherwise it would be necessary to dig deeper into what went wrong in the gerrit tests when collector_port was declared as u32.) Best regards, Elias On Wed, 2020-05-27 at 09:15 +0200, Ole Troan wrote: > Hi Elias, > > Thanks for spotting that. > Just make collector_port a u32 and add a boundary check? > > Best regards, > Ole > > [...] > > > > Here is a fix, please check it and merge if you agree: > > https://gerrit.fd.io/r/c/vpp/+/27280 > > > > Best regards, > > Elias > > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16532): https://lists.fd.io/g/vpp-dev/message/16532 Mute This Topic: https://lists.fd.io/mt/74491544/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Fix in set_ipfix_exporter_command_fn() to avoid segmentation fault crash
Hi Andrew, Yes, it was Basic LISP test. It looked like this in the console.log.gz for vpp-verify-master-ubuntu1804: === === TEST RESULTS: Scheduled tests: 1177 Executed tests: 1176 Passed tests: 1039 Skipped tests: 137 Not Executed tests: 1 Errors: 1 FAILURES AND ERRORS IN TESTS: Testcase name: Basic LISP test ERROR: Test case for basic encapsulation [test_lisp.TestLisp.test_lisp_basic_encap] TESTCASES WHERE NO TESTS WERE SUCCESSFULLY EXECUTED: Basic LISP test === === / Elias On Wed, 2020-05-27 at 18:42 +0200, Andrew 👽 Yourtchenko wrote: > Basic LISP test - was it the one that was failing for you ? > > That particular test intermittently failed a couple of times for me > as well, on a doc-only change, so we have an unrelated issue. > > I am running it locally to see what is going on. > > --a > > > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16535): https://lists.fd.io/g/vpp-dev/message/16535 Mute This Topic: https://lists.fd.io/mt/74491544/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Fix in set_ipfix_exporter_command_fn() to avoid segmentation fault crash
Hi Andrew, In my case it failed several times and appeared to be triggered by seemingly harmless code changes, but it seemed like the problem was reproducible for a given version of the code. What seemed to matter was when I changed things related to local variables inside the set_ipfix_exporter_command_fn() function. The test logs said "Core-file exists" which I suppose means that vpp crashed. The testing framework repeats the test several times, saying "3 attempt(s) left", then "2 attempt(s) left" and so on, all those repeated attempts seemed to crash in the same way. It could be something with uninitialized variables, e.g. something that is assumed to be zero but is never explicitly initialized so it can work when it happens to be zero but depending on platform and compiler details there could be some garbage there causing a problem. Then unrelated code changes like adding variables somewhere making things end up at slightly different memory ocations could make the error come and go. This is just guessing of course. Is it possible to get login access to the machine where the gerrit/jenkins tests are run, to debug it there where the issue can be reproduced? / Elias On Wed, 2020-05-27 at 19:03 +0200, Andrew 👽 Yourtchenko wrote: > Yep, so it looks like we have an issue... > > https://gerrit.fd.io/r/c/vpp/+/27305 has the same failures, I am > rerunning it now to see how intermittent it is - as well as testing > the latest master locally.... > > --a > > > On 27 May 2020, at 18:56, Elias Rudberg > > wrote: > > > > Hi Andrew, > > > > Yes, it was Basic LISP test. It looked like this in the > > console.log.gz > > for vpp-verify-master-ubuntu1804: > > > > === > > > > === > > TEST RESULTS: > > Scheduled tests: 1177 > > Executed tests: 1176 > >Passed tests: 1039 > > Skipped tests: 137 > > Not Executed tests: 1 > > Errors: 1 > > FAILURES AND ERRORS IN TESTS: > > Testcase name: Basic LISP test > > ERROR: Test case for basic encapsulation > > [test_lisp.TestLisp.test_lisp_basic_encap] > > TESTCASES WHERE NO TESTS WERE SUCCESSFULLY EXECUTED: > > Basic LISP test > > === > > > > === > > > > / Elias > > > > > > > > On Wed, 2020-05-27 at 18:42 +0200, Andrew 👽 Yourtchenko wrote: > > > Basic LISP test - was it the one that was failing for you ? > > > That particular test intermittently failed a couple of times for > > > me > > > as well, on a doc-only change, so we have an unrelated issue. > > > I am running it locally to see what is going on. > > > --a -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16549): https://lists.fd.io/g/vpp-dev/message/16549 Mute This Topic: https://lists.fd.io/mt/74491544/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Fix in set_ipfix_exporter_command_fn() to avoid segmentation fault crash
Hi Andrew, > Could you push as a separate change the code that reliably gives you > the error in the LISP unit test I tried but today, whatever I do, I cannot reproduce the test failure anymore. All tests pass now even when I try exactly the same code for which the test failed yesterday. For example, Patchset 4 for https://gerrit.fd.io/r/c/vpp/+/27280 failed yesterday, but now I created Patchset 8 which is identical to Patchset 4, and Patchset 8 passes all tests. I don't know, maybe something changed in the testing environment since yesterday, or maybe the issue was never reproducible, it was just a coincidence that made it seem that way yesterday. The good news is that the fix I wanted to do now passes the tests also when written as Ole suggested, with collector_port as u32 and a bounds check added: https://gerrit.fd.io/r/c/vpp/+/27280 It would be great if that could get merged. Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16556): https://lists.fd.io/g/vpp-dev/message/16556 Mute This Topic: https://lists.fd.io/mt/74491544/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Fix in set_ipfix_exporter_command_fn() to avoid segmentation fault crash
I changed the fix using %U and a new unformat_l3_port function, as suggested by Paul: https://gerrit.fd.io/r/c/vpp/+/27280 This works fine, but I wasn't sure where to put the unformat_l3_port function. Now it's in vnet/udp/udp_format.c -- let me know if you have a better idea about where it should be. / Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16565): https://lists.fd.io/g/vpp-dev/message/16565 Mute This Topic: https://lists.fd.io/mt/74491544/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Fix in set_ipfix_exporter_command_fn() to avoid segmentation fault crash
Ah. OK, now it's changed to the hopefully better name "unformat_udp_port". / Elias On Fri, 2020-05-29 at 00:32 +0200, Andrew 👽 Yourtchenko wrote: > > On 29 May 2020, at 00:02, Elias Rudberg > > wrote: > > > > I changed the fix using %U and a new unformat_l3_port function, as > > suggested by Paul: > > > > https://gerrit.fd.io/r/c/vpp/+/27280 > > My opinion it’s an incorrect and unnecessary > generalization/abstraction: > > 1) port is a L4 concept, not L3. Cf name. > > 2) no one said all L4 ports are/have to be a u16, or that the L4 has > to have a concept of port. Don’t let TCP/UDP monoculture fool you. > > But, 🤷♂️. > > —a > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16567): https://lists.fd.io/g/vpp-dev/message/16567 Mute This Topic: https://lists.fd.io/mt/74491544/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Request to include recent collector_port and vm fixes in stable/2005 branch
Hello, The following two fixes were recently merged to the master branch. Could they please be included in the stable/2005 branch also? https://gerrit.fd.io/r/c/vpp/+/27280 (misc: ipfix-export unformat u16 collector_port fix) https://gerrit.fd.io/r/c/vpp/+/27281 (nat: fix regarding vm arg for vlib_time_now call) We need them to avoid segmentation fault and assertion failure problems. Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16575): https://lists.fd.io/g/vpp-dev/message/16575 Mute This Topic: https://lists.fd.io/mt/74544789/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] worker thread deadlock for current master branch, started with commit "bonding: adjust link state based on active slaves"
Hello, We now get this kind of error for the current master branch (5bb3e81e): vlib_worker_thread_barrier_sync_int: worker thread deadlock Testing previous commits indicates the problem started with the recent commit 9121c415 "bonding: adjust link state based on active slaves" (AuthorDate May 18, CommitDate May 27). We can reproduce the problem using the following config: unix { nodaemon exec /etc/vpp/commands.txt } cpu { workers 10 } where commands.txt looks like this: create bond mode lacp load-balance l23 create int rdma host-if enp101s0f1 name Interface101 create int rdma host-if enp179s0f1 name Interface179 bond add BondEthernet0 Interface101 bond add BondEthernet0 Interface179 create sub-interfaces BondEthernet0 1012 create sub-interfaces BondEthernet0 1013 set int ip address BondEthernet0.1012 10.1.1.1/30 set int ip address BondEthernet0.1013 10.1.2.1/30 set int state BondEthernet0 up set int state Interface101 up set int state Interface179 up set int state BondEthernet0.1012 up set int state BondEthernet0.1013 up Then we get the "worker thread deadlock" every time at startup, after just a few seconds. We get the following gdb backtrace (for a release build): vlib_worker_thread_barrier_sync_int: worker thread deadlock Thread 3 "vpp_wk_0" received signal SIGABRT, Aborted. [Switching to Thread 0x7ffe027fe700 (LWP 12171)] __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x742ff801 in __GI_abort () at abort.c:79 #2 0xc700 in os_panic () at vpp/src/vpp/vnet/main.c:371 #3 0x75dd03ab in vlib_worker_thread_barrier_sync_int (vm=0x7fffb87c0300, func_name=) at vpp/src/vlib/threads.c:1517 #4 0x777bfa9c in dpo_get_next_node (child_type=, child_proto=, parent_dpo=0x7fffb9cebda0) at vpp/src/vnet/dpo/dpo.c:430 #5 dpo_stack (child_type=, child_proto=, dpo=, parent=0x7fffb9cebda0) at vpp/src/vnet/dpo/dpo.c:521 #6 0x777c50ac in load_balance_set_bucket_i (lb=0x7fffb8e784c0, bucket=, buckets=0x7fffb8e784e0, next=) at vpp/src/vnet/dpo/load_balance.c:252 #7 load_balance_fill_buckets_norm (lb=0x7fffb8e784c0, nhs=0x7fffb9cebda0, buckets=0x7fffb8e784e0, n_buckets=) at vpp/src/vnet/dpo/load_balance.c:525 #8 load_balance_fill_buckets (lb=0x7fffb8e784c0, nhs=0x7fffb9cebda0, buckets=0x7fffb8e784e0, n_buckets=, flags=) at vpp/src/vnet/dpo/load_balance.c:589 #9 0x777c4d5f in load_balance_multipath_update (dpo=, raw_nhs=, flags=) at vpp/src/vnet/dpo/load_balance.c:88 #10 0x7778e0fc in fib_entry_src_mk_lb (fib_entry=0x7fffb90dd770, esrc=0x7fffb8c60150, fct=FIB_FORW_CHAIN_TYPE_UNICAST_IP4, dpo_lb=0x7fffb90dd798) at vpp/src/vnet/fib/fib_entry_src.c:645 #11 0x7778e4b7 in fib_entry_src_action_install (fib_entry=0x7fffb90dd770, source=FIB_SOURCE_INTERFACE) at vpp/src/vnet/fib/fib_entry_src.c:705 #12 0x7778f0b0 in fib_entry_src_action_reactivate (fib_entry=0x7fffb90dd770, source=FIB_SOURCE_INTERFACE) at vpp/src/vnet/fib/fib_entry_src.c:1221 #13 0x7778d873 in fib_entry_back_walk_notify (node=0x7fffb90dd770, ctx=0x7fffb89c21d0) at vpp/src/vnet/fib/fib_entry.c:316 #14 0x7778343b in fib_walk_advance (fwi=) at vpp/src/vnet/fib/fib_walk.c:368 #15 0x77784107 in fib_walk_sync (parent_type=, parent_index=, ctx=0x7fffb89c22a0) at vpp/src/vnet/fib/fib_walk.c:792 #16 0x7779a43b in fib_path_back_walk_notify (node=, ctx=0x7fffb89c22a0) at vpp/src/vnet/fib/fib_path.c:1226 #17 0x7778343b in fib_walk_advance (fwi=) at vpp/src/vnet/fib/fib_walk.c:368 #18 0x77784107 in fib_walk_sync (parent_type=, parent_index=, ctx=0x7fffb89c2330) at vpp/src/vnet/fib/fib_walk.c:792 #19 0x777a6dec in adj_glean_interface_state_change (vnm=, sw_if_index=5, flags=) at vpp/src/vnet/adj/adj_glean.c:166 #20 adj_nbr_hw_sw_interface_state_change (vnm=, sw_if_index=5, arg=) at vpp/src/vnet/adj/adj_glean.c:183 #21 0x770e06cc in vnet_hw_interface_walk_sw (vnm=0x77b570f0 , hw_if_index=, fn=0x777a6da0 , ctx=0x1) at vpp/src/vnet/interface.c:1062 #22 0x777a6b72 in adj_glean_hw_interface_state_change (vnm=0x2, hw_if_index=3097238656, flags=) at vpp/src/vnet/adj/adj_glean.c:205 #23 0x770df60c in call_elf_section_interface_callbacks (vnm=0x77b570f0 , if_index=1, flags=, elts=0x77b571a0 ) at vpp/src/vnet/interface.c:251 #24 vnet_hw_interface_set_flags_helper (vnm=0x77b570f0 , hw_if_index=1, flags=VNET_HW_INTERFACE_FLAG_LINK_UP, helper_flags=) at vpp/src/vnet/interface.c:331 #25 0x771b300f in bond_enable_collecting_distributing (vm=, sif=0x7fffb95de168) at vpp/src/vnet/bonding/cli.c:178 #26 0x7fffad765636 in lacp_mux_action_collecting_distributing (p1=0x7fffb87c0300, p2=0x7fffb95de168) at vpp/src/plugins/lacp/mux_machine.c:173 #27 0x7fffad7654ff in lacp_mux_action_attached (p1=0x7ff
Re: [vpp-dev] ixge and rdma drivers
Hi Chris, About mlx5, we are using mlx5 cards with the VPP rdma plugin and it is working fine for us, for VPP 19.08 and newer. (I think there may be a problem with the rdma plugin for larger MTU values but for MTU < 2000 or so, everything works fine.) / Elias On Tue, 2020-06-02 at 03:40 -0400, Christian Hopps wrote: > Hi vpp-dev, > > I've been contemplating trying to use native drivers in place of DPDK > with the understanding that I may be paying a ~20% penalty by using > DPDK. So I went to try things out, but had some trouble. The systems > in paticular I'm interested in have 10GE intel NICs in them which I > believe would be supported by the ixge driver. I noticed that this > driver has been marked deprecated in VPP though. Is there a > replacement or is DPDK required for this NIC? > > I also have systems that have mlx5 (and eventually will have > connectx-6 cards). These cards appear to be supported by the rdma > native driver. I was able to create the interfaces and saw TX packets > but no RX. Is this driver considered stable and usable in 19.08 (and > if not which release would it be consider so)? > > Thanks, > Chris. > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16607): https://lists.fd.io/g/vpp-dev/message/16607 Mute This Topic: https://lists.fd.io/mt/74623336/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] ixge and rdma drivers
Hi Ben, > > (I think there may be a problem with the rdma plugin for larger MTU > > values but for MTU < 2000 or so, everything works fine.) > > It should work, jumbo support was added in the last months. Or do you > refer to something else? I think I mean something else, a problem that I noticed a few weeks ago but never had time to report it then. Now I tried again and it can still be reproduced with the current master branch. The setup is that I have one server running VPP doing NAT44 and then I have two other servers on inside and outside. This works fine when the MTU is 1500. Then I set the MTU to 3000 on all involved interfaces and restart VPP. Now it works as longas only small packets are used, but as soon as a packet larger than ~2048 bytes appears, VPP stops working. (Doing e.g. ping -s 2100 is enough to trigger it.) After that VPP is stuck in some kind of error state from which it does not recover, even small packets are not forwarded after that. I tried to investigate further and then it seemed like that what happens is that the RDMA_DEVICE_F_ERROR flag is set in src/plugins/rdma/input.c which causes the rdma plugin code to get stuck, the error flag is never cleared it seems. The reason why the larger packet size caused an error seems to be that the log2_cq_size value used in src/plugins/rdma/input.c is log2_cq_size = 11 which corresponds to 2^11 = 2048 bytes which is roughly the packet size where the problem appears. So I got the impression that the rdma plugin is limited to 2^11 = 2048 bytes MTU due to the log2_cq_size = 11 value. Maybe that can be configured somehow? In any case, it seems bad that VPP gets stuck after one such error appears, it would be better if it just increased an error counter and dropped the packet. Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16612): https://lists.fd.io/g/vpp-dev/message/16612 Mute This Topic: https://lists.fd.io/mt/74623336/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Assertion failure triggered by "ip mroute add" command (master branch)
Hello VPP experts, There seems to be a problem with "ip mroute add" causing assertion failure. This happens for the current master branch and the stable/2005 branch, but not for stable/1908 and stable/2001. Doing the following is enough to see the problem: create int rdma host-if enp101s0f1 name Interface101 set int ip address Interface101 10.0.0.1/24 ip mroute add 224.0.0.1 via Interface101 Accept The "ip mroute add" command there then causes an assertion failure. Backtrace: Thread 1 "vpp_main" received signal SIGABRT, Aborted. __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x74629801 in __GI_abort () at abort.c:79 #2 0x004071a3 in os_panic () at vpp/src/vpp/vnet/main.c:371 #3 0x755085b9 in debugger () at vpp/src/vppinfra/error.c:84 #4 0x75508337 in _clib_error (how_to_die=2, function_name=0x0, line_number=0, fmt=0x776b04b0 "%s:%d (%s) assertion `%s' fails") at vpp/src/vppinfra/error.c:143 #5 0x774d1ed8 in dpo_proto_to_fib (dpo_proto=255) at vpp/src/vnet/fib/fib_types.c:353 #6 0x77504111 in fib_path_attached_get_adj (path=0x7fffb602cda0, link=255, dpo=0x7fffa6f3c2e8) at vpp/src/vnet/fib/fib_path.c:721 #7 0x775038fa in fib_path_resolve (path_index=15) at vpp/src/vnet/fib/fib_path.c:1949 #8 0x774f6a18 in fib_path_list_paths_add (path_list_index=13, rpaths=0x7fffb6523b40) at vpp/src/vnet/fib/fib_path_list.c:902 #9 0x775c795a in mfib_entry_src_paths_add (msrc=0x7fffb6527c10, rpaths=0x7fffb6523b40) at vpp/src/vnet/mfib/mfib_entry.c:754 #10 0x775c764e in mfib_entry_path_update (mfib_entry_index=1, source=MFIB_SOURCE_CLI, rpaths=0x7fffb6523b40) at vpp/src/vnet/mfib/mfib_entry.c:1009 #11 0x775ce98a in mfib_table_entry_paths_update_i (fib_index=0, prefix=0x7fffa6f3c720, source=MFIB_SOURCE_CLI, rpaths=0x7fffb6523b40) at vpp/src/vnet/mfib/mfib_table.c:318 #12 0x775ce643 in mfib_table_entry_path_update (fib_index=0, prefix=0x7fffa6f3c720, source=MFIB_SOURCE_CLI, rpath=0x7fffb5ffa330) at vpp/src/vnet/mfib/mfib_table.c:335 #13 0x76f18ce2 in vnet_ip_mroute_cmd (vm=0x763969c0 , main_input=0x7fffa6f3cf18, cmd=0x7fffb5efced0) at vpp/src/vnet/ip/lookup.c:819 #14 0x76093139 in vlib_cli_dispatch_sub_commands (vm=0x763969c0 , cm=0x76396bf0 , input=0x7fffa6f3cf18, parent_command_index=463) at vpp/src/vlib/cli.c:568 #15 0x76092fdd in vlib_cli_dispatch_sub_commands (vm=0x763969c0 , cm=0x76396bf0 , input=0x7fffa6f3cf18, parent_command_index=0) at vpp/src/vlib/cli.c:528 #16 0x7609218f in vlib_cli_input (vm=0x763969c0 , input=0x7fffa6f3cf18, function=0x0, function_arg=0) at vpp/src/vlib/cli.c:667 #17 0x7616180b in startup_config_process (vm=0x763969c0 , rt=0x7fffb4a9c480, f=0x0) at vpp/src/vlib/unix/main.c:366 #18 0x760dd704 in vlib_process_bootstrap (_a=140736226945080) at vpp/src/vlib/main.c:1502 #19 0x7552c744 in clib_calljmp () at vpp/src/vppinfra/longjmp.S:123 #20 0x7fffb4d06830 in ?? () #21 0x760dd2a2 in vlib_process_startup (vm=0x288, p=0xcd5b1d5112dc20, f=0xb4d069a0) at vpp/src/vlib/main.c:1524 #22 0x0030b6523520 in ?? () #23 0x002f in ?? () #24 0x0035b4d429c0 in ?? () #25 0x0034 in ?? () #26 0x77b775b4 in vlibapi_get_main () at vpp/src/vlibapi/api_common.h:385 Backtrace stopped: previous frame inner to this frame (corrupt stack?) (gdb) The code at the assertion at fib_types.c:353 looks like this: fib_protocol_t dpo_proto_to_fib (dpo_proto_t dpo_proto) { switch (dpo_proto) { case DPO_PROTO_IP6: return (FIB_PROTOCOL_IP6); case DPO_PROTO_IP4: return (FIB_PROTOCOL_IP4); case DPO_PROTO_MPLS: return (FIB_PROTOCOL_MPLS); default: break; } ASSERT(0); <--- this assertion is triggered return (0); } so apparently dpo_proto does not have any of the allowed values. Testing earlier commits in the git history pointed to the following seemingly unrelated and harmless refactoring commit as the point when this problem started: 30cca512c (build: remove valgrind leftovers, 2019-11-25) What we are trying to do, which has worked for VPP 19.08, is to enable receiving of multicast packets on a given interface using two commands like this: ip mroute add 224.0.0.1 via Interface101 Accept ip mroute add 224.0.0.1 via local Forward but now for the master branch the first of those "ip mroute add" lines gives the assertion failure. Has something changed regarding how the "ip mroute add" command is to be used? If not, could the assertion failure indicate a bug somewhere? The problem seems easy to reproduce, at least for me the assertion happens in the same way every time. Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You
Re: [vpp-dev] Assertion failure triggered by "ip mroute add" command (master branch)
Hi Ben! > It is probably a bug but I could not reproduce it. > Note that commit 30cca512c (build: remove valgrind > leftovers, 2019-11-25) is present in stable/2001 > so probably not the culprit... Agreed. > Can you share how you built VPP and your complete startup.conf? > You seems to be running those commands from startup.conf directly. Yes, I had those three commands in a file and then pointed to that file as "exec /path/to/file" in the unix { } part of startup.conf. Anyway, I got inspired and debugged the issue further myself: the problem seems to be that the variable payload_proto in vnet_ip_mroute_cmd() does not get set to anything, it end up having whatever value was on the stack which could be any garbage. My test works correctly after initializing it to zero, like this: --- a/src/vnet/ip/lookup.c +++ b/src/vnet/ip/lookup.c @@ -661,7 +661,7 @@ vnet_ip_mroute_cmd (vlib_main_t * vm, unformat_input_t _line_input, *line_input = &_line_input; fib_route_path_t rpath, *rpaths = NULL; clib_error_t *error = NULL; - u32 table_id, is_del, payload_proto; + u32 table_id, is_del, payload_proto = 0; If you want to reproduce the problem, you can simply set payload_proto=77 (or whatever) instead of payload_proto=0 there, to mimic garbage on the stack. Just setting payload_proto=0 is probably not a good fix though, I guess that just means hard-coding the FIB_PROTOCOL_IP4 value which happens to work in my case. To fix it properly I think payload_proto should be set to the appropriate protocol in the different "else if" clauses, when pfx.fp_proto is set then payload_proto should also be set, in the same way as it is done in the vnet_ip_route_cmd() function. I pushed a fix like that to gerrit, please have a look: https://gerrit.fd.io/r/c/vpp/+/27416 Best regards, Elias P.S. By the way, do you think address sanitizer could be used to find this kind of bugs? (Or perhaps if there was a compiler option to poison the stack at each function call, or something like that. I think it's a common problem that code relies on uninitialized things being zero and that can sometimes go undetected for a long time because things often happen to be zero, forcing something nonzero could help detecting such bugs.) -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16649): https://lists.fd.io/g/vpp-dev/message/16649 Mute This Topic: https://lists.fd.io/mt/74649468/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] RDMA problem in master and stable/2005, started with commit introducing direct verb for Cx4/5 tx #mellanox #rdma
Hello VPP experts, There seems to be a problem with the RDMA driver in VPP when using Mellanox ConnectX5 network interfaces. This problem appears for the master branch and for the stable/2005 branch, while stable/2001 does not have this problem. The problem is that when a frame with 2 packets is to be sent, only the first packets is sent directly while the second packet gets delayed. It seems like the second packet is only sent later, when some other frame with other packets is to be sent then the delayed earlier packet is also sent. Perhaps this can go undetected if there is lots of traffic all the time, if there is always new traffic to flush out any delayed packets from earlier. So to reproduce it, it seems best to have a testing setup with very little traffic such that there are several seconds without any traffic, then it seems like packets can get delayed for several seconds. Note that the delay is not seen inside VPP where packet traces look like the packets are sent directly, VPP thinks they are sent but it seems some packets are held in the NIC and only sent later on. Monitoring traffic arriving at the other end shows that there was a delay. The behavior seems reproducible, except when there is other traffic being sent soon after since that causes the delayed packets to be sent. The specific case when this came up for us was when using VPP for NAT with ipfix logging turned on, and doing some ping tests. Then when a single ping echo request packet is to be NATed, that usually works fine but sometimes there is also a ipfix logging packet to be sent, that ends up in the same frame so that the frame has 2 packets. Then the ipfix logging packet gets sent directly while the ICMP packet is delayed, sometimes so much that the ping failed, it timed out. I don't think the problem has anything to do with NAT or ipfix logging, it seems like a more general problem with the rdma plugin. Testing previous commits indicates that the problem started with this commit: dc812d9a7 (rdma: introduce direct verb for Cx4/5 tx, 2019-12-16) That commit exists in master and in stable/2005 but not in stable/2001 which fits with that this problem is seen for master and stable/2005 but not for stable/2001. Tried updating to the latest Mellanox driver (v5.0-2.1.8) but that did not help. In the code in src/plugins/rdma/output.c it seems like the function rdma_device_output_tx_mlx5() is handling the packets, but I was not able to fully understand how it works. There is a concept of a "doorbell" function call there, apparently the idea is that when packets are to be sent, info about the packets is prepared and then the "doorbell" is used to alert the NIC that there are things to send. From my limited understanding, it seems like the doorbell currently results in only the first packet is really being physically sent by the NIC directly, while remaining packets are somehow stored and sent later. So far I don't understand exactly why that happens or how to fix it. As a workaround, it seems to work to simply revert the entire rdma plugin to the way it looks in the stable/2001 branch, then the problem seems to disappear. But that probably means we lose performance gains and other improvements in the newer code. Can someone with insight in the rdma plugin please help try to fix this? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16822): https://lists.fd.io/g/vpp-dev/message/16822 Mute This Topic: https://lists.fd.io/mt/75120690/21656 Mute #mellanox: https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/mellanox Mute #rdma: https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/rdma Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] RDMA problem in master and stable/2005, started with commit introducing direct verb for Cx4/5 tx #mellanox #rdma
Hi Ben, Thanks, now I tried it (the Patchset 2 variant) but it seems to behave like before, the delay is sitll happening. Let me know if you have something more I could try. / Elias On Fri, 2020-06-26 at 12:04 +, Benoit Ganne (bganne) via lists.fd.io wrote: > Hi Elias, > > Thanks for the detailed report. I suspect you are correct, it seems > to be related to the doorbell update to notify the NIC there are some > work to do. > Could you check https://gerrit.fd.io/r/c/vpp/+/27708 and report > whether it fixes the issue? > > Best > ben -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16831): https://lists.fd.io/g/vpp-dev/message/16831 Mute This Topic: https://lists.fd.io/mt/75120690/21656 Mute #mellanox: https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/mellanox Mute #rdma: https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/rdma Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] RDMA problem in master and stable/2005, started with commit introducing direct verb for Cx4/5 tx #mellanox #rdma
Hi Ben, Thanks, I tested that now but it did not help, it behaves the same also with "MLX5_SHUT_UP_BF=1" set. / Elias > Can you try to export "MLX5_SHUT_UP_BF=1" in your environment before > starting VPP (ie, VPP environment must contain this)? This should > disable the "BlueFlame" mechanism in Mellanox NIC. Otherwise I'll > need to take a deeper look. > > Best > ben -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16833): https://lists.fd.io/g/vpp-dev/message/16833 Mute This Topic: https://lists.fd.io/mt/75120690/21656 Mute #mellanox: https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/mellanox Mute #rdma: https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/rdma Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] NAT port number selection problem, leads to wrong thread index for some sessions
Hello VPP experts, There seems to be a problem with the way port number is selected for NAT: sometimes the selected port number leads to a different thread index being selected for out2in packets, making that session useless. This applies to the current master branch as well as the latest stable branches, I think. Here is the story as I understand it, please correct me if I have misunderstood something. Each NAT thread has a range of port numbers that it can use, and when a new session is created a port number is picked at random from within that range. That happens when a in2out packet is NATed. Then later when a response comes as a out2in packet, VPP needs to make sure it is handled by the correct thread, the same thread that created the session. The port number to use for a new session is selected in nat_alloc_addr_and_port_default() like this: portnum = (port_per_thread * snat_thread_index) + snat_random_port(1, port_per_thread) + 1024; where port_per_thread is the number of ports each thread is allowed to use, and snat_random_port() returns a random number in the given range. This means that the smallest possible portnum is 1025, that can happen when snat_thread_index is zero. The corresponding calculation to get the thread index back based on the port number is essentially this: (portnum - 1024) / port_per_thread This works most of the time, but not always. It works in all cases except when snat_random_port() returns the largest possible value, in that case we end up with the wrong thread index. That means that out2in packets arriving for that session get handed off to another thread. The other thread is unaware of that session so all out2in packets are then dropped for that session. Since each thread has thousands of port numbers to choose from and the problem only appears for one particular choice, only a small fraction of all sessions are affected by this. In my tests there was 8 NAT threads, then the port_per_thread value was about 8000 so that the probability was about 1/8000 or roughly 0.0125% of all sessions that failed. The test I used was simply to try many separate ping commands with the "-c 1" option, all should give the normal result "1 packets transmitted, 1 received, 0% packet loss" but due to this problem some of the pings fail. Note that it needs to be separate ping commands so that VPP creates a new session for each of them. Provided that you test a large enough number of sessions, it is straightforward to reproduce the problem. It could be fixed in different ways, one way is to simply shift the arguments to snat_random_port() down by one: snat_random_port(1, port_per_thread) --> snat_random_port(0, port_per_thread-1) I pushed such a change to gerrit, here: https://gerrit.fd.io/r/c/vpp/+/27786 The smallest port number used then becomes 1024 instead of 1025 as it has been so far, I suppose that should be OK since it is the "well- known ports" from 0 to 1023 that should be avoided, port 1024 should be okay to use. What do you think, does it make sense to fix it in this way? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16880): https://lists.fd.io/g/vpp-dev/message/16880 Mute This Topic: https://lists.fd.io/mt/75267169/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] VPP 20.05.1 tomorrow 15th July 2020
Hello Andrew, The following two fixes have been merged to the master branch, it would be good to have them in stable/2005 also: https://gerrit.fd.io/r/c/vpp/+/27280 (misc: ipfix-export unformat u16 collector_port fix) https://gerrit.fd.io/r/c/vpp/+/27281 (nat: fix regarding vm arg for vlib_time_now call) Best regards, Elias On Tue, 2020-07-14 at 19:04 +0200, Andrew Yourtchenko wrote: > Hi all, > > As agreed on the VPP community call today, we will declare the > current stable/2005 branch as v20.05.1 tomorrow (15th July) > > If you have any fixes that are already in master but not yet in > stable/2005, that you want to get in there - please let me know > before noon UTC. > > --a > Your friendly release manager > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16966): https://lists.fd.io/g/vpp-dev/message/16966 Mute This Topic: https://lists.fd.io/mt/75503386/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] VPP 20.05.1 tomorrow 15th July 2020
Hi Andrew, I don't know how to cherry-pick. I was under the impression that only the trusted commiters were allowed to do that, maybe I misunderstood that. What I know so far about the gerrit system is what I read here: https://wiki.fd.io/view/VPP/Pulling,_Building,_Running,_Hacking_and_Pushing_VPP_Code#Pushing_Code_with_git_review Is there a guide somewhere describing how to do cherry-picking? Alternatively, could you do it for me? / Elias On Wed, 2020-07-15 at 12:27 +0200, Andrew 👽 Yourtchenko wrote: > Hi Elias, sure, feel free to cherry-pick to stable/2005 branch and > add me as a reviewer, then I can merge when JJB gives thumbs up. > > --a > > > On 15 Jul 2020, at 07:25, Elias Rudberg > > wrote: > > > > Hello Andrew, > > > > The following two fixes have been merged to the master branch, it > > would > > be good to have them in stable/2005 also: > > > > https://gerrit.fd.io/r/c/vpp/+/27280 (misc: ipfix-export unformat > > u16 > > collector_port fix) > > > > https://gerrit.fd.io/r/c/vpp/+/27281 (nat: fix regarding vm arg for > > vlib_time_now call) > > > > Best regards, > > Elias > > > > > > > On Tue, 2020-07-14 at 19:04 +0200, Andrew Yourtchenko wrote: > > > Hi all, > > > > > > As agreed on the VPP community call today, we will declare the > > > current stable/2005 branch as v20.05.1 tomorrow (15th July) > > > > > > If you have any fixes that are already in master but not yet in > > > stable/2005, that you want to get in there - please let me know > > > before noon UTC. > > > > > > --a > > > Your friendly release manager > > > -=-=-=-=-=-=-=-=-=-=-=- > > > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16970): https://lists.fd.io/g/vpp-dev/message/16970 Mute This Topic: https://lists.fd.io/mt/75503386/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] NAT port number selection problem, leads to wrong thread index for some sessions
Hello, Just a reminder about this, see below. Best regards, Elias Forwarded Message From: Elias Rudberg To: vpp-dev@lists.fd.io Subject: [vpp-dev] NAT port number selection problem, leads to wrong thread index for some sessions Date: Thu, 02 Jul 2020 20:43:12 + Hello VPP experts, There seems to be a problem with the way port number is selected for NAT: sometimes the selected port number leads to a different thread index being selected for out2in packets, making that session useless. This applies to the current master branch as well as the latest stable branches, I think. Here is the story as I understand it, please correct me if I have misunderstood something. Each NAT thread has a range of port numbers that it can use, and when a new session is created a port number is picked at random from within that range. That happens when a in2out packet is NATed. Then later when a response comes as a out2in packet, VPP needs to make sure it is handled by the correct thread, the same thread that created the session. The port number to use for a new session is selected in nat_alloc_addr_and_port_default() like this: portnum = (port_per_thread * snat_thread_index) + snat_random_port(1, port_per_thread) + 1024; where port_per_thread is the number of ports each thread is allowed to use, and snat_random_port() returns a random number in the given range. This means that the smallest possible portnum is 1025, that can happen when snat_thread_index is zero. The corresponding calculation to get the thread index back based on the port number is essentially this: (portnum - 1024) / port_per_thread This works most of the time, but not always. It works in all cases except when snat_random_port() returns the largest possible value, in that case we end up with the wrong thread index. That means that out2in packets arriving for that session get handed off to another thread. The other thread is unaware of that session so all out2in packets are then dropped for that session. Since each thread has thousands of port numbers to choose from and the problem only appears for one particular choice, only a small fraction of all sessions are affected by this. In my tests there was 8 NAT threads, then the port_per_thread value was about 8000 so that the probability was about 1/8000 or roughly 0.0125% of all sessions that failed. The test I used was simply to try many separate ping commands with the "-c 1" option, all should give the normal result "1 packets transmitted, 1 received, 0% packet loss" but due to this problem some of the pings fail. Note that it needs to be separate ping commands so that VPP creates a new session for each of them. Provided that you test a large enough number of sessions, it is straightforward to reproduce the problem. It could be fixed in different ways, one way is to simply shift the arguments to snat_random_port() down by one: snat_random_port(1, port_per_thread) --> snat_random_port(0, port_per_thread-1) I pushed such a change to gerrit, here: https://gerrit.fd.io/r/c/vpp/+/27786 The smallest port number used then becomes 1024 instead of 1025 as it has been so far, I suppose that should be OK since it is the "well- known ports" from 0 to 1023 that should be avoided, port 1024 should be okay to use. What do you think, does it make sense to fix it in this way? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#17052): https://lists.fd.io/g/vpp-dev/message/17052 Mute This Topic: https://lists.fd.io/mt/75267169/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] VPP load estimation
Hi Ben, > Yes, it is the main way to quickly assess VPP load, see > https://fd.io/docs/vpp/master/troubleshooting/cpuusage.html#vpp-cpu-load > My very crude rule-of-thumb looks like this (but your mileage may > vary): > - between 0 and 50: VPP is not working too hard > - between 50 and 100: VPP is starting to be pushed hard > - above 100: you'll probably experiment drops with bursts > - 250+: you're dropping traffic Is it possible to get this information using the python API instead of the vppctl "show runtime" command? In our case we have some monitoring tools that fetch statistics from VPP regularly, like several times each minute. So then we would like to do it in a way that does not cause performance problems. Is it a bad idea to use the vppctl "show runtime" command frequently (it causes a thread barrier I think) and if so, is there a better way of getting the corresponding information? I also have another question related to load estimation: we are using VPP for NAT44 and we are seeing a significant number (like 1000 per second) of congestion drops (meaning that a NAT thread wants to handoff packets to another thread but the handoff queue is full). Then we looked at the "show runtime" output and expected to see some large values for the vector rate there, but it just shows values like 7 and similar, far below 50, which by your rule of thumb should indicate that VPP is not working too hard. In this case, are there some other statistics we could look at to figure out what is happening? One theory is that there are some short bursts of more intense traffic causing our drops, that we do not see with "show runtime" because statistics there are smeared out over time. Are there some other statistics we could use to understand if that is the case, or better ways to investigate this kind of problem? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#17984): https://lists.fd.io/g/vpp-dev/message/17984 Mute This Topic: https://lists.fd.io/mt/78132591/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?
Hello VPP experts, We are using VPP for NAT44 and we get some "congestion drops", in a situation where we think VPP is far from overloaded in general. Then we started to investigate if it would help to use a larger handoff frame queue size. In theory at least, allowing a longer queue could help avoiding drops in case of short spikes of traffic, or if it happens that some worker thread is temporarily busy for whatever reason. The NAT worker handoff frame queue size is hard-coded in the NAT_FQ_NELTS macro in src/plugins/nat/nat.h where the current value is 64. The idea is that putting a larger value there could help. We have run some tests where we changed the NAT_FQ_NELTS value from 64 to a range of other values, each time rebuilding VPP and running an identical test, a test case that is to some extent trying to mimic our real traffic, although of course it is simplified. The test runs many iperf3 tests simultaneously using TCP, combined with some UDP traffic chosen to trigger VPP to create more new sessions (to make the NAT "slowpath" happen more). The following NAT_FQ_NELTS values were tested: 16 32 64 <-- current value 128 256 512 1024 2048 <-- best performance in our tests 4096 8192 16384 32768 65536 131072 In those tests, performance was very bad for the smallest NAT_FQ_NELTS values of 16 and 32, while values larger than 64 gave improved performance. The best results in terms of throughput were seen for NAT_FQ_NELTS=2048. For even larger values than that, we got reduced performance compared to the 2048 case. The tests were done for VPP 20.05 running on a Ubuntu 18.04 server with a 12-core Intel Xeon CPU and two Mellanox mlx5 network cards. The number of NAT threads was 8 in some of the tests and 4 in some of the tests. According to these tests, the effect of changing NAT_FQ_NELTS can be quite large. For example, for one test case chosen such that congestion drops were a significant problem, the throughput increased from about 43 to 90 Gbit/second with the amount of congestion drops per second reduced to about one third. In another kind of test, throughput increased by about 20% with congestion drops reduced to zero. Of course such results depend a lot on how the tests are constructed. But anyway, it seems clear that the choice of NAT_FQ_NELTS value can be important and that increasing it would be good, at least for the kind of usage we have tested now. Based on the above, we are considering changing NAT_FQ_NELTS from 64 to a larger value and start trying that in our production environment (so far we have only tried it in a test environment). Were there specific reasons for setting NAT_FQ_NELTS to 64? Are there some potential drawbacks or dangers of changing it to a larger value? Would you consider changing to a larger value in the official VPP code? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18012): https://lists.fd.io/g/vpp-dev/message/18012 Mute This Topic: https://lists.fd.io/mt/78230881/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?
Hi Klement, Thanks! I have now tested your patch (28980), it seems to work and it does give some improvement. However, according to my tests, increasing NAT_FQ_NELTS seems to have a bigger effect, it improves performance a lot. When using the original NAT_FQ_NELTS value of 64, your patch gives some improvement but I still get the best performance when increasing NAT_FQ_NELTS. For example, one of the tests behaves like this: Without patch, NAT_FQ_NELTS=64 --> 129 Gbit/s and ~600k cong. drops With patch, NAT_FQ_NELTS=64 --> 136 Gbit/s and ~400k cong. drops Without patch, NAT_FQ_NELTS=1024 --> 151 Gbit/s and 0 cong. drops With patch, NAT_FQ_NELTS=1024 --> 151 Gbit/s and 0 cong. drops So it still looks like increasing NAT_FQ_NELTS would be good, which brings me back to the same questions as before: Were there specific reasons for setting NAT_FQ_NELTS to 64? Are there some potential drawbacks or dangers of changing it to a larger value? I suppose everyone will agree that when there is a queue with a maximum length, the choice of that maximum length can be important. Is there some particular reason to believe that 64 would be enough? In our case we are using 8 NAT threads. Suppose thread 8 is held up briefly due to something taking a little longer than usual, meanwhile threads 1-7 each hand off 10 frames to thread 8, that situation would require a queue size of at least 70, unless I misunderstood how the handoff mechanism works. To me, allowing a longer queue seems like a good thing because it allows us to handle also more difficult cases when threads are not always equally fast, there can be spikes in traffic that affect some threads more than others, things like that. But maybe there are strong reasons for keeping the queue short, reasons I don't know about, that's why I'm asking. Best regards, Elias On Fri, 2020-11-13 at 15:14 +, Klement Sekera -X (ksekera - PANTHEON TECH SRO at Cisco) wrote: > Hi Elias, > > I’ve already debugged this and came to the conclusion that it’s the > infra which is the weak link. I was seeing congestion drops at mild > load, but not at full load. Issue is that with handoff, there is > uneven workload. For simplicity’s sake, just consider thread 1 > handing off all the traffic to thread 2. What happens is that for > thread 1, the job is much easier, it just does some ip4 parsing and > then hands packet to thread 2, which actually does the heavy lifting > of hash inserts/lookups/translation etc. 64 element queue can hold 64 > frames, one extreme is 64 1-packet frames, totalling 64 packets, > other extreme is 64 255-packet frames, totalling ~16k packets. What > happens is this: thread 1 is mostly idle and just picking a few > packets from NIC and every one of these small frames creates an entry > in the handoff queue. Now thread 2 picks one element from the handoff > queue and deals with it before picking another one. If the queue has > only 3-packet or 10-packet elements, then thread 2 can never really > get into what VPP excels in - bulk processing. > > Q: Why doesn’t it pick as many packets as possible from the handoff > queue? > A: It’s not implemented. > > I already wrote a patch for it, which made all congestion drops which > I saw (in above synthetic test case) disappear. Mentioned patch > https://gerrit.fd.io/r/c/vpp/+/28980 is sitting in gerrit. > > Would you like to give it a try and see if it helps your issue? We > shouldn’t need big queues under mild loads anyway … > > Regards, > Klement > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18039): https://lists.fd.io/g/vpp-dev/message/18039 Mute This Topic: https://lists.fd.io/mt/78230881/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?
Hi Klement, > I see no reason why this shouldn’t be configurable. > [...] > Would you like to submit a patch? Sure, I'll give that a try, adding it as a config option of the same kind as other NAT options. Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18061): https://lists.fd.io/g/vpp-dev/message/18061 Mute This Topic: https://lists.fd.io/mt/78230881/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] RDMA problem in master and stable/2005, started with commit introducing direct verb for Cx4/5 tx #mellanox #rdma
Hi Ben, Returning to this issue, last discussed in June. > > Thanks, now I tried it (the Patchset 2 variant) but it seems to > > behave like before, the delay is sitll happening. > > Hmm thanks ☹ > Can you try to export "MLX5_SHUT_UP_BF=1" in your environment before > starting VPP (ie, VPP environment must contain this)? This should > disable the "BlueFlame" mechanism in Mellanox NIC. Otherwise I'll > need to take a deeper look. Unfortunately that did not help, it seemed to behave the same also with "MLX5_SHUT_UP_BF=1" set. We are still having this problem now, with the current master branch. Like before, the behavior seems to be that when 2 packets are to be sent, only the first one gets sent directly, the second packet gets delayed. I have a test case now where the delay is more than 3 seconds. It seems the delay lasts until something else is to be sent, then the old packet gets sent also. So nothing gets lost, just delayed. But things can anyway fail, for example some ping tests fail because they time out. I have looked a bit more at it and tried to understand what happens, but I did not get much wiser, still just seems to me like VPP rings the "doorbell" and expects the packets to be sent, but somehow only one packet is sent and the other is delayed. Am I right to assume that the "doorbell" action is the last thing VPP is doing that we can check in the VPP source code itself, then we would need to go poke around inside the underlying rdma-core driver to see what is happening? Can you help more with this? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18130): https://lists.fd.io/g/vpp-dev/message/18130 Mute This Topic: https://lists.fd.io/mt/75120690/21656 Mute #mellanox:https://lists.fd.io/g/vpp-dev/mutehashtag/mellanox Mute #rdma:https://lists.fd.io/g/vpp-dev/mutehashtag/rdma Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] After recent "interface: improve logging" commit, "Secondary MAC Addresses not supported" message appears, does this mean something is wrong?
Hello VPP experts, Using the current master branch, we now get log messages like this (shown by journalctl in red color): Nov 25 15:10:29 vnet[...]: interface: hw_add_del_mac_address: vnet_hw_interface_add_del_mac_address: Secondary MAC Addresses not supported for interface index 0 Nov 25 15:10:29 vnet[...]: interface: hw_add_del_mac_address: vnet_hw_interface_add_del_mac_address: Secondary MAC Addresses not supported for interface index 0 This seems to have started with the commit d1bd5d26 "interface: improve logging" on November 23. Even though the commit message says it was only a logging change, I still wnder if the message is correct and if so, if it means that something is wrong with the way we have configured VPP. Here is an example of VPP commands leading to those log messages: create bond mode lacp load-balance l23 create int rdma host-if enp101s0f1 name i1 create int rdma host-if enp179s0f1 name i2 bond add BondEthernet0 i1 bond add BondEthernet0 i2 create sub-interfaces BondEthernet0 1 create sub-interfaces BondEthernet0 2 set int ip address BondEthernet0.1 10.0.0.1/30 The "set int ip address" command there triggers two such "Secondary MAC Addresses not supported" messages -- what does that mean in case of the config above? Should we do something differently to avoid the error messages? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18135): https://lists.fd.io/g/vpp-dev/message/18135 Mute This Topic: https://lists.fd.io/mt/78500983/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] NAT memory usage problem for VPP 20.09 compared to 20.05 due to larger translation_buckets value
Hello VPP experts, We are using VPP for NAT44 and are currently looking at how to move from VPP 20.05 to 20.09. There are some differences in the way the NAT plugin is configured. One difficulty for us is the maximum number of sessions allowed, we need to handle large numbers of sessions so that limit can be important for us. For VPP 20.05 we have used "translation hash buckets 1048576" and then the maximum number of sessions per thread becomes 10 times that because of this line in the source code in snat_config(): sm->max_translations = 10 * translation_buckets; So then we got a limit of about 10 million sessions per thread, which we have been happy with so far. With VPP 20.09 however, things have changed so that the maximum number of sessions is now configured explicitly, and the relationship between max_translations_per_thread and translation_buckets is no longer a factor of 10 but instead given by the nat_calc_bihash_buckets() function: static u32 nat_calc_bihash_buckets (u32 n_elts) { return 1 << (max_log2 (n_elts >> 1) + 1); } The above function corresponds to a factor of somewhere between 1 and 2 instead of 10. So, if I understood this correctly, for a given maximum number of sessions, the corresponding translation_buckets value will be something like 5 to 10 times larger in VPP 20.09 compared to how it was in VPP 20.05, leading to significantly increased memory requirement given that we want to have the same maximum number of sessions as before. It seems a little strange that the translation_buckets value would change so much between VPP versions, was that change intentional? The old relationship "max_translations = 10 * translation_buckets" seems to have worked well in practice, at least for our use case. What could we do to get around this, if we want to switch to VPP 20.09 but without reducing the maximum number of sessions? If we were to simply divide the nat_calc_bihash_buckets() value by 8 or so to make it more similar to how it was earlier, would that lead to other problems? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18160): https://lists.fd.io/g/vpp-dev/message/18160 Mute This Topic: https://lists.fd.io/mt/78533277/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] minor doc change
On Sat, 2020-11-28 at 13:18 -0500, Paul Vinciguerra wrote: > > We don't see pull requests. Github is just a mirror of the gerrit > repo. I think it would be good if that could be clarified on the github page. When people search for "vpp source code" or similar, I think they will often end up on the github page and it's not immediately obvious from there that it's only a mirror. (People might get the wrong idea about some things, for example github shows a "contributors" list which I guess is not accurate as it only shows authors who happen to have github accounts that are linked to gerrit in some way?) The github page says, under "About" to the right, "No description, website, or topics provided." So there is apparently a possibility to enter a "description", perhaps that could be used to indicate that it is just a mirror? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18190): https://lists.fd.io/g/vpp-dev/message/18190 Mute This Topic: https://lists.fd.io/mt/78559913/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] minor doc change
Hi Hemant, > I agree with Elias. Long term, maybe use of gerrit > is deprecated and github is used. Perhaps I should clarify that I did not mean to recommend moving VPP to github. On the contrary, I think it is good that the VPP source code is managed independently from github and I hope it will stay that way. My point was just that in the current situation when there is a github mirror, it would be good to make that more clear to avoid confusion. Another way to avoid confusion would be to remove the code from github (that would be fine as I see it but of course there will be different opinions about that). > Github is free for public repos. That depends on what you mean by "free". It could be argued that there is a cost in terms of control over the project and being able to do what you want in the future. Moving something to github means partly giving up control. Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18193): https://lists.fd.io/g/vpp-dev/message/18193 Mute This Topic: https://lists.fd.io/mt/78559913/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] stat_set_simple_counter fix to avoid stat client crash needed in stable/2009 also?
Hello everyone, For VPP 20.05 the following works to extract /sys/vector_rate statistics: #!/usr/bin/python3 from vpp_papi.vpp_stats import VPPStats stat = VPPStats("/run/vpp/stats.sock") dir = stat.ls(['^/sys/vector_rate']) counters = stat.dump(dir) vector_rate=counters.get('/sys/vector_rate') print("vector_rate = ", vector_rate) Unfortunately, with VPP 20.09 the stat client crashes when doing that. Seems like a problem introduced by https://gerrit.fd.io/r/c/vpp/+/28017 (stats: remove offsets on vpp side) and fixed in master by https://gerrit.fd.io/r/c/vpp/+/29569 (stats: missing dimension in stat_set_simple_counter). I was hoping this could be fixed by cherry-picking the fix into the stable/2009 branch which I tried here: https://gerrit.fd.io/r/c/vpp/+/30161 However that does not pass the jenkins tests due to some problem related to "vom" which was recently deprecated in the master branch, that might explain why the fix works in master but not in stable/2009. Still, the fix does work for me in stable/2009, maybe different compiler version or other details matter and cause some of the jenkins builds to fail. How to get around this, to make the stat client work for stable/2009 also? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18196): https://lists.fd.io/g/vpp-dev/message/18196 Mute This Topic: https://lists.fd.io/mt/78601259/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] stat_set_simple_counter fix to avoid stat client crash needed in stable/2009 also?
Hi Ole, thanks for your answer. > /w/workspace/vpp-verify-2009-ubuntu1804-x86_64/build-root/install- > vpp-native/vpp/include/vpp-api/client/stat_client.h:107:11: error: > pointer of type ‘void *’ used in arithmetic [-Werror=pointer-arith] >((p + sizeof (p)) < ((void *) sm->shared_header + sm- > >memory_size))) > > ~~^~ > > Doing pointer arithmetic on an incomplete type (void) isn't entirely > kosher. > GCC supports it, and you could disable the warning. > But the correct-est approach would be to cast it to a type with size > 1. After adding some (char *) casts in stat_segment_adjust() it passed the tests, please have a look: https://gerrit.fd.io/r/c/vpp/+/30161 Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18203): https://lists.fd.io/g/vpp-dev/message/18203 Mute This Topic: https://lists.fd.io/mt/78601259/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] stat_set_simple_counter fix to avoid stat client crash needed in stable/2009 also?
Hi Ole, > Thanks Elias, merged. Great. Thanks! > Would you mind fixing that in master too? OK: https://gerrit.fd.io/r/c/vpp/+/30207 With that, the stat_client.h file becomes identical in master and stable/2009. Some pedantic part of me noticed that the same issue seems to exist also in stat_client.c in the stat_vec_dup macro and maybe other places, but the compiler does not complain about that in either of the branches so I did not change it. Perhaps the reason why there were compilation problems for stable/2009 is that stat_client.h is included from some C++ code in extras/vom/ and the rules and/or compiler options are different for C++. Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18205): https://lists.fd.io/g/vpp-dev/message/18205 Mute This Topic: https://lists.fd.io/mt/78601259/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] VPP hanging and running out of memory due to infinite loop related to nat44-hairpinning
Hello VPP experts, For our NAT44 usage of VPP we have encountered a problem with VPP running out of memory, which now, after much headache and many out-of- memory crashes over the past several months, has turned out to be caused by an infinite loop where VPP gets stuck repeating the three nodes ip4-lookup, ip4-local and nat44-hairpinning. A single packet gets passed around and around between those three nodes, eating more and more memory which causes that worker thread to get stuck and VPP to run out of memory after a few seconds. (Earlier we speculated that it was due to a memory leak but now it seems it was not.) This concerns the current master branch as well as the stable/2009 branches and earlier VPP versions as well. One scenario when this happens is when a UDP (or TCP) packet is sent from a client on the inside with a destination IP address that matches an existing static NAT mapping that maps that IP address on the inside to the same IP address on the outside. Then, the problem can be triggered for example by doing this from a client on the inside, where DESTINATION_IP is the IP address of such a static mapping: echo hello > /dev/udp/$DESTINATION_IP/3 Here is the packet trace for the thread that receives the packet at rdma-input: -- Packet 42 00:03:07:636840: rdma-input rdma: Interface179 (4) next-node bond-input l2-ok l3-ok l4-ok ip4 udp 00:03:07:636841: bond-input src d4:6a:35:52:30:db, dst 02:fe:8d:23:60:a7, Interface179 -> BondEthernet0 00:03:07:636843: ethernet-input IP4: d4:6a:35:52:30:db -> 02:fe:8d:23:60:a7 802.1q vlan 1013 00:03:07:636844: ip4-input UDP: SOURCE_IP_INSIDE -> DESTINATION_IP tos 0x00, ttl 63, length 34, checksum 0xe7e3 dscp CS0 ecn NON_ECN fragment id 0x50fe, flags DONT_FRAGMENT UDP: 48824 -> 3 length 14, checksum 0x781e 00:03:07:636846: ip4-sv-reassembly-feature [not-fragmented] 00:03:07:636847: nat44-in2out-worker-handoff NAT44_IN2OUT_WORKER_HANDOFF : next-worker 8 trace index 41 -- So it is doing handoff to thread 8 with trace index 41. Nothing wrong so far, I think. Here is the beginning of the corresponding packet trace for the receiving thread: -- Packet 57 00:03:07:636850: handoff_trace HANDED-OFF: from thread 7 trace index 41 00:03:07:636850: nat44-in2out NAT44_IN2OUT_FAST_PATH: sw_if_index 6, next index 3, session -1 00:03:07:636855: nat44-in2out-slowpath NAT44_IN2OUT_SLOW_PATH: sw_if_index 6, next index 0, session 11 00:03:07:636927: ip4-lookup fib 0 dpo-idx 577 flow hash: 0x UDP: SOURCE_IP_OUTSIDE -> DESTINATION_IP tos 0x00, ttl 63, length 34, checksum 0x5eee dscp CS0 ecn NON_ECN fragment id 0x50fe, flags DONT_FRAGMENT UDP: 63957 -> 3 length 14, checksum 0xb40b 00:03:07:636930: ip4-local UDP: SOURCE_IP_OUTSIDE -> DESTINATION_IP tos 0x00, ttl 63, length 34, checksum 0x5eee dscp CS0 ecn NON_ECN fragment id 0x50fe, flags DONT_FRAGMENT UDP: 63957 -> 3 length 14, checksum 0xb40b 00:03:07:636932: nat44-hairpinning new dst addr DESTINATION_IP port 3 fib-index 0 is-static-mapping 00:03:07:636934: ip4-lookup fib 0 dpo-idx 577 flow hash: 0x UDP: SOURCE_IP_OUTSIDE -> DESTINATION_IP tos 0x00, ttl 63, length 34, checksum 0x5eee dscp CS0 ecn NON_ECN fragment id 0x50fe, flags DONT_FRAGMENT UDP: 63957 -> 3 length 14, checksum 0xb40b 00:03:07:636936: ip4-local UDP: SOURCE_IP_OUTSIDE -> DESTINATION_IP tos 0x00, ttl 63, length 34, checksum 0x5eee dscp CS0 ecn NON_ECN fragment id 0x50fe, flags DONT_FRAGMENT UDP: 63957 -> 3 length 14, checksum 0xb40b 00:03:07:636937: nat44-hairpinning new dst addr DESTINATION_IP port 3 fib-index 0 is-static-mapping 00:03:07:636937: ip4-lookup fib 0 dpo-idx 577 flow hash: 0x UDP: SOURCE_IP_OUTSIDE -> DESTINATION_IP tos 0x00, ttl 63, length 34, checksum 0x5eee dscp CS0 ecn NON_ECN fragment id 0x50fe, flags DONT_FRAGMENT UDP: 63957 -> 3 length 14, checksum 0xb40b 00:03:07:636939: ip4-local UDP: SOURCE_IP_OUTSIDE -> DESTINATION_IP tos 0x00, ttl 63, length 34, checksum 0x5eee dscp CS0 ecn NON_ECN fragment id 0x50fe, flags DONT_FRAGMENT UDP: 63957 -> 3 length 14, checksum 0xb40b 00:03:07:636940: nat44-hairpinning new dst addr DESTINATION_IP port 3 fib-index 0 is-static-mapping ... ... and so on. In principle it never ends. To get this trace I had added a hack in nat44-hairpinning to stop when my added debug counter exceeded a few thousand. Without that, it seems to loop forever, that worker thread gets stuck. What happens seems to be that the nat44-hairpinning node determines that there is an existing session and then decides the packet should go to the ip4-lookup node, followed by the ip4-local, followed by the nat44-hairpinning node which makes the same decision again, so it just goes round and round like that. Inside the snat_hairpinning() function it always comes to the "Destinat
Re: [vpp-dev] VPP hanging and running out of memory due to infinite loop related to nat44-hairpinning
Hi Klement, > > an existing static NAT mapping that maps that IP address on the > > inside to the same IP address on the outside. > what is the point of such static mapping? What is the use case here? We are using VPP for endpoint-independent NAT44. Then all traffic from outside is normally translated by NAT dynamic sessions but we have special treatment of traffic to a certain IP address that corresponds to our BGP (Border Gateway Protocol) traffic, that should not be translated, so then we have such a static mapping for that. If we do not have this static mapping then VPP tries to translate our BGP packets and then BGP does not work properly. It may be possible to do things differently so that no such mapping would be needed, but we have been using such a mapping until now and things have worked fine apart from this infinite loop issue, that happens when a client from inside happens to send something to our special BGP IP address that is intended to be used from the outside. That IP address is normally not used by traffic from clients, the normal thing is for the router to communicate with the VPP server using that address, from outside. This is why the out-of-memory problem has appeared random and hard to reproduce earlier, it just happened when a client behaved in an unusual way, that did not happen very often but when it did, we got the out-of-memory crash and now we finally know why. Now that we know, we can easily reproduce it, it is not really random it just seemed that way. Anyway, even if it would be unusual and possibly a bad idea to have such a static mapping, do you agree that VPP should handle the situation differently? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18233): https://lists.fd.io/g/vpp-dev/message/18233 Mute This Topic: https://lists.fd.io/mt/78662322/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] VPP hanging and running out of memory due to infinite loop related to nat44-hairpinning
Hi Klement, > Would you mind pushing it to gerrit? Here: https://gerrit.fd.io/r/c/vpp/+/30284 > It would be super cool if the change also contained a test case ;-) Coolness is always my goal. Have a look, see if the patch qualifies. :-) / Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18252): https://lists.fd.io/g/vpp-dev/message/18252 Mute This Topic: https://lists.fd.io/mt/78662322/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] VPP hanging and running out of memory due to infinite loop related to nat44-hairpinning
Hi Klement, > > Would you mind pushing it to gerrit? > > Here: https://gerrit.fd.io/r/c/vpp/+/30284 I see you added "code review +1" there, thanks! What more is needed to get it merged? Do we need to add another reviewer? / Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18282): https://lists.fd.io/g/vpp-dev/message/18282 Mute This Topic: https://lists.fd.io/mt/78662322/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] VPP hanging and running out of memory due to infinite loop related to nat44-hairpinning
Hi Ole, Thanks for merging 30284. I did the same change in the stable/2009 branch also, here: https://gerrit.fd.io/r/c/vpp/+/30340 If that could get merged as weil, it would be much appreciated. Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18303): https://lists.fd.io/g/vpp-dev/message/18303 Mute This Topic: https://lists.fd.io/mt/78662322/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?
Hi Klement, > > I see no reason why this shouldn’t be configurable. > > [...] > > Would you like to submit a patch? Here is a patch making it configurable: https://gerrit.fd.io/r/c/vpp/+/30433 Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18349): https://lists.fd.io/g/vpp-dev/message/18349 Mute This Topic: https://lists.fd.io/mt/78230881/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [EXTERNAL] [vpp-dev] Check running status of vpp
Hello, On Wed, 2020-12-16 at 14:23 +, Chris Luke via lists.fd.io wrote: > [...] I wonder if the filesystem entry is a remnant from a previous > session that was not cleaned up. FWIW we have had such problems earlier, maybe the issue is similar. In our case we were mixing use of two different VPP versions that were using different conventions for naming of those .sock files under the /run/ directory. We got into trouble because the program that tried to communicate with VPP was trying those in some specific order and if it found an existing .sock file it would try to connect using that. When there was an old .sock file it tried and failed to connect using that, it did not realize that there was a new .sock file (with a slightly different name or path) that it was possible to connect to. We were able to resolve that situation by either removing the old .sock file or by rebooting the machine which had the same effect, cleaning up old stuff under /run/. Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18362): https://lists.fd.io/g/vpp-dev/message/18362 Mute This Topic: https://lists.fd.io/mt/79001336/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?
Hi Klement, > > > I see no reason why this shouldn’t be configurable. > > > [...] > > > Would you like to submit a patch? > > Here is a patch making it configurable: > [...] New patch, including API support and a test case: https://gerrit.fd.io/r/c/vpp/+/30482 Please check that one instead, I think it's better. Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18403): https://lists.fd.io/g/vpp-dev/message/18403 Mute This Topic: https://lists.fd.io/mt/78230881/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?
Hi Klement, > > I see no reason why this shouldn’t be configurable. > > [...] > > Would you like to submit a patch? I had a patch in December that was lying around too long so there were merge conflicts, so now I made a new one again. Third time's the charm, I hope. Here it is: https://gerrit.fd.io/r/c/vpp/+/30933 It makes the frame queue size configurable and also adds API support and a test verifying the API support. Please have a look! / Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18596): https://lists.fd.io/g/vpp-dev/message/18596 Mute This Topic: https://lists.fd.io/mt/78230881/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] How to add in/out interfaces in NAT44 in vpp21
I think that with the latest VPP versions you need to use the "nat44 enable" command first, for example like this: nat44 enable sessions 100 users 1000 where the numbers are your choices for the maximum number of sessions and users per thread. Best regards, Elias On Sun, 2021-02-07 at 09:01 +, Юрий Иванов wrote: > Hi, > I'm trying to configure NAT44 feature on latest vpp: > vpp# show version > vpp v21.01-release built by root on fcb1bae62b24 at 2021-01- > 27T16:06:22 > > > Can someone help to determine why adding interfaces not working like > it should > vpp# set interface nat44 in GigabitEthernet0/5/0 out > GigabitEthernet0/4/0 > set interface nat44: add GigabitEthernet0/5/0 failed > > My config: > set interface ip address GigabitEthernet0/4/0 1.0.0.1/24 > set interface ip address GigabitEthernet0/5/0 10.0.1.1/24 > > set interface state GigabitEthernet0/4/0 up > set interface state GigabitEthernet0/5/0 up > > nat44 forwarding enable > nat44 add address 1.0.0.2-1.0.0.100 > vpp# show interface > Name IdxState MTU (L3/IP4/IP6/MPLS) > Counter Count > GigabitEthernet0/4/0 1 up 9000/0/0/0 > rx packets 9 > > rx bytes3339 > > drops 9 > GigabitEthernet0/5/0 2 up 9000/0/0/0 > local00 down 0/0/0/0 > > Maybe somehting has changed once more, because version 17-18 was > working as expected? > > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18689): https://lists.fd.io/g/vpp-dev/message/18689 Mute This Topic: https://lists.fd.io/mt/80449289/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] VPP 20.09 os_out_of_memory() in clib_bihash_add_del_16_8 in IPv4 Shallow Virtual reassembly code
Hello VPP experts, We have a problem with VPP 20.09 crashing with SIGABRT, this happened several times lately but we do not have an exact way of reproducing it. Here is a backtrace from gdb: Thread 10 "vpp_wk_7" received signal SIGABRT, Aborted. [Switching to Thread 0x7feac47f8700 (LWP 6263)] __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x74044921 in __GI_abort () at abort.c:79 #2 0xc640 in os_panic () at src/vpp/vnet/main.c:368 #3 0x77719229 in alloc_aligned_16_8 (h=0x77b79990 , nbytes=) at src/vppinfra/bihash_template.c:34 #4 0x7771b650 in value_alloc_16_8 (h=0x77b79990 , log2_pages=4) at src/vppinfra/bihash_template.c:356 #5 0x7771b43a in split_and_rehash_16_8 (h=0x77b79990 , old_values=0x7ff87c7b0d40, old_log2_pages=3, new_log2_pages=4) at src/vppinfra/bihash_template.c:453 #6 0x77710f84 in clib_bihash_add_del_inline_with_hash_16_8 (h=0x77b79990 , add_v=0x7ffbf2088c60, hash=, is_add=, is_stale_cb=0x0, arg=0x0) at src/vppinfra/bihash_template.c:765 #7 clib_bihash_add_del_inline_16_8 (h=0x77b79990 , add_v=0x7ffbf2088c60, is_add=, is_stale_cb=0x0, arg=0x0) at src/vppinfra/bihash_template.c:857 #8 clib_bihash_add_del_16_8 (h=0x77b79990 , add_v=0x7ffbf2088c60, is_add=) at src/vppinfra/bihash_template.c:864 #9 0x766795ec in ip4_sv_reass_find_or_create (vm=, rm=, rt=, kv=, do_handoff=) at src/vnet/ip/reass/ip4_sv_reass.c:364 #10 ip4_sv_reass_inline (vm=, node=, frame=, is_feature=255, is_output_feature=false, is_custom=false) at src/vnet/ip/reass/ip4_sv_reass.c:726 #11 ip4_sv_reass_node_feature_fn_skx (vm=, node=, frame=) at src/vnet/ip/reass/ip4_sv_reass.c:919 #12 0x75ac806e in dispatch_node (vm=0x7ffbf1e74400, node=0x7ffbf2553fc0, type=VLIB_NODE_TYPE_INTERNAL, dispatch_state=VLIB_NODE_STATE_POLLING, frame=, last_time_stamp=) at src/vlib/main.c:1194 #13 dispatch_pending_node (vm=0x7ffbf1e74400, pending_frame_index=, last_time_stamp=) at src/vlib/main.c:1353 #14 vlib_main_or_worker_loop (vm=0x7ffbf1e74400, is_main=0) at src/vlib/main.c:1846 #15 vlib_worker_loop (vm=0x7ffbf1e74400) at src/vlib/main.c:1980 The line at bihash_template.c:34 is "os_out_of_memory ()". If VPP calls "os_out_of_memory()" at that point in the code, what does that mean, is there some way we could configure VPP to allow it to use more memory for this kind of allocations? We have plenty of physical memory available and the main heap ("heapsize" in startup.conf) has already been set to a large value but maybe this part of the code is using some other kind of memory allocation, not using the main heap? How do we know if this particular allocation is using the main heap or not? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18769): https://lists.fd.io/g/vpp-dev/message/18769 Mute This Topic: https://lists.fd.io/mt/80753669/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] VPP 20.09 os_out_of_memory() in clib_bihash_add_del_16_8 in IPv4 Shallow Virtual reassembly code
Thanks Dave, however it looks like BIHASH_USE_HEAP does not exist in VPP 20.09 but was introduced later. Looks like it appeared with the commit 2454de2d4 "vppinfra: use heap to store bihash data" which was after 20.09 was released. I guess this means that bihash data is not stored on the heap in VPP 20.09. Maybe switching to VPP 21.01 would help with this issue then, or at least with 21.01 all of our main heap space would need to be consumed before we get another os_out_of_memory() SIGABRT crash? / Elias On Fri, 2021-02-19 at 09:56 -0500, v...@barachs.net wrote: > See ../src/vppinfra/bihash_16_8.h: > > #define BIHASH_USE_HEAP 1 > > The the sv reassembly bihash table configuration appears to be > hardwired, and complex enough to satisfy the cash customers. If the > number of buckets is way too low for your use-case, bihash is capable > of wasting a considerable amount of memory. > > Suggest that you ping Klement Sekera, it's his code... > > D. > > -Original Message- > From: vpp-dev@lists.fd.io On Behalf Of Elias > Rudberg > Sent: Friday, February 19, 2021 7:41 AM > To: vpp-dev@lists.fd.io > Subject: [vpp-dev] VPP 20.09 os_out_of_memory() in > clib_bihash_add_del_16_8 in IPv4 Shallow Virtual reassembly code > > Hello VPP experts, > > We have a problem with VPP 20.09 crashing with SIGABRT, this happened > several times lately but we do not have an exact way of reproducing > it. Here is a backtrace from gdb: > > Thread 10 "vpp_wk_7" received signal SIGABRT, Aborted. > [Switching to Thread 0x7feac47f8700 (LWP 6263)] __GI_raise ( > sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 > #0 __GI_raise (sig=sig@entry=6) at > ../sysdeps/unix/sysv/linux/raise.c:51 > #1 0x74044921 in __GI_abort () at abort.c:79 > #2 0xc640 in os_panic () at src/vpp/vnet/main.c:368 > #3 0x77719229 in alloc_aligned_16_8 (h=0x77b79990 > , nbytes=) at > src/vppinfra/bihash_template.c:34 > #4 0x7771b650 in value_alloc_16_8 (h=0x77b79990 > , log2_pages=4) at > src/vppinfra/bihash_template.c:356 > #5 0x7771b43a in split_and_rehash_16_8 (h=0x77b79990 > , old_values=0x7ff87c7b0d40, old_log2_pages=3, > new_log2_pages=4) at src/vppinfra/bihash_template.c:453 > #6 0x77710f84 in clib_bihash_add_del_inline_with_hash_16_8 > (h=0x77b79990 , add_v=0x7ffbf2088c60, > hash=, is_add=, is_stale_cb=0x0, > arg=0x0) at src/vppinfra/bihash_template.c:765 > #7 clib_bihash_add_del_inline_16_8 (h=0x77b79990 > , add_v=0x7ffbf2088c60, is_add=, > is_stale_cb=0x0, arg=0x0) at src/vppinfra/bihash_template.c:857 > #8 clib_bihash_add_del_16_8 (h=0x77b79990 > , add_v=0x7ffbf2088c60, is_add=) > at > src/vppinfra/bihash_template.c:864 > #9 0x766795ec in ip4_sv_reass_find_or_create (vm= out>, rm=, rt=, kv=, > do_handoff=) at src/vnet/ip/reass/ip4_sv_reass.c:364 > #10 ip4_sv_reass_inline (vm=, node=, > frame=, is_feature=255, is_output_feature=false, > is_custom=false) at src/vnet/ip/reass/ip4_sv_reass.c:726 > #11 ip4_sv_reass_node_feature_fn_skx (vm=, > node=, frame=) at > src/vnet/ip/reass/ip4_sv_reass.c:919 > #12 0x75ac806e in dispatch_node (vm=0x7ffbf1e74400, > node=0x7ffbf2553fc0, type=VLIB_NODE_TYPE_INTERNAL, > dispatch_state=VLIB_NODE_STATE_POLLING, frame=, > last_time_stamp=) at src/vlib/main.c:1194 > #13 dispatch_pending_node (vm=0x7ffbf1e74400, > pending_frame_index=, last_time_stamp=) > at src/vlib/main.c:1353 > #14 vlib_main_or_worker_loop (vm=0x7ffbf1e74400, is_main=0) at > src/vlib/main.c:1846 > #15 vlib_worker_loop (vm=0x7ffbf1e74400) at src/vlib/main.c:1980 > > The line at bihash_template.c:34 is "os_out_of_memory ()". > > If VPP calls "os_out_of_memory()" at that point in the code, what > does that mean, is there some way we could configure VPP to allow it > to use more memory for this kind of allocations? > > We have plenty of physical memory available and the main heap > ("heapsize" in startup.conf) has already been set to a large value > but maybe this part of the code is using some other kind of memory > allocation, not using the main heap? How do we know if this > particular allocation is using the main heap or not? > > Best regards, > Elias > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18772): https://lists.fd.io/g/vpp-dev/message/18772 Mute This Topic: https://lists.fd.io/mt/80753669/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?
Hi Marcos, If you are building VPP 20.05 from source then the easiest way is to simply change the value at "#define NAT_FQ_NELTS 64" in src/plugins/nat/nat.h from 64 to something larger, we have been using 512 which seems to work fine in our case. Note that this can help with one specific kind of packet drops in VPP NAT called "congestion drops", if you have packet loss for other reasons then a NAT_FQ_NELTS change will probably not help. Best regards, Elias On Wed, 2021-02-24 at 13:45 -0300, Marcos - Mgiga wrote: > Hi Elias, > > I have been following this discussion and finally I gave VPP a try > implementing it as a CGN gateway. Unfortunattely some issues came up, > like packets loss and I believe your patch can be helpful, > > Would mind give me guidance to deploy it? I'm using VPP 20.05 as you > did > > Best Regards > > -Mensagem original- > De: vpp-dev@lists.fd.io Em nome de Elias > Rudberg > Enviada em: terça-feira, 26 de janeiro de 2021 11:10 > Para: ksek...@cisco.com > Cc: vpp-dev@lists.fd.io > Assunto: Re: [vpp-dev] Increasing NAT worker handoff frame queue size > NAT_FQ_NELTS to avoid congestion drops? > > Hi Klement, > > > > I see no reason why this shouldn’t be configurable. > > > [...] > > > Would you like to submit a patch? > > I had a patch in December that was lying around too long so there > were merge conflicts, so now I made a new one again. Third time's the > charm, I hope. Here it is: > > https://gerrit.fd.io/r/c/vpp/+/30933 > > It makes the frame queue size configurable and also adds API support > and a test verifying the API support. Please have a look! > > / Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18802): https://lists.fd.io/g/vpp-dev/message/18802 Mute This Topic: https://lists.fd.io/mt/78230881/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Suggestion: clarify that github repo is only a mirror, add link to real repo?
Hello, Searching for "VPP source code" using my favourive web search engine gives the github page https://github.com/FDio/vpp as top search result. However that is not the real VPP repo, the github page is only a mirror. I think it would be good to clarify this in the "About" part for the github project, to avoid confusion. For comparison, look at how it is done for the Linux kernel source code mirror here: https://github.com/gregkh/linux To the top right there it says "Linux kernel stable tree mirror" with a link to the real repository which in that case is under git.kernel.org. Could that be done in the same way for VPP also? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18938): https://lists.fd.io/g/vpp-dev/message/18938 Mute This Topic: https://lists.fd.io/mt/81371296/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] VPP 20.09 os_out_of_memory() in clib_bihash_add_del_16_8 in IPv4 Shallow Virtual reassembly code
Hello Dave, Just to follow up on this, we switched from 20.09 to 21.01 and that indeed seems to have solved the problem for us, having now run for about a month without the issue coming back. Thanks for your help! Best regards, Elias On Sun, 2021-02-21 at 07:43 -0500, v...@barachs.net wrote: > That's right. In 20.09, bihash did its own os-level memory > allocation. You could (probably) pick up and port > src/vppinfra/bihash*.[ch] to 20.09, or you could add some config > knobs to the reassembly code. > > If switching to 21.01 is an option, that seems like the path of least > resistance. > > HTH... Dave > > -Original Message- > From: vpp-dev@lists.fd.io On Behalf Of Elias > Rudberg > Sent: Friday, February 19, 2021 12:10 PM > To: v...@barachs.net; vpp-dev@lists.fd.io > Subject: Re: [vpp-dev] VPP 20.09 os_out_of_memory() in > clib_bihash_add_del_16_8 in IPv4 Shallow Virtual reassembly code > > Thanks Dave, however it looks like BIHASH_USE_HEAP does not exist in > VPP 20.09 but was introduced later. Looks like it appeared with the > commit 2454de2d4 "vppinfra: use heap to store bihash data" which was > after 20.09 was released. > > I guess this means that bihash data is not stored on the heap in VPP > 20.09. Maybe switching to VPP 21.01 would help with this issue then, > or at least with 21.01 all of our main heap space would need to be > consumed before we get another os_out_of_memory() SIGABRT crash? > > / Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#19022): https://lists.fd.io/g/vpp-dev/message/19022 Mute This Topic: https://lists.fd.io/mt/80753669/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Thread safety issue in NAT plugin regarding counter for busy ports
Hello VPP experts, I think there is a thread safety issue in the NAT plugin regarding the counter for busy ports. Looking at this for the master branch now, there has been some refactoring lately but the issue has anyway been there for a long time, at least several VPP versions back, although filenames and function names have changed. Here I will take the endpoint-independent code in nat44-ei/nat44_ei.c code because that is the part I am using, but it looks like a similar issue is there for nat44-ed as well. In the nat44_ei_alloc_default_cb() function in nat44_ei.c there is a part that looks like this: --a->busy_##n##_port_refcounts[portnum]; \ a->busy_##n##_ports_per_thread[thread_index]++; \ a->busy_##n##_ports++;\ where the variable "a" is an address (nat44_ei_address_t) that belongs to the "addresses" in the global nat44_ei_main, so not thread-specific. As I understand it, different threads may be using the same "a" at the same time. At first sight it might seem like all those three lines are risky because different threads can execute this code at the same time for the same "a". However, the _port_refcounts[portnum] and _ports_per_thread[thread_index] parts are actually okay to access because the [portnum] and [thread_index] ensures that those lines only access parts of those arrays that belong to thecurrent thread, that is how the port number is selected. So the first two lines there are fine, I think, but the third line, incrementing a->busy_##n##_ports, can give a race condition when different threads execute it at the same time. The same issue is also there in other places where the busy_##n##_ports values are updated. I think this is not critical because the busy_##n##_ports information (that can be wrong because of this thread safety issue) is not used very much. However those values are used in nat44_ei_del_address() where it looks like this: /* Delete sessions using address */ if (a->busy_tcp_ports || a->busy_udp_ports || a->busy_icmp_ports) { and then inside that if-statement there is some code to delete those sessions. If the busy_##n##_ports values are wrong it could in principle happen that the session deletion is skipped when there were actually some sessions that needed deleting. Perhaps rare and perhaps resulting in nothing worse than a small memory leak, but anyway. One effect of this is that there can be an inconsistency, if we were to sum up the busy_##n##_ports_per_thread values for all threads, that should be equal to busy_##n##_ports but due to this issue there could be a difference, because while the busy_##n##_ports_per_thread values are correct the busy_##n##_ports values may have been corupted due to the race condition mentioned above. Not sure if the above is a problem in practice, my main motivation for reporting this is that it confuses me when I am trying to understand how te code works in order to do some modifications. Either the code is not thread safe there, or I have misunderstood things. What do you think, is it an issue? If not, what have I missed? (This is not an April fools' joke, I really am this pedantic) Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#19087): https://lists.fd.io/g/vpp-dev/message/19087 Mute This Topic: https://lists.fd.io/mt/81773552/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Thread safety issue in NAT plugin regarding counter for busy ports
Hi Klement, > it’s spot on. I think all of it. Would you like to push an atomic- > increment patch or should I? Better if you do it, I don't really know how such atomic-increment operations work, it's something new to me. If you have a way of fixing it like that, I would be interested to see how you did it. Do you think that could be done without too much performance cost, and still portable enough? Best regards, Elias > Thanks for spotting this!!! > Klement > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#19102): https://lists.fd.io/g/vpp-dev/message/19102 Mute This Topic: https://lists.fd.io/mt/81773552/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Progressive_VPP_Tutorial show ip: unknown input arp
Hi Farzad, I noticed also that "show ip arp" does not seem to be available anymore in recent VPP versions. Maybe the tutorial needs updating. You can try using "show ip neighbors" instead, I think it shows about the same info that "show ip arp" used to give. Best regards, Elias On Sun, 2021-05-02 at 10:41 +0430, Farzad Sadeghi wrote: > I'm very new to vpp so I decided to do the progressive vpp tutorial. > At some point you are supposed to run "show ip arp". I get this in > response: > show ip: unknown input `arp' > > I pulled the vagrant box mentioned in the tutorial so I'm on Ubuntu > Xenial. > Here's the output of "show version": > vpp v20.01-release built by root on 4d189446a03d at 2020-01- > 29T22:12:33 > > Do I need to enable a certain plugin for "show ip arp" to be > available? > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#19312): https://lists.fd.io/g/vpp-dev/message/19312 Mute This Topic: https://lists.fd.io/mt/82522555/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] How to use valgrind to check for memory errors in vpp?
Hello, I would like to use valgrind to check for memory errors in vpp. I understand that running something through valgrind makes it very very slow so that it is not an option for real production usage of vpp. However, valgrind is still very useful for finding errors even if it's only for very limited test runs, so I would very much like to make that work. I know that vpp has some built-in checking for memory leaks, but the reason I want to use valgrind is not primarily to check for memory leaks but to check for other kinds of memory-access-related errors, like the "invalid read" and "invalid write" errors that valgrind can detect. So far, what I have done is to build vpp (debug configuration) according to the instructions here: https://fdio-vpp.readthedocs.io/en/latest/gettingstarted/developers/building.html Then I stopped the vpp service since I want to run vpp from the command-line through valgrind, and finally I run it like this: sudo valgrind vpp -c /etc/vpp/startup.conf That gave warnings about "client switching stacks?" and suggested adding --max-stackframe=137286291952 so I did that: sudo valgrind --max-stackframe=137286291936 vpp -c /etc/vpp/startup.conf Then valgrind gives a warning "Warning: set address range perms: large range" followed by some error reports of the type "Conditional jump or move depends on uninitialised value(s)" inside the mspace_malloc routine in dlmalloc.c. I think these issues are probably related to the fact that vpp uses its own malloc implementation (in dlmalloc.c) instead of the default malloc, possibly combined with the fact that vpp uses very large (virual) memory. Questions: - Are there ways to configure vpp to allow it to work together with valgrind? - Are there ways to make vpp use less memory? (currently "top" shows 0.205t VIRT memory usage for the vpp_main process) - Is it possible to somehow configure vpp to use standard malloc instead of the dlmalloc.c implementation, perhaps sacrificing performance but making things work better with valgrind? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#13923): https://lists.fd.io/g/vpp-dev/message/13923 Mute This Topic: https://lists.fd.io/mt/34077527/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] How to use valgrind to check for memory errors in vpp?
Thanks Dave and Ben for your kind replies. Ben, the Address Sanitizer integration sounds very interesting, if you could share your WIP patches that would be great! Best regards, Elias On Mon, 2019-09-09 at 12:57 +, Benoit Ganne (bganne) via Lists.Fd.Io wrote: > Hi Elias, > > As mentioned by Dave, running Valgrind on VPP is challenging because > of speed and custom allocators. > That being said, I am (slowly) working on integrating Address > Sanitizer into VPP. I have some cleanup to do but I can share my WIP > patches if interested. > > Best > ben > > > -Original Message- > > From: vpp-dev@lists.fd.io On Behalf Of Dave > > Barach > > via Lists.Fd.Io > > Sent: lundi 9 septembre 2019 14:20 > > To: Elias Rudberg ; vpp-dev@lists.fd.io > > Cc: vpp-dev@lists.fd.io > > Subject: Re: [vpp-dev] How to use valgrind to check for memory > > errors in > > vpp? > > > > Dlmalloc [aka "Doug Lea Malloc"] is a lightly modified copy of the > > allocator described here: > > http://gee.cs.oswego.edu/dl/html/malloc.html. If > > you've managed to find an issue in it, please share the details. > > Until > > proven otherwise, I suspect the report rather than dlmalloc itself. > > > > Vpp does indeed manage its own thread stacks. The so-called vpp > > process > > model [in truth: cooperative multi-tasking threads] uses > > setjmp/longjmp to > > switch stacks. The scheme is fundamental, and won't be changed to > > accomodate valgrind. > > > > Dlmalloc does not support valgrind. It's a waste of a huge number > > of > > cycles to run valgrind unless the memory allocator supports it. My > > experience making vpp's previous memory allocator support valgrind > > might > > be worth sharing: it never worked very well. After > 15 years > > working on > > the code base, I've not felt the need to go back and make it work > > in > > detail. > > > > Vpp uses multiple, independent heaps - some in shared memory - so > > switching to vanilla malloc() seems like a non-starter. > > > > Vpp's virtual space is larger than one might like - note the > > difference > > with none of the plugins loaded - but in terms of real memory > > consumption > > we often see RSS sizes in the 20-30mb range. A decent fraction of > > the > > virtual space is used to avoid expensive computations in device > > drivers: > > to facilitate virtual <--> physical address translation. > > > > Any issues accidentally introduced into the memory allocator would > > be a > > severe nuisance. Folks would be well-advised not to tinker with it. > > > > HTH... Dave > > > > -Original Message- > > From: vpp-dev@lists.fd.io On Behalf Of Elias > > Rudberg > > Sent: Monday, September 9, 2019 4:43 AM > > To: vpp-dev@lists.fd.io > > Subject: [vpp-dev] How to use valgrind to check for memory errors > > in vpp? > > > > Hello, > > > > I would like to use valgrind to check for memory errors in vpp. > > > > I understand that running something through valgrind makes it very > > very > > slow so that it is not an option for real production usage of vpp. > > However, valgrind is still very useful for finding errors even if > > it's > > only for very limited test runs, so I would very much like to make > > that > > work. > > > > I know that vpp has some built-in checking for memory leaks, but > > the > > reason I want to use valgrind is not primarily to check for memory > > leaks > > but to check for other kinds of memory-access-related errors, like > > the > > "invalid read" and "invalid write" errors that valgrind can detect. > > > > So far, what I have done is to build vpp (debug configuration) > > according > > to the instructions here: https://fdio- > > vpp.readthedocs.io/en/latest/gettingstarted/developers/building.htm > > l > > Then I stopped the vpp service since I want to run vpp from the > > command- > > line through valgrind, and finally I run it like this: > > > > sudo valgrind vpp -c /etc/vpp/startup.conf > > > > That gave warnings about "client switching stacks?" and suggested > > adding - > > -max-stackframe=137286291952 so I did that: > > > > sudo valgrind --max-stackframe=137286291936 vpp -c > > /etc/vpp/startup.conf > > > > Then valgrind gives a warning "Warning: set address range perms: > > large &g
[vpp-dev] Bug in plugins/dpdk/device/init.c related to eal_init_args found using AddressSanitizer
Hello, Thanks to the patches shared by Benoit Ganne on Monday, I was today able to use AddressSanitizer for vpp. AddressSanitizer detected a problem that I think is caused by a bug in plugins/dpdk/device/init.c related to how the conf->eal_init_args vector is manipulated in the dpdk_config function. It appears that the code there uses two different kinds of strings, both C-style null-terminated strings (char*) and vectors of type (u8*) which are not necessarily null-terminated but instead have their length stored in a different way (as described in vppinfra/vec.h). In the dpdk_config function, various strings are added to the conf- >eal_init_args vector. Those strings need to be null-terminated because they are later used as input to the "format" function which expects null-terminated strings for its later arguments. The strings are mostly null-terminated but not all of them, which leads to the error detected by AddressSanitizer. I think what happens is that some string that was generated by the "format" function and is thus not null-terminated is later given as input to a function that needs null-terminated strings as input, leading to illegal memory access. I'm able to make AddressSanitizer happy by making the following two changes: (1) Null-terminate the tmp string for conf->nchannels in the same way as it is done in other places in the code: - tmp = format (0, "%d", conf->nchannels); + tmp = format (0, "%d%c", conf->nchannels, 0); (2) Null-terminate conf->eal_init_args_str before the call to dpdk_log_warn: + vec_add1(conf->eal_init_args_str, 0); After that, vpp starts without complaints from AddressSanitizer. Should this be reported as a new bug in the Jira system for VPP ( https://jira.fd.io/browse/VPP)? Should I push a fix myself (not sure if I have permission to do that) or could someone more familiar with that part of the code do it? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#13955): https://lists.fd.io/g/vpp-dev/message/13955 Mute This Topic: https://lists.fd.io/mt/34104878/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Bug in plugins/dpdk/device/init.c related to eal_init_args found using AddressSanitizer
OK, now I created a Jira issue about it: https://jira.fd.io/browse/VPP-1772 I would like to commit and push a fix also, but I'm not sure how to do that properly. Looking at "git log" it looks like you are using some special form of commit messages with special "Signed-off-by" and "Change-Id" parts, I don't know what those mean. Are you using some tool to generate those commit messages, rather than just doing "git commit" at the command-line? Best regards, Elias On Wed, 2019-09-11 at 15:03 -0400, Dave Wallace wrote: > Elias, > > Please open a Jira Ticket and push a patch with this fix. > > BTW, there is a macro [0] that safely adds c-string termination to a > vector which I would recommend using for your fix (2). > > Thanks, > -daw- > [0] > https://docs.fd.io/vpp/19.08/db/d65/vec_8h.html#a2bc43313bc727b5453c3e5d7cc57a464 > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#13969): https://lists.fd.io/g/vpp-dev/message/13969 Mute This Topic: https://lists.fd.io/mt/34104878/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Bug in plugins/dpdk/device/init.c related to eal_init_args found using AddressSanitizer
Thanks! What about the Jira ticket here https://jira.fd.io/browse/VPP-1772 -- now I set "Resolution: Done" there, should the "Fix Version/s" field be changed also? / Elias On Thu, 2019-09-12 at 12:00 -0400, Dave Wallace wrote: > Elias, > > Thanks for the patch -- I just merged it. > > Welcome to the VPP community :) > > Thanks, > -daw- -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#13975): https://lists.fd.io/g/vpp-dev/message/13975 Mute This Topic: https://lists.fd.io/mt/34104878/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Poor NAT performance with 19.08 compared to 19.01, problem related to thread placement?
As we are about to switch from VPP 19.01 to 19.08 we encountered a problem with NAT performance. We try to use the same settings (as far as possible) for 19.08 as we did for 19.01, on the same computer. In 19.01 we used 11 worker threads in total, combined with "set nat workers 0-6" so that 7 of the worker threads were handling NAT work. That worked fine in 19.01, but now that we try the same with 19.08 the performance gets really bad. The problem seems related to the choice of NAT treads. Examples to illustrate the issue: "set nat workers 0-1" --> works fine for both 19.01 and 19.08. "set nat workers 2-3" --> works fine for 19.01, but gives bad performance for 19.08. It seems as if, for version 19.08, only threads 0 and 1 can do NAT work with decent performance; as soon as any other threads are specified, performance gets bad. In contrast, for version 19.01, seemingly any of the threads can be used for NAT without performance problems. "Bad" performance here means that things work something like 10x slower, e.g. VPP starts to drop packets already at only 10% of the amount of traffic that it could handle otherwise. So it is really a big difference. Using gdb I was able to verify that the NAT functions are really executed by those worker threads that were chosen using "set nat workers", and as long as there is not too much traffic vpp still processes the packets correctly, it is just that it gets really slow when using other NAT threads than 0 and 1. My best guess is that the problem has something to do with how threads are bound (or not) to certain CPU cores and/or NUMA memory banks. But we have not changed any configuration options related to such things. Maybe if there has been a change in default behavior between 19.01 and 19.08 then that could explain it. The behavior for the current master branch seems to be the same as for 19.08. Questions: Are there some new configuration options that we need to use to make 19.08 work with good performance using more than 2 NAT threads? Has the default behavior regarding binding of threads to CPU cores changed between VPP versions 19.01 and 19.08? Other ideas of what could be causing this and/or how to troubleshoot further? (In case that matters, we are using Mellanox hardware interfaces that required "make dpdk-install-dev DPDK_MLX5_PMD=y DPDK_MLX5_PMD_DLOPEN_DEPS=n" when building for vpp 19.01, while for 19.08 the interfaces are setup using "create int rdma host-if ...".) Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#14104): https://lists.fd.io/g/vpp-dev/message/14104 Mute This Topic: https://lists.fd.io/mt/34379814/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Poor NAT performance with 19.08 compared to 19.01, problem related to thread placement?
More info after investigating further: the issue seems related to the fact that the RDMA plugin is available in 19.08, which did not exist in 19.01. As a result, we no longer need the "make dpdk-install-dev DPDK_MLX5_PMD=y DPDK_MLX5_PMD_DLOPEN_DEPS=n" complication when building. The release notes for VPP 19.04 say "RDMA (ibverb) driver plugin - MLX5 with multiqueue". For 19.01 we had configured "num-rx-queues" for each of the two interfaces used, in the dpdk dev part of the startup.conf file. After testing different choices for that it turns out that if we set "num-rx- queues 1" for each interface, then 19.01 gets the same performance problem that we see for 19.08 (i.e. only threads 0 and 1 can be used efficiently for NAT). So it appears that the reason why our 19.01 installation can use more NAT threads is that we have set larger "num- rx-queues" values. For 19.08 however, the "num-rx-queues" values seem to be ignored, presumably because the RDMA plugin is used. Is it correct that the dpdk dev num-rx-queues option is ignored when the RDMA plugin is used? How can we add more queues or polling threads to RDMA interfaces so that we can use more NAT workers? Best regards, Elias On Thu, 2019-10-03 at 07:28 +, Elias Rudberg wrote: > As we are about to switch from VPP 19.01 to 19.08 we encountered a > problem with NAT performance. We try to use the same settings (as far > as possible) for 19.08 as we did for 19.01, on the same computer. > > [...] -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#14105): https://lists.fd.io/g/vpp-dev/message/14105 Mute This Topic: https://lists.fd.io/mt/34379814/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Poor NAT performance with 19.08 compared to 19.01, problem related to thread placement?
Dear Chris and Ben, This solved the issue for us. Many thanks for your help! Best regards, Elias On Thu, 2019-10-03 at 11:55 +, Benoit Ganne (bganne) via Lists.Fd.Io wrote: > Chris is correct, rdma driver is independent from DPDK driver and as > such is not aware of any DPDK config option. > Here is an example to create 8 rx queues: > ~# vppctl create int rdma host-if enp94s0f0 name rdma-0 num-rx-queues > 8 > > Best > Ben > > > -Original Message- > > From: vpp-dev@lists.fd.io On Behalf Of > > Christian > > Hopps > > Sent: jeudi 3 octobre 2019 13:29 > > To: Elias Rudberg > > Cc: Christian Hopps ; vpp-dev@lists.fd.io > > Subject: Re: [vpp-dev] Poor NAT performance with 19.08 compared to > > 19.01, > > problem related to thread placement? > > > > "create interface rdma" CLI has an num-rx-queues config > > > > VLIB_CLI_COMMAND (rdma_create_command, static) = { > > .path = "create interface rdma", > > .short_help = "create interface rdma [name > > ]" > > " [rx-queue-size ] [tx-queue-size ]" > > " [num-rx-queues ]", > > .function = rdma_create_command_fn, > > }; > > > > is that were you are setting it? DPDK config will not apply when > > you are > > using the native driver. > > > > Thanks, > > Chris. > > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#14108): https://lists.fd.io/g/vpp-dev/message/14108 Mute This Topic: https://lists.fd.io/mt/34379814/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] per-worker stat vector length fix needed in 19.08 also?
I was just chasing a strange error that turned out to be related to some code in src/vpp/stats/stat_segment.c where something went wrong regarding statistics vectors for different threads (some kind of memory corruption that ended up causing an infinite loop inside dlmalloc.c). Then I saw the following commit by Ben in the master branch: - commit dba00cad1a2e41b4974911793cc76eab81a6e30e Author: Benoît Ganne Date: Mon Sep 30 12:39:55 2019 +0200 stats: fix per-worker stat vector length Type: fix - The above commit in the master branch fixes the problem I was struggling with for 19.08. Can that commit be applied (cherry-picked?) also for the 19.08 branch? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#14113): https://lists.fd.io/g/vpp-dev/message/14113 Mute This Topic: https://lists.fd.io/mt/34391310/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] prevent loopback of broadcast packets rdma fix needed in 19.08 also?
We just had problems making LACP bonding work with RDMA and Mellanox cards, using VPP 19.08. It turned out to be caused by a problem with unintended loopback of some packets, something that is fixed by the following commit by Ben in the master branch: --- commit df213385d391f21d99eaeaf066f0130a20f7ccde Author: Benoît Ganne Date: Fri Oct 4 15:28:12 2019 +0200 rdma: prevent loopback of broadcast packets TX queues must be created before RX queues on Mellanox cards in order to not receive our own broadcast packets. Type: fix Change-Id: I32ae25a47d819f715feda621a5ecddcf4efd71ba Signed-off-by: Benoît Ganne --- Can that fix be applied also for the 19.08 branch? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#14148): https://lists.fd.io/g/vpp-dev/message/14148 Mute This Topic: https://lists.fd.io/mt/34443946/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Access to gerrit.fd.io port 29418 works for IPv4 but not for IPv6?
Hello, According to the instructions here https://wiki.fd.io/view/VPP/Pulling,_Building,_Running,_Hacking_and_Pushing_VPP_Code#Pulling_code_via_ssh pulling the code should be done like this: git clone ssh://usern...@gerrit.fd.io:29418/vpp.git However, from my computer that does not work (it hangs). First I thought this was due to port 29418 being blocked for me locally but it turns out that was not the issue. Doing "host gerrit.fd.io" shows that it has both an IPv4 and an IPv6 address: IPv4: 52.10.107.188 IPv6: 2600:1f14:9b3:3400:ee75:f90f:2247:905d If I use the IPv4 address instead of the hostname, like this, then it works: git clone ssh://USERNAME@52.10.107.188:29418/vpp.git Trying from another computer that only uses IPv4, it works as it should using the hostname. So, it seems like the ssh access to gerrit.fd.io:29418 works for IPv4 but not for IPv6. That would explain why I can get it to work by typing the IPv4 address instead of the hostname, I guess that forces IPv4 to be used. As another way of verifying this, I tested disabling IPv6 completely on my computer. Then things work, consistent with the hypotethis that the problem is related to IPv6 configuration of the gerrit.fd.io server. (if I'm right, anyone trying to access gerrit.fd.io:29418 using IPv6 should see this problem) For now, using the IPv4 address works as a workaround, but I guess this is something that should be fixed in how the server is configured? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#14385): https://lists.fd.io/g/vpp-dev/message/14385 Mute This Topic: https://lists.fd.io/mt/39770462/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] RDMA fix needed in 19.08 also
It seems like the rdma plugin is currently not working in the stable/1908 branch. It stopped working after commit b4c5f16889. In the master branch, the rdma plugin stopped working in commit 534de8b2a7 but started working again after the fix in commit 386ebb6e2b with commit message "rdma: build: fix ibverb compilation test". To make rdma work again in the stable/1908 branch, I think the fix 386ebb6e2b "rdma: build: fix ibverb compilation test" would be needed in that branch also. The change is quite small, only a few lines in the file src/plugins/rdma/CMakeLists.txt. Can that change be applied in the stable/1908 branch? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#14432): https://lists.fd.io/g/vpp-dev/message/14432 Mute This Topic: https://lists.fd.io/mt/40219418/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] RDMA fix needed in 19.08 also
Yes, now it works. Thank you! / Elias On Fri, 2019-11-01 at 08:42 +0100, Andrew 👽 Yourtchenko wrote: > It’s merged. Please let me know if all ok now. > > --a > > > On 31 Oct 2019, at 23:55, Andrew Yourtchenko via Lists.Fd.Io < > > ayourtch=gmail@lists.fd.io> wrote: > > > > Elias, > > > > Thanks for telling! I have cherry-picked > > https://gerrit.fd.io/r/#/c/vpp/+/23164/ and will merge it tomorrow. > > > > --a > > > > > On 31 Oct 2019, at 19:18, Elias Rudberg < > > > elias.rudb...@bahnhof.net> wrote: > > > > > > 386ebb6e2b > > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#14458): https://lists.fd.io/g/vpp-dev/message/14458 Mute This Topic: https://lists.fd.io/mt/40219418/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] NAT worker HANDOFF but no HANDED-OFF -- no worker picks up the handed-off work
We are using VPP 19.08 for NAT (nat44) and are struggling with the following problem: it first works seemingly fine for a while, like several days or weeks, but then suddenly VPP stops forwarding traffic. Even ping to the "outside" IP address fails. The VPP process is still running so we try to investigate further using vppctl, enabling packet trace as follows: clear trace trace add rdma-input 5 then doing ping to "outside" and then "show trace". To see the normal behavior we have compared to another server running VPP without the strange problem happening; there we can see that the normal behavior is that one worker starts processing the packet and then does NAT44_OUT2IN_WORKER_HANDOFF after which another worker takes over: "handoff_trace" and then "HANDED-OFF: from thread..." and then that worker continues processing the packet. So the relevant parts of the trace look like this (abbreviated to show only node names and handoff info) for a case when thread 8 hands off work to thread 3: --- Start of thread 3 vpp_wk_2 --- Packet 1 08:15:10:781992: handoff_trace HANDED-OFF: from thread 8 trace index 0 08:15:10:781992: nat44-out2in 08:15:10:782008: ip4-lookup 08:15:10:782009: ip4-local 08:15:10:782010: ip4-icmp-input 08:15:10:782011: ip4-icmp-echo-request 08:15:10:782011: ip4-load-balance 08:15:10:782013: ip4-rewrite 08:15:10:782014: BondEthernet0-output --- Start of thread 8 vpp_wk_7 --- Packet 1 08:15:10:781986: rdma-input 08:15:10:781988: bond-input 08:15:10:781989: ethernet-input 08:15:10:781989: ip4-input 08:15:10:781990: nat44-out2in-worker-handoff NAT44_OUT2IN_WORKER_HANDOFF : next-worker 3 trace index 0 The above is what it looks like normally. The problem is that sometimes, for some reason, the handoff stops working so that we only get the initial processing by a worker and that working saying NAT44_OUT2IN_WORKER_HANDOFF but the other worker does not pick up the work, it is seemingly ignored. Here is what it looks like then, when the problem has happened, thread 7 trying to handoff to thread 3: --- Start of thread 3 vpp_wk_2 --- No packets in trace buffer --- Start of thread 7 vpp_wk_6 --- Packet 1 08:38:41:904654: rdma-input 08:38:41:904656: bond-input 08:38:41:904658: ethernet-input 08:38:41:904660: ip4-input 08:38:41:904663: nat44-out2in-worker-handoff NAT44_OUT2IN_WORKER_HANDOFF : next-worker 3 trace index 0 So, work is also in this case handed off to thread 3 but thread 3 does not pick it up. There is no "HANDED-OFF" message in the trace at all, not for any worker. It seems like the handed-off work was ignored. Then of course it is understandable that the ping does not work and packet forwarding does not work, the question is: why does that hand-off procedure fail? Are there some known reasons that can cause this behavior? When there is a NAT44_OUT2IN_WORKER_HANDOFF message in the packet trace, should there always be a corresponding "HANDED-OFF" message for another thread picking it up? One more question related to the above: sometimes when looking at trace for ICMP packets to investigate this problem we have seen a worker apparently handing off work to itself, which seems strange. Example: --- Start of thread 3 vpp_wk_2 --- Packet 1 08:31:23:871274: rdma-input 08:31:23:871279: bond-input 08:31:23:871282: ethernet-input 08:31:23:871285: ip4-input 08:31:23:871289: nat44-out2in-worker-handoff NAT44_OUT2IN_WORKER_HANDOFF : next-worker 3 trace index 0 If the purpose of "handoff" is to let another thread take over, then this seems strange by itself (even without considering that there is no "HANDED-OFF" for any thread): why is thread 3 trying to handoff work to itself? Does that indicate something wrong or are there legitimate cases where a thread "hands off" something to itself? We have encountered this problem several times but unfortunately we have not yet found a way to reproduce it in a lab environment, we do not know exactly what triggers the problem. Previous times, when we have restarted vpp it starts working normally again. Any input on this or ideas for how to troubleshoot further would be much appreciated. Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#14602): https://lists.fd.io/g/vpp-dev/message/14602 Mute This Topic: https://lists.fd.io/mt/59112885/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] NAT worker HANDOFF but no HANDED-OFF -- no worker picks up the handed-off work
Hi Andrew, Thanks, that looks promising. The issue https://jira.fd.io/browse/VPP-1734 that the fix refers to seems like it could be the same issue we are seeing. We have just restarted vpp with the fix, it will be interesting to see if it helps. Thanks again for your help! / Elias On Fri, 2019-11-15 at 11:26 +0100, Andrew 👽 Yourtchenko wrote: > Hi Elias, > > Could you give a shot running a build with > https://gerrit.fd.io/r/#/c/vpp/+/23461/ in ? > > I cherry-picked it from master today but it is not in 19.08 branch > yet. > > --a > > > On 15 Nov 2019, at 11:05, Elias Rudberg > > wrote: > > > > We are using VPP 19.08 for NAT (nat44) and are struggling with the > > following problem: it first works seemingly fine for a while, like > > several days or weeks, but then suddenly VPP stops forwarding > > traffic. > > Even ping to the "outside" IP address fails. > > > > The VPP process is still running so we try to investigate further > > using > > vppctl, enabling packet trace as follows: > > > > clear trace > > trace add rdma-input 5 > > > > then doing ping to "outside" and then "show trace". > > > > To see the normal behavior we have compared to another server > > running > > VPP without the strange problem happening; there we can see that > > the > > normal behavior is that one worker starts processing the packet and > > then does NAT44_OUT2IN_WORKER_HANDOFF after which another worker > > takes > > over: "handoff_trace" and then "HANDED-OFF: from thread..." and > > then > > that worker continues processing the packet. > > So the relevant parts of the trace look like this (abbreviated to > > show > > only node names and handoff info) for a case when thread 8 hands > > off > > work to thread 3: > > > > --- Start of thread 3 vpp_wk_2 --- > > Packet 1 > > > > 08:15:10:781992: handoff_trace > > HANDED-OFF: from thread 8 trace index 0 > > 08:15:10:781992: nat44-out2in > > 08:15:10:782008: ip4-lookup > > 08:15:10:782009: ip4-local > > 08:15:10:782010: ip4-icmp-input > > 08:15:10:782011: ip4-icmp-echo-request > > 08:15:10:782011: ip4-load-balance > > 08:15:10:782013: ip4-rewrite > > 08:15:10:782014: BondEthernet0-output > > > > --- Start of thread 8 vpp_wk_7 --- > > Packet 1 > > > > 08:15:10:781986: rdma-input > > 08:15:10:781988: bond-input > > 08:15:10:781989: ethernet-input > > 08:15:10:781989: ip4-input > > 08:15:10:781990: nat44-out2in-worker-handoff > > NAT44_OUT2IN_WORKER_HANDOFF : next-worker 3 trace index 0 > > > > The above is what it looks like normally. The problem is that > > sometimes, for some reason, the handoff stops working so that we > > only > > get the initial processing by a worker and that working saying > > NAT44_OUT2IN_WORKER_HANDOFF but the other worker does not pick up > > the > > work, it is seemingly ignored. > > > > Here is what it looks like then, when the problem has happened, > > thread > > 7 trying to handoff to thread 3: > > > > --- Start of thread 3 vpp_wk_2 --- > > No packets in trace buffer > > > > --- Start of thread 7 vpp_wk_6 --- > > Packet 1 > > > > 08:38:41:904654: rdma-input > > 08:38:41:904656: bond-input > > 08:38:41:904658: ethernet-input > > 08:38:41:904660: ip4-input > > 08:38:41:904663: nat44-out2in-worker-handoff > > NAT44_OUT2IN_WORKER_HANDOFF : next-worker 3 trace index 0 > > > > So, work is also in this case handed off to thread 3 but thread 3 > > does > > not pick it up. There is no "HANDED-OFF" message in the trace at > > all, > > not for any worker. It seems like the handed-off work was ignored. > > Then > > of course it is understandable that the ping does not work and > > packet > > forwarding does not work, the question is: why does that hand-off > > procedure fail? > > > > Are there some known reasons that can cause this behavior? > > > > When there is a NAT44_OUT2IN_WORKER_HANDOFF message in the packet > > trace, should there always be a corresponding "HANDED-OFF" message > > for > > another thread picking it up? > > > > One more question related to the above: sometimes when looking at > > trace > > for ICMP packets to investigate this problem we have seen a worker > > apparently
[vpp-dev] undefined symbol: nat_ha_resync (trying to use Active-Passive NAT HA)
When trying to use the Active-Passive NAT HA functionality described at https://docs.fd.io/vpp/20.01/dd/d2e/nat_ha_doc.html and trying the "nat ha resync" command, VPP crashes with the following message: symbol lookup error: [...] nat_plugin.so: undefined symbol: nat_ha_resync The attempted function call is in nat_ha_resync_command_fn in plugins/nat/nat44_cli.c and looks like this: if (nat_ha_resync (0, 0, 0)) error = clib_error_return (0, "NAT HA resync already running"); The nat_ha_resync function is declared in plugins/nat/nat_ha.h like this: /** * @brief Resync HA (resend existing sessions to new failover) */ int nat_ha_resync (u32 client_index, u32 pid, nat_ha_resync_event_cb_t event_callback); so it is declared so the compiler accepts the function call, but apparently the function is not implemented anywhere, leading to the symbol lookup error. We tried this with 19.08 as well as the current master branch and encounter the same problem for both. Any ideas on how to make this work? Also, any other advice regarding the NAT HA functionality or links to further documentation or example usage (if there is more than https://docs.fd.io/vpp/20.01/dd/d2e/nat_ha_doc.html) would be much appreciated. Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#14696): https://lists.fd.io/g/vpp-dev/message/14696 Mute This Topic: https://lists.fd.io/mt/61957444/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Good LACP packets giving "error-drop" statistics
We are using LACP and it works fine except that the "error-drop" statistics are increased for each LACP packet that arrives. We see this behavior both for VPP 19.08 and for the current master branch. Here is an example of a packet trace for a LACP packet: 00:00:16:717846: rdma-input rdma: Interface101 (3) next-node bond-input 00:00:16:717848: bond-input src [...], dst [...], Interface101 -> Interface101 00:00:16:717849: ethernet-input SLOW_PROTOCOLS: [...] 00:00:16:717850: lacp-input Interface101: Length: 110 LACPv1 Actor Information TLV: length 20 [... LACP info here ...] Partner Information TLV: length 20 [... LACP info here ...] 00:00:16:717851: error-drop rx:Interface101 00:00:16:717852: drop lacp-input: good lacp packets -- cache hit So it says "good lacp packets" but at the same time "error-drop" which seems contradictory. LACP is in fact working fine, the only issue we have is the "error- drop" statistics that we would like to avoid if there is in fact nothing wrong. Is there some reason why it is desirable to report error-drop for all LACP packets, or is this something that can be fixed so that error-drop is only used when there is something wrong? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#14716): https://lists.fd.io/g/vpp-dev/message/14716 Mute This Topic: https://lists.fd.io/mt/62549109/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Status of VPP Active-Passive NAT HA code?
Hello VPP experts, I would like to ask about the status of the Active-Passive NAT HA (high availability) code in src/plugins/nat/nat_ha.c and nat_ha.h. In the git history it looks like it was added by Matus Fabian in February 2019, with few changes since then. Having looked at it and tested it I think it is partly working, it can indeed sync sessions from the active to the passive vpp server, but the "resync" functionality needed to (re-)send all session data to a new passive vpp server is as far as I can tell not fully implemented. In particular, the function nat_ha_resync declared in nat_ha.h is not implemented in nat_ha.c which makes vpp crash when trying to use the "nat ha resync" command in vppctl. The "resync" functionality would be really good to have since that would allow us to restore the primary server in a situation when the secondary has taken over, if resync is supported then the secondary can send the session data back again once the primary has been fixed/upgraded and then the original setup with the redundancy can be recovered, all without breaking existing user sessions. Is anyone working on that part of the code now, or using it, or have some idea about its status? Any advice in case I were to try implementing the missing pieces myself? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#14727): https://lists.fd.io/g/vpp-dev/message/14727 Mute This Topic: https://lists.fd.io/mt/64117356/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Status of VPP Active-Passive NAT HA code?
Hi Ole, Thanks for explaining! The "programmable flow NAT" solution you describe sounds very interesting, it may be better for us to focus on that if it's not too far off in the future. Please let me know if, when and how I can help with that. Best regards, Elias > The NAT HA code was something Matus ported across from another > project. > The other work, was an experiment with a split of the NAT fast-path > and slow-path, with a protocol between them. > The NAT fast-path (aka the NAT DP) used a flow cache, with > instructions. On cache miss it would send a protocol packet to the > NAT slow path / NAT CP, asking for instructions for this flow. > The flow cache was uni-directional. And forward / reverse flow could > be on different VPP instances, as could the NAT CP. > > In our experiment the NAT CP also ran on VPP (although it doesn't > have to). And while the NAT DP instances by design don't need HA > functionality, the backend NAT CP would have to. > The bits of NAT HA was upstreamed by Matus from that experiment. > > It hasn't been worked on since as far as I know. > Filip probably has a better understanding of the details of that > code. > Happy to help of course, and happy to declare you the owner of NAT HA > from now on. ;-) > > Upstreaming the "programmable flow NAT" is on my list. It will be in > a separate plugin. Let me know if you are interested / want to > contribute. > > Best regards, > Ole > > PS: On a more personal rant. I dislike HA solutions in general. They > tend to, by their increased level of complexity, to decrease > reliability. > One benefit of IPv4 run-out is that applications/the transport layer > has been trained to expect that the network holds session state, and > that it's the applications responsibility to maintain that session > state in the network. I wouldn't be so worried about dropping > sessions in the case of an unexpected event. Applications will > recreate session. At least something worth testing. > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > > View/Reply Online (#14728): > https://lists.fd.io/g/vpp-dev/message/14728 > Mute This Topic: https://lists.fd.io/mt/64117356/1968077 > Group Owner: vpp-dev+ow...@lists.fd.io > Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [elias.rudberg@bahn > hof.net] > -=-=-=-=-=-=-=-=-=-=-=- -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#14733): https://lists.fd.io/g/vpp-dev/message/14733 Mute This Topic: https://lists.fd.io/mt/64117356/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] How to receive broadcast messages in VPP?
Hello everyone, I am trying to figure out how to receive broadcast messages in VPP (vpp version 19.08 in case that matters). This is in the context of some changes we are considering in the VPP NAT HA functionality. That code in e.g. plugins/nat/nat_ha.c uses UDP messages to communicate information about NAT sessions between different VPP servers. It is currently using unicast messages, but we are considering the possibility of using broadcast messages instead, hoping that could be more efficient in case there are more than two servers involved. For example, when a new NAT session has been created, we could send a broadcast message about the new session, that would reach several other VPP servers, without need to send a separate unicast message to each server. The code in plugins/nat/nat_ha.c calls udp_register_dst_port() to register that it wants to receive UDP traffic, like this: udp_register_dst_port (ha->vlib_main, port, nat_ha_handoff_node.index, 1); This works fine for unicast messages; when such packets arrive at the given port, they get handled by the nat_ha_handoff_node as desired. However, if broadcast packets arrive, those packets are dropped instead, they do not arrive at the nat_ha_handoff_node. For example, if the IP address of the relevant interface on the receiving side is 10.10.50.1/24 then unicast UDP messages with destination 10.10.50.1 are handled fine. However, if the destination is 10.10.50.255 (the broadcast address for that /24 subnet) then the packets are dropped. Here is an example of a packet trace when such a packet is received from 10.10.50.2: 02:41:19:250212: rdma-input rdma: Interface101 (3) next-node bond-input 02:41:19:250214: bond-input src 02:fe:ff:76:e4:5d, dst ff:ff:ff:ff:ff:ff, Interface101 -> BondEthernet0 02:41:19:250214: ethernet-input IP4: 02:fe:ff:76:e4:5d -> ff:ff:ff:ff:ff:ff 802.1q vlan 1015 02:41:19:250215: ip4-input UDP: 10.10.50.2 -> 10.10.50.255 tos 0x80, ttl 254, length 92, checksum 0x02fa fragment id 0x0002, flags DONT_FRAGMENT UDP: 1234 -> 2345 length 72, checksum 0x 02:41:19:250216: ip4-lookup fib 0 dpo-idx 0 flow hash: 0x UDP: 10.10.50.2 -> 10.10.50.255 tos 0x80, ttl 254, length 92, checksum 0x02fa fragment id 0x0002, flags DONT_FRAGMENT UDP: 1234 -> 2345 length 72, checksum 0x 02:41:19:250217: ip4-drop UDP: 10.10.50.2 -> 10.10.50.255 tos 0x80, ttl 254, length 92, checksum 0x02fa fragment id 0x0002, flags DONT_FRAGMENT UDP: 1234 -> 2345 length 72, checksum 0x 02:41:19:250217: error-drop rx:BondEthernet0.1015 02:41:19:250217: drop ethernet-input: no error So the packet ends up at ip4-drop when I would have liked it to come to nat_ha_handoff_node. Does anyone have a suggestion about how to make this work? Is some special configuration of the receiving interface needed to tell VPP that we want it to receive broadcast packets? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#15352): https://lists.fd.io/g/vpp-dev/message/15352 Mute This Topic: https://lists.fd.io/mt/71020576/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] How to receive broadcast messages in VPP?
Hi Neale and Dave, Thanks for your answers! I was able to make it work using multicast as Neale suggested. Here is roughly what I did to make it work using multicast instead of unicast: On the sending side, to make it send multicast packets: adj_index_t adj_index_for_multicast = adj_mcast_add_or_lock (FIB_PROTOCOL_IP4, VNET_LINK_IP4, sw_if_index); and then when a message is to be sent, use the above created adj_index before invoking ip4_rewrite_node (instead of ip4_lookup_node): vnet_buffer (b)->ip.adj_index[VLIB_TX] = adj_index_for_multicast; vlib_put_frame_to_node (vm, ip4_rewrite_node.index, f); On the receiving side the following config was needed: ip mroute add 224.0.0.1 via MyInterface Accept ip mroute add 224.0.0.1 via local Forward After that it works using multicast. Thanks for your help! (Please let me know if the above is not the right way to do it) Best regards, Elias On Thu, 2020-02-06 at 13:45 +, Neale Ranns via Lists.Fd.Io wrote: > Hi Elias, > > Please see inline. > > > On 06/02/2020 12:41, "vpp-dev@lists.fd.io on behalf of Elias > Rudberg" > wrote: > > Hello everyone, > > I am trying to figure out how to receive broadcast messages in > VPP (vpp > version 19.08 in case that matters). > > This is in the context of some changes we are considering in the > VPP > NAT HA functionality. That code in e.g. plugins/nat/nat_ha.c uses > UDP > messages to communicate information about NAT sessions between > different VPP servers. It is currently using unicast messages, > but we > are considering the possibility of using broadcast messages > instead, > hoping that could be more efficient in case there are more than > two > servers involved. For example, when a new NAT session has been > created, > we could send a broadcast message about the new session, that > would > reach several other VPP servers, without need to send a separate > unicast message to each server. > > The code in plugins/nat/nat_ha.c calls udp_register_dst_port() to > register that it wants to receive UDP traffic, like this: > > udp_register_dst_port (ha->vlib_main, port, > nat_ha_handoff_node.index, 1); > > This works fine for unicast messages; when such packets arrive at > the > given port, they get handled by the nat_ha_handoff_node as > desired. > > However, if broadcast packets arrive, those packets are dropped > instead, they do not arrive at the nat_ha_handoff_node. > > For example, if the IP address of the relevant interface on the > receiving side is 10.10.50.1/24 then unicast UDP messages with > destination 10.10.50.1 are handled fine. However, if the > destination is > 10.10.50.255 (the broadcast address for that /24 subnet) then the > packets are dropped. Here is an example of a packet trace when > such a > packet is received from 10.10.50.2: > > 02:41:19:250212: rdma-input > rdma: Interface101 (3) next-node bond-input > 02:41:19:250214: bond-input > src 02:fe:ff:76:e4:5d, dst ff:ff:ff:ff:ff:ff, Interface101 -> > BondEthernet0 > 02:41:19:250214: ethernet-input > IP4: 02:fe:ff:76:e4:5d -> ff:ff:ff:ff:ff:ff 802.1q vlan 1015 > 02:41:19:250215: ip4-input > UDP: 10.10.50.2 -> 10.10.50.255 > tos 0x80, ttl 254, length 92, checksum 0x02fa > fragment id 0x0002, flags DONT_FRAGMENT > UDP: 1234 -> 2345 > length 72, checksum 0x > 02:41:19:250216: ip4-lookup > fib 0 dpo-idx 0 flow hash: 0x > UDP: 10.10.50.2 -> 10.10.50.255 > tos 0x80, ttl 254, length 92, checksum 0x02fa > fragment id 0x0002, flags DONT_FRAGMENT > UDP: 1234 -> 2345 > length 72, checksum 0x > 02:41:19:250217: ip4-drop > UDP: 10.10.50.2 -> 10.10.50.255 > tos 0x80, ttl 254, length 92, checksum 0x02fa > fragment id 0x0002, flags DONT_FRAGMENT > UDP: 1234 -> 2345 > length 72, checksum 0x > > if you check: > sh ip fib 10.10.50.255/32 > you'll see an explicit entry to drop. You can't override this. > > > 02:41:19:250217: error-drop > rx:BondEthernet0.1015 > 02:41:19:250217: drop > ethernet-input: no error > > So the packet ends up at ip4-drop when I would have liked it to > come to > nat_ha_handoff_node. > > Does anyone have a suggestion about how to make this work? > Is some special configuration of the receiving interface needed > to tell > VPP that we want it to receive broadcast packets? >
[vpp-dev] VPP ip4-input drops packets due to "ip4 length > l2 length" errors when using rdma with Mellanox mlx5 cards
Hello VPP developers, We have a problem with VPP used for NAT on Ubuntu 18.04 servers equipped with Mellanox ConnectX-5 network cards (ConnectX-5 EN network interface card; 100GbE dual-port QSFP28; PCIe3.0 x16; tall bracket; ROHS R6). VPP is dropping packets in the ip4-input node due to "ip4 length > l2 length" errors, when we use the RDMA plugin. The interfaces are configured like this: create int rdma host-if enp101s0f1 name Interface101 num-rx-queues 1 create int rdma host-if enp179s0f1 name Interface179 num-rx-queues 1 (we have set num-rx-queues 1 now to simplify while troubleshooting, in production we use num-rx-queues 4) We see some packets dropped due to "ip4 length > l2 length" for example in TCP tests with around 100 Mbit/s -- running such a test for a few seconds already gives some errors. More traffic gives more errors and it seems to be unrelated to the contents of the packets, it seems to happen quite randomly and already at such moderate amounts of traffic, very far below what should be the capacity of the hardware. Only a small fraction of packets are dropped: in tests at 100 Mbit/s and packet size 500, for each million packets about 3 or 4 packets get the "ip4 length > l2 length" drop problem. However, the effect appears stronger for larger amounts of traffic and has impacted some of our end users who observe decresed TCP speed as a result of these drops. The "ip4 length > l2 length" errors can be seen using vppctl "show errors": 142ip4-input ip4 length > l2 length To get more info about the "ip4 length > l2 length" error we printed the involved sizes when the error happens (ip_len0 and cur_len0 in src/vnet/ip/ip4_input.h), which shows that the actual packet size is often much smaller than the ip_len0 value which is what the IP packet size should be according to the IP header. For example, when ip_len0=500 as is the case for many of our packets in the test runs, the cur_len0 value is sometimes much smaller. The smallest case we have seen was cur_len0 = 59 with ip_len0 = 500 -- the IP header said the IP packet size was 500 bytes, but the actual size was only 59 bytes. So it seems some data is lost, packets have been truncated, sometimes large parts of the packets are missing. The problems disappear if we skip using the RDMA plugin and use the (old?) dpdk way of handling the interfaces, then there are no "ip4 length > l2 length" drops at all. That makes us think there is something wrong with the rdma plugin, perhaps a bug or something wrong with how it is configured. We have tested this with both the current master branch and the stable/1908 branch, we see the same problem for both. We tried updating the Mellanox driver from v4.6 to v4.7 (latest version) but that did not help. After trying some different values of the rx-queue-size parameter to the "create int rdma" command, it seems like the "ip4 length > l2 length" becomes smaller as the rx-queue-size is increased, perhaps indicating the problem has to do with what happens when the end of that queue is reached. Do you agree that the above points to a problem with the RDMA plugin in VPP? Are there known bugs or other issues that could explain the "ip4 length > l2 length" drops? Does it seem like a good idea to set a very large value of the rx- queue-size parameter if that alleviates the "ip4 length > l2 length" problem, or are there big downsides of using a large rx-queue-size value? What else could we do to troubleshoot this further, are there configuration options to the RDMA plugin that could be used to solve this and/or get more information about what is happening? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#15403): https://lists.fd.io/g/vpp-dev/message/15403 Mute This Topic: https://lists.fd.io/mt/71273976/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] VPP ip4-input drops packets due to "ip4 length > l2 length" errors when using rdma with Mellanox mlx5 cards
Hi Ben, Thanks for your answer. Now I think I found the problem, looks like a bug in plugins/rdma/input.c related to what happens when the list of input packets wrap around to the beginning of the ring buffer. To fix it, the following change is needed: diff --git a/src/plugins/rdma/input.c b/src/plugins/rdma/input.c index 30fae83e0..f9979545d 100644 --- a/src/plugins/rdma/input.c +++ b/src/plugins/rdma/input.c @@ -318,7 +318,7 @@ rdma_device_input_inline (vlib_main_t * vm, vlib_node_runtime_t * node, &bt); if (n_tail < n_rx_packets) n_rx_bytes += - rdma_device_input_bufs (vm, rd, &to_next[n_tail], &rxq->bufs[0], wc, + rdma_device_input_bufs (vm, rd, &to_next[n_tail], &rxq->bufs[0], &wc[n_tail], n_rx_packets - n_tail, &bt); rdma_device_input_ethernet (vm, node, rd, next_index); At that point in the code, the rdma_device_input_bufs() function is called twice to handle the n_rx_packets that have arrived. First it is called for the part up to the end of the buffer, and then a second call is made to handle the remaining part, starting from the beginning of the buffer. The problem is that the same "wc" argument is passed both times, when in fact that pointer needs to be moved forward for the second call, so we need &wc[n_tail] instead of just wc for the second call to rdma_device_input_bufs() -- n_tail is the number of packets that were handled by the first rdma_device_input_bufs() call. In my tests so far it looks like the above change fixes the problem completely, after the fix there are no longer any "ip4 length > l2 length" errors. This explanation fits with what we saw in our tests earlier, that the problem with erroneous packets became smaller when the buffer size was increased, since the second call to rdma_device_input_bufs() only comes into play at the end of the ring buffer, which happens more rarely when the buffer is larger. (But after the fix above there is no longer any need to increase the buffer size.) What do you think, does this seem right? Best regards, Elias On Mon, 2020-02-17 at 15:38 +, Benoit Ganne (bganne) via Lists.Fd.Io wrote: > Hi Elias, > > As the problem only arise with VPP rdma driver and not the DPDK > driver, it is fair to say it is a VPP rdma driver issue. > I'll try to reproduce the issue on my setup and keep you posted. > In the meantime I do not see a big issue increasing the rx-queue-size > to mitigate it. > > ben > > > -Original Message- > > From: vpp-dev@lists.fd.io On Behalf Of Elias > > Rudberg > > Sent: vendredi 14 février 2020 16:56 > > To: vpp-dev@lists.fd.io > > Subject: [vpp-dev] VPP ip4-input drops packets due to "ip4 length > > > l2 > > length" errors when using rdma with Mellanox mlx5 cards > > > > Hello VPP developers, > > > > We have a problem with VPP used for NAT on Ubuntu 18.04 servers > > equipped with Mellanox ConnectX-5 network cards (ConnectX-5 EN > > network > > interface card; 100GbE dual-port QSFP28; PCIe3.0 x16; tall bracket; > > ROHS R6). > > > > VPP is dropping packets in the ip4-input node due to "ip4 length > > > l2 > > length" errors, when we use the RDMA plugin. > > > > The interfaces are configured like this: > > > > create int rdma host-if enp101s0f1 name Interface101 num-rx-queues > > 1 > > create int rdma host-if enp179s0f1 name Interface179 num-rx-queues > > 1 > > > > (we have set num-rx-queues 1 now to simplify while troubleshooting, > > in > > production we use num-rx-queues 4) > > > > We see some packets dropped due to "ip4 length > l2 length" for > > example > > in TCP tests with around 100 Mbit/s -- running such a test for a > > few > > seconds already gives some errors. More traffic gives more errors > > and > > it seems to be unrelated to the contents of the packets, it seems > > to > > happen quite randomly and already at such moderate amounts of > > traffic, > > very far below what should be the capacity of the hardware. > > > > Only a small fraction of packets are dropped: in tests at 100 > > Mbit/s > > and packet size 500, for each million packets about 3 or 4 packets > > get > > the "ip4 length > l2 length" drop problem. However, the effect > > appears > > stronger for larger amounts of traffic and has impacted some of our > > end > > users who observe decresed TCP speed as a result of these drops. > > > > The "ip4 length > l2 length" errors can be seen using vppctl "show > > errors": > > > >
Re: [vpp-dev] VPP ip4-input drops packets due to "ip4 length > l2 length" errors when using rdma with Mellanox mlx5 cards
Hi Ben, Great! I tried submitting a patch myself, here it is: https://gerrit.fd.io/r/c/vpp/+/25233 Let me know if something more is needed. I tried to follow the instructions here: https://wiki.fd.io/view/VPP/Pulling,_Building,_Running,_Hacking_and_Pushing_VPP_Code#Pushing_Code_with_git_review / Elias On Tue, 2020-02-18 at 09:30 +, Benoit Ganne (bganne) via Lists.Fd.Io wrote: > Hi Elias, > > > Now I think I found the problem, looks like a bug in > > plugins/rdma/input.c related to what happens when the list of input > > packets wrap around to the beginning of the ring buffer. > > To fix it, the following change is needed: > > Indeed, your fix is correct, good catch. Do you want to submit a > patch through gerrit or do you prefer me to do it? > > Best > ben -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#15446): https://lists.fd.io/g/vpp-dev/message/15446 Mute This Topic: https://lists.fd.io/mt/71273976/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] VPP ip4-input drops packets due to "ip4 length > l2 length" errors when using rdma with Mellanox mlx5 cards
Hello, Could this fix be applied in the stable/1908 (and maybe stable/2001) branch also? Best regards, Elias On Tue, 2020-02-18 at 11:48 +, Elias Rudberg wrote: > Hi Ben, > > Great! I tried submitting a patch myself, here it is: > > https://gerrit.fd.io/r/c/vpp/+/25233 > > Let me know if something more is needed. I tried to follow the > instructions here: > https://wiki.fd.io/view/VPP/Pulling,_Building,_Running,_Hacking_and_Pushing_VPP_Code#Pushing_Code_with_git_review > > / Elias > > > On Tue, 2020-02-18 at 09:30 +, Benoit Ganne (bganne) via > Lists.Fd.Io wrote: > > Hi Elias, > > > > > Now I think I found the problem, looks like a bug in > > > plugins/rdma/input.c related to what happens when the list of > > > input > > > packets wrap around to the beginning of the ring buffer. > > > To fix it, the following change is needed: > > > > Indeed, your fix is correct, good catch. Do you want to submit a > > patch through gerrit or do you prefer me to do it? > > > > Best > > ben -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#15702): https://lists.fd.io/g/vpp-dev/message/15702 Mute This Topic: https://lists.fd.io/mt/71273976/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] VPP ip4-input drops packets due to "ip4 length > l2 length" errors when using rdma with Mellanox mlx5 cards
Hello again, Thanks for the help with getting this fix into the 1908 branch! Could the same fix please be added in the stable/2001 branch also? That would be very helpful for us, since although we are until now using 19.08 we are about to move to 20.01 because we need the NAT improvements in 20.01 that are not available in 19.08. Best regards, Elias On Mon, 2020-03-09 at 09:21 +, Elias Rudberg wrote: > Hello, > > Could this fix be applied in the stable/1908 (and maybe stable/2001) > branch also? > > Best regards, > Elias > > > > On Tue, 2020-02-18 at 11:48 +, Elias Rudberg wrote: > > Hi Ben, > > > > Great! I tried submitting a patch myself, here it is: > > > > https://gerrit.fd.io/r/c/vpp/+/25233 > > > > Let me know if something more is needed. I tried to follow the > > instructions here: > > https://wiki.fd.io/view/VPP/Pulling,_Building,_Running,_Hacking_and_Pushing_VPP_Code#Pushing_Code_with_git_review > > > > / Elias > > > > > > On Tue, 2020-02-18 at 09:30 +, Benoit Ganne (bganne) via > > Lists.Fd.Io wrote: > > > Hi Elias, > > > > > > > Now I think I found the problem, looks like a bug in > > > > plugins/rdma/input.c related to what happens when the list of > > > > input > > > > packets wrap around to the beginning of the ring buffer. > > > > To fix it, the following change is needed: > > > > > > Indeed, your fix is correct, good catch. Do you want to submit a > > > patch through gerrit or do you prefer me to do it? > > > > > > Best > > > ben -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#15725): https://lists.fd.io/g/vpp-dev/message/15725 Mute This Topic: https://lists.fd.io/mt/71273976/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] impact of API requests on forwarding performance?
Hi Andreas, I think you are right about the stop-world way it works. We have seen a performance impact, but that was for a command that was quite slow, listing something with many lines of output (the "show nat44 sessions" command). So then the worker threads were stopped during that whole operation and we saw some packet drops each time. Later we were able to extract the info we needed in other ways (like getting number of sessions directly as a single number per thread via the python API instead of fetching a large output and counting lines in that), so we could avoid that performance problem. For small things like checking the values of some counters, we have not seen any performance impact. But then we only did those calls once every 30 seconds or so. If you do it very often, like many times times per second, maybe there could be a performance impact also for small things. I suppose you could test it by gradually increasing the frequency of your API calls and seeing when drops start to happen. Best regards, Elias On Wed, 2020-03-11 at 17:03 +0100, Andreas Schultz wrote: > Hi, > > Has anyone benchmarked the impact of VPP API invocations on the > forwarding performance? > > Background: most calls on the VPP API run in a stop-world maner. That > means all graph node worker threads are stopped at a barrier, the API > call is executed and then the workers are released from the barrier. > Right? > > My question is now, when I do 1k, 10k or even 100k API invocation per > second, how does that impact the forwarding performance of VPP? > > Does anyone have a use-case running that is actually doing that? > > Many thanks, > Andreas -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#15738): https://lists.fd.io/g/vpp-dev/message/15738 Mute This Topic: https://lists.fd.io/mt/71882379/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] impact of API requests on forwarding performance?
Hi Ole, Thanks for explaining! I'm sorry if what I wrote before was wrong or confusing. > Checking counters values in the stats segment has _no_ impact on VPP. > VPP writes those counters regardless of reader frequency. That's great! Just to be clear, to make sure I understand what this means, if we do the following in python: from vpp_papi.vpp_stats import VPPStats stat = VPPStats("/run/vpp/stats.sock") dir = stat.ls(['^/nat44/total-users']) counters = stat.dump(dir) list_of_counters=counters.get('/nat44/total-users') (followed by a loop in python to sum up the counter values from different vpp threads) then what we are doing is that we are checking counters values in the stats segment, so there should be no impact on VPP? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#15743): https://lists.fd.io/g/vpp-dev/message/15743 Mute This Topic: https://lists.fd.io/mt/71882379/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] NAT bugix related to in2out/out2in handoff node index
Hello, While working on moving from VPP 19.08 to 20.01 we found that NAT was no longer working and it seems to be due to a bug in src/plugins/nat/nat.c for the dynamic endpoint-independent case, here: sm->handoff_out2in_index = snat_in2out_node.index; sm->handoff_in2out_index = snat_out2in_node.index; As I understand it, handoff_out2in_index is supposed to be the node index of the out2in node, but it is set to the in2out node index instead. And the other way around, in2out/in2out are mixed up in those two lines. I pushed a fix to gerrit, it's just those two lines that are changed: https://gerrit.fd.io/r/c/vpp/+/25856 If you agree, can this fix please be accepted into master and also into the stable/2001 branch? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#15772): https://lists.fd.io/g/vpp-dev/message/15772 Mute This Topic: https://lists.fd.io/mt/71926127/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Approve NAT in2out/out2in handoff node index fix in stable/2001 branch also?
Hello, Can someone please approve this change so that we get the fix in the stable/2001 branch also? https://gerrit.fd.io/r/c/vpp/+/25861 (it was done in the master branch last week -- see https://gerrit.fd.io/r/c/vpp/+/25856 -- then it was cherry picked for the stable/2001 branch) Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#15799): https://lists.fd.io/g/vpp-dev/message/15799 Mute This Topic: https://lists.fd.io/mt/72020681/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] VPP 20.05 problems related to memory allocations -- possible memory leak?
Hello Murthy, I think that the problem with VPP 20.05 that I wrote about back in October 2020 later turned out to be related to NAT44 hairpinning, see the discussion here: https://lists.fd.io/g/vpp-dev/topic/78662322 A fix was merged so the exact same problem should not happen for the 21.06 version that you are using. If you have some similar problem in the sense that VPP runs out of memory for some unknown reason, then my advice would be to gather as much statistics as you can (error counters and so on) from around the time the problem happens, to see if that gives you any clue. Best regards, Elias On Tue, 2021-11-16 at 23:10 -0800, Satya Murthy wrote: > Hi Klemant/Elias/Vpp-Experts, > > We are also seeing the same crash with fdio 21.06 version. > > vec_resize_allocate_memory + 0x285 > vlib_put_next_frame + 0xbd > > Our main-heap size is set to 2G. > > Is this a known issue (or) any fix that is available for this. > > Any inputs will be helpful. > > > > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#20505): https://lists.fd.io/g/vpp-dev/message/20505 Mute This Topic: https://lists.fd.io/mt/77479819/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Are some VPP releases considered LTS releases?
Hello VPP experts, Are some VPP releases considered LTS (long-term support) releases? If so, which is the latest LTS version at this time? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22097): https://lists.fd.io/g/vpp-dev/message/22097 Mute This Topic: https://lists.fd.io/mt/94681424/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] How to make VPP work with Mellanox ConnectX-6 NICs?
Hello VPP experts, We have been using VPP with Mellanox ConnectX-5 cards for a while, which has been working great. Now we have a new server where we want to run VPP in a similar way that we are used to, the difference is that the new server has ConnectX-6 cards instead of ConnectX-5. The lspci command shows each ConnectX-6 card as follows: 51:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6] Trying to create an interface using the following command: create int rdma host-if ibs1f1 name if1 num-rx-queues 4 gives the following error: DBGvpp# create int rdma host-if ibs1f1 name if1 num-rx-queues 4 create interface rdma: Queue Pair create failed: Invalid argument and journalctl shows the following: Nov 16 16:06:39 [...] vnet[3147]: rdma: rdma_txq_init: Queue Pair create failed: Invalid argument Nov 16 16:06:39 [...] vnet[3147]: create interface rdma: Queue Pair create failed: Invalid argument Nov 16 16:06:39 [...] kernel: infiniband mlx5_3: create_qp:3206:(pid 3147): Create QP type 8 failed We are using Ubuntu 22.04 and the VPP version tested was vpp v22.10. Do we need to do something different when using ConnectX-6 cards compared to the ConnectX-5 case? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22189): https://lists.fd.io/g/vpp-dev/message/22189 Mute This Topic: https://lists.fd.io/mt/95069595/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] How to make VPP work with Mellanox ConnectX-6 NICs?
Hi Ben, You were right that my issue had to do with IB/ETH mode. The card was set to IB mode. After changing to ETH mode, things are now working. No change in how VPP is configured for ConnectX-6 compared to ConnectX-5, everything is the same except that the interface names are slightly different as can be seen for example using the "ip link" command. In case it helps someone else, the command used to change mode was "mlxconfig" and the options changed were LINK_TYPE_P1 and LINK_TYPE_P2, both of those were changed from IB(1) to ETH(2). Thanks! / Elias On Tue, 2022-11-22 at 08:59 +, Benoit Ganne (bganne) via lists.fd.io wrote: > Hi Elias, > > Sorry, this slipped through my mind! > I do not have any Cx6 to test (I think we should receive some in CSIT > at some point), but as it seems to complain about the QP type 8 which > is supposed to be the ethernet queue type, can you check if your > adapter supports Ethernet and if so, if it is set to Ethernet and not > IB? You might need to use some mlx tools to query/change settings in > the card fw. > > Best > ben > > > -Original Message----- > > From: vpp-dev@lists.fd.io On Behalf Of Elias > > Rudberg > > Sent: Wednesday, November 16, 2022 17:10 > > To: vpp-dev@lists.fd.io > > Subject: [vpp-dev] How to make VPP work with Mellanox ConnectX-6 > > NICs? > > > > Hello VPP experts, > > > > We have been using VPP with Mellanox ConnectX-5 cards for a while, > > which has been working great. > > > > Now we have a new server where we want to run VPP in a similar way > > that > > we are used to, the difference is that the new server has ConnectX- > > 6 > > cards instead of ConnectX-5. > > > > The lspci command shows each ConnectX-6 card as follows: > > > > 51:00.0 Infiniband controller: Mellanox Technologies MT28908 Family > > [ConnectX-6] > > > > Trying to create an interface using the following command: > > > > create int rdma host-if ibs1f1 name if1 num-rx-queues 4 > > > > gives the following error: > > > > DBGvpp# create int rdma host-if ibs1f1 name if1 num-rx-queues 4 > > create interface rdma: Queue Pair create failed: Invalid argument > > > > and journalctl shows the following: > > > > Nov 16 16:06:39 [...] vnet[3147]: rdma: rdma_txq_init: Queue Pair > > create failed: Invalid argument > > Nov 16 16:06:39 [...] vnet[3147]: create interface rdma: Queue Pair > > create failed: Invalid argument > > Nov 16 16:06:39 [...] kernel: infiniband mlx5_3: > > create_qp:3206:(pid > > 3147): Create QP type 8 failed > > > > We are using Ubuntu 22.04 and the VPP version tested was vpp > > v22.10. > > > > Do we need to do something different when using ConnectX-6 cards > > compared to the ConnectX-5 case? > > > > Best regards, > > Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#0): https://lists.fd.io/g/vpp-dev/message/0 Mute This Topic: https://lists.fd.io/mt/95069595/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Traffic shaping functionality in VPP?
Hello VPP experts, We have been using VPP for NAT44 for a while, which has worked great. We contributed some fixes a couple of years ago and have been using VPP without issues since then. Now we are considering the possibility of using VPP for a different usecase as well, related to "broadband network gateway" (BNG) functionality. This would involve traffic shaping, something like buffering packets for each user/subscriber when the rate of traffic reaches a certain limit, allowing different limits for different users. There would need to be a separate buffer for each user and some counters to keep track of the current rate of traffic for each user. Questions related to this: - is there some already existing traffic shaping functionality in VPP that could be used for this? - otherwise, if we were to implement such functionality, would you say it is feasible to do as a VPP plugin and do you have advice on how to do it? - are others on this list also interested in this, or even someone already working on something like this? I would also be interested in any other comments or thoughts you may have about this. Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22757): https://lists.fd.io/g/vpp-dev/message/22757 Mute This Topic: https://lists.fd.io/mt/97800741/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-