Re: [vpp-dev] #vpp-hoststack - Issue with UDP receiver application using VCL library

2020-01-24 Thread Raj Kumar
Hi Florin,
After fixing the UDP checksum offload issue and using the 64K tx buffer, I
am able to send 35Gbps ( half duplex) .
In DPDK code ( ./plugins/dpdk/device/init.c) , it was not setting the
DEV_TX_OFFLOAD_TCP_CKSUM and DEV_TX_OFFLOAD_UDP_CKSUM offload bit for MLNX5
PMD.
 In the udp tx application I am using vppcom_session_write to write to the
session and write len is same as the buffer size ( 64K).

Btw, I run all the tests with the patch
https://gerrit.fd.io/r/c/vpp/+/24462 you
provided.

If I run a single UDP tx connection then the throughput is 35 Gbps. But, on
starting other UDP rx connections (20 Gbps) the tx throughput goes down to
12Gbps.
Even , if I run 2 UDP tx connection then also I am not able to scale up the
throughput.  The overall throughput stays the same.
First I tried this test with 4 worker threads and then with 1 worker
thread.

 I have following 2 points -
1) With my udp tx test application, I am getting this throughput after
using 64K tx buffer. But , in actual product I have to send the variable
size UDP packets ( max len 9000 bytes) . That mean the maximum tx buffer
size would be 9K and with that buffer size  I am getting 15Gbps which is
fine if I can some how scale up it by running multiple applications. But,
that does not seems to work with UDP ( I am not using udpc).

2)  My target is the achieve at least 30 Gbps rx and 30 Gbps tx UDP
throughput on one  NUMA node.  I tried by running the multiple VPP
instances on VFs ( SR-IOV) and I can scale up the throughput ( rx and tx)
with the number of VPP instances.
Here is the throughput test with VF -
1 VPP instance  ( 15Gbps  rx and 15Gbps  tx)
2 VPP instances  ( 30Gbps rx and 30 Ghps tx)
3 VPP instances  ( 45 Gbps rx and 35Gbps tx)

I have 2 NUMA node on the serer so I am expecting to get 60 Gbps rx and 60
Gbps rx total throughput.

Btw, I also tested TCP without VF. It seems to scale up properly as the
connections are going on different threads.

*vpp# sh thread*

*ID NameTypeLWP Sched Policy (Priority)
lcore  Core   Socket State*

*0  vpp_main22181   other (0)
1  0  0*

*1  vpp_wk_0workers 22183   other (0)
   2  2  0*

*2  vpp_wk_1workers 22184   other (0)
3  3  0*

*3  vpp_wk_2workers 22185   other (0)
4  4  0*

*4  vpp_wk_3workers 22186   other (0)
   5  8  0*



*4 worker threads *

*Iperf3 TCP tests  - 8000 bytes packets *

1 Connection:

Rx only

18 Gbps

vpp# sh session verbose 1

ConnectionState  Rx-f
Tx-f

[0:0][T] fd0d:edc4::2001::203:6669->:::0  LISTEN 0 0

Thread 0: active sessions 1



ConnectionState  Rx-f
Tx-f

[1:0][T] fd0d:edc4::2001::203:6669->fd0d:edc4:ESTABLISHED0 0

Thread 1: active sessions 1

Thread 2: no sessions

Thread 3: no sessions



ConnectionState  Rx-f
Tx-f

[4:0][T] fd0d:edc4::2001::203:6669->fd0d:edc4:ESTABLISHED0 0

Thread 4: active sessions 1



2 connections:



Rx only

32Gbps

vpp# sh session verbose 1

ConnectionState  Rx-f
Tx-f

[0:0][T] fd0d:edc4::2001::203:6669->:::0  LISTEN 0 0

[0:1][T] fd0d:edc4::2001::203:6679->:::0  LISTEN 0 0

Thread 0: active sessions 2



ConnectionState  Rx-f
Tx-f

[1:0][T] fd0d:edc4::2001::203:6669->fd0d:edc4:ESTABLISHED0 0

Thread 1: active sessions 1

Thread 2: no sessions

Thread 3: no sessions



ConnectionState  Rx-f
Tx-f

[4:0][T] fd0d:edc4::2001::203:6669->fd0d:edc4:ESTABLISHED0 0

[4:1][T] fd0d:edc4::2001::203:6679->fd0d:edc4:ESTABLISHED0 0

[4:2][T] fd0d:edc4::2001::203:6679->fd0d:edc4:ESTABLISHED0 0

Thread 4: active sessions 3

3 connection

Rx only

43Gbps

vpp# sh session verbose 1

ConnectionState  Rx-f
Tx-f

[0:0][T] fd0d:edc4::2001::203:6669->:::0  LISTEN 0 0

[0:1][T] fd0d:edc4::2001::203:6679->:::0  LISTEN 0 0

[0:2][T] fd0d:edc4::2001::203:6689->:::0  LISTEN 0 0

Thread 0: active sessions 3



ConnectionState  Rx-f
Tx-f

[1:0][T] fd0d:edc4::2001::203:6669->fd0d:edc4:ESTABLISHED0 0

Thread 1: active sessions 1

Thread 2: no sessions



ConnectionState  Rx-f
Tx-f

[3:0][T] fd0d:edc4::2001::203:6689->fd0d:edc4:ESTABLISHED0 0

Thread 3: active sessions 1



ConnectionState  Rx-f
Tx-f

[4:0][T] 

Re: [vpp-dev] #vpp-hoststack - Issue with UDP receiver application using VCL library

2020-01-21 Thread Florin Coras
Hi Raj, 

Inline.

> On Jan 21, 2020, at 3:41 PM, Raj Kumar  wrote:
> 
> Hi Florin,
> There is no drop on the interfaces. It is 100G card. 
> In UDP tx application, I am using 1460 bytes of buffer to send on select(). I 
> am getting 5 Gbps throughput  ,but if I start one more application then total 
> throughput goes down to 4 Gbps as both the sessions are on the same thread.   
> I increased the tx buffer to 8192 bytes and then I can get 11 Gbps throughput 
>  but again if I start one more application the throughput goes down to 10 
> Gbps.

FC: I assume you’re using vppcom_session_write to write to the session. How 
large is “len” typically? See lower on why that matters.
 
> 
> I found one issue in the code ( You must be aware of that) , the UDP send MSS 
> is hard-coded to 1460 ( /vpp/src/vnet/udp/udp.c file). So, the large packets  
> are getting fragmented. 
> udp_send_mss (transport_connection_t * t)
> {
>   /* TODO figure out MTU of output interface */
>   return 1460;
> }

FC: That’s a typical mss and actually what tcp uses as well. Given the nics, 
they should be fine sending a decent number of mpps without the need to do 
jumbo ip datagrams. 

> if I change the MSS to 8192 then I am getting 17 Mbps throughput. But , if i 
> start one more application then throughput is going down to 13 Mbps. 

> 
> It looks like the 17 Mbps is per core limit and since all the sessions are 
> pined to the same thread we can not get more throughput.  Here, per core 
> throughput look good to me. Please let me know there is any way to use 
> multiple threads for UDP tx applications. 
> 
> In your previous email you mentioned that we can use connected udp socket in 
> the UDP receiver. Can we do something similar for UDP tx ?

FC: I think it may work fine if vpp has main + 1 worker. I have a draft patch 
here [1] that seems to work with multiple workers but it’s not heavily tested. 

Out of curiosity, I ran a vcl_test_client/server test with 1 worker and with 
XL710s, I’m seeing this:

CLIENT RESULTS: Streamed 65536017791 bytes
  in 14.392678 seconds (36.427420 Gbps half-duplex)!

Should be noted that because of how datagrams are handled in the session layer, 
throughput is sensitive to write sizes. I ran the client like:
~/vcl_client -p udpc 6.0.1.2 1234 -U -N 100 -T 65536

Or in english, unidirectional test, tx buffer of 64kB and 1M writes of that 
buffer. My vcl config was such that tx fifos were 4MB and rx fifos 2MB. The 
sender had few tx packet drops (1657) and the receiver few rx packet drops 
(801). If you plan to use it, make sure arp entries are first resolved (e.g., 
use ping) otherwise the first packet is lost. 

Throughput drops to ~15Gbps with 8kB writes. You should probably also test with 
bigger writes with udp. 

[1] https://gerrit.fd.io/r/c/vpp/+/24462

> 
> From the hardware stats , it seems that UDP tx checksum offload is not 
> enabled/active  which could impact the performance. I think, udp tx checksum 
> should be enabled by default if it is not disabled using parameter  
> "no-tx-checksum-offload".

FC: Performance might be affected by the limited number of offloads available. 
Here’s what I see on my XL710s:

rx offload active: ipv4-cksum jumbo-frame scatter
tx offload active: udp-cksum tcp-cksum multi-segs

> 
> Ethernet address b8:83:03:79:af:8c
>   Mellanox ConnectX-4 Family
> carrier up full duplex mtu 9206
> flags: admin-up pmd maybe-multiseg subif rx-ip4-cksum
> rx: queues 5 (max 65535), desc 1024 (min 0 max 65535 align 1)

FC: Are you running with 5 vpp workers? 

Regards,
Florin

> tx: queues 6 (max 65535), desc 1024 (min 0 max 65535 align 1)
> pci: device 15b3:1017 subsystem 1590:0246 address :12:00.00 numa 0
> max rx packet len: 65536
> promiscuous: unicast off all-multicast on
> vlan offload: strip off filter off qinq off
> rx offload avail:  vlan-strip ipv4-cksum udp-cksum tcp-cksum vlan-filter
>jumbo-frame scatter timestamp keep-crc
> rx offload active: ipv4-cksum jumbo-frame scatter
> tx offload avail:  vlan-insert ipv4-cksum udp-cksum tcp-cksum tcp-tso
>outer-ipv4-cksum vxlan-tnl-tso gre-tnl-tso multi-segs
>udp-tnl-tso ip-tnl-tso
> tx offload active: multi-segs
> rss avail: ipv4-frag ipv4-tcp ipv4-udp ipv4-other ipv4 ipv6-tcp-ex
>ipv6-udp-ex ipv6-frag ipv6-tcp ipv6-udp ipv6-other
>ipv6-ex ipv6
> rss active:ipv4-frag ipv4-tcp ipv4-udp ipv4-other ipv4 ipv6-tcp-ex
>ipv6-udp-ex ipv6-frag ipv6-tcp ipv6-udp ipv6-other
>ipv6-ex ipv6
> tx burst function: (nil)
> rx burst function: mlx5_rx_burst
> 
> thanks,
> -Raj
> 
> On Mon, Jan 20, 2020 at 7:55 PM Florin Coras  > wrote:
> Hi Raj, 
> 
> Good to see progress. Check with “show int” the tx counters on the sender and 
> rx counters on the receiver as 

Re: [vpp-dev] #vpp-hoststack - Issue with UDP receiver application using VCL library

2020-01-21 Thread Raj Kumar
Correction : -
Please read 17 Mbps as 17 Gbps and 13Mbps as 13Gbps in my previous mail.

thanks,
-Raj

On Tue, Jan 21, 2020 at 6:41 PM Raj Kumar  wrote:

> Hi Florin,
> There is no drop on the interfaces. It is 100G card.
> In UDP tx application, I am using 1460 bytes of buffer to send on
> select(). I am getting 5 Gbps throughput  ,but if I start one more
> application then total throughput goes down to 4 Gbps as both the sessions
> are on the same thread.
> I increased the tx buffer to 8192 bytes and then I can get 11 Gbps
> throughput  but again if I start one more application the throughput goes
> down to 10 Gbps.
>
> I found one issue in the code ( You must be aware of that) , the UDP send
> MSS is hard-coded to 1460 ( /vpp/src/vnet/udp/udp.c file). So, the large
> packets  are getting fragmented.
> udp_send_mss (transport_connection_t * t)
> {
>   /* TODO figure out MTU of output interface */
>   return 1460;
> }
> if I change the MSS to 8192 then I am getting 17 Mbps throughput. But , if
> i start one more application then throughput is going down to 13 Mbps.
>
> It looks like the 17 Mbps is per core limit and since all the sessions are
> pined to the same thread we can not get more throughput.  Here, per core
> throughput look good to me. Please let me know there is any way to use
> multiple threads for UDP tx applications.
>
> In your previous email you mentioned that we can use connected udp socket
> in the UDP receiver. Can we do something similar for UDP tx ?
>
> From the hardware stats , it seems that UDP tx checksum offload is not
> enabled/active  which could impact the performance. I think, udp tx
> checksum should be enabled by default if it is not disabled using
> parameter  "no-tx-checksum-offload".
>
> Ethernet address b8:83:03:79:af:8c
>   Mellanox ConnectX-4 Family
> carrier up full duplex mtu 9206
> flags: admin-up pmd maybe-multiseg subif rx-ip4-cksum
> rx: queues 5 (max 65535), desc 1024 (min 0 max 65535 align 1)
> tx: queues 6 (max 65535), desc 1024 (min 0 max 65535 align 1)
> pci: device 15b3:1017 subsystem 1590:0246 address :12:00.00 numa 0
> max rx packet len: 65536
> promiscuous: unicast off all-multicast on
> vlan offload: strip off filter off qinq off
> rx offload avail:  vlan-strip ipv4-cksum udp-cksum tcp-cksum
> vlan-filter
>jumbo-frame scatter timestamp keep-crc
> rx offload active: ipv4-cksum jumbo-frame scatter
> tx offload avail:  vlan-insert ipv4-cksum udp-cksum tcp-cksum tcp-tso
>outer-ipv4-cksum vxlan-tnl-tso gre-tnl-tso
> multi-segs
>udp-tnl-tso ip-tnl-tso
> tx offload active: multi-segs
> rss avail: ipv4-frag ipv4-tcp ipv4-udp ipv4-other ipv4
> ipv6-tcp-ex
>ipv6-udp-ex ipv6-frag ipv6-tcp ipv6-udp ipv6-other
>ipv6-ex ipv6
> rss active:ipv4-frag ipv4-tcp ipv4-udp ipv4-other ipv4
> ipv6-tcp-ex
>ipv6-udp-ex ipv6-frag ipv6-tcp ipv6-udp ipv6-other
>ipv6-ex ipv6
> tx burst function: (nil)
> rx burst function: mlx5_rx_burst
>
> thanks,
> -Raj
>
> On Mon, Jan 20, 2020 at 7:55 PM Florin Coras 
> wrote:
>
>> Hi Raj,
>>
>> Good to see progress. Check with “show int” the tx counters on the sender
>> and rx counters on the receiver as the interfaces might be dropping
>> traffic. One sender should be able to do more than 5Gbps.
>>
>> How big are the writes to the tx fifo? Make sure the tx buffer is some
>> tens of kB.
>>
>> As for the issue with the number of workers, you’ll have to switch to
>> udpc (connected udp), to ensure you have a separate connection for each
>> ‘flow’, and to use accept in combination with epoll to accept the sessions
>> udpc creates.
>>
>> Note that udpc currently does not work correctly with vcl and multiple
>> vpp workers if vcl is the sender (not the receiver) and traffic is
>> bidirectional. The sessions are all created on the first thread and once
>> return traffic is received, they’re migrated to the thread selected by RSS
>> hashing. VCL is not notified when that happens and it runs out of sync. You
>> might not be affected by this, as you’re not receiving any return traffic,
>> but because of that all sessions may end up stuck on the first thread.
>>
>> For udp transport, the listener is connection-less and bound to the main
>> thread. As a result, all incoming packets, even if they pertain to multiple
>> flows, are written to the listener’s buffer/fifo.
>>
>> Regards,
>> Florin
>>
>> On Jan 20, 2020, at 3:50 PM, Raj Kumar  wrote:
>>
>> Hi Florin,
>> I changed my application as you suggested. Now, I am able to achieve 5
>> Gbps with a single UDP stream.  Overall, I can get ~20Gbps with multiple
>> host application . Also, the TCP throughput  is improved to ~28Gbps after
>> tuning as mentioned in  [1].
>> On the similar topic; the UDP tx throughput is throttled to 5Gbps. Even
>> if I run the 

Re: [vpp-dev] #vpp-hoststack - Issue with UDP receiver application using VCL library

2020-01-21 Thread Raj Kumar
Hi Florin,
There is no drop on the interfaces. It is 100G card.
In UDP tx application, I am using 1460 bytes of buffer to send on select().
I am getting 5 Gbps throughput  ,but if I start one more application then
total throughput goes down to 4 Gbps as both the sessions are on the same
thread.
I increased the tx buffer to 8192 bytes and then I can get 11 Gbps
throughput  but again if I start one more application the throughput goes
down to 10 Gbps.

I found one issue in the code ( You must be aware of that) , the UDP send
MSS is hard-coded to 1460 ( /vpp/src/vnet/udp/udp.c file). So, the large
packets  are getting fragmented.
udp_send_mss (transport_connection_t * t)
{
  /* TODO figure out MTU of output interface */
  return 1460;
}
if I change the MSS to 8192 then I am getting 17 Mbps throughput. But , if
i start one more application then throughput is going down to 13 Mbps.

It looks like the 17 Mbps is per core limit and since all the sessions are
pined to the same thread we can not get more throughput.  Here, per core
throughput look good to me. Please let me know there is any way to use
multiple threads for UDP tx applications.

In your previous email you mentioned that we can use connected udp socket
in the UDP receiver. Can we do something similar for UDP tx ?

>From the hardware stats , it seems that UDP tx checksum offload is not
enabled/active  which could impact the performance. I think, udp tx
checksum should be enabled by default if it is not disabled using
parameter  "no-tx-checksum-offload".

Ethernet address b8:83:03:79:af:8c
  Mellanox ConnectX-4 Family
carrier up full duplex mtu 9206
flags: admin-up pmd maybe-multiseg subif rx-ip4-cksum
rx: queues 5 (max 65535), desc 1024 (min 0 max 65535 align 1)
tx: queues 6 (max 65535), desc 1024 (min 0 max 65535 align 1)
pci: device 15b3:1017 subsystem 1590:0246 address :12:00.00 numa 0
max rx packet len: 65536
promiscuous: unicast off all-multicast on
vlan offload: strip off filter off qinq off
rx offload avail:  vlan-strip ipv4-cksum udp-cksum tcp-cksum vlan-filter
   jumbo-frame scatter timestamp keep-crc
rx offload active: ipv4-cksum jumbo-frame scatter
tx offload avail:  vlan-insert ipv4-cksum udp-cksum tcp-cksum tcp-tso
   outer-ipv4-cksum vxlan-tnl-tso gre-tnl-tso multi-segs
   udp-tnl-tso ip-tnl-tso
tx offload active: multi-segs
rss avail: ipv4-frag ipv4-tcp ipv4-udp ipv4-other ipv4
ipv6-tcp-ex
   ipv6-udp-ex ipv6-frag ipv6-tcp ipv6-udp ipv6-other
   ipv6-ex ipv6
rss active:ipv4-frag ipv4-tcp ipv4-udp ipv4-other ipv4
ipv6-tcp-ex
   ipv6-udp-ex ipv6-frag ipv6-tcp ipv6-udp ipv6-other
   ipv6-ex ipv6
tx burst function: (nil)
rx burst function: mlx5_rx_burst

thanks,
-Raj

On Mon, Jan 20, 2020 at 7:55 PM Florin Coras  wrote:

> Hi Raj,
>
> Good to see progress. Check with “show int” the tx counters on the sender
> and rx counters on the receiver as the interfaces might be dropping
> traffic. One sender should be able to do more than 5Gbps.
>
> How big are the writes to the tx fifo? Make sure the tx buffer is some
> tens of kB.
>
> As for the issue with the number of workers, you’ll have to switch to udpc
> (connected udp), to ensure you have a separate connection for each ‘flow’,
> and to use accept in combination with epoll to accept the sessions udpc
> creates.
>
> Note that udpc currently does not work correctly with vcl and multiple vpp
> workers if vcl is the sender (not the receiver) and traffic is
> bidirectional. The sessions are all created on the first thread and once
> return traffic is received, they’re migrated to the thread selected by RSS
> hashing. VCL is not notified when that happens and it runs out of sync. You
> might not be affected by this, as you’re not receiving any return traffic,
> but because of that all sessions may end up stuck on the first thread.
>
> For udp transport, the listener is connection-less and bound to the main
> thread. As a result, all incoming packets, even if they pertain to multiple
> flows, are written to the listener’s buffer/fifo.
>
> Regards,
> Florin
>
> On Jan 20, 2020, at 3:50 PM, Raj Kumar  wrote:
>
> Hi Florin,
> I changed my application as you suggested. Now, I am able to achieve 5
> Gbps with a single UDP stream.  Overall, I can get ~20Gbps with multiple
> host application . Also, the TCP throughput  is improved to ~28Gbps after
> tuning as mentioned in  [1].
> On the similar topic; the UDP tx throughput is throttled to 5Gbps. Even if
> I run the multiple host applications the overall throughput is 5Gbps. I
> also tried by configuring multiple worker threads . But the problem is that
> all the application sessions are assigned to the same worker thread. Is
> there any way to assign each session  to a different worker thread?
>
> vpp# sh session verbose 2
> Thread 0: no 

Re: [vpp-dev] #vpp-hoststack - Issue with UDP receiver application using VCL library

2020-01-20 Thread Florin Coras
Hi Raj, 

Good to see progress. Check with “show int” the tx counters on the sender and 
rx counters on the receiver as the interfaces might be dropping traffic. One 
sender should be able to do more than 5Gbps. 

How big are the writes to the tx fifo? Make sure the tx buffer is some tens of 
kB. 

As for the issue with the number of workers, you’ll have to switch to udpc 
(connected udp), to ensure you have a separate connection for each ‘flow’, and 
to use accept in combination with epoll to accept the sessions udpc creates. 

Note that udpc currently does not work correctly with vcl and multiple vpp 
workers if vcl is the sender (not the receiver) and traffic is bidirectional. 
The sessions are all created on the first thread and once return traffic is 
received, they’re migrated to the thread selected by RSS hashing. VCL is not 
notified when that happens and it runs out of sync. You might not be affected 
by this, as you’re not receiving any return traffic, but because of that all 
sessions may end up stuck on the first thread. 

For udp transport, the listener is connection-less and bound to the main 
thread. As a result, all incoming packets, even if they pertain to multiple 
flows, are written to the listener’s buffer/fifo.

Regards,
Florin

> On Jan 20, 2020, at 3:50 PM, Raj Kumar  wrote:
> 
> Hi Florin,
> I changed my application as you suggested. Now, I am able to achieve 5 Gbps 
> with a single UDP stream.  Overall, I can get ~20Gbps with multiple host 
> application . Also, the TCP throughput  is improved to ~28Gbps after tuning 
> as mentioned in  [1]. 
> On the similar topic; the UDP tx throughput is throttled to 5Gbps. Even if I 
> run the multiple host applications the overall throughput is 5Gbps. I also 
> tried by configuring multiple worker threads . But the problem is that all 
> the application sessions are assigned to the same worker thread. Is there any 
> way to assign each session  to a different worker thread?
> 
> vpp# sh session verbose 2
> Thread 0: no sessions
> [#1][U] fd0d:edc4::2001::203:58926->fd0d:edc4:
>  Rx fifo: cursize 0 nitems 399 has_event 0
>   head 0 tail 0 segment manager 1
>   vpp session 0 thread 1 app session 0 thread 0
>   ooo pool 0 active elts newest 0
>  Tx fifo: cursize 399 nitems 399 has_event 1
>   head 1460553 tail 1460552 segment manager 1
>   vpp session 0 thread 1 app session 0 thread 0
>   ooo pool 0 active elts newest 4294967295
>  session: state: opened opaque: 0x0 flags:
> [#1][U] fd0d:edc4::2001::203:63413->fd0d:edc4:
>  Rx fifo: cursize 0 nitems 399 has_event 0
>   head 0 tail 0 segment manager 2
>   vpp session 1 thread 1 app session 0 thread 0
>   ooo pool 0 active elts newest 0
>  Tx fifo: cursize 399 nitems 399 has_event 1
>   head 3965434 tail 3965433 segment manager 2
>   vpp session 1 thread 1 app session 0 thread 0
>   ooo pool 0 active elts newest 4294967295
>  session: state: opened opaque: 0x0 flags:
> Thread 1: active sessions 2
> Thread 2: no sessions
> Thread 3: no sessions
> Thread 4: no sessions
> Thread 5: no sessions
> Thread 6: no sessions
> Thread 7: no sessions
> vpp# sh app client
> Connection  App
> [#1][U] fd0d:edc4::2001::203:58926->udp6_tx_8092[shm]
> [#1][U] fd0d:edc4::2001::203:63413->udp6_tx_8093[shm]
> vpp#
> 
> 
> 
> thanks,
> -Raj
> 
> On Sun, Jan 19, 2020 at 8:50 PM Florin Coras  > wrote:
> Hi Raj,
> 
> The function used for receiving datagrams is limited to reading at most the 
> length of a datagram from the rx fifo. UDP datagrams are mtu sized, so your 
> reads are probably limited to ~1.5kB. On each epoll rx event try reading from 
> the session handle in a while loop until you get an VPPCOM_EWOULDBLOCK. That 
> might improve performance. 
> 
> Having said that, udp is lossy so unless you implement your own 
> congestion/flow control algorithms, the data you’ll receive might be full of 
> “holes”. What are the rx/tx error counters on your interfaces (check with “sh 
> int”). 
> 
> Also, with simple tuning like this [1], you should be able to achieve much 
> more than 15Gbps with tcp. 
> 
> Regards,
> Florin
> 
> [1] https://wiki.fd.io/view/VPP/HostStack/LDP/iperf 
> 
> 
>> On Jan 19, 2020, at 3:25 PM, Raj Kumar > > wrote:
>> 
>>   Hi Florin,
>>  By using VCL library in an UDP receiver application,  I am able to receive 
>> only 2 Mbps traffic. On increasing the traffic, I see Rx FIFO full error and 
>> application stopped receiving the traffic from the session layer.  Whereas, 
>> with TCP I can easily achieve 15Gbps throughput without tuning any DPDK 
>> parameter.  UDP tx also looks fine. From an host application I can send 
>> ~5Gbps without any issue. 
>> 
>> I am running VPP( stable/2001 code) on RHEL8 server using Mellanox 100G 

Re: [vpp-dev] #vpp-hoststack - Issue with UDP receiver application using VCL library

2020-01-20 Thread Raj Kumar
Hi Florin,
I changed my application as you suggested. Now, I am able to achieve 5 Gbps
with a single UDP stream.  Overall, I can get ~20Gbps with multiple host
application . Also, the TCP throughput  is improved to ~28Gbps after tuning
as mentioned in  [1].
On the similar topic; the UDP tx throughput is throttled to 5Gbps. Even if
I run the multiple host applications the overall throughput is 5Gbps. I
also tried by configuring multiple worker threads . But the problem is that
all the application sessions are assigned to the same worker thread. Is
there any way to assign each session  to a different worker thread?

vpp# sh session verbose 2
Thread 0: no sessions
[#1][U] fd0d:edc4::2001::203:58926->fd0d:edc4:
 Rx fifo: cursize 0 nitems 399 has_event 0
  head 0 tail 0 segment manager 1
  vpp session 0 thread 1 app session 0 thread 0
  ooo pool 0 active elts newest 0
 Tx fifo: cursize 399 nitems 399 has_event 1
  head 1460553 tail 1460552 segment manager 1
  vpp session 0 thread 1 app session 0 thread 0
  ooo pool 0 active elts newest 4294967295
 session: state: opened opaque: 0x0 flags:
[#1][U] fd0d:edc4::2001::203:63413->fd0d:edc4:
 Rx fifo: cursize 0 nitems 399 has_event 0
  head 0 tail 0 segment manager 2
  vpp session 1 thread 1 app session 0 thread 0
  ooo pool 0 active elts newest 0
 Tx fifo: cursize 399 nitems 399 has_event 1
  head 3965434 tail 3965433 segment manager 2
  vpp session 1 thread 1 app session 0 thread 0
  ooo pool 0 active elts newest 4294967295
 session: state: opened opaque: 0x0 flags:
Thread 1: active sessions 2
Thread 2: no sessions
Thread 3: no sessions
Thread 4: no sessions
Thread 5: no sessions
Thread 6: no sessions
Thread 7: no sessions
vpp# sh app client
Connection  App
[#1][U] fd0d:edc4::2001::203:58926->udp6_tx_8092[shm]
[#1][U] fd0d:edc4::2001::203:63413->udp6_tx_8093[shm]
vpp#



thanks,
-Raj

On Sun, Jan 19, 2020 at 8:50 PM Florin Coras  wrote:

> Hi Raj,
>
> The function used for receiving datagrams is limited to reading at most
> the length of a datagram from the rx fifo. UDP datagrams are mtu sized, so
> your reads are probably limited to ~1.5kB. On each epoll rx event try
> reading from the session handle in a while loop until you get an
> VPPCOM_EWOULDBLOCK. That might improve performance.
>
> Having said that, udp is lossy so unless you implement your own
> congestion/flow control algorithms, the data you’ll receive might be full
> of “holes”. What are the rx/tx error counters on your interfaces (check
> with “sh int”).
>
> Also, with simple tuning like this [1], you should be able to achieve much
> more than 15Gbps with tcp.
>
> Regards,
> Florin
>
> [1] https://wiki.fd.io/view/VPP/HostStack/LDP/iperf
>
> On Jan 19, 2020, at 3:25 PM, Raj Kumar  wrote:
>
>   Hi Florin,
>  By using VCL library in an UDP receiver application,  I am able to
> receive only 2 Mbps traffic. On increasing the traffic, I see Rx FIFO full
> error and application stopped receiving the traffic from the session
> layer.  Whereas, with TCP I can easily achieve 15Gbps throughput without
> tuning any DPDK parameter.  UDP tx also looks fine. From an host
> application I can send ~5Gbps without any issue.
>
> I am running VPP( stable/2001 code) on RHEL8 server using Mellanox 100G
> (MLNX5) adapters.
> Please advise if I can use VCL library to receive high throughput UDP
> traffic ( in Gbps). I would be running multiple instances of host
> application to receive data ( ~50-60 Gbps).
>
> I also tried by increasing the Rx FIFO size to 16MB but did not help much.
> The host application is just throwing the received packets , it is not
> doing any packet processing.
>
> [root@orc01 vcl_test]# VCL_DEBUG=2 ./udp6_server_vcl
> VCL<20201>: configured VCL debug level (2) from VCL_DEBUG!
> VCL<20201>: allocated VCL heap = 0x7f39a17ab010, size 268435456
> (0x1000)
> VCL<20201>: configured rx_fifo_size 400 (0x3d0900)
> VCL<20201>: configured tx_fifo_size 400 (0x3d0900)
> VCL<20201>: configured app_scope_local (1)
> VCL<20201>: configured app_scope_global (1)
> VCL<20201>: configured api-socket-name (/tmp/vpp-api.sock)
> VCL<20201>: completed parsing vppcom config!
> vppcom_connect_to_vpp:480: vcl<20201:0>: app (udp6_server) is connected to
> VPP!
> vppcom_app_create:1104: vcl<20201:0>: sending session enable
> vppcom_app_create:1112: vcl<20201:0>: sending app attach
> vppcom_app_create:1121: vcl<20201:0>: app_name 'udp6_server',
> my_client_index 256 (0x100)
> vppcom_epoll_create:2439: vcl<20201:0>: Created vep_idx 0
> vppcom_session_create:1179: vcl<20201:0>: created session 1
> vppcom_session_bind:1317: vcl<20201:0>: session 1 handle 1: binding to
> local IPv6 address fd0d:edc4::2001::203 port 8092, proto UDP
> vppcom_session_listen:1349: vcl<20201:0>: session 1: sending vpp listen
> request...
> vcl_session_bound_handler:604: 

Re: [vpp-dev] #vpp-hoststack - Issue with UDP receiver application using VCL library

2020-01-19 Thread Florin Coras
Hi Raj,

The function used for receiving datagrams is limited to reading at most the 
length of a datagram from the rx fifo. UDP datagrams are mtu sized, so your 
reads are probably limited to ~1.5kB. On each epoll rx event try reading from 
the session handle in a while loop until you get an VPPCOM_EWOULDBLOCK. That 
might improve performance. 

Having said that, udp is lossy so unless you implement your own congestion/flow 
control algorithms, the data you’ll receive might be full of “holes”. What are 
the rx/tx error counters on your interfaces (check with “sh int”). 

Also, with simple tuning like this [1], you should be able to achieve much more 
than 15Gbps with tcp. 

Regards,
Florin

[1] https://wiki.fd.io/view/VPP/HostStack/LDP/iperf

> On Jan 19, 2020, at 3:25 PM, Raj Kumar  wrote:
> 
>   Hi Florin,
>  By using VCL library in an UDP receiver application,  I am able to receive 
> only 2 Mbps traffic. On increasing the traffic, I see Rx FIFO full error and 
> application stopped receiving the traffic from the session layer.  Whereas, 
> with TCP I can easily achieve 15Gbps throughput without tuning any DPDK 
> parameter.  UDP tx also looks fine. From an host application I can send 
> ~5Gbps without any issue. 
> 
> I am running VPP( stable/2001 code) on RHEL8 server using Mellanox 100G 
> (MLNX5) adapters.
> Please advise if I can use VCL library to receive high throughput UDP traffic 
> ( in Gbps). I would be running multiple instances of host application to 
> receive data ( ~50-60 Gbps).
> 
> I also tried by increasing the Rx FIFO size to 16MB but did not help much. 
> The host application is just throwing the received packets , it is not doing 
> any packet processing.
> 
> [root@orc01 vcl_test]# VCL_DEBUG=2 ./udp6_server_vcl
> VCL<20201>: configured VCL debug level (2) from VCL_DEBUG!
> VCL<20201>: allocated VCL heap = 0x7f39a17ab010, size 268435456 (0x1000)
> VCL<20201>: configured rx_fifo_size 400 (0x3d0900)
> VCL<20201>: configured tx_fifo_size 400 (0x3d0900)
> VCL<20201>: configured app_scope_local (1)
> VCL<20201>: configured app_scope_global (1)
> VCL<20201>: configured api-socket-name (/tmp/vpp-api.sock)
> VCL<20201>: completed parsing vppcom config!
> vppcom_connect_to_vpp:480: vcl<20201:0>: app (udp6_server) is connected to 
> VPP!
> vppcom_app_create:1104: vcl<20201:0>: sending session enable
> vppcom_app_create:1112: vcl<20201:0>: sending app attach
> vppcom_app_create:1121: vcl<20201:0>: app_name 'udp6_server', my_client_index 
> 256 (0x100)
> vppcom_epoll_create:2439: vcl<20201:0>: Created vep_idx 0
> vppcom_session_create:1179: vcl<20201:0>: created session 1
> vppcom_session_bind:1317: vcl<20201:0>: session 1 handle 1: binding to local 
> IPv6 address fd0d:edc4::2001::203 port 8092, proto UDP
> vppcom_session_listen:1349: vcl<20201:0>: session 1: sending vpp listen 
> request...
> vcl_session_bound_handler:604: vcl<20201:0>: session 1 [0x1]: listen 
> succeeded!
> vppcom_epoll_ctl:2541: vcl<20201:0>: EPOLL_CTL_ADD: vep_sh 0, sh 1, events 
> 0x1, data 0x1!
> vppcom_session_create:1179: vcl<20201:0>: created session 2
> vppcom_session_bind:1317: vcl<20201:0>: session 2 handle 2: binding to local 
> IPv6 address fd0d:edc4::2001::203 port 8093, proto UDP
> vppcom_session_listen:1349: vcl<20201:0>: session 2: sending vpp listen 
> request...
> vcl_session_app_add_segment_handler:765: vcl<20201:0>: mapped new segment 
> '20190-2' size 134217728
> vcl_session_bound_handler:604: vcl<20201:0>: session 2 [0x2]: listen 
> succeeded!
> vppcom_epoll_ctl:2541: vcl<20201:0>: EPOLL_CTL_ADD: vep_sh 0, sh 2, events 
> 0x1, data 0x2!
> 
> 
> vpp# sh session verbose 2
> [#0][U] fd0d:edc4::2001::203:8092->:::0
> 
>  Rx fifo: cursize 3999125 nitems 399 has_event 1
>   head 2554045 tail 2553170 segment manager 1
>   vpp session 0 thread 0 app session 1 thread 0
>   ooo pool 0 active elts newest 4294967295
>  Tx fifo: cursize 0 nitems 399 has_event 0
>   head 0 tail 0 segment manager 1
>   vpp session 0 thread 0 app session 1 thread 0
>   ooo pool 0 active elts newest 0
> [#0][U] fd0d:edc4::2001::203:8093->:::0
> 
>  Rx fifo: cursize 0 nitems 399 has_event 0
>   head 0 tail 0 segment manager 2
>   vpp session 1 thread 0 app session 2 thread 0
>   ooo pool 0 active elts newest 0
>  Tx fifo: cursize 0 nitems 399 has_event 0
>   head 0 tail 0 segment manager 2
>   vpp session 1 thread 0 app session 2 thread 0
>   ooo pool 0 active elts newest 0
> Thread 0: active sessions 2
> 
> [root@orc01 vcl_test]# cat /etc/vpp/vcl.conf
> vcl {
>   rx-fifo-size 400
>   tx-fifo-size 400
>   app-scope-local
>   app-scope-global
>   api-socket-name /tmp/vpp-api.sock
> }
> [root@orc01 vcl_test]#
> 
> --- Start of thread 0 vpp_main ---
> Packet 1
> 
> 00:09:53:445025: dpdk-input
>   HundredGigabitEthernet12/0/0 rx queue 0
>   buffer 0x88078: 

Re: [vpp-dev] #vpp-hoststack - Issue with UDP receiver application using VCL library

2020-01-19 Thread Raj Kumar
  Hi Florin,
 By using VCL library in an UDP receiver application,  I am able to receive
only 2 Mbps traffic. On increasing the traffic, I see Rx FIFO full error
and application stopped receiving the traffic from the session layer.
Whereas, with TCP I can easily achieve 15Gbps throughput without tuning any
DPDK parameter.  UDP tx also looks fine. From an host application I can
send ~5Gbps without any issue.

I am running VPP( stable/2001 code) on RHEL8 server using Mellanox 100G
(MLNX5) adapters.
Please advise if I can use VCL library to receive high throughput UDP
traffic ( in Gbps). I would be running multiple instances of host
application to receive data ( ~50-60 Gbps).

I also tried by increasing the Rx FIFO size to 16MB but did not help much.
The host application is just throwing the received packets , it is not
doing any packet processing.

[root@orc01 vcl_test]# VCL_DEBUG=2 ./udp6_server_vcl
VCL<20201>: configured VCL debug level (2) from VCL_DEBUG!
VCL<20201>: allocated VCL heap = 0x7f39a17ab010, size 268435456 (0x1000)
VCL<20201>: configured rx_fifo_size 400 (0x3d0900)
VCL<20201>: configured tx_fifo_size 400 (0x3d0900)
VCL<20201>: configured app_scope_local (1)
VCL<20201>: configured app_scope_global (1)
VCL<20201>: configured api-socket-name (/tmp/vpp-api.sock)
VCL<20201>: completed parsing vppcom config!
vppcom_connect_to_vpp:480: vcl<20201:0>: app (udp6_server) is connected to
VPP!
vppcom_app_create:1104: vcl<20201:0>: sending session enable
vppcom_app_create:1112: vcl<20201:0>: sending app attach
vppcom_app_create:1121: vcl<20201:0>: app_name 'udp6_server',
my_client_index 256 (0x100)
vppcom_epoll_create:2439: vcl<20201:0>: Created vep_idx 0
vppcom_session_create:1179: vcl<20201:0>: created session 1
vppcom_session_bind:1317: vcl<20201:0>: session 1 handle 1: binding to
local IPv6 address fd0d:edc4::2001::203 port 8092, proto UDP
vppcom_session_listen:1349: vcl<20201:0>: session 1: sending vpp listen
request...
vcl_session_bound_handler:604: vcl<20201:0>: session 1 [0x1]: listen
succeeded!
vppcom_epoll_ctl:2541: vcl<20201:0>: EPOLL_CTL_ADD: vep_sh 0, sh 1, events
0x1, data 0x1!
vppcom_session_create:1179: vcl<20201:0>: created session 2
vppcom_session_bind:1317: vcl<20201:0>: session 2 handle 2: binding to
local IPv6 address fd0d:edc4::2001::203 port 8093, proto UDP
vppcom_session_listen:1349: vcl<20201:0>: session 2: sending vpp listen
request...
vcl_session_app_add_segment_handler:765: vcl<20201:0>: mapped new segment
'20190-2' size 134217728
vcl_session_bound_handler:604: vcl<20201:0>: session 2 [0x2]: listen
succeeded!
vppcom_epoll_ctl:2541: vcl<20201:0>: EPOLL_CTL_ADD: vep_sh 0, sh 2, events
0x1, data 0x2!


vpp# sh session verbose 2
[#0][U] fd0d:edc4::2001::203:8092->:::0

 Rx fifo: cursize 3999125 nitems 399 has_event 1
  head 2554045 tail 2553170 segment manager 1
  vpp session 0 thread 0 app session 1 thread 0
  ooo pool 0 active elts newest 4294967295
 Tx fifo: cursize 0 nitems 399 has_event 0
  head 0 tail 0 segment manager 1
  vpp session 0 thread 0 app session 1 thread 0
  ooo pool 0 active elts newest 0
[#0][U] fd0d:edc4::2001::203:8093->:::0

 Rx fifo: cursize 0 nitems 399 has_event 0
  head 0 tail 0 segment manager 2
  vpp session 1 thread 0 app session 2 thread 0
  ooo pool 0 active elts newest 0
 Tx fifo: cursize 0 nitems 399 has_event 0
  head 0 tail 0 segment manager 2
  vpp session 1 thread 0 app session 2 thread 0
  ooo pool 0 active elts newest 0
Thread 0: active sessions 2

[root@orc01 vcl_test]# cat /etc/vpp/vcl.conf
vcl {
  rx-fifo-size 400
  tx-fifo-size 400
  app-scope-local
  app-scope-global
  api-socket-name /tmp/vpp-api.sock
}
[root@orc01 vcl_test]#

--- Start of thread 0 vpp_main ---
Packet 1

00:09:53:445025: dpdk-input
  HundredGigabitEthernet12/0/0 rx queue 0
  buffer 0x88078: current data 0, length 1516, buffer-pool 0, ref-count 1,
totlen-nifb 0, trace handle 0x0
  ext-hdr-valid
  l4-cksum-computed l4-cksum-correct
  PKT MBUF: port 0, nb_segs 1, pkt_len 1516
buf_len 2176, data_len 1516, ol_flags 0x180, data_off 128, phys_addr
0x75601e80
packet_type 0x2e1 l2_len 0 l3_len 0 outer_l2_len 0 outer_l3_len 0
rss 0x0 fdir.hi 0x0 fdir.lo 0x0
Packet Offload Flags
  PKT_RX_IP_CKSUM_GOOD (0x0080) IP cksum of RX pkt. is valid
  PKT_RX_L4_CKSUM_GOOD (0x0100) L4 cksum of RX pkt. is valid
Packet Types
  RTE_PTYPE_L2_ETHER (0x0001) Ethernet packet
  RTE_PTYPE_L3_IPV6_EXT_UNKNOWN (0x00e0) IPv6 packet with or without
extension headers
  RTE_PTYPE_L4_UDP (0x0200) UDP packet
  IP6: b8:83:03:79:9f:e4 -> b8:83:03:79:af:8c 802.1q vlan 2001
  UDP: fd0d:edc4::2001::201 -> fd0d:edc4::2001::203
tos 0x00, flow label 0x0, hop limit 64, payload length 1458
  UDP: 56944 -> 8092
length 1458, checksum 0xb22d
00:09:53:445028: 

Re: [vpp-dev] #vpp-hoststack - Issue with UDP receiver application using VCL library

2020-01-15 Thread Raj Kumar
Hi Florin,
Yes,  [2] patch resolved the  IPv6/UDP receiver issue.
Thanks! for your help.

thanks,
-Raj

On Tue, Jan 14, 2020 at 9:35 PM Florin Coras  wrote:

> Hi Raj,
>
> First of all, with this [1], the vcl test app/client can establish a udpc
> connection. Note that udp will most probably lose packets, so large
> exchanges with those apps may not work.
>
> As for the second issue, does [2] solve it?
>
> Regards,
> Florin
>
> [1] https://gerrit.fd.io/r/c/vpp/+/24332
> [2] https://gerrit.fd.io/r/c/vpp/+/24334
>
> On Jan 14, 2020, at 12:59 PM, Raj Kumar  wrote:
>
> Hi Florin,
> Thanks! for the reply.
>
> I realized the issue with the non-connected case.  For receiving
> datagrams, I was using recvfrom() with DONOT_WAIT flag because of
> that  vppcom_session_recvfrom() api was failing. It expects either 0 or
> MSG_PEEK flag.
>   if (flags == 0)
> rv = vppcom_session_read (session_handle, buffer, buflen);
>   else if (flags & MSG_PEEK) 0x2
> rv = vppcom_session_peek (session_handle, buffer, buflen);
>   else
> {
>   VDBG (0, "Unsupport flags for recvfrom %d", flags);
>   return VPPCOM_EAFNOSUPPORT;
> }
>
>  I changed the flag to 0 in recvfrom() , after that UDP rx is working fine
> but only for IPv4.
>
> I am facing a different issue with IPv6/UDP receiver.  I am getting "no
> listener for dst port" error.
>
> Please let me know if I am doing something wrong.
> Here are the traces : -
>
> [root@orc01 testcode]# VCL_DEBUG=2 LDP_DEBUG=2
> LD_PRELOAD=/opt/vpp/build-root/install-vpp-native/vpp/lib/libvcl_ldpreload.so
>  VCL_CONFIG=/etc/vpp/vcl.cfg ./udp6_rx
> VCL<1164>: configured VCL debug level (2) from VCL_DEBUG!
> VCL<1164>: allocated VCL heap = 0x7ff877439010, size 268435456 (0x1000)
> VCL<1164>: configured rx_fifo_size 400 (0x3d0900)
> VCL<1164>: configured tx_fifo_size 400 (0x3d0900)
> VCL<1164>: configured app_scope_local (1)
> VCL<1164>: configured app_scope_global (1)
> VCL<1164>: configured api-socket-name (/tmp/vpp-api.sock)
> VCL<1164>: completed parsing vppcom config!
> vppcom_connect_to_vpp:549: vcl<1164:0>: app (ldp-1164-app) is connected to
> VPP!
> vppcom_app_create:1067: vcl<1164:0>: sending session enable
> vppcom_app_create:1075: vcl<1164:0>: sending app attach
> vppcom_app_create:1084: vcl<1164:0>: app_name 'ldp-1164-app',
> my_client_index 0 (0x0)
> ldp_init:209: ldp<1164>: configured LDP debug level (2) from env var
> LDP_DEBUG!
> ldp_init:282: ldp<1164>: LDP initialization: done!
> ldp_constructor:2490: LDP<1164>: LDP constructor: done!
> socket:974: ldp<1164>: calling vls_create: proto 1 (UDP), is_nonblocking 0
> vppcom_session_create:1142: vcl<1164:0>: created session 0
> bind:1086: ldp<1164>: fd 32: calling vls_bind: vlsh 0, addr
> 0x7fff9a93efe0, len 28
> vppcom_session_bind:1280: vcl<1164:0>: session 0 handle 0: binding to
> local IPv6 address :: port 8092, proto UDP
> vppcom_session_listen:1312: vcl<1164:0>: session 0: sending vpp listen
> request...
> vcl_session_bound_handler:610: vcl<1164:0>: session 0 [0x1]: listen
> succeeded!
> bind:1102: ldp<1164>: fd 32: returning 0
>
> vpp# sh app server
> Connection  App  Wrk
> [0:0][CT:U] :::8092->:::0   ldp-1164-app[shm] 0
> [#0][U] :::8092->:::0   ldp-1164-app[shm] 0
>
> vpp# sh err
>CountNode  Reason
>  7   dpdk-input   no error
>   2606 ip6-udp-lookup no listener for dst port
>  8arp-reply   ARP replies sent
>  1  arp-disabled  ARP Disabled on this
> interface
> 13ip6-glean   neighbor solicitations
> sent
>   2606ip6-input   valid ip6 packets
>  4  ip6-local-hop-by-hop  Unknown protocol ip6
> local h-b-h packets dropped
>   2606 ip6-icmp-error destination unreachable
> response sent
> 40 ip6-icmp-input valid packets
>  1 ip6-icmp-input neighbor solicitations
> from source not on link
> 12 ip6-icmp-input neighbor solicitations
> for unknown targets
>  1 ip6-icmp-input neighbor advertisements
> sent
>  1 ip6-icmp-input neighbor advertisements
> received
> 40 ip6-icmp-input router advertisements
> sent
> 40 ip6-icmp-input router advertisements
> received
>  1 ip4-icmp-input echo replies sent
> 89   lldp-input   lldp packets received on
> disabled interfaces
>   1328llc-input   unknown llc ssap/dsap
> vpp#
>
> vpp# show trace
> --- Start of thread 0 vpp_main ---
> Packet 

Re: [vpp-dev] #vpp-hoststack - Issue with UDP receiver application using VCL library

2020-01-14 Thread Florin Coras
Hi Raj, 

First of all, with this [1], the vcl test app/client can establish a udpc 
connection. Note that udp will most probably lose packets, so large exchanges 
with those apps may not work. 

As for the second issue, does [2] solve it?

Regards, 
Florin

[1] https://gerrit.fd.io/r/c/vpp/+/24332 
[2] https://gerrit.fd.io/r/c/vpp/+/24334 

> On Jan 14, 2020, at 12:59 PM, Raj Kumar  wrote:
> 
> Hi Florin,
> Thanks! for the reply. 
> 
> I realized the issue with the non-connected case.  For receiving datagrams, I 
> was using recvfrom() with DONOT_WAIT flag because of that  
> vppcom_session_recvfrom() api was failing. It expects either 0 or MSG_PEEK 
> flag.
>   if (flags == 0)
> rv = vppcom_session_read (session_handle, buffer, buflen);
>   else if (flags & MSG_PEEK) 0x2
> rv = vppcom_session_peek (session_handle, buffer, buflen);
>   else
> {
>   VDBG (0, "Unsupport flags for recvfrom %d", flags);
>   return VPPCOM_EAFNOSUPPORT;
> }
> 
>  I changed the flag to 0 in recvfrom() , after that UDP rx is working fine 
> but only for IPv4.
> 
> I am facing a different issue with IPv6/UDP receiver.  I am getting "no 
> listener for dst port" error.
>
> Please let me know if I am doing something wrong. 
> Here are the traces : -
> 
> [root@orc01 testcode]# VCL_DEBUG=2 LDP_DEBUG=2 
> LD_PRELOAD=/opt/vpp/build-root/install-vpp-native/vpp/lib/libvcl_ldpreload.so 
>  VCL_CONFIG=/etc/vpp/vcl.cfg ./udp6_rx
> VCL<1164>: configured VCL debug level (2) from VCL_DEBUG!
> VCL<1164>: allocated VCL heap = 0x7ff877439010, size 268435456 (0x1000)
> VCL<1164>: configured rx_fifo_size 400 (0x3d0900)
> VCL<1164>: configured tx_fifo_size 400 (0x3d0900)
> VCL<1164>: configured app_scope_local (1)
> VCL<1164>: configured app_scope_global (1)
> VCL<1164>: configured api-socket-name (/tmp/vpp-api.sock)
> VCL<1164>: completed parsing vppcom config!
> vppcom_connect_to_vpp:549: vcl<1164:0>: app (ldp-1164-app) is connected to 
> VPP!
> vppcom_app_create:1067: vcl<1164:0>: sending session enable
> vppcom_app_create:1075: vcl<1164:0>: sending app attach
> vppcom_app_create:1084: vcl<1164:0>: app_name 'ldp-1164-app', my_client_index 
> 0 (0x0)
> ldp_init:209: ldp<1164>: configured LDP debug level (2) from env var 
> LDP_DEBUG!
> ldp_init:282: ldp<1164>: LDP initialization: done!
> ldp_constructor:2490: LDP<1164>: LDP constructor: done!
> socket:974: ldp<1164>: calling vls_create: proto 1 (UDP), is_nonblocking 0
> vppcom_session_create:1142: vcl<1164:0>: created session 0
> bind:1086: ldp<1164>: fd 32: calling vls_bind: vlsh 0, addr 0x7fff9a93efe0, 
> len 28
> vppcom_session_bind:1280: vcl<1164:0>: session 0 handle 0: binding to local 
> IPv6 address :: port 8092, proto UDP
> vppcom_session_listen:1312: vcl<1164:0>: session 0: sending vpp listen 
> request...
> vcl_session_bound_handler:610: vcl<1164:0>: session 0 [0x1]: listen succeeded!
> bind:1102: ldp<1164>: fd 32: returning 0
> 
> vpp# sh app server
> Connection  App  Wrk
> [0:0][CT:U] :::8092->:::0   ldp-1164-app[shm] 0
> [#0][U] :::8092->:::0   ldp-1164-app[shm] 0
> 
> vpp# sh err
>CountNode  Reason
>  7   dpdk-input   no error
>   2606 ip6-udp-lookup no listener for dst port
>  8arp-reply   ARP replies sent
>  1  arp-disabled  ARP Disabled on this 
> interface
> 13ip6-glean   neighbor solicitations sent
>   2606ip6-input   valid ip6 packets
>  4  ip6-local-hop-by-hop  Unknown protocol ip6 local 
> h-b-h packets dropped
>   2606 ip6-icmp-error destination unreachable 
> response sent
> 40 ip6-icmp-input valid packets
>  1 ip6-icmp-input neighbor solicitations from 
> source not on link
> 12 ip6-icmp-input neighbor solicitations for 
> unknown targets
>  1 ip6-icmp-input neighbor advertisements sent
>  1 ip6-icmp-input neighbor advertisements 
> received
> 40 ip6-icmp-input router advertisements sent
> 40 ip6-icmp-input router advertisements 
> received
>  1 ip4-icmp-input echo replies sent
> 89   lldp-input   lldp packets received on 
> disabled interfaces
>   1328llc-input   unknown llc ssap/dsap
> vpp#
> 
> vpp# show trace
> --- Start of thread 0 vpp_main ---
> Packet 1
> 
> 00:23:39:401354: dpdk-input
>   HundredGigabitEthernet12/0/0 rx 

Re: [vpp-dev] #vpp-hoststack - Issue with UDP receiver application using VCL library

2020-01-14 Thread Raj Kumar
Hi Florin,
Thanks! for the reply.

I realized the issue with the non-connected case.  For receiving datagrams,
I was using recvfrom() with DONOT_WAIT flag because of
that  vppcom_session_recvfrom() api was failing. It expects either 0 or
MSG_PEEK flag.
  if (flags == 0)
rv = vppcom_session_read (session_handle, buffer, buflen);
  else if (flags & MSG_PEEK) 0x2
rv = vppcom_session_peek (session_handle, buffer, buflen);
  else
{
  VDBG (0, "Unsupport flags for recvfrom %d", flags);
  return VPPCOM_EAFNOSUPPORT;
}

 I changed the flag to 0 in recvfrom() , after that UDP rx is working fine
but only for IPv4.

I am facing a different issue with IPv6/UDP receiver.  I am getting "no
listener for dst port" error.

Please let me know if I am doing something wrong.
Here are the traces : -

[root@orc01 testcode]# VCL_DEBUG=2 LDP_DEBUG=2
LD_PRELOAD=/opt/vpp/build-root/install-vpp-native/vpp/lib/libvcl_ldpreload.so
 VCL_CONFIG=/etc/vpp/vcl.cfg ./udp6_rx
VCL<1164>: configured VCL debug level (2) from VCL_DEBUG!
VCL<1164>: allocated VCL heap = 0x7ff877439010, size 268435456 (0x1000)
VCL<1164>: configured rx_fifo_size 400 (0x3d0900)
VCL<1164>: configured tx_fifo_size 400 (0x3d0900)
VCL<1164>: configured app_scope_local (1)
VCL<1164>: configured app_scope_global (1)
VCL<1164>: configured api-socket-name (/tmp/vpp-api.sock)
VCL<1164>: completed parsing vppcom config!
vppcom_connect_to_vpp:549: vcl<1164:0>: app (ldp-1164-app) is connected to
VPP!
vppcom_app_create:1067: vcl<1164:0>: sending session enable
vppcom_app_create:1075: vcl<1164:0>: sending app attach
vppcom_app_create:1084: vcl<1164:0>: app_name 'ldp-1164-app',
my_client_index 0 (0x0)
ldp_init:209: ldp<1164>: configured LDP debug level (2) from env var
LDP_DEBUG!
ldp_init:282: ldp<1164>: LDP initialization: done!
ldp_constructor:2490: LDP<1164>: LDP constructor: done!
socket:974: ldp<1164>: calling vls_create: proto 1 (UDP), is_nonblocking 0
vppcom_session_create:1142: vcl<1164:0>: created session 0
bind:1086: ldp<1164>: fd 32: calling vls_bind: vlsh 0, addr 0x7fff9a93efe0,
len 28
vppcom_session_bind:1280: vcl<1164:0>: session 0 handle 0: binding to local
IPv6 address :: port 8092, proto UDP
vppcom_session_listen:1312: vcl<1164:0>: session 0: sending vpp listen
request...
vcl_session_bound_handler:610: vcl<1164:0>: session 0 [0x1]: listen
succeeded!
bind:1102: ldp<1164>: fd 32: returning 0

vpp# sh app server
Connection  App  Wrk
[0:0][CT:U] :::8092->:::0   ldp-1164-app[shm] 0
[#0][U] :::8092->:::0   ldp-1164-app[shm] 0

vpp# sh err
   CountNode  Reason
 7   dpdk-input   no error
  2606 ip6-udp-lookup no listener for dst port
 8arp-reply   ARP replies sent
 1  arp-disabled  ARP Disabled on this
interface
13ip6-glean   neighbor solicitations
sent
  2606ip6-input   valid ip6 packets
 4  ip6-local-hop-by-hop  Unknown protocol ip6
local h-b-h packets dropped
  2606 ip6-icmp-error destination unreachable
response sent
40 ip6-icmp-input valid packets
 1 ip6-icmp-input neighbor solicitations
from source not on link
12 ip6-icmp-input neighbor solicitations
for unknown targets
 1 ip6-icmp-input neighbor advertisements
sent
 1 ip6-icmp-input neighbor advertisements
received
40 ip6-icmp-input router advertisements sent
40 ip6-icmp-input router advertisements
received
 1 ip4-icmp-input echo replies sent
89   lldp-input   lldp packets received on
disabled interfaces
  1328llc-input   unknown llc ssap/dsap
vpp#

vpp# show trace
--- Start of thread 0 vpp_main ---
Packet 1

00:23:39:401354: dpdk-input
  HundredGigabitEthernet12/0/0 rx queue 0
  buffer 0x8894e: current data 0, length 1516, buffer-pool 0, ref-count 1,
totlen-nifb 0, trace handle 0x0
  ext-hdr-valid
  l4-cksum-computed l4-cksum-correct
  PKT MBUF: port 0, nb_segs 1, pkt_len 1516
buf_len 2176, data_len 1516, ol_flags 0x180, data_off 128, phys_addr
0x75025400
packet_type 0x2e1 l2_len 0 l3_len 0 outer_l2_len 0 outer_l3_len 0
rss 0x0 fdir.hi 0x0 fdir.lo 0x0
Packet Offload Flags
  PKT_RX_IP_CKSUM_GOOD (0x0080) IP cksum of RX pkt. is valid
  PKT_RX_L4_CKSUM_GOOD (0x0100) L4 cksum of RX pkt. is valid
Packet Types
  RTE_PTYPE_L2_ETHER (0x0001) Ethernet packet
  RTE_PTYPE_L3_IPV6_EXT_UNKNOWN 

Re: [vpp-dev] #vpp-hoststack - Issue with UDP receiver application using VCL library

2020-01-14 Thread Florin Coras
Hi Raj,

Session layer does support connection-less transports but udp does not raise 
accept notifications to vcl. UDPC might, but we haven’t tested udpc with vcl in 
a long time so it might not work properly. 

What was the problem you were hitting in the non-connected case?

Regards,
Florin

> On Jan 14, 2020, at 7:13 AM, raj.gauta...@gmail.com wrote:
> 
> Hi ,
> I am trying some host application tests ( using LD_PRELOAD) .  TCP rx and tx 
> both work fine. UDP tx also works fine. 
> The issue is only with UDP rx .  In some discussion it was mentioned that 
> session layer does not support connection-less transports so protocols like 
> udp still need to accept connections and only afterwards read from the fifos.
> So, I changed the UDP receiver application to use listen() and accept() 
> before read() . But , I am still having issue to make it run. 
> After I started, udp traffic from other server it seems to accept the 
> connection but never returns from the vppcom_session_accept() function.
> VPP release is 19.08.
> 
> vpp# sh app server
> Connection  App  Wrk
> [0:0][CT:U] 0.0.0.0:8090->0.0.0.0:0 ldp-36646-app[shm]0
> [#0][U] 0.0.0.0:8090->0.0.0.0:0 ldp-36646-app[shm]0
> vpp#
>
>
> [root@orc01 testcode]#  VCL_DEBUG=2 LDP_DEBUG=2 
> LD_PRELOAD=/opt/vpp/build-root/install-vpp-native/vpp/lib/libvcl_ldpreload.so 
>  VCL_CONFIG=/etc/vpp/vcl.cfg ./udp_rx
> VCL<36646>: configured VCL debug level (2) from VCL_DEBUG!
> VCL<36646>: allocated VCL heap = 0x7f77e5309010, size 268435456 (0x1000)
> VCL<36646>: configured rx_fifo_size 400 (0x3d0900)
> VCL<36646>: configured tx_fifo_size 400 (0x3d0900)
> VCL<36646>: configured app_scope_local (1)
> VCL<36646>: configured app_scope_global (1)
> VCL<36646>: configured api-socket-name (/tmp/vpp-api.sock)
> VCL<36646>: completed parsing vppcom config!
> vppcom_connect_to_vpp:549: vcl<36646:0>: app (ldp-36646-app) is connected to 
> VPP!
> vppcom_app_create:1067: vcl<36646:0>: sending session enable
> vppcom_app_create:1075: vcl<36646:0>: sending app attach
> vppcom_app_create:1084: vcl<36646:0>: app_name 'ldp-36646-app', 
> my_client_index 0 (0x0)
> ldp_init:209: ldp<36646>: configured LDP debug level (2) from env var 
> LDP_DEBUG!
> ldp_init:282: ldp<36646>: LDP initialization: done!
> ldp_constructor:2490: LDP<36646>: LDP constructor: done!
> socket:974: ldp<36646>: calling vls_create: proto 1 (UDP), is_nonblocking 0
> vppcom_session_create:1142: vcl<36646:0>: created session 0
> Socket successfully created..
> bind:1086: ldp<36646>: fd 32: calling vls_bind: vlsh 0, addr 0x7fff3f3c1040, 
> len 16
> vppcom_session_bind:1280: vcl<36646:0>: session 0 handle 0: binding to local 
> IPv4 address 0.0.0.0 port 8090, proto UDP
> vppcom_session_listen:1312: vcl<36646:0>: session 0: sending vpp listen 
> request...
> vcl_session_bound_handler:610: vcl<36646:0>: session 0 [0x1]: listen 
> succeeded!
> bind:1102: ldp<36646>: fd 32: returning 0
> Socket successfully binded..
> listen:2005: ldp<36646>: fd 32: calling vls_listen: vlsh 0, n 5
> vppcom_session_listen:1308: vcl<36646:0>: session 0 [0x1]: already in listen 
> state!
> listen:2020: ldp<36646>: fd 32: returning 0
> Server listening..
> ldp_accept4:2043: ldp<36646>: listen fd 32: calling vppcom_session_accept: 
> listen sid 0, ep 0x0, flags 0x3f3c0fc0
> vppcom_session_accept:1478: vcl<36646:0>: discarded event: 0
>
> 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15165): https://lists.fd.io/g/vpp-dev/message/15165
Mute This Topic: https://lists.fd.io/mt/69694900/21656
Mute #vpp-hoststack: https://lists.fd.io/mk?hashtag=vpp-hoststack=1480452
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] #vpp-hoststack - Issue with UDP receiver application using VCL library

2020-01-14 Thread raj . gautam25
Hi ,
I am trying some host application tests ( using LD_PRELOAD) .  TCP rx and tx 
both work fine. UDP tx also works fine.
The issue is only with UDP rx .  In some discussion it was mentioned that 
session layer does not support connection-less transports so protocols like udp 
still need to accept connections and only afterwards read from the fifos.
So, I changed the UDP receiver application to use listen() and accept() before 
read() . But , I am still having issue to make it run.
After I started, udp traffic from other server it seems to accept the 
connection but never returns from the vppcom_session_accept() function.
VPP release is 19.08.

vpp# sh app server
Connection                              App                          Wrk
[0:0][CT:U] 0.0.0.0:8090->0.0.0.0:0     ldp-36646-app[shm]            0
[#0][U] 0.0.0.0:8090->0.0.0.0:0         ldp-36646-app[shm]            0
vpp#

[root@orc01 testcode]#  VCL_DEBUG=2 LDP_DEBUG=2 
LD_PRELOAD=/opt/vpp/build-root/install-vpp-native/vpp/lib/libvcl_ldpreload.so  
VCL_CONFIG=/etc/vpp/vcl.cfg ./udp_rx
VCL<36646>: configured VCL debug level (2) from VCL_DEBUG!
VCL<36646>: allocated VCL heap = 0x7f77e5309010, size 268435456 (0x1000)
VCL<36646>: configured rx_fifo_size 400 (0x3d0900)
VCL<36646>: configured tx_fifo_size 400 (0x3d0900)
VCL<36646>: configured app_scope_local (1)
VCL<36646>: configured app_scope_global (1)
VCL<36646>: configured api-socket-name (/tmp/vpp-api.sock)
VCL<36646>: completed parsing vppcom config!
vppcom_connect_to_vpp:549: vcl<36646:0>: app (ldp-36646-app) is connected to 
VPP!
vppcom_app_create:1067: vcl<36646:0>: sending session enable
vppcom_app_create:1075: vcl<36646:0>: sending app attach
vppcom_app_create:1084: vcl<36646:0>: app_name 'ldp-36646-app', my_client_index 
0 (0x0)
ldp_init:209: ldp<36646>: configured LDP debug level (2) from env var LDP_DEBUG!
ldp_init:282: ldp<36646>: LDP initialization: done!
ldp_constructor:2490: LDP<36646>: LDP constructor: done!
socket:974: ldp<36646>: calling vls_create: proto 1 (UDP), is_nonblocking 0
vppcom_session_create:1142: vcl<36646:0>: created session 0
Socket successfully created..
bind:1086: ldp<36646>: fd 32: calling vls_bind: vlsh 0, addr 0x7fff3f3c1040, 
len 16
vppcom_session_bind:1280: vcl<36646:0>: session 0 handle 0: binding to local 
IPv4 address 0.0.0.0 port 8090, proto UDP
vppcom_session_listen:1312: vcl<36646:0>: session 0: sending vpp listen 
request...
vcl_session_bound_handler:610: vcl<36646:0>: session 0 [0x1]: listen succeeded!
bind:1102: ldp<36646>: fd 32: returning 0
Socket successfully binded..
listen:2005: ldp<36646>: fd 32: calling vls_listen: vlsh 0, n 5
vppcom_session_listen:1308: vcl<36646:0>: session 0 [0x1]: already in listen 
state!
listen:2020: ldp<36646>: fd 32: returning 0
Server listening..
ldp_accept4:2043: ldp<36646>: listen fd 32: calling vppcom_session_accept: 
listen sid 0, ep 0x0, flags 0x3f3c0fc0
vppcom_session_accept:1478: vcl<36646:0>: discarded event: 0
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15162): https://lists.fd.io/g/vpp-dev/message/15162
Mute This Topic: https://lists.fd.io/mt/69694900/21656
Mute #vpp-hoststack: https://lists.fd.io/mk?hashtag=vpp-hoststack=1480452
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-