[dpdk-dev] NIC support for HPE Ethernet 10Gb 2-port 560FLR-SFP+ Adapter
Hi, One of my customers intends to buy HPE Ethernet 10Gb 2-port 560FLR-SFP+ Adapter (http://www8.hp.com/h20195/v2/getpdf.aspx/c04111435.pdf?ver=8) for running a DPDK based app. I have never tested my app with the above NIC (always used X520 to test my app) If someone has already tried with this NIC, can you please confirm so that I can give a go ahead for this NIC. Thanks in advance Best Regards -Prashant
[dpdk-dev] Exception Path issue with AF Packet PMD for fragmented packets
Hi, I have a DPDK based application where the core 0 is handling the exception path and rest of the cores are bringing in the data from the NIC via the PMD. For the exception path handling I use the tun/tap interface. So the flow is like this ? Fast path cores bring in the data from NIC, classify it as exception path data, write to tap. (eg. ICMP ping data) Kernel responds and sends data out of tap My application is listening to tap in a thread running on core 0 itself, takes the data and writes to NIC via PMD The above works very well when I use PMD which takes the NIC over eg. ixgbe PMD with Intel 82599 NIC. However when I use the AF Packet PMD, I face an issue that while non fragmented ping?s work fine, the fragmented ping?s (pings with big payloads) don?t work. I see, via tcpdump on tap, that the fast path cores write all the fragments to tap, but the kernel does not respond. Interestingly, the fragmented usecase works if I put small delays at the following two places ? Just before writing to the tap Just after reading from tap and immediately before sending to NIC with the tx burst PMD API I believe the problem may not be related to DPDK at all and may have something to do with linux kernel or tun/tap driver, but has anybody faced a similar issue or root-caused it. Regards -Prashant
[dpdk-dev] Regarding dpdk_qat example
Hi, In the dpdk_qat example, the function alloc_memzone_region does the allocation for the memory of a crypto session context. Now in a real application, the sessions will be torn down as well. So if a similar strategy is followed as that of alloc_memzone_region, then how can the memory be returned to memzone. I see the comment over the alloc_memzone_region function which indicates that the allocation is meant to exist for a lifetime and there is no possibility to free. Or should a real application follow a different strategy for allocating memory to sessions, would like to know the advice of QAT users. Regards -Prashant
[dpdk-dev] Regarding UDP checksum offload
Hi Olivier, On Wed, Jan 28, 2015 at 8:39 PM, Olivier MATZ wrote: > Hi Prashant, > > > On 01/28/2015 03:57 PM, Prashant Upadhyaya wrote: > >> I am using dpdk 1.6r1, intel 82599 NIC. >>>> I have an mbuf, I have hand-constructed a UDP packet (IPv4) in >>>> the data >>>> portion, filled the relevant fields of the headers and I do a tx >>>> burst. No >>>> problems, the destination gets the packet. I filled UDP checksum >>>> as zero >>>> and there was no checksum offloaded in ol_flags. >>>> >>>> Now in the same usecase, I want to offload UDP checksum. >>>> I am aware that the checksum field in UDP header has to be >>>> filled with the >>>> pseudo header checksum, I did that, duly added the >>>> PKT_TX_UDP_CKSUM flag in >>>> ol_flags, did a tx_burst and the packet does not reach the >>>> destination. >>>> >>>> I realized that I have to fill the following fields as well (my >>>> packet does >>>> not have vlan tag) >>>> mbuf->pkt.vlan_macip.f.l2_len >>>> mbuf->pkt.vlan_macip.f.l3_len >>>> >>>> so I filled the l2_len as 14 and l3_len as 20 (IP header with no >>>> options) >>>> Yet the packet did not reach the destination. >>>> >>>> So my question is -- am I filling the l2_len and l3_len >>>> properly ? >>>> Is there anything else to be done before I can get this UDP >>>> checksum >>>> offload to work properly for me. >>>> >>> >>> >>> >>> As far as I remember, this should be working on 1.6r1. >>> When you say "did not reach the destination", do you mean that the >>> packet is not transmitted at all? Or is it transmitted with a wrong >>> checksum? >>> >> >> >> The packet is not transmitted to destination. I cannot see it in tcpdump >> at wireshark. >> If I don't do the offload and fill UDP checksum as zero, then >> destination shows the packet in tcpdump >> If I don't do the offload and just fill the pseudo header checksum in >> UDP header (clearly the wrong checksum), then the destination shows the >> packet in tcpdump and wireshark decodes it to complain of wrong UDP >> checksum as expected. >> > > This is strange. I don't see anything obvious in what you are > describing. It looks like the packet is dropped in the driver > or in the hardware. You can check the device statistics. > > Another thing you can do is to retry on the latest stable dpdk which > is known to work (see csumonly.c in test-pmd). > > Let me add further, I am _just_ doing the UDP checksum offload and not >> the IP hdr checksum offload. I calculate and set IP header checksum by >> my own code. I hope that this is acceptable and does not interfere with >> UDP checksum offload >> > > This should not be a problem. > > Indeed it worked with DPDK1.7 and then I retried with DPDK1.6 and it worked there too. Must have been some mistake at my end, may be I did not clean properly when I was experimenting with some values of l2_len. Sorry for the botheration to the list. While we are at it, a quick question -- in case I have an mbuf chain whose payloads constitute a UDP packet, should I setup the ol_flags and the l2_len, l3_len fields only in the first mbuf header of the chain or in all the mbuf headers of the chain ? > Regards, > Olivier > >
[dpdk-dev] Regarding UDP checksum offload
On Wed, Jan 28, 2015 at 6:32 PM, Olivier MATZ wrote: > Hi Prashant, > > > On 01/28/2015 12:25 PM, Prashant Upadhyaya wrote: > >> Hi, >> >> I am aware that this topic has been discussed several times before, but I >> am somehow still stuck with this. >> >> I am using dpdk 1.6r1, intel 82599 NIC. >> I have an mbuf, I have hand-constructed a UDP packet (IPv4) in the data >> portion, filled the relevant fields of the headers and I do a tx burst. No >> problems, the destination gets the packet. I filled UDP checksum as zero >> and there was no checksum offloaded in ol_flags. >> >> Now in the same usecase, I want to offload UDP checksum. >> I am aware that the checksum field in UDP header has to be filled with the >> pseudo header checksum, I did that, duly added the PKT_TX_UDP_CKSUM flag >> in >> ol_flags, did a tx_burst and the packet does not reach the destination. >> >> I realized that I have to fill the following fields as well (my packet >> does >> not have vlan tag) >> mbuf->pkt.vlan_macip.f.l2_len >> mbuf->pkt.vlan_macip.f.l3_len >> >> so I filled the l2_len as 14 and l3_len as 20 (IP header with no options) >> Yet the packet did not reach the destination. >> >> So my question is -- am I filling the l2_len and l3_len properly ? >> Is there anything else to be done before I can get this UDP checksum >> offload to work properly for me. >> > > > As far as I remember, this should be working on 1.6r1. > When you say "did not reach the destination", do you mean that the > packet is not transmitted at all? Or is it transmitted with a wrong > checksum? > The packet is not transmitted to destination. I cannot see it in tcpdump at wireshark. If I don't do the offload and fill UDP checksum as zero, then destination shows the packet in tcpdump If I don't do the offload and just fill the pseudo header checksum in UDP header (clearly the wrong checksum), then the destination shows the packet in tcpdump and wireshark decodes it to complain of wrong UDP checksum as expected. Let me add further, I am _just_ doing the UDP checksum offload and not the IP hdr checksum offload. I calculate and set IP header checksum by my own code. I hope that this is acceptable and does not interfere with UDP checksum offload > > I think you should try to reproduce the issue with the latest DPDK > which is known to work with test-pmd (csum forward engine). > > Regards, > Olivier > >
[dpdk-dev] Regarding UDP checksum offload
Hi, I am aware that this topic has been discussed several times before, but I am somehow still stuck with this. I am using dpdk 1.6r1, intel 82599 NIC. I have an mbuf, I have hand-constructed a UDP packet (IPv4) in the data portion, filled the relevant fields of the headers and I do a tx burst. No problems, the destination gets the packet. I filled UDP checksum as zero and there was no checksum offloaded in ol_flags. Now in the same usecase, I want to offload UDP checksum. I am aware that the checksum field in UDP header has to be filled with the pseudo header checksum, I did that, duly added the PKT_TX_UDP_CKSUM flag in ol_flags, did a tx_burst and the packet does not reach the destination. I realized that I have to fill the following fields as well (my packet does not have vlan tag) mbuf->pkt.vlan_macip.f.l2_len mbuf->pkt.vlan_macip.f.l3_len so I filled the l2_len as 14 and l3_len as 20 (IP header with no options) Yet the packet did not reach the destination. So my question is -- am I filling the l2_len and l3_len properly ? Is there anything else to be done before I can get this UDP checksum offload to work properly for me. Regards -Prashant
[dpdk-dev] Segmentation fault in ixgbe_rxtx_vec.c:444 with 1.8.0
On Wed, Jan 21, 2015 at 7:19 PM, Bruce Richardson < bruce.richardson at intel.com> wrote: > On Tue, Jan 20, 2015 at 11:39:03AM +0100, Martin Weiser wrote: > > Hi again, > > > > I did some further testing and it seems like this issue is linked to > > jumbo frames. I think a similar issue has already been reported by > > Prashant Upadhyaya with the subject 'Packet Rx issue with DPDK1.8'. > > In our application we use the following rxmode port configuration: > > > > .mq_mode= ETH_MQ_RX_RSS, > > .split_hdr_size = 0, > > .header_split = 0, > > .hw_ip_checksum = 1, > > .hw_vlan_filter = 0, > > .jumbo_frame= 1, > > .hw_strip_crc = 1, > > .max_rx_pkt_len = 9000, > > > > and the mbuf size is calculated like the following: > > > > (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM) > > > > This works fine with DPDK 1.7 and jumbo frames are split into buffer > > chains and can be forwarded on another port without a problem. > > With DPDK 1.8 and the default configuration (CONFIG_RTE_IXGBE_INC_VECTOR > > enabled) the application sometimes crashes like described in my first > > mail and sometimes packet receiving stops with subsequently arriving > > packets counted as rx errors. When CONFIG_RTE_IXGBE_INC_VECTOR is > > disabled the packet processing also comes to a halt as soon as jumbo > > frames arrive with a the slightly different effect that now > > rte_eth_tx_burst refuses to send any previously received packets. > > > > Is there anything special to consider regarding jumbo frames when moving > > from DPDK 1.7 to 1.8 that we might have missed? > > > > Martin > > > > > > > > On 19.01.15 11:26, Martin Weiser wrote: > > > Hi everybody, > > > > > > we quite recently updated one of our applications to DPDK 1.8.0 and are > > > now seeing a segmentation fault in ixgbe_rxtx_vec.c:444 after a few > minutes. > > > I just did some quick debugging and I only have a very limited > > > understanding of the code in question but it seems that the 'continue' > > > in line 445 without increasing 'buf_idx' might cause the problem. In > one > > > debugging session when the crash occurred the value of 'buf_idx' was 2 > > > and the value of 'pkt_idx' was 8965. > > > Any help with this issue would be greatly appreciated. If you need any > > > further information just let me know. > > > > > > Martin > > > > > > > > > Hi Martin, Prashant, > > I've managed to reproduce the issue here and had a look at it. Could you > both perhaps try the proposed change below and see if it fixes the problem > for > you and gives you a working system? If so, I'll submit this as a patch fix > officially - or go back to the drawing board, if not. :-) > > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c > b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c > index b54cb19..dfaccee 100644 > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c > @@ -402,10 +402,10 @@ reassemble_packets(struct igb_rx_queue *rxq, struct > rte_mbuf **rx_bufs, > struct rte_mbuf *pkts[RTE_IXGBE_VPMD_RX_BURST]; /*finished pkts*/ > struct rte_mbuf *start = rxq->pkt_first_seg; > struct rte_mbuf *end = rxq->pkt_last_seg; > - unsigned pkt_idx = 0, buf_idx = 0; > + unsigned pkt_idx, buf_idx; > > > - while (buf_idx < nb_bufs) { > + for (buf_idx = 0, pkt_idx = 0; buf_idx < nb_bufs; buf_idx++) { > if (end != NULL) { > /* processing a split packet */ > end->next = rx_bufs[buf_idx]; > @@ -448,7 +448,6 @@ reassemble_packets(struct igb_rx_queue *rxq, struct > rte_mbuf **rx_bufs, > rx_bufs[buf_idx]->data_len += rxq->crc_len; > rx_bufs[buf_idx]->pkt_len += rxq->crc_len; > } > - buf_idx++; > } > > /* save the partial packet for next time */ > > > Regards, > /Bruce > > Hi Bruce, I am afraid your patch did not work for me. In my case I am not trying to receive jumbo frames but normal frames. They are not received at my application. Further, your patched function is not getting stimulated in my usecase. Regards -Prashant
[dpdk-dev] Packet Rx issue with DPDK1.8
Hi Bruce, I tried with your suggestion. When I disable the _vec function with the following config change, the usecase works for me. So it points to some issue in the _vec function. CONFIG_RTE_IXGBE_INC_VECTOR=y, I changed this parameter to CONFIG_RTE_IXGBE_INC_VECTOR=n There appears to be some gottcha in the function therefore, somebody may want to run some tests again perhaps with jumbo frames enabled (and sending small normal frames) Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Bruce Richardson Sent: Thursday, January 08, 2015 7:17 PM To: Prashant Upadhyaya Cc: dev at dpdk.org Subject: Re: [dpdk-dev] Packet Rx issue with DPDK1.8 On Thu, Jan 08, 2015 at 01:40:54PM +0530, Prashant Upadhyaya wrote: > Hi, > > I am migrating from DPDK1.7 to DPDK1.8. > My application works fine with DPDK1.7. > I am using 10 Gb Intel 82599 NIC. > I have jumbo frames enabled, with max_rx_pkt_len = 10232 My mbuf > dataroom size is 2048+headroom So naturally the > ixgbe_recv_scattered_pkts driver function is triggered for receiving. > This works with DPDK1.7 and my app receives packets. > However, it does not work with DPDK1.8 somehow.I don't receive any packets. > > So, I increased the mbuf data room size in my application to a higher > value so that the function ixgbe_recv_scattered_pkts is not enabled (I > believe ixgbe_recv_pkts will be used in this case), and now my > application starts getting packets with DPDK1.8 and the entire > application usecase works fine (ofcourse my application had to adapt > to the mbuf structure changes which I have done) > > I am kind of coming to the conclusion that ixgbe_recv_scattered_pkts > has something broken in DPDK1.8 as compared to the earlier versions by > the above empirical evidence. > > Has anybody else faced a similar issue ? > > Regards > -Prashant This is worrying to hear. In 1.8, there is now the receive_scattered_pkts_vec function which manages changed mbufs. This was tested - both in development and in validation - before release, but since it's new code, it's entirely possible we missed something. Can you perhaps try disabling the vector driver in 1.8, and see if receiving scattered packets/chained mbufs works? Regards, /Bruce "DISCLAIMER: This message is proprietary to Aricent and is intended solely for the use of the individual to whom it is addressed. It may contain privileged or confidential information and should not be circulated or used for any purpose other than for what it is intended. If you have received this message in error, please notify the originator immediately. If you are not the intended recipient, you are notified that you are strictly prohibited from using, copying, altering, or disclosing the contents of this message. Aricent accepts no responsibility for loss or damage arising from the use of the information transmitted by this email including damage from virus."
[dpdk-dev] Packet Rx issue with DPDK1.8
Hi, I am migrating from DPDK1.7 to DPDK1.8. My application works fine with DPDK1.7. I am using 10 Gb Intel 82599 NIC. I have jumbo frames enabled, with max_rx_pkt_len = 10232 My mbuf dataroom size is 2048+headroom So naturally the ixgbe_recv_scattered_pkts driver function is triggered for receiving. This works with DPDK1.7 and my app receives packets. However, it does not work with DPDK1.8 somehow.I don't receive any packets. So, I increased the mbuf data room size in my application to a higher value so that the function ixgbe_recv_scattered_pkts is not enabled (I believe ixgbe_recv_pkts will be used in this case), and now my application starts getting packets with DPDK1.8 and the entire application usecase works fine (ofcourse my application had to adapt to the mbuf structure changes which I have done) I am kind of coming to the conclusion that ixgbe_recv_scattered_pkts has something broken in DPDK1.8 as compared to the earlier versions by the above empirical evidence. Has anybody else faced a similar issue ? Regards -Prashant
[dpdk-dev] Help with compilation of .s files with DPDK build system
Hi, I have an application which consists of *.c files and I have been using the DPDK build system happily so far. The way I do it is I include the rte.vars.mk and rte.extlib.mk in my application Makefile. I set the LIB variable to .a I set SRCS-y := And that's it, it works. Now recently, I need to introduce the compilation of .s file (not .S) I can easily compile it by hand using gcc, but I can't seem to find a trick to compile it with DPDK environment. Any advises ? Regards -Prashant
[dpdk-dev] Regarding Hardware Crypto Accelerator
Hi, Currently I have a machine with Xeon processor and it does not have a hardware crypto accelerator. I am running my DPDK based application successfully on it. Now I want to use a hardware crypto accelerator and use it with DPDK for IPSec operations in my application I am planning to buy the following -- PE3iS4CO2 -- Silicom's Quad HW Accelerator Crypto Compression PCI Express Gen 3.0 Server Adapter / ColetoCreek SKU2 Can somebody advise if this would work properly with DPDK or is there any other catch involved which I should be careful about before I go ahead and invest on the equipment, would really appreciate any advice. Regards -Prashant
[dpdk-dev] Would DPDK run on AMD processors
Hi, Has anybody attempted to run DPDK on AMD processors. Does it run straightforward or would there be some obvious issues where porting would be needed, I welcome any comments. Regards -Prashant
[dpdk-dev] DPDK on ARM
Hi guys, Does the DPDK also work on the ARM processor ? If it does not, can anybody suggest what it would take to make it work on ARM (what would be the challenges and so forth, or is it even worth it) Regards -Prashant
[dpdk-dev] Regarding Mellanox CX3 NIC
Hi, I have a usecase coming up where I have to use DPDK with the Mellanox CX3 NIC. I see on dpdk.org that there is a PMD available for that http://dpdk.org/about#6WIND Kindly let me know if the above PMD is open source or has to be purchased. Regards -Prashant
[dpdk-dev] compilation error with 1.6.0r2
[resending on the list] Hi, I recently picked the 1.6.0r2 from dpdk.org and tried to compile it the usual way and ran into the following compilation error. I am aware I can sidestep these by getting the compiler to treat them as warnings, but these did not use to come with 1.6.0r1 so wanted to report it here. I am using Fedora 18 to compile this version of DPDK. Regards -Prashant [root at localhost ~]# cd dpdk-1.6.0r2/ [root at localhost dpdk-1.6.0r2]# make install T=x86_64-default-linuxapp-gcc == Installing x86_64-default-linuxapp-gcc == Build scripts == Build scripts/testhost == Build lib == Build lib/librte_eal == Build lib/librte_eal/common == Build lib/librte_eal/linuxapp == Build lib/librte_eal/linuxapp/igb_uio Building modules, stage 2. MODPOST 1 modules == Build lib/librte_eal/linuxapp/eal == Build lib/librte_eal/linuxapp/kni CC [M] /root/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/igb_ethtool.o /root/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/igb_ethtool.c: In function 'igb_get_eee': */root/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/igb_ethtool.c:2441:4: error: implicit declaration of function 'mmd_eee_adv_to_ethtool_adv_t' [-Werror=implicit-function-declaration]* /root/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/igb_ethtool.c: In function 'igb_set_eee': */root/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/igb_ethtool.c:2551:2: error: implicit declaration of function 'ethtool_adv_to_mmd_eee_adv_t' [-Werror=implicit-function-declaration]* cc1: all warnings being treated as errors make[10]: *** [/root/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/igb_ethtool.o] Error 1 make[9]: *** [_module_/root/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni] Error 2 make[8]: *** [sub-make] Error 2 make[7]: *** [rte_kni.ko] Error 2 make[6]: *** [kni] Error 2 make[5]: *** [linuxapp] Error 2 make[4]: *** [librte_eal] Error 2 make[3]: *** [lib] Error 2 make[2]: *** [all] Error 2 make[1]: *** [x86_64-default-linuxapp-gcc_install] Error 2 make: *** [install] Error 2 [root at localhost dpdk-1.6.0r2]#
[dpdk-dev] compilation error with 1.6.0r2
Hi, I recently picked the 1.6.0r2 from dpdk.org and tried to compile it the usual way and ran into the following compilation error. I am aware I can sidestep these by getting the compiler to treat them as warnings, but these did not use to come with 1.6.0r1 so wanted to report it here. I am using Fedora 18 to compile this version of DPDK. Regards -Prashant [root at localhost ~]# cd dpdk-1.6.0r2/ [root at localhost dpdk-1.6.0r2]# make install T=x86_64-default-linuxapp-gcc == Installing x86_64-default-linuxapp-gcc == Build scripts == Build scripts/testhost == Build lib == Build lib/librte_eal == Build lib/librte_eal/common == Build lib/librte_eal/linuxapp == Build lib/librte_eal/linuxapp/igb_uio Building modules, stage 2. MODPOST 1 modules == Build lib/librte_eal/linuxapp/eal == Build lib/librte_eal/linuxapp/kni CC [M] /root/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/igb_ethtool.o /root/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/igb_ethtool.c: In function 'igb_get_eee': /root/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/igb_ethtool.c:2441:4: error: implicit declaration of function 'mmd_eee_adv_to_ethtool_adv_t' [-Werror=implicit-function-declaration] /root/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/igb_ethtool.c: In function 'igb_set_eee': /root/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/igb_ethtool.c:2551:2: error: implicit declaration of function 'ethtool_adv_to_mmd_eee_adv_t' [-Werror=implicit-function-declaration] cc1: all warnings being treated as errors make[10]: *** [/root/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/igb_ethtool.o] Error 1 make[9]: *** [_module_/root/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni] Error 2 make[8]: *** [sub-make] Error 2 make[7]: *** [rte_kni.ko] Error 2 make[6]: *** [kni] Error 2 make[5]: *** [linuxapp] Error 2 make[4]: *** [librte_eal] Error 2 make[3]: *** [lib] Error 2 make[2]: *** [all] Error 2 make[1]: *** [x86_64-default-linuxapp-gcc_install] Error 2 make: *** [install] Error 2 [root at localhost dpdk-1.6.0r2]# "DISCLAIMER: This message is proprietary to Aricent and is intended solely for the use of the individual to whom it is addressed. It may contain privileged or confidential information and should not be circulated or used for any purpose other than for what it is intended. If you have received this message in error, please notify the originator immediately. If you are not the intended recipient, you are notified that you are strictly prohibited from using, copying, altering, or disclosing the contents of this message. Aricent accepts no responsibility for loss or damage arising from the use of the information transmitted by this email including damage from virus."
[dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x?
Hi Stephen, Kindly let me know if the multi-segment support for vmxnet3 pmd is already in, in a formal release of DPDK. Or which formal release you are targeting this for. Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Prashant Upadhyaya Sent: Friday, March 21, 2014 1:42 PM To: Stephen Hemminger Cc: dev at dpdk.org Subject: Re: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x? Hi Stephen, I believe the 1.6.0r2 is baking. It would be great if you could enhance the vmxnet3 driver on the above with multi-segment support. Any serious usecase ends up using multi-segment, so would be great if r2 can capture it. Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Prashant Upadhyaya Sent: Tuesday, March 11, 2014 10:57 AM To: Stephen Hemminger Cc: dev at dpdk.org Subject: Re: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x? Hi Stephen, This is great news ! I can wait for a formal release of DPDK with your driver. Please let me know when is the release expected. I will happily migrate to that. Regards -Prashant -Original Message- From: Stephen Hemminger [mailto:step...@networkplumber.org] Sent: Monday, March 10, 2014 9:21 PM To: Prashant Upadhyaya Cc: Srinivasan J; dev at dpdk.org Subject: Re: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x? On Mon, 10 Mar 2014 13:30:48 +0530 Prashant Upadhyaya wrote: > Hi Srini, > > Thanks, I could also make it work, thanks to your cue ! > > Now then, this multi-segment not being supported in vmxnet3 driver is a big > party-pooper for me. Unfortunately in my usecase, I do indeed make heavy use > of multisegment buffers for sending out the data, so my usecase has failed > and I will have to fix that. > > Also, can you please adivse how much is the max data rates you have been able > to achieve with one vmxnet3 10G port. > > Thanks a lot for the advice once again. > > Regards > -Prashant I am integrating our driver with the 1.6.1 DPDK driver. We support multi-segment, if you want I will backport that feature first. === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. === === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. === "DISCLAIMER: This message is proprietary to Aricent and is intended solely for the use of the individual to whom it is addressed. It may contain privileged or confidential information and should not be circulated or used for any purpose other than for what it is intended. If you have received this message in error, please notify the originator immediately. If you are not the intended recipient, you are notified that you are strictly prohibited from using, copying, altering, or disclosing the contents of this message. Aricent accepts no responsibility for loss or damage arising from the use of the information transmitted by this email including damage from virus."
[dpdk-dev] IGB_UIO port unbinding
Hi Anatoly, I might have used the term 'initialization' in a wrong fashion. But I have confirmed, the issue was related to this commit (which Thomas brought to my notice) -- http://dpdk.org/browse/dpdk/commit/?id=18f02ff75949de9c2468 The above should get you the context for the original issue. Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Burakov, Anatoly Sent: Friday, April 11, 2014 2:29 PM To: dev at dpdk.org Subject: Re: [dpdk-dev] IGB_UIO port unbinding Hi Prashant, > There was a usecase with ESXi VMXNET3 NIC where I had to use this > parameter set to Y to make it work. > So kindly ensure that the initialization of vmxnet3 NIC is not vulnerable. > Did you try it on the latest 1.6.x code (that option was removed)? More to the point, I can't see how this would be relevant to vmxnet3 initialization. All this config option does is automatically unbinds NICs for you, which is something you can do manually with the included igb_uio_bind.py/pci_unbind.py script anyway. Could you please elaborate on how exactly this option helped with vmxnet3 initialization? Best regards, Anatoly Burakov DPDK SW Engineer -- Intel Shannon Limited Registered in Ireland Registered Office: Collinstown Industrial Park, Leixlip, County Kildare Registered Number: 308263 Business address: Dromore House, East Park, Shannon, Co. Clare "DISCLAIMER: This message is proprietary to Aricent and is intended solely for the use of the individual to whom it is addressed. It may contain privileged or confidential information and should not be circulated or used for any purpose other than for what it is intended. If you have received this message in error, please notify the originator immediately. If you are not the intended recipient, you are notified that you are strictly prohibited from using, copying, altering, or disclosing the contents of this message. Aricent accepts no responsibility for loss or damage arising from the use of the information transmitted by this email including damage from virus."
[dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x?
Hi Stephen, I believe the 1.6.0r2 is baking. It would be great if you could enhance the vmxnet3 driver on the above with multi-segment support. Any serious usecase ends up using multi-segment, so would be great if r2 can capture it. Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Prashant Upadhyaya Sent: Tuesday, March 11, 2014 10:57 AM To: Stephen Hemminger Cc: dev at dpdk.org Subject: Re: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x? Hi Stephen, This is great news ! I can wait for a formal release of DPDK with your driver. Please let me know when is the release expected. I will happily migrate to that. Regards -Prashant -Original Message- From: Stephen Hemminger [mailto:step...@networkplumber.org] Sent: Monday, March 10, 2014 9:21 PM To: Prashant Upadhyaya Cc: Srinivasan J; dev at dpdk.org Subject: Re: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x? On Mon, 10 Mar 2014 13:30:48 +0530 Prashant Upadhyaya wrote: > Hi Srini, > > Thanks, I could also make it work, thanks to your cue ! > > Now then, this multi-segment not being supported in vmxnet3 driver is a big > party-pooper for me. Unfortunately in my usecase, I do indeed make heavy use > of multisegment buffers for sending out the data, so my usecase has failed > and I will have to fix that. > > Also, can you please adivse how much is the max data rates you have been able > to achieve with one vmxnet3 10G port. > > Thanks a lot for the advice once again. > > Regards > -Prashant I am integrating our driver with the 1.6.1 DPDK driver. We support multi-segment, if you want I will backport that feature first. === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. === === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] L2FWD Sample Application stops receiving packets after sometime
Hi Neeraj, I am glad your usecase works. Please do let me know what is the maximum throughput you are able to achieve with vmxnet3 (assuming your underlying physical NIC is 10 Gig), it will be interesting to see the performance. Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Jain, Neeraj 3. (NSN - IN/Bangalore) Sent: Thursday, March 20, 2014 5:00 PM To: dev at dpdk.org Subject: Re: [dpdk-dev] L2FWD Sample Application stops receiving packets after sometime Found the Problem. Few packet buffers were not being freed which lead to slow leak of mbufs. ~Neeraj -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of ext Jain, Neeraj 3. (NSN - IN/Bangalore) Sent: Thursday, March 20, 2014 2:42 PM To: dev at dpdk.org Subject: [dpdk-dev] L2FWD Sample Application stops receiving packets after sometime Hi, I am currently running few trials with DPDK sample application for one of my project requirements. Below are the system details under which the sample application (L2 Forward) is being run. Platform: Linux 2.6.32-358.17.1.el6.x86_64 VMWare: ESXI 5.1 Driver: vmxnet3-usermap L2 FWD program is being run with a load of ~100K packets per second on RX queue. The sample application successfully forwards the packets to other end for some time. However, after around 1 hour run, I see that rte_eth_rx_burst returns 0 packets received, though the load generator continuously sends packet to the L2 FWD program. The statistics reported by the sample application also shows the "Packets received:" value stops incrementing. I also used "rte_eth_stats_get" api to check if any packets are being received on the interface. This api shows the packets are being received at the interface, however rte_eth_rx_burst does not pick these packets. Can you please help me as to why this behavior is seen. Also, please let me know if you need further details. Thanks ~Neeraj === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x?
Hi Stephen, Can you please advise on your experience of the kind of data rates you have been able to achieve with vmxnet3. Also did you have to do any special optimizations at the vmnic of ESXi for the above, kindly let me know. Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Prashant Upadhyaya Sent: Tuesday, March 11, 2014 10:57 AM To: Stephen Hemminger Cc: dev at dpdk.org Subject: Re: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x? Hi Stephen, This is great news ! I can wait for a formal release of DPDK with your driver. Please let me know when is the release expected. I will happily migrate to that. Regards -Prashant -Original Message- From: Stephen Hemminger [mailto:step...@networkplumber.org] Sent: Monday, March 10, 2014 9:21 PM To: Prashant Upadhyaya Cc: Srinivasan J; dev at dpdk.org Subject: Re: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x? On Mon, 10 Mar 2014 13:30:48 +0530 Prashant Upadhyaya wrote: > Hi Srini, > > Thanks, I could also make it work, thanks to your cue ! > > Now then, this multi-segment not being supported in vmxnet3 driver is a big > party-pooper for me. Unfortunately in my usecase, I do indeed make heavy use > of multisegment buffers for sending out the data, so my usecase has failed > and I will have to fix that. > > Also, can you please adivse how much is the max data rates you have been able > to achieve with one vmxnet3 10G port. > > Thanks a lot for the advice once again. > > Regards > -Prashant I am integrating our driver with the 1.6.1 DPDK driver. We support multi-segment, if you want I will backport that feature first. === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. === === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x?
Hi Stephen, This is great news ! I can wait for a formal release of DPDK with your driver. Please let me know when is the release expected. I will happily migrate to that. Regards -Prashant -Original Message- From: Stephen Hemminger [mailto:step...@networkplumber.org] Sent: Monday, March 10, 2014 9:21 PM To: Prashant Upadhyaya Cc: Srinivasan J; dev at dpdk.org Subject: Re: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x? On Mon, 10 Mar 2014 13:30:48 +0530 Prashant Upadhyaya wrote: > Hi Srini, > > Thanks, I could also make it work, thanks to your cue ! > > Now then, this multi-segment not being supported in vmxnet3 driver is a big > party-pooper for me. Unfortunately in my usecase, I do indeed make heavy use > of multisegment buffers for sending out the data, so my usecase has failed > and I will have to fix that. > > Also, can you please adivse how much is the max data rates you have been able > to achieve with one vmxnet3 10G port. > > Thanks a lot for the advice once again. > > Regards > -Prashant I am integrating our driver with the 1.6.1 DPDK driver. We support multi-segment, if you want I will backport that feature first. === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x?
Hi, Regarding performance with vmxnet3 driver, the programmer's guide says thus -- " Currently, the driver provides basic support for using the device in an Intel(r) DPDK application running on a guest OS. Optimization is needed on the backend, that is, the VMware* ESXi vmkernel switch, to achieve optimal performance end-to-end. " Can someone advise what are the techniques of 'backend optimization on the vmkernel switch' I ran some initial tests without any attempt for above optimization and the data rates I am achieving with vmxnet3 are not very encouraging. Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Prashant Upadhyaya Sent: Monday, March 10, 2014 1:31 PM To: Srinivasan J Cc: dev at dpdk.org Subject: Re: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x? Hi Srini, Thanks, I could also make it work, thanks to your cue ! Now then, this multi-segment not being supported in vmxnet3 driver is a big party-pooper for me. Unfortunately in my usecase, I do indeed make heavy use of multisegment buffers for sending out the data, so my usecase has failed and I will have to fix that. Also, can you please adivse how much is the max data rates you have been able to achieve with one vmxnet3 10G port. Thanks a lot for the advice once again. Regards -Prashant -Original Message- From: Srinivasan J [mailto:srinid...@gmail.com] Sent: Sunday, March 09, 2014 12:38 AM To: Prashant Upadhyaya Cc: David Marchand; dev at dpdk.org Subject: Re: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x? Prashant, I was also able to hit the issue your hitting using Esxi 5.1.0 evaluation and Fedora 20 X86_64 guest. I was able to fix the issue by setting CONFIG_RTE_EAL_UNBIND_PORTS=y option in defconfig_x86_64-default-linuxapp-gcc configuration file. Issue seen EAL: PCI device :03:00.0 on NUMA socket -1 EAL: probe driver: 15ad:7b0 rte_vmxnet3_pmd EAL: Device is blacklisted, not initializing EAL: PCI device :0b:00.0 on NUMA socket -1 EAL: probe driver: 15ad:7b0 rte_vmxnet3_pmd Program received signal SIGSEGV, Segmentation fault. eth_vmxnet3_dev_init (eth_drv=, eth_dev=0x754480 ) at /root/source/dpdk-1.6.0r1/lib/librte_pmd_vmxnet3/vmxnet3_ethdev.c:218 218 ver = VMXNET3_READ_BAR1_REG(hw, VMXNET3_REG_VRRS); Missing separate debuginfos, use: debuginfo-install glibc-2.18-11.fc20.x86_64 (gdb) p hw $1 = (struct vmxnet3_hw *) 0x7fffd8fc1040 (gdb) p *hw $2 = {hw_addr0 = 0x0, hw_addr1 = 0x0, back = 0x0, device_id = 1968, vendor_id = 5549, subsystem_device_id = 0, subsystem_vendor_id = 0, adapter_stopped = 0, perm_addr = "\000\000\000\000\000", num_tx_queues = 1 '\001', num_rx_queues = 1 '\001', bufs_per_pkt = 1 '\001', cur_mtu = 0, tqd_start = 0x0, rqd_start = 0x0, shared = 0x0, sharedPA = 0, queueDescPA = 0, queue_desc_len = 0, rss_conf = 0x0, rss_confPA = 0, mf_table = 0x0} (gdb) #define VMXNET3_PCI_BAR1_REG_ADDR(hw, reg) \ ((volatile uint32_t *)((char *)(hw)->hw_addr1 + (reg))) #define VMXNET3_READ_BAR1_REG(hw, reg) \ vmxnet3_read_addr(VMXNET3_PCI_BAR1_REG_ADDR((hw), (reg))) lib/librte_pmd_vmxnet3/vmxnet3_ethdev.h Issue not seen after enabling CONFIG_RTE_EAL_UNBIND_PORTS=y == [root at localhost build]# ./l2fwd -c 0xf -b :03:00.0 -n 1 -- -p 0x6 EAL: Detected lcore 0 as core 0 on socket 0 EAL: Detected lcore 1 as core 1 on socket 0 EAL: Detected lcore 2 as core 2 on socket 0 EAL: Detected lcore 3 as core 3 on socket 0 EAL: Skip lcore 4 (not detected) EAL: Skip lcore 5 (not detected) EAL: Skip lcore 6 (not detected) EAL: Skip lcore 7 (not detected) EAL: Skip lcore 8 (not detected) EAL: Skip lcore 9 (not detected) EAL: Skip lcore 10 (not detected) EAL: Skip lcore 11 (not detected) EAL: Skip lcore 12 (not detected) EAL: Skip lcore 13 (not detected) EAL: Skip lcore 14 (not detected) EAL: Skip lcore 15 (not detected) EAL: Skip lcore 16 (not detected) EAL: Skip lcore 17 (not detected) EAL: Skip lcore 18 (not detected) EAL: Skip lcore 19 (not detected) EAL: Skip lcore 20 (not detected) EAL: Skip lcore 21 (not detected) EAL: Skip lcore 22 (not detected) EAL: Skip lcore 23 (not detected) EAL: Skip lcore 24 (not detected) EAL: Skip lcore 25 (not detected) EAL: Skip lcore 26 (not detected) EAL: Skip lcore 27 (not detected) EAL: Skip lcore 28 (not detected) EAL: Skip lcore 29 (not detected) EAL: Skip lcore 30 (not detected) EAL: Skip lcore 31 (not detected) EAL: Skip lcore 32 (not detected) EAL: Skip lcore 33 (not detected) EAL: Skip lcore 34 (not detected) EAL: Skip lcore 35 (not detected) EAL: Skip lcore 36 (not detected) EAL: Skip lcore 37 (not detected) EAL: Skip lcore 38 (not detected) EAL: Skip lcore 39 (not detected) EAL: Skip lcore 40 (not detected) EAL: Skip lcore 41 (not detected) EAL: Skip lcore 42 (not detected) EAL: Skip lcore 43 (not detected) EAL: Skip lcore 44 (not detected) EAL: Skip lcore 45 (not de
[dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x?
Hi Srini, Thanks, I could also make it work, thanks to your cue ! Now then, this multi-segment not being supported in vmxnet3 driver is a big party-pooper for me. Unfortunately in my usecase, I do indeed make heavy use of multisegment buffers for sending out the data, so my usecase has failed and I will have to fix that. Also, can you please adivse how much is the max data rates you have been able to achieve with one vmxnet3 10G port. Thanks a lot for the advice once again. Regards -Prashant -Original Message- From: Srinivasan J [mailto:srinid...@gmail.com] Sent: Sunday, March 09, 2014 12:38 AM To: Prashant Upadhyaya Cc: David Marchand; dev at dpdk.org Subject: Re: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x? Prashant, I was also able to hit the issue your hitting using Esxi 5.1.0 evaluation and Fedora 20 X86_64 guest. I was able to fix the issue by setting CONFIG_RTE_EAL_UNBIND_PORTS=y option in defconfig_x86_64-default-linuxapp-gcc configuration file. Issue seen EAL: PCI device :03:00.0 on NUMA socket -1 EAL: probe driver: 15ad:7b0 rte_vmxnet3_pmd EAL: Device is blacklisted, not initializing EAL: PCI device :0b:00.0 on NUMA socket -1 EAL: probe driver: 15ad:7b0 rte_vmxnet3_pmd Program received signal SIGSEGV, Segmentation fault. eth_vmxnet3_dev_init (eth_drv=, eth_dev=0x754480 ) at /root/source/dpdk-1.6.0r1/lib/librte_pmd_vmxnet3/vmxnet3_ethdev.c:218 218 ver = VMXNET3_READ_BAR1_REG(hw, VMXNET3_REG_VRRS); Missing separate debuginfos, use: debuginfo-install glibc-2.18-11.fc20.x86_64 (gdb) p hw $1 = (struct vmxnet3_hw *) 0x7fffd8fc1040 (gdb) p *hw $2 = {hw_addr0 = 0x0, hw_addr1 = 0x0, back = 0x0, device_id = 1968, vendor_id = 5549, subsystem_device_id = 0, subsystem_vendor_id = 0, adapter_stopped = 0, perm_addr = "\000\000\000\000\000", num_tx_queues = 1 '\001', num_rx_queues = 1 '\001', bufs_per_pkt = 1 '\001', cur_mtu = 0, tqd_start = 0x0, rqd_start = 0x0, shared = 0x0, sharedPA = 0, queueDescPA = 0, queue_desc_len = 0, rss_conf = 0x0, rss_confPA = 0, mf_table = 0x0} (gdb) #define VMXNET3_PCI_BAR1_REG_ADDR(hw, reg) \ ((volatile uint32_t *)((char *)(hw)->hw_addr1 + (reg))) #define VMXNET3_READ_BAR1_REG(hw, reg) \ vmxnet3_read_addr(VMXNET3_PCI_BAR1_REG_ADDR((hw), (reg))) lib/librte_pmd_vmxnet3/vmxnet3_ethdev.h Issue not seen after enabling CONFIG_RTE_EAL_UNBIND_PORTS=y == [root at localhost build]# ./l2fwd -c 0xf -b :03:00.0 -n 1 -- -p 0x6 EAL: Detected lcore 0 as core 0 on socket 0 EAL: Detected lcore 1 as core 1 on socket 0 EAL: Detected lcore 2 as core 2 on socket 0 EAL: Detected lcore 3 as core 3 on socket 0 EAL: Skip lcore 4 (not detected) EAL: Skip lcore 5 (not detected) EAL: Skip lcore 6 (not detected) EAL: Skip lcore 7 (not detected) EAL: Skip lcore 8 (not detected) EAL: Skip lcore 9 (not detected) EAL: Skip lcore 10 (not detected) EAL: Skip lcore 11 (not detected) EAL: Skip lcore 12 (not detected) EAL: Skip lcore 13 (not detected) EAL: Skip lcore 14 (not detected) EAL: Skip lcore 15 (not detected) EAL: Skip lcore 16 (not detected) EAL: Skip lcore 17 (not detected) EAL: Skip lcore 18 (not detected) EAL: Skip lcore 19 (not detected) EAL: Skip lcore 20 (not detected) EAL: Skip lcore 21 (not detected) EAL: Skip lcore 22 (not detected) EAL: Skip lcore 23 (not detected) EAL: Skip lcore 24 (not detected) EAL: Skip lcore 25 (not detected) EAL: Skip lcore 26 (not detected) EAL: Skip lcore 27 (not detected) EAL: Skip lcore 28 (not detected) EAL: Skip lcore 29 (not detected) EAL: Skip lcore 30 (not detected) EAL: Skip lcore 31 (not detected) EAL: Skip lcore 32 (not detected) EAL: Skip lcore 33 (not detected) EAL: Skip lcore 34 (not detected) EAL: Skip lcore 35 (not detected) EAL: Skip lcore 36 (not detected) EAL: Skip lcore 37 (not detected) EAL: Skip lcore 38 (not detected) EAL: Skip lcore 39 (not detected) EAL: Skip lcore 40 (not detected) EAL: Skip lcore 41 (not detected) EAL: Skip lcore 42 (not detected) EAL: Skip lcore 43 (not detected) EAL: Skip lcore 44 (not detected) EAL: Skip lcore 45 (not detected) EAL: Skip lcore 46 (not detected) EAL: Skip lcore 47 (not detected) EAL: Skip lcore 48 (not detected) EAL: Skip lcore 49 (not detected) EAL: Skip lcore 50 (not detected) EAL: Skip lcore 51 (not detected) EAL: Skip lcore 52 (not detected) EAL: Skip lcore 53 (not detected) EAL: Skip lcore 54 (not detected) EAL: Skip lcore 55 (not detected) EAL: Skip lcore 56 (not detected) EAL: Skip lcore 57 (not detected) EAL: Skip lcore 58 (not detected) EAL: Skip lcore 59 (not detected) EAL: Skip lcore 60 (not detected) EAL: Skip lcore 61 (not detected) EAL: Skip lcore 62 (not detected) EAL: Skip lcore 63 (not detected) EAL: Setting up memory... EAL: Ask a virtual area of 0x20 bytes EAL: Virtual area found at 0x7f3a76a0 (size = 0x20) EAL: Ask a virtual area of 0x7c0 bytes EAL: Virtual area found at 0x7f3a6ec0 (size = 0x7c00
[dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x?
And if it is any help, here is the startup log -- EAL: Detected lcore 0 as core 0 on socket 0 EAL: Detected lcore 1 as core 1 on socket 0 EAL: Detected lcore 2 as core 2 on socket 0 EAL: Detected lcore 3 as core 3 on socket 0 EAL: Detected lcore 4 as core 4 on socket 0 EAL: Detected lcore 5 as core 5 on socket 0 EAL: Detected lcore 6 as core 6 on socket 0 EAL: Detected lcore 7 as core 7 on socket 0 EAL: Skip lcore 8 (not detected) EAL: Skip lcore 9 (not detected) EAL: Skip lcore 10 (not detected) EAL: Skip lcore 11 (not detected) EAL: Skip lcore 12 (not detected) EAL: Skip lcore 13 (not detected) EAL: Skip lcore 14 (not detected) EAL: Skip lcore 15 (not detected) EAL: Skip lcore 16 (not detected) EAL: Skip lcore 17 (not detected) EAL: Skip lcore 18 (not detected) EAL: Skip lcore 19 (not detected) EAL: Skip lcore 20 (not detected) EAL: Skip lcore 21 (not detected) EAL: Skip lcore 22 (not detected) EAL: Skip lcore 23 (not detected) EAL: Skip lcore 24 (not detected) EAL: Skip lcore 25 (not detected) EAL: Skip lcore 26 (not detected) EAL: Skip lcore 27 (not detected) EAL: Skip lcore 28 (not detected) EAL: Skip lcore 29 (not detected) EAL: Skip lcore 30 (not detected) EAL: Skip lcore 31 (not detected) EAL: Skip lcore 32 (not detected) EAL: Skip lcore 33 (not detected) EAL: Skip lcore 34 (not detected) EAL: Skip lcore 35 (not detected) EAL: Skip lcore 36 (not detected) EAL: Skip lcore 37 (not detected) EAL: Skip lcore 38 (not detected) EAL: Skip lcore 39 (not detected) EAL: Skip lcore 40 (not detected) EAL: Skip lcore 41 (not detected) EAL: Skip lcore 42 (not detected) EAL: Skip lcore 43 (not detected) EAL: Skip lcore 44 (not detected) EAL: Skip lcore 45 (not detected) EAL: Skip lcore 46 (not detected) EAL: Skip lcore 47 (not detected) EAL: Skip lcore 48 (not detected) EAL: Skip lcore 49 (not detected) EAL: Skip lcore 50 (not detected) EAL: Skip lcore 51 (not detected) EAL: Skip lcore 52 (not detected) EAL: Skip lcore 53 (not detected) EAL: Skip lcore 54 (not detected) EAL: Skip lcore 55 (not detected) EAL: Skip lcore 56 (not detected) EAL: Skip lcore 57 (not detected) EAL: Skip lcore 58 (not detected) EAL: Skip lcore 59 (not detected) EAL: Skip lcore 60 (not detected) EAL: Skip lcore 61 (not detected) EAL: Skip lcore 62 (not detected) EAL: Skip lcore 63 (not detected) EAL: Setting up memory... EAL: Ask a virtual area of 0x8000 bytes EAL: Virtual area found at 0x7f848ae0 (size = 0x8000) EAL: Requesting 1024 pages of size 2MB from socket 0 EAL: TSC frequency is ~200 KHz EAL: Master core 0 is ready (tid=b3f3f00) EAL: Core 1 is ready (tid=8a1f2700) EAL: Core 2 is ready (tid=899f1700) EAL: Core 3 is ready (tid=891f0700) EAL: Core 4 is ready (tid=889ef700) EAL: Core 5 is ready (tid=7bfff700) EAL: Core 6 is ready (tid=7b7fe700) EAL: Core 7 is ready (tid=7affd700) Pool initialized Global Variables initialized PMD: rte_vmxnet3_pmd_init(): >> EAL: PCI device :0b:00.0 on NUMA socket -1 EAL: probe driver: 15ad:7b0 rte_vmxnet3_pmd PMD: eth_vmxnet3_dev_init(): >> Segmentation fault (core dumped) -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Prashant Upadhyaya Sent: Thursday, March 06, 2014 12:20 PM To: David Marchand Cc: dev at dpdk.org Subject: Re: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x? Hi, Some further update on the crash I am facing. I am using DPDK1.6.0r1 and take over the vmxnet3 with igb_uio and then start the application. (so no external ko or vmxnet usermap etc.) During the port initializations, the crash is happening in the following function - eth_vmxnet3_dev_init and the crash is happening at the following line - /* Check h/w version compatibility with driver. */ ver = VMXNET3_READ_BAR1_REG(hw, VMXNET3_REG_VRRS); Any hints regarding what could be wrong ? Regards -Prashant From: Prashant Upadhyaya Sent: Wednesday, March 05, 2014 9:01 PM To: 'David Marchand' Cc: Srinivasan J; dev at dpdk.org Subject: RE: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x? Hi David, The compilation error with debug flags on was that the functions -- vmxnet3_txq_dump, vmxnet3_rxq_dump - are defined but not used. Not a serious error, I will try to get rid of the compiler flag which generates this. However, I must reiterate, I _did_ bind my vmxnet3 device with igb_uio (and I did not use any .so, because I was intending to use the builtin vmxnet3 driver of dpdk 1.6.0r1), the bind succeeded, but then when I started the application, the dev init for vmxnet3 gave a core dump. Your patch and solution seems to be suggesting the reverse, i.e. when I don't bind with igb_uio but try to use the native driver. So please do try the above combination as well. Regards -Prashant From: David Marchand [mailto:david.march...@6wind.com] Sent: Wednesday, March 05, 2014 8:41 PM To: Prashant Upadhyaya Cc: Srinivasan J; dev at dpdk.org<mailto:dev at dpdk.org> Subject: Re: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.
[dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x?
Hi, Some further update on the crash I am facing. I am using DPDK1.6.0r1 and take over the vmxnet3 with igb_uio and then start the application. (so no external ko or vmxnet usermap etc.) During the port initializations, the crash is happening in the following function - eth_vmxnet3_dev_init and the crash is happening at the following line - /* Check h/w version compatibility with driver. */ ver = VMXNET3_READ_BAR1_REG(hw, VMXNET3_REG_VRRS); Any hints regarding what could be wrong ? Regards -Prashant From: Prashant Upadhyaya Sent: Wednesday, March 05, 2014 9:01 PM To: 'David Marchand' Cc: Srinivasan J; dev at dpdk.org Subject: RE: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x? Hi David, The compilation error with debug flags on was that the functions -- vmxnet3_txq_dump, vmxnet3_rxq_dump - are defined but not used. Not a serious error, I will try to get rid of the compiler flag which generates this. However, I must reiterate, I _did_ bind my vmxnet3 device with igb_uio (and I did not use any .so, because I was intending to use the builtin vmxnet3 driver of dpdk 1.6.0r1), the bind succeeded, but then when I started the application, the dev init for vmxnet3 gave a core dump. Your patch and solution seems to be suggesting the reverse, i.e. when I don't bind with igb_uio but try to use the native driver. So please do try the above combination as well. Regards -Prashant From: David Marchand [mailto:david.march...@6wind.com] Sent: Wednesday, March 05, 2014 8:41 PM To: Prashant Upadhyaya Cc: Srinivasan J; dev at dpdk.org<mailto:dev at dpdk.org> Subject: Re: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x? Hello Prashant, On Wed, Mar 5, 2014 at 3:28 PM, Prashant Upadhyaya mailto:prashant.upadhyaya at aricent.com>> wrote: Hi, I am also keen to know the answer to the question posted by Srini. The real question is -- is dpdk1.6.0r1 self-sufficient so that I don't need any extensions etc., or do I still need something from outside like the usermap kernel driver etc. Secondly, if I turn on all the debug options for the vmxnet3 pmd in the config file, 1.6.0r1 compilation runs into a problem and reports a function which is defined but not used. Can you send your build error ? (maybe in a separate thread ?) I am trying to bring up DPDK inside Fedora18 Guest on ESXi -- when I used DPDK1.6.0r1 (without debug options turned on for vmxnet3 pmd) the igb_uio could take over the vmxnet3 NIC but I encountered a core dump in the dev init function for the vmxnet3 driver -- anybody encountered a similar issue ? I encountered these problems as well. - igb_uio module does not check if you disable vmxnet3-uio pmds, it will always try to take over vmxnet3 devices. I have a patch waiting in my working dir to cleanly disable vmxnet3-uio pmd. - If you don't bind vmxnet3 devices to uio, but forget to enable vmxnet3-usermap pmd (by specifiying -d librte_pmd_vmxnet3.so), then internal vmxnet3-uio pmd will try to initialise and crash. I did not look any deeper into this, the easiest way is to disable vmxnet3-uio pmd + apply the patch I will send in a few minutes, as a first workaround. Regards, -- David Marchand === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x?
Hi, I am also keen to know the answer to the question posted by Srini. The real question is -- is dpdk1.6.0r1 self-sufficient so that I don't need any extensions etc., or do I still need something from outside like the usermap kernel driver etc. Secondly, if I turn on all the debug options for the vmxnet3 pmd in the config file, 1.6.0r1 compilation runs into a problem and reports a function which is defined but not used. I am trying to bring up DPDK inside Fedora18 Guest on ESXi -- when I used DPDK1.6.0r1 (without debug options turned on for vmxnet3 pmd) the igb_uio could take over the vmxnet3 NIC but I encountered a core dump in the dev init function for the vmxnet3 driver -- anybody encountered a similar issue ? Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Srinivasan J Sent: Tuesday, March 04, 2014 8:53 PM To: dev at dpdk.org Subject: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x? I want to try dpdk with vmxnet3 in Vmware Esxi 5.1. I see that the latest dpdk 1.6.0r1 includes a vmxnet3 pmd. The vmxnet3-usermap-1.1.tar.gz as well includes a vmxnet3 pmd driver. I'm confused as to which vmxnet3 pmd driver to use along with which vmxnet3 kernel driver (vmxnet3 native kernel driver or vmxnet3-usermap kernel driver). I also want to try RSS with vmxnet3 and dpdk. As per Intel DPDK Programmer's Guide (January 2014) RSS is supported with vmxnet3 since dpdk version 1.6.0. Thanks, Srini === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] DPDK not recognizing 82599 on my setup
Hi Qinglai, Indeed this was the issue which was the consequence of power management being 'on' in the BIOS. Turning that off (as I am not doing power management right now) cures the problems since the timers work as expected based on the fixed frequency at which the CPU operates. So looks like DPDK library has issues when the power management is on. Thanks a bunch. Regards -Prashant -Original Message- From: jigsaw [mailto:jig...@gmail.com] Sent: Friday, February 14, 2014 3:11 PM To: Prashant Upadhyaya Cc: dev at dpdk.org Subject: Re: [dpdk-dev] DPDK not recognizing 82599 on my setup http://dpdk.org/ml/archives/dev/2014-January/001205.html Likely to be the root cause. It needs a patch coz it caused problems now and then without giving noticeable reason in the log. -Qinglai On Fri, Feb 14, 2014 at 11:30 AM, Prashant Upadhyaya wrote: > Pasting the logs inline as the attachments don't show up on list somehow -- > > =~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2014.02.14 13:29:51 =~=~=~=~=~=~=~=~=~=~=~= > ./lte_lg.out -cff -n4 > EAL: Detected lcore 0 as core 0 on socket 0 > EAL: Detected lcore 1 as core 1 on socket 0 > EAL: Detected lcore 2 as core 2 on socket 0 > EAL: Detected lcore 3 as core 3 on socket 0 > EAL: Detected lcore 4 as core 4 on socket 0 > EAL: Detected lcore 5 as core 8 on socket 0 > EAL: Detected lcore 6 as core 9 on socket 0 > EAL: Detected lcore 7 as core 10 on socket 0 > EAL: Detected lcore 8 as core 11 on socket 0 > EAL: Detected lcore 9 as core 12 on socket 0 > EAL: Detected lcore 10 as core 0 on socket 1 > EAL: Detected lcore 11 as core 1 on socket 1 > EAL: Detected lcore 12 as core 2 on socket 1 > EAL: Detected lcore 13 as core 3 on socket 1 > EAL: Detected lcore 14 as core 4 on socket 1 > EAL: Detected lcore 15 as core 8 on socket 1 > EAL: Detected lcore 16 as core 9 on socket 1 > EAL: Detected lcore 17 as core 10 on socket 1 > EAL: Detected lcore 18 as core 11 on socket 1 > EAL: Detected lcore 19 as core 12 on socket 1 > EAL: Detected lcore 20 as core 0 on socket 0 > EAL: Detected lcore 21 as core 1 on socket 0 > EAL: Detected lcore 22 as core 2 on socket 0 > EAL: Detected lcore 23 as core 3 on socket 0 > EAL: Detected lcore 24 as core 4 on socket 0 > EAL: Detected lcore 25 as core 8 on socket 0 > EAL: Detected lcore 26 as core 9 on socket 0 > EAL: Detected lcore 27 as core 10 on socket 0 > EAL: Detected lcore 28 as core 11 on socket 0 > EAL: Detected lcore 29 as core 12 on socket 0 > EAL: Detected lcore 30 as core 0 on socket 1 > EAL: Detected lcore 31 as core 1 on socket 1 > EAL: Detected lcore 32 as core 2 on socket 1 > EAL: Detected lcore 33 as core 3 on socket 1 > EAL: Detected lcore 34 as core 4 on socket 1 > EAL: Detected lcore 35 as core 8 on socket 1 > EAL: Detected lcore 36 as core 9 on socket 1 > EAL: Detected lcore 37 as core 10 on socket 1 > EAL: Detected lcore 38 as core 11 on socket 1 > EAL: Detected lcore 39 as core 12 on socket 1 > EAL: Skip lcore 40 (not detected) > EAL: Skip lcore 41 (not detected) > EAL: Skip lcore 42 (not detected) > EAL: Skip lcore 43 (not detected) > EAL: Skip lcore 44 (not detected) > EAL: Skip lcore 45 (not detected) > EAL: Skip lcore 46 (not detected) > EAL: Skip lcore 47 (not detected) > EAL: Skip lcore 48 (not detected) > EAL: Skip lcore 49 (not detected) > EAL: Skip lcore 50 (not detected) > EAL: Skip lcore 51 (not detected) > EAL: Skip lcore 52 (not detected) > EAL: Skip lcore 53 (not detected) > EAL: Skip lcore 54 (not detected) > EAL: Skip lcore 55 (not detected) > EAL: Skip lcore 56 (not detected) > EAL: Skip lcore 57 (not detected) > EAL: Skip lcore 58 (not detected) > EAL: Skip lcore 59 (not detected) > EAL: Skip lcore 60 (not detected) > EAL: Skip lcore 61 (not detected) > EAL: Skip lcore 62 (not detected) > EAL: Skip lcore 63 (not detected) > EAL: Setting up memory... > EAL: Ask a virtual area of 0x2147483648 bytes > EAL: Virtual area found at 0x7f948000 (size = 0x8000) > EAL: Ask a virtual area of 0x2147483648 bytes > EAL: Virtual area found at 0x7f93c000 (size = 0x8000) > EAL: Requesting 2 pages of size 1024MB from socket 0 > EAL: Requesting 2 pages of size 1024MB from socket 1 > EAL: TSC frequency is ~120 KHz > EAL: Master core 0 is ready (tid=3d457040) > EAL: Core 1 is ready (tid=3ca9d700) > EAL: Core 2 is ready (tid=3c29c700) > EAL: Core 3 is ready (tid=3ba9b700) > EAL: Core 4 is ready (tid=3b29a700) > EAL: Core 5 is ready (tid=3aa99700) > EAL: Core 6 is ready (tid=3a298700) > EAL: Core 7 is ready (tid=39a97700) > Pool initialized > Global Variables initialized > PMD: rte_ixgbe_pmd_init(): >> > PMD: rte_ixgbevf_pmd_init(): rte_ixgbevf_pmd_init > EAL: PCI device :07:00.0 on NUMA socket 0 > EAL: probe driver: 8086:152
[dpdk-dev] DPDK not recognizing 82599 on my setup
Pasting the logs inline as the attachments don't show up on list somehow -- =~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2014.02.14 13:29:51 =~=~=~=~=~=~=~=~=~=~=~= ./lte_lg.out -cff -n4 EAL: Detected lcore 0 as core 0 on socket 0 EAL: Detected lcore 1 as core 1 on socket 0 EAL: Detected lcore 2 as core 2 on socket 0 EAL: Detected lcore 3 as core 3 on socket 0 EAL: Detected lcore 4 as core 4 on socket 0 EAL: Detected lcore 5 as core 8 on socket 0 EAL: Detected lcore 6 as core 9 on socket 0 EAL: Detected lcore 7 as core 10 on socket 0 EAL: Detected lcore 8 as core 11 on socket 0 EAL: Detected lcore 9 as core 12 on socket 0 EAL: Detected lcore 10 as core 0 on socket 1 EAL: Detected lcore 11 as core 1 on socket 1 EAL: Detected lcore 12 as core 2 on socket 1 EAL: Detected lcore 13 as core 3 on socket 1 EAL: Detected lcore 14 as core 4 on socket 1 EAL: Detected lcore 15 as core 8 on socket 1 EAL: Detected lcore 16 as core 9 on socket 1 EAL: Detected lcore 17 as core 10 on socket 1 EAL: Detected lcore 18 as core 11 on socket 1 EAL: Detected lcore 19 as core 12 on socket 1 EAL: Detected lcore 20 as core 0 on socket 0 EAL: Detected lcore 21 as core 1 on socket 0 EAL: Detected lcore 22 as core 2 on socket 0 EAL: Detected lcore 23 as core 3 on socket 0 EAL: Detected lcore 24 as core 4 on socket 0 EAL: Detected lcore 25 as core 8 on socket 0 EAL: Detected lcore 26 as core 9 on socket 0 EAL: Detected lcore 27 as core 10 on socket 0 EAL: Detected lcore 28 as core 11 on socket 0 EAL: Detected lcore 29 as core 12 on socket 0 EAL: Detected lcore 30 as core 0 on socket 1 EAL: Detected lcore 31 as core 1 on socket 1 EAL: Detected lcore 32 as core 2 on socket 1 EAL: Detected lcore 33 as core 3 on socket 1 EAL: Detected lcore 34 as core 4 on socket 1 EAL: Detected lcore 35 as core 8 on socket 1 EAL: Detected lcore 36 as core 9 on socket 1 EAL: Detected lcore 37 as core 10 on socket 1 EAL: Detected lcore 38 as core 11 on socket 1 EAL: Detected lcore 39 as core 12 on socket 1 EAL: Skip lcore 40 (not detected) EAL: Skip lcore 41 (not detected) EAL: Skip lcore 42 (not detected) EAL: Skip lcore 43 (not detected) EAL: Skip lcore 44 (not detected) EAL: Skip lcore 45 (not detected) EAL: Skip lcore 46 (not detected) EAL: Skip lcore 47 (not detected) EAL: Skip lcore 48 (not detected) EAL: Skip lcore 49 (not detected) EAL: Skip lcore 50 (not detected) EAL: Skip lcore 51 (not detected) EAL: Skip lcore 52 (not detected) EAL: Skip lcore 53 (not detected) EAL: Skip lcore 54 (not detected) EAL: Skip lcore 55 (not detected) EAL: Skip lcore 56 (not detected) EAL: Skip lcore 57 (not detected) EAL: Skip lcore 58 (not detected) EAL: Skip lcore 59 (not detected) EAL: Skip lcore 60 (not detected) EAL: Skip lcore 61 (not detected) EAL: Skip lcore 62 (not detected) EAL: Skip lcore 63 (not detected) EAL: Setting up memory... EAL: Ask a virtual area of 0x2147483648 bytes EAL: Virtual area found at 0x7f948000 (size = 0x8000) EAL: Ask a virtual area of 0x2147483648 bytes EAL: Virtual area found at 0x7f93c000 (size = 0x8000) EAL: Requesting 2 pages of size 1024MB from socket 0 EAL: Requesting 2 pages of size 1024MB from socket 1 EAL: TSC frequency is ~120 KHz EAL: Master core 0 is ready (tid=3d457040) EAL: Core 1 is ready (tid=3ca9d700) EAL: Core 2 is ready (tid=3c29c700) EAL: Core 3 is ready (tid=3ba9b700) EAL: Core 4 is ready (tid=3b29a700) EAL: Core 5 is ready (tid=3aa99700) EAL: Core 6 is ready (tid=3a298700) EAL: Core 7 is ready (tid=39a97700) Pool initialized Global Variables initialized PMD: rte_ixgbe_pmd_init(): >> PMD: rte_ixgbevf_pmd_init(): rte_ixgbevf_pmd_init EAL: PCI device :07:00.0 on NUMA socket 0 EAL: probe driver: 8086:1521 rte_igb_pmd EAL: :07:00.0 not managed by UIO driver, skipping EAL: PCI device :07:00.1 on NUMA socket 0 EAL: probe driver: 8086:1521 rte_igb_pmd EAL: :07:00.1 not managed by UIO driver, skipping EAL: PCI device :81:00.0 on NUMA socket 1 EAL: probe driver: 8086:10fb rte_ixgbe_pmd EAL: PCI memory mapped at 0x7f9539217000 EAL: PCI memory mapped at 0x7f953d45f000 PMD: eth_ixgbe_dev_init(): >> PMD: ixgbe_init_shared_code(): ixgbe_init_shared_code PMD: ixgbe_set_mac_type(): ixgbe_set_mac_type PMD: ixgbe_set_mac_type(): ixgbe_set_mac_type found mac: 2, returns: 0 PMD: ixgbe_init_ops_82599(): ixgbe_init_ops_82599 PMD: ixgbe_init_phy_ops_generic(): ixgbe_init_phy_ops_generic PMD: ixgbe_init_ops_generic(): ixgbe_init_ops_generic PMD: ixgbe_init_mac_link_ops_82599(): ixgbe_init_mac_link_ops_82599 PMD: ixgbe_get_media_type_82599(): ixgbe_get_media_type_82599 PMD: ixgbe_get_media_type_82599(): ixgbe_get_media_type_82599 PMD: ixgbe_get_pcie_msix_count_generic(): ixgbe_get_pcie_msix_count_generic PMD: ixgbe_validate_eeprom_checksum_generic(): ixgbe_validate_eeprom_checksum_generic PMD: ixgbe_read_eeprom_82599(): ixgbe_read_eeprom_82599 PMD: ixgbe_read_eeprom_bit_bang_generic(): ixgbe_read_eeprom_bit_bang_generic PMD: ixgbe_init_eeprom_params_generic():
[dpdk-dev] DPDK not recognizing 82599 on my setup
Hi, Here is the lscpi output - 81:00.0 Ethernet controller [0200]: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01) 81:00.1 Ethernet controller [0200]: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01) 84:00.0 Ethernet controller [0200]: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01) 84:00.1 Ethernet controller [0200]: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01) I am using Fedora 18, DPDK version 1.5.0r1 The first two ports (as in lspci output) are connected to the switch and Linux is able to ping using these. When they are in linux control, I see that auto negotiation is done for 10 Gbps I unbind 81:00.0 and apply igb_uio Now when I start the DPDK application, I get the error which I am attaching with DEBUG set to y. Somehow DPDK is not able to recognize the NIC. Any hints if someone has encountered a similar issue. Regards -Prashant === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] DPDK not recognizing 82599 on my setup
Hi, Here is the lscpi output - 81:00.0 Ethernet controller [0200]: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01) 81:00.1 Ethernet controller [0200]: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01) 84:00.0 Ethernet controller [0200]: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01) 84:00.1 Ethernet controller [0200]: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01) I am using Fedora 18, DPDK version 1.5.0r1 The first two ports (as in lspci output) are connected to the switch and Linux is able to ping using these. When they are in linux control, I see that auto negotiation is done for 10 Gbps I unbind 81:00.0 and apply igb_uio Now when I start the DPDK application, I get the error which I am attaching with DEBUG set to y. Somehow DPDK is not able to recognize the NIC. Any hints if someone has encountered a similar issue. Regards -Prashant === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] NUMA CPU Sockets and DPDK
Hi Etai, Ofcourse all DPDK threads consume 100 % (unless some waits are introduced for some power saving etc., all typical DPDK threads are while(1) loops) When I said core 1 is unusually busy, I meant to say that it is not able to read beyond 2 Gbps or so and the packets are dropping at NIC. (I have my own custom way of calculating the cpu utilization of core 1 based on how many empty polls were done and how many polls got me data which I then process) On the 8 core machine with single socket, the core 1 was being able to lift successfully much higher data rates, hence the question. Regards -Prashant -Original Message- From: Etai Lev Ran [mailto:elev...@gmail.com] Sent: Wednesday, February 12, 2014 5:18 PM To: Prashant Upadhyaya Cc: dev at dpdk.org Subject: RE: [dpdk-dev] NUMA CPU Sockets and DPDK Hi Prashant, Based on our experience, using DPDK cross CPU sockets may indeed result in some performance degradation (~10% for our application vs. staying in socket. YMMV based on HW, application structure, etc.). Regarding CPU utilization on core 1, the one picking up traffic: perhaps I had misunderstood your comment, but I would expect it to always be close to 100% since it's polling the device via the PMD and not driven by interrupts. Regards, Etai -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Prashant Upadhyaya Sent: Wednesday, February 12, 2014 1:28 PM To: dev at dpdk.org Subject: [dpdk-dev] NUMA CPU Sockets and DPDK Hi guys, What has been your experience of using DPDK based app's in NUMA mode with multiple sockets where some cores are present on one socket and other cores on some other socket. I am migrating my application from one intel machine with 8 cores, all in one socket to a 32 core machine where 16 cores are in one socket and 16 other cores in the second socket. My core 0 does all initialization for mbuf's, nic ports, queues etc. and uses SOCKET_ID_ANY for socket related parameters. The usecase works, but I think I am running into performance issues on the 32 core machine. The lscpu output on my 32 core machine shows the following - NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31 I am using core 1 to lift all the data from a single queue of an 82599EB port and I see that the cpu utilization for this core 1 is way too high even for lifting traffic of 1 Gbps with packet size of 650 bytes. In general, does one need to be careful in working with multiple sockets and so forth, any comments would be helpful. Regards -Prashant === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. === === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] NUMA CPU Sockets and DPDK
Hi guys, What has been your experience of using DPDK based app's in NUMA mode with multiple sockets where some cores are present on one socket and other cores on some other socket. I am migrating my application from one intel machine with 8 cores, all in one socket to a 32 core machine where 16 cores are in one socket and 16 other cores in the second socket. My core 0 does all initialization for mbuf's, nic ports, queues etc. and uses SOCKET_ID_ANY for socket related parameters. The usecase works, but I think I am running into performance issues on the 32 core machine. The lscpu output on my 32 core machine shows the following - NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31 I am using core 1 to lift all the data from a single queue of an 82599EB port and I see that the cpu utilization for this core 1 is way too high even for lifting traffic of 1 Gbps with packet size of 650 bytes. In general, does one need to be careful in working with multiple sockets and so forth, any comments would be helpful. Regards -Prashant === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] [PATCH RFC] dpif-netdev: Add support Intel DPDK based ports.
Hi Pravin, Request you to please validate atleast one method to interface VM's with your innovative dpdk port on the OVS. Preferably IVSHM. Please do publish the steps for that too. We really need the above for huge acceptance. Regards -Prashant -Original Message- From: Pravin Shelar [mailto:pshe...@nicira.com] Sent: Thursday, January 30, 2014 3:00 AM To: Prashant Upadhyaya Cc: dev at openvswitch.org; dev at dpdk.org; dpdk-ovs at lists.01.org; Gerald Rogers Subject: Re: [dpdk-dev] [PATCH RFC] dpif-netdev: Add support Intel DPDK based ports. On Wed, Jan 29, 2014 at 12:56 AM, Prashant Upadhyaya wrote: > Hi Pravin, > > I think your stuff is on the brink of a creating a mini revolution :) > > Some questions inline below -- > +ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk > What do you mean by portid here, do you mean the physical interface id like > eth0 which I have bound to igb_uio now ? > If I have multiple interfaces I have assigned igb_uio to, eg. eth0, eth1, > eth2 etc., what is the id mapping for those ? > Port id is id assigned by DPDK. DPDK interface takes this port id as argument. Currently you need to look at pci id to figure out the device mapping to port id. I know it is clean and I am exploring better interface so that we can specify device names to ovs-vsctl. > If I have VM's running, then typically how to interface those VM's to this > OVS in user space now, do I use the same classical 'tap' interface and add it > to the OVS above. tap device will work, but you would not get performance primarily due to scheduling delay and memcopy. DPDK has multiple drivers to create interface with KVM guests OS. those should perform better. I have no tried it yet. > What is the actual path the data takes from the VM now all the way to the > switch, wouldn't it be hypervisor to kernel to OVS switch in user space to > other VM/Network ? Depends on method you use. e.g. Memnic bypass hypervisor and host kernel entirely. > I think if we can solve the VM to OVS port connectivity remaining in > userspace only, then we have a great thing at our hand. Kindly comment on > this. > right, performance looks pretty good. Still DPDK needs constant polling which consumes more power. RFC ovs-dkdp patch has simple polling which need tweaking for better power usage. Thanks, Pravin. > Regards > -Prashant > > === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] Selecting Linux distribution for DPDK applications: CentOS or Debian
Hi Dan, Intel DPDK release notes(1.5.2) mention the following tested OS -- * Fedora release 18 * Ubuntu* 12.04 LTS * Wind River* Linux* 5 * Red Hat* Enterprise Linux 6.3 * SUSE Enterprise Linux* 11 SP2 I have personally used Fedora 18 and it works fine for me for virtualization including SRIOV and pass through as well as virtio with KNI backend. So I am tending to stick to Fedora 18. I don't know why is CentOS not tested and mentioned in release notes. Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Daniel Kan Sent: Thursday, January 30, 2014 2:01 PM To: dev at dpdk.org Subject: [dpdk-dev] Selecting Linux distribution for DPDK applications: CentOS or Debian I'm deciding between Debian 7.3 (3.2.0 kernel) and CentOS 6.5 (2.6.32 kernel) for production. I'm wondering if anyone has recommendation. We run the DPDK application in a virtualized environment. Currently, we configure NICs in pass-through mode which gives the best performance. In the future, we plan to use DPDK with paravirtualized nics (eg. vmxnet3-usermap). Thanks. Dan === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] [PATCH RFC] dpif-netdev: Add support Intel DPDK based ports.
Hi Pravin, I think your stuff is on the brink of a creating a mini revolution :) Some questions inline below -- +ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk What do you mean by portid here, do you mean the physical interface id like eth0 which I have bound to igb_uio now ? If I have multiple interfaces I have assigned igb_uio to, eg. eth0, eth1, eth2 etc., what is the id mapping for those ? If I have VM's running, then typically how to interface those VM's to this OVS in user space now, do I use the same classical 'tap' interface and add it to the OVS above. What is the actual path the data takes from the VM now all the way to the switch, wouldn't it be hypervisor to kernel to OVS switch in user space to other VM/Network ? I think if we can solve the VM to OVS port connectivity remaining in userspace only, then we have a great thing at our hand. Kindly comment on this. Regards -Prashant -Original Message- From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of pshe...@nicira.com Sent: Tuesday, January 28, 2014 7:19 AM To: dev at openvswitch.org; dev at dpdk.org; dpdk-ovs at lists.01.org Cc: Gerald Rogers Subject: [dpdk-dev] [PATCH RFC] dpif-netdev: Add support Intel DPDK based ports. From: Pravin B ShelarFollowing patch adds DPDK netdev-class to userspace datapath. Approach taken in this patch differs from Intel? DPDK vSwitch where DPDK datapath switching is done in saparate process. This patch adds support for DPDK type port and uses OVS userspace datapath for switching. Therefore all DPDK processing and flow miss handling is done in single process. This also avoids code duplication by reusing OVS userspace datapath switching and therefore it supports all flow matching and actions that user-space datapath supports. Refer to INSTALL.DPDK doc for further info. With this patch I got similar performance for netperf TCP_STREAM tests compared to kernel datapath. This is based a patch from Gerald Rogers. Signed-off-by: Pravin B Shelar CC: "Gerald Rogers" --- This patch is tested on latest OVS master (commit 9d0581fdf22bec79). --- INSTALL |1 + INSTALL.DPDK| 85 Makefile.am |1 + acinclude.m4| 40 ++ configure.ac|1 + lib/automake.mk |6 + lib/dpif-netdev.c | 393 +++- lib/netdev-dpdk.c | 1152 +++ lib/netdev-dpdk.h |7 + lib/netdev-dummy.c | 38 +- lib/netdev-linux.c | 33 +- lib/netdev-provider.h | 13 +- lib/netdev-vport.c |1 + lib/netdev.c| 52 ++- lib/netdev.h| 15 +- lib/ofpbuf.c|7 +- lib/ofpbuf.h| 13 +- lib/packets.c |9 + lib/packets.h |1 + vswitchd/ovs-vswitchd.c | 14 +- 20 files changed, 1702 insertions(+), 180 deletions(-) create mode 100644 INSTALL.DPDK create mode 100644 lib/netdev-dpdk.c create mode 100644 lib/netdev-dpdk.h diff --git a/INSTALL b/INSTALL index 001d3cb..74cd278 100644 --- a/INSTALL +++ b/INSTALL @@ -10,6 +10,7 @@ on a specific platform, please see one of these files: - INSTALL.RHEL - INSTALL.XenServer - INSTALL.NetBSD +- INSTALL.DPDK Build Requirements -- diff --git a/INSTALL.DPDK b/INSTALL.DPDK new file mode 100644 index 000..1c95104 --- /dev/null +++ b/INSTALL.DPDK @@ -0,0 +1,85 @@ + Using Open vSwitch with DPDK + + +Open vSwitch can use Intel(R) DPDK lib to operate entirely in +userspace. This file explains how to install and use Open vSwitch in +such a mode. + +The DPDK support of Open vSwitch is considered experimental. +It has not been thoroughly tested. + +This version of Open vSwitch should be built manually with "configure" +and "make". + +Building and Installing: + + +DPDK: +cd DPDK +make install T=x86_64-default-linuxapp-gcc +Refer to http://dpdk.org/ requirements of details. + +Linux kernel: +Refer to intel-dpdk-getting-started-guide.pdf for understanding +DPDK kernel requirement. + +OVS: +cd $(OVS_DIR)/openvswitch +./boot.sh +./configure --with-dpdk=$(DPDK_BUILD) +make + +Refer to INSTALL.userspace for general requirements of building +userspace OVS. + +Using the DPDK with ovs-vswitchd: +- + +Fist setup DPDK devices: + - insert igb_uio.ko +e.g. insmod DPDK/x86_64-default-linuxapp-gcc/kmod/igb_uio.ko + - mount hugefs +e.g. mount -t hugetlbfs -o pagesize=1G none /mnt/huge/ + - Bind network device to ibg_uio. +e.g. DPDK/tools/pci_unbind.py --bind=igb_uio eth1 + +Ref to http://www.dpdk.org/doc/quick-start for verifying DPDK setup. + +Start vswitchd: +DPDK configuration arguments can be passed to vswitchd via `--dpdk` +argument. + e.g. + ./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 -- unix:$DB_SOCK --pidfile --detach + +To
[dpdk-dev] Query about this NIC
Hi, No, I have not used X520-T2 before. If anybody on the list has used the above successfully, please advise. Jayant, I will await your advice. Regards -Prashant -Original Message- From: Jayakumar, Muthurajan [mailto:muthurajan.jayaku...@intel.com] Sent: Thursday, January 09, 2014 9:32 PM To: Prashant Upadhyaya; dev at dpdk.org Subject: RE: Query about this NIC Hi Prashant, Thanks for using Intel DPDK. Intel DPDK supports X520-T2 (previously code named "Iron Pond") Will find out the differences between X520-T2 and X520-DA2 Server Adapter E10G42BTDA PCIe Dual-Port 2xSFP+ Copper 10GSFP+Cu Low-Profile BTW, have you used X520-T2? Thank you, M Jay -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Prashant Upadhyaya Sent: Thursday, January 09, 2014 7:39 AM To: dev at dpdk.org Subject: [dpdk-dev] Query about this NIC Hi, Would the following NIC work with DPDK - Intel 10-Gigabit Ethernet X520-DA2 Server Adapter E10G42BTDA PCIe Dual-Port 2xSFP+ Copper 10GSFP+Cu Low-Profile Regards -Prashant === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. === === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] Query about this NIC
Hi, Would the following NIC work with DPDK - Intel 10-Gigabit Ethernet X520-DA2 Server Adapter E10G42BTDA PCIe Dual-Port 2xSFP+ Copper 10GSFP+Cu Low-Profile Regards -Prashant === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] regarding this NIC
Intel(r) 10GbE network controller 82599ES From: Prashant Upadhyaya Sent: Wednesday, December 18, 2013 4:38 PM To: dev at dpdk.org Subject: regarding this NIC Hi, Could somebody advise if this NIC would work with DPDK. It mentions 82599ES, so far I have been working with 82599EB. Regards -Prashant === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] regarding this NIC
Hi, Could somebody advise if this NIC would work with DPDK. It mentions 82599ES, so far I have been working with 82599EB. Regards -Prashant === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] generic load balancing
Hi, Regarding this point ?C If intel supports round robin distribution of packets in the same flow, Intel needs to provide some way like Cavium's SSO(tag switch) to maintain packet order in the same flow. And it is hard to do so because intel's cpu and nic are decoupled My main submission is ?C I understand there are issues like the above and ooo stuff you pointed out. But that is for the usecase implementer to solve in software logic. The equivalent of tag switch can be attempted to be developed in the software if the usecase so desires. But atleast ??give?? the facility in the NIC to fan out on round robin on queues. Somehow we are trying to find out reasons why we should not have it. I am saying, give it in the NIC and let people use it in innovative ways. People who don??t want to use it can always have the choice to not use it. Regards -Prashant From: ?? [mailto:ydwoo0...@gmail.com] Sent: Friday, December 06, 2013 7:47 AM To: Thomas Monjalon Cc: Michael Quicquaro; Prashant Upadhyaya; dev at dpdk.org Subject: Re: [dpdk-dev] generic load balancing RSS is a way to distribute packets to multi cores while packets order in the same flow still get maintained. Round robin distribution of packets may cause ooo(out of order) of packets in the same flow. We also meet this problem in ipsec vpn case. The tunneled packets are rss to the same queue if they are on the same tunnel. But if we dispatch the packets to the other cores to process, ooo packets may occur and tcp performance may be greatly hurt. If you enable rss on udp packets and some udp packets are ip fragmented, rss of udp fragments(hash only calculated from ip addr) may be different fom rss of udp non-fragment packets(hash with information of udp ports), ooo may occur too. So in kernel driver disables udp rss by default. If intel supports round robin distribution of packets in the same flow, Intel needs to provide some way like Cavium's SSO(tag switch) to maintain packet order in the same flow. And it is hard to do so because intel's cpu and nic are decoupled. 2013/12/6 Thomas Monjalon mailto:thomas.monjalon at 6wind.com>> Hello, 05/12/2013 16:42, Michael Quicquaro : > This is a good discussion and I hope Intel can see and benefit from it. Don't forget that this project is Open Source. So you can submit your patches for review. Thanks for participating -- Thomas === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] generic load balancing
Hi, Well, GTP is the main usecase. We end up with a GTP tunnel between the two machines. And ordinarily with 82599, all the data will land up on a single queue and therefore must be polled on a single core. Bottleneck. But in general, if I want to employ all the CPU cores horsepower simultaneously to pickup the packets from NIC, then it is natural that I drop a queue each for every core into the NIC and if the NIC does a round robin then it naturally fans out and I can use all the cores to lift packets from NIC in a load balanced fashion. Imagine a theoretical usecase, where I have to lift the packets from the NIC, inspect it myself in the application and then switch them to the right core for further processing. So my cores have two jobs, one is to poll the NIC and then switch the packets to the right core. Here I would simply love to poll the queue and the intercore ring from each core to achieve the processing. No single core will become the bottleneck as far as polling the NIC is concerned. You might argue on what basis I switch to the relevant core for further processing, but that's _my_ usecase and headache to further equally distribute amongst the cores. Imagine an LTE usecase where I am on the core side (SGW), the packets come over GTP from thousands of mobiles (via eNB). I can employ all the cores to pickup the GTP packets (if NIC gives me round robin) and then based on the inner IP packet's src IP address (the mobile IP address), I can take it to the further relevant core for processing. This way I will get a complete load balancing done not only for polling from NIC but also for processing of the inner IP packets. I have also worked a lot on Cavium processors. Those of you who are familiar with that would know that the POW scheduler gives the packets to whichever core is requesting for work so the packets can go to any core in Cavium Octeon processor. The only way to achieve similar functionality in DPDK is to drop a queue per core into the NIC and then let NIC do round robin on those queues blindly. What's the harm if this feature is added, let those who want to use it, use, and those who hate it or think it is useless, ignore. Regards -Prashant -Original Message- From: Fran?ois-Fr?d?ric Ozog [mailto:f...@ozog.com] Sent: Thursday, December 05, 2013 2:16 PM To: Prashant Upadhyaya Cc: 'Michael Quicquaro'; 'Stephen Hemminger'; dev at dpdk.org Subject: RE: [dpdk-dev] generic load balancing Hi, If the traffic you manage is above MPLS or GTP encapsulations, then you can use cards that provide flexible hash functions. Chelsio cxgb5 provides combination of "offset", length and tuple that may help. The only reason I would have loved to get a pure round robin feature was to pass certain "Breaking Point" (http://www.ixiacom.com/breakingpoint) tests where the traffic issue was multicast from a single source... But that is not real life traffic. If you could share the use case... Fran?ois-Fr?d?ric > -Message d'origine- > De : Prashant Upadhyaya [mailto:prashant.upadhyaya at aricent.com] > Envoy? : jeudi 5 d?cembre 2013 06:30 > ? : Stephen Hemminger > Cc : Fran?ois-Fr?d?ric Ozog; Michael Quicquaro; dev at dpdk.org Objet : > RE: [dpdk-dev] generic load balancing > > Hi Stepher, > > The awfulness depends upon the 'usecase' > I have eg. a usecase where I want this roundrobin behaviour. > > I just want the NIC to give me a facility to use this. > > Regards > -Prashant > > > -Original Message- > From: Stephen Hemminger [mailto:stephen at networkplumber.org] > Sent: Thursday, December 05, 2013 10:25 AM > To: Prashant Upadhyaya > Cc: Fran?ois-Fr?d?ric Ozog; Michael Quicquaro; dev at dpdk.org > Subject: Re: [dpdk-dev] generic load balancing > > Round robin would actually be awful for any protocol because it would cause > out of order packets. > That is why flow based algorithms like flow director and RSS work much > better. > > On Wed, Dec 4, 2013 at 8:31 PM, Prashant Upadhyaya > wrote: > > Hi, > > > > It's a real pity that Intel 82599 NIC (and possibly others) don't > > have a > simple round robin scheduling of packets on the configured queues. > > > > I have requested Intel earlier, and using this forum requesting > > again -- > please please put this facility in the NIC that if I drop N queues > there and configure the NIC for some round robin scheduling on > queues, then NIC should simply put the received packets one by one on > queue 1, then on queue2,,then on queueN, and then back on queue 1. > > The above is very useful in lot of load balancing cases. > > > > Regards > > -Prashant > > > > > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of > > Fran?ois-Fr?d?ric Ozog > >
[dpdk-dev] generic load balancing
Hi, It's a real pity that Intel 82599 NIC (and possibly others) don't have a simple round robin scheduling of packets on the configured queues. I have requested Intel earlier, and using this forum requesting again -- please please put this facility in the NIC that if I drop N queues there and configure the NIC for some round robin scheduling on queues, then NIC should simply put the received packets one by one on queue 1, then on queue2,,then on queueN, and then back on queue 1. The above is very useful in lot of load balancing cases. Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Fran?ois-Fr?d?ric Ozog Sent: Thursday, December 05, 2013 2:35 AM To: 'Michael Quicquaro' Cc: dev at dpdk.org Subject: Re: [dpdk-dev] generic load balancing Hi, As far as I can tell, this is really hardware dependent. Some hash functions allow uplink and downlink packets of the same "session" to go to the same queue (I know Chelsio can do this). For the Intel card, you may find what you want in: http://www.intel.com/content/www/us/en/ethernet-controllers/82599-10-gbe-con troller-datasheet.html Other cards require NDA or other agreements to get details of RSS. If you have a performance problem, may I suggest you use kernel 3.10 then monitor system activity with "perf" command. For instance you can start with "perf top -a" this will give you nice information. Then your creativity will do the rest ;-) You may be surprised what comes on the top hot points... (the most unexpected hot function I found here was Linux syscall gettimeofday!!!) Fran?ois-Fr?d?ric > -Message d'origine- > De : dev [mailto:dev-bounces at dpdk.org] De la part de Michael Quicquaro > Envoy? : mercredi 4 d?cembre 2013 18:53 ? : dev at dpdk.org Objet : > [dpdk-dev] generic load balancing > > Hi all, > I am writing a dpdk application that will receive packets from one > interface and process them. It does not forward packets in the traditional > sense. However, I do need to process them at full line rate and > therefore need more than one core. The packets can be somewhat > generic in nature and > can be nearly identical (especially at the beginning of the packet). > I've used the rxonly function of testpmd as a model. > > I've run into problems in processing a full line rate of data since > the nature of the data causes all the data to be presented to only one core. I > get a large percentage of dropped packets (shows up as Rx-Errors in > "port > stats") because of this. I've tried modifying the data so that > packets have different UDP ports and that seems to work when I use > --rss-udp > > My questions are: > 1) Is there a way to configure RSS so that it alternates packets to > all configured cores regardless of the packet data? > > 2) Where is the best place to learn more about RSS and how to > configure it? I have not found much in the DPDK documentation. > > Thanks for the help, > - Mike === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] Regarding VM live migration with SRIOV
Hi Stephen, Agreed that the current code does not directly support hotplug. Now I am looking for a hint regarding how best the DPDK application can find out that the PCI device on which it was doing the I/O has been removed from underneath. In that case the application can call the ethdev stop and ethdev close functions gracefully. Question is -- is there a way for the PMD to know that the device is gone. Further the PCI device is a mapped memory, so when the plugout of the PCI device happen, is the DPDK application, which is doing I/O, vulnerable to a crash ? Regards -Prashant -Original Message- From: Stephen Hemminger [mailto:step...@networkplumber.org] Sent: Wednesday, November 27, 2013 11:54 AM To: Prashant Upadhyaya Cc: dev at dpdk.org Subject: Re: [dpdk-dev] Regarding VM live migration with SRIOV On Wed, 27 Nov 2013 11:39:28 +0530 Prashant Upadhyaya wrote: > Hi Stephen, > > The rte_eal_pci_probe is typically called at the startup. > > Now let's say a DPDK application is running with a PCI device (doing > tx and rx) and I remove that PCI device underneath (hot plugout) So how does > the application now know that the device is gone ? > > Is it that rte_eal_pci_probe should be called periodically from, let's say, > the slow control path of the DPDK application ? > > Regards > -Prashant > Like I said current code doesn't do hotplug. If you wanted to add it, you would have to refactor the PCI management layer. === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] Regarding VM live migration with SRIOV
Hi, Let me be more specific. Does DPDK support hot plugin/plugout of PCI devices ? What typically needs to be done if this is to be achieved inside an application. Typically, the NIC PF or VF appears to the DPDK application as a PCI device which is probed at startup. Now what happens if I insert a new VF dynamically and want to use it inside the DPDK application (while it is already running), how should this typically be done ? [hotplugin] And what happens if the DPDK application is in control of a PCI device and that PCI device is suddenly removed ? How can the application detect this and stop doing data transfer on this and sort of unload it ? [hotplugout] If the above can be coded inside the DPDK app, then we can think of live VM migration with SRIOV -- just hotplugin and plugout the VF's. Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Prashant Upadhyaya Sent: Monday, November 25, 2013 7:32 PM To: dev at dpdk.org Subject: [dpdk-dev] Regarding VM live migration with SRIOV Hi guys, I have a VM on top of QEMU/KVM hypervisor. Guest and Host are both Fedora 18. I am using 82599 NIC with SRIOV based VF's in the VM. In VM I am running a DPDK based application which uses the VF. Now I have to do a live migration of the running VM from one physical machine to the other. Right, so has anybody tried it before in the above environment. If yes, would love to hear from you and learn from your experience of doing it instead of making the same mistakes and learning the hard way. Regards -Prashant === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. === === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] Regarding VM live migration with SRIOV
Hi guys, I have a VM on top of QEMU/KVM hypervisor. Guest and Host are both Fedora 18. I am using 82599 NIC with SRIOV based VF's in the VM. In VM I am running a DPDK based application which uses the VF. Now I have to do a live migration of the running VM from one physical machine to the other. Right, so has anybody tried it before in the above environment. If yes, would love to hear from you and learn from your experience of doing it instead of making the same mistakes and learning the hard way. Regards -Prashant === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] Query regarding multiple processes in DPDK
Hi Bruce, Thanks, this was very useful information. Regards -Prashant -Original Message- From: Richardson, Bruce [mailto:bruce.richard...@intel.com] Sent: Monday, November 25, 2013 2:59 PM To: Prashant Upadhyaya; dev at dpdk.org Subject: RE: Query regarding multiple processes in DPDK If the primary process dies: a) The memory does not go away, so the second process can still use it b) When restarting the primary process, you should restart it as a secondary one, to ensure it reattaches to memory properly instead of trying to re-initialize it. Regards /Bruce > -Original Message- > From: Prashant Upadhyaya [mailto:prashant.upadhyaya at aricent.com] > Sent: Monday, November 25, 2013 4:08 AM > To: Richardson, Bruce; dev at dpdk.org > Subject: RE: Query regarding multiple processes in DPDK > > Hi Bruce, > > One more question -- > > Suppose the first instance comes up as primary and creates the mbuf > pool and rings etc. [ok] Now, the second instance comes up as > secondary and does the corresponding lookup functions [ok] Now the > primary exits -- at this point can the secondary still run with all > the memory to which it had done the lookup intact, or does the fact > that primary died will lead to all the memory also taken away with it > so that the secondary can no longer function now ? > > Regards > -Prashant > > > -Original Message----- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Prashant > Upadhyaya > Sent: Friday, November 22, 2013 7:16 PM > To: Richardson, Bruce; dev at dpdk.org > Subject: Re: [dpdk-dev] Query regarding multiple processes in DPDK > > Thanks Bruce, I think your suggested example of multi_process answers > my questions. > > Regards > -Prashant > > > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Prashant > Upadhyaya > Sent: Friday, November 22, 2013 7:10 PM > To: Richardson, Bruce; dev at dpdk.org > Subject: Re: [dpdk-dev] Query regarding multiple processes in DPDK > > Hi Bruce, > > Thanks. > > Regarding your comment -- > [BR] It will depend upon the application, but in most cases you > probably want to have slightly different code paths for primary and > secondary instances. For example, if a process is running as primary > instance, it will probably call rte_mempool_create or rte_ring_create. > A secondary instance which wants to use these should instead call > rte_mempool_lookup and rte_ring_lookup instead. > For an example of how to write the one binary to be used as both > primary and secondary process, I suggest looking at the symmetric_mp > example application in the examples/multi_process/ directory. > > I was really hoping that the --proc-type=auto, would make the DPDK > libraries internally resolving all this stuff, is that not the case ? > I have not started reading the code for all this yet. > I must launch the same executable twice in my usecase. Even if the > executable code has to make different calls when it comes up as > secondary, is there a way for the usercode to know that it has really > come up as secondary when the --proc-type=auto is used ? > > Regards > -Prashant > > -Original Message- > From: Richardson, Bruce [mailto:bruce.richardson at intel.com] > Sent: Friday, November 22, 2013 7:02 PM > To: Prashant Upadhyaya; dev at dpdk.org > Subject: RE: Query regarding multiple processes in DPDK > > Hi Prashant > > > === > > The EAL also supports an auto-detection mode (set by EAL > > --proc-type=auto flag), whereby an Intel(r) DPDK process is started > > as a secondary instance if a primary instance is already running. > > === > > > > So does this mean that if I have a DPDK exe foo.out, then when I run > > the first instance of foo.out with -proc-type = auto, then foo.out > > will run as a primary process and when I spawn the second instance > > of foo.out (with first already running) again with -proc-type=auto, > > then this second instance automatically becomes secondary ? > [BR] Yes, that is the idea. > > > > > Also is there any user code initialization change required or > > exactly the same code will work for both the processes ? > [BR] It will depend upon the application, but in most cases you > probably want to have slightly different code paths for primary and > secondary instances. For example, if a process is running as primary > instance, it will probably call rte_mempool_create or rte_ring_create. > A secondary instance which wants to use these should instead call > rte_mempool_lookup and rte_ring_lookup instead. > For an example of how to write the one binary to be used as both > primary and secondary process, I suggest
[dpdk-dev] Query regarding multiple processes in DPDK
Hi Bruce, One more question -- Suppose the first instance comes up as primary and creates the mbuf pool and rings etc. [ok] Now, the second instance comes up as secondary and does the corresponding lookup functions [ok] Now the primary exits -- at this point can the secondary still run with all the memory to which it had done the lookup intact, or does the fact that primary died will lead to all the memory also taken away with it so that the secondary can no longer function now ? Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Prashant Upadhyaya Sent: Friday, November 22, 2013 7:16 PM To: Richardson, Bruce; dev at dpdk.org Subject: Re: [dpdk-dev] Query regarding multiple processes in DPDK Thanks Bruce, I think your suggested example of multi_process answers my questions. Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Prashant Upadhyaya Sent: Friday, November 22, 2013 7:10 PM To: Richardson, Bruce; dev at dpdk.org Subject: Re: [dpdk-dev] Query regarding multiple processes in DPDK Hi Bruce, Thanks. Regarding your comment -- [BR] It will depend upon the application, but in most cases you probably want to have slightly different code paths for primary and secondary instances. For example, if a process is running as primary instance, it will probably call rte_mempool_create or rte_ring_create. A secondary instance which wants to use these should instead call rte_mempool_lookup and rte_ring_lookup instead. For an example of how to write the one binary to be used as both primary and secondary process, I suggest looking at the symmetric_mp example application in the examples/multi_process/ directory. I was really hoping that the --proc-type=auto, would make the DPDK libraries internally resolving all this stuff, is that not the case ? I have not started reading the code for all this yet. I must launch the same executable twice in my usecase. Even if the executable code has to make different calls when it comes up as secondary, is there a way for the usercode to know that it has really come up as secondary when the --proc-type=auto is used ? Regards -Prashant -Original Message- From: Richardson, Bruce [mailto:bruce.richard...@intel.com] Sent: Friday, November 22, 2013 7:02 PM To: Prashant Upadhyaya; dev at dpdk.org Subject: RE: Query regarding multiple processes in DPDK Hi Prashant > === > The EAL also supports an auto-detection mode (set by EAL > --proc-type=auto flag), whereby an Intel(r) DPDK process is started as > a secondary instance if a primary instance is already running. > === > > So does this mean that if I have a DPDK exe foo.out, then when I run > the first instance of foo.out with -proc-type = auto, then foo.out > will run as a primary process and when I spawn the second instance of > foo.out (with first already running) again with -proc-type=auto, then > this second instance automatically becomes secondary ? [BR] Yes, that is the idea. > > Also is there any user code initialization change required or exactly > the same code will work for both the processes ? [BR] It will depend upon the application, but in most cases you probably want to have slightly different code paths for primary and secondary instances. For example, if a process is running as primary instance, it will probably call rte_mempool_create or rte_ring_create. A secondary instance which wants to use these should instead call rte_mempool_lookup and rte_ring_lookup instead. For an example of how to write the one binary to be used as both primary and secondary process, I suggest looking at the symmetric_mp example application in the examples/multi_process/ directory. Regards, /Bruce === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. === === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. === === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] Query regarding multiple processes in DPDK
Thanks Bruce, I think your suggested example of multi_process answers my questions. Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Prashant Upadhyaya Sent: Friday, November 22, 2013 7:10 PM To: Richardson, Bruce; dev at dpdk.org Subject: Re: [dpdk-dev] Query regarding multiple processes in DPDK Hi Bruce, Thanks. Regarding your comment -- [BR] It will depend upon the application, but in most cases you probably want to have slightly different code paths for primary and secondary instances. For example, if a process is running as primary instance, it will probably call rte_mempool_create or rte_ring_create. A secondary instance which wants to use these should instead call rte_mempool_lookup and rte_ring_lookup instead. For an example of how to write the one binary to be used as both primary and secondary process, I suggest looking at the symmetric_mp example application in the examples/multi_process/ directory. I was really hoping that the --proc-type=auto, would make the DPDK libraries internally resolving all this stuff, is that not the case ? I have not started reading the code for all this yet. I must launch the same executable twice in my usecase. Even if the executable code has to make different calls when it comes up as secondary, is there a way for the usercode to know that it has really come up as secondary when the --proc-type=auto is used ? Regards -Prashant -Original Message- From: Richardson, Bruce [mailto:bruce.richard...@intel.com] Sent: Friday, November 22, 2013 7:02 PM To: Prashant Upadhyaya; dev at dpdk.org Subject: RE: Query regarding multiple processes in DPDK Hi Prashant > === > The EAL also supports an auto-detection mode (set by EAL > --proc-type=auto flag), whereby an Intel(r) DPDK process is started as > a secondary instance if a primary instance is already running. > === > > So does this mean that if I have a DPDK exe foo.out, then when I run > the first instance of foo.out with -proc-type = auto, then foo.out > will run as a primary process and when I spawn the second instance of > foo.out (with first already running) again with -proc-type=auto, then > this second instance automatically becomes secondary ? [BR] Yes, that is the idea. > > Also is there any user code initialization change required or exactly > the same code will work for both the processes ? [BR] It will depend upon the application, but in most cases you probably want to have slightly different code paths for primary and secondary instances. For example, if a process is running as primary instance, it will probably call rte_mempool_create or rte_ring_create. A secondary instance which wants to use these should instead call rte_mempool_lookup and rte_ring_lookup instead. For an example of how to write the one binary to be used as both primary and secondary process, I suggest looking at the symmetric_mp example application in the examples/multi_process/ directory. Regards, /Bruce === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. === === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] Query regarding multiple processes in DPDK
Hi Bruce, Thanks. Regarding your comment -- [BR] It will depend upon the application, but in most cases you probably want to have slightly different code paths for primary and secondary instances. For example, if a process is running as primary instance, it will probably call rte_mempool_create or rte_ring_create. A secondary instance which wants to use these should instead call rte_mempool_lookup and rte_ring_lookup instead. For an example of how to write the one binary to be used as both primary and secondary process, I suggest looking at the symmetric_mp example application in the examples/multi_process/ directory. I was really hoping that the --proc-type=auto, would make the DPDK libraries internally resolving all this stuff, is that not the case ? I have not started reading the code for all this yet. I must launch the same executable twice in my usecase. Even if the executable code has to make different calls when it comes up as secondary, is there a way for the usercode to know that it has really come up as secondary when the --proc-type=auto is used ? Regards -Prashant -Original Message- From: Richardson, Bruce [mailto:bruce.richard...@intel.com] Sent: Friday, November 22, 2013 7:02 PM To: Prashant Upadhyaya; dev at dpdk.org Subject: RE: Query regarding multiple processes in DPDK Hi Prashant > === > The EAL also supports an auto-detection mode (set by EAL > --proc-type=auto flag), whereby an Intel(r) DPDK process is started as > a secondary instance if a primary instance is already running. > === > > So does this mean that if I have a DPDK exe foo.out, then when I run > the first instance of foo.out with -proc-type = auto, then foo.out > will run as a primary process and when I spawn the second instance of > foo.out (with first already running) again with -proc-type=auto, then > this second instance automatically becomes secondary ? [BR] Yes, that is the idea. > > Also is there any user code initialization change required or exactly > the same code will work for both the processes ? [BR] It will depend upon the application, but in most cases you probably want to have slightly different code paths for primary and secondary instances. For example, if a process is running as primary instance, it will probably call rte_mempool_create or rte_ring_create. A secondary instance which wants to use these should instead call rte_mempool_lookup and rte_ring_lookup instead. For an example of how to write the one binary to be used as both primary and secondary process, I suggest looking at the symmetric_mp example application in the examples/multi_process/ directory. Regards, /Bruce === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] Query regarding multiple processes in DPDK
Hi guys, The DPDP programmer's guide mentions - === The EAL also supports an auto-detection mode (set by EAL --proc-type=auto flag), whereby an Intel(r) DPDK process is started as a secondary instance if a primary instance is already running. === So does this mean that if I have a DPDK exe foo.out, then when I run the first instance of foo.out with -proc-type = auto, then foo.out will run as a primary process and when I spawn the second instance of foo.out (with first already running) again with -proc-type=auto, then this second instance automatically becomes secondary ? Also is there any user code initialization change required or exactly the same code will work for both the processes ? Regards -Prashant === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] [PATCH 1/2] igb/ixgbe: ETH_MQ_RX_NONE should disable RSS
Hi, So if I have multiple queues and was using ETH_MQ_RX_NONE (and therefore utilizing RSS), it will stop working for me now after this patch ? Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ivan Boule Sent: Wednesday, November 20, 2013 3:20 PM To: Maxime Leroy Cc: dev at dpdk.org Subject: Re: [dpdk-dev] [PATCH 1/2] igb/ixgbe: ETH_MQ_RX_NONE should disable RSS On 11/19/2013 02:03 PM, Maxime Leroy wrote: > As explained in rte_ethdev.h, ETH_MQ_RX_NONE allows to not choose > between RSS, DCB or VMDQ modes for the selection of a rx queue. > > But the igb/ixgbe code always silently selects the RSS mode with > ETH_MQ_RX_NONE. This patch fixes this incoherence between the API and > the implementation. > > Signed-off-by: Maxime Leroy > --- > lib/librte_pmd_e1000/igb_rxtx.c |4 ++-- > lib/librte_pmd_ixgbe/ixgbe_rxtx.c |4 ++-- > 2 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/lib/librte_pmd_e1000/igb_rxtx.c > b/lib/librte_pmd_e1000/igb_rxtx.c index f785d9f..641ceea 100644 > --- a/lib/librte_pmd_e1000/igb_rxtx.c > +++ b/lib/librte_pmd_e1000/igb_rxtx.c > @@ -1745,8 +1745,6 @@ igb_dev_mq_rx_configure(struct rte_eth_dev *dev) >*/ > if (dev->data->nb_rx_queues > 1) > switch (dev->data->dev_conf.rxmode.mq_mode) { > -case ETH_MQ_RX_NONE: > -/* if mq_mode not assign, we use rss mode.*/ > case ETH_MQ_RX_RSS: > igb_rss_configure(dev); > break; > @@ -1754,6 +1752,8 @@ igb_dev_mq_rx_configure(struct rte_eth_dev *dev) > /*Configure general VMDQ only RX parameters*/ > igb_vmdq_rx_hw_configure(dev); > break; > +case ETH_MQ_RX_NONE: > +/* if mq_mode is none, disable rss mode.*/ > default: > igb_rss_disable(dev); > break; > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > index 0f7be95..e1b90f9 100644 > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > @@ -3217,8 +3217,6 @@ ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev) >*/ > if (dev->data->nb_rx_queues > 1) > switch (dev->data->dev_conf.rxmode.mq_mode) { > -case ETH_MQ_RX_NONE: > -/* if mq_mode not assign, we use rss mode.*/ > case ETH_MQ_RX_RSS: > ixgbe_rss_configure(dev); > break; > @@ -3231,6 +3229,8 @@ ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev) > ixgbe_vmdq_rx_hw_configure(dev); > break; > > +case ETH_MQ_RX_NONE: > +/* if mq_mode is none, disable rss mode.*/ > default: ixgbe_rss_disable(dev); > } > else Acked by Ivan Boule -- Ivan Boule 6WIND Development Engineer === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] olflags in SRIOV VF environment
Thanks Vladimir ! That seems to be the issue. Better to parse it by hand right now instead of depending on ol_flags. Regards -Prashant From: Vladimir Medvedkin [mailto:medvedk...@gmail.com] Sent: Tuesday, November 12, 2013 7:54 PM To: Prashant Upadhyaya Cc: dev at dpdk.org Subject: Re: [dpdk-dev] olflags in SRIOV VF environment Hi Prashant, May be it doesn't work due to Known Issues and Limitations (see Release Notes) quote: 6.1 In packets provided by the PMD, some flags are missing In packets provided by the PMD, some flags are missing. The application does not have access to information provided by the hardware (packet is broadcast, packet is multicast, packet is IPv4 and so on). Regards, Vladimir 2013/11/12 Prashant Upadhyaya mailto:prashant.upadhyaya at aricent.com>> Hi guys, I am facing a peculiar issue with the usage of struct rte_mbuf-> ol_flags field in the rte_mbuf when I receive the packets with the rte_eth_rx_burst function. I use the ol_flags field to identify whether is an IPv4 or IPv6 packet or not thus - if ((pkts_burst->ol_flags & PKT_RX_IPV4_HDR) || (pkts_burst->ol_flags & PKT_RX_IPV6_HDR)) [pkts_burst is my rte_mbuf pointer] Now here are the observations - 1. This works mighty fine when my app is working on the native machine 2. This works good when I run this in a VM and use one VF over SRIOV from one NIC port 3. This works good when I run this in two VM's and use one VF from 2 different NIC ports (one VF from each) and use these VF's in these 2 VM's (VF1 from NIC port1 in VM1 and VF2 from NIC port2 in VM2) 4. However the ol_flags fails to classify the packets when I use 2 VM's and use 2 VF's from the 'same' NIC port and expose one each to the 2 VM's I have There is no bug in my 'own' application, because when I stopped inspecting the ol_flags for classification of IPv4 and V6 packets and wrote a mini logic of my own by inspecting the ether type of the packets (the packets themselves come proper in all the cases, thankfully), my entire usecase passes (it is a rather significant usecase, so it can't be luck) Any idea guys why it works and doesn't work ? Regards -Prashant === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. === === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] olflags in SRIOV VF environment
Hi guys, I am facing a peculiar issue with the usage of struct rte_mbuf-> ol_flags field in the rte_mbuf when I receive the packets with the rte_eth_rx_burst function. I use the ol_flags field to identify whether is an IPv4 or IPv6 packet or not thus - if ((pkts_burst->ol_flags & PKT_RX_IPV4_HDR) || (pkts_burst->ol_flags & PKT_RX_IPV6_HDR)) [pkts_burst is my rte_mbuf pointer] Now here are the observations - 1. This works mighty fine when my app is working on the native machine 2. This works good when I run this in a VM and use one VF over SRIOV from one NIC port 3. This works good when I run this in two VM's and use one VF from 2 different NIC ports (one VF from each) and use these VF's in these 2 VM's (VF1 from NIC port1 in VM1 and VF2 from NIC port2 in VM2) 4. However the ol_flags fails to classify the packets when I use 2 VM's and use 2 VF's from the 'same' NIC port and expose one each to the 2 VM's I have There is no bug in my 'own' application, because when I stopped inspecting the ol_flags for classification of IPv4 and V6 packets and wrote a mini logic of my own by inspecting the ether type of the packets (the packets themselves come proper in all the cases, thankfully), my entire usecase passes (it is a rather significant usecase, so it can't be luck) Any idea guys why it works and doesn't work ? Regards -Prashant === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] raw frame to rte_mbuf
Hi Pepe, Ofcourse a simple cast will not suffice. Please look the rte_mbuf structure in the header files and let me know if you still have the confusion. There is a header and payload. Your raw frame will go in the payload. Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Jose Gavine Cueto Sent: Tuesday, November 12, 2013 1:49 PM To: dev at dpdk.org Subject: [dpdk-dev] raw frame to rte_mbuf Hi, In DPDK how should a raw ethernet frame converted to rte_mbuf * ? For example if I have an ARP packet: void * arp_pkt how should this be converted to an rte_mbuf * for transmission, does a simple cast suffice ? Cheers, Pepe -- To stop learning is like to stop loving. === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] Surprisingly high TCP ACK packets drop counter
Hi Alexander, Please confirm if the patch works for you. @Wang, are you saying that without the patch the NIC does not fan out the messages properly on all the receive queues ? So what exactly happens ? Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Alexander Belyakov Sent: Monday, November 04, 2013 1:51 AM To: Wang, Shawn Cc: dev at dpdk.org Subject: Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter Hi, thanks for the patch and explanation. We have tried DPDK 1.3 and 1.5 - both have the same issue. Regards, Alexander On Fri, Nov 1, 2013 at 6:54 PM, Wang, Shawn wrote: > Hi: > > We had the same problem before. It turned out that RSC (receive side > coalescing) is enabled by default in DPDK. So we write this na?ve > patch to disable it. This patch is based on DPDK 1.3. Not sure 1.5 has > changed it or not. > After this patch, ACK rate should go back to 14.5Mpps. For details, > you can refer to Intel? 82599 10 GbE Controller Datasheet. (7.11 > Receive Side Coalescing). > > From: xingbow > Date: Wed, 21 Aug 2013 11:35:23 -0700 > Subject: [PATCH] Disable RSC in ixgbe_dev_rx_init function in file > > ixgbe_rxtx.c > > --- > > DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h | 2 +- > DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 7 +++ > 2 files changed, 8 insertions(+), 1 deletion(-) > > diff --git a/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h > b/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h > index 7fffd60..f03046f 100644 > > --- a/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h > > +++ b/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h > > @@ -1930,7 +1930,7 @@ enum { > > #define IXGBE_RFCTL_ISCSI_DIS 0x0001 > #define IXGBE_RFCTL_ISCSI_DWC_MASK 0x003E > #define IXGBE_RFCTL_ISCSI_DWC_SHIFT1 > -#define IXGBE_RFCTL_RSC_DIS0x0010 > > +#define IXGBE_RFCTL_RSC_DIS0x0020 > > #define IXGBE_RFCTL_NFSW_DIS 0x0040 > #define IXGBE_RFCTL_NFSR_DIS 0x0080 > #define IXGBE_RFCTL_NFS_VER_MASK 0x0300 > diff --git a/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > b/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > index 07830b7..ba6e05d 100755 > > --- a/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > > +++ b/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > > @@ -3007,6 +3007,7 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev) > > uint64_t bus_addr; > uint32_t rxctrl; > uint32_t fctrl; > + uint32_t rfctl; > > uint32_t hlreg0; > uint32_t maxfrs; > uint32_t srrctl; > @@ -3033,6 +3034,12 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev) > > fctrl |= IXGBE_FCTRL_PMCF; > IXGBE_WRITE_REG(hw, IXGBE_FCTRL, fctrl); > > + /* Disable RSC */ > + RTE_LOG(INFO, PMD, "Disable RSC\n"); > + rfctl = IXGBE_READ_REG(hw, IXGBE_RFCTL); > + rfctl |= IXGBE_RFCTL_RSC_DIS; > + IXGBE_WRITE_REG(hw, IXGBE_RFCTL, rfctl); > + > > /* > * Configure CRC stripping, if any. > */ > -- > > > Thanks. > Wang, Xingbo > > > > > On 11/1/13 6:43 AM, "Alexander Belyakov" wrote: > > >Hello, > > > >we have simple test application on top of DPDK which sole purpose is > >to forward as much packets as possible. Generally we easily achieve > >14.5Mpps with two 82599EB (one as input and one as output). The only > >suprising exception is forwarding pure TCP ACK flood when performace > >always drops to approximately 7Mpps. > > > >For simplicity consider two different types of traffic: > >1) TCP SYN flood is forwarded at 14.5Mpps rate, > >2) pure TCP ACK flood is forwarded only at 7Mpps rate. > > > >Both SYN and ACK packets have exactly the same length. > > > >It is worth to mention, this forwarding application looks at Ethernet > >and IP headers, but never deals with L4 headers. > > > >We tracked down issue to RX circuit. To be specific, there are 4 RX > >queues initialized on input port and rte_eth_stats_get() shows > >uniform packet distribution (q_ipackets) among them, while q_errors > >remain zero for all queues. The only drop counter quickly increasing > >in the case of pure ACK flood is ierrors, while rx_nombuf remains zero. > > > >We tried different kinds of traffic generators, but always got the > >same > >result: 7Mpps (instead of expected 14Mpps) for TCP packets with ACK > >flag bit set while all other flag bits dropped. Source IPs and ports > >are selected randomly. > > > >Please let us know if anyone is aware of such strange behavior and > >where should we look at to narrow down the problem. > > > >Thanks in advance, > >Alexander Belyakov > > === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] Debugging igbvf_pmd
Hi Sambath, Did you follow the step of applying the mac to each of the virtual function as per the release notes in DPDK ? And ofcourse the src mac of your packets should be 'that' mac as set above. Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Sambath Kumar Balasubramanian Sent: Sunday, November 03, 2013 4:26 AM To: dev at dpdk.org Subject: Re: [dpdk-dev] Debugging igbvf_pmd Sorry pressed the send button too soon. The NIC Card we are using is Intel Corporation 82576 Virtual Function (rev 01) Do we need to do now NIC/CPU low level debugging or is there some issue in the sw that could cause the packet to be dropped below this log message. PMD: eth_igb_xmit_pkts(): port_id=3 queue_id=0 pktlen=60 tx_first=14 tx_last=14 PMD: eth_igb_xmit_pkts(): port_id=3 queue_id=0 tx_tail=15 nb_tx=1 Thanks, Sambath On Sat, Nov 2, 2013 at 3:54 PM, Sambath Kumar Balasubramanian < sambath.balasubramanian at gmail.com> wrote: > Hi, > > We are developing an App over DPDK and in one scenario with SR-IOV > with one of the VFs mapped to a VM and DPDK running on the VM, we see > that the packets are not coming on the wire but I get the following > debug logs for every packet transmitted. We are getting the same > format of packets on the wire in a different scenario so IMO the > Virtual Function ports are set up properly. Any idea how this can be > debugged further. The NIC card we are using is > > > > PMD: eth_igb_xmit_pkts(): port_id=3 queue_id=0 pktlen=60 tx_first=14 > tx_last=14 > > PMD: eth_igb_xmit_pkts(): port_id=3 queue_id=0 tx_tail=15 nb_tx=1 > > Regards, > Sambath > > === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] Surprisingly high TCP ACK packets drop counter
Hi, I have used DPDK1.4 and DPDK1.5 and the packets do fan out nicely on the rx queues nicely in some usecases I have. Alexander, can you please try using DPDK1.4 or 1.5 and share the results. Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Wang, Shawn Sent: Friday, November 01, 2013 8:24 PM To: Alexander Belyakov Cc: dev at dpdk.org Subject: Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter Hi: We had the same problem before. It turned out that RSC (receive side coalescing) is enabled by default in DPDK. So we write this na?ve patch to disable it. This patch is based on DPDK 1.3. Not sure 1.5 has changed it or not. After this patch, ACK rate should go back to 14.5Mpps. For details, you can refer to Intel? 82599 10 GbE Controller Datasheet. (7.11 Receive Side Coalescing). From: xingbowDate: Wed, 21 Aug 2013 11:35:23 -0700 Subject: [PATCH] Disable RSC in ixgbe_dev_rx_init function in file ixgbe_rxtx.c --- DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h | 2 +- DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 7 +++ 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h b/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h index 7fffd60..f03046f 100644 --- a/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h +++ b/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h @@ -1930,7 +1930,7 @@ enum { #define IXGBE_RFCTL_ISCSI_DIS 0x0001 #define IXGBE_RFCTL_ISCSI_DWC_MASK 0x003E #define IXGBE_RFCTL_ISCSI_DWC_SHIFT1 -#define IXGBE_RFCTL_RSC_DIS0x0010 +#define IXGBE_RFCTL_RSC_DIS0x0020 #define IXGBE_RFCTL_NFSW_DIS 0x0040 #define IXGBE_RFCTL_NFSR_DIS 0x0080 #define IXGBE_RFCTL_NFS_VER_MASK 0x0300 diff --git a/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c index 07830b7..ba6e05d 100755 --- a/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c +++ b/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c @@ -3007,6 +3007,7 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev) uint64_t bus_addr; uint32_t rxctrl; uint32_t fctrl; + uint32_t rfctl; uint32_t hlreg0; uint32_t maxfrs; uint32_t srrctl; @@ -3033,6 +3034,12 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev) fctrl |= IXGBE_FCTRL_PMCF; IXGBE_WRITE_REG(hw, IXGBE_FCTRL, fctrl); + /* Disable RSC */ + RTE_LOG(INFO, PMD, "Disable RSC\n"); + rfctl = IXGBE_READ_REG(hw, IXGBE_RFCTL); + rfctl |= IXGBE_RFCTL_RSC_DIS; + IXGBE_WRITE_REG(hw, IXGBE_RFCTL, rfctl); + /* * Configure CRC stripping, if any. */ -- Thanks. Wang, Xingbo On 11/1/13 6:43 AM, "Alexander Belyakov" wrote: >Hello, > >we have simple test application on top of DPDK which sole purpose is to >forward as much packets as possible. Generally we easily achieve >14.5Mpps with two 82599EB (one as input and one as output). The only >suprising exception is forwarding pure TCP ACK flood when performace >always drops to approximately 7Mpps. > >For simplicity consider two different types of traffic: >1) TCP SYN flood is forwarded at 14.5Mpps rate, >2) pure TCP ACK flood is forwarded only at 7Mpps rate. > >Both SYN and ACK packets have exactly the same length. > >It is worth to mention, this forwarding application looks at Ethernet >and IP headers, but never deals with L4 headers. > >We tracked down issue to RX circuit. To be specific, there are 4 RX >queues initialized on input port and rte_eth_stats_get() shows uniform >packet distribution (q_ipackets) among them, while q_errors remain zero >for all queues. The only drop counter quickly increasing in the case of >pure ACK flood is ierrors, while rx_nombuf remains zero. > >We tried different kinds of traffic generators, but always got the same >result: 7Mpps (instead of expected 14Mpps) for TCP packets with ACK >flag bit set while all other flag bits dropped. Source IPs and ports >are selected randomly. > >Please let us know if anyone is aware of such strange behavior and >where should we look at to narrow down the problem. > >Thanks in advance, >Alexander Belyakov === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] Surprisingly high TCP ACK packets drop counter
Hi Alexander, Regarding your following statement -- " The only drop counter quickly increasing in the case of pure ACK flood is ierrors, while rx_nombuf remains zero. " Can you please explain the significance of "ierrors" counter since I am not familiar with that. Further, you said you have 4 queues, how many cores are you using for polling the queues ? Hopefully 4 cores for one queue each without locks. [It is absolutely critical that all 4 queues be polled] Further, is it possible so that your application itself reports the traffic receive in packets per second on each queue ? [Don't try to forward the traffic here, simply receive and drop in your app and sample the counters every second] Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Alexander Belyakov Sent: Friday, November 01, 2013 7:13 PM To: dev at dpdk.org Subject: [dpdk-dev] Surprisingly high TCP ACK packets drop counter Hello, we have simple test application on top of DPDK which sole purpose is to forward as much packets as possible. Generally we easily achieve 14.5Mpps with two 82599EB (one as input and one as output). The only suprising exception is forwarding pure TCP ACK flood when performace always drops to approximately 7Mpps. For simplicity consider two different types of traffic: 1) TCP SYN flood is forwarded at 14.5Mpps rate, 2) pure TCP ACK flood is forwarded only at 7Mpps rate. Both SYN and ACK packets have exactly the same length. It is worth to mention, this forwarding application looks at Ethernet and IP headers, but never deals with L4 headers. We tracked down issue to RX circuit. To be specific, there are 4 RX queues initialized on input port and rte_eth_stats_get() shows uniform packet distribution (q_ipackets) among them, while q_errors remain zero for all queues. The only drop counter quickly increasing in the case of pure ACK flood is ierrors, while rx_nombuf remains zero. We tried different kinds of traffic generators, but always got the same result: 7Mpps (instead of expected 14Mpps) for TCP packets with ACK flag bit set while all other flag bits dropped. Source IPs and ports are selected randomly. Please let us know if anyone is aware of such strange behavior and where should we look at to narrow down the problem. Thanks in advance, Alexander Belyakov === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] rte_eth_rx_burst stops running on dpdk extlib
Hi Pepe, Please also make sure that you are following the correct makefile templates (as present in examples) when you build your app separately. If you don't, then you will naturally miss out on some flags and that can prove fatal because your app will see some code differently in the header files of rte than what the library was built with based on the compiler flags. Regards -Prashant From: Jose Gavine Cueto [mailto:peped...@gmail.com] Sent: Wednesday, October 30, 2013 12:34 PM To: Prashant Upadhyaya Cc: dev at dpdk.org Subject: Re: [dpdk-dev] rte_eth_rx_burst stops running on dpdk extlib Hi Prashant, Are you referring to the rte_* libraries ? I had actually compiled them using the setup script ($(RTE_SDK)/tools/setup.sh) and then linked the PMD lib extension I've made and then linked the pktdump eventually, before seeing the problem. However, building it all as an app. didn't show any issue. I will try to redo the building/linking just to make sure. Thank you! On Wed, Oct 30, 2013 at 2:57 PM, Prashant Upadhyaya mailto:prashant.upadhyaya at aricent.com>> wrote: Hi Pepe, How about this -- compile the libraries yourself and then link your application with them just like the original usecase where you find the problem. If this works, then the problem is with the precompiled libraries you were picking from somewhere. Regards -Prashant -Original Message- From: dev [mailto:dev-bounces at dpdk.org<mailto:dev-boun...@dpdk.org>] On Behalf Of Jose Gavine Cueto Sent: Wednesday, October 30, 2013 12:22 PM To: dev at dpdk.org<mailto:dev at dpdk.org> Subject: Re: [dpdk-dev] rte_eth_rx_burst stops running on dpdk extlib Hi, Could someone help me ? Or at least let me know if what I'm doing (diagram above) is right with regard to using a external dpdk library ? I assume dpdk external libraries can be treated as norma C libraries. Cheers, Pepe On Wed, Oct 30, 2013 at 8:18 AM, Jose Gavine Cueto mailto:pepedocs at gmail.com>>wrote: > Hi, > > I'm writing a very simple packet dump application that can be > described by the following diagram: > > --- > |pktdump| > > | PMD lib extension | > | (extlib) | > > | DPDK PMD lib | > > > pktdump - very simple app. built with gcc and linked with pmd lib > extension and dpdk libs. > pmd lib extension - an extension of dpdk pmd library, which provides > some higher-level APIs dpdk pmd lib - pmd lib provided by Intel > > I have an issue where in when I run the pktdump app. it's lcore > threads stops executing at varying number of times. Sometimes it doesn't > even run. > But this only happens if I use the PMD lib extension. On the other > hand, if pktdump is directly built with pmd lib extension code while > pmd lib extension is built as an extapp, it works very well. I wonder > what's the difference, code-wise there is none, the only difference I > can see is how they are built (extapp, extlib). > > The pmd lib extension's lcore threads basically do simple forwarding > (rx > -> tx). So rte_eth_rx_burst is called when receiving packets and > rte_eth_tx_burst when transmitting packets. These runs on an lcore thread. > > snippet of code that runs on lcore: > > void burst_fwd(...){ > num_rx = rte_eth_rx_burst(...) > ... > rte_eth_tx_burst(...) > } > > Any tips on how to debug this, some quick inspections may help. Is > there some specific build options for building libraries, because this > only happens on extlib. > -- To stop learning is like to stop loving. === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. === -- To stop learning is like to stop loving. === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] query about rte_eal_mp_remote_launch()
Hi, So far as you are doing the init properly in a controlled fashion from one core, you should be able to orchestrate the usecase with your own threads. I don't think so there should be any limitations. Regards -Prashant From: Jyotiswarup Raiturkar [mailto:jyoti...@googlemail.com] Sent: Wednesday, October 23, 2013 8:01 PM To: Prashant Upadhyaya Cc: dev at dpdk.org Subject: Re: [dpdk-dev] query about rte_eal_mp_remote_launch() Hi Prashant Thanks for the reply. I understand what you said. But my query was can i use pthread_create() to create the 'tight loop' threads on demand, rather than spawing the threads at start with rte_eal_mp_remote_launch(). Does anything in the dpdk core preclude using pthread_create() calls directly? -Jyoti On Wed, Oct 23, 2013 at 7:46 PM, Prashant Upadhyaya mailto:prashant.upadhyaya at aricent.com>> wrote: Hi Jyoti, You must carefully analyse your usecase. Typically each core must run a tight loop (and therefore one thread spawned by remote launch) which does a while 1 { get packet, service packet } You should try to build your application around the above paradigm. One of your cores can service the slow path using traditional linux with a tap interface. Regards -Prashant -Original Message- From: dev [mailto:dev-bounces at dpdk.org<mailto:dev-boun...@dpdk.org>] On Behalf Of Jyotiswarup Raiturkar Sent: Wednesday, October 23, 2013 5:11 PM To: dev at dpdk.org<mailto:dev at dpdk.org> Subject: [dpdk-dev] query about rte_eal_mp_remote_launch() Hello Devs I'm new to DPDK and trying to understand the basics.. I want to write a DPDK app where I want to configure shm rings on the fly, and I want one thread(per core) to service the ring. In some of the examples I saw rte_eal_mp_remote_launch() being used, but this is a one time launch. Can I use pthread_create() on-the-fly (taking care of CPU core allocation), after doing an initial threads launch using rte_eal_mp_remote_launch()? Thanks Jyotiswarup Raiturkar === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. === === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] query about port queues
Hi Jyoti, You can configure the number of tx and rx queues via the software when you are calling the rte_eth_dev_configure. However you cannot allocate more than what the NIC supports. But you can allocate less ofcourse. Typically the queues are used so that independent cores can do tx and rx on a separate queue without locking. If you have configured 'n' rx queues, your must ensure that you read from _all_ the queues because the packet can arrive on any of the rx queues based on the algorithm by which NIC fans out incoming messages on the queues (eg. RSS). You can transmit freely from any queue (eg. each core of yours could have a tx queue each in your usecase) Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Jyotiswarup Raiturkar Sent: Wednesday, October 23, 2013 5:10 PM To: dev at dpdk.org Subject: [dpdk-dev] query about port queues Hello Devs I'm new to DPDK and trying to understand the basics. I went through the programming guide but I had one question regarding Tx and Rx queues per port. Are they configurable entirely in software or do they depend on the HW (NIC)? Does the L2 configuration (MAC address) apply to all the queues on the port? (and hence will an application like say a network stack need packets from all the queues in the port)? Thanks Jyotiswarup Raiturkar === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] query about rte_eal_mp_remote_launch()
Hi Jyoti, You must carefully analyse your usecase. Typically each core must run a tight loop (and therefore one thread spawned by remote launch) which does a while 1 { get packet, service packet } You should try to build your application around the above paradigm. One of your cores can service the slow path using traditional linux with a tap interface. Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Jyotiswarup Raiturkar Sent: Wednesday, October 23, 2013 5:11 PM To: dev at dpdk.org Subject: [dpdk-dev] query about rte_eal_mp_remote_launch() Hello Devs I'm new to DPDK and trying to understand the basics.. I want to write a DPDK app where I want to configure shm rings on the fly, and I want one thread(per core) to service the ring. In some of the examples I saw rte_eal_mp_remote_launch() being used, but this is a one time launch. Can I use pthread_create() on-the-fly (taking care of CPU core allocation), after doing an initial threads launch using rte_eal_mp_remote_launch()? Thanks Jyotiswarup Raiturkar === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] 82599 SR-IOV with passthrough
Hi Qinglai, I would say that SRIOV is 'useless' if the VF gets only one queue. At the heart of performance is to use one queue per core so that the the tx and rx remain lockless. Locks 'destroy' performance. So with one queue, if we want to remain lockless, that automatically means that the usecase is restricted to one core, ergo useless for any usecase worth its salt. It was courtesy your mail that I 'discovered' that DPDK has such a limitation. So I am all for this patch to go in DPDK. Good luck ! Regards -Prashant -Original Message- From: jigsaw [mailto:jig...@gmail.com] Sent: Thursday, October 17, 2013 6:14 PM To: Prashant Upadhyaya Cc: Thomas Monjalon; dev at dpdk.org Subject: Re: [dpdk-dev] 82599 SR-IOV with passthrough Hi Prashant, I patched both Intel ixgbe PF driver and DPDK 1.5 VF driver, so that DPDK gets 4 queues in one VF. It works fine with all 4 Tx queues. The only trick is to set proper mac address for all outgoing packets, which must be the same mac as you set to the VF. This trick is described in the release note of DPDK. I wonder whether it makes sense to push this patch to DPDK. Any comments? thx & rgds, -ql On Thu, Oct 17, 2013 at 2:55 PM, Prashant Upadhyaya wrote: > Hi Qinglai, > > Why are you using the kernel driver at all. > Use the DPDK driver to control the PF on the host. The guest would > communicate with the PF on host using mailbox as usual. > Then the changes will be limited to DPDK, isn't it ? > > Regards > -Prashant > > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of jigsaw > Sent: Wednesday, October 16, 2013 6:51 PM > To: Thomas Monjalon > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] 82599 SR-IOV with passthrough > > Hi Thomas, > > Thanks for reply. > > The kernel has older version of PF than the one released on sf.net. So I'm > checking the sf.net release. > If the change is limited in DPDK then it is controllable. But now it affects > Intel's PF driver, I don't even know how to push the feature to Intel. The > driver on sf.net is a read-only repository, isn't it? It would be painful to > maintain another branch of 10G PF driver. > Could Intel give some advice or hints here? > > thx & > rgds, > -Qinglai > > On Wed, Oct 16, 2013 at 3:58 PM, Thomas Monjalon 6wind.com> wrote: >> 16/10/2013 14:18, jigsaw : >>> Therefore, to add support for multiple queues per VF, we have to at >>> least fix the PF driver, then add support in DPDK's VF driver. >> >> You're right, Linux PF driver have to be updated to properly manage >> multiple queues per VF. Then the guest can be tested with DPDK or >> with Linux driver (ixgbe_vf). >> >> Note that there are 2 versions of Linux driver for ixgbe: kernel.org >> and sourceforge.net (supporting many kernel versions). >> >> -- >> Thomas > > > > > == > = Please refer to > http://www.aricent.com/legal/email_disclaimer.html > for important disclosures regarding this electronic communication. > == > = === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] 82599 SR-IOV with passthrough
Hi Qinglai, Even with 1 queue, were you able to run the DPDK app in the guest OS ? If you were able to, which version of DPDK did you use, please let me know. I am trying to run the DPDK app in guest OS using QEMU/KVM with an SRIOV virtual function of an 82599 NIC. I can see the vf pci address in the lspci output on the guest OS, but when I try to run the DPDK app in the guest OS, the EAL complains with the following -- EAL: pci_uio_map_resource(): cannot store uio mmap details I am using DPDK1.4. Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of jigsaw Sent: Wednesday, October 16, 2013 5:49 PM To: dev at dpdk.org Subject: [dpdk-dev] 82599 SR-IOV with passthrough Hi, I am doing experiments with SR-IOV + passthrough on 82599. My expectation is to have VT on and DCB off, under which configuration, the total 128 TX queues will be split into 32 pools, each has 4 queues. With latest driver ixgbe-3.18.7, PF can be set with 16 pools, each has 4 queues with these params: insmod ./ixgbe.ko MQ=1 max_vfs=8 RSS=8 VMDQ=16 I tried with VMDQ=32 with a panic. Also, it seems that if RSS is set to 4, the PF driver will set RSS to 2 somehow. Since I'm fine with 16 pools + 4 queues, I'm not going to investigate (at this moment) why PF doesn't work as expected. Next step is then to try DPDK in guest OS, which get one VF by passthrough. Not surprisingly, DPDK says that number of TX queue is 1. This is because the value is set arbitrarily in ixgbe_init_ops_vf of ixgbe_vf.c, and it never gets updated. Actually the mbox API has support for requesting Tx/Rx queue numbers from VF. See implementation of routine ixgbevf_get_queues and ixgbevf_negotiate_api_version. However, it is not straightforward to use these 2 routines to fetch Tx/Rx queue number, coz the PF driver is not ready to be used without modification. See ixgbe_get_vf_queues of ixgbe_sriov.c in ixgbe-3.18.7. The PF will always answer with 1 for Tx/Rx queue number requests, regardless of current config. Therefore, to add support for multiple queues per VF, we have to at least fix the PF driver, then add support in DPDK's VF driver. But the question is, is this enough? Before doing any experiments I wonder whether anybody has come across same problem as I do, and if there's any implementation ongoing. thx & rgds, -Qinglai === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===
[dpdk-dev] sending and receiving packets
Hi Gopi, I have not worked with rumpkernel tcpip stack. Does it run 'with' the DPDK in the userspace and is your tcp client application interacting over sockets to that tcpip stack in user space ? If your stack is running in the kernel, then ofcourse you have to use a tap interface to interface with the kernel. Can you describe your usecase in more details eg. what is the dpdk app, is the tcp client itself the dpdk app and so forth. Normally, I use tcpclient/server as a normal linux kernel interfacing apps. I run a DPDK app and use a tap interface to switch packets in an out of the kernel. The kernel interacts over sockets with tcpclient/server as usual. Regards -Prashant -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Gopi Krishna B Sent: Wednesday, October 16, 2013 8:13 PM To: dev at dpdk.org Subject: [dpdk-dev] sending and receiving packets Hi, I have DPDK 1.5 configured on my machine, I have also configured the rump kernel tcpip stack. Now, to check whether the setup works, I started *TCP Server application*on another machine and connected the LAN cable to the port which is controlled by DPDK. And I am running *TCP client application* on the machine having DPDK and rumpkernel tcpip stack. The tcp client and server cannot communicate, is there some other configuration to be taken care for the traffic to flow appropriately on the machine running DPDK. I have checked similar posts on the mailing list, but didnot get clue on how to debug the issue I am facing. Any pointer/suggestions would be really of great help. -- Regards Gopi Krishna === Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. ===