[dpdk-dev] MENNIC1.2 host-sim crashed for me

2014-07-15 Thread Hiroshi Shimamoto
Hi,

> Subject: [dpdk-dev] MENNIC1.2 host-sim crashed for me
> 
> Hi,
> I want to run MEMNIC 1.2 application .
> 
> 1.   I compiled DPDK1.6
> 
> 2.   I compiled memnic.12
> 
> 3.   And while running memnic-hostsim appgot strucked
> 
> 4.
> 
> 5.   [root at localhost host-sim]# ./memnic-host-sim /dev/shm/ivshm
> 
> Bus error (core dumped)
> 
> 
> 
> Core was generated by `./memnic-host-sim  /dev/shm/ivshm'.
> 
> Program terminated with signal SIGBUS, Bus error.
> 
> #0  0x003a82e894e4 in memset () from /lib64/libc.so.6
> 
> Missing separate debuginfos, use: debuginfo-install glibc-2.18-11.fc20.x86_64
> 
> (gdb) bt
> 
> #0  0x003a82e894e4 in memset () from /lib64/libc.so.6
> 
> #1  0x004008a3 in init_memnic (nic=0x76fe2000) at host-sim.c:55
> 
> #2  0x00400a8a in main (argc=2, argv=0x7fffe4a8) at host-sim.c:106
> 
> (gdb)
> 
> 
> 
> 
> 
> Got error at line 55 .. saying nic is read only..


I have never tried host-sim yet though.
I guess it's the cause that host-sim doesn't increase the shared memory size.
Could you try booting qemu first with -device ivshmem,size=16,shm=/ivshm then 
run host-sim?

thanks,
Hiroshi

> 
> 
> 
> 53 static void init_memnic(struct memnic_area *nic)
> 
> 54 {
> 
> 55 memset(nic, 0, sizeof(*nic));
> 
> 56 nic->hdr.magic = MEMNIC_MAGIC;
> 
> 57 nic->hdr.version = MEMNIC_VERSION;
> 
> 58 /* 00:09:c0:00:13:37 */
> 
> 59 nic->hdr.mac_addr[0] = 0x00;
> 
> 60 nic->hdr.mac_addr[1] = 0x09;
> 
> 61 nic->hdr.mac_addr[2] = 0xc0;
> 
> 62 nic->hdr.mac_addr[3] = 0x00;
> 
> 63 nic->hdr.mac_addr[4] = 0x13;
> 
> 64 nic->hdr.mac_addr[5] = 0x37;
> 
> 65 }
> 
> 
> 
> Thanks,
> 
> Srinivas.
> 
> "DISCLAIMER: This message is proprietary to Aricent and is intended solely 
> for the use of the individual to whom it is
> addressed. It may contain privileged or confidential information and should 
> not be circulated or used for any purpose
> other than for what it is intended. If you have received this message in 
> error, please notify the originator immediately.
> If you are not the intended recipient, you are notified that you are strictly 
> prohibited from using, copying, altering,
> or disclosing the contents of this message. Aricent accepts no responsibility 
> for loss or damage arising from the use
> of the information transmitted by this email including damage from virus."


[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices

2014-07-15 Thread Zhou, Danny
According to my performance measurement results for 64B small packet, 1 queue 
perf. is better than 16 queues (1.35M pps vs. 0.93M pps) which make sense to me 
as for 16 queues case more CPU cycles (16 queues' 87% vs. 1 queue' 80%) in 
kernel land needed for NAPI-enabled ixgbe driver to switch between polling and 
interrupt modes in order to service per-queue rx interrupts, so more context 
switch overhead involved. Also, since the eth_packet_rx/eth_packet_tx routines 
involves in two memory copies between DPDK mbuf and pbuf for each packet, it 
can hardly achieve high performance unless packet are directly DMA to mbuf 
which needs ixgbe driver to support.

> -Original Message-
> From: John W. Linville [mailto:linville at tuxdriver.com]
> Sent: Tuesday, July 15, 2014 2:25 AM
> To: dev at dpdk.org
> Cc: Thomas Monjalon; Richardson, Bruce; Zhou, Danny
> Subject: [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual
> devices
> 
> This is a Linux-specific virtual PMD driver backed by an AF_PACKET socket.  
> This
> implementation uses mmap'ed ring buffers to limit copying and user/kernel
> transitions.  The PACKET_FANOUT_HASH behavior of AF_PACKET is used for
> frame reception.  In the current implementation, Tx and Rx queues are always 
> paired,
> and therefore are always equal in number -- changing this would be a Simple 
> Matter
> Of Programming.
> 
> Interfaces of this type are created with a command line option like
> "--vdev=eth_packet0,iface=...".  There are a number of options availabe as
> arguments:
> 
>  - Interface is chosen by "iface" (required)
>  - Number of queue pairs set by "qpairs" (optional, default: 1)
>  - AF_PACKET MMAP block size set by "blocksz" (optional, default: 4096)
>  - AF_PACKET MMAP frame size set by "framesz" (optional, default: 2048)
>  - AF_PACKET MMAP frame count set by "framecnt" (optional, default: 512)
> 
> Signed-off-by: John W. Linville 
> ---
> This PMD is intended to provide a means for using DPDK on a broad range of
> hardware without hardware-specific PMDs and (hopefully) with better 
> performance
> than what PCAP offers in Linux.  This might be useful as a development 
> platform for
> DPDK applications when DPDK-supported hardware is expensive or unavailable.
> 
> New in v2:
> 
> -- fixup some style issues found by check patch
> -- use if_index as part of fanout group ID
> -- set default number of queue pairs to 1
> 
>  config/common_bsdapp   |   5 +
>  config/common_linuxapp |   5 +
>  lib/Makefile   |   1 +
>  lib/librte_eal/linuxapp/eal/Makefile   |   1 +
>  lib/librte_pmd_packet/Makefile |  60 +++
>  lib/librte_pmd_packet/rte_eth_packet.c | 826
> +
> lib/librte_pmd_packet/rte_eth_packet.h |  55 +++
>  mk/rte.app.mk  |   4 +
>  8 files changed, 957 insertions(+)
>  create mode 100644 lib/librte_pmd_packet/Makefile  create mode 100644
> lib/librte_pmd_packet/rte_eth_packet.c
>  create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h
> 
> diff --git a/config/common_bsdapp b/config/common_bsdapp index
> 943dce8f1ede..c317f031278e 100644
> --- a/config/common_bsdapp
> +++ b/config/common_bsdapp
> @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y
> CONFIG_RTE_LIBRTE_PMD_BOND=y
> 
>  #
> +# Compile software PMD backed by AF_PACKET sockets (Linux only) #
> +CONFIG_RTE_LIBRTE_PMD_PACKET=n
> +
> +#
>  # Do prefetch of packet data within PMD driver receive function  #
> CONFIG_RTE_PMD_PACKET_PREFETCH=y diff --git a/config/common_linuxapp
> b/config/common_linuxapp index 7bf5d80d4e26..f9e7bc3015ec 100644
> --- a/config/common_linuxapp
> +++ b/config/common_linuxapp
> @@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n
> CONFIG_RTE_LIBRTE_PMD_BOND=y
> 
>  #
> +# Compile software PMD backed by AF_PACKET sockets (Linux only) #
> +CONFIG_RTE_LIBRTE_PMD_PACKET=y
> +
> +#
>  # Compile Xen PMD
>  #
>  CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
> diff --git a/lib/Makefile b/lib/Makefile index 10c5bb3045bc..930fadf29898 
> 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -47,6 +47,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) +=
> librte_pmd_i40e
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += librte_pmd_bond
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += librte_pmd_ring
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += librte_pmd_pcap
> +DIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += librte_pmd_packet
>  DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio
>  DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt diff --git
> a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
> index 756d6b0c9301..feed24a63272 100644
> --- a/lib/librte_eal/linuxapp/eal/Makefile
> +++ b/lib/librte_eal/linuxapp/eal/Makefile
> @@ -44,6 +44,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_ether  CFLAGS +=
> -I$(RTE_SDK)/lib/librte_ivshmem  CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_ring
> CFLAGS += -I$(RTE_SDK)/lib

[dpdk-dev] Make DPDK tailqs fully local

2014-07-15 Thread Gray, Mark D
Hi,

What are the plans to resolve this issue? Will this patch get upstreamed?

http://www.dpdk.org/ml/archives/dev/2014-June/003591.html

Thanks,

Mark



[dpdk-dev] Making space in mbuf data-structure

2014-07-15 Thread Ananyev, Konstantin

Hi Oliver,

> 
> As this change would impact the core of DPDK, I think it would be
> interesting to list some representative use-cases in order to evaluate
> the cost of each solution. This will also help for future modifications,
> and could be included in a sort of non-regression test?
> 

I think this is a very good idea.
Let's create a list of such test-cases.   
Two obvious apps from our side would probably be:
- test-pmd iofwd mode
- l3fwd
What else should we add here?
Konstantin


[dpdk-dev] Making space in mbuf data-structure

2014-07-15 Thread Ananyev, Konstantin
Hi Bruce,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Richardson, Bruce
> Sent: Friday, July 04, 2014 12:39 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] Making space in mbuf data-structure
> 
> Hi all,
> 
> At this stage it's been well recognised that we need to make more space in 
> the mbuf data structure for new fields. We in Intel have
> had some discussion on this and this email is to outline what our current 
> thinking and approach on this issue is, and look for additional
> suggestions and feedback.
> 
> Firstly, we believe that there is no possible way that we can ever fit all 
> the fields we need to fit into a 64-byte mbuf, and so we need to
> start looking at a 128-byte mbuf instead. Examples of new fields that need to 
> fit in, include - 32-64 bits for additional offload flags, 32-
> bits more for filter information for support for the new filters in the i40e 
> driver, an additional 2-4 bytes for storing info on a second
> vlan tag, 4-bytes for storing a sequence number to enable packet reordering 
> in future, as well as potentially a number of other fields
> or splitting out fields that are superimposed over each other right now, e.g. 
> for the qos scheduler. We also want to allow space for use
> by other non-Intel NIC drivers that may be open-sourced to dpdk.org in the 
> future too, where they support fields and offloads that
> our hardware doesn't.
> 
> If we accept the fact of a 2-cache-line mbuf, then the issue becomes how to 
> rework things so that we spread our fields over the two
> cache lines while causing the lowest slow-down possible. The general approach 
> that we are looking to take is to focus the first cache
> line on fields that are updated on RX , so that receive only deals with one 
> cache line. The second cache line can be used for application
> data and information that will only be used on the TX leg. This would allow 
> us to work on the first cache line in RX as now, and have the
> second cache line being prefetched in the background so that it is available 
> when necessary. Hardware prefetches should help us out
> here. We also may move rarely used, or slow-path RX fields e.g. such as those 
> for chained mbufs with jumbo frames, to the second
> cache line, depending upon the performance impact and bytes savings achieved.
> 
> With this approach, we still need to make space in the first cache line for 
> information for the new or expanded receive offloads. First
> off the blocks is to look at moving the mempool pointer into the second cache 
> line. This will free-up 8 bytes in cache line  0, with a field
> that is only used when cleaning up after TX. A prototype patch for this 
> change is given below, for your review and comment. Initial
> quick tests with testpmd (testpmd -c 600 -n 4 -- --mbcache=250 
> --txqflags=0xf01 --burst=32 --txrst=32 --txfreet=32 --rxfreet=32 --
> total-num-mbufs=65536), and l3fwd (l3fwd -c 400 -n 4 -- -p F -P 
> --config="(0,0,10),(1,0,10),(2,0,10),(3,0,10)") showed only a slight
> performance decrease with testpmd and equal or slightly better performance 
> with l3fwd. These would both have been using the
> vector PMD - I haven't looked to change the fallback TX implementation yet 
> for this change, since it's not as performance optimized
> and hence cycle-count sensitive.

Regarding code changes itself:
If I understand your code changes correctly:
ixgbe_tx_free_bufs() will use rte_mempool_put_bulk() *only* if all mbufs in the 
bulk belong to the same mempool.
If we have let say 2 groups of mbufs from 2 different mempools - it wouldn't be 
able to use put_bulk. 
While, as I remember,  current implementation is able to use put_bulk() in such 
case.
I think, it is possible to change you code to do a better grouping.

> 
> Beyond this change, I'm also investigating potentially moving the "next" 
> pointer to the second cache line, but it's looking harder to
> move without serious impact, so we'll have to see there what can be done. 
> Other fields I'll examine in due course too, and send on
> any findings I may have.

Could you explain it a bit?
I always had an impression that moving next pointer to the 2-nd cache line 
would have less impact then moving mempool pointer.
My thinking was: next is used only in scatter versions of RX/TX which are 
slower than optimised RX/TX anyway.
So few extra cycles wouldn't be that noticeable. 
Though I admit I never did any measurements for that case.

About space savings for the first cache line:
- I still think that Oliver's suggestion of replacing data pointer (8 bytes) 
with data offset (2 bytes) makes a lot of sense.

> 
> Once we have freed up space, then we can start looking at what fields get to 
> use that space and what way we shuffle the existing
> fields about, but that's a discussion for another day!
> 
> Please see patch below for moving pool pointer to second cache line of mbuf. 
> All feedback welcome, naturally.
> 
> Regards,
> /Bruce

[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices

2014-07-15 Thread Neil Horman
On Tue, Jul 15, 2014 at 12:15:49AM +, Zhou, Danny wrote:
> According to my performance measurement results for 64B small packet, 1 queue 
> perf. is better than 16 queues (1.35M pps vs. 0.93M pps) which make sense to 
> me as for 16 queues case more CPU cycles (16 queues' 87% vs. 1 queue' 80%) in 
> kernel land needed for NAPI-enabled ixgbe driver to switch between polling 
> and interrupt modes in order to service per-queue rx interrupts, so more 
> context switch overhead involved. Also, since the eth_packet_rx/eth_packet_tx 
> routines involves in two memory copies between DPDK mbuf and pbuf for each 
> packet, it can hardly achieve high performance unless packet are directly DMA 
> to mbuf which needs ixgbe driver to support.

I thought 16 queues would be spread out between as many cpus as you had though,
obviating the need for context switches, no?
Neil

> 
> > -Original Message-
> > From: John W. Linville [mailto:linville at tuxdriver.com]
> > Sent: Tuesday, July 15, 2014 2:25 AM
> > To: dev at dpdk.org
> > Cc: Thomas Monjalon; Richardson, Bruce; Zhou, Danny
> > Subject: [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual
> > devices
> > 
> > This is a Linux-specific virtual PMD driver backed by an AF_PACKET socket.  
> > This
> > implementation uses mmap'ed ring buffers to limit copying and user/kernel
> > transitions.  The PACKET_FANOUT_HASH behavior of AF_PACKET is used for
> > frame reception.  In the current implementation, Tx and Rx queues are 
> > always paired,
> > and therefore are always equal in number -- changing this would be a Simple 
> > Matter
> > Of Programming.
> > 
> > Interfaces of this type are created with a command line option like
> > "--vdev=eth_packet0,iface=...".  There are a number of options availabe as
> > arguments:
> > 
> >  - Interface is chosen by "iface" (required)
> >  - Number of queue pairs set by "qpairs" (optional, default: 1)
> >  - AF_PACKET MMAP block size set by "blocksz" (optional, default: 4096)
> >  - AF_PACKET MMAP frame size set by "framesz" (optional, default: 2048)
> >  - AF_PACKET MMAP frame count set by "framecnt" (optional, default: 512)
> > 
> > Signed-off-by: John W. Linville 
> > ---
> > This PMD is intended to provide a means for using DPDK on a broad range of
> > hardware without hardware-specific PMDs and (hopefully) with better 
> > performance
> > than what PCAP offers in Linux.  This might be useful as a development 
> > platform for
> > DPDK applications when DPDK-supported hardware is expensive or unavailable.
> > 
> > New in v2:
> > 
> > -- fixup some style issues found by check patch
> > -- use if_index as part of fanout group ID
> > -- set default number of queue pairs to 1
> > 
> >  config/common_bsdapp   |   5 +
> >  config/common_linuxapp |   5 +
> >  lib/Makefile   |   1 +
> >  lib/librte_eal/linuxapp/eal/Makefile   |   1 +
> >  lib/librte_pmd_packet/Makefile |  60 +++
> >  lib/librte_pmd_packet/rte_eth_packet.c | 826
> > +
> > lib/librte_pmd_packet/rte_eth_packet.h |  55 +++
> >  mk/rte.app.mk  |   4 +
> >  8 files changed, 957 insertions(+)
> >  create mode 100644 lib/librte_pmd_packet/Makefile  create mode 100644
> > lib/librte_pmd_packet/rte_eth_packet.c
> >  create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h
> > 
> > diff --git a/config/common_bsdapp b/config/common_bsdapp index
> > 943dce8f1ede..c317f031278e 100644
> > --- a/config/common_bsdapp
> > +++ b/config/common_bsdapp
> > @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y
> > CONFIG_RTE_LIBRTE_PMD_BOND=y
> > 
> >  #
> > +# Compile software PMD backed by AF_PACKET sockets (Linux only) #
> > +CONFIG_RTE_LIBRTE_PMD_PACKET=n
> > +
> > +#
> >  # Do prefetch of packet data within PMD driver receive function  #
> > CONFIG_RTE_PMD_PACKET_PREFETCH=y diff --git a/config/common_linuxapp
> > b/config/common_linuxapp index 7bf5d80d4e26..f9e7bc3015ec 100644
> > --- a/config/common_linuxapp
> > +++ b/config/common_linuxapp
> > @@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n
> > CONFIG_RTE_LIBRTE_PMD_BOND=y
> > 
> >  #
> > +# Compile software PMD backed by AF_PACKET sockets (Linux only) #
> > +CONFIG_RTE_LIBRTE_PMD_PACKET=y
> > +
> > +#
> >  # Compile Xen PMD
> >  #
> >  CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
> > diff --git a/lib/Makefile b/lib/Makefile index 10c5bb3045bc..930fadf29898 
> > 100644
> > --- a/lib/Makefile
> > +++ b/lib/Makefile
> > @@ -47,6 +47,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) +=
> > librte_pmd_i40e
> >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += librte_pmd_bond
> >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += librte_pmd_ring
> >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += librte_pmd_pcap
> > +DIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += librte_pmd_packet
> >  DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio
> >  DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3
> >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd

[dpdk-dev] FW: MENNIC1.2 host-sim crashed for me

2014-07-15 Thread Srinivas Reddi

Hi Hiroshi,
Thanks for ur reply .. I have moved forward little bit.

MEMNIC-1.2

1. I started qemu and then started host-sim application

Qemu command :
qemu-system-x86_64 -enable-kvm -cpu host   -boot c -hda 
/home/vm-images/vm1-clone.img -m 8192M -smp 3 --enable-kvm -name vm1 -vnc :1 
-pidfile /tmp/vm1.pid -drive file=fat:rw:/tmp/share  -device 
ivshmem,size=16,shm=ivshm
vvfat fat:rw:/tmp/share chs 1024,16,63

2.Host-sim app command :
3.[root at localhost host-sim]# ./memnic-host-sim   /dev/shm/ivshm
4.On the guest compiled  memnic-1.2 .
5.Inserted memnic.ko
6.Found and interface ens4  after insmod memnic.ko

[root at localhost memnic-1.2]# ifconfig -a
ens3: flags=4163  mtu 1500
inet6 fe80::5054:ff:fe12:3456  prefixlen 64  scopeid 0x20
ether 52:54:00:12:34:56  txqueuelen 1000  (Ethernet)
RX packets 0  bytes 0 (0.0 B)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 8  bytes 648 (648.0 B)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ens4: flags=4098  mtu 1500
ether 00:09:c0:00:13:37  txqueuelen 1000  (Ethernet)
RX packets 0  bytes 0 (0.0 B)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 0  bytes 0 (0.0 B)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73  mtu 65536
inet 127.0.0.1  netmask 255.0.0.0
inet6 ::1  prefixlen 128  scopeid 0x10
loop  txqueuelen 0  (Local Loopback)
RX packets 386  bytes 33548 (32.7 KiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 386  bytes 33548 (32.7 KiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

7.lspci on the guest

[root at localhost memnic-1.2]# lspci
00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
00:02.0 VGA compatible controller: Cirrus Logic GD 5446
00:03.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet 
Controller (rev 03)
00:04.0 RAM memory: Red Hat, Inc Device 1110 [root at localhost memnic-1.2]#

8.on the Guest  ran test pmd application

[root at localhost test-pmd]# ./testpmd -c7 -n3  -- --d 
/usr/local/lib/librte_pmd_memnic_copy.so  -i --nb-cores=1 --nb-ports=1 
--port-topology=chained
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 0 on socket 0
EAL: Detected lcore 2 as core 0 on socket 0
EAL: Setting up memory...
EAL: Ask a virtual area of 0x4000 bytes
EAL: Virtual area found at 0x7f551400 (size = 0x4000)
EAL: Requesting 512 pages of size 2MB from socket 0
EAL: TSC frequency is ~3092833 KHz
EAL: Master core 0 is ready (tid=54398880)
EAL: Core 1 is ready (tid=135f8700)
EAL: Core 2 is ready (tid=12df7700)
EAL: PCI device :00:03.0 on NUMA socket -1
EAL:   probe driver: 8086:100e rte_em_pmd
EAL:   :00:03.0 not managed by UIO driver, skipping
EAL: Error - exiting with code: 1
  Cause: No probed ethernet devices - check that CONFIG_RTE_LIBRTE_IGB_PMD=y 
and that CONFIG_RTE_LIBRTE_EM_PMD=y and that CONFIG_RTE_LIBRTE_IXGBE_PMD=y in 
your configuration file
[root at localhost test-pmd]#

How can I bind 00:04.0 Ram controller to dpdk application (test-pmd ) .
How DPDK test-pmd application finds the memnic device.

9.Am I missing any steps in the guest configurations or host configuration .
10.Is there any better manual for testing MEMNIC-1.2 or better understanding .
11.Is there any better application  to test MEMNIC   for VM-VM  or VM to host 
data transfer .

Thanks &  regards,
Srinivas.



-Original Message-
From: Hiroshi Shimamoto [mailto:h-shimam...@ct.jp.nec.com]
Sent: Tuesday, July 15, 2014 5:31 AM
To: Srinivas Reddi; dev at dpdk.org
Subject: RE: MENNIC1.2 host-sim crashed for me

Hi,

> Subject: [dpdk-dev] MENNIC1.2 host-sim crashed for me
>
> Hi,
> I want to run MEMNIC 1.2 application .
>
> 1.   I compiled DPDK1.6
>
> 2.   I compiled memnic.12
>
> 3.   And while running memnic-hostsim appgot strucked
>
> 4.
>
> 5.   [root at localhost host-sim]# ./memnic-host-sim /dev/shm/ivshm
>
> Bus error (core dumped)
>
>
>
> Core was generated by `./memnic-host-sim  /dev/shm/ivshm'.
>
> Program terminated with signal SIGBUS, Bus error.
>
> #0  0x003a82e894e4 in memset () from /lib64/libc.so.6
>
> Missing separate debuginfos, use: debuginfo-install
> glibc-2.18-11.fc20.x86_64
>
> (gdb) bt
>
> #0  0x003a82e894e4 in memset () from /lib64/libc.so.6
>
> #1  0x004008a3 in init_memnic (nic=0x76fe2000) at
> host-sim.c:55
>
> #2  0x00400a8a in main (argc=2, argv=0x7fffe4a8) at
> host-sim.c:106
>
> (gdb)
>
>
>
>
>
> Got error at line 55 .. saying nic is read only..


I have never tried host-sim yet though.
I guess it's the cause that host-sim doesn't increase the shared memory size.
Could you try b

[dpdk-dev] MENNIC1.2 host-sim crashed for me

2014-07-15 Thread Hiroshi Shimamoto
Hi Srinivas,

> Subject: FW: MENNIC1.2 host-sim crashed for me
> 
> 
> Hi Hiroshi,
> Thanks for ur reply .. I have moved forward little bit.
> 
> MEMNIC-1.2
> 
> 1. I started qemu and then started host-sim application
> 
> Qemu command :
> qemu-system-x86_64 -enable-kvm -cpu host   -boot c -hda 
> /home/vm-images/vm1-clone.img -m 8192M -smp 3 --enable-kvm -name
> vm1 -vnc :1 -pidfile /tmp/vm1.pid -drive file=fat:rw:/tmp/share  -device 
> ivshmem,size=16,shm=ivshm
> vvfat fat:rw:/tmp/share chs 1024,16,63
> 
> 2.Host-sim app command :
> 3.[root at localhost host-sim]# ./memnic-host-sim   /dev/shm/ivshm
> 4.On the guest compiled  memnic-1.2 .
> 5.Inserted memnic.ko
> 6.Found and interface ens4  after insmod memnic.ko
> 
> [root at localhost memnic-1.2]# ifconfig -a
> ens3: flags=4163  mtu 1500
> inet6 fe80::5054:ff:fe12:3456  prefixlen 64  scopeid 0x20
> ether 52:54:00:12:34:56  txqueuelen 1000  (Ethernet)
> RX packets 0  bytes 0 (0.0 B)
> RX errors 0  dropped 0  overruns 0  frame 0
> TX packets 8  bytes 648 (648.0 B)
> TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
> 
> ens4: flags=4098  mtu 1500
> ether 00:09:c0:00:13:37  txqueuelen 1000  (Ethernet)
> RX packets 0  bytes 0 (0.0 B)
> RX errors 0  dropped 0  overruns 0  frame 0
> TX packets 0  bytes 0 (0.0 B)
> TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
> 
> lo: flags=73  mtu 65536
> inet 127.0.0.1  netmask 255.0.0.0
> inet6 ::1  prefixlen 128  scopeid 0x10
> loop  txqueuelen 0  (Local Loopback)
> RX packets 386  bytes 33548 (32.7 KiB)
> RX errors 0  dropped 0  overruns 0  frame 0
> TX packets 386  bytes 33548 (32.7 KiB)
> TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
> 
> 7.lspci on the guest
> 
> [root at localhost memnic-1.2]# lspci
> 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
> 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
> 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
> 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
> 00:02.0 VGA compatible controller: Cirrus Logic GD 5446
> 00:03.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet 
> Controller (rev 03)
> 00:04.0 RAM memory: Red Hat, Inc Device 1110 [root at localhost memnic-1.2]#
> 
> 8.on the Guest  ran test pmd application

you cannot use both kernel driver and PMD concurrently.
Before run testpmd, you should unload memnic.ko by rmmod command.

> 
> [root at localhost test-pmd]# ./testpmd -c7 -n3  -- --d 
> /usr/local/lib/librte_pmd_memnic_copy.so  -i --nb-cores=1
> --nb-ports=1 --port-topology=chained

I don't know about testpmd so much, but I guess the correct EAL parameters are 
like this.

# ./testpmd -c 0x7 -n 3 -d /usr/local/lib/librte_pmd_memnic_copy.so -- ...

Please pass extra library in EAL parameter.

> EAL: Detected lcore 0 as core 0 on socket 0
> EAL: Detected lcore 1 as core 0 on socket 0
> EAL: Detected lcore 2 as core 0 on socket 0
> EAL: Setting up memory...
> EAL: Ask a virtual area of 0x4000 bytes
> EAL: Virtual area found at 0x7f551400 (size = 0x4000)
> EAL: Requesting 512 pages of size 2MB from socket 0
> EAL: TSC frequency is ~3092833 KHz
> EAL: Master core 0 is ready (tid=54398880)
> EAL: Core 1 is ready (tid=135f8700)
> EAL: Core 2 is ready (tid=12df7700)
> EAL: PCI device :00:03.0 on NUMA socket -1
> EAL:   probe driver: 8086:100e rte_em_pmd
> EAL:   :00:03.0 not managed by UIO driver, skipping
> EAL: Error - exiting with code: 1
>   Cause: No probed ethernet devices - check that CONFIG_RTE_LIBRTE_IGB_PMD=y 
> and that CONFIG_RTE_LIBRTE_EM_PMD=y and that
> CONFIG_RTE_LIBRTE_IXGBE_PMD=y in your configuration file
> [root at localhost test-pmd]#
> 
> How can I bind 00:04.0 Ram controller to dpdk application (test-pmd ) .
> How DPDK test-pmd application finds the memnic device.
> 
> 9.Am I missing any steps in the guest configurations or host configuration .
> 10.Is there any better manual for testing MEMNIC-1.2 or better understanding .
> 11.Is there any better application  to test MEMNIC   for VM-VM  or VM to host 
> data transfer .

I think the current host-sim doesn't have any packet switching capability, we 
need to implement such
a functionality to test MEMNIC.

Actually, I started MEMNIC develop in DPDK vSwitch project.
You can see that in https://github.com/01org/dpdk-ovs/tree/development

thanks,
Hiroshi

> 
> Thanks &  regards,
> Srinivas.
> 
> 
> 
> -Original Message-
> From: Hiroshi Shimamoto [mailto:h-shimamoto at ct.jp.nec.com]
> Sent: Tuesday, July 15, 2014 5:31 AM
> To: Srinivas Reddi; dev at dpdk.org
> Subject: RE: MENNIC1.2 host-sim crashed for me
> 
> Hi,
> 
> > Subject: [dpdk-dev] MENNIC1.2 host-sim crashed for me
> >
> > Hi,
> > I want to run MEMNIC 1.2 application .
> >
> > 1.   I compiled DPDK1.6
> >
> > 2.   I

[dpdk-dev] [PATCH] port: fix doxygen

2014-07-15 Thread Yao Zhao
fix doxygen for rte_port_out_op_flush.

Signed-off-by: Yao Zhao 
---
 lib/librte_port/rte_port.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_port/rte_port.h b/lib/librte_port/rte_port.h
index 0934b00..1d763c2 100644
--- a/lib/librte_port/rte_port.h
+++ b/lib/librte_port/rte_port.h
@@ -165,7 +165,7 @@ typedef int (*rte_port_out_op_tx_bulk)(
uint64_t pkts_mask);

 /**
- * Output port free
+ * Output port flush
  *
  * @param port
  *   Handle to output port instance
-- 
1.9.1



[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices

2014-07-15 Thread John W. Linville
On Tue, Jul 15, 2014 at 08:17:44AM -0400, Neil Horman wrote:
> On Tue, Jul 15, 2014 at 12:15:49AM +, Zhou, Danny wrote:
> > According to my performance measurement results for 64B small
> > packet, 1 queue perf. is better than 16 queues (1.35M pps vs. 0.93M
> > pps) which make sense to me as for 16 queues case more CPU cycles (16
> > queues' 87% vs. 1 queue' 80%) in kernel land needed for NAPI-enabled
> > ixgbe driver to switch between polling and interrupt modes in order
> > to service per-queue rx interrupts, so more context switch overhead
> > involved. Also, since the eth_packet_rx/eth_packet_tx routines involves
> > in two memory copies between DPDK mbuf and pbuf for each packet,
> > it can hardly achieve high performance unless packet are directly
> > DMA to mbuf which needs ixgbe driver to support.
> 
> I thought 16 queues would be spread out between as many cpus as you had 
> though,
> obviating the need for context switches, no?

I think Danny is testing the single CPU case.  Having more queues
than CPUs probably does not provide any benefit.

It would be cool to hack the DPDK memory management to work directly
out of the mmap'ed AF_PACKET buffers.  But at this point I don't
have enough knowledge of DPDK internals to know if that is at all
reasonable...

John

P.S.  Danny, have you run any performance tests on the PCAP driver?

-- 
John W. LinvilleSomeday the world will need a hero, and you
linville at tuxdriver.com   might be all we have.  Be ready.


[dpdk-dev] MENNIC1.2 host-sim crashed for me

2014-07-15 Thread Srinivas Reddi
Hi Hiroshi ,

Thanks for your comments ...
I found dpdk1.6  doesn't work for MEMNIC.1.2 ..
I tried with dpdk-1.7 .. It worked for me ..


1. host_sim app in host
[root at localhost host-sim]# ./memnic-host-sim /dev/shm/ivshm
reset
reset
reset


2. test pmd app in guest ..

[root at localhost test-pmd]# ./testpmd -c3 -n2 -d 
/usr/local/lib/librte_pmd_memnic_copy.so -- -i --nb-cores=1 --nb-ports=1 
--port-topology=chained
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 0 on socket 0
EAL: Detected lcore 2 as core 0 on socket 0
EAL: Support maximum 64 logical core(s) by configuration.
EAL: Detected 3 lcore(s)
EAL:   cannot open VFIO container, error 2 (No such file or directory)
EAL: VFIO support could not be initialized
EAL: Setting up memory...
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7f4ccca0 (size = 0x20)
EAL: Ask a virtual area of 0x3fc0 bytes
EAL: Virtual area found at 0x7f4c8cc0 (size = 0x3fc0)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7f4c8c80 (size = 0x20)
EAL: Requesting 512 pages of size 2MB from socket 0
EAL: TSC frequency is ~3092832 KHz
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable 
clock cycles !
EAL: open shared lib /usr/local/lib/librte_pmd_memnic_copy.so
EAL: Master core 0 is ready (tid=cd01f880)
EAL: Core 1 is ready (tid=8bdf8700)
EAL: PCI device :00:03.0 on NUMA socket -1
EAL:   probe driver: 8086:100e rte_em_pmd
EAL:   :00:03.0 not managed by UIO driver, skipping
EAL: PCI device :00:04.0 on NUMA socket -1
EAL:   probe driver: 1af4:1110 rte_pmd_memnic_copy
PMD: PORT MAC: 00:09:C0:00:13:37
Interactive-mode selected
Configuring Port 0 (socket 0)
PMD: memnic: configure OK
PMD: txq: 0x7f4c8d6f2980
PMD: rxq: 0x7f4c8d6f2900
Port 0: 00:09:C0:00:13:37
Checking link statuses...
Port 0 Link Up - speed 1 Mbps - full-duplex
Done
testpmd> port reset 0
Bad arguments
testpmd> port stop 0
Stopping ports...
Checking link statuses...
Port 0 Link Up - speed 1 Mbps - full-duplex
Done
testpmd> port start 0
Port 0: 00:09:C0:00:13:37
Checking link statuses...
Port 0 Link Up - speed 1 Mbps - full-duplex
Done
testpmd> port stop 0
Stopping ports...
Checking link statuses...
Port 0 Link Up - speed 1 Mbps - full-duplex
Done
testpmd> port start 0
Port 0: 00:09:C0:00:13:37
Checking link statuses...
Port 0 Link Up - speed 1 Mbps - full-duplex
Done
testpmd>


Thanks & Regards,
Srinivas.


-Original Message-
From: Hiroshi Shimamoto [mailto:h-shimam...@ct.jp.nec.com]
Sent: Tuesday, July 15, 2014 6:23 PM
To: Srinivas Reddi; dev at dpdk.org
Subject: RE: MENNIC1.2 host-sim crashed for me

Hi Srinivas,

> Subject: FW: MENNIC1.2 host-sim crashed for me
>
>
> Hi Hiroshi,
> Thanks for ur reply .. I have moved forward little bit.
>
> MEMNIC-1.2
>
> 1. I started qemu and then started host-sim application
>
> Qemu command :
> qemu-system-x86_64 -enable-kvm -cpu host   -boot c -hda 
> /home/vm-images/vm1-clone.img -m 8192M -smp 3 --enable-kvm -name
> vm1 -vnc :1 -pidfile /tmp/vm1.pid -drive file=fat:rw:/tmp/share
> -device ivshmem,size=16,shm=ivshm vvfat fat:rw:/tmp/share chs
> 1024,16,63
>
> 2.Host-sim app command :
> 3.[root at localhost host-sim]# ./memnic-host-sim   /dev/shm/ivshm
> 4.On the guest compiled  memnic-1.2 .
> 5.Inserted memnic.ko
> 6.Found and interface ens4  after insmod memnic.ko
>
> [root at localhost memnic-1.2]# ifconfig -a
> ens3: flags=4163  mtu 1500
> inet6 fe80::5054:ff:fe12:3456  prefixlen 64  scopeid 0x20
> ether 52:54:00:12:34:56  txqueuelen 1000  (Ethernet)
> RX packets 0  bytes 0 (0.0 B)
> RX errors 0  dropped 0  overruns 0  frame 0
> TX packets 8  bytes 648 (648.0 B)
> TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>
> ens4: flags=4098  mtu 1500
> ether 00:09:c0:00:13:37  txqueuelen 1000  (Ethernet)
> RX packets 0  bytes 0 (0.0 B)
> RX errors 0  dropped 0  overruns 0  frame 0
> TX packets 0  bytes 0 (0.0 B)
> TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>
> lo: flags=73  mtu 65536
> inet 127.0.0.1  netmask 255.0.0.0
> inet6 ::1  prefixlen 128  scopeid 0x10
> loop  txqueuelen 0  (Local Loopback)
> RX packets 386  bytes 33548 (32.7 KiB)
> RX errors 0  dropped 0  overruns 0  frame 0
> TX packets 386  bytes 33548 (32.7 KiB)
> TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>
> 7.lspci on the guest
>
> [root at localhost memnic-1.2]# lspci
> 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma]
> (rev 02)
> 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton
> II]
> 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE
> [Natoma/Triton II]
> 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
> 00:02.0 VGA compatible controller: Cirrus Logic GD 5446
> 00:03.0 Ethernet controller: Intel Corporation 

[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices

2014-07-15 Thread Zhou, Danny

> -Original Message-
> From: Neil Horman [mailto:nhorman at tuxdriver.com]
> Sent: Tuesday, July 15, 2014 8:18 PM
> To: Zhou, Danny
> Cc: John W. Linville; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for
> AF_PACKET-based virtual devices
> 
> On Tue, Jul 15, 2014 at 12:15:49AM +, Zhou, Danny wrote:
> > According to my performance measurement results for 64B small packet, 1 
> > queue
> perf. is better than 16 queues (1.35M pps vs. 0.93M pps) which make sense to 
> me as
> for 16 queues case more CPU cycles (16 queues' 87% vs. 1 queue' 80%) in kernel
> land needed for NAPI-enabled ixgbe driver to switch between polling and 
> interrupt
> modes in order to service per-queue rx interrupts, so more context switch 
> overhead
> involved. Also, since the eth_packet_rx/eth_packet_tx routines involves in two
> memory copies between DPDK mbuf and pbuf for each packet, it can hardly 
> achieve
> high performance unless packet are directly DMA to mbuf which needs ixgbe 
> driver
> to support.
> 
> I thought 16 queues would be spread out between as many cpus as you had 
> though,
> obviating the need for context switches, no?
> Neil
> 

If you set those per-queue MSIX interrupt affinity to different cpus, then 
performance would be much better 
and linear scaling is expected. But in order to do apple-to-apple performance 
comparison against 1 queue case 
on single core, by default all interrupts are handled by one core, say core0, 
so lots of context switch impacts 
performance I think.

> >
> > > -Original Message-
> > > From: John W. Linville [mailto:linville at tuxdriver.com]
> > > Sent: Tuesday, July 15, 2014 2:25 AM
> > > To: dev at dpdk.org
> > > Cc: Thomas Monjalon; Richardson, Bruce; Zhou, Danny
> > > Subject: [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based
> > > virtual devices
> > >
> > > This is a Linux-specific virtual PMD driver backed by an AF_PACKET
> > > socket.  This implementation uses mmap'ed ring buffers to limit
> > > copying and user/kernel transitions.  The PACKET_FANOUT_HASH
> > > behavior of AF_PACKET is used for frame reception.  In the current
> > > implementation, Tx and Rx queues are always paired, and therefore
> > > are always equal in number -- changing this would be a Simple Matter Of
> Programming.
> > >
> > > Interfaces of this type are created with a command line option like
> > > "--vdev=eth_packet0,iface=...".  There are a number of options
> > > availabe as
> > > arguments:
> > >
> > >  - Interface is chosen by "iface" (required)
> > >  - Number of queue pairs set by "qpairs" (optional, default: 1)
> > >  - AF_PACKET MMAP block size set by "blocksz" (optional, default:
> > > 4096)
> > >  - AF_PACKET MMAP frame size set by "framesz" (optional, default:
> > > 2048)
> > >  - AF_PACKET MMAP frame count set by "framecnt" (optional, default:
> > > 512)
> > >
> > > Signed-off-by: John W. Linville 
> > > ---
> > > This PMD is intended to provide a means for using DPDK on a broad
> > > range of hardware without hardware-specific PMDs and (hopefully)
> > > with better performance than what PCAP offers in Linux.  This might
> > > be useful as a development platform for DPDK applications when
> DPDK-supported hardware is expensive or unavailable.
> > >
> > > New in v2:
> > >
> > > -- fixup some style issues found by check patch
> > > -- use if_index as part of fanout group ID
> > > -- set default number of queue pairs to 1
> > >
> > >  config/common_bsdapp   |   5 +
> > >  config/common_linuxapp |   5 +
> > >  lib/Makefile   |   1 +
> > >  lib/librte_eal/linuxapp/eal/Makefile   |   1 +
> > >  lib/librte_pmd_packet/Makefile |  60 +++
> > >  lib/librte_pmd_packet/rte_eth_packet.c | 826
> > > +
> > > lib/librte_pmd_packet/rte_eth_packet.h |  55 +++
> > >  mk/rte.app.mk  |   4 +
> > >  8 files changed, 957 insertions(+)
> > >  create mode 100644 lib/librte_pmd_packet/Makefile  create mode
> > > 100644 lib/librte_pmd_packet/rte_eth_packet.c
> > >  create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h
> > >
> > > diff --git a/config/common_bsdapp b/config/common_bsdapp index
> > > 943dce8f1ede..c317f031278e 100644
> > > --- a/config/common_bsdapp
> > > +++ b/config/common_bsdapp
> > > @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y
> > > CONFIG_RTE_LIBRTE_PMD_BOND=y
> > >
> > >  #
> > > +# Compile software PMD backed by AF_PACKET sockets (Linux only) #
> > > +CONFIG_RTE_LIBRTE_PMD_PACKET=n
> > > +
> > > +#
> > >  # Do prefetch of packet data within PMD driver receive function  #
> > > CONFIG_RTE_PMD_PACKET_PREFETCH=y diff --git
> a/config/common_linuxapp
> > > b/config/common_linuxapp index 7bf5d80d4e26..f9e7bc3015ec 100644
> > > --- a/config/common_linuxapp
> > > +++ b/config/common_linuxapp
> > > @@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n
> > > CONFIG_RTE_LIBRTE_PMD_BOND=y
> > >
> > >  #
> > > +# Co

[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices

2014-07-15 Thread Zhou, Danny

> -Original Message-
> From: John W. Linville [mailto:linville at tuxdriver.com]
> Sent: Tuesday, July 15, 2014 10:01 PM
> To: Neil Horman
> Cc: Zhou, Danny; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for
> AF_PACKET-based virtual devices
> 
> On Tue, Jul 15, 2014 at 08:17:44AM -0400, Neil Horman wrote:
> > On Tue, Jul 15, 2014 at 12:15:49AM +, Zhou, Danny wrote:
> > > According to my performance measurement results for 64B small
> > > packet, 1 queue perf. is better than 16 queues (1.35M pps vs. 0.93M
> > > pps) which make sense to me as for 16 queues case more CPU cycles
> > > (16 queues' 87% vs. 1 queue' 80%) in kernel land needed for
> > > NAPI-enabled ixgbe driver to switch between polling and interrupt
> > > modes in order to service per-queue rx interrupts, so more context
> > > switch overhead involved. Also, since the
> > > eth_packet_rx/eth_packet_tx routines involves in two memory copies
> > > between DPDK mbuf and pbuf for each packet, it can hardly achieve
> > > high performance unless packet are directly DMA to mbuf which needs ixgbe
> driver to support.
> >
> > I thought 16 queues would be spread out between as many cpus as you
> > had though, obviating the need for context switches, no?
> 
> I think Danny is testing the single CPU case.  Having more queues than CPUs
> probably does not provide any benefit.
> 
> It would be cool to hack the DPDK memory management to work directly out of 
> the
> mmap'ed AF_PACKET buffers.  But at this point I don't have enough knowledge of
> DPDK internals to know if that is at all reasonable...
> 
> John
> 
> P.S.  Danny, have you run any performance tests on the PCAP driver?

No, I do not have PCAP driver performance results in hand. But I remember it is 
less than
1M pps for 64B.

> 
> --
> John W. Linville  Someday the world will need a hero, and you
> linville at tuxdriver.com might be all we have.  Be ready.


[dpdk-dev] Making space in mbuf data-structure

2014-07-15 Thread Richardson, Bruce
> -Original Message-
> From: Ananyev, Konstantin
> Sent: Tuesday, July 15, 2014 2:31 AM
> To: Richardson, Bruce; dev at dpdk.org
> Subject: RE: Making space in mbuf data-structure
> 
> Hi Bruce,
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Richardson, Bruce
> > Sent: Friday, July 04, 2014 12:39 AM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] Making space in mbuf data-structure
> >
> > Hi all,
> >
> > At this stage it's been well recognised that we need to make more space in 
> > the
> mbuf data structure for new fields. We in Intel have
> > had some discussion on this and this email is to outline what our current
> thinking and approach on this issue is, and look for additional
> > suggestions and feedback.
> >
> > Firstly, we believe that there is no possible way that we can ever fit all 
> > the
> fields we need to fit into a 64-byte mbuf, and so we need to
> > start looking at a 128-byte mbuf instead. Examples of new fields that need 
> > to
> fit in, include - 32-64 bits for additional offload flags, 32-
> > bits more for filter information for support for the new filters in the i40e
> driver, an additional 2-4 bytes for storing info on a second
> > vlan tag, 4-bytes for storing a sequence number to enable packet reordering 
> > in
> future, as well as potentially a number of other fields
> > or splitting out fields that are superimposed over each other right now, 
> > e.g. for
> the qos scheduler. We also want to allow space for use
> > by other non-Intel NIC drivers that may be open-sourced to dpdk.org in the
> future too, where they support fields and offloads that
> > our hardware doesn't.
> >
> > If we accept the fact of a 2-cache-line mbuf, then the issue becomes how to
> rework things so that we spread our fields over the two
> > cache lines while causing the lowest slow-down possible. The general
> approach that we are looking to take is to focus the first cache
> > line on fields that are updated on RX , so that receive only deals with one
> cache line. The second cache line can be used for application
> > data and information that will only be used on the TX leg. This would allow 
> > us
> to work on the first cache line in RX as now, and have the
> > second cache line being prefetched in the background so that it is available
> when necessary. Hardware prefetches should help us out
> > here. We also may move rarely used, or slow-path RX fields e.g. such as 
> > those
> for chained mbufs with jumbo frames, to the second
> > cache line, depending upon the performance impact and bytes savings
> achieved.
> >
> > With this approach, we still need to make space in the first cache line for
> information for the new or expanded receive offloads. First
> > off the blocks is to look at moving the mempool pointer into the second 
> > cache
> line. This will free-up 8 bytes in cache line  0, with a field
> > that is only used when cleaning up after TX. A prototype patch for this 
> > change
> is given below, for your review and comment. Initial
> > quick tests with testpmd (testpmd -c 600 -n 4 -- --mbcache=250 --
> txqflags=0xf01 --burst=32 --txrst=32 --txfreet=32 --rxfreet=32 --
> > total-num-mbufs=65536), and l3fwd (l3fwd -c 400 -n 4 -- -p F -P --
> config="(0,0,10),(1,0,10),(2,0,10),(3,0,10)") showed only a slight
> > performance decrease with testpmd and equal or slightly better performance
> with l3fwd. These would both have been using the
> > vector PMD - I haven't looked to change the fallback TX implementation yet
> for this change, since it's not as performance optimized
> > and hence cycle-count sensitive.
> 
> Regarding code changes itself:
> If I understand your code changes correctly:
> ixgbe_tx_free_bufs() will use rte_mempool_put_bulk() *only* if all mbufs in 
> the
> bulk belong to the same mempool.
> If we have let say 2 groups of mbufs from 2 different mempools - it wouldn't 
> be
> able to use put_bulk.
> While, as I remember,  current implementation is able to use put_bulk() in 
> such
> case.
> I think, it is possible to change you code to do a better grouping.

Given two sets of mbufs from two separate pool this proposed change will use 
put bulk for the first group encountered, but not the second. We can look at 
enhancing it to work with different groups, but this version works well where 
all or most mbufs in a TX queue come from a common queue. In the case of having 
every second packet or every third packet coming from a different mempool (for 
the two or three mempool case), this version should also perform better as it 
doesn't rely on having long runs of packets from the same mempool.
However, this is very much WIP, so we'll see if we can come up with a better 
solution if we think one is necessary. (Most apps seem to use just one mempool 
per numa socket at most, though, so I don't think we should over-optimise for 
multiple mempools).

> 
> >
> > Beyond this change, I'm also investigating potentially moving the 

[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices

2014-07-15 Thread John W. Linville
On Tue, Jul 15, 2014 at 03:40:56PM +, Zhou, Danny wrote:
> 
> > -Original Message-
> > From: John W. Linville [mailto:linville at tuxdriver.com]
> > Sent: Tuesday, July 15, 2014 10:01 PM
> > To: Neil Horman
> > Cc: Zhou, Danny; dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for
> > AF_PACKET-based virtual devices
> > 
> > On Tue, Jul 15, 2014 at 08:17:44AM -0400, Neil Horman wrote:
> > > On Tue, Jul 15, 2014 at 12:15:49AM +, Zhou, Danny wrote:
> > > > According to my performance measurement results for 64B small
> > > > packet, 1 queue perf. is better than 16 queues (1.35M pps vs. 0.93M
> > > > pps) which make sense to me as for 16 queues case more CPU cycles
> > > > (16 queues' 87% vs. 1 queue' 80%) in kernel land needed for
> > > > NAPI-enabled ixgbe driver to switch between polling and interrupt
> > > > modes in order to service per-queue rx interrupts, so more context
> > > > switch overhead involved. Also, since the
> > > > eth_packet_rx/eth_packet_tx routines involves in two memory copies
> > > > between DPDK mbuf and pbuf for each packet, it can hardly achieve
> > > > high performance unless packet are directly DMA to mbuf which needs 
> > > > ixgbe
> > driver to support.
> > >
> > > I thought 16 queues would be spread out between as many cpus as you
> > > had though, obviating the need for context switches, no?
> > 
> > I think Danny is testing the single CPU case.  Having more queues than CPUs
> > probably does not provide any benefit.
> > 
> > It would be cool to hack the DPDK memory management to work directly out of 
> > the
> > mmap'ed AF_PACKET buffers.  But at this point I don't have enough knowledge 
> > of
> > DPDK internals to know if that is at all reasonable...
> > 
> > John
> > 
> > P.S.  Danny, have you run any performance tests on the PCAP driver?
> 
> No, I do not have PCAP driver performance results in hand. But I remember it 
> is less than
> 1M pps for 64B.

Cool, good info...thanks!

-- 
John W. LinvilleSomeday the world will need a hero, and you
linville at tuxdriver.com   might be all we have.  Be ready.


[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices

2014-07-15 Thread Neil Horman
On Tue, Jul 15, 2014 at 10:01:11AM -0400, John W. Linville wrote:
> On Tue, Jul 15, 2014 at 08:17:44AM -0400, Neil Horman wrote:
> > On Tue, Jul 15, 2014 at 12:15:49AM +, Zhou, Danny wrote:
> > > According to my performance measurement results for 64B small
> > > packet, 1 queue perf. is better than 16 queues (1.35M pps vs. 0.93M
> > > pps) which make sense to me as for 16 queues case more CPU cycles (16
> > > queues' 87% vs. 1 queue' 80%) in kernel land needed for NAPI-enabled
> > > ixgbe driver to switch between polling and interrupt modes in order
> > > to service per-queue rx interrupts, so more context switch overhead
> > > involved. Also, since the eth_packet_rx/eth_packet_tx routines involves
> > > in two memory copies between DPDK mbuf and pbuf for each packet,
> > > it can hardly achieve high performance unless packet are directly
> > > DMA to mbuf which needs ixgbe driver to support.
> > 
> > I thought 16 queues would be spread out between as many cpus as you had 
> > though,
> > obviating the need for context switches, no?
> 
> I think Danny is testing the single CPU case.  Having more queues
> than CPUs probably does not provide any benefit.
> 
Ah, yes, generally speaking, you never want nr_cpus < nr_queues.  Otherwise
you'll just be fighting yourself.

> It would be cool to hack the DPDK memory management to work directly
> out of the mmap'ed AF_PACKET buffers.  But at this point I don't
> have enough knowledge of DPDK internals to know if that is at all
> reasonable...
> 
> John
> 
> P.S.  Danny, have you run any performance tests on the PCAP driver?
> 
> -- 
> John W. Linville  Someday the world will need a hero, and you
> linville at tuxdriver.com might be all we have.  Be ready.
> 


[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices

2014-07-15 Thread Zhou, Danny

> -Original Message-
> From: Neil Horman [mailto:nhorman at tuxdriver.com]
> Sent: Wednesday, July 16, 2014 4:31 AM
> To: John W. Linville
> Cc: Zhou, Danny; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for
> AF_PACKET-based virtual devices
> 
> On Tue, Jul 15, 2014 at 10:01:11AM -0400, John W. Linville wrote:
> > On Tue, Jul 15, 2014 at 08:17:44AM -0400, Neil Horman wrote:
> > > On Tue, Jul 15, 2014 at 12:15:49AM +, Zhou, Danny wrote:
> > > > According to my performance measurement results for 64B small
> > > > packet, 1 queue perf. is better than 16 queues (1.35M pps vs.
> > > > 0.93M
> > > > pps) which make sense to me as for 16 queues case more CPU cycles
> > > > (16 queues' 87% vs. 1 queue' 80%) in kernel land needed for
> > > > NAPI-enabled ixgbe driver to switch between polling and interrupt
> > > > modes in order to service per-queue rx interrupts, so more context
> > > > switch overhead involved. Also, since the
> > > > eth_packet_rx/eth_packet_tx routines involves in two memory copies
> > > > between DPDK mbuf and pbuf for each packet, it can hardly achieve
> > > > high performance unless packet are directly DMA to mbuf which needs 
> > > > ixgbe
> driver to support.
> > >
> > > I thought 16 queues would be spread out between as many cpus as you
> > > had though, obviating the need for context switches, no?
> >
> > I think Danny is testing the single CPU case.  Having more queues than
> > CPUs probably does not provide any benefit.
> >
> Ah, yes, generally speaking, you never want nr_cpus < nr_queues.  Otherwise 
> you'll
> just be fighting yourself.
> 

It is true for interrupt based NIC driver and this AF_PACKET based PMD because 
it depends 
on kernel NIC driver. But for poll-mode based DPDK native NIC driver, you can 
have a cpu pinning to
to a core polling multiple queues on a NIC or queues on different NICs, at the 
cost of more
power consumption or wasted CPU cycles busying waiting packets.

> > It would be cool to hack the DPDK memory management to work directly
> > out of the mmap'ed AF_PACKET buffers.  But at this point I don't have
> > enough knowledge of DPDK internals to know if that is at all
> > reasonable...
> >
> > John
> >
> > P.S.  Danny, have you run any performance tests on the PCAP driver?
> >
> > --
> > John W. LinvilleSomeday the world will need a hero, and you
> > linville at tuxdriver.com   might be all we have.  Be ready.
> >


[dpdk-dev] [PATCH] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices

2014-07-15 Thread Thomas Monjalon
2014-07-14 09:46, John W. Linville:
> On Sat, Jul 12, 2014 at 12:34:46AM +0200, Thomas Monjalon wrote:
> > 2014-07-11 13:40, John W. Linville:
> > > Is there an example of code in DPDK that requires specific kernel
> > > versions?  What is the preferred method for coding such dependencies?
> > 
> > No there is no userspace code checking kernel version in DPDK.
> > Feel free to use what you think the best method.
> > Please keep in mind that checking version number is a maintenance
> > nightmare
> > because of backports (like RedHat do ;).
> 
> I suppose that it could be a configuration option?

If there is no other way to configure kernel-dependent features, we can add 
options. But I feel that relying on a macro (#ifdef) would be better if such 
macro exist.

-- 
Thomas