[dpdk-dev] Sporadic errors while initializing NICs in example applications, dpdk-1.5.0r1

2013-12-06 Thread Dmitry Vyal
On 11/29/2013 04:39 PM, Thomas Monjalon wrote:
> 29/11/2013 13:25, Thomas Monjalon :
>
> Please check that your hardware do not support invariant TSC.
> It would explain why you need to fix frequency.
>
> I attach a simple code to test CPU feature "Invariant TSC".

I compiled and ran the code on all the platforms I had troubles on. 
Invariant TSC is supported everywhere.

> It seems that the file is stripped on the mailing-list.
> Code inlined:
>
> #include 
> #include 
> #include 
> #include 
>
>
> int main()
> {
>  uint32_t a = 0x8000;
>  uint32_t b, d;
>
>  __asm__("cpuid;"
>  :"=a"(b)
>  :"0"(a));
>
>  if (b >= 0x8007) {
>
>  a = 0x8007;
>  __asm__("cpuid;"
>  :"=a"(b), "=d"(d)
>  :"0"(a));
>
>  if (d & (1<<8)) {
>  printf("Invariant TSC is supported\n");
>  } else{
>  printf("Invariant TSC is NOT supported\n");
>  }
>  } else {
>  printf("No support for Advanced Power Management Information in
> CPUID\n");
>  }
>  return 0;
> }
>



[dpdk-dev] DPDK delaying individual packets infinitely

2013-12-06 Thread Dmitry Vyal
Hello list,

For some time I've been writing a custom packet generator coupled with a 
packet receiver using DPDK.

It works greatly when used for generating  millions of packets, but 
unexpectedly It has troubles generating a several dozens of packets or so.

The application consists of two threads, first one sends the packets 
from one port and second receives them on another. These are 2 ports of 
four-ports 1Gb NIC. It's detected as Intel Corporation 82576 Gigabit 
Network Connection (rev 01). Ports are connected with a patch cord.

My experiment runs like the following:
After initializing NICs both generator and receiver wait for 1 second.
Generator sends N packets by calling rte_eth_tx_burst for each 
individual packet. It waits for 1 cpu ticks between bursts. 
rte_eth_tx_burst reports all packets are sent.
Receiver repeats calling  rte_eth_rx_burst and waits for 5000 ticks 
if the function returns zero. After generators sends all the packets, 
the receiver continues polling for several seconds.

I'm observing the following behavior:

If N is small, say 20, than no packets a received. See the logs. 
Receiver prints Z when it got zero packets and R if it got at least one 
packet.

Starting experiment
receiver 0 started main loop
generator 0 sitting on socket 0 is waiting for experiment start
generator 0 started main loop
Zgenerating mbuf for file 0, port 0, addr 0 on socket 0
free_count on pool 0 = 1
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 1 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 2 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 3 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 4 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 5 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 6 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 7 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 8 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 9 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 10 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 11 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 12 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 13 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 14 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 15 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 16 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 17 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 18 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 19 on socket 0
sent 1 in queue 0 freed 0
Stopping after reaching 20 packets limit
ZZwaiting for receivers to 
stopZZZ*
 
Statistics:
Seconds elapsed: 6.123671


If I made N bigger, say 25, than all of a sudden some packets a 
received, like so:


Starting experiment
receiver 0 started main loop
generator 0 sitting on socket 0 is waiting for experiment start
generator 0 started main loop
generating mbuf for file 0, port 0, addr 0 on socket 0
Zfree_count on pool 0 = 1
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 1 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 2 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 3 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 4 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 5 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 6 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 7 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 8 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 9 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 10 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 11 on socket 0
sent 1 in queue 0 freed 0
ZZgenerating mbuf for file 0, port 0, addr 12 on socket 0
sent 1 in queue 0 freed 0
ZZgene

[dpdk-dev] Any ideas how to stop DPDK from banning me from the box.

2013-07-18 Thread Dmitry Vyal
Greetings!

I've been playing with dpdk on my desktop for some time and decided to 
finally test in on a server. It has a plenty of nics:

dev at box:~$ lspci |grep Ether
01:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit 
SFI/SFP+ Network Connection (rev 01)
01:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit 
SFI/SFP+ Network Connection (rev 01)
03:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit 
SFI/SFP+ Network Connection (rev 01)
03:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit 
SFI/SFP+ Network Connection (rev 01)
05:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit 
SFI/SFP+ Network Connection (rev 01)
05:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit 
SFI/SFP+ Network Connection (rev 01)
0c:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network 
Connection (rev 01)
0c:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network 
Connection (rev 01)
0e:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network 
Connection (rev 01)
0e:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network 
Connection (rev 01)
10:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network 
Connection (rev 01)
10:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network 
Connection (rev 01)
12:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network 
Connection
82:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit 
SFI/SFP+ Network Connection (rev 01)
82:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit 
SFI/SFP+ Network Connection (rev 01)
88:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit 
SFI/SFP+ Network Connection (rev 01)
88:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit 
SFI/SFP+ Network Connection (rev 01)
8a:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit 
SFI/SFP+ Network Connection (rev 01)
8a:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit 
SFI/SFP+ Network Connection (rev 01)

I connect to the server using ssh through one of gigabit ports. The 
problem is that no matter which cards I blacklist with -b option then 
starting my dpdk application it takes away all the interfaces from the 
kernel and later returns blacklisted ones. This disrupts my ssh session 
and I can no longer connect to the server.

Any ideas how to overcome this?

Best regards,
Dmitry Vyal.



[dpdk-dev] Any ideas how to stop DPDK from banning me from the box.

2013-07-19 Thread Dmitry Vyal
Hi Marco,

thanks for posting the example I carefully composed mine using your as a 
basis and, guess, all works I don't have the exact cmdline I issued 
yesterday. Looks like bash didn't have a chance to save history But I 
guess I put all the -b options after the -- delimiter so they weren't 
parsed by rte_eal_init. You saved me a day!
>
> pino:~/dpdk-1.2.3r4 ~> sudo build/app/testpmd -c 0xf -n 3 -r 1 -b 
> :05:00.0 -b :05:00.1
> EAL: coremask set to f
> ...
> Initializing port 0... done:  Link Down
> Initializing port 1... done:  Link Down
> Initializing port 2... done:  Link Up - speed 1 Mbps - full-duplex
> Initializing port 3... done:  Link Up - speed 1 Mbps - full-duplex
> No commandline core given, start packet forwarding
>   io packet forwarding - CRC stripping disabled - packets/burst=16
>   nb forwarding cores=1 - nb forwarding ports=4
>   RX queues=1 - RX desc=128 - RX free threshold=0
>   RX threshold registers: pthresh=8 hthresh=8 wthresh=4
>   TX queues=1 - TX desc=512 - TX free threshold=0
>   TX threshold registers: pthresh=36 hthresh=0 wthresh=0
>   TX RS bit threshold=0
> Press enter to exit
>
> [ssh is still up]
>
>
> Regards,
> Marco


[dpdk-dev] Thread preemption and rte_ring

2013-11-05 Thread Dmitry Vyal
Hello,

Documentation for rte_ring says: the ring implementation is not 
preemptable. A lcore must not be interrupted by another task that uses 
the same ring. What does it precisely mean? Must all the producers and 
consumers be non-preemptive? Can we relax that restriction somehow? Say, 
can I have multiple non-preemptive writers running on dedicated cores 
and a single reader running as a regular Linux thread?

Thanks,
Dmitry


[dpdk-dev] Sporadic errors while initializing NICs in example applications, dpdk-1.5.0r1

2013-11-22 Thread Dmitry Vyal
Hi, I'm experiencing weird problems with running dpdk examples on my 
server running ubuntu-12.04. Application either manages to use ethernet 
ports or doesn't. For example, this is results of two identical 
sequental runs of l2fwd:

**
dev at tiny-one:~/dpdk-1.5.0r1/examples/l2fwd$ s -E ./build/l2fwd -c 0x3 -n 2
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
EAL: Detected lcore 2 as core 2 on socket 0
EAL: Detected lcore 3 as core 3 on socket 0
EAL: Detected lcore 4 as core 0 on socket 0
EAL: Detected lcore 5 as core 1 on socket 0
EAL: Detected lcore 6 as core 2 on socket 0
EAL: Detected lcore 7 as core 3 on socket 0
EAL: Skip lcore 8 (not detected)

EAL: Setting up memory...
EAL: Ask a virtual area of 0x4194304 bytes
EAL: Virtual area found at 0x7f6b8240 (size = 0x40)
EAL: Ask a virtual area of 0x2097152 bytes
EAL: Virtual area found at 0x7f6b8200 (size = 0x20)
EAL: Ask a virtual area of 0x2097152 bytes
EAL: Virtual area found at 0x7f6b81c0 (size = 0x20)
EAL: Ask a virtual area of 0x1056964608 bytes
EAL: Virtual area found at 0x7f6b42a0 (size = 0x3f00)
EAL: Ask a virtual area of 0x2097152 bytes
EAL: Virtual area found at 0x7f6b4260 (size = 0x20)
EAL: Ask a virtual area of 0x2097152 bytes
EAL: Virtual area found at 0x7f6b4220 (size = 0x20)
EAL: Ask a virtual area of 0x2097152 bytes
EAL: Virtual area found at 0x7f6b41e0 (size = 0x20)
EAL: Ask a virtual area of 0x2097152 bytes
EAL: Virtual area found at 0x7f6b41a0 (size = 0x20)
EAL: Requesting 512 pages of size 2MB from socket 0
EAL: TSC frequency is ~160 KHz
EAL: Master core 0 is ready (tid=836da800)
EAL: Core 1 is ready (tid=40ff8700)
EAL: PCI device :02:00.0 on NUMA socket -1
EAL:   probe driver: 8086:10fb rte_ixgbe_pmd
EAL:   PCI memory mapped at 0x7f6b83687000
EAL:   PCI memory mapped at 0x7f6b83683000
EAL: PCI device :02:00.1 on NUMA socket -1
EAL:   probe driver: 8086:10fb rte_ixgbe_pmd
EAL:   PCI memory mapped at 0x7f6b83663000
EAL:   PCI memory mapped at 0x7f6b8365f000
EAL: PCI device :03:00.0 on NUMA socket -1
EAL:   probe driver: 8086:10d3 rte_em_pmd
EAL:   :03:00.0 not managed by UIO driver, skipping
EAL: PCI device :04:00.0 on NUMA socket -1
EAL:   probe driver: 8086:10d3 rte_em_pmd
EAL:   :04:00.0 not managed by UIO driver, skipping
EAL: PCI device :05:00.0 on NUMA socket -1
EAL:   probe driver: 8086:10d3 rte_em_pmd
EAL:   :05:00.0 not managed by UIO driver, skipping
EAL: PCI device :06:00.0 on NUMA socket -1
EAL:   probe driver: 8086:10d3 rte_em_pmd
EAL:   :06:00.0 not managed by UIO driver, skipping
EAL: PCI device :07:00.0 on NUMA socket -1
EAL:   probe driver: 8086:10d3 rte_em_pmd
EAL:   :07:00.0 not managed by UIO driver, skipping
EAL: PCI device :08:00.0 on NUMA socket -1
EAL:   probe driver: 8086:10d3 rte_em_pmd
EAL:   :08:00.0 not managed by UIO driver, skipping
EAL: PCI device :09:00.0 on NUMA socket -1
EAL:   probe driver: 8086:10d3 rte_em_pmd
EAL:   :09:00.0 not managed by UIO driver, skipping
EAL: Error - exiting with code: 1
   Cause: No Ethernet ports - bye



dev at econat-tiny-one:~/dpdk-1.5.0r1/examples/l2fwd$ s -E ./build/l2fwd -c 
0x3 -n 2
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
EAL: Detected lcore 2 as core 2 on socket 0
EAL: Detected lcore 3 as core 3 on socket 0
EAL: Detected lcore 4 as core 0 on socket 0
EAL: Detected lcore 5 as core 1 on socket 0
EAL: Detected lcore 6 as core 2 on socket 0
EAL: Detected lcore 7 as core 3 on socket 0
EAL: Skip lcore 8 (not detected)

EAL: Setting up memory...
EAL: Ask a virtual area of 0x4194304 bytes
EAL: Virtual area found at 0x7f538a60 (size = 0x40)
EAL: Ask a virtual area of 0x2097152 bytes
EAL: Virtual area found at 0x7f538a20 (size = 0x20)
EAL: Ask a virtual area of 0x2097152 bytes
EAL: Virtual area found at 0x7f5389e0 (size = 0x20)
EAL: Ask a virtual area of 0x1056964608 bytes
EAL: Virtual area found at 0x7f534ac0 (size = 0x3f00)
EAL: Ask a virtual area of 0x2097152 bytes
EAL: Virtual area found at 0x7f534a80 (size = 0x20)
EAL: Ask a virtual area of 0x2097152 bytes
EAL: Virtual area found at 0x7f534a40 (size = 0x20)
EAL: Ask a virtual area of 0x2097152 bytes
EAL: Virtual area found at 0x7f534a00 (size = 0x20)
EAL: Ask a virtual area of 0x2097152 bytes
EAL: Virtual area found at 0x7f5349c0 (size = 0x20)
EAL: Requesting 512 pages of size 2MB from socket 0
EAL: TSC frequency is ~3301000 KHz
EAL: Master core 0 is ready (tid=8b98d800)
EAL: Core 1 is ready (tid=491f8700)
EAL: PCI device :02:00.0 on NUMA socket -1
EAL:   probe driver: 8086:10fb rte_ixgbe_pmd
EAL:   PCI memory mapped at 0x7f538b93a000
EAL:   PCI memory mapped at 0x7f538b936000
EAL:

[dpdk-dev] Sporadic errors while initializing NICs in example applications, dpdk-1.5.0r1

2013-11-27 Thread Dmitry Vyal
Looks like I finally found the reason. After applying this patch I can 
no longer reproduce the error.

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
index db07789..5f825fa 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
@@ -2347,7 +2347,7 @@ s32 ixgbe_reset_pipeline_82599(struct ixgbe_hw *hw)
 /* Write AUTOC register with toggled LMS[2] bit and Restart_AN */
 IXGBE_WRITE_REG(hw, IXGBE_AUTOC, autoc_reg ^ 
IXGBE_AUTOC_LMS_1G_AN);
 /* Wait for AN to leave state 0 */
-   for (i = 0; i < 10; i++) {
+   for (i = 0; i < 100; i++) {
 msec_delay(4);
 anlp1_reg = IXGBE_READ_REG(hw, IXGBE_ANLP1);
 if (anlp1_reg & IXGBE_ANLP1_AN_STATE_MASK)

On 11/22/2013 04:48 PM, Thomas Monjalon wrote:
> Hello,
>
> 22/11/2013 13:29, Dmitry Vyal :
>> EAL: PCI device :02:00.0 on NUMA socket -1
>> EAL:   probe driver: 8086:10fb rte_ixgbe_pmd
>> EAL:   PCI memory mapped at 0x7f6b83687000
>> EAL:   PCI memory mapped at 0x7f6b83683000
>> EAL: PCI device :02:00.1 on NUMA socket -1
>> EAL:   probe driver: 8086:10fb rte_ixgbe_pmd
>> EAL:   PCI memory mapped at 0x7f6b83663000
>> EAL:   PCI memory mapped at 0x7f6b8365f000
> [...]
>> EAL: Error - exiting with code: 1
>> Cause: No Ethernet ports - bye
>>
>> Any ideas how to investigate this?
> Could you try this patch in order to see the root cause of your issue ?
>
> --- a/lib/librte_pmd_ixgbe/ixgbe_logs.h
> +++ b/lib/librte_pmd_ixgbe/ixgbe_logs.h
> @@ -34,41 +34,44 @@
>   #ifndef _IXGBE_LOGS_H_
>   #define _IXGBE_LOGS_H_
>   
> +#define PMD_LOG(level, fmt, args...) \
> + RTE_LOG(level, PMD, "%s(): " fmt "\n", __func__, ##args)
> +
>   #ifdef RTE_LIBRTE_IXGBE_DEBUG_INIT
> -#define PMD_INIT_LOG(level, fmt, args...) \
> - RTE_LOG(level, PMD, "%s(): " fmt "\n", __func__, ## args)
> +#define PMD_INIT_LOG(level, fmt, args...) PMD_LOG(level, fmt, ##args)
>   #define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>")
>   #else
> -#define PMD_INIT_LOG(level, fmt, args...) do { } while(0)
> +#define PMD_INIT_LOG(level, fmt, args...) \
> + (void)(RTE_LOG_##level <= RTE_LOG_ERR ? PMD_LOG(level, fmt, ##args) : 0)
>   #define PMD_INIT_FUNC_TRACE() do { } while(0)
>   #endif
>   
>   #ifdef RTE_LIBRTE_IXGBE_DEBUG_RX
> -#define PMD_RX_LOG(level, fmt, args...) \
> - RTE_LOG(level, PMD, "%s(): " fmt "\n", __func__, ## args)
> +#define PMD_RX_LOG(level, fmt, args...) PMD_LOG(level, fmt, ##args)
>   #else
> -#define PMD_RX_LOG(level, fmt, args...) do { } while(0)
> +#define PMD_RX_LOG(level, fmt, args...) \
> + (void)(RTE_LOG_##level <= RTE_LOG_ERR ? PMD_LOG(level, fmt, ##args) : 0)
>   #endif
>   
>   #ifdef RTE_LIBRTE_IXGBE_DEBUG_TX
> -#define PMD_TX_LOG(level, fmt, args...) \
> - RTE_LOG(level, PMD, "%s(): " fmt "\n", __func__, ## args)
> +#define PMD_TX_LOG(level, fmt, args...) PMD_LOG(level, fmt, ##args)
>   #else
> -#define PMD_TX_LOG(level, fmt, args...) do { } while(0)
> +#define PMD_TX_LOG(level, fmt, args...) \
> + (void)(RTE_LOG_##level <= RTE_LOG_ERR ? PMD_LOG(level, fmt, ##args) : 0)
>   #endif
>   
>   #ifdef RTE_LIBRTE_IXGBE_DEBUG_TX_FREE
> -#define PMD_TX_FREE_LOG(level, fmt, args...) \
> - RTE_LOG(level, PMD, "%s(): " fmt "\n", __func__, ## args)
> +#define PMD_TX_FREE_LOG(level, fmt, args...) PMD_LOG(level, fmt, ##args)
>   #else
> -#define PMD_TX_FREE_LOG(level, fmt, args...) do { } while(0)
> +#define PMD_TX_FREE_LOG(level, fmt, args...) \
> + (void)(RTE_LOG_##level <= RTE_LOG_ERR ? PMD_LOG(level, fmt, ##args) : 0)
>   #endif
>   
>   #ifdef RTE_LIBRTE_IXGBE_DEBUG_DRIVER
> -#define PMD_DRV_LOG(level, fmt, args...) \
> - RTE_LOG(level, PMD, "%s(): " fmt "\n", __func__, ## args)
> +#define PMD_DRV_LOG(level, fmt, args...) PMD_LOG(level, fmt, ##args)
>   #else
> -#define PMD_DRV_LOG(level, fmt, args...) do { } while(0)
> +#define PMD_DRV_LOG(level, fmt, args...) \
> + (void)(RTE_LOG_##level <= RTE_LOG_ERR ? PMD_LOG(level, fmt, ##args) : 0)
>   #endif
>   
>   #endif /* _IXGBE_LOGS_H_ */
>



[dpdk-dev] Sporadic errors while initializing NICs in example applications, dpdk-1.5.0r1

2013-11-29 Thread Dmitry Vyal
Hmm, that's strange. I don't know how to interpret my observations then. 
I have access to two platforms, one is based on Intel(R) Xeon(R) CPU 
E3-1230 V2 @ 3.30GHz and another on Intel(R) Xeon(R) CPU E3-1270 v3 @ 
3.50GHz. Both running ubuntu-12.04 server. I see repeating errors on NIC 
initialisation phase. The error frequency greatly reduces if I patch 
loop limit as I described earlier or if I call rte_power_init and 
rte_power_freq_max as Thomas suggested.

But the only way to get rid of them completely is to set performance 
governor.

On 11/28/2013 03:01 PM, Richardson, Bruce wrote:
>> It's probably due to a frequency scaling.
>> The timer based is initialized when DPDK initialize and the CPU can change
>> its frequency, breaking next timers.
>>
>> The fix is to control the CPU frequency.
>> Please try this, without your patch:
>>  for g in /sys/devices/system/cpu/*/cpufreq/scaling_governor; do
>> echo performance >$g; done The right fix for applications (examples and
>> testpmd included) could be to call rte_power_init(). Patches are welcomed.
>>
> [BR] Frequency changes should not affect timers for modern Intel CPUs. Please 
> see the " Intel(r) 64 and IA-32 Architectures Software Developer's Manual" 
> Volume 3 
> (http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-system-programming-manual-325384.pdf)
>  , Section 17.13 for more details on this.
>



[dpdk-dev] dropped packet count

2013-10-09 Thread Dmitry Vyal
Hi John,

take a look at  void rte_eth_stats_get(uint8_t /port_id/, struct 
rte_eth_stats  
*/stats);/
http://dpdk.org/doc/api/rte__ethdev_8h.html#aac7b274a66c959f827a0750eaf22a5cb

The structure it fills has a member q_errors which seems to be what 
you're looking for.

Regards,
Dmitry

On 10/09/2013 12:38 AM, John Lange wrote:
> Greetings,
>
> Is there a standard way of retrieving the dropped packet count for a 
> particular port/queue in the DPDK?   If no way is currently defined, could 
> this please be added in a future version?
>
> Thanks!



[dpdk-dev] Recommended method of getting timestamps?

2013-09-06 Thread Dmitry Vyal
Hello Patrick,

I guess gettimeofday is too heavy if all you need is an abstract 
timestamp not related to any particular calendar. I think you should 
look at rte_rdtsc()? It returns a current value of CPU tick counter. So 
it's very cheap (just a few clocks) and has a great resolution (a 
fraction of nanosecond).

Regards,
Dmitry

> I have a need to keep a timestamp on a piece of global data.  When then 
> timestamp grows too old I want to refresh that data.  Is it safe to use, 
> gettimeofday()?
>
> I thought about using an alarm, but I need to set an alarm from inside the 
> alarm callback which doesn't look like it will work due to the spinlock on 
> the alarm list.
>
> And since this is inside the driver I am working on, setting up a timer is 
> not simple.
>
> So, I figure to timestamp the data, wait until I need to access it, check the 
> timestamp and refresh if it is too old.
>
> Thoughts?  Suggestions?
>
> Thanks,
>
> Patrick
>
> Coming to you from deep inside Fortress Mahan



[dpdk-dev] How to free a ring?

2013-09-11 Thread Dmitry Vyal
Hello all.

Is there a way to deallocate an rte_ring? Maybe I'm missing something, 
but I can't find an API function for that at
http://dpdk.org/doc/api/rte__ring_8h.html

Regards,
Dmitry


[dpdk-dev] How to free a ring?

2013-09-11 Thread Dmitry Vyal
On 09/11/2013 11:46 AM, AndyChen wrote:
> Dpdk hasn't API to free ring/memzone/mempool, what's the scene you 
> must free the ring?
>
Well, I'm writing a packet defragmenting code. I have a data structure 
consisting of a rte_hash, a rte_ring and an array allocated with 
rte_zmalloc. All these besides the ring can be freed. So looks like I 
can't destruct my object after allocating. Strictly speaking, I don't 
have to. The need shouldn't arise in normal situation. I just considered 
writing allocating/deallocating function pairs and processing all 
allocation errors to be a good practice.

Isn't it a bit strange to have rte_hash_free() and not have 
rte_ring_free()? What's the reason for this?

Best wishes,
Dmitry


[dpdk-dev] Looks like rte_mempool_free_count() and rte_mempool_count() are swapped

2013-09-12 Thread Dmitry Vyal
Greetings.

I had a suspect I run into a mbuf depletion issue and decided to check 
using rte_mempool_free_count(). To my surprise, it returned a value 
equal to mempool size. I tried calling rte_mempool_count() and it 
returned zero.

I inspected the code in dpdk-1.3.1-7 and dpdk.1.4.1-4:

rte_mempool_count(const struct rte_mempool *mp)
{
 unsigned count;

 count = rte_ring_count(mp->ring);

#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
 {
 unsigned lcore_id;
 if (mp->cache_size == 0)
 return count;

 for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++)
 count += mp->local_cache[lcore_id].len;
 }
#endif

 /*
  * due to race condition (access to len is not locked), the
  * total can be greater than size... so fix the result
  */
 if (count > mp->size)
 return mp->size;
 return count;
}

If I understand it correctly, the ring contains free buffers and 
rte_ring_count() returns a number of entries inside a ring. So this 
function actually calculates the number of free entries, not busy.

Moreover, rte_mempool_count() is used in many places. For example it's 
called in rte_mempool_free_count() and rte_mempool_full().

Can anyone confirm or refute my findings?

Regards,
Dmitry


[dpdk-dev] How to fight forwarding performance regression on large mempool sizes.

2013-09-19 Thread Dmitry Vyal
Good day everyone,

While working on IP packet defragmenting I had to enlarge mempool size. 
I did this to provide large enough time window for assembling a fragment 
sequence. Unfortunately, I got a performance regression: if I enlarge 
mempool size from 2**12 to 2**20 MBufs, packet performance for not 
fragmented packets drops from ~8.5mpps to ~5.5mpps for single core. I 
made only a single measure, so the data are noisy, but the trend is evident:
SIZE 4096 - 8.47mpps
SIZE 8192 - 8.26mpps
SIZE 16384 - 8.29mpps
SIZE 32768 - 8.31mpps
SIZE 65536 - 8.12mpps
SIZE 131072 - 7.93mpps
SIZE 262144 - 6.22mpps
SIZE 524288 - 5.72mpps
SIZE 1048576 - 5.63mpps

And I need even larger sizes.

I want to ask for an advice, how best to tackle this? One way I'm 
thinking about is to make two mempools, one large for fragments (we may 
accumulate a big number of them) and one small for full packets, we just 
forward them burst by burst. Is it possible to configure RSS to 
distribute packets between queues according to this scheme? Perhaps, 
there are better ways?

Thanks,
Dmitry


[dpdk-dev] How to fight forwarding performance regression on large mempool sizes.

2013-09-20 Thread Dmitry Vyal
On 09/19/2013 11:39 PM, Robert Sanford wrote:
> Hi Dmitry,
>
> The biggest drop-off seems to be from size 128K to 256K. Are you using 
> 1GB huge pages already (rather than 2MB)?
>
> I would think that it would not use over 1GB until you ask for 512K 
> mbufs or more.
>

Hi Robert,

Yes, I've been using 1GB pages for a while. My L3 cache is 20MB and 
mbufs are 2240 bytes of size. So something strange indeed happens then 
we move from ~200MB to ~400MB. Any ideas?

Regards,
Dmitry



[dpdk-dev] How to fight forwarding performance regression on large mempool sizes.

2013-09-20 Thread Dmitry Vyal
On 09/19/2013 11:43 PM, Venkatesan, Venky wrote:
> Dmitry,
> One other question - what version of DPDK are you doing on?
> -Venky
>
It's DPDK-1.3.1-7 downloaded from intel.com. Should I try upgrading?


[dpdk-dev] How to fight forwarding performance regression on large mempool sizes.

2013-09-22 Thread Dmitry Vyal
On 09/20/2013 07:34 PM, Robert Sanford wrote:
> One more point, if you're not doing this already: Allocate 2^N-1 
> mbufs, not 2^N. According to the code and comments: "The optimum size 
> (in terms of memory usage) for a mempool is when n is a power of two 
> minus one: n = (2^q - 1)."
>
Many thanks! Didn't know about it.


[dpdk-dev] SFP/SFP+ modules hotplugging question.

2014-08-11 Thread Dmitry Vyal
Dear mailing list, I have a question concerning SFP modules hotplugging.

I made some experiments and want to confirm my findings.

Looks like hotplug is basically supported out of the box, the only thing 
one has to do is to register
callbacks for RTE_ETH_EVENT_INTR_LSC and avoid sending mbufs to NIC when 
link is down.
Implemented that, I can now remove an SFP module in the middle on a run 
and insert another one and NIC operation resumes.

So far so good. But then I inserted an SFP module into SFP+ slot during 
the testing I got an unsatisfactory behavior.
Link went down but to my surprise it didn't go up after I replaced SFP 
module with one of SFP+ type.
I had to restart the application to bring port back into the working state.
So my question is whether it's possible to detect and gracefully 
overcome such situations without disrupting other ports operation?

P.S. Forgot to mention, I observed this behavior with DPDK-1.5.1 and 
82599EB NIC.

Best regards,
Dmitry.


[dpdk-dev] Rx-errors with testpmd (only 75% line rate)

2014-01-22 Thread Dmitry Vyal
Hello MIchael,

I suggest you to check average burst sizes on receive queues. Looks like 
I stumbled upon a similar issue several times. If you are calling 
rte_eth_rx_burst too frequently, NIC begins losing packets no matter how 
many CPU horse power you have (more you have, more it loses, actually). 
In my case this situation occured when average burst size is less than 
20 packets or so. I'm not sure what's the reason for this behavior, but 
I observed it on several applications on Intel 82599 10Gb cards.

Regards, Dmitry


On 01/09/2014 11:28 PM, Michael Quicquaro wrote:
> Hello,
> My hardware is a Dell PowerEdge R820:
> 4x Intel Xeon E5-4620 2.20GHz 8 core
> 16GB RDIMM 1333 MHz Dual Rank, x4 - Quantity 16
> Intel X520 DP 10Gb DA/SFP+
>
> So in summary 32 cores @ 2.20GHz and 256GB RAM
>
> ... plenty of horsepower.
>
> I've reserved 16 1GB Hugepages
>
> I am configuring only one interface and using testpmd in rx_only mode to
> first see if I can receive at line rate.
>
> I am generating traffic on a different system which is running the netmap
> pkt-gen program - generating 64 byte packets at close to line rate.
>
> I am only able to receive approx. 75% of line rate and I see the Rx-errors
> in the port stats going up proportionally.
> I have verified that all receive queues are being used, but strangely
> enough, it doesn't matter how many queues more than 2 that I use, the
> throughput is the same.  I have verified with 'mpstat -P ALL' that all
> specified cores are used.  The utilization of each core is only roughly 25%.
>
> Here is my command line:
> testpmd -c 0x -n 4 -- --nb-ports=1 --coremask=0xfffe
> --nb-cores=8 --rxd=2048 --txd=2048 --mbcache=512 --burst=512 --rxq=8
> --txq=8 --interactive
>
> What can I do to trace down this problem?  It seems very similar to a
> thread on this list back in May titled "Best example for showing
> throughput?" where no resolution was ever mentioned in the thread.
>
> Thanks for any help.
> - Michael



[dpdk-dev] Rx-errors with testpmd (only 75% line rate)

2014-01-28 Thread Dmitry Vyal
On 01/28/2014 12:00 AM, Michael Quicquaro wrote:
> Dmitry,
> I cannot thank you enough for this information.  This too was my main 
> problem.  I put a "small" unmeasured delay before the call to 
> rte_eth_rx_burst() and suddenly it starts returning bursts of 512 
> packets vs. 4!!
> Best Regards,
> Mike
>

Thanks for confirming my guesses! By the way, make sure the number of 
packets you receive in a single burst is less than configured queue 
size. Or you will lose packets too. Maybe your "small" delay in not so 
small :) For my own purposes I use a delay of about 150usecs.

P.S. I wonder why this issue is not mentioned in documentation. Is it 
evident for everyone doing network programming?


>
> On Wed, Jan 22, 2014 at 9:52 AM, Dmitry Vyal  <mailto:dmitryvyal at gmail.com>> wrote:
>
> Hello MIchael,
>
> I suggest you to check average burst sizes on receive queues.
> Looks like I stumbled upon a similar issue several times. If you
> are calling rte_eth_rx_burst too frequently, NIC begins losing
> packets no matter how many CPU horse power you have (more you
> have, more it loses, actually). In my case this situation occured
> when average burst size is less than 20 packets or so. I'm not
> sure what's the reason for this behavior, but I observed it on
> several applications on Intel 82599 10Gb cards.
>
> Regards, Dmitry
>
>
>
> On 01/09/2014 11:28 PM, Michael Quicquaro wrote:
>
> Hello,
> My hardware is a Dell PowerEdge R820:
> 4x Intel Xeon E5-4620 2.20GHz 8 core
> 16GB RDIMM 1333 MHz Dual Rank, x4 - Quantity 16
> Intel X520 DP 10Gb DA/SFP+
>
> So in summary 32 cores @ 2.20GHz and 256GB RAM
>
> ... plenty of horsepower.
>
> I've reserved 16 1GB Hugepages
>
> I am configuring only one interface and using testpmd in
> rx_only mode to
> first see if I can receive at line rate.
>
> I am generating traffic on a different system which is running
> the netmap
> pkt-gen program - generating 64 byte packets at close to line
> rate.
>
> I am only able to receive approx. 75% of line rate and I see
> the Rx-errors
> in the port stats going up proportionally.
> I have verified that all receive queues are being used, but
> strangely
> enough, it doesn't matter how many queues more than 2 that I
> use, the
> throughput is the same.  I have verified with 'mpstat -P ALL'
> that all
> specified cores are used.  The utilization of each core is
> only roughly 25%.
>
> Here is my command line:
> testpmd -c 0x -n 4 -- --nb-ports=1 --coremask=0xfffe
> --nb-cores=8 --rxd=2048 --txd=2048 --mbcache=512 --burst=512
> --rxq=8
> --txq=8 --interactive
>
> What can I do to trace down this problem?  It seems very
> similar to a
> thread on this list back in May titled "Best example for showing
> throughput?" where no resolution was ever mentioned in the thread.
>
> Thanks for any help.
> - Michael
>
>
>



[dpdk-dev] rte_malloc_socket UNABLE to allocate 8GB.

2014-09-24 Thread Dmitry Vyal
Dear list,

I'm experiencing problems allocating big chunks of memory with 
rte_malloc_socket. Basically, it successfully allocates 6GB but returns 
NULL when I try to allocate 8GB. I tried dpdk-1.5.1 and 1.7.1 and got 
similar behavior. First machine I was trying this on had 29*1GB 
hugepages on single NUMA node, preallocated using grub. Another one had 
14000*2Mb hugepages split between two sockets.

Are there any restrictions on maximum size of the block to be allocated 
with rte_malloc_socket?

$ mount |grep hug
nodev on /mnt/huge type hugetlbfs (rw,pagesize=1Gb)

$ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
29

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:Ubuntu 12.04.2 LTS
Release:12.04
Codename:   precise

Best regards,
Dmitry