[dpdk-dev] 2nd parameter of driver init function can be NULL using latest code

2015-02-19 Thread Tetsuya Mukawa
Hi,

It seems after applying below patch, 2nd parameter of PMD initialization
code can be NULL when vdev option is like below.


commit c07691ae10894bb6bf284fed75829b95844eacdb

devargs: remove limit on parameters length


Here is example vdev option

--vdev 'eth_name0'
(No option after driver name case)


It seems some PMDs assumes 2nd parameter will be always not NULL, even
if there is no option after driver name.

For example, before applying the patch, here is a log of bond PMD.

$ sudo ./x86_64-native-linuxapp-gcc/app/testpmd -c f -n 1 --vdev
'eth_bond0' -- -i
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
EAL: Detected lcore 2 as core 2 on socket 0
EAL: Detected lcore 3 as core 3 on socket 0
EAL: Detected lcore 4 as core 4 on socket 0
EAL: Detected lcore 5 as core 5 on socket 0
EAL: Detected lcore 6 as core 6 on socket 0
EAL: Detected lcore 7 as core 7 on socket 0
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 8 lcore(s)
EAL: VFIO modules not all loaded, skip VFIO support...
EAL: Setting up memory...
EAL: Ask a virtual area of 0x28000 bytes
EAL: Virtual area found at 0x7ffd4000 (size = 0x28000)
EAL: Requesting 10 pages of size 1024MB from socket 0
EAL: TSC frequency is ~3991438 KHz
EAL: Master core 0 is ready (tid=f7fd6840)
EAL: Initializing pmd_bond for eth_bond0
EAL: Mode must be specified only once for bonded device eth_bond0
PMD: ENICPMD trace: rte_enic_pmd_init
EAL: Core 3 is ready (tid=f58e0700)
EAL: Core 2 is ready (tid=f60e1700)
EAL: Core 1 is ready (tid=f68e2700)
EAL: PCI device :02:00.0 on NUMA socket -1
EAL:   probe driver: 8086:10b9 rte_em_pmd
EAL:   :02:00.0 not managed by UIO driver, skipping
EAL: Error - exiting with code: 1
  Cause: No probed ethernet device


After applying.

$ sudo ./x86_64-native-linuxapp-gcc/app/testpmd -c f -n 1 --vdev
'eth_bond0' -- -i
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
EAL: Detected lcore 2 as core 2 on socket 0
EAL: Detected lcore 3 as core 3 on socket 0
EAL: Detected lcore 4 as core 4 on socket 0
EAL: Detected lcore 5 as core 5 on socket 0
EAL: Detected lcore 6 as core 6 on socket 0
EAL: Detected lcore 7 as core 7 on socket 0
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 8 lcore(s)
EAL: VFIO modules not all loaded, skip VFIO support...
EAL: Setting up memory...
EAL: Ask a virtual area of 0x28000 bytes
EAL: Virtual area found at 0x7ffd4000 (size = 0x28000)
EAL: Requesting 10 pages of size 1024MB from socket 0
EAL: TSC frequency is ~3991439 KHz
EAL: Master core 0 is ready (tid=f7fd6840)
EAL: Initializing pmd_bond for eth_bond0
$

It seems error is returned in PMD code.
I am not sure this is an issue. But just in case I report it, because
behavior is changed.

Thanks,
Tetsuya



[dpdk-dev] [PATCH] Fix compilation error when compiling with clang-3.4 in C++11 mode: in C++11 concatenated string literals need to have a space in between.

2015-02-19 Thread Stefan Puiu
From: Stefan Puiu 

Sample error message:
dpdk/include/rte_pci.h:96:26: error: invalid suffix on literal; C++11 requires 
a space between literal and identifier [-Wreserved-user-defined-literal]
---
 lib/librte_eal/common/include/rte_pci.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 66ed793..12ae5a7 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -93,10 +93,10 @@ extern struct pci_device_list pci_device_list; /**< Global 
list of PCI devices.
 #define SYSFS_PCI_DEVICES "/sys/bus/pci/devices"

 /** Formatting string for PCI device identifier: Ex: :00:01.0 */
-#define PCI_PRI_FMT "%.4"PRIx16":%.2"PRIx8":%.2"PRIx8".%"PRIx8
+#define PCI_PRI_FMT "%.4" PRIx16 ":%.2" PRIx8 ":%.2" PRIx8 ".%" PRIx8

 /** Short formatting string, without domain, for PCI device: Ex: 00:01.0 */
-#define PCI_SHORT_PRI_FMT "%.2"PRIx8":%.2"PRIx8".%"PRIx8
+#define PCI_SHORT_PRI_FMT "%.2" PRIx8 ":%.2" PRIx8 ".%" PRIx8

 /** Nb. of values in PCI device identifier format string. */
 #define PCI_FMT_NVAL 4
-- 
1.7.9.5



[dpdk-dev] [PATCH] Fix C++11 compilation error with rte_pci.h

2015-02-19 Thread Stefan Puiu
 In C++11 concatenated string literals need to have a
 space in between. Found while trying to compile with clang++-3.4. 

Sample error message:
dpdk/include/rte_pci.h:96:26: error: invalid suffix on literal; C++11 requires 
a space between literal and identifier [-Wreserved-user-defined-literal]

Signed-off-by: Stefan Puiu 
---
 lib/librte_eal/common/include/rte_pci.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 66ed793..12ae5a7 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -93,10 +93,10 @@ extern struct pci_device_list pci_device_list; /**< Global 
list of PCI devices.
 #define SYSFS_PCI_DEVICES "/sys/bus/pci/devices"

 /** Formatting string for PCI device identifier: Ex: :00:01.0 */
-#define PCI_PRI_FMT "%.4"PRIx16":%.2"PRIx8":%.2"PRIx8".%"PRIx8
+#define PCI_PRI_FMT "%.4" PRIx16 ":%.2" PRIx8 ":%.2" PRIx8 ".%" PRIx8

 /** Short formatting string, without domain, for PCI device: Ex: 00:01.0 */
-#define PCI_SHORT_PRI_FMT "%.2"PRIx8":%.2"PRIx8".%"PRIx8
+#define PCI_SHORT_PRI_FMT "%.2" PRIx8 ":%.2" PRIx8 ".%" PRIx8

 /** Nb. of values in PCI device identifier format string. */
 #define PCI_FMT_NVAL 4
-- 
1.7.9.5



[dpdk-dev] [PATCH v9 13/14] eal/pci: Add rte_eal_dev_attach/detach() functions

2015-02-19 Thread Tetsuya Mukawa
On 2015/02/19 22:30, Tetsuya Mukawa wrote:
> On 2015/02/19 21:10, Thomas Monjalon wrote:
>> 2015-02-19 11:49, Tetsuya Mukawa:
>>> +/* attach the new virtual device, then store port_id of the device */
>>> +static int
>>> +rte_eal_dev_attach_vdev(const char *vdevargs, uint8_t *port_id)
>>> +{
>>> +   char *args;
>>> +   uint8_t new_port_id;
>>> +   struct rte_eth_dev devs[RTE_MAX_ETHPORTS];
>>> +
>>> +   if ((vdevargs == NULL) || (port_id == NULL))
>>> +   goto err0;
>>> +
>>> +   args = strdup(vdevargs);
>>> +   if (args == NULL)
>>> +   goto err0;
>>> +
>>> +   /* save current port status */
>>> +   if (rte_eth_dev_save(devs, sizeof(devs)))
>>> +   goto err1;
>>> +   /* add the vdevargs to devargs_list */
>>> +   if (rte_eal_devargs_add(RTE_DEVTYPE_VIRTUAL, args))
>>> +   goto err1;
>> Could you explain why you store devargs in a list?
> I try to do same behavior when rte_eal_init() is called.

Sorry for lack of explanation.

"vdevargs" of rte_eal_dev_attach_vdev() will be same format of "--vdev"
option.
And when rte_eal_init() is called, such a "--vdev" option value will be
stored in devargs_list.
So I try to same thing here.

> If only hotplug doesn't do this, someone may be confused when they try
> to realize devargs_list.
>
>>> +   /* parse vdevargs, then retrieve device name */
>>> +   get_vdev_name(args);
>>> +   /* walk around dev_driver_list to find the driver of the device,
>>> +* then invoke probe function o the driver */
>>> +   if (rte_eal_vdev_find_and_init(args))
>> TODO: get port_id from init.
> Yes, I will.
> I also add comment about it.
>
>>> +   goto err2;
>>> +   /* get port_id enabled by above procedures */
>>> +   if (rte_eth_dev_get_changed_port(devs, &new_port_id))
>>> +   goto err2;
>> [...]
>>> --- a/lib/librte_eal/common/include/rte_dev.h
>>> +++ b/lib/librte_eal/common/include/rte_dev.h
>>> @@ -47,6 +47,7 @@ extern "C" {
>>>  #endif
>>>  
>>>  #include 
>>> +#include 
>>>  
>>>  /** Double linked list of device drivers. */
>>>  TAILQ_HEAD(rte_driver_list, rte_driver);
>>> @@ -57,6 +58,11 @@ TAILQ_HEAD(rte_driver_list, rte_driver);
>>>  typedef int (rte_dev_init_t)(const char *name, const char *args);
>>>  
>>>  /**
>>> + * Uninitilization function called for each device driver once.
>>> + */
>>> +typedef int (rte_dev_uninit_t)(const char *name);
>> Why using name as parameter and not port_id?
> This function pointer will be implemented in PMDs.
> For example, in pcap PMD, rte_pmd_pcap_devuninit() is the function.
>
> static struct rte_driver pmd_pcap_drv = {
> .name = "eth_pcap",
> .type = PMD_VDEV,
> .init = rte_pmd_pcap_devinit,
> .uninit = rte_pmd_pcap_devuninit,
> };
>
> "port_id" isn't needed in PMD.
>
>> [...]
>>> --- a/lib/librte_eal/linuxapp/eal/Makefile
>>> +++ b/lib/librte_eal/linuxapp/eal/Makefile
>>> @@ -45,6 +45,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common/include
>>>  CFLAGS += -I$(RTE_SDK)/lib/librte_ring
>>>  CFLAGS += -I$(RTE_SDK)/lib/librte_mempool
>>>  CFLAGS += -I$(RTE_SDK)/lib/librte_malloc
>>> +CFLAGS += -I$(RTE_SDK)/lib/librte_mbuf
>> Why do you need mbuf?
> To include rte_ethdev.h in this code, rte_mbuf.h is also needed.
>
>> [...]
>>> --- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
>>> +++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
>>> @@ -21,6 +21,8 @@ DPDK_2.0 {
>>> rte_eal_alarm_cancel;
>>> rte_eal_alarm_set;
>>> rte_eal_dev_init;
>>> +   rte_eal_dev_attach;
>>> +   rte_eal_dev_detach;
>> Please keep alphabetical order.
>>
> Sure, I will.
>
> Thanks,
> Tetsuya
>




[dpdk-dev] [PATCH v9 13/14] eal/pci: Add rte_eal_dev_attach/detach() functions

2015-02-19 Thread Tetsuya Mukawa
On 2015/02/19 21:10, Thomas Monjalon wrote:
> 2015-02-19 11:49, Tetsuya Mukawa:
>> +/* attach the new virtual device, then store port_id of the device */
>> +static int
>> +rte_eal_dev_attach_vdev(const char *vdevargs, uint8_t *port_id)
>> +{
>> +char *args;
>> +uint8_t new_port_id;
>> +struct rte_eth_dev devs[RTE_MAX_ETHPORTS];
>> +
>> +if ((vdevargs == NULL) || (port_id == NULL))
>> +goto err0;
>> +
>> +args = strdup(vdevargs);
>> +if (args == NULL)
>> +goto err0;
>> +
>> +/* save current port status */
>> +if (rte_eth_dev_save(devs, sizeof(devs)))
>> +goto err1;
>> +/* add the vdevargs to devargs_list */
>> +if (rte_eal_devargs_add(RTE_DEVTYPE_VIRTUAL, args))
>> +goto err1;
> Could you explain why you store devargs in a list?

I try to do same behavior when rte_eal_init() is called.
If only hotplug doesn't do this, someone may be confused when they try
to realize devargs_list.

>> +/* parse vdevargs, then retrieve device name */
>> +get_vdev_name(args);
>> +/* walk around dev_driver_list to find the driver of the device,
>> + * then invoke probe function o the driver */
>> +if (rte_eal_vdev_find_and_init(args))
> TODO: get port_id from init.
Yes, I will.
I also add comment about it.

>> +goto err2;
>> +/* get port_id enabled by above procedures */
>> +if (rte_eth_dev_get_changed_port(devs, &new_port_id))
>> +goto err2;
> [...]
>> --- a/lib/librte_eal/common/include/rte_dev.h
>> +++ b/lib/librte_eal/common/include/rte_dev.h
>> @@ -47,6 +47,7 @@ extern "C" {
>>  #endif
>>  
>>  #include 
>> +#include 
>>  
>>  /** Double linked list of device drivers. */
>>  TAILQ_HEAD(rte_driver_list, rte_driver);
>> @@ -57,6 +58,11 @@ TAILQ_HEAD(rte_driver_list, rte_driver);
>>  typedef int (rte_dev_init_t)(const char *name, const char *args);
>>  
>>  /**
>> + * Uninitilization function called for each device driver once.
>> + */
>> +typedef int (rte_dev_uninit_t)(const char *name);
> Why using name as parameter and not port_id?

This function pointer will be implemented in PMDs.
For example, in pcap PMD, rte_pmd_pcap_devuninit() is the function.

static struct rte_driver pmd_pcap_drv = {
.name = "eth_pcap",
.type = PMD_VDEV,
.init = rte_pmd_pcap_devinit,
.uninit = rte_pmd_pcap_devuninit,
};

"port_id" isn't needed in PMD.

>
> [...]
>> --- a/lib/librte_eal/linuxapp/eal/Makefile
>> +++ b/lib/librte_eal/linuxapp/eal/Makefile
>> @@ -45,6 +45,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common/include
>>  CFLAGS += -I$(RTE_SDK)/lib/librte_ring
>>  CFLAGS += -I$(RTE_SDK)/lib/librte_mempool
>>  CFLAGS += -I$(RTE_SDK)/lib/librte_malloc
>> +CFLAGS += -I$(RTE_SDK)/lib/librte_mbuf
> Why do you need mbuf?

To include rte_ethdev.h in this code, rte_mbuf.h is also needed.

> [...]
>> --- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
>> +++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
>> @@ -21,6 +21,8 @@ DPDK_2.0 {
>>  rte_eal_alarm_cancel;
>>  rte_eal_alarm_set;
>>  rte_eal_dev_init;
>> +rte_eal_dev_attach;
>> +rte_eal_dev_detach;
> Please keep alphabetical order.
>

Sure, I will.

Thanks,
Tetsuya



[dpdk-dev] [PATCH v9 08/14] ethdev: Add functions that will be used by port hotplug functions

2015-02-19 Thread Tetsuya Mukawa
On 2015/02/19 20:24, Thomas Monjalon wrote:
> 2015-02-19 11:49, Tetsuya Mukawa:
>> --- a/lib/librte_ether/rte_ether_version.map
>> +++ b/lib/librte_ether/rte_ether_version.map
>> @@ -109,6 +109,13 @@ DPDK_2.0 {
>>  rte_eth_tx_queue_setup;
>>  rte_eth_xstats_get;
>>  rte_eth_xstats_reset;
>> +rte_eth_dev_allocated;
>> +rte_eth_dev_is_detachable;
>> +rte_eth_dev_get_name_by_port;
>> +rte_eth_dev_get_addr_by_port;
>> +rte_eth_dev_get_port_by_addr;
>> +rte_eth_dev_get_changed_port;
>> +rte_eth_dev_save;
> In this file, alphabetical order is preferred.
>

Hi Thomas,

Sure, I will change order.

Tetsuya





[dpdk-dev] [PATCH v9 02/14] eal_pci: Add flag to hold kernel driver type

2015-02-19 Thread Tetsuya Mukawa
On 2015/02/19 20:17, Thomas Monjalon wrote:
>> @@ -152,6 +159,7 @@ struct rte_pci_device {
>>  uint16_t max_vfs;   /**< sriov enable if not zero */
>>  int numa_node;  /**< NUMA node connection */
>>  struct rte_devargs *devargs;/**< Device user arguments */
>> +enum rte_pt_driver pt_driver;   /**< Driver of passthrough */
> [...]
>> +static int
>> +pci_get_kernel_driver_by_path(const char *filename, char *dri_name)
> I think "kernel driver" is a good name. Why not using this name in the
> pci_device struct to be more consistent?

Hi Michael,

Could you please let me know what do you think about it?

Thanks
Tetsuya

> Thanks



[dpdk-dev] Binding NICs while running DPDK app

2015-02-19 Thread Daniele Di Proietto
Hi,

I would like to implement a scenario like the following:

- Some NICs are bound to the igb_uio driver
- A DPDK application is started and it sends/receives packet to/from those NICs.
- Another set of NICs are bound to the igb_uio driver.
- The same DPDK application continues processing packets from the old
NICs and starts processing packets from the new NICs as well.

I have done some tests and the DPDK application sees the new NICs
after a call to rte_eal_pci_probe(). Though I'm having some question:

- Can I call rte_eal_pci_probe() whenever I want? Will it interfere
with existing NICs' operations? (the documentation doesn't say much
about that)
- Will it work with DPDK 1.7 and DPDK 1.8? I'm asking because I've
noticed that in DPDK 1.7 rte_eal_pci_probe() checks if a NIC is
already registered, while in DPDK 1.8 this check has been removed.

I've seen that there's some work towards proper hotplug support, but I
was hoping to get the above (just adding, no removals) working with
current DPDK releases.

Thanks,

Daniele Di Proietto


[dpdk-dev] [PATCH v4 5/5] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch

2015-02-19 Thread Zhou Danny
v3 changes
- Add spinlock to ensure thread safe when accessing interrupt mask
  register

v2 changes
- Remove unused function which is for debug purpose

Demonstrate how to handle per rx queue interrupt in a NAPI-like
implementation in usersapce. PDK polling thread mainly works in
polling mode and switch to interrupt mode only if there is no
any packet received in recent polls.
Usersapce interrupt notification generally takes a lot more cycles
than kernel, so one-shot interrupt is used here to guarantee minimum
overhead and DPDK polling thread returns to polling mode immediately
once it receives an interrupt notificaiton for incoming packet.

Signed-off-by: Danny Zhou 
Tested-by: Yong Liu 
---
 examples/l3fwd-power/main.c | 153 
 1 file changed, 112 insertions(+), 41 deletions(-)

diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index f6b55b9..d482710 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -75,12 +75,14 @@
 #include 
 #include 
 #include 
+#include 
+#include 

 #define RTE_LOGTYPE_L3FWD_POWER RTE_LOGTYPE_USER1

 #define MAX_PKT_BURST 32

-#define MIN_ZERO_POLL_COUNT 5
+#define MIN_ZERO_POLL_COUNT 10

 /* around 100ms at 2 Ghz */
 #define TIMER_RESOLUTION_CYCLES   2ULL
@@ -156,6 +158,9 @@ static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
 /* ethernet addresses of ports */
 static struct ether_addr ports_eth_addr[RTE_MAX_ETHPORTS];

+/* ethernet addresses of ports */
+static rte_spinlock_t locks[RTE_MAX_ETHPORTS];
+
 /* mask of enabled ports */
 static uint32_t enabled_port_mask = 0;
 /* Ports set in promiscuous mode off by default. */
@@ -188,6 +193,9 @@ struct lcore_rx_queue {
 #define MAX_TX_QUEUE_PER_PORT RTE_MAX_ETHPORTS
 #define MAX_RX_QUEUE_PER_PORT 128

+#define MAX_RX_QUEUE_INTERRUPT_PER_PORT 16
+
+
 #define MAX_LCORE_PARAMS 1024
 struct lcore_params {
uint8_t port_id;
@@ -214,7 +222,7 @@ static uint16_t nb_lcore_params = 
sizeof(lcore_params_array_default) /

 static struct rte_eth_conf port_conf = {
.rxmode = {
-   .mq_mode= ETH_MQ_RX_RSS,
+   .mq_mode = ETH_MQ_RX_RSS,
.max_rx_pkt_len = ETHER_MAX_LEN,
.split_hdr_size = 0,
.header_split   = 0, /**< Header Split disabled */
@@ -226,11 +234,14 @@ static struct rte_eth_conf port_conf = {
.rx_adv_conf = {
.rss_conf = {
.rss_key = NULL,
-   .rss_hf = ETH_RSS_IP,
+   .rss_hf = ETH_RSS_UDP,
},
},
.txmode = {
-   .mq_mode = ETH_DCB_NONE,
+   .mq_mode = ETH_MQ_TX_NONE,
+   },
+   .intr_conf = {
+   .rxq = 1, /**< rxq interrupt feature enabled */
},
 };

@@ -402,19 +413,22 @@ power_timer_cb(__attribute__((unused)) struct rte_timer 
*tim,
/* accumulate total execution time in us when callback is invoked */
sleep_time_ratio = (float)(stats[lcore_id].sleep_time) /
(float)SCALING_PERIOD;
-
/**
 * check whether need to scale down frequency a step if it sleep a lot.
 */
-   if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD)
-   rte_power_freq_down(lcore_id);
+   if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD) {
+   if (rte_power_freq_down)
+   rte_power_freq_down(lcore_id);
+   }
else if ( (unsigned)(stats[lcore_id].nb_rx_processed /
-   stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST)
+   stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST) {
/**
 * scale down a step if average packet per iteration less
 * than expectation.
 */
-   rte_power_freq_down(lcore_id);
+   if (rte_power_freq_down)
+   rte_power_freq_down(lcore_id);
+   }

/**
 * initialize another timer according to current frequency to ensure
@@ -707,22 +721,20 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid,

 }

-#define SLEEP_GEAR1_THRESHOLD100
-#define SLEEP_GEAR2_THRESHOLD1000
+#define MINIMUM_SLEEP_TIME 1
+#define SUSPEND_THRESHOLD  300

 static inline uint32_t
 power_idle_heuristic(uint32_t zero_rx_packet_count)
 {
-   /* If zero count is less than 100, use it as the sleep time in us */
-   if (zero_rx_packet_count < SLEEP_GEAR1_THRESHOLD)
-   return zero_rx_packet_count;
-   /* If zero count is less than 1000, sleep time should be 100 us */
-   else if ((zero_rx_packet_count >= SLEEP_GEAR1_THRESHOLD) &&
-   (zero_rx_packet_count < SLEEP_GEAR2_THRESHOLD))
-   return SLEEP_GEAR1_THRESHOLD;
-   /* If zero count is greater than 1000, sleep time should be 1000 us */
-   else if (zero_

[dpdk-dev] [PATCH v4 4/5] eal: add per rx queue interrupt handling based on VFIO

2015-02-19 Thread Zhou Danny
v4 changes:
- Adjust position of new-added structure fields

v3 changes:
- Fix review comments

v2 changes:
- Fix compilation issue for a missed header file
- Bug fix: free unreleased resources on the exception path before return
- Consolidate coding style related review comments

This patch does below:
- Create multiple VFIO eventfd for rx queues.
- Handle per rx queue interrupt.
- Eliminate unnecessary suspended DPDK polling thread wakeup mechanism
for rx interrupt by allowing polling thread epoll_wait rx queue
interrupt notification.

Signed-off-by: Danny Zhou 
Tested-by: Yong Liu 
---
 lib/librte_eal/common/include/rte_eal.h|  12 ++
 lib/librte_eal/linuxapp/eal/Makefile   |   1 +
 lib/librte_eal/linuxapp/eal/eal_interrupts.c   | 190 -
 lib/librte_eal/linuxapp/eal/eal_pci_vfio.c |  12 +-
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |   4 +
 5 files changed, 175 insertions(+), 44 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_eal.h 
b/lib/librte_eal/common/include/rte_eal.h
index f4ecd2e..d81331f 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -150,6 +150,18 @@ int rte_eal_iopl_init(void);
  *   - On failure, a negative error value.
  */
 int rte_eal_init(int argc, char **argv);
+
+/**
+ * @param port_id
+ *   the port id
+ * @param queue_id
+ *   the queue id
+ * @return
+ *   - On success, return 0
+ *   - On failure, returns -1.
+ */
+int rte_eal_wait_rx_intr(uint8_t port_id, uint8_t queue_id);
+
 /**
  * Usage function typedef used by the application usage function.
  *
diff --git a/lib/librte_eal/linuxapp/eal/Makefile 
b/lib/librte_eal/linuxapp/eal/Makefile
index e117cec..c593dfa 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -43,6 +43,7 @@ CFLAGS += -I$(SRCDIR)/include
 CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common
 CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common/include
 CFLAGS += -I$(RTE_SDK)/lib/librte_ring
+CFLAGS += -I$(RTE_SDK)/lib/librte_mbuf
 CFLAGS += -I$(RTE_SDK)/lib/librte_mempool
 CFLAGS += -I$(RTE_SDK)/lib/librte_malloc
 CFLAGS += -I$(RTE_SDK)/lib/librte_ether
diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c 
b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index dc2668a..97215ad 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -64,6 +64,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "eal_private.h"
 #include "eal_vfio.h"
@@ -127,6 +128,9 @@ static pthread_t intr_thread;
 #ifdef VFIO_PRESENT

 #define IRQ_SET_BUF_LEN  (sizeof(struct vfio_irq_set) + sizeof(int))
+/* irq set buffer length for queue interrupts and LSC interrupt */
+#define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
+   sizeof(int) * (VFIO_MAX_QUEUE_ID + 1))

 /* enable legacy (INTx) interrupts */
 static int
@@ -218,10 +222,10 @@ vfio_disable_intx(struct rte_intr_handle *intr_handle) {
return 0;
 }

-/* enable MSI-X interrupts */
+/* enable MSI interrupts */
 static int
 vfio_enable_msi(struct rte_intr_handle *intr_handle) {
-   int len, ret;
+   int len, ret, max_intr;
char irq_set_buf[IRQ_SET_BUF_LEN];
struct vfio_irq_set *irq_set;
int *fd_ptr;
@@ -230,12 +234,19 @@ vfio_enable_msi(struct rte_intr_handle *intr_handle) {

irq_set = (struct vfio_irq_set *) irq_set_buf;
irq_set->argsz = len;
-   irq_set->count = 1;
+   if ((!intr_handle->max_intr) ||
+   (intr_handle->max_intr > VFIO_MAX_QUEUE_ID))
+   max_intr = VFIO_MAX_QUEUE_ID + 1;
+   else
+   max_intr = intr_handle->max_intr;
+
+   irq_set->count = max_intr;
irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | 
VFIO_IRQ_SET_ACTION_TRIGGER;
irq_set->index = VFIO_PCI_MSI_IRQ_INDEX;
irq_set->start = 0;
fd_ptr = (int *) &irq_set->data;
-   *fd_ptr = intr_handle->fd;
+   memcpy(fd_ptr, intr_handle->queue_fd, sizeof(intr_handle->queue_fd));
+   fd_ptr[max_intr - 1] = intr_handle->fd;

ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);

@@ -244,27 +255,10 @@ vfio_enable_msi(struct rte_intr_handle *intr_handle) {
intr_handle->fd);
return -1;
}
-
-   /* manually trigger interrupt to enable it */
-   memset(irq_set, 0, len);
-   len = sizeof(struct vfio_irq_set);
-   irq_set->argsz = len;
-   irq_set->count = 1;
-   irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
-   irq_set->index = VFIO_PCI_MSI_IRQ_INDEX;
-   irq_set->start = 0;
-
-   ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
-
-   if (ret) {
-   RTE_LOG(ERR, EAL, "Error triggering MSI interrupts for fd %d\n",
-   intr_handle->fd);
-   re

[dpdk-dev] [PATCH v4 3/5] igb: enable rx queue interrupts for PF

2015-02-19 Thread Zhou Danny
v3 changes
- Remove unnecessary variables in e1000_mac_info
- Remove spinlok from PMD

v2 changes
- Consolidate review comments related to coding style

The patch does below for igb PF:
- Setup NIC to generate MSI-X interrupts
- Set the IVAR register to map interrupt causes to vectors
- Implement interrupt enable/disable functions

Signed-off-by: Danny Zhou 
Tested-by: Yong Liu 
---
 lib/librte_pmd_e1000/e1000_ethdev.h |   3 +
 lib/librte_pmd_e1000/igb_ethdev.c   | 228 
 2 files changed, 206 insertions(+), 25 deletions(-)

diff --git a/lib/librte_pmd_e1000/e1000_ethdev.h 
b/lib/librte_pmd_e1000/e1000_ethdev.h
index d155e77..452c6bf 100644
--- a/lib/librte_pmd_e1000/e1000_ethdev.h
+++ b/lib/librte_pmd_e1000/e1000_ethdev.h
@@ -105,6 +105,9 @@
 #define E1000_FTQF_QUEUE_SHIFT   16
 #define E1000_FTQF_QUEUE_ENABLE  0x0100

+/* maximum number of other interrupts besides Rx & Tx interrupts */
+#define E1000_MAX_OTHER_INTR   1
+
 /* structure for interrupt relative data */
 struct e1000_interrupt {
uint32_t flags;
diff --git a/lib/librte_pmd_e1000/igb_ethdev.c 
b/lib/librte_pmd_e1000/igb_ethdev.c
index 2a268b8..bafa332 100644
--- a/lib/librte_pmd_e1000/igb_ethdev.c
+++ b/lib/librte_pmd_e1000/igb_ethdev.c
@@ -97,6 +97,7 @@ static int  eth_igb_flow_ctrl_get(struct rte_eth_dev *dev,
 static int  eth_igb_flow_ctrl_set(struct rte_eth_dev *dev,
struct rte_eth_fc_conf *fc_conf);
 static int eth_igb_lsc_interrupt_setup(struct rte_eth_dev *dev);
+static int eth_igb_rxq_interrupt_setup(struct rte_eth_dev *dev);
 static int eth_igb_interrupt_get_status(struct rte_eth_dev *dev);
 static int eth_igb_interrupt_action(struct rte_eth_dev *dev);
 static void eth_igb_interrupt_handler(struct rte_intr_handle *handle,
@@ -191,6 +192,16 @@ static int eth_igb_filter_ctrl(struct rte_eth_dev *dev,
 enum rte_filter_op filter_op,
 void *arg);

+static int eth_igb_rx_queue_intr_enable(struct rte_eth_dev *dev,
+   uint16_t queue_id);
+static int eth_igb_rx_queue_intr_disable(struct rte_eth_dev *dev,
+   uint16_t queue_id);
+static void eth_igb_assign_msix_vector(struct e1000_hw *hw, int8_t direction,
+   uint8_t queue, uint8_t msix_vector);
+static void eth_igb_configure_msix_intr(struct rte_eth_dev *dev);
+static void eth_igb_write_ivar(struct e1000_hw *hw, uint8_t msix_vector,
+   uint8_t index, uint8_t offset);
+
 /*
  * Define VF Stats MACRO for Non "cleared on read" register
  */
@@ -250,6 +261,8 @@ static struct eth_dev_ops eth_igb_ops = {
.vlan_tpid_set= eth_igb_vlan_tpid_set,
.vlan_offload_set = eth_igb_vlan_offload_set,
.rx_queue_setup   = eth_igb_rx_queue_setup,
+   .rx_queue_intr_enable = eth_igb_rx_queue_intr_enable,
+   .rx_queue_intr_disable = eth_igb_rx_queue_intr_disable,
.rx_queue_release = eth_igb_rx_queue_release,
.rx_queue_count   = eth_igb_rx_queue_count,
.rx_descriptor_done   = eth_igb_rx_descriptor_done,
@@ -471,6 +484,7 @@ eth_igb_dev_init(__attribute__((unused)) struct eth_driver 
*eth_drv,
struct e1000_vfta * shadow_vfta =
E1000_DEV_PRIVATE_TO_VFTA(eth_dev->data->dev_private);
uint32_t ctrl_ext;
+   struct rte_eth_dev_info dev_info;

pci_dev = eth_dev->pci_dev;
eth_dev->dev_ops = ð_igb_ops;
@@ -592,6 +606,13 @@ eth_igb_dev_init(__attribute__((unused)) struct eth_driver 
*eth_drv,
 eth_dev->data->port_id, pci_dev->id.vendor_id,
 pci_dev->id.device_id);

+   /* set max interrupt vfio request */
+   memset(&dev_info, 0, sizeof(dev_info));
+   eth_igb_infos_get(eth_dev, &dev_info);
+
+   pci_dev->intr_handle.max_intr = dev_info.max_rx_queues +
+   E1000_MAX_OTHER_INTR;
+
rte_intr_callback_register(&(pci_dev->intr_handle),
eth_igb_interrupt_handler, (void *)eth_dev);

@@ -754,7 +775,7 @@ eth_igb_start(struct rte_eth_dev *dev)
 {
struct e1000_hw *hw =
E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-   int ret, i, mask;
+   int ret, mask;
uint32_t ctrl_ext;

PMD_INIT_FUNC_TRACE();
@@ -794,6 +815,9 @@ eth_igb_start(struct rte_eth_dev *dev)
/* configure PF module if SRIOV enabled */
igb_pf_host_configure(dev);

+   /* confiugre msix for rx interrupt */
+   eth_igb_configure_msix_intr(dev);
+
/* Configure for OS presence */
igb_init_manageability(hw);

@@ -821,33 +845,9 @@ eth_igb_start(struct rte_eth_dev *dev)
igb_vmdq_vlan_hw_filter_enable(dev);
}

-   /*
-* Configure the Interrupt Moderation register (EITR) with the maximum
-* possible value (0x) to minimize "Sy

[dpdk-dev] [PATCH v4 2/5] ixgbe: enable rx queue interrupts for both PF and VF

2015-02-19 Thread Zhou Danny
v3 changes
- Remove spinlok from PMD

v2 changes
- Consolidate review comments related to coding style

The patch does below things for ixgbe PF and VF:
- Setup NIC to generate MSI-X interrupts
- Set the IVAR register to map interrupt causes to vectors
- Implement interrupt enable/disable functions

Signed-off-by: Danny Zhou 
Signed-off-by: Yong Liu 
Tested-by: Yong Liu 
---
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 365 +++-
 lib/librte_pmd_ixgbe/ixgbe_ethdev.h |   6 +
 2 files changed, 367 insertions(+), 4 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c 
b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
index d6d408e..7e72808 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
@@ -83,6 +83,9 @@
  */
 #define IXGBE_FC_LO0x40

+/* Default minimum inter-interrupt interval for EITR configuration */
+#define IXGBE_MIN_INTER_INTERRUPT_INTERVAL_DEFAULT0x79E
+
 /* Timer value included in XOFF frames. */
 #define IXGBE_FC_PAUSE 0x680

@@ -173,6 +176,7 @@ static int ixgbe_dev_rss_reta_query(struct rte_eth_dev *dev,
uint16_t reta_size);
 static void ixgbe_dev_link_status_print(struct rte_eth_dev *dev);
 static int ixgbe_dev_lsc_interrupt_setup(struct rte_eth_dev *dev);
+static int ixgbe_dev_rxq_interrupt_setup(struct rte_eth_dev *dev);
 static int ixgbe_dev_interrupt_get_status(struct rte_eth_dev *dev);
 static int ixgbe_dev_interrupt_action(struct rte_eth_dev *dev);
 static void ixgbe_dev_interrupt_handler(struct rte_intr_handle *handle,
@@ -186,11 +190,14 @@ static void ixgbe_dcb_init(struct ixgbe_hw *hw,struct 
ixgbe_dcb_config *dcb_conf
 /* For Virtual Function support */
 static int eth_ixgbevf_dev_init(struct eth_driver *eth_drv,
struct rte_eth_dev *eth_dev);
+static int ixgbevf_dev_interrupt_get_status(struct rte_eth_dev *dev);
+static int ixgbevf_dev_interrupt_action(struct rte_eth_dev *dev);
 static int  ixgbevf_dev_configure(struct rte_eth_dev *dev);
 static int  ixgbevf_dev_start(struct rte_eth_dev *dev);
 static void ixgbevf_dev_stop(struct rte_eth_dev *dev);
 static void ixgbevf_dev_close(struct rte_eth_dev *dev);
 static void ixgbevf_intr_disable(struct ixgbe_hw *hw);
+static void ixgbevf_intr_enable(struct ixgbe_hw *hw);
 static void ixgbevf_dev_stats_get(struct rte_eth_dev *dev,
struct rte_eth_stats *stats);
 static void ixgbevf_dev_stats_reset(struct rte_eth_dev *dev);
@@ -200,6 +207,15 @@ static void ixgbevf_vlan_strip_queue_set(struct 
rte_eth_dev *dev,
uint16_t queue, int on);
 static void ixgbevf_vlan_offload_set(struct rte_eth_dev *dev, int mask);
 static void ixgbevf_set_vfta_all(struct rte_eth_dev *dev, bool on);
+static void ixgbevf_dev_interrupt_handler(struct rte_intr_handle *handle,
+   void *param);
+static int ixgbevf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev,
+   uint16_t queue_id);
+static int ixgbevf_dev_rx_queue_intr_disable(struct rte_eth_dev *dev,
+uint16_t queue_id);
+static void ixgbevf_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+uint8_t queue, uint8_t msix_vector);
+static void ixgbevf_configure_msix(struct  ixgbe_hw *hw);

 /* For Eth VMDQ APIs support */
 static int ixgbe_uc_hash_table_set(struct rte_eth_dev *dev, struct
@@ -217,6 +233,14 @@ static int ixgbe_mirror_rule_set(struct rte_eth_dev *dev,
 static int ixgbe_mirror_rule_reset(struct rte_eth_dev *dev,
uint8_t rule_id);

+static int ixgbe_dev_rx_queue_intr_enable(struct rte_eth_dev *dev,
+   uint16_t queue_id);
+static int ixgbe_dev_rx_queue_intr_disable(struct rte_eth_dev *dev,
+   uint16_t queue_id);
+static void ixgbe_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+   uint8_t queue, uint8_t msix_vector);
+static void ixgbe_configure_msix(struct  ixgbe_hw *hw);
+
 static int ixgbe_set_queue_rate_limit(struct rte_eth_dev *dev,
uint16_t queue_idx, uint16_t tx_rate);
 static int ixgbe_set_vf_rate_limit(struct rte_eth_dev *dev, uint16_t vf,
@@ -257,7 +281,7 @@ static int ixgbe_dev_filter_ctrl(struct rte_eth_dev *dev,
  */
 #define UPDATE_VF_STAT(reg, last, cur) \
 {   \
-   u32 latest = IXGBE_READ_REG(hw, reg);   \
+   uint32_t latest = IXGBE_READ_REG(hw, reg);   \
cur += latest - last;   \
last = latest;  \
 }
@@ -338,6 +362,8 @@ static struct eth_dev_ops ixgbe_eth_dev_ops = {
.tx_queue_start   = ixgbe_dev_tx_queue_start,
.tx_queue_stop= ixgbe_dev_tx_queue_stop,
.rx_queue_setup   = ixgbe_dev_rx_queue_setup,
+   .rx_queue_intr_enable = ixgbe_dev_rx_queue_intr_enable,
+   .rx_queue_intr_disable = ixgbe_dev_rx_queue

[dpdk-dev] [PATCH v4 1/5] ethdev: add rx interrupt enable/disable functions

2015-02-19 Thread Zhou Danny
v4 changes
- Export interrupt enable/disable functions for shared libraries
- Put new functions at the end of eth_dev_ops to avoid breaking ABI

v3 changes
- Add return value for interrupt enable/disable functions

Add two dev_ops functions to enable and disable rx queue interrupts

Signed-off-by: Danny Zhou 
Tested-by: Yong Liu 
---
 lib/librte_ether/rte_ethdev.c  | 43 +
 lib/librte_ether/rte_ethdev.h  | 59 ++
 lib/librte_ether/rte_ether_version.map |  2 ++
 3 files changed, 104 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ea3a1fb..d27469a 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -2825,6 +2825,49 @@ _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
}
rte_spinlock_unlock(&rte_eth_dev_cb_lock);
 }
+
+int
+rte_eth_dev_rx_queue_intr_enable(uint8_t port_id,
+   uint16_t queue_id)
+{
+   struct rte_eth_dev *dev;
+
+   if (port_id >= nb_ports) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return (-ENODEV);
+   }
+
+   dev = &rte_eth_devices[port_id];
+   if (dev == NULL) {
+   PMD_DEBUG_TRACE("Invalid port device\n");
+   return (-ENODEV);
+   }
+
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, -ENOTSUP);
+   return (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
+}
+
+int
+rte_eth_dev_rx_queue_intr_disable(uint8_t port_id,
+   uint16_t queue_id)
+{
+   struct rte_eth_dev *dev;
+
+   if (port_id >= nb_ports) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return (-ENODEV);
+   }
+
+   dev = &rte_eth_devices[port_id];
+   if (dev == NULL) {
+   PMD_DEBUG_TRACE("Invalid port device\n");
+   return (-ENODEV);
+   }
+
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_disable, -ENOTSUP);
+   return (*dev->dev_ops->rx_queue_intr_disable)(dev, queue_id);
+}
+
 #ifdef RTE_NIC_BYPASS
 int rte_eth_dev_bypass_init(uint8_t port_id)
 {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 84160c3..43035c2 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -848,6 +848,8 @@ struct rte_eth_fdir {
 struct rte_intr_conf {
/** enable/disable lsc interrupt. 0 (default) - disable, 1 enable */
uint16_t lsc;
+   /** enable/disable rxq interrupt. 0 (default) - disable, 1 enable */
+   uint16_t rxq;
 };

 /**
@@ -1109,6 +,14 @@ typedef int (*eth_tx_queue_setup_t)(struct rte_eth_dev 
*dev,
const struct rte_eth_txconf *tx_conf);
 /**< @internal Setup a transmit queue of an Ethernet device. */

+typedef int (*eth_rx_enable_intr_t)(struct rte_eth_dev *dev,
+   uint16_t rx_queue_id);
+/**< @internal Enable interrupt of a receive queue of an Ethernet device. */
+
+typedef int (*eth_rx_disable_intr_t)(struct rte_eth_dev *dev,
+   uint16_t rx_queue_id);
+/**< @internal Disable interrupt of a receive queue of an Ethernet device. */
+
 typedef void (*eth_queue_release_t)(void *queue);
 /**< @internal Release memory resources allocated by given RX/TX queue. */

@@ -1520,6 +1530,10 @@ struct eth_dev_ops {
eth_remove_flex_filter_t   remove_flex_filter;   /**< remove flex 
filter. */
eth_get_flex_filter_t  get_flex_filter;  /**< get flex 
filter. */
eth_filter_ctrl_t  filter_ctrl;  /**< common filter 
control*/
+
+   /** Enable/disable Rx queue interrupt. */
+   eth_rx_enable_intr_t   rx_queue_intr_enable; /**< Enable Rx queue 
interrupt. */
+   eth_rx_disable_intr_t  rx_queue_intr_disable; /**< Disable Rx queue 
interrupt.*/
 };

 /**
@@ -2811,6 +2825,51 @@ void _rte_eth_dev_callback_process(struct rte_eth_dev 
*dev,
enum rte_eth_event_type event);

 /**
+ * When there is no rx packet coming in Rx Queue for a long time, we can
+ * sleep lcore related to RX Queue for power saving, and enable rx interrupt
+ * to be triggered when rx packect arrives.
+ *
+ * The rte_eth_dev_rx_queue_intr_enable() function enables rx queue
+ * interrupt on specific rx queue of a port.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The index of the receive queue from which to retrieve input packets.
+ *   The value must be in the range [0, nb_rx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @return
+ *   - (0) if successful.
+ *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
+ * that operation.
+ *   - (-ENODEV) if *port_id* invalid.
+ */
+int rte_eth_dev_rx_queue_intr_enable(uint8_t port_id,
+   uint16_t queue_id);

[dpdk-dev] [PATCH v4 0/5] Interrupt mode PMD

2015-02-19 Thread Zhou Danny
v4 changes
- Export interrupt enable/disable functions for shared libraries
- Adjust position of new-added structure fields and functions to
avoid breaking ABI

v3 changes
- Add return value for interrupt enable/disable functions
- Move spinlok from PMD to L3fwd-power
- Remove unnecessary variables in e1000_mac_info
- Fix miscelleous review comments

v2 changes
- Fix compilation issue in Makefile for missed header file.
- Consolidate internal and community review comments of v1 patch set.

The patch series introduce low-latency one-shot rx interrupt into DPDK with
polling and interrupt mode switch control example.

DPDK userspace interrupt notification and handling mechanism is based on UIO
with below limitation:
1) It is designed to handle LSC interrupt only with inefficient suspended
pthread wakeup procedure (e.g. UIO wakes up LSC interrupt handling thread
which then wakes up DPDK polling thread). In this way, it introduces
non-deterministic wakeup latency for DPDK polling thread as well as packet
latency if it is used to handle Rx interrupt.
2) UIO only supports a single interrupt vector which has to been shared by
LSC interrupt and interrupts assigned to dedicated rx queues.

This patchset includes below features:
1) Enable one-shot rx queue interrupt in ixgbe PMD(PF & VF) and igb PMD(PF 
only).
2) Build on top of the VFIO mechanism instead of UIO, so it could support
up to 64 interrupt vectors for rx queue interrupts.
3) Have 1 DPDK polling thread handle per Rx queue interrupt with a dedicated
VFIO eventfd, which eliminates non-deterministic pthread wakeup latency in
user space.
4) Demonstrate interrupts control APIs and userspace NAIP-like polling/interrupt
switch algorithms in L3fwd-power example.

Known limitations:
1) It does not work for UIO due to a single interrupt eventfd shared by LSC
and rx queue interrupt handlers causes a mess.
2) LSC interrupt is not supported by VF driver, so it is by default disabled
in L3fwd-power now. Feel free to turn in on if you want to support both LSC
and rx queue interrupts on a PF.

Danny Zhou (5):
  ethdev: add rx interrupt enable/disable functions
  ixgbe: enable rx queue interrupts for both PF and VF
  igb: enable rx queue interrupts for PF
  eal: add per rx queue interrupt handling based on VFIO
  l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode  
  switch

 examples/l3fwd-power/main.c| 153 ++---
 lib/librte_eal/common/include/rte_eal.h|  12 +
 lib/librte_eal/linuxapp/eal/Makefile   |   1 +
 lib/librte_eal/linuxapp/eal/eal_interrupts.c   | 190 ---
 lib/librte_eal/linuxapp/eal/eal_pci_vfio.c |  12 +-
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |   4 +
 lib/librte_ether/rte_ethdev.c  |  43 +++
 lib/librte_ether/rte_ethdev.h  |  59 
 lib/librte_ether/rte_ether_version.map |   2 +
 lib/librte_pmd_e1000/e1000_ethdev.h|   3 +
 lib/librte_pmd_e1000/igb_ethdev.c  | 228 +++--
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c| 365 -
 lib/librte_pmd_ixgbe/ixgbe_ethdev.h|   6 +
 13 files changed, 964 insertions(+), 114 deletions(-)

-- 
1.8.1.4



[dpdk-dev] [PATCH v9 2/2] librte_pmd_null: Support port hotplug function

2015-02-19 Thread Tetsuya Mukawa
This patch adds port hotplug support to Null PMD.

v9:
 - Use rte_eth_dev_release_port() instead of rte_eth_dev_free().
v7:
 - Add parameter checkings.
   (Thanks to Iremonger, Bernard)
v6:
 - Fix a parameter of rte_eth_dev_free().
v4:
 - Fix commit title.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_pmd_null/rte_eth_null.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/lib/librte_pmd_null/rte_eth_null.c 
b/lib/librte_pmd_null/rte_eth_null.c
index 779db63..5cf40f4 100644
--- a/lib/librte_pmd_null/rte_eth_null.c
+++ b/lib/librte_pmd_null/rte_eth_null.c
@@ -336,6 +336,13 @@ eth_stats_reset(struct rte_eth_dev *dev)
}
 }

+static struct eth_driver rte_null_pmd = {
+   .pci_drv = {
+   .name = "rte_null_pmd",
+   .drv_flags = RTE_PCI_DRV_DETACHABLE,
+   },
+};
+
 static void
 eth_queue_release(void *q)
 {
@@ -429,10 +436,12 @@ eth_dev_null_create(const char *name,
data->nb_tx_queues = (uint16_t)nb_tx_queues;
data->dev_link = pmd_link;
data->mac_addrs = ð_addr;
+   strncpy(data->name, eth_dev->data->name, strlen(eth_dev->data->name));

eth_dev->data = data;
eth_dev->dev_ops = &ops;
eth_dev->pci_dev = pci_dev;
+   eth_dev->driver = &rte_null_pmd;

/* finally assign rx and tx ops */
if (packet_copy) {
@@ -532,10 +541,36 @@ rte_pmd_null_devinit(const char *name, const char *params)
return eth_dev_null_create(name, numa_node, packet_size, packet_copy);
 }

+static int
+rte_pmd_null_devuninit(const char *name)
+{
+   struct rte_eth_dev *eth_dev = NULL;
+
+   if (name == NULL)
+   return -EINVAL;
+
+   RTE_LOG(INFO, PMD, "Closing null ethdev on numa socket %u\n",
+   rte_socket_id());
+
+   /* reserve an ethdev entry */
+   eth_dev = rte_eth_dev_allocated(name);
+   if (eth_dev == NULL)
+   return -1;
+
+   rte_free(eth_dev->data->dev_private);
+   rte_free(eth_dev->data);
+   rte_free(eth_dev->pci_dev);
+
+   rte_eth_dev_release_port(eth_dev);
+
+   return 0;
+}
+
 static struct rte_driver pmd_null_drv = {
.name = "eth_null",
.type = PMD_VDEV,
.init = rte_pmd_null_devinit,
+   .uninit = rte_pmd_null_devuninit,
 };

 PMD_REGISTER_DRIVER(pmd_null_drv);
-- 
1.9.1



[dpdk-dev] [PATCH v9 1/2] librte_pmd_null: Add Null PMD

2015-02-19 Thread Tetsuya Mukawa
Null PMD is a driver of the virtual device particularly designed to measure
performance of DPDK PMDs. When an application call rx, Null PMD just allocates
mbufs and returns those. Also tx, the PMD just frees mbufs.

The PMD has following options.
- size: specify packe size allocated by RX. Default packet size is 64.
- copy: specify 1 or 0 to enable or disable copy while RX and TX.
Default value is 0(disabled).
This option is used for emulating more realistic data transfer.
Copy size is equal to packet size.

To use the PMD, enable CONFIG_RTE_BUILD_SHARED_LIB in config file. Then
compile the PMD as shared library. The library can be linked using '-d'
option when an application invokes.

Here is an example.
$ sudo ./testpmd -c f -n 4 -d librte_pmd_null.so \
--vdev 'eth_null0' --vdev 'eth_null1' -- -i --no-flush-rx

If testpmd is compiled with CONFIG_RTE_BUILD_SHARED_LIB, it may need to
specify more libraries using '-d' option.

v8:
 - Fix Makefile and add version map file.
   (Thanks to Qiu, Michael and Iremonger, Bernard)
v7:
 - Add parameter checkings.
   (Thanks to Iremonger, Bernard)
 - Remove needless "__rte_unused".
v4:
 - Fix memory leak.
   (Thanks to Iremonger, Bernard)

Signed-off-by: Tetsuya Mukawa 
---
 config/common_bsdapp |   5 +
 config/common_linuxapp   |   5 +
 lib/Makefile |   1 +
 lib/librte_pmd_null/Makefile |  62 +++
 lib/librte_pmd_null/rte_eth_null.c   | 541 +++
 lib/librte_pmd_null/rte_pmd_null_version.map |   4 +
 6 files changed, 618 insertions(+)
 create mode 100644 lib/librte_pmd_null/Makefile
 create mode 100644 lib/librte_pmd_null/rte_eth_null.c
 create mode 100644 lib/librte_pmd_null/rte_pmd_null_version.map

diff --git a/config/common_bsdapp b/config/common_bsdapp
index e9d07e4..d019edb 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -241,6 +241,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y
 CONFIG_RTE_LIBRTE_PMD_BOND=y

 #
+# Compile null PMD
+#
+CONFIG_RTE_LIBRTE_PMD_NULL=y
+
+#
 # Do prefetch of packet data within PMD driver receive function
 #
 CONFIG_RTE_PMD_PACKET_PREFETCH=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 5f7fe28..ed749f2 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -248,6 +248,11 @@ CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
 CONFIG_RTE_LIBRTE_PMD_XENVIRT=n

 #
+# Compile null PMD
+#
+CONFIG_RTE_LIBRTE_PMD_NULL=y
+
+#
 # Do prefetch of packet data within PMD driver receive function
 #
 CONFIG_RTE_PMD_PACKET_PREFETCH=y
diff --git a/lib/Makefile b/lib/Makefile
index 6575a4e..5fcbb3c 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -54,6 +54,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio
 DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt
 DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_NULL) += librte_pmd_null
 DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash
 DIRS-$(CONFIG_RTE_LIBRTE_LPM) += librte_lpm
 DIRS-$(CONFIG_RTE_LIBRTE_ACL) += librte_acl
diff --git a/lib/librte_pmd_null/Makefile b/lib/librte_pmd_null/Makefile
new file mode 100644
index 000..6472015
--- /dev/null
+++ b/lib/librte_pmd_null/Makefile
@@ -0,0 +1,62 @@
+#   BSD LICENSE
+#
+#   Copyright (C) IGEL Co.,Ltd.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of IGEL Co.,Ltd. nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(R

[dpdk-dev] [PATCH v3 6/6] bond: add unit tests for link bonding mode 6.

2015-02-19 Thread Michal Jastrzebski
From: Maciej Gajdzica 

Added 4 unit tests checking link bonding mode 6 behavior.

Also modified virtual_pmd so it is possible to provide packets,
that should be received with rx_burst and to inspect packets
transmitted by tx_burst.

In packet_burst_generator.c function creating eth_header is
modified, so it accepts ether_type as a parameter and function
creating arp_header is added. Updated other unit tests to get
rid of compilation errors.

Signed-off-by: Maciej Gajdzica 
---
 app/test/packet_burst_generator.c |   41 +++-
 app/test/packet_burst_generator.h |   11 +-
 app/test/test_link_bonding.c  |  439 +++--
 app/test/test_pmd_perf.c  |3 +-
 app/test/virtual_pmd.c|  109 +
 app/test/virtual_pmd.h|5 +-
 6 files changed, 533 insertions(+), 75 deletions(-)

diff --git a/app/test/packet_burst_generator.c 
b/app/test/packet_burst_generator.c
index e9d059c..b46eed7 100644
--- a/app/test/packet_burst_generator.c
+++ b/app/test/packet_burst_generator.c
@@ -80,11 +80,10 @@ copy_buf_to_pkt(void *buf, unsigned len, struct rte_mbuf 
*pkt, unsigned offset)
copy_buf_to_pkt_segs(buf, len, pkt, offset);
 }

-
 void
 initialize_eth_header(struct ether_hdr *eth_hdr, struct ether_addr *src_mac,
-   struct ether_addr *dst_mac, uint8_t ipv4, uint8_t vlan_enabled,
-   uint16_t van_id)
+   struct ether_addr *dst_mac, uint16_t ether_type,
+   uint8_t vlan_enabled, uint16_t van_id)
 {
ether_addr_copy(dst_mac, ð_hdr->d_addr);
ether_addr_copy(src_mac, ð_hdr->s_addr);
@@ -95,19 +94,27 @@ initialize_eth_header(struct ether_hdr *eth_hdr, struct 
ether_addr *src_mac,

eth_hdr->ether_type = rte_cpu_to_be_16(ETHER_TYPE_VLAN);

-   if (ipv4)
-   vhdr->eth_proto =  rte_cpu_to_be_16(ETHER_TYPE_IPv4);
-   else
-   vhdr->eth_proto =  rte_cpu_to_be_16(ETHER_TYPE_IPv6);
-
+   vhdr->eth_proto =  rte_cpu_to_be_16(ether_type);
vhdr->vlan_tci = van_id;
} else {
-   if (ipv4)
-   eth_hdr->ether_type = rte_cpu_to_be_16(ETHER_TYPE_IPv4);
-   else
-   eth_hdr->ether_type = rte_cpu_to_be_16(ETHER_TYPE_IPv6);
+   eth_hdr->ether_type = rte_cpu_to_be_16(ether_type);
}
+}

+void
+initialize_arp_header(struct arp_hdr *arp_hdr, struct ether_addr *src_mac,
+   struct ether_addr *dst_mac, uint32_t src_ip, uint32_t dst_ip,
+   uint32_t opcode)
+{
+   arp_hdr->arp_hrd = rte_cpu_to_be_16(ARP_HRD_ETHER);
+   arp_hdr->arp_pro = rte_cpu_to_be_16(ETHER_TYPE_IPv4);
+   arp_hdr->arp_hln = ETHER_ADDR_LEN;
+   arp_hdr->arp_pln = sizeof(uint32_t);
+   arp_hdr->arp_op = rte_cpu_to_be_16(opcode);
+   ether_addr_copy(src_mac, &arp_hdr->arp_data.arp_sha);
+   arp_hdr->arp_data.arp_sip = src_ip;
+   ether_addr_copy(dst_mac, &arp_hdr->arp_data.arp_tha);
+   arp_hdr->arp_data.arp_tip = dst_ip;
 }

 uint16_t
@@ -265,9 +272,19 @@ nomore_mbuf:
if (ipv4) {
pkt->vlan_tci  = ETHER_TYPE_IPv4;
pkt->l3_len = sizeof(struct ipv4_hdr);
+
+   if (vlan_enabled)
+   pkt->ol_flags = PKT_RX_IPV4_HDR | 
PKT_RX_VLAN_PKT;
+   else
+   pkt->ol_flags = PKT_RX_IPV4_HDR;
} else {
pkt->vlan_tci  = ETHER_TYPE_IPv6;
pkt->l3_len = sizeof(struct ipv6_hdr);
+
+   if (vlan_enabled)
+   pkt->ol_flags = PKT_RX_IPV6_HDR | 
PKT_RX_VLAN_PKT;
+   else
+   pkt->ol_flags = PKT_RX_IPV6_HDR;
}

pkts_burst[nb_pkt] = pkt;
diff --git a/app/test/packet_burst_generator.h 
b/app/test/packet_burst_generator.h
index 666cc8e..edc1044 100644
--- a/app/test/packet_burst_generator.h
+++ b/app/test/packet_burst_generator.h
@@ -40,6 +40,7 @@ extern "C" {

 #include 
 #include 
+#include 
 #include 
 #include 

@@ -50,11 +51,15 @@ extern "C" {
 #define PACKET_BURST_GEN_PKT_LEN 60
 #define PACKET_BURST_GEN_PKT_LEN_128 128

-
 void
 initialize_eth_header(struct ether_hdr *eth_hdr, struct ether_addr *src_mac,
-   struct ether_addr *dst_mac, uint8_t ipv4, uint8_t vlan_enabled,
-   uint16_t van_id);
+   struct ether_addr *dst_mac, uint16_t ether_type,
+   uint8_t vlan_enabled, uint16_t van_id);
+
+void
+initialize_arp_header(struct arp_hdr *arp_hdr, struct ether_addr *src_mac,
+   struct ether_addr *dst_mac, uint32_t src_ip, uint32_t dst_ip,
+   uint32_t opcode);

 uint16_t
 initialize_udp_header(struct udp_hdr *udp_hdr, uint16_t src_port,
diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c
ind

[dpdk-dev] [PATCH v3 5/6] bond: modify TLB unit tests

2015-02-19 Thread Michal Jastrzebski
From: Daniel Mrzyglod 

This patch modify mode older name from
BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING to BONDING_MODE_TLB
This patch also changes order of TEST_ASSERT macro in
test_tlb_verify_slave_link_status_change_failover.

Signed-off-by: Daniel Mrzyglod 
---
 app/test/test_link_bonding.c|   27 ++-
 lib/librte_pmd_bond/rte_eth_bond.h  |2 +-
 lib/librte_pmd_bond/rte_eth_bond_api.c  |8 
 lib/librte_pmd_bond/rte_eth_bond_args.c |2 +-
 lib/librte_pmd_bond/rte_eth_bond_pmd.c  |   12 ++--
 5 files changed, 26 insertions(+), 25 deletions(-)

diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c
index 579ebbf..dd6e357 100644
--- a/app/test/test_link_bonding.c
+++ b/app/test/test_link_bonding.c
@@ -4053,7 +4053,7 @@ test_tlb_tx_burst(void)
uint64_t floor_obytes = 0, ceiling_obytes = 0;

TEST_ASSERT_SUCCESS(initialize_bonded_device_with_slaves
-   (BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING, 1, 3, 
1),
+   (BONDING_MODE_TLB, 1, 3, 1),
"Failed to initialise bonded device");

burst_size = 20 * test_params->bonded_slave_count;
@@ -4153,7 +4153,7 @@ test_tlb_rx_burst(void)

/* Initialize bonded device with 4 slaves in transmit load balancing 
mode */
TEST_ASSERT_SUCCESS(initialize_bonded_device_with_slaves(
-   BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING,
+   BONDING_MODE_TLB,

TEST_ADAPTIVE_TRANSMIT_LOAD_BALANCING_RX_BURST_SLAVE_COUNT, 1, 1),
"Failed to initialize bonded device");

@@ -4231,7 +4231,7 @@ test_tlb_verify_promiscuous_enable_disable(void)

/* Initialize bonded device with 4 slaves in transmit load balancing 
mode */
TEST_ASSERT_SUCCESS( initialize_bonded_device_with_slaves(
-   BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING, 0, 4, 1),
+   BONDING_MODE_TLB, 0, 4, 1),
"Failed to initialize bonded device");

primary_port = rte_eth_bond_primary_get(test_params->bonded_port_id);
@@ -4289,7 +4289,7 @@ test_tlb_verify_mac_assignment(void)

/* Initialize bonded device with 2 slaves in active backup mode */
TEST_ASSERT_SUCCESS(initialize_bonded_device_with_slaves(
-   BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING, 0, 2, 1),
+   BONDING_MODE_TLB, 0, 2, 1),
"Failed to initialize bonded device");

/* Verify that bonded MACs is that of first slave and that the other 
slave
@@ -4409,7 +4409,7 @@ test_tlb_verify_slave_link_status_change_failover(void)

/* Initialize bonded device with 4 slaves in round robin mode */
TEST_ASSERT_SUCCESS(initialize_bonded_device_with_slaves(
-   BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING, 0,
+   BONDING_MODE_TLB, 0,

TEST_ADAPTIVE_TRANSMIT_LOAD_BALANCING_RX_BURST_SLAVE_COUNT, 1),
"Failed to initialize bonded device with slaves");

@@ -4472,20 +4472,21 @@ test_tlb_verify_slave_link_status_change_failover(void)
rte_delay_us(11000);
}

-   rte_eth_stats_get(test_params->slave_port_ids[2], &port_stats);
-   TEST_ASSERT_NOT_EQUAL(port_stats.opackets, (int8_t)0,
-   "(%d) port_stats.opackets not as expected\n",
-   test_params->slave_port_ids[2]);
-
rte_eth_stats_get(test_params->slave_port_ids[0], &port_stats);
TEST_ASSERT_EQUAL(port_stats.opackets, (int8_t)0,
-   "(%d) port_stats.opackets not as expected\n",
-   test_params->slave_port_ids[0]);
+   "(%d) port_stats.opackets not as expected\n",
+   test_params->slave_port_ids[0]);

rte_eth_stats_get(test_params->slave_port_ids[1], &port_stats);
TEST_ASSERT_NOT_EQUAL(port_stats.opackets, (int8_t)0,
+   "(%d) port_stats.opackets not as 
expected\n",
+   test_params->slave_port_ids[1]);
+
+
+   rte_eth_stats_get(test_params->slave_port_ids[2], &port_stats);
+   TEST_ASSERT_NOT_EQUAL(port_stats.opackets, (int8_t)0,
"(%d) port_stats.opackets not as expected\n",
-   test_params->slave_port_ids[1]);
+   test_params->slave_port_ids[2]);

rte_eth_stats_get(test_params->slave_port_ids[3], &port_stats);
TEST_ASSERT_NOT_EQUAL(port_stats.opackets, (int8_t)0,
diff --git a/lib/librte_pmd_bond/rte_eth_bond.h 
b/lib/librte_pmd_bond/rte_eth_bond.h
index 13581cb..4117a70 100644
--- a/lib/librte_pmd_bond/rte_eth_bond.h
+++ b/lib/librte_pmd_bond/rte_eth_bond.h
@@ -96,7 +96,7 @@ extern "C" {
  * to rx_burst should be at least 2 times the slave cou

[dpdk-dev] [PATCH v3 4/6] bond: add example application for link bonding mode 6

2015-02-19 Thread Michal Jastrzebski
This patch contains an example for link bonding mode 6.
It interact with user by a command prompt. Available commands are:
Start - starts ARP_thread which respond to ARP_requests and sends
ARP_updates (this
Is enabled by default after startup),
Stop  -stops ARP_thread,
Send count ip - send count ARP requests for IP,
Show - prints basic bond information, like IPv4 statistics from clients
Help,
Quit.
The best way to test mode 6 is to use this example together with
previous patch:
[PATCH 3/4] bond: add debug info for mode 6 link bonding.
Connect clients thru switch to bonding machine and send:
arping -c 1 bond_ip or
generate IPv4 traffic to bond_ip (IPv4 traffic from different clients
should be then balanced on slaves in round robin manner).

Signed-off-by: Michal Jastrzebski 
Signed-off-by: Maciej Gajdzica  
---
 examples/bond/Makefile |   57 
 examples/bond/main.c   |  796 
 examples/bond/main.h   |   46 +++
 3 files changed, 899 insertions(+)
 create mode 100644 examples/bond/Makefile
 create mode 100644 examples/bond/main.c
 create mode 100644 examples/bond/main.h

diff --git a/examples/bond/Makefile b/examples/bond/Makefile
new file mode 100644
index 000..9262249
--- /dev/null
+++ b/examples/bond/Makefile
@@ -0,0 +1,57 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = bond_app
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+CFLAGS += -O3
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/bond/main.c b/examples/bond/main.c
new file mode 100644
index 000..e62fbde
--- /dev/null
+++ b/examples/bond/main.c
@@ -0,0 +1,796 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, IN

[dpdk-dev] [PATCH v3 3/6] bond: add debug info for mode 6 link bonding

2015-02-19 Thread Michal Jastrzebski
This patch add some debug information when using link bonding mode 6.
It prints basic information about ARP packets on RX and TX (MAC, ip,
packet number, arp packet type).
If CONFIG_RTE_LIBRTE_BOND_DEBUG_ALB == y.
If CONFIG_RTE_LIBRTE_BOND_DEBUG_ALB_L1 is enabled instead of previous
one, use show command to see IPv4 balancing from clients.

Signed-off-by: Michal Jastrzebski 
---
 config/common_linuxapp |3 +-
 lib/librte_pmd_bond/rte_eth_bond_pmd.c |  199 +++-
 2 files changed, 198 insertions(+), 4 deletions(-)

diff --git a/config/common_linuxapp b/config/common_linuxapp
index d428f84..7c54edf 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -220,7 +220,8 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n
 # Compile link bonding PMD library
 #
 CONFIG_RTE_LIBRTE_PMD_BOND=y
-
+CONFIG_RTE_LIBRTE_BOND_DEBUG_ALB=n
+CONFIG_RTE_LIBRTE_BOND_DEBUG_ALB_L1=n
 #
 # Compile software PMD backed by AF_PACKET sockets (Linux only)
 #
diff --git a/lib/librte_pmd_bond/rte_eth_bond_pmd.c 
b/lib/librte_pmd_bond/rte_eth_bond_pmd.c
index 39039ba..af2ef8c 100644
--- a/lib/librte_pmd_bond/rte_eth_bond_pmd.c
+++ b/lib/librte_pmd_bond/rte_eth_bond_pmd.c
@@ -192,17 +192,186 @@ bond_ethdev_rx_burst_8023ad(void *queue, struct rte_mbuf 
**bufs,
return num_rx_total;
 }

+#if defined(RTE_LIBRTE_BOND_DEBUG_ALB) || defined(RTE_LIBRTE_BOND_DEBUG_ALB_L1)
+uint32_t burstnumberRX;
+uint32_t burstnumberTX;
+
+#ifdef RTE_LIBRTE_BOND_DEBUG_ALB
+
+static void
+arp_op_name(uint16_t arp_op, char *buf)
+{
+   switch (arp_op) {
+   case ARP_OP_REQUEST:
+   snprintf(buf, sizeof("ARP Request"), "%s", "ARP Request");
+   return;
+   case ARP_OP_REPLY:
+   snprintf(buf, sizeof("ARP Reply"), "%s", "ARP Reply");
+   return;
+   case ARP_OP_REVREQUEST:
+   snprintf(buf, sizeof("Reverse ARP Request"), "%s",
+   "Reverse ARP Request");
+   return;
+   case ARP_OP_REVREPLY:
+   snprintf(buf, sizeof("Reverse ARP Reply"), "%s",
+   "Reverse ARP Reply");
+   return;
+   case ARP_OP_INVREQUEST:
+   snprintf(buf, sizeof("Peer Identify Request"), "%s",
+   "Peer Identify Request");
+   return;
+   case ARP_OP_INVREPLY:
+   snprintf(buf, sizeof("Peer Identify Reply"), "%s",
+   "Peer Identify Reply");
+   return;
+   default:
+   break;
+   }
+   snprintf(buf, sizeof("Unknown"), "%s", "Unknown");
+   return;
+}
+#endif
+#define MaxIPv4String  16
+static void
+ipv4_addr_to_dot(uint32_t be_ipv4_addr, char *buf, uint8_t buf_size)
+{
+   uint32_t ipv4_addr;
+
+   ipv4_addr = rte_be_to_cpu_32(be_ipv4_addr);
+   snprintf(buf, buf_size, "%d.%d.%d.%d", (ipv4_addr >> 24) & 0xFF,
+   (ipv4_addr >> 16) & 0xFF, (ipv4_addr >> 8) & 0xFF,
+   ipv4_addr & 0xFF);
+}
+
+#define MAX_CLIENTS_NUMBER 128
+uint8_t active_clients;
+struct client_stats_t {
+   uint8_t port;
+   uint32_t ipv4_addr;
+   uint32_t ipv4_rx_packets;
+   uint32_t ipv4_tx_packets;
+};
+struct client_stats_t client_stats[MAX_CLIENTS_NUMBER];
+
+static void
+update_client_stats(uint32_t addr, uint8_t port, uint32_t *TXorRXindicator)
+{
+   int i = 0;
+
+   for (; i < MAX_CLIENTS_NUMBER; i++) {
+   if ((client_stats[i].ipv4_addr == addr) && 
(client_stats[i].port == port))  {
+   /* Just update RX packets number for this client */
+   if (TXorRXindicator == &burstnumberRX)
+   client_stats[i].ipv4_rx_packets++;
+   else
+   client_stats[i].ipv4_tx_packets++;
+   return;
+   }
+   }
+   /* We have a new client. Insert him to the table, and increment stats */
+   if (TXorRXindicator == &burstnumberRX)
+   client_stats[active_clients].ipv4_rx_packets++;
+   else
+   client_stats[active_clients].ipv4_tx_packets++;
+   client_stats[active_clients].ipv4_addr = addr;
+   client_stats[active_clients].port = port;
+   active_clients++;
+
+}
+
+void print_client_stats(void);
+void print_client_stats(void)
+{
+   int i = 0;
+   char buf[MaxIPv4String];
+
+   for (; i < active_clients; i++) {
+   ipv4_addr_to_dot(client_stats[i].ipv4_addr, buf, MaxIPv4String);
+   printf("port:%d client:%s RX:%d TX:%d\n", client_stats[i].port, 
buf,
+   client_stats[i].ipv4_rx_packets,
+   client_stats[i].ipv4_tx_packets);
+   }
+}
+#ifdef RTE_LIBRTE_BOND_DEBUG_ALB
+#define MODE6_DEBUG(info, src_ip, dst_ip, eth_h, arp_op, port, burstnumber)
\
+   RTE_LOG(DEBUG, PMD, \
+   "%s " \
+  

[dpdk-dev] [PATCH v3 2/6] bond: add link bonding mode 6 implementation

2015-02-19 Thread Michal Jastrzebski
From: Maciej Gajdzica 

This mode includes adaptive TLB and receive load balancing (RLB). In RLB
the bonding driver intercepts ARP replies send by local system and
overwrites its source MAC address, so that different peers send data to
the server on different slave interfaces. When local system sends ARP
request, it saves IP information from it. When ARP reply from that peer
is received, its MAC is stored, one of slave MACs assigned and ARP reply
send to that peer.

Signed-off-by: Maciej Gajdzica 
Signed-off-by: Michal Jastrzebski 
Signed-off-by: Daniel Mrzyglod 
---
 lib/librte_pmd_bond/Makefile   |1 +
 lib/librte_pmd_bond/rte_eth_bond.h |9 +
 lib/librte_pmd_bond/rte_eth_bond_alb.c |  256 +++
 lib/librte_pmd_bond/rte_eth_bond_alb.h |  109 
 lib/librte_pmd_bond/rte_eth_bond_api.c |   28 ++-
 lib/librte_pmd_bond/rte_eth_bond_args.c|1 +
 lib/librte_pmd_bond/rte_eth_bond_pmd.c |  259 
 lib/librte_pmd_bond/rte_eth_bond_private.h |   12 ++
 8 files changed, 640 insertions(+), 35 deletions(-)
 create mode 100644 lib/librte_pmd_bond/rte_eth_bond_alb.c
 create mode 100644 lib/librte_pmd_bond/rte_eth_bond_alb.h

diff --git a/lib/librte_pmd_bond/Makefile b/lib/librte_pmd_bond/Makefile
index d6c81a8..cb16356 100644
--- a/lib/librte_pmd_bond/Makefile
+++ b/lib/librte_pmd_bond/Makefile
@@ -50,6 +50,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_api.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_pmd.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_args.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_8023ad.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_alb.c

 ifeq ($(CONFIG_RTE_MBUF_REFCNT),n)
 $(info WARNING: Link Bonding Broadcast mode is disabled because it needs 
MBUF_REFCNT.)
diff --git a/lib/librte_pmd_bond/rte_eth_bond.h 
b/lib/librte_pmd_bond/rte_eth_bond.h
index 7177983..13581cb 100644
--- a/lib/librte_pmd_bond/rte_eth_bond.h
+++ b/lib/librte_pmd_bond/rte_eth_bond.h
@@ -101,6 +101,15 @@ extern "C" {
  * This mode provides an adaptive transmit load balancing. It dynamically
  * changes the transmitting slave, according to the computed load. Statistics
  * are collected in 100ms intervals and scheduled every 10ms */
+#define BONDING_MODE_ALB   (6)
+/**< Adaptive Load Balancing (Mode 6)
+ * This mode includes adaptive TLB and receive load balancing (RLB). In RLB the
+ * bonding driver intercepts ARP replies send by local system and overwrites 
its
+ * source MAC address, so that different peers send data to the server on
+ * different slave interfaces. When local system sends ARP request, it saves IP
+ * information from it. When ARP reply from that peer is received, its MAC is
+ * stored, one of slave MACs assigned and ARP reply send to that peer.
+ */

 /* Balance Mode Transmit Policies */
 #define BALANCE_XMIT_POLICY_LAYER2 (0)
diff --git a/lib/librte_pmd_bond/rte_eth_bond_alb.c 
b/lib/librte_pmd_bond/rte_eth_bond_alb.c
new file mode 100644
index 000..55ed842
--- /dev/null
+++ b/lib/librte_pmd_bond/rte_eth_bond_alb.c
@@ -0,0 +1,256 @@
+#include "rte_eth_bond_private.h"
+#include "rte_eth_bond_alb.h"
+
+static inline uint8_t
+simple_hash(uint8_t *hash_start, int hash_size)
+{
+   int i;
+   uint8_t hash;
+
+   hash = 0;
+   for (i = 0; i < hash_size; ++i)
+   hash ^= hash_start[i];
+
+   return hash;
+}
+
+static uint8_t
+calculate_slave(struct bond_dev_private *internals)
+{
+   uint8_t idx;
+
+   idx = (internals->mode6.last_slave + 1) % internals->active_slave_count;
+   internals->mode6.last_slave = idx;
+   return internals->active_slaves[idx];
+}
+
+int
+bond_mode_alb_enable(struct rte_eth_dev *bond_dev)
+{
+   struct bond_dev_private *internals = bond_dev->data->dev_private;
+   struct client_data *hash_table = internals->mode6.client_table;
+
+   uint16_t element_size;
+   char mem_name[RTE_ETH_NAME_MAX_LEN];
+   int socket_id = bond_dev->pci_dev->numa_node;
+
+   /* Fill hash table with initial values */
+   memset(hash_table, 0, sizeof(struct client_data) * ALB_HASH_TABLE_SIZE);
+   rte_spinlock_init(&internals->mode6.lock);
+   internals->mode6.last_slave = ALB_NULL_INDEX;
+   internals->mode6.ntt = 0;
+
+   /* Initialize memory pool for ARP packets to send */
+   if (internals->mode6.mempool == NULL) {
+   /*
+* 256 is size of ETH header, ARP header and nested VLAN 
headers.
+* The value is chosen to be cache aligned.
+*/
+   element_size = 256 + sizeof(struct rte_mbuf) + 
RTE_PKTMBUF_HEADROOM;
+   snprintf(mem_name, sizeof(mem_name), "%s_MODE6", 
bond_dev->data->name);
+   internals->mode6.mempool = rte_mempool_create(mem_name,
+   512 * RTE_MAX_ETHPORTS,
+   element_size,
+  

[dpdk-dev] [PATCH v3 1/6] net: changed arp_hdr struct declaration

2015-02-19 Thread Michal Jastrzebski
From: Maciej Gajdzica 

Changed MAC address type from uint8_t[6] to struct ether_addr and IP
address type from uint8_t[4] to uint32_t. Also removed union from
arp_hdr struct. Updated test-pmd to match new arp_hdr version.

Signed-off-by: Maciej Gajdzica 
---
 app/test-pmd/icmpecho.c  |   27 ++-
 lib/librte_net/rte_arp.h |   13 ++---
 2 files changed, 16 insertions(+), 24 deletions(-)

diff --git a/app/test-pmd/icmpecho.c b/app/test-pmd/icmpecho.c
index 08ea01d..010c5a9 100644
--- a/app/test-pmd/icmpecho.c
+++ b/app/test-pmd/icmpecho.c
@@ -371,18 +371,14 @@ reply_to_icmp_echo_rqsts(struct fwd_stream *fs)
continue;
}
if (verbose_level > 0) {
-   memcpy(ð_addr,
-  arp_h->arp_data.arp_ip.arp_sha, 6);
+   ether_addr_copy(&arp_h->arp_data.arp_sha, 
ð_addr);
ether_addr_dump("sha=", ð_addr);
-   memcpy(&ip_addr,
-  arp_h->arp_data.arp_ip.arp_sip, 4);
+   ip_addr = arp_h->arp_data.arp_sip;
ipv4_addr_dump(" sip=", ip_addr);
printf("\n");
-   memcpy(ð_addr,
-  arp_h->arp_data.arp_ip.arp_tha, 6);
+   ether_addr_copy(&arp_h->arp_data.arp_tha, 
ð_addr);
ether_addr_dump("tha=", ð_addr);
-   memcpy(&ip_addr,
-  arp_h->arp_data.arp_ip.arp_tip, 4);
+   ip_addr = arp_h->arp_data.arp_tip;
ipv4_addr_dump(" tip=", ip_addr);
printf("\n");
}
@@ -402,17 +398,14 @@ reply_to_icmp_echo_rqsts(struct fwd_stream *fs)
ð_h->s_addr);

arp_h->arp_op = rte_cpu_to_be_16(ARP_OP_REPLY);
-   memcpy(ð_addr, arp_h->arp_data.arp_ip.arp_tha, 6);
-   memcpy(arp_h->arp_data.arp_ip.arp_tha,
-  arp_h->arp_data.arp_ip.arp_sha, 6);
-   memcpy(arp_h->arp_data.arp_ip.arp_sha,
-  ð_h->s_addr, 6);
+   ether_addr_copy(&arp_h->arp_data.arp_tha, ð_addr);
+   ether_addr_copy(&arp_h->arp_data.arp_sha, 
&arp_h->arp_data.arp_tha);
+   ether_addr_copy(ð_addr, &arp_h->arp_data.arp_sha);

/* Swap IP addresses in ARP payload */
-   memcpy(&ip_addr, arp_h->arp_data.arp_ip.arp_sip, 4);
-   memcpy(arp_h->arp_data.arp_ip.arp_sip,
-  arp_h->arp_data.arp_ip.arp_tip, 4);
-   memcpy(arp_h->arp_data.arp_ip.arp_tip, &ip_addr, 4);
+   ip_addr = arp_h->arp_data.arp_sip;
+   arp_h->arp_data.arp_sip = arp_h->arp_data.arp_tip;
+   arp_h->arp_data.arp_tip = ip_addr;
pkts_burst[nb_replies++] = pkt;
continue;
}
diff --git a/lib/librte_net/rte_arp.h b/lib/librte_net/rte_arp.h
index c7b0e51..72108a1 100644
--- a/lib/librte_net/rte_arp.h
+++ b/lib/librte_net/rte_arp.h
@@ -39,6 +39,7 @@
  */

 #include 
+#include 

 #ifdef __cplusplus
 extern "C" {
@@ -48,10 +49,10 @@ extern "C" {
  * ARP header IPv4 payload.
  */
 struct arp_ipv4 {
-   uint8_t  arp_sha[6]; /* sender hardware address */
-   uint8_t  arp_sip[4]; /* sender IP address */
-   uint8_t  arp_tha[6]; /* target hardware address */
-   uint8_t  arp_tip[4]; /* target IP address */
+   struct ether_addr  arp_sha; /* sender hardware address */
+   uint32_t  arp_sip;  /* sender IP address */
+   struct ether_addr  arp_tha; /* target hardware address */
+   uint32_t  arp_tip;  /* target IP address */
 } __attribute__((__packed__));

 /**
@@ -72,9 +73,7 @@ struct arp_hdr {
 #defineARP_OP_INVREQUEST 8 /* request to identify peer */
 #defineARP_OP_INVREPLY   9 /* response identifying peer */

-   union {
-   struct arp_ipv4 arp_ip;
-   } arp_data;
+   struct arp_ipv4 arp_data;
 } __attribute__((__packed__));

 #ifdef __cplusplus
-- 
1.7.9.5



[dpdk-dev] [PATCH v3 0/6] Link Bonding mode 6 support (ALB)

2015-02-19 Thread Michal Jastrzebski
v3 changes:
- completed description for mode 5 unit tests patch
- fixed errors required by checkpatch.pl
- moved patch version changes from patches to cover-letter

v2 changes:
in mode 6 patch 2/6:
- add VLAN support
- fixed sending duplicated ARPupdates
- fixed assigning slaves for next clients
- fixed TLB mode

in debug patch 3/6:
- add IPv4 RX/TX information
- add mode6_debug(..) function

in the example application patch 4/6:
- remove count parameter from send command
- fixed quit command to use cmdline_quit(cl)
- add echo function - all IPv4 packets will be retransmitted. Bonding
driver will use TLB policy - this will show how TX works in mode 6
- remove unused structures rx_conf_default and tx_conf_default
- add VLAN support
- remove unnecessary comments
- nodify show command in term of printing DEBUG informations

This patchset add support for link bonding mode 6.
Additionally it changes an arp_header structure definition.
Also a basic example is introduced. Using this example,
Bonding will configure each client ARP table,
that packets from each client will be received on different slave,
mode 6 uses round-robin policy to assign slave to client IP address.

Daniel Mrzyglod (1):
  bond: modify TLB unit tests

Maciej Gajdzica (3):
  net: changed arp_hdr struct declaration
  bond: add link bonding mode 6 implementation
  bond: add unit tests for link bonding mode 6.

Michal Jastrzebski (2):
  bond: add debug info for mode 6 link bonding
  bond: add example application for link bonding mode 6

 app/test-pmd/icmpecho.c|   27 +-
 app/test/packet_burst_generator.c  |   41 +-
 app/test/packet_burst_generator.h  |   11 +-
 app/test/test_link_bonding.c   |  450 +++-
 app/test/test_pmd_perf.c   |3 +-
 app/test/virtual_pmd.c |  109 ++--
 app/test/virtual_pmd.h |5 +-
 config/common_linuxapp |3 +-
 examples/bond/Makefile |   57 ++
 examples/bond/main.c   |  796 
 examples/bond/main.h   |   46 ++
 lib/librte_net/rte_arp.h   |   13 +-
 lib/librte_pmd_bond/Makefile   |1 +
 lib/librte_pmd_bond/rte_eth_bond.h |   11 +-
 lib/librte_pmd_bond/rte_eth_bond_alb.c |  256 +
 lib/librte_pmd_bond/rte_eth_bond_alb.h |  109 
 lib/librte_pmd_bond/rte_eth_bond_api.c |   28 +-
 lib/librte_pmd_bond/rte_eth_bond_args.c|3 +-
 lib/librte_pmd_bond/rte_eth_bond_pmd.c |  460 ++--
 lib/librte_pmd_bond/rte_eth_bond_private.h |   12 +
 20 files changed, 2295 insertions(+), 146 deletions(-)
 create mode 100644 examples/bond/Makefile
 create mode 100644 examples/bond/main.c
 create mode 100644 examples/bond/main.h
 create mode 100644 lib/librte_pmd_bond/rte_eth_bond_alb.c
 create mode 100644 lib/librte_pmd_bond/rte_eth_bond_alb.h

-- 
1.7.9.5



[dpdk-dev] [PATCH v4 3/3] examples: example showing use of callbacks.

2015-02-19 Thread John McNamara
From: Richardson, Bruce 

Example showing how callbacks can be used to insert a timestamp
into each packet on RX. On TX the timestamp is used to calculate
the packet latency through the app, in cycles.

Signed-off-by: Bruce Richardson 
Signed-off-by: John McNamara 
---
 MAINTAINERS  |4 +
 examples/Makefile|1 +
 examples/rxtx_callbacks/Makefile |   57 ++
 examples/rxtx_callbacks/main.c   |  228 ++
 4 files changed, 290 insertions(+), 0 deletions(-)
 create mode 100644 examples/rxtx_callbacks/Makefile
 create mode 100644 examples/rxtx_callbacks/main.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 7ac6d59..dcca441 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -432,6 +432,10 @@ F: doc/guides/sample_app_ug/netmap_compatibility.rst
 F: examples/quota_watermark/
 F: doc/guides/sample_app_ug/quota_watermark.rst

+M: Bruce Richardson 
+M: John McNamara 
+F: examples/rxtx_callbacks/
+
 F: examples/skeleton/

 F: examples/vmdq/
diff --git a/examples/Makefile b/examples/Makefile
index 095bad2..4a872f2 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -63,6 +63,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += packet_ordering
 DIRS-$(CONFIG_RTE_LIBRTE_METER) += qos_meter
 DIRS-$(CONFIG_RTE_LIBRTE_SCHED) += qos_sched
 DIRS-y += quota_watermark
+DIRS-$(CONFIG_RTE_LIBRTE_ETHDEV_RXTX_CALLBACKS) += rxtx_callbacks
 DIRS-y += skeleton
 DIRS-y += timer
 DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += vhost
diff --git a/examples/rxtx_callbacks/Makefile b/examples/rxtx_callbacks/Makefile
new file mode 100644
index 000..0fafbb7
--- /dev/null
+++ b/examples/rxtx_callbacks/Makefile
@@ -0,0 +1,57 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overridden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = rxtx_callbacks
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+EXTRA_CFLAGS += -O3 -g -Wfatal-errors
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/rxtx_callbacks/main.c b/examples/rxtx_callbacks/main.c
new file mode 100644
index 000..9e5e68e
--- /dev/null
+++ b/examples/rxtx_callbacks/main.c
@@ -0,0 +1,228 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived

[dpdk-dev] [PATCH v4 2/3] ethdev: add optional rxtx callback support

2015-02-19 Thread John McNamara
From: Richardson, Bruce 

Add optional support for inline processing of packets inside the RX
or TX call. For an RX callback, what happens is that we get a set of
packets from the NIC and then pass them to a callback function, if
configured, to allow additional processing to be done on them, e.g.
filling in more mbuf fields, before passing back to the application.
On TX, the packets are similarly post-processed before being handed
to the NIC for transmission.

Signed-off-by: Bruce Richardson 
Signed-off-by: John McNamara 
---
 config/common_bsdapp   |1 +
 config/common_linuxapp |1 +
 lib/librte_ether/rte_ethdev.c  |  192 +-
 lib/librte_ether/rte_ethdev.h  |  204 +++-
 lib/librte_ether/rte_ether_version.map |4 +
 5 files changed, 397 insertions(+), 5 deletions(-)

diff --git a/config/common_bsdapp b/config/common_bsdapp
index f11ff39..e9c445e 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -133,6 +133,7 @@ CONFIG_RTE_LIBRTE_ETHDEV_DEBUG=n
 CONFIG_RTE_MAX_ETHPORTS=32
 CONFIG_RTE_LIBRTE_IEEE1588=n
 CONFIG_RTE_ETHDEV_QUEUE_STAT_CNTRS=16
+CONFIG_RTE_LIBRTE_ETHDEV_RXTX_CALLBACKS=n

 #
 # Support NIC bypass logic
diff --git a/config/common_linuxapp b/config/common_linuxapp
index f921d8c..0cb850e 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -131,6 +131,7 @@ CONFIG_RTE_LIBRTE_ETHDEV_DEBUG=n
 CONFIG_RTE_MAX_ETHPORTS=32
 CONFIG_RTE_LIBRTE_IEEE1588=n
 CONFIG_RTE_ETHDEV_QUEUE_STAT_CNTRS=16
+CONFIG_RTE_LIBRTE_ETHDEV_RXTX_CALLBACKS=n

 #
 # Support NIC bypass logic
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 7c4e772..8a4e0e7 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -337,6 +337,20 @@ rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
dev->data->nb_rx_queues = 0;
return -(ENOMEM);
}
+
+#ifdef RTE_LIBRTE_ETHDEV_RXTX_CALLBACKS
+   dev->post_rx_burst_cbs = rte_zmalloc(
+   "ethdev->post_rx_burst_cbs",
+   sizeof(*dev->post_rx_burst_cbs) * nb_queues,
+   RTE_CACHE_LINE_SIZE);
+   if (dev->post_rx_burst_cbs == NULL) {
+   rte_free(dev->data->rx_queues);
+   dev->data->rx_queues = NULL;
+   dev->data->nb_rx_queues = 0;
+   return -(ENOMEM);
+   }
+#endif
+
} else { /* re-configure */
FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_release, -ENOTSUP);

@@ -349,9 +363,25 @@ rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
if (rxq == NULL)
return -(ENOMEM);

-   if (nb_queues > old_nb_queues)
+#ifdef RTE_LIBRTE_ETHDEV_RXTX_CALLBACKS
+   dev->post_rx_burst_cbs = rte_realloc(
+   dev->post_rx_burst_cbs,
+   sizeof(*dev->post_rx_burst_cbs) *
+   nb_queues, RTE_CACHE_LINE_SIZE);
+   if (dev->post_rx_burst_cbs == NULL)
+   return -(ENOMEM);
+#endif
+
+   if (nb_queues > old_nb_queues) {
+   uint16_t new_qs = nb_queues - old_nb_queues;
memset(rxq + old_nb_queues, 0,
-   sizeof(rxq[0]) * (nb_queues - old_nb_queues));
+   sizeof(rxq[0]) * new_qs);
+
+#ifdef RTE_LIBRTE_ETHDEV_RXTX_CALLBACKS
+   memset(dev->post_rx_burst_cbs + old_nb_queues, 0,
+   sizeof(dev->post_rx_burst_cbs[0]) * new_qs);
+#endif
+   }

dev->data->rx_queues = rxq;

@@ -479,6 +509,20 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
dev->data->nb_tx_queues = 0;
return -(ENOMEM);
}
+
+#ifdef RTE_LIBRTE_ETHDEV_RXTX_CALLBACKS
+   dev->pre_tx_burst_cbs = rte_zmalloc(
+   "ethdev->pre_tx_burst_cbs",
+   sizeof(*dev->pre_tx_burst_cbs) * nb_queues,
+   RTE_CACHE_LINE_SIZE);
+   if (dev->pre_tx_burst_cbs == NULL) {
+   rte_free(dev->data->tx_queues);
+   dev->data->tx_queues = NULL;
+   dev->data->nb_tx_queues = 0;
+   return -(ENOMEM);
+   }
+#endif
+
} else { /* re-configure */
FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_release, -ENOTSUP);

@@ -491,9 +535,25 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
if (txq == NULL)
return -(ENOMEM);

-   if (nb_queues > old_nb_queues)
+#ifdef RTE_LIBRTE_ETHDEV_RXTX_CALLBACKS
+   dev->

[dpdk-dev] [PATCH v4 1/3] ethdev: rename callbacks field to link_intr_cbs

2015-02-19 Thread John McNamara
From: Richardson, Bruce 

The 'callbacks' member of the rte_eth_dev structure has been renamed
to 'link_intr_cbs' to make it clear that it refers to callbacks from
NIC interrupts. This allows us to add other types of callbacks to
the structure without ambiguity.

Signed-off-by: Bruce Richardson 
Signed-off-by: John McNamara 
---
 app/test/virtual_pmd.c |2 +-
 lib/librte_ether/rte_ethdev.c  |   12 ++--
 lib/librte_ether/rte_ethdev.h  |2 +-
 lib/librte_pmd_bond/rte_eth_bond_api.c |2 +-
 lib/librte_pmd_ring/rte_eth_ring.c |2 +-
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/app/test/virtual_pmd.c b/app/test/virtual_pmd.c
index 9fac95d..eb75846 100644
--- a/app/test/virtual_pmd.c
+++ b/app/test/virtual_pmd.c
@@ -576,7 +576,7 @@ virtual_ethdev_create(const char *name, struct ether_addr 
*mac_addr,
eth_dev->data->nb_rx_queues = (uint16_t)1;
eth_dev->data->nb_tx_queues = (uint16_t)1;

-   TAILQ_INIT(&(eth_dev->callbacks));
+   TAILQ_INIT(&(eth_dev->link_intr_cbs));

eth_dev->data->dev_link.link_status = 0;
eth_dev->data->dev_link.link_speed = ETH_LINK_SPEED_1;
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 17be2f3..7c4e772 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -265,7 +265,7 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
eth_dev->data->rx_mbuf_alloc_failed = 0;

/* init user callbacks */
-   TAILQ_INIT(&(eth_dev->callbacks));
+   TAILQ_INIT(&(eth_dev->link_intr_cbs));

/*
 * Set the default MTU.
@@ -2743,7 +2743,7 @@ rte_eth_dev_callback_register(uint8_t port_id,
dev = &rte_eth_devices[port_id];
rte_spinlock_lock(&rte_eth_dev_cb_lock);

-   TAILQ_FOREACH(user_cb, &(dev->callbacks), next) {
+   TAILQ_FOREACH(user_cb, &(dev->link_intr_cbs), next) {
if (user_cb->cb_fn == cb_fn &&
user_cb->cb_arg == cb_arg &&
user_cb->event == event) {
@@ -2757,7 +2757,7 @@ rte_eth_dev_callback_register(uint8_t port_id,
user_cb->cb_fn = cb_fn;
user_cb->cb_arg = cb_arg;
user_cb->event = event;
-   TAILQ_INSERT_TAIL(&(dev->callbacks), user_cb, next);
+   TAILQ_INSERT_TAIL(&(dev->link_intr_cbs), user_cb, next);
}

rte_spinlock_unlock(&rte_eth_dev_cb_lock);
@@ -2784,7 +2784,7 @@ rte_eth_dev_callback_unregister(uint8_t port_id,
rte_spinlock_lock(&rte_eth_dev_cb_lock);

ret = 0;
-   for (cb = TAILQ_FIRST(&dev->callbacks); cb != NULL; cb = next) {
+   for (cb = TAILQ_FIRST(&dev->link_intr_cbs); cb != NULL; cb = next) {

next = TAILQ_NEXT(cb, next);

@@ -2798,7 +2798,7 @@ rte_eth_dev_callback_unregister(uint8_t port_id,
 * then remove it.
 */
if (cb->active == 0) {
-   TAILQ_REMOVE(&(dev->callbacks), cb, next);
+   TAILQ_REMOVE(&(dev->link_intr_cbs), cb, next);
rte_free(cb);
} else {
ret = -EAGAIN;
@@ -2817,7 +2817,7 @@ _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
struct rte_eth_dev_callback dev_cb;

rte_spinlock_lock(&rte_eth_dev_cb_lock);
-   TAILQ_FOREACH(cb_lst, &(dev->callbacks), next) {
+   TAILQ_FOREACH(cb_lst, &(dev->link_intr_cbs), next) {
if (cb_lst->cb_fn == NULL || cb_lst->event != event)
continue;
dev_cb = *cb_lst;
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 6e454e8..48e4ac9 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1539,7 +1539,7 @@ struct rte_eth_dev {
const struct eth_driver *driver;/**< Driver for this device */
struct eth_dev_ops *dev_ops;/**< Functions exported by PMD */
struct rte_pci_device *pci_dev; /**< PCI info. supplied by probing */
-   struct rte_eth_dev_cb_list callbacks; /**< User application callbacks */
+   struct rte_eth_dev_cb_list link_intr_cbs; /**< User application 
callbacks on interrupt*/
 };

 struct rte_eth_dev_sriov {
diff --git a/lib/librte_pmd_bond/rte_eth_bond_api.c 
b/lib/librte_pmd_bond/rte_eth_bond_api.c
index 4ab3267..077cb73 100644
--- a/lib/librte_pmd_bond/rte_eth_bond_api.c
+++ b/lib/librte_pmd_bond/rte_eth_bond_api.c
@@ -251,7 +251,7 @@ rte_eth_bond_create(const char *name, uint8_t mode, uint8_t 
socket_id)
eth_dev->data->nb_rx_queues = (uint16_t)1;
eth_dev->data->nb_tx_queues = (uint16_t)1;

-   TAILQ_INIT(&(eth_dev->callbacks));
+   TAILQ_INIT(&(eth_dev->link_intr_cbs));

eth_dev->data->dev_link.link_status = 0;

diff --git a/lib/librte_pmd_ring/rte_eth_ring.c 
b/lib/librte_pmd_ring/rte_eth_ring.c
index a23e933..a5dc71e 100644
--- a/lib/librte_pmd_ring/rte

[dpdk-dev] [PATCH v4 0/3] DPDK ethdev callback support

2015-02-19 Thread John McNamara
This patchset is for a small optional addition to the ethdev library,
to add support for callbacks at the RX and TX stages. This allows
packet processing to be done on packets before they get returned
to applications using rte_eth_rx_burst call.

See the RFC cover letter for the use cases:

http://dpdk.org/ml/archives/dev/2014-December/010491.html

For this version we spent some time investigating Stephen Hemminger's
suggestion of using the userspace RCU (read-copy-update) library for
SMP safety:

   http://urcu.so/

The default liburcu (which defaulted to liburcu-mb) requires the least
interaction from the end user but showed a 25% drop in packet throughput
in the callback sample app.

The liburcu-qsbr (quiescent state) variant showed a 1% drop in packet
throughput in the callback sample app. However it requires registered
RCU threads in the program to periodically announce quiescent states.
This makes it more difficult to implement for end user applications.

For this release we will document that adding and removing callbacks
is not thread safe.

Note: Sample application documentation to follow in a patch update.


Version 4 changes:
* Make the callback feature a compile time option.

Version 3 changes:
* Removed unnecessary header file from example folder
  (which included baremetal reference).
* Renamed the interrupt, RX and TX callbacks to make their function
  clearer (using the names suggested in the mailing list comments).
* Squashed ABI version update into the commit it relates to.
* Fixed various checkpatch warnings.

Version 2 changes:
* Added ABI versioning.
* Doxygen clarifications.

Version 1 changes:
* Added callback removal functions.
* Minor fixes.


Richardson, Bruce (3):
  ethdev: rename callbacks field to link_intr_cbs
  ethdev: add optional rxtx callback support
  examples: example showing use of callbacks.

 MAINTAINERS|4 +
 app/test/virtual_pmd.c |2 +-
 config/common_bsdapp   |1 +
 config/common_linuxapp |1 +
 examples/Makefile  |1 +
 examples/rxtx_callbacks/Makefile   |   57 
 examples/rxtx_callbacks/main.c |  228 
 lib/librte_ether/rte_ethdev.c  |  204 +++--
 lib/librte_ether/rte_ethdev.h  |  204 -
 lib/librte_ether/rte_ether_version.map |4 +
 lib/librte_pmd_bond/rte_eth_bond_api.c |2 +-
 lib/librte_pmd_ring/rte_eth_ring.c |2 +-
 12 files changed, 696 insertions(+), 14 deletions(-)
 create mode 100644 examples/rxtx_callbacks/Makefile
 create mode 100644 examples/rxtx_callbacks/main.c

-- 
1.7.4.1



[dpdk-dev] [PATCH] ACL: use setjmp/longjmp to handle memory allocation failure at build phase

2015-02-19 Thread Konstantin Ananyev
During build phase ACL doing quite a lot of memory allocations
for relatively small temporary structures.
In theory each of such allocation can fail, so we need to handle
all these possible failures.
That adds a lot of extra checks and makes the code harder to read and follow.
To simplify the process, made changes to handle all such failures
in one place.
Note, that all that memory for temporary structures
is freed at one go at the end of build phase.

Signed-off-by: Konstantin Ananyev 
---
 lib/librte_acl/acl_bld.c | 42 +-
 lib/librte_acl/tb_mem.c  |  8 ++--
 lib/librte_acl/tb_mem.h  |  3 +++
 3 files changed, 22 insertions(+), 31 deletions(-)

diff --git a/lib/librte_acl/acl_bld.c b/lib/librte_acl/acl_bld.c
index 1fe79fb..ddb23fd 100644
--- a/lib/librte_acl/acl_bld.c
+++ b/lib/librte_acl/acl_bld.c
@@ -302,8 +302,6 @@ acl_add_ptr(struct acl_build_context *context,
/* add room for more pointers */
num_ptrs = node->max_ptrs + ACL_PTR_ALLOC;
ptrs = acl_build_alloc(context, num_ptrs, sizeof(*ptrs));
-   if (ptrs == NULL)
-   return -ENOMEM;

/* copy current points to new memory allocation */
if (node->ptrs != NULL) {
@@ -477,16 +475,12 @@ acl_dup_node(struct acl_build_context *context, struct 
rte_acl_node *node)
struct rte_acl_node *next;

next = acl_alloc_node(context, node->level);
-   if (next == NULL)
-   return NULL;

/* allocate the pointers */
if (node->num_ptrs > 0) {
next->ptrs = acl_build_alloc(context,
node->max_ptrs,
sizeof(struct rte_acl_ptr_set));
-   if (next->ptrs == NULL)
-   return NULL;
next->max_ptrs = node->max_ptrs;
}

@@ -669,8 +663,6 @@ acl_merge_intersect(struct acl_build_context *context,

/* Duplicate A for intersection */
node_c = acl_dup_node(context, node_a->ptrs[idx_a].ptr);
-   if (node_c == NULL)
-   return -1;

/* Remove intersection from A */
acl_exclude_ptr(context, node_a, idx_a, intersect_ptr);
@@ -1328,14 +1320,10 @@ build_trie(struct acl_build_context *context, struct 
rte_acl_build_rule *head,
rule = head;

trie = acl_alloc_node(context, 0);
-   if (trie == NULL)
-   return NULL;

while (rule != NULL) {

root = acl_alloc_node(context, 0);
-   if (root == NULL)
-   return NULL;

root->ref_count = 1;
end = root;
@@ -1419,10 +1407,9 @@ build_trie(struct acl_build_context *context, struct 
rte_acl_build_rule *head,
 * Setup the results for this rule.
 * The result and priority of each category.
 */
-   if (end->mrt == NULL &&
-   (end->mrt = acl_build_alloc(context, 1,
-   sizeof(*end->mrt))) == NULL)
-   return NULL;
+   if (end->mrt == NULL)
+   end->mrt = acl_build_alloc(context, 1,
+   sizeof(*end->mrt));

for (m = 0; m < context->cfg.num_categories; m++) {
if (rule->f->data.category_mask & (1 << m)) {
@@ -1760,13 +1747,6 @@ acl_build_tries(struct acl_build_context *context,

/* Create a new copy of config for remaining rules. */
config = acl_build_alloc(context, 1, sizeof(*config));
-   if (config == NULL) {
-   RTE_LOG(ERR, ACL,
-   "New config allocation for %u-th "
-   "trie failed\n", num_tries);
-   return -ENOMEM;
-   }
-
memcpy(config, rule_sets[n]->config, sizeof(*config));

/* Make remaining rules use new config. */
@@ -1825,12 +1805,6 @@ acl_build_rules(struct acl_build_context *bcx)
sz = ofs + n * fn * sizeof(*wp);

br = tb_alloc(&bcx->pool, sz);
-   if (br == NULL) {
-   RTE_LOG(ERR, ACL, "ACL context %s: failed to create a copy "
-   "of %u build rules (%zu bytes)\n",
-   bcx->acx->name, n, sz);
-   return -ENOMEM;
-   }

wp = (uint32_t *)((uintptr_t)br + ofs);
num = 0;
@@ -1895,6 +1869,16 @@ acl_bld(struct acl_build_context *bcx, struct 
rte_acl_ctx *ctx,
bcx->category_mask = LEN2MASK(bcx->cfg.num_categories);
bcx->node_max = node_max;

+   rc = sigsetjmp(bcx->pool.fail, 0);
+
+   /* build phase runs out of memory. */
+   if (rc != 0) {
+   RTE_LOG(ERR, ACL,
+   "ACL context: %s, %s() failed with error code: %d\n",
+   bcx->acx->name, __func__, rc);
+   return rc;
+ 

[dpdk-dev] [PATCH v4 7/7] pmd ixgbe: fix vlan setting in in PF

2015-02-19 Thread Pawel Wodkowski
The ixgbe_vlan_filter_set() should use hw->mac.ops.set_vfta() to set
VLAN filtering as this is generic function that handles both non-SRIOV
and SRIOV cases.

Bug was discovered issuing command in testpmd 'rx_vlan add VLAN PORT'
for PF. Requested VLAN was enabled but pool mask is not set. Only
command 'rx_vlan add VLAN port PORT vf MASK' can enable pointed VLAN id
for PF.

Signed-off-by: Pawel Wodkowski 
---
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 19 ---
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c 
b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
index 7551bcc..7aef0e8 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
@@ -1162,21 +1162,18 @@ ixgbe_vlan_filter_set(struct rte_eth_dev *dev, uint16_t 
vlan_id, int on)
IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
struct ixgbe_vfta * shadow_vfta =
IXGBE_DEV_PRIVATE_TO_VFTA(dev->data->dev_private);
-   uint32_t vfta;
+   struct rte_eth_dev_sriov *sriov = &RTE_ETH_DEV_SRIOV(dev);
+   u32 vind = sriov->active ? sriov->def_vmdq_idx : 0;
+   s32 ret_val;
uint32_t vid_idx;
-   uint32_t vid_bit;

-   vid_idx = (uint32_t) ((vlan_id >> 5) & 0x7F);
-   vid_bit = (uint32_t) (1 << (vlan_id & 0x1F));
-   vfta = IXGBE_READ_REG(hw, IXGBE_VFTA(vid_idx));
-   if (on)
-   vfta |= vid_bit;
-   else
-   vfta &= ~vid_bit;
-   IXGBE_WRITE_REG(hw, IXGBE_VFTA(vid_idx), vfta);
+   ret_val = hw->mac.ops.set_vfta(hw, vlan_id, vind, on);
+   if (ret_val != IXGBE_SUCCESS)
+   return ret_val;

+   vid_idx = (uint32_t) ((vlan_id >> 5) & 0x7F);
/* update local VFTA copy */
-   shadow_vfta->vfta[vid_idx] = vfta;
+   shadow_vfta->vfta[vid_idx] = IXGBE_READ_REG(hw, IXGBE_VFTA(vid_idx));

return 0;
 }
-- 
1.9.1



[dpdk-dev] [PATCH v4 6/7] tespmd: fix DCB in SRIOV mode support

2015-02-19 Thread Pawel Wodkowski
This patch incorporate fixes to support DCB in SRIOV mode for testpmd.

Signed-off-by: Pawel Wodkowski 
---
 app/test-pmd/cmdline.c |  4 ++--
 app/test-pmd/testpmd.c | 39 +--
 app/test-pmd/testpmd.h | 10 --
 3 files changed, 31 insertions(+), 22 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 4753bb4..1e30ca6 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -1964,9 +1964,9 @@ cmd_config_dcb_parsed(void *parsed_result,

/* DCB in VT mode */
if (!strncmp(res->vt_en, "on",2))
-   dcb_conf.dcb_mode = DCB_VT_ENABLED;
+   dcb_conf.vt_en = 1;
else
-   dcb_conf.dcb_mode = DCB_ENABLED;
+   dcb_conf.vt_en = 0;

if (!strncmp(res->pfc_en, "on",2)) {
dcb_conf.pfc_en = 1;
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 3aebea6..bdbf237 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -1766,7 +1766,8 @@ const uint16_t vlan_tags[] = {
 };

 static  int
-get_eth_dcb_conf(struct rte_eth_conf *eth_conf, struct dcb_config *dcb_conf)
+get_eth_dcb_conf(struct rte_eth_conf *eth_conf, struct dcb_config *dcb_conf,
+   uint16_t sriov)
 {
 uint8_t i;

@@ -1774,7 +1775,7 @@ get_eth_dcb_conf(struct rte_eth_conf *eth_conf, struct 
dcb_config *dcb_conf)
 * Builds up the correct configuration for dcb+vt based on the vlan 
tags array
 * given above, and the number of traffic classes available for use.
 */
-   if (dcb_conf->dcb_mode == DCB_VT_ENABLED) {
+   if (dcb_conf->vt_en == 1) {
struct rte_eth_vmdq_dcb_conf vmdq_rx_conf;
struct rte_eth_vmdq_dcb_tx_conf vmdq_tx_conf;

@@ -1791,9 +1792,17 @@ get_eth_dcb_conf(struct rte_eth_conf *eth_conf, struct 
dcb_config *dcb_conf)
vmdq_rx_conf.pool_map[i].vlan_id = vlan_tags[ i ];
vmdq_rx_conf.pool_map[i].pools = 1 << (i % 
vmdq_rx_conf.nb_queue_pools);
}
-   for (i = 0; i < ETH_DCB_NUM_USER_PRIORITIES; i++) {
-   vmdq_rx_conf.dcb_queue[i] = i;
-   vmdq_tx_conf.dcb_queue[i] = i;
+
+   if (sriov == 0) {
+   for (i = 0; i < ETH_DCB_NUM_USER_PRIORITIES; i++) {
+   vmdq_rx_conf.dcb_queue[i] = i;
+   vmdq_tx_conf.dcb_queue[i] = i;
+   }
+   } else {
+   for (i = 0; i < ETH_DCB_NUM_USER_PRIORITIES; i++) {
+   vmdq_rx_conf.dcb_queue[i] = i % 
dcb_conf->num_tcs;
+   vmdq_tx_conf.dcb_queue[i] = i % 
dcb_conf->num_tcs;
+   }
}

/*set DCB mode of RX and TX of multiple queues*/
@@ -1851,22 +1860,32 @@ init_port_dcb_config(portid_t pid,struct dcb_config 
*dcb_conf)
uint16_t nb_vlan;
uint16_t i;

-   /* rxq and txq configuration in dcb mode */
-   nb_rxq = 128;
-   nb_txq = 128;
rx_free_thresh = 64;

+   rte_port = &ports[pid];
memset(&port_conf,0,sizeof(struct rte_eth_conf));
/* Enter DCB configuration status */
dcb_config = 1;

nb_vlan = sizeof( vlan_tags )/sizeof( vlan_tags[ 0 ]);
/*set configuration of DCB in vt mode and DCB in non-vt mode*/
-   retval = get_eth_dcb_conf(&port_conf, dcb_conf);
+   retval = get_eth_dcb_conf(&port_conf, dcb_conf, 
rte_port->dev_info.max_vfs);
+
+   /* rxq and txq configuration in dcb mode */
+   nb_rxq = rte_port->dev_info.max_rx_queues;
+   nb_txq = rte_port->dev_info.max_tx_queues;
+
+   if (rte_port->dev_info.max_vfs) {
+   if (port_conf.rxmode.mq_mode == ETH_MQ_RX_VMDQ_DCB)
+   nb_rxq /= 
port_conf.rx_adv_conf.vmdq_dcb_conf.nb_queue_pools;
+
+   if (port_conf.txmode.mq_mode == ETH_MQ_TX_VMDQ_DCB)
+   nb_txq /= 
port_conf.tx_adv_conf.vmdq_dcb_tx_conf.nb_queue_pools;
+   }
+
if (retval < 0)
return retval;

-   rte_port = &ports[pid];
memcpy(&rte_port->dev_conf, &port_conf,sizeof(struct rte_eth_conf));

rxtx_port_config(rte_port);
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 581130b..0ef3257 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -230,20 +230,10 @@ struct fwd_config {
portid_t   nb_fwd_ports;/**< Nb. of ports involved. */
 };

-/**
- * DCB mode enable
- */
-enum dcb_mode_enable
-{
-   DCB_VT_ENABLED,
-   DCB_ENABLED
-};
-
 /*
  * DCB general config info
  */
 struct dcb_config {
-   enum dcb_mode_enable dcb_mode;
uint8_t vt_en;
enum rte_eth_nb_tcs num_tcs;
uint8_t pfc_en;
-- 
1.9.1



[dpdk-dev] [PATCH v4 5/7] pmd ixgbe: enable DCB in SRIOV

2015-02-19 Thread Pawel Wodkowski
Enable DCB in SRIOV mode for ixgbe driver.

To use DCB in VF PF must configure port as  DCB + VMDQ and VF must
configure port as DCB only. VF are not allowed to change DCB settings
that are common to all ports like number of TC.

Signed-off-by: Pawel Wodkowski 
---
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |  2 +-
 lib/librte_pmd_ixgbe/ixgbe_pf.c | 19 ---
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c   | 18 +++---
 3 files changed, 24 insertions(+), 15 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c 
b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
index 8e9da3b..7551bcc 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
@@ -1514,7 +1514,7 @@ ixgbe_dev_configure(struct rte_eth_dev *dev)
if (conf->nb_queue_pools != ETH_16_POOLS &&
   conf->nb_queue_pools != ETH_32_POOLS) {
PMD_INIT_LOG(ERR, " VMDQ+DCB selected, "
-   "number of TX qqueue pools must be %d 
or %d\n",
+   "number of TX queue pools must be %d or 
%d\n",
ETH_16_POOLS, ETH_32_POOLS);
return (-EINVAL);
}
diff --git a/lib/librte_pmd_ixgbe/ixgbe_pf.c b/lib/librte_pmd_ixgbe/ixgbe_pf.c
index a7b9333..7c4afba 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_pf.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_pf.c
@@ -109,9 +109,12 @@ int ixgbe_pf_host_init(struct rte_eth_dev *eth_dev)
/* Fill sriov structure using default configuration. */
retval = ixgbe_pf_configure_mq_sriov(eth_dev);
if (retval != 0) {
-   if (retval < 0)
-   PMD_INIT_LOG(ERR, " Setting up SRIOV with default 
device "
+   if (retval < 0) {
+   PMD_INIT_LOG(ERR, "Setting up SRIOV with default device 
"
"configuration should not fail. This is 
a BUG.");
+   return retval;
+   }
+
return 0;
}

@@ -652,7 +655,9 @@ ixgbe_get_vf_queues(struct rte_eth_dev *dev, uint32_t vf, 
uint32_t *msgbuf)
 {
struct ixgbe_vf_info *vfinfo =
*IXGBE_DEV_PRIVATE_TO_P_VFDATA(dev->data->dev_private);
-   uint32_t default_q = vf * RTE_ETH_DEV_SRIOV(dev).nb_tx_q_per_pool;
+   struct ixgbe_dcb_config *dcbinfo =
+   IXGBE_DEV_PRIVATE_TO_DCB_CFG(dev->data->dev_private);
+   uint32_t default_q = RTE_ETH_DEV_SRIOV(dev).def_pool_q_idx;

/* Verify if the PF supports the mbox APIs version or not */
switch (vfinfo[vf].api_version) {
@@ -670,10 +675,10 @@ ixgbe_get_vf_queues(struct rte_eth_dev *dev, uint32_t vf, 
uint32_t *msgbuf)
/* Notify VF of default queue */
msgbuf[IXGBE_VF_DEF_QUEUE] = default_q;

-   /*
-* FIX ME if it needs fill msgbuf[IXGBE_VF_TRANS_VLAN]
-* for VLAN strip or VMDQ_DCB or VMDQ_DCB_RSS
-*/
+   if (dcbinfo->num_tcs.pg_tcs)
+   msgbuf[IXGBE_VF_TRANS_VLAN] = dcbinfo->num_tcs.pg_tcs;
+   else
+   msgbuf[IXGBE_VF_TRANS_VLAN] = 1;

return 0;
 }
diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c 
b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
index e6766b3..2e3522c 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
@@ -3166,10 +3166,9 @@ void ixgbe_configure_dcb(struct rte_eth_dev *dev)

/* check support mq_mode for DCB */
if ((dev_conf->rxmode.mq_mode != ETH_MQ_RX_VMDQ_DCB) &&
-   (dev_conf->rxmode.mq_mode != ETH_MQ_RX_DCB))
-   return;
-
-   if (dev->data->nb_rx_queues != ETH_DCB_NUM_QUEUES)
+   (dev_conf->rxmode.mq_mode != ETH_MQ_RX_DCB) &&
+   (dev_conf->txmode.mq_mode != ETH_MQ_TX_VMDQ_DCB) &&
+   (dev_conf->txmode.mq_mode != ETH_MQ_TX_DCB))
return;

/** Configure DCB hardware **/
@@ -3442,8 +3441,13 @@ ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev)
ixgbe_config_vf_rss(dev);
break;

-   /* FIXME if support DCB/RSS together with VMDq & SRIOV */
+   /*
+* DCB will be configured during port startup.
+*/
case ETH_MQ_RX_VMDQ_DCB:
+   break;
+
+   /* FIXME if support DCB+RSS together with VMDq & SRIOV */
case ETH_MQ_RX_VMDQ_DCB_RSS:
PMD_INIT_LOG(ERR,
"Could not support DCB with VMDq & SRIOV");
@@ -3488,8 +3492,8 @@ ixgbe_dev_mq_tx_configure(struct rte_eth_dev *dev)
switch (RTE_ETH_DEV_SRIOV(dev).active) {

/*
-* SRIOV active scheme
-* FIXME if support DCB together with VMDq & SRIOV
+* SRIOV active scheme.
+* Note: DCB will be configured during port startup.
 */
case ETH_64_POO

[dpdk-dev] [PATCH v4 4/7] move rte_eth_dev_check_mq_mode() logic to driver

2015-02-19 Thread Pawel Wodkowski
Function rte_eth_dev_check_mq_mode() is driver specific. It should be
done in PF configuration phase. This patch move igb/ixgbe driver
specific mq check and SRIOV configuration code to driver part. Also
rewriting log messages to be shorter and more descriptive.

Signed-off-by: Pawel Wodkowski 
---
 lib/librte_ether/rte_ethdev.c   | 197 ---
 lib/librte_pmd_e1000/igb_ethdev.c   |  43 
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 105 ++-
 lib/librte_pmd_ixgbe/ixgbe_ethdev.h |   5 +-
 lib/librte_pmd_ixgbe/ixgbe_pf.c | 202 +++-
 5 files changed, 327 insertions(+), 225 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 4007054..aa27e39 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -502,195 +502,6 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
return (0);
 }

-static int
-rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, uint16_t nb_rx_q)
-{
-   struct rte_eth_dev *dev = &rte_eth_devices[port_id];
-   switch (nb_rx_q) {
-   case 1:
-   case 2:
-   RTE_ETH_DEV_SRIOV(dev).active =
-   ETH_64_POOLS;
-   break;
-   case 4:
-   RTE_ETH_DEV_SRIOV(dev).active =
-   ETH_32_POOLS;
-   break;
-   default:
-   return -EINVAL;
-   }
-
-   RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool = nb_rx_q;
-   RTE_ETH_DEV_SRIOV(dev).def_pool_q_idx =
-   dev->pci_dev->max_vfs * nb_rx_q;
-
-   return 0;
-}
-
-static int
-rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
- const struct rte_eth_conf *dev_conf)
-{
-   struct rte_eth_dev *dev = &rte_eth_devices[port_id];
-
-   if (RTE_ETH_DEV_SRIOV(dev).active != 0) {
-   /* check multi-queue mode */
-   if ((dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB) ||
-   (dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB_RSS) ||
-   (dev_conf->txmode.mq_mode == ETH_MQ_TX_DCB)) {
-   /* SRIOV only works in VMDq enable mode */
-   PMD_DEBUG_TRACE("ethdev port_id=%" PRIu8
-   " SRIOV active, "
-   "wrong VMDQ mq_mode rx %u tx %u\n",
-   port_id,
-   dev_conf->rxmode.mq_mode,
-   dev_conf->txmode.mq_mode);
-   return (-EINVAL);
-   }
-
-   switch (dev_conf->rxmode.mq_mode) {
-   case ETH_MQ_RX_VMDQ_DCB:
-   case ETH_MQ_RX_VMDQ_DCB_RSS:
-   /* DCB/RSS VMDQ in SRIOV mode, not implement yet */
-   PMD_DEBUG_TRACE("ethdev port_id=%" PRIu8
-   " SRIOV active, "
-   "unsupported VMDQ mq_mode rx %u\n",
-   port_id, dev_conf->rxmode.mq_mode);
-   return (-EINVAL);
-   case ETH_MQ_RX_RSS:
-   PMD_DEBUG_TRACE("ethdev port_id=%" PRIu8
-   " SRIOV active, "
-   "Rx mq mode is changed from:"
-   "mq_mode %u into VMDQ mq_mode %u\n",
-   port_id,
-   dev_conf->rxmode.mq_mode,
-   dev->data->dev_conf.rxmode.mq_mode);
-   case ETH_MQ_RX_VMDQ_RSS:
-   dev->data->dev_conf.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
-   if (nb_rx_q <= RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool)
-   if (rte_eth_dev_check_vf_rss_rxq_num(port_id, 
nb_rx_q) != 0) {
-   PMD_DEBUG_TRACE("ethdev port_id=%d"
-   " SRIOV active, invalid queue"
-   " number for VMDQ RSS, allowed"
-   " value are 1, 2 or 4\n",
-   port_id);
-   return -EINVAL;
-   }
-   break;
-   default: /* ETH_MQ_RX_VMDQ_ONLY or ETH_MQ_RX_NONE */
-   /* if nothing mq mode configure, use default scheme */
-   dev->data->dev_conf.rxmode.mq_mode = 
ETH_MQ_RX_VMDQ_ONLY;
-   if (RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool > 1)
-   RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool = 1;
-   break;
-   }
-
-   switch (dev_conf->txmode.mq_mode) {
-   case E

[dpdk-dev] [PATCH v4 3/7] pmd: igb/ixgbe split nb_q_per_pool to rx and tx nb_q_per_pool

2015-02-19 Thread Pawel Wodkowski
rx and tx number of queue might be different if RX and TX are
configured in different mode. This allow to inform VF about
proper number of queues.

Signed-off-by: Pawel Wodkowski 
---
 lib/librte_ether/rte_ethdev.c   | 12 ++--
 lib/librte_ether/rte_ethdev.h   |  3 ++-
 lib/librte_pmd_e1000/igb_pf.c   |  3 ++-
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |  2 +-
 lib/librte_pmd_ixgbe/ixgbe_pf.c |  9 +
 5 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 2e814db..4007054 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -520,7 +520,7 @@ rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, uint16_t 
nb_rx_q)
return -EINVAL;
}

-   RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool = nb_rx_q;
+   RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool = nb_rx_q;
RTE_ETH_DEV_SRIOV(dev).def_pool_q_idx =
dev->pci_dev->max_vfs * nb_rx_q;

@@ -567,7 +567,7 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t 
nb_rx_q, uint16_t nb_tx_q,
dev->data->dev_conf.rxmode.mq_mode);
case ETH_MQ_RX_VMDQ_RSS:
dev->data->dev_conf.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
-   if (nb_rx_q <= RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool)
+   if (nb_rx_q <= RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool)
if (rte_eth_dev_check_vf_rss_rxq_num(port_id, 
nb_rx_q) != 0) {
PMD_DEBUG_TRACE("ethdev port_id=%d"
" SRIOV active, invalid queue"
@@ -580,8 +580,8 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t 
nb_rx_q, uint16_t nb_tx_q,
default: /* ETH_MQ_RX_VMDQ_ONLY or ETH_MQ_RX_NONE */
/* if nothing mq mode configure, use default scheme */
dev->data->dev_conf.rxmode.mq_mode = 
ETH_MQ_RX_VMDQ_ONLY;
-   if (RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool > 1)
-   RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool = 1;
+   if (RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool > 1)
+   RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool = 1;
break;
}

@@ -600,8 +600,8 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t 
nb_rx_q, uint16_t nb_tx_q,
}

/* check valid queue number */
-   if ((nb_rx_q > RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) ||
-   (nb_tx_q > RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool)) {
+   if ((nb_rx_q > RTE_ETH_DEV_SRIOV(dev).nb_tx_q_per_pool) ||
+   (nb_tx_q > RTE_ETH_DEV_SRIOV(dev).nb_tx_q_per_pool)) {
PMD_DEBUG_TRACE("ethdev port_id=%d SRIOV active, "
"queue number must less equal to %d\n",
port_id, 
RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool);
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 84160c3..af86401 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1544,7 +1544,8 @@ struct rte_eth_dev {

 struct rte_eth_dev_sriov {
uint8_t active;   /**< SRIOV is active with 16, 32 or 64 
pools */
-   uint8_t nb_q_per_pool;/**< rx queue number per pool */
+   uint8_t nb_rx_q_per_pool;/**< rx queue number per pool */
+   uint8_t nb_tx_q_per_pool;/**< tx queue number per pool */
uint16_t def_vmdq_idx;/**< Default pool num used for PF */
uint16_t def_pool_q_idx;  /**< Default pool queue start reg index */
 };
diff --git a/lib/librte_pmd_e1000/igb_pf.c b/lib/librte_pmd_e1000/igb_pf.c
index bc3816a..9d2f858 100644
--- a/lib/librte_pmd_e1000/igb_pf.c
+++ b/lib/librte_pmd_e1000/igb_pf.c
@@ -115,7 +115,8 @@ void igb_pf_host_init(struct rte_eth_dev *eth_dev)
rte_panic("Cannot allocate memory for private VF data\n");

RTE_ETH_DEV_SRIOV(eth_dev).active = ETH_8_POOLS;
-   RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool = nb_queue;
+   RTE_ETH_DEV_SRIOV(eth_dev).nb_rx_q_per_pool = nb_queue;
+   RTE_ETH_DEV_SRIOV(eth_dev).nb_tx_q_per_pool = nb_queue;
RTE_ETH_DEV_SRIOV(eth_dev).def_vmdq_idx = vf_num;
RTE_ETH_DEV_SRIOV(eth_dev).def_pool_q_idx = (uint16_t)(vf_num * 
nb_queue);

diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c 
b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
index d6d408e..02b9cda 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
@@ -3564,7 +3564,7 @@ static int ixgbe_set_vf_rate_limit(struct rte_eth_dev 
*dev, uint16_t vf,
struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
struct ixgbe_vf_info *vfinfo =
*(IXGBE_DEV_PRIVATE_TO_P_VFDATA(dev->data->dev_private

[dpdk-dev] [PATCH v4 2/7] pmd igb: fix VMDQ mode checking

2015-02-19 Thread Pawel Wodkowski
RX mode is an enum created by ORing flags. Change compare by value
to test a flag when enabling/disabling VLAN filtering during RX queue
setup.

Signed-off-by: Pawel Wodkowski 
---
 lib/librte_pmd_e1000/igb_ethdev.c | 2 +-
 lib/librte_pmd_e1000/igb_rxtx.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_pmd_e1000/igb_ethdev.c 
b/lib/librte_pmd_e1000/igb_ethdev.c
index 2a268b8..d451086 100644
--- a/lib/librte_pmd_e1000/igb_ethdev.c
+++ b/lib/librte_pmd_e1000/igb_ethdev.c
@@ -816,7 +816,7 @@ eth_igb_start(struct rte_eth_dev *dev)
ETH_VLAN_EXTEND_MASK;
eth_igb_vlan_offload_set(dev, mask);

-   if (dev->data->dev_conf.rxmode.mq_mode == ETH_MQ_RX_VMDQ_ONLY) {
+   if ((dev->data->dev_conf.rxmode.mq_mode & ETH_MQ_RX_VMDQ_FLAG) != 0) {
/* Enable VLAN filter since VMDq always use VLAN filter */
igb_vmdq_vlan_hw_filter_enable(dev);
}
diff --git a/lib/librte_pmd_e1000/igb_rxtx.c b/lib/librte_pmd_e1000/igb_rxtx.c
index 5c394a9..79c458f 100644
--- a/lib/librte_pmd_e1000/igb_rxtx.c
+++ b/lib/librte_pmd_e1000/igb_rxtx.c
@@ -2150,7 +2150,7 @@ eth_igb_rx_init(struct rte_eth_dev *dev)
(hw->mac.mc_filter_type << E1000_RCTL_MO_SHIFT);

/* Make sure VLAN Filters are off. */
-   if (dev->data->dev_conf.rxmode.mq_mode != ETH_MQ_RX_VMDQ_ONLY)
+   if ((dev->data->dev_conf.rxmode.mq_mode & ETH_MQ_RX_VMDQ_FLAG) == 0)
rctl &= ~E1000_RCTL_VFE;
/* Don't store bad packets. */
rctl &= ~E1000_RCTL_SBP;
-- 
1.9.1



[dpdk-dev] [PATCH v4 1/7] ethdev: Allow zero rx/tx queues in SRIOV mode

2015-02-19 Thread Pawel Wodkowski
Allow zero rx/tx queues to be passed to rte_eth_dev_configure(). This
way PF might be used only for configuration purpose when no receive
and/or transmit functionality is needed.

Rationale:
in SRIOV mode PF use first free VF to RX/TX (at least ixgbe based NICs).
For example: if using 82599EB based NIC and VF count is 16, 32 or 64 all
recources are assigned to VFs so PF might be used only for configuration
purpose.

Signed-off-by: Pawel Wodkowski 
---
 lib/librte_ether/rte_ethdev.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ea3a1fb..2e814db 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -333,7 +333,7 @@ rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
dev->data->rx_queues = rte_zmalloc("ethdev->rx_queues",
sizeof(dev->data->rx_queues[0]) * nb_queues,
RTE_CACHE_LINE_SIZE);
-   if (dev->data->rx_queues == NULL) {
+   if (dev->data->rx_queues == NULL && nb_queues > 0) {
dev->data->nb_rx_queues = 0;
return -(ENOMEM);
}
@@ -475,7 +475,7 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
dev->data->tx_queues = rte_zmalloc("ethdev->tx_queues",
sizeof(dev->data->tx_queues[0]) * nb_queues,
RTE_CACHE_LINE_SIZE);
-   if (dev->data->tx_queues == NULL) {
+   if (dev->data->tx_queues == NULL && nb_queues > 0) {
dev->data->nb_tx_queues = 0;
return -(ENOMEM);
}
@@ -731,7 +731,10 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_q, 
uint16_t nb_tx_q,
}
if (nb_rx_q == 0) {
PMD_DEBUG_TRACE("ethdev port_id=%d nb_rx_q == 0\n", port_id);
-   return (-EINVAL);
+   /* In SRIOV there can be no free resource for PF. So permit use 
only
+* for configuration. */
+   if (RTE_ETH_DEV_SRIOV(dev).active == 0)
+   return (-EINVAL);
}

if (nb_tx_q > dev_info.max_tx_queues) {
@@ -739,9 +742,13 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_q, 
uint16_t nb_tx_q,
port_id, nb_tx_q, dev_info.max_tx_queues);
return (-EINVAL);
}
+
if (nb_tx_q == 0) {
PMD_DEBUG_TRACE("ethdev port_id=%d nb_tx_q == 0\n", port_id);
-   return (-EINVAL);
+   /* In SRIOV there can be no free resource for PF. So permit use 
only
+* for configuration. */
+   if (RTE_ETH_DEV_SRIOV(dev).active == 0)
+   return (-EINVAL);
}

/* Copy the dev_conf parameter into the dev structure */
-- 
1.9.1



[dpdk-dev] [PATCH v4 0/7] Enable DCB in SRIOV mode for ixgbe driver

2015-02-19 Thread Pawel Wodkowski
This patchset enables DCB in SRIOV (ETH_MQ_RX_VMDQ_DCB and ETH_MQ_TX_VMDQ_DCB)
for each VF and PF for ixgbe driver.

As a side effect this allow to use multiple queues for TX in VF (8 if there is
16 or less VFs or 4 if there is 32 or less VFs) when PFC is not enabled.

PATCH v4 changes:
 - resend patch as previous was sent by mistake with different one.

PATCH v3 changes:
 - Rework patch to fit ixgbe RSS in VT mode changes.
 - move driver specific code from rte_ethdev.c to driver code.
 - fix bug ixgbe driver VLAN filter enable in PF discoveded during testing.

PATCH v2 changes:
 - Split patch for easer review.
 - Remove "pmd: add api version negotiation for ixgbe driver" and "pmd: extend
  mailbox api to report number of RX/TX queues" patches as those are already 
  already marged from other patch

Pawel Wodkowski (7):
  ethdev: Allow zero rx/tx queues in SRIOV mode
  pmd igb: fix VMDQ mode checking
  pmd: igb/ixgbe split nb_q_per_pool to rx and tx nb_q_per_pool
  move rte_eth_dev_check_mq_mode() logic to ixgbe driver
  pmd ixgbe: enable DCB in SRIOV
  tespmd: fix DCB in SRIOV mode support
  pmd ixgbe: fix vlan setting in in PF

 app/test-pmd/cmdline.c  |   4 +-
 app/test-pmd/testpmd.c  |  39 +--
 app/test-pmd/testpmd.h  |  10 --
 lib/librte_ether/rte_ethdev.c   | 212 ++
 lib/librte_ether/rte_ethdev.h   |   3 +-
 lib/librte_pmd_e1000/igb_ethdev.c   |  45 +++-
 lib/librte_pmd_e1000/igb_pf.c   |   3 +-
 lib/librte_pmd_e1000/igb_rxtx.c |   2 +-
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 126 ++---
 lib/librte_pmd_ixgbe/ixgbe_ethdev.h |   5 +-
 lib/librte_pmd_ixgbe/ixgbe_pf.c | 220 +++-
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c   |  18 +--
 12 files changed, 407 insertions(+), 280 deletions(-)

-- 
1.9.1



[dpdk-dev] i40e and RSS woes

2015-02-19 Thread Gleb Natapov
CCing i40e driver author in a hope to get an answer.

On Mon, Feb 16, 2015 at 03:36:54PM +0200, Gleb Natapov wrote:
> I have an application that works reasonably well with ixgbe driver, but
> when I try to use it with i40e I encounter various RSS related issues.
> 
> First one is that for some reason i40e, when it builds default reta
> table, round down number of queues to power of two. Why is this? If I
> configure reta by my own using all of the queues everything seams to be
> working. To add insult to injury I do not get any errors during
> configuration some queues just do not receive any traffic.
> 
> The second problem is that for some reason i40e does not use 40 byte
> toeplitz hash key like any other driver, but it expects the key to be 52
> bytes. And it would have being fine (if we ignore the fact that it
> contradicts MS spec), but how my high level code suppose to know that?
> And again, device configuration does not fail when wrong key length is
> provided, it just uses some other key. Guys this kind of error handling
> is completely unacceptable.
> 
> The last one is more of a question. Why interface to change RSS hash
> function (XOR or toeplitz) is part of a filter configuration and not rss
> config?
> 
> --
>   Gleb.

--
Gleb.


[dpdk-dev] [PATCH v1 1/3] eal: enable uio_pci_generic support

2015-02-19 Thread Zhou, Danny
Thomas, thanks for review and I added comments inline.

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, February 18, 2015 9:40 PM
> To: Zhou, Danny
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v1 1/3] eal: enable uio_pci_generic support
> 
> Hi Danny,
> 
> I wanted to apply this patchset which was reviewed. But when having a quick
> overview, I've seen some strange additions.
> 
> 2015-01-29 17:28, Danny Zhou:
> > 1) Unify procedure to retrieve BAR resource mapping information.
> > 2) Setup bus master bit in NIC's PCIe configuration space for 
> > uio_pci_generic.
> >
> > Signed-off-by: Danny Zhou 
> > Tested-by: Qun Wan 
> [...]
> > --- a/lib/librte_eal/common/include/rte_pci.h
> > +++ b/lib/librte_eal/common/include/rte_pci.h
> > @@ -148,6 +148,7 @@ struct rte_pci_device {
> > struct rte_pci_id id;   /**< PCI ID. */
> > struct rte_pci_resource mem_resource[PCI_MAX_RESOURCE];   /**< PCI 
> > Memory Resource */
> > struct rte_intr_handle intr_handle; /**< Interrupt handle */
> > +   char driver_name[BUFSIZ];   /**< driver name */
> 
> Why not embedding this field in driver struct?
> The name and comment should be more precise.
> There is also driver->name and hotplug patchset is adding a kernel driver 
> name.
> Please bring the light in all these driver names :)
> 

This driver_name is the name of kernel driver(e.g. vfio_pci, igb_uio, 
uio_pci_generic) while the driver->name is
a user-defined name for user space driver. I am going to change it to 
kernel_driver_name with precise comment in V2
patch, and when the V2 patch is applied, I think the function 
pci_get_kernel_driver_by_path() in hotplug patchset is not 
necessary then as it could directly retrieve the kernel driver name from this 
variable.

> > const struct rte_pci_driver *driver;/**< Associated driver */
> [...]
> > --- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
> > +++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
> > +#define IORESOURCE_MEM0x0200
> 
> Please comment this value.

Will do.

> 
> > --- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
> > +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
> > @@ -50,8 +50,14 @@ enum rte_intr_handle_type {
> >
> >  /** Handle for interrupts. */
> >  struct rte_intr_handle {
> > -   int vfio_dev_fd; /**< VFIO device file descriptor */
> > -   int fd;  /**< file descriptor */
> > +   union {
> > +   int vfio_dev_fd;  /**< VFIO device file descriptor */
> > +   };
> > +union {
> > +   int uio_cfg_fd;  /**< UIO config file descriptor
> > +   for uio_pci_generic */
> > +   };
> 
> Apart the indent, it seems there is a mistake here.
> Why 2 unions with 1 field each?

It is a mistake I made during code merge, will fix it in V2.

> 
> > +   int fd;  /**< interrupt event file descriptor */
> > enum rte_intr_handle_type type;  /**< handle type */
> >  };



[dpdk-dev] [PATCH v6 0/7] rte_hash_crc reworked to be platform-independent

2015-02-19 Thread Bruce Richardson
On Mon, Feb 02, 2015 at 11:39:18AM +0600, Yerden Zhumabekov wrote:
> 
> 02.02.2015 9:31, Neil Horman ?:
> > On Mon, Feb 02, 2015 at 09:07:45AM +0600, Yerden Zhumabekov wrote:
> >
> >> I think so, I've just successfully built it against latest snapshot with
> >> RTE_TARGET
> >> equal to 'x86_64-native-linuxapp-gcc'.
> >>
> > Please confirm that setting the machine type to default builds and runs 
> > properly.
> 
> If I understood you correctly, I set CONFIG_RTE_MACHINE="default" in the
> config and the build was successful.
> 

Confirmed, this worked for me too.
Looking at the patches, they look good. However, one thing I think we are 
missing
is a unit test to verify that all our CRC implementations give the same result.
That would be useful as a sanity check of the software fallback especially. The
existing hash tests, test the hash table implementation rather than the
mathematical argorithm used to compute the hash values.

Overall, though, software fallback for CRC is something well worthwhile having.

Series Acked-by: Bruce Richardson 

> -- 
> Sincerely,
> 
> Yerden Zhumabekov
> State Technical Service
> Astana, KZ
> 
> 


[dpdk-dev] Patches outstanding

2015-02-19 Thread O'driscoll, Tim
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Thursday, February 19, 2015 1:47 PM
> To: Neil Horman
> Cc: dev at dpdk.org; Stephen Hemminger
> Subject: Re: [dpdk-dev] Patches outstanding
> 
> 2015-02-19 08:08, Neil Horman:
> > On Tue, Feb 17, 2015 at 10:35:07AM -0500, Stephen Hemminger wrote:
> > > There are currently 1039 patches outstanding on DPDK.
> > > What is the schedule for getting these merged or resolved?
> > > I don't think it would be reasonable to declare 2.0 as done
> > > until the patch backlog is 0!
> > >
> >
> > I think the subtrees were supposed to start biting into this, but I don't 
> > see
> > them getting used yet.
> 
> Yes
> Actually the main problem is still on reviews.
> There are more good reviews in this cycle.
> But some patchset are not reviewed and some others are acked
> without being carefully reviewed.

Progress on reviews has been a little slow on our side. One of the reasons for 
this is that our PRC team are on their new year holidays at the moment, so 
we're a little short staffed. They return to the office in the middle of next 
week, after which we'll be back to full strength.

We do agree with Thomas on the need for thorough reviews. We're working on 
this, so you should see more reviews/acknowledgements soon.


Tim



[dpdk-dev] [PATCH 6/6] examples: remove unneeded casts

2015-02-19 Thread Bruce Richardson
On Sat, Feb 14, 2015 at 09:59:10AM -0500, Stephen Hemminger wrote:
> *alloc() routines return void * and therefore cast is not needed.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  examples/kni/main.c   | 4 ++--
>  examples/l3fwd-acl/main.c | 4 ++--
>  examples/vhost/main.c | 7 ---
>  3 files changed, 8 insertions(+), 7 deletions(-)
> 
...  ...
> diff --git a/examples/vhost/main.c b/examples/vhost/main.c
> index 3a35359..a96b19f 100644
> --- a/examples/vhost/main.c
> +++ b/examples/vhost/main.c
> @@ -2592,9 +2592,10 @@ new_device (struct virtio_net *dev)
>  
>   }
>  
> - vdev->regions_hpa = (struct virtio_memory_regions_hpa *) 
> rte_zmalloc("vhost hpa region",
> - sizeof(struct virtio_memory_regions_hpa) * 
> vdev->nregions_hpa,
> - RTE_CACHE_LINE_SIZE);
> + vdev->regions_hpa = rte_calloc("vhost hpa region",
> +sizeof(struct 
> virtio_memory_regions_hpa),
> +vdev->nregions_hpa,
> +RTE_CACHE_LINE_SIZE);

I know functionally it probably doesn't make a difference, but I think your
"num" and "size" parameters are reversed here.

/Bruce


[dpdk-dev] [PATCH 5/6] eal: remove useless memset

2015-02-19 Thread Bruce Richardson
On Sat, Feb 14, 2015 at 09:59:09AM -0500, Stephen Hemminger wrote:
> The path variable is set via snprintf, and does not need to
> memset before that.
> 
> Signed-off-by: Stephen Hemminger 

Acked-by: Bruce Richardson 

> ---
>  lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c 
> b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
> index 590cb56..8d29e06 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
> @@ -84,8 +84,6 @@ get_num_hugepages(const char *subdir)
>   else
>   nr_hp_file = "free_hugepages";
>  
> - memset(path, 0, sizeof(path));
> -
>   snprintf(path, sizeof(path), "%s/%s/%s",
>   sys_dir_path, subdir, nr_hp_file);
>  
> -- 
> 2.1.4
> 


[dpdk-dev] [PATCH 4/6] enic: eliminate useless cast

2015-02-19 Thread Bruce Richardson
On Sat, Feb 14, 2015 at 09:59:08AM -0500, Stephen Hemminger wrote:
> Signed-off-by: Stephen Hemminger 

Acked-by: Bruce Richardson 

> ---
>  lib/librte_pmd_enic/enic_clsf.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_pmd_enic/enic_clsf.c b/lib/librte_pmd_enic/enic_clsf.c
> index 577a382..b61d625 100644
> --- a/lib/librte_pmd_enic/enic_clsf.c
> +++ b/lib/librte_pmd_enic/enic_clsf.c
> @@ -121,9 +121,8 @@ int enic_fdir_add_fltr(struct enic *enic, struct 
> rte_fdir_filter *params,
>   enic->fdir.stats.f_add++;
>   return -ENOSPC;
>   }
> - key = (struct enic_fdir_node *)rte_zmalloc(
> - "enic_fdir_node",
> - sizeof(struct enic_fdir_node), 0);
> + key = rte_zmalloc("enic_fdir_node",
> +   sizeof(struct enic_fdir_node), 0);
>   if (!key) {
>   enic->fdir.stats.f_add++;
>   return -ENOMEM;
> -- 
> 2.1.4
> 


[dpdk-dev] [PATCH 2/6] vhost_xen: remove unnecessary cast

2015-02-19 Thread Bruce Richardson
On Sat, Feb 14, 2015 at 09:59:06AM -0500, Stephen Hemminger wrote:
> Don't need to cast malloc family of functions since they return
> void *.
> 
> Signed-off-by: Stephen Hemminger 

Acked-by: Bruce Richardson 

> ---
>  examples/vhost_xen/vhost_monitor.c  | 2 +-
>  examples/vhost_xen/xenstore_parse.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/examples/vhost_xen/vhost_monitor.c 
> b/examples/vhost_xen/vhost_monitor.c
> index f683989..9d99962 100644
> --- a/examples/vhost_xen/vhost_monitor.c
> +++ b/examples/vhost_xen/vhost_monitor.c
> @@ -138,7 +138,7 @@ add_xen_guest(int32_t dom_id)
>   if ((guest = get_xen_guest(dom_id)) != NULL)
>   return guest;
>  
> - guest = (struct xen_guest * )calloc(1, sizeof(struct xen_guest));
> + guest = calloc(1, sizeof(struct xen_guest));
>   if (guest) {
>   RTE_LOG(ERR, XENHOST, "  %s: return newly created guest with %d 
> rings\n", __func__, guest->vring_num);
>   TAILQ_INSERT_TAIL(&guest_root, guest, next);
> diff --git a/examples/vhost_xen/xenstore_parse.c 
> b/examples/vhost_xen/xenstore_parse.c
> index 9441639..df191ac 100644
> --- a/examples/vhost_xen/xenstore_parse.c
> +++ b/examples/vhost_xen/xenstore_parse.c
> @@ -248,8 +248,8 @@ parse_gntnode(int dom_id, char *path)
>   goto err;
>   }
>  
> - gntnode = (struct xen_gntnode *)calloc(1, sizeof(struct xen_gntnode));
> - gnt = (struct xen_gnt *)calloc(gref_num, sizeof(struct xen_gnt));
> + gntnode = calloc(1, sizeof(struct xen_gntnode));
> + gnt = calloc(gref_num, sizeof(struct xen_gnt));
>   if (gnt == NULL || gntnode == NULL)
>   goto err;
>  
> -- 
> 2.1.4
> 


[dpdk-dev] [PATCH 1/6] test: remove unneeded casts

2015-02-19 Thread Bruce Richardson
On Sat, Feb 14, 2015 at 09:59:05AM -0500, Stephen Hemminger wrote:
> The malloc family returns void * and therefore cast is unnecessary.
> Use calloc rather than zmalloc with multiply for array.
> 
> Signed-off-by: Stephen Hemminger 

Looks like a good basic cleanup

Acked-by: Bruce Richardson 

> ---
>  app/test/test_hash_perf.c | 8 
>  app/test/test_mempool.c   | 2 +-
>  app/test/test_ring.c  | 2 +-
>  3 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/app/test/test_hash_perf.c b/app/test/test_hash_perf.c
> index be34957..6f719fc 100644
> --- a/app/test/test_hash_perf.c
> +++ b/app/test/test_hash_perf.c
> @@ -459,13 +459,13 @@ run_single_tbl_perf_test(const struct rte_hash *h, 
> hash_operation func,
>  
>   /* Initialise */
>   num_buckets = params->entries / params->bucket_entries;
> - key = (uint8_t *) rte_zmalloc("hash key",
> - params->key_len * sizeof(uint8_t), 16);
> + key = rte_zmalloc("hash key",
> +   params->key_len * sizeof(uint8_t), 16);
>   if (key == NULL)
>   return -1;
>  
> - bucket_occupancies = (uint32_t *) rte_zmalloc("bucket occupancies",
> - num_buckets * sizeof(uint32_t), 16);
> + bucket_occupancies = rte_calloc("bucket occupancies",
> + num_buckets, sizeof(uint32_t), 16);
>   if (bucket_occupancies == NULL) {
>   rte_free(key);
>   return -1;
> diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
> index 303d2b3..de85c9c 100644
> --- a/app/test/test_mempool.c
> +++ b/app/test/test_mempool.c
> @@ -360,7 +360,7 @@ test_mempool_basic_ex(struct rte_mempool * mp)
>   if (mp == NULL)
>   return ret;
>  
> - obj = (void **)rte_zmalloc("test_mempool_basic_ex", (MEMPOOL_SIZE * 
> sizeof(void *)), 0);
> + obj = rte_calloc("test_mempool_basic_ex", MEMPOOL_SIZE , sizeof(void 
> *), 0);
>   if (obj == NULL) {
>   printf("test_mempool_basic_ex fail to rte_malloc\n");
>   return ret;
> diff --git a/app/test/test_ring.c b/app/test/test_ring.c
> index 2cd8e77..ce25329 100644
> --- a/app/test/test_ring.c
> +++ b/app/test/test_ring.c
> @@ -1259,7 +1259,7 @@ test_ring_basic_ex(void)
>   struct rte_ring * rp;
>   void **obj = NULL;
>  
> - obj = (void **)rte_zmalloc("test_ring_basic_ex_malloc", (RING_SIZE * 
> sizeof(void *)), 0);
> + obj = rte_calloc("test_ring_basic_ex_malloc", RING_SIZE, sizeof(void 
> *), 0);
>   if (obj == NULL) {
>   printf("test_ring_basic_ex fail to rte_malloc\n");
>   goto fail_test;
> -- 
> 2.1.4
> 


[dpdk-dev] [PATCH v9 12/14] ethdev: Add one dev_type parameter to rte_eth_dev_allocate

2015-02-19 Thread Iremonger, Bernard


> -Original Message-
> From: Tetsuya Mukawa [mailto:mukawa at igel.co.jp]
> Sent: Thursday, February 19, 2015 2:50 AM
> To: dev at dpdk.org
> Cc: Qiu, Michael; Iremonger, Bernard; thomas.monjalon at 6wind.com; Tetsuya 
> Mukawa
> Subject: [PATCH v9 12/14] ethdev: Add one dev_type parameter to 
> rte_eth_dev_allocate
> 
> This new parameter is needed to keep device type like PCI or virtual.
> Port detaching processes are different between PCI device and virtual device.
> RTE_ETH_DEV_PCI indicates device type is PCI. RTE_ETH_DEV_VIRTUAL indicates 
> device is virtual.
> 
> v9:
> - Fix commit log.
> - RTE_ETH_DEV_PHYSICAL is replaced by RTE_ETH_DEV_PCI.
>   (Thanks to Thomas Monjalon)
> v8:
> - NONE_TRACE is replaced by NO_TRACE.
> - Add missing symbol in version map.
>   (Thanks to Qiu, Michael and Iremonger, Bernard)
> v4:
> - Fix comments of rte_eth_dev_type.
> 
> Signed-off-by: Tetsuya Mukawa 
> ---
>  app/test/virtual_pmd.c   |  2 +-
>  lib/librte_ether/rte_ethdev.c| 25 +++--
>  lib/librte_ether/rte_ethdev.h| 25 -
>  lib/librte_ether/rte_ether_version.map   |  1 +
>  lib/librte_pmd_af_packet/rte_eth_af_packet.c |  2 +-
>  lib/librte_pmd_bond/rte_eth_bond_api.c   |  2 +-
>  lib/librte_pmd_pcap/rte_eth_pcap.c   |  2 +-
>  lib/librte_pmd_ring/rte_eth_ring.c   |  2 +-
>  lib/librte_pmd_xenvirt/rte_eth_xenvirt.c |  2 +-
>  9 files changed, 54 insertions(+), 9 deletions(-)
> 
> diff --git a/app/test/virtual_pmd.c b/app/test/virtual_pmd.c index 
> 9fac95d..c02644a 100644
> --- a/app/test/virtual_pmd.c
> +++ b/app/test/virtual_pmd.c
> @@ -556,7 +556,7 @@ virtual_ethdev_create(const char *name, struct ether_addr 
> *mac_addr,
>   goto err;
> 
>   /* reserve an ethdev entry */
> - eth_dev = rte_eth_dev_allocate(name);
> + eth_dev = rte_eth_dev_allocate(name, RTE_ETH_DEV_PCI);
>   if (eth_dev == NULL)
>   goto err;
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c 
> index 3b64f3a..201c04a
> 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -227,7 +227,7 @@ rte_eth_dev_find_free_port(void)  }
> 
>  struct rte_eth_dev *
> -rte_eth_dev_allocate(const char *name)
> +rte_eth_dev_allocate(const char *name, enum rte_eth_dev_type type)
>  {
>   uint8_t port_id;
>   struct rte_eth_dev *eth_dev;
> @@ -251,6 +251,7 @@ rte_eth_dev_allocate(const char *name)
>   snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name);
>   eth_dev->data->port_id = port_id;
>   eth_dev->attached = DEV_ATTACHED;
> + eth_dev->dev_type = type;
>   nb_ports++;
>   return eth_dev;
>  }
> @@ -262,6 +263,7 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
>   return -EINVAL;
> 
>   eth_dev->attached = 0;
> + eth_dev->dev_type = RTE_ETH_DEV_UNKNOWN;
>   nb_ports--;
>   return 0;
>  }
> @@ -299,7 +301,7 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
>   /* Create unique Ethernet device name using PCI address */
>   rte_eth_dev_create_unique_device_name(ethdev_name, pci_dev);
> 
> - eth_dev = rte_eth_dev_allocate(ethdev_name);
> + eth_dev = rte_eth_dev_allocate(ethdev_name, RTE_ETH_DEV_PCI);
>   if (eth_dev == NULL)
>   return -ENOMEM;
> 
> @@ -424,6 +426,14 @@ rte_eth_dev_count(void)
>   return (nb_ports);
>  }
> 
> +enum rte_eth_dev_type
> +rte_eth_dev_get_device_type(uint8_t port_id) {
> + if (!rte_eth_dev_is_valid_port(port_id))
> + return -1;
> + return rte_eth_devices[port_id].dev_type;
> +}
> +
>  int
>  rte_eth_dev_save(struct rte_eth_dev *devs, size_t size)  { @@ -521,6 +531,17 
> @@
> rte_eth_dev_is_detachable(uint8_t port_id)
>   return -EINVAL;
>   }
> 
> + if (rte_eth_devices[port_id].dev_type == RTE_ETH_DEV_PCI) {
> + switch (rte_eth_devices[port_id].pci_dev->pt_driver) {
> + case RTE_PT_IGB_UIO:
> + case RTE_PT_UIO_GENERIC:
> + break;
> + case RTE_PT_VFIO:
> + default:
> + return -ENOTSUP;
> + }
> + }
> +
>   drv_flags = rte_eth_devices[port_id].driver->pci_drv.drv_flags;
>   return !(drv_flags & RTE_PCI_DRV_DETACHABLE);  } diff --git 
> a/lib/librte_ether/rte_ethdev.h
> b/lib/librte_ether/rte_ethdev.h index 90b7f25..4f66bb6 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1523,6 +1523,17 @@ struct eth_dev_ops {  };
> 
>  /**
> + * The eth device type
> + */
> +enum rte_eth_dev_type {
> + RTE_ETH_DEV_UNKNOWN,/**< unknown device type */
> + RTE_ETH_DEV_PCI,
> + /**< Physical function and Virtual function of PCI devices */
> + RTE_ETH_DEV_VIRTUAL,/**< non hardware device */
> + RTE_ETH_DEV_MAX /**< max value of this enum */
> +};
> +
> +/**

[dpdk-dev] Patches outstanding

2015-02-19 Thread Thomas Monjalon
2015-02-19 08:08, Neil Horman:
> On Tue, Feb 17, 2015 at 10:35:07AM -0500, Stephen Hemminger wrote:
> > There are currently 1039 patches outstanding on DPDK.
> > What is the schedule for getting these merged or resolved?
> > I don't think it would be reasonable to declare 2.0 as done
> > until the patch backlog is 0!
> > 
> 
> I think the subtrees were supposed to start biting into this, but I don't see
> them getting used yet.

Yes
Actually the main problem is still on reviews.
There are more good reviews in this cycle.
But some patchset are not reviewed and some others are acked
without being carefully reviewed.


[dpdk-dev] [PATCH] ixgbe: fix build with gcc 5

2015-02-19 Thread Panu Matilainen
On 02/19/2015 02:02 PM, Ananyev, Konstantin wrote:
> Hi Panu,
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen
>> Sent: Thursday, February 19, 2015 10:25 AM
>> To: dev at dpdk.org
>> Subject: [dpdk-dev] [PATCH] ixgbe: fix build with gcc 5
>>
>> Add extra parenthesis to remove ambiguity on what we want to compare,
>> otherwise gcc 5 issues a "logical not is only applied to the left hand
>> side of comparison" warning which with -Werror fails the build.
>>
>> Signed-off-by: Panu Matilainen 
>> ---
>>   lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c 
>> b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
>> index 37e5bae..93a6a00 100644
>> --- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
>> +++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
>> @@ -2898,8 +2898,8 @@ STATIC s32 ixgbe_fc_autoneg_fiber(struct ixgbe_hw *hw)
>>   */
>>
>>  linkstat = IXGBE_READ_REG(hw, IXGBE_PCS1GLSTA);
>> -if ((!!(linkstat & IXGBE_PCS1GLSTA_AN_COMPLETE) == 0) ||
>> -(!!(linkstat & IXGBE_PCS1GLSTA_AN_TIMED_OUT) == 1)) {
>> +if (((!!(linkstat & IXGBE_PCS1GLSTA_AN_COMPLETE)) == 0) ||
>> +((!!(linkstat & IXGBE_PCS1GLSTA_AN_TIMED_OUT)) == 1)) {
>>  ERROR_REPORT1(IXGBE_ERROR_POLLING,
>>   "Auto-Negotiation did not complete or timed out");
>>  goto out;
>
> Unfortunately we are not supposed to change files under ixgbe subfirectory 
> (except ixgbe_osdep.*).

Oh, sorry about that, I didn't realize there were untouchable files in 
the repo. Its not a very common setup :)

> Usually we deal with it just by:
> If GCC_VERSION...
> CFLAGS_ixgbe_common.o += -Wno...
>
> You can have a look at lib/librte_pmd_ixgbe/Makefile, there are plenty of 
> such things.

Yup, noticed that but assumed the warning disablers were mainly for 
things that are not trivial to fix.

This one can be worked around just as easily with 
-Wlogical-not-parentheses, but since this flag is new to gcc 5 it can't 
really be added until gcc 5 is recognized as a supported version by the 
makefiles:
http://dpdk.org/dev/patchwork/patch/3452/

I'll send an updated version using warning disabler once other gcc-5 
support goes in.

- Panu -




[dpdk-dev] [PATCH v9 07/14] eal, ethdev: Add a function and function pointers to close ether device

2015-02-19 Thread Iremonger, Bernard


> -Original Message-
> From: Tetsuya Mukawa [mailto:mukawa at igel.co.jp]
> Sent: Thursday, February 19, 2015 2:50 AM
> To: dev at dpdk.org
> Cc: Qiu, Michael; Iremonger, Bernard; thomas.monjalon at 6wind.com; Tetsuya 
> Mukawa
> Subject: [PATCH v9 07/14] eal,ethdev: Add a function and function pointers to 
> close ether device
> 
> The patch adds function pointer to rte_pci_driver and eth_driver structure. 
> These function pointers
> are used when ports are detached.
> Also, the patch adds rte_eth_dev_uninit(). So far, it's not called by 
> anywhere, but it will be called
> when port hotplug function is implemented.
> 
> v9:
> - Change parameter of pci_devuninit_t and rte_eth_dev_uninit.
> - Remove code that initiaize callback of ethdev from
>   rte_eth_dev_uninit().
> - Add a function to create a unique device name.
>   (Thanks to Thomas Monjalon)
> v6:
> - Fix rte_eth_dev_uninit() to handle a return value of uninit
>   function of PMD.
> v4:
> - Add parameter checking.
> - Change function names.
> 
> Signed-off-by: Tetsuya Mukawa 
> ---
>  lib/librte_eal/common/include/rte_pci.h |  6 
>  lib/librte_ether/rte_ethdev.c   | 62 
> +++--
>  lib/librte_ether/rte_ethdev.h   | 24 +
>  3 files changed, 90 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_eal/common/include/rte_pci.h 
> b/lib/librte_eal/common/include/rte_pci.h
> index c609ef3..376f66a 100644
> --- a/lib/librte_eal/common/include/rte_pci.h
> +++ b/lib/librte_eal/common/include/rte_pci.h
> @@ -189,12 +189,18 @@ struct rte_pci_driver;  typedef int 
> (pci_devinit_t)(struct rte_pci_driver *,
> struct rte_pci_device *);
> 
>  /**
> + * Uninitialisation function for the driver called during hotplugging.
> + */
> +typedef int (pci_devuninit_t)(struct rte_pci_device *);
> +
> +/**
>   * A structure describing a PCI driver.
>   */
>  struct rte_pci_driver {
>   TAILQ_ENTRY(rte_pci_driver) next;   /**< Next in list. */
>   const char *name;   /**< Driver name. */
>   pci_devinit_t *devinit; /**< Device init. function. */
> + pci_devuninit_t *devuninit; /**< Device uninit function. */
>   struct rte_pci_id *id_table;/**< ID table, NULL terminated. 
> */
>   uint32_t drv_flags; /**< Flags contolling handling 
> of device. */
>  };
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c 
> index be5aa18..ef5d226
> 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -266,6 +266,24 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
>   return 0;
>  }
> 
> +static inline int
> +rte_eth_dev_create_unique_device_name(char *name,
> + struct rte_pci_device *pci_dev)
> +{
> + int ret;
> +
> + if ((name == NULL) || (pci_dev == NULL))
> + return -EINVAL;
> +
Hi Tetsuya,

It would be safer to pass in the size of the name buffer and use it in the 
snprintf() call .


Regards,

Bernard.


> + ret = snprintf(name, RTE_ETH_NAME_MAX_LEN, "%d:%d.%d",
> + pci_dev->addr.bus, pci_dev->addr.devid,
> + pci_dev->addr.function);
> + if (ret < 0)
> + return ret;
> +
> + return 0;
> +}
> +
>  static int
>  rte_eth_dev_init(struct rte_pci_driver *pci_drv,
>struct rte_pci_device *pci_dev)
> @@ -279,8 +297,7 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
>   eth_drv = (struct eth_driver *)pci_drv;
> 
>   /* Create unique Ethernet device name using PCI address */
> - snprintf(ethdev_name, RTE_ETH_NAME_MAX_LEN, "%d:%d.%d",
> - pci_dev->addr.bus, pci_dev->addr.devid, 
> pci_dev->addr.function);
> + rte_eth_dev_create_unique_device_name(ethdev_name, pci_dev);
> 
>   eth_dev = rte_eth_dev_allocate(ethdev_name);
>   if (eth_dev == NULL)
> @@ -321,6 +338,46 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
>   return diag;
>  }
> 
> +static int
> +rte_eth_dev_uninit(struct rte_pci_device *pci_dev) {
> + const struct eth_driver *eth_drv;
> + struct rte_eth_dev *eth_dev;
> + char ethdev_name[RTE_ETH_NAME_MAX_LEN];
> + int ret;
> +
> + if (pci_dev == NULL)
> + return -EINVAL;
> +
> + /* Create unique Ethernet device name using PCI address */
> + rte_eth_dev_create_unique_device_name(ethdev_name, pci_dev);
> +
> + eth_dev = rte_eth_dev_allocated(ethdev_name);
> + if (eth_dev == NULL)
> + return -ENODEV;
> +
> + eth_drv = (const struct eth_driver *)pci_dev->driver;
> +
> + /* Invoke PMD device uninit function */
> + if (*eth_drv->eth_dev_uninit) {
> + ret = (*eth_drv->eth_dev_uninit)(eth_drv, eth_dev);
> + if (ret)
> + return ret;
> + }
> +
> + /* free ether device */
> + rte_eth_dev_release_port(eth_dev);
> +
> + if (rte_eal_process_type() == RTE_PROC_PR

[dpdk-dev] mapping lcore – port –queue

2015-02-19 Thread kuldeep.sam...@wipro.com
Hi Team ,

I am using DPDK-1.7.1 to process my packet faster  & trying to understand how 
physical nic ports are map with lcore .

Any suggestion the mapping on lcore ? port ?queue relation & which file I 
suppose to be follow for mapping .



Regards ,
Kuldeep


[dpdk-dev] [PATCH v2] i40e: fix build with gcc 5

2015-02-19 Thread Panu Matilainen
Eliminate ambiguity in the condition which trips up a "logical not
is only applied to the left..." warning from gcc 5, causing build
failure with -Werror. Besides non-ambiguous, the condition is
far more obvious this way.

Signed-off-by: Panu Matilainen 
---
 lib/librte_pmd_i40e/i40e_rxtx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c b/lib/librte_pmd_i40e/i40e_rxtx.c
index c9f1026..12c0831 100644
--- a/lib/librte_pmd_i40e/i40e_rxtx.c
+++ b/lib/librte_pmd_i40e/i40e_rxtx.c
@@ -613,7 +613,7 @@ check_rx_burst_bulk_alloc_preconditions(__rte_unused struct 
i40e_rx_queue *rxq)
 "rxq->nb_rx_desc=%d",
 rxq->rx_free_thresh, rxq->nb_rx_desc);
ret = -EINVAL;
-   } else if (!(rxq->nb_rx_desc % rxq->rx_free_thresh) == 0) {
+   } else if (rxq->nb_rx_desc % rxq->rx_free_thresh != 0) {
PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions: "
 "rxq->nb_rx_desc=%d, "
 "rxq->rx_free_thresh=%d",
-- 
2.1.0



[dpdk-dev] [PATCH v5 3/3] MAINTAINERS: claim responsibility for headroom library and example app

2015-02-19 Thread Pawel Wodkowski
Signed-off-by: Pawel Wodkowski 
---
 MAINTAINERS | 4 
 1 file changed, 4 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index a771fa3..782b585 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -362,6 +362,10 @@ F: app/test/test_timer*
 F: examples/timer/
 F: doc/guides/sample_app_ug/timer.rst

+Headroom
+M: Pawel Wodkowski 
+F: lib/librte_headroom/
+F: examples/l2fwd-headroom/

 Test Applications
 -
-- 
1.9.1



[dpdk-dev] [PATCH v5 2/3] examples: introduce new l2fwd-headroom example

2015-02-19 Thread Pawel Wodkowski
This app demonstrate usage of new headroom library.
It is basically the orginal l2fwd with following modifications to met
headroom library requirements:
- main_loop() was split into two jobs: forward job and flush job. Logic
for those jobs is almost the same as in original application.
- stats is moved to rte_alarm callback to not introduce overhead of
printing.
- stats are expanded to show headroom statistics.
- added new parameter '-l' to automatic thousands separator.

Comparing original l2fwd and l2fwd-headroom apps will show approach what
is needed to properly write own application with headroom measurements.

New available statistics:
- Total and % of fwd and flush execution time
- management time - overhead of rte_timer + overhead of headroom library
- Idle time and % of time spent waiting for fwd or flush to be ready to
execute.
- per job execution time and period.


Signed-off-by: Pawel Wodkowski 
---
 examples/Makefile|1 +
 examples/l2fwd-headroom/Makefile |   51 ++
 examples/l2fwd-headroom/main.c   | 1040 ++
 mk/rte.app.mk|4 +
 4 files changed, 1096 insertions(+)
 create mode 100644 examples/l2fwd-headroom/Makefile
 create mode 100644 examples/l2fwd-headroom/main.c

diff --git a/examples/Makefile b/examples/Makefile
index 81f1d2f..8a459b7 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -50,6 +50,7 @@ DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ip_fragmentation
 DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ipv4_multicast
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += kni
 DIRS-y += l2fwd
+DIRS-y += l2fwd-headroom
 DIRS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += l2fwd-ivshmem
 DIRS-y += l3fwd
 DIRS-$(CONFIG_RTE_LIBRTE_ACL) += l3fwd-acl
diff --git a/examples/l2fwd-headroom/Makefile b/examples/l2fwd-headroom/Makefile
new file mode 100644
index 000..07da286
--- /dev/null
+++ b/examples/l2fwd-headroom/Makefile
@@ -0,0 +1,51 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = l2fwd-headroom
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/l2fwd-headroom/main.c b/examples/l2fwd-headroom/main.c
new file mode 100644
index 000..d7e557d
--- /dev/null
+++ b/examples/l2fwd-headroom/main.c
@@ -0,0 +1,1040 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or p

[dpdk-dev] [PATCH v5 1/3] librte_headroom: New library for checking core/system/app load

2015-02-19 Thread Pawel Wodkowski
This library provide API to measure time spend in particular parts of
code and to calculate optimal polling time.

To calculate a those statistics application code need to be divided into
parts (called jobs) that do something. It is up to application to decide
what is considered a job.

Series of jobs must be surrounded with the rte_headroom_start_loop() and
rte_headroom_finish_loop() calls. After that, jobs might be started.
Each job must be surrounded with rte_headroom_start_job() and
rte_headroom_finish_job() calls.

After job finishes its execution, period in which it should be called
again is adjusted to minimize time wasted on unnecessary polls/calls.
Adjustment is based on data provided by job itself (ex: number of
packets it processed).

After all jobs in serie are executed fallowing statistics are updated
and might be used by application. Statistics can be reset. Some of
provided statistic data:
 - total/min/max execution - time spent in executing jobs.
 - total/min/max management - time spent outside execution area. This
value might be used to measure overhead of scheduling jobs. This time
also
contains overhead of headroom library itself.
 - number of loops that executed at least one job
 - executed jobs
 - time when statistics were reset.

Each job provide total/min/max execution time and execution count
statistics.

Signed-off-by: Pawel Wodkowski 
---
 config/common_bsdapp |   5 +
 config/common_linuxapp   |   5 +
 lib/Makefile |   1 +
 lib/librte_headroom/Makefile |  54 +
 lib/librte_headroom/rte_headroom.c   | 271 ++
 lib/librte_headroom/rte_headroom.h   | 324 +++
 lib/librte_headroom/rte_headroom_version.map |  19 ++
 7 files changed, 679 insertions(+)
 create mode 100644 lib/librte_headroom/Makefile
 create mode 100644 lib/librte_headroom/rte_headroom.c
 create mode 100644 lib/librte_headroom/rte_headroom.h
 create mode 100644 lib/librte_headroom/rte_headroom_version.map

diff --git a/config/common_bsdapp b/config/common_bsdapp
index 57bacb8..aa2e5fd 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -282,6 +282,11 @@ CONFIG_RTE_LIBRTE_HASH=y
 CONFIG_RTE_LIBRTE_HASH_DEBUG=n

 #
+# Compile librte_headroom
+#
+CONFIG_RTE_LIBRTE_HEADROOM=y
+
+#
 # Compile librte_lpm
 #
 CONFIG_RTE_LIBRTE_LPM=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index d428f84..055a37b 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -290,6 +290,11 @@ CONFIG_RTE_LIBRTE_HASH=y
 CONFIG_RTE_LIBRTE_HASH_DEBUG=n

 #
+# Compile librte_headroom
+#
+CONFIG_RTE_LIBRTE_HEADROOM=y
+
+#
 # Compile librte_lpm
 #
 CONFIG_RTE_LIBRTE_LPM=y
diff --git a/lib/Makefile b/lib/Makefile
index d617d81..4fc2819 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -54,6 +54,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt
 DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
 DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash
+DIRS-$(CONFIG_RTE_LIBRTE_HEADROOM) += librte_headroom
 DIRS-$(CONFIG_RTE_LIBRTE_LPM) += librte_lpm
 DIRS-$(CONFIG_RTE_LIBRTE_ACL) += librte_acl
 DIRS-$(CONFIG_RTE_LIBRTE_NET) += librte_net
diff --git a/lib/librte_headroom/Makefile b/lib/librte_headroom/Makefile
new file mode 100644
index 000..faefb3b
--- /dev/null
+++ b/lib/librte_headroom/Makefile
@@ -0,0 +1,54 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIA

[dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application

2015-02-19 Thread Pawel Wodkowski
Hi community,
I would like to introduce library for measuring load of some arbitrary jobs. It
can be used to profile every kind of job sets on any arbitrary execution unit or
tasking library.

In provided l2fwd-headroom example I demonstrate how to use this library to
select optimal rx burst poll time. Jobs are selected by using existing rte_timer
library calls. This example does no limit possible schemes on which this library
can be used.

PATCH v5 changes:
 - Fix spelling and checkpatch.pl errors.
 - Add maintainer claim for library and example app.

PATCH v4 changes:
 - use proper branch for generating patch.

PATCH v3 changes:
 - Fix spelling.

PATCH v2 changes:
 - Remove jobs management/callback from library to not duplicate tasking library
   behaviour.
 - Cleenup/remove useless statistics.
 - Rework example application to use rte_timer library for jobs selection.
 - Introduce new app parameter '-l' for automatic thousands separating in stats.
 - More readable statistics format.



Pawel Wodkowski (3):
  librte_headroom: New library for checking core/system/app load
  examples: introduce new l2fwd-headroom example
  MAINTAINERS: claim responsibility for headroom library and example app

 MAINTAINERS  |4 +
 config/common_bsdapp |5 +
 config/common_linuxapp   |5 +
 examples/Makefile|1 +
 examples/l2fwd-headroom/Makefile |   51 ++
 examples/l2fwd-headroom/main.c   | 1040 ++
 lib/Makefile |1 +
 lib/librte_headroom/Makefile |   54 ++
 lib/librte_headroom/rte_headroom.c   |  271 +++
 lib/librte_headroom/rte_headroom.h   |  324 
 lib/librte_headroom/rte_headroom_version.map |   19 +
 mk/rte.app.mk|4 +
 12 files changed, 1779 insertions(+)
 create mode 100644 examples/l2fwd-headroom/Makefile
 create mode 100644 examples/l2fwd-headroom/main.c
 create mode 100644 lib/librte_headroom/Makefile
 create mode 100644 lib/librte_headroom/rte_headroom.c
 create mode 100644 lib/librte_headroom/rte_headroom.h
 create mode 100644 lib/librte_headroom/rte_headroom_version.map

-- 
1.9.1



[dpdk-dev] [PATCH v3 1/5] ethdev: add rx interrupt enable/disable functions

2015-02-19 Thread Zhou, Danny


> -Original Message-
> From: Neil Horman [mailto:nhorman at tuxdriver.com]
> Sent: Thursday, February 19, 2015 9:10 PM
> To: Zhou, Danny
> Cc: Gonzalez Monroy, Sergio; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add rx interrupt 
> enable/disable functions
> 
> On Thu, Feb 19, 2015 at 08:34:22AM +, Zhou, Danny wrote:
> >
> >
> > > -Original Message-
> > > From: Gonzalez Monroy, Sergio
> > > Sent: Thursday, February 19, 2015 4:22 PM
> > > To: Zhou, Danny; Neil Horman
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add rx interrupt 
> > > enable/disable functions
> > >
> > > On 19/02/2015 08:06, Zhou, Danny wrote:
> > > >
> > > >> -Original Message-
> > > >> From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > >> Sent: Tuesday, February 17, 2015 11:53 PM
> > > >> To: Zhou, Danny
> > > >> Cc: dev at dpdk.org
> > > >> Subject: Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add rx interrupt 
> > > >> enable/disable functions
> > > >>
> > > >> On Tue, Feb 17, 2015 at 09:47:15PM +0800, Zhou Danny wrote:
> > > >>> v3 changes
> > > >>> - Add return value for interrupt enable/disable functions
> > > >>>
> > > >>> Add two dev_ops functions to enable and disable rx queue interrupts
> > > >>>
> > > >>> Signed-off-by: Danny Zhou 
> > > >>> Tested-by: Yong Liu 
> > > >>> ---
> > > >>>   lib/librte_ether/rte_ethdev.c | 43 
> > > >>>   lib/librte_ether/rte_ethdev.h | 57 
> > > >>> +++
> > > >>>   2 files changed, 100 insertions(+)
> > > >>>
> > > >>> diff --git a/lib/librte_ether/rte_ethdev.c 
> > > >>> b/lib/librte_ether/rte_ethdev.c
> > > >>> index ea3a1fb..d27469a 100644
> > > >>> --- a/lib/librte_ether/rte_ethdev.c
> > > >>> +++ b/lib/librte_ether/rte_ethdev.c
> > > >>> @@ -2825,6 +2825,49 @@ _rte_eth_dev_callback_process(struct 
> > > >>> rte_eth_dev *dev,
> > > >>>   }
> > > >>>   rte_spinlock_unlock(&rte_eth_dev_cb_lock);
> > > >>>   }
> > > >>> +
> > > >>> +int
> > > >>> +rte_eth_dev_rx_queue_intr_enable(uint8_t port_id,
> > > >>> + uint16_t queue_id)
> > > >>> +{
> > > >>> + struct rte_eth_dev *dev;
> > > >>> +
> > > >>> + if (port_id >= nb_ports) {
> > > >>> + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > > >>> + return (-ENODEV);
> > > >>> + }
> > > >>> +
> > > >>> + dev = &rte_eth_devices[port_id];
> > > >>> + if (dev == NULL) {
> > > >>> + PMD_DEBUG_TRACE("Invalid port device\n");
> > > >>> + return (-ENODEV);
> > > >>> + }
> > > >>> +
> > > >>> + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, 
> > > >>> -ENOTSUP);
> > > >>> + return (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
> > > >>> +}
> > > >>> +
> > > >>> +int
> > > >>> +rte_eth_dev_rx_queue_intr_disable(uint8_t port_id,
> > > >>> + uint16_t queue_id)
> > > >>> +{
> > > >>> + struct rte_eth_dev *dev;
> > > >>> +
> > > >>> + if (port_id >= nb_ports) {
> > > >>> + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > > >>> + return (-ENODEV);
> > > >>> + }
> > > >>> +
> > > >>> + dev = &rte_eth_devices[port_id];
> > > >>> + if (dev == NULL) {
> > > >>> + PMD_DEBUG_TRACE("Invalid port device\n");
> > > >>> + return (-ENODEV);
> > > >>> + }
> > > >>> +
> > > >>> + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_disable, 
> > > >>> -ENOTSUP);
> > > >>> + return (*dev->dev_ops->rx_queue_intr_disable)(dev, queue_id);
> > > >>> +}
> > > >>> +
> > > >>>   #ifdef RTE_NIC_BYPASS
> > > >>>   int rte_eth_dev_bypass_init(uint8_t port_id)
> > > >>>   {
> > > >>> diff --git a/lib/librte_ether/rte_ethdev.h 
> > > >>> b/lib/librte_ether/rte_ethdev.h
> > > >>> index 84160c3..0f320a9 100644
> > > >>> --- a/lib/librte_ether/rte_ethdev.h
> > > >>> +++ b/lib/librte_ether/rte_ethdev.h
> > > >>> @@ -848,6 +848,8 @@ struct rte_eth_fdir {
> > > >>>   struct rte_intr_conf {
> > > >>>   /** enable/disable lsc interrupt. 0 (default) - disable, 1 
> > > >>> enable */
> > > >>>   uint16_t lsc;
> > > >>> + /** enable/disable rxq interrupt. 0 (default) - disable, 1 
> > > >>> enable */
> > > >>> + uint16_t rxq;
> > > >>>   };
> > > >>>
> > > >>>   /**
> > > >>> @@ -1109,6 +,14 @@ typedef int (*eth_tx_queue_setup_t)(struct 
> > > >>> rte_eth_dev *dev,
> > > >>>   const struct rte_eth_txconf 
> > > >>> *tx_conf);
> > > >>>   /**< @internal Setup a transmit queue of an Ethernet device. */
> > > >>>
> > > >>> +typedef int (*eth_rx_enable_intr_t)(struct rte_eth_dev *dev,
> > > >>> + uint16_t rx_queue_id);
> > > >>> +/**< @internal Enable interrupt of a receive queue of an Ethernet 
> > > >>> device. */
> > > >>> +
> > > >>> +typedef int (*eth_rx_disable_intr_t)(struct rte_eth_dev *dev,
> > > >>> +  

[dpdk-dev] [PATCH v9 13/14] eal/pci: Add rte_eal_dev_attach/detach() functions

2015-02-19 Thread Thomas Monjalon
2015-02-19 11:49, Tetsuya Mukawa:
> +/* attach the new virtual device, then store port_id of the device */
> +static int
> +rte_eal_dev_attach_vdev(const char *vdevargs, uint8_t *port_id)
> +{
> + char *args;
> + uint8_t new_port_id;
> + struct rte_eth_dev devs[RTE_MAX_ETHPORTS];
> +
> + if ((vdevargs == NULL) || (port_id == NULL))
> + goto err0;
> +
> + args = strdup(vdevargs);
> + if (args == NULL)
> + goto err0;
> +
> + /* save current port status */
> + if (rte_eth_dev_save(devs, sizeof(devs)))
> + goto err1;
> + /* add the vdevargs to devargs_list */
> + if (rte_eal_devargs_add(RTE_DEVTYPE_VIRTUAL, args))
> + goto err1;

Could you explain why you store devargs in a list?

> + /* parse vdevargs, then retrieve device name */
> + get_vdev_name(args);
> + /* walk around dev_driver_list to find the driver of the device,
> +  * then invoke probe function o the driver */
> + if (rte_eal_vdev_find_and_init(args))

TODO: get port_id from init.

> + goto err2;
> + /* get port_id enabled by above procedures */
> + if (rte_eth_dev_get_changed_port(devs, &new_port_id))
> + goto err2;

[...]
> --- a/lib/librte_eal/common/include/rte_dev.h
> +++ b/lib/librte_eal/common/include/rte_dev.h
> @@ -47,6 +47,7 @@ extern "C" {
>  #endif
>  
>  #include 
> +#include 
>  
>  /** Double linked list of device drivers. */
>  TAILQ_HEAD(rte_driver_list, rte_driver);
> @@ -57,6 +58,11 @@ TAILQ_HEAD(rte_driver_list, rte_driver);
>  typedef int (rte_dev_init_t)(const char *name, const char *args);
>  
>  /**
> + * Uninitilization function called for each device driver once.
> + */
> +typedef int (rte_dev_uninit_t)(const char *name);

Why using name as parameter and not port_id?

[...]
> --- a/lib/librte_eal/linuxapp/eal/Makefile
> +++ b/lib/librte_eal/linuxapp/eal/Makefile
> @@ -45,6 +45,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common/include
>  CFLAGS += -I$(RTE_SDK)/lib/librte_ring
>  CFLAGS += -I$(RTE_SDK)/lib/librte_mempool
>  CFLAGS += -I$(RTE_SDK)/lib/librte_malloc
> +CFLAGS += -I$(RTE_SDK)/lib/librte_mbuf

Why do you need mbuf?

[...]
> --- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> +++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> @@ -21,6 +21,8 @@ DPDK_2.0 {
>   rte_eal_alarm_cancel;
>   rte_eal_alarm_set;
>   rte_eal_dev_init;
> + rte_eal_dev_attach;
> + rte_eal_dev_detach;

Please keep alphabetical order.



[dpdk-dev] [PATCH] i40e: fix build with gcc 5

2015-02-19 Thread Panu Matilainen
On 02/19/2015 01:05 PM, Ananyev, Konstantin wrote:
>
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen
>> Sent: Thursday, February 19, 2015 10:25 AM
>> To: dev at dpdk.org
>> Subject: [dpdk-dev] [PATCH] i40e: fix build with gcc 5
>>
>> Eliminate embiguity in the condition which trips up a "logical not
>> is only applied to the left..." warning from gcc 5, causing build
>> failure with -Werror.
>>
>> Signed-off-by: Panu Matilainen 
>> ---
>>   lib/librte_pmd_i40e/i40e_rxtx.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c 
>> b/lib/librte_pmd_i40e/i40e_rxtx.c
>> index c9f1026..ede5405 100644
>> --- a/lib/librte_pmd_i40e/i40e_rxtx.c
>> +++ b/lib/librte_pmd_i40e/i40e_rxtx.c
>> @@ -613,7 +613,7 @@ check_rx_burst_bulk_alloc_preconditions(__rte_unused 
>> struct i40e_rx_queue *rxq)
>>   "rxq->nb_rx_desc=%d",
>>   rxq->rx_free_thresh, rxq->nb_rx_desc);
>>  ret = -EINVAL;
>> -} else if (!(rxq->nb_rx_desc % rxq->rx_free_thresh) == 0) {
>> +} else if (!(rxq->nb_rx_desc % rxq->rx_free_thresh == 0)) {
>
> Why just not:
> else if (rxq->nb_rx_desc % rxq->rx_free_thresh != 0)
> ?

The same occurred to me right after hitting send, it'll make it a whole 
lot more obvious. I'll send another version.

- Panu -



[dpdk-dev] [PATCH] i40e: fix build with gcc 5

2015-02-19 Thread Panu Matilainen
Eliminate embiguity in the condition which trips up a "logical not
is only applied to the left..." warning from gcc 5, causing build
failure with -Werror.

Signed-off-by: Panu Matilainen 
---
 lib/librte_pmd_i40e/i40e_rxtx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c b/lib/librte_pmd_i40e/i40e_rxtx.c
index c9f1026..ede5405 100644
--- a/lib/librte_pmd_i40e/i40e_rxtx.c
+++ b/lib/librte_pmd_i40e/i40e_rxtx.c
@@ -613,7 +613,7 @@ check_rx_burst_bulk_alloc_preconditions(__rte_unused struct 
i40e_rx_queue *rxq)
 "rxq->nb_rx_desc=%d",
 rxq->rx_free_thresh, rxq->nb_rx_desc);
ret = -EINVAL;
-   } else if (!(rxq->nb_rx_desc % rxq->rx_free_thresh) == 0) {
+   } else if (!(rxq->nb_rx_desc % rxq->rx_free_thresh == 0)) {
PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions: "
 "rxq->nb_rx_desc=%d, "
 "rxq->rx_free_thresh=%d",
-- 
2.1.0



[dpdk-dev] [PATCH] ixgbe: fix build with gcc 5

2015-02-19 Thread Panu Matilainen
Add extra parenthesis to remove ambiguity on what we want to compare,
otherwise gcc 5 issues a "logical not is only applied to the left hand
side of comparison" warning which with -Werror fails the build.

Signed-off-by: Panu Matilainen 
---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
index 37e5bae..93a6a00 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
@@ -2898,8 +2898,8 @@ STATIC s32 ixgbe_fc_autoneg_fiber(struct ixgbe_hw *hw)
 */

linkstat = IXGBE_READ_REG(hw, IXGBE_PCS1GLSTA);
-   if ((!!(linkstat & IXGBE_PCS1GLSTA_AN_COMPLETE) == 0) ||
-   (!!(linkstat & IXGBE_PCS1GLSTA_AN_TIMED_OUT) == 1)) {
+   if (((!!(linkstat & IXGBE_PCS1GLSTA_AN_COMPLETE)) == 0) ||
+   ((!!(linkstat & IXGBE_PCS1GLSTA_AN_TIMED_OUT)) == 1)) {
ERROR_REPORT1(IXGBE_ERROR_POLLING,
 "Auto-Negotiation did not complete or timed out");
goto out;
-- 
2.1.0



[dpdk-dev] [PATCH v9 08/14] ethdev: Add functions that will be used by port hotplug functions

2015-02-19 Thread Thomas Monjalon
2015-02-19 11:49, Tetsuya Mukawa:
> --- a/lib/librte_ether/rte_ether_version.map
> +++ b/lib/librte_ether/rte_ether_version.map
> @@ -109,6 +109,13 @@ DPDK_2.0 {
>   rte_eth_tx_queue_setup;
>   rte_eth_xstats_get;
>   rte_eth_xstats_reset;
> + rte_eth_dev_allocated;
> + rte_eth_dev_is_detachable;
> + rte_eth_dev_get_name_by_port;
> + rte_eth_dev_get_addr_by_port;
> + rte_eth_dev_get_port_by_addr;
> + rte_eth_dev_get_changed_port;
> + rte_eth_dev_save;

In this file, alphabetical order is preferred.



[dpdk-dev] [PATCH v9 02/14] eal_pci: Add flag to hold kernel driver type

2015-02-19 Thread Thomas Monjalon
> @@ -152,6 +159,7 @@ struct rte_pci_device {
>   uint16_t max_vfs;   /**< sriov enable if not zero */
>   int numa_node;  /**< NUMA node connection */
>   struct rte_devargs *devargs;/**< Device user arguments */
> + enum rte_pt_driver pt_driver;   /**< Driver of passthrough */
[...]
> +static int
> +pci_get_kernel_driver_by_path(const char *filename, char *dri_name)

I think "kernel driver" is a good name. Why not using this name in the
pci_device struct to be more consistent?

Thanks


[dpdk-dev] [PATCH] ixgbe: fix build with gcc 5

2015-02-19 Thread Ananyev, Konstantin
Hi Panu,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen
> Sent: Thursday, February 19, 2015 10:25 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] ixgbe: fix build with gcc 5
> 
> Add extra parenthesis to remove ambiguity on what we want to compare,
> otherwise gcc 5 issues a "logical not is only applied to the left hand
> side of comparison" warning which with -Werror fails the build.
> 
> Signed-off-by: Panu Matilainen 
> ---
>  lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c 
> b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
> index 37e5bae..93a6a00 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
> @@ -2898,8 +2898,8 @@ STATIC s32 ixgbe_fc_autoneg_fiber(struct ixgbe_hw *hw)
>*/
> 
>   linkstat = IXGBE_READ_REG(hw, IXGBE_PCS1GLSTA);
> - if ((!!(linkstat & IXGBE_PCS1GLSTA_AN_COMPLETE) == 0) ||
> - (!!(linkstat & IXGBE_PCS1GLSTA_AN_TIMED_OUT) == 1)) {
> + if (((!!(linkstat & IXGBE_PCS1GLSTA_AN_COMPLETE)) == 0) ||
> + ((!!(linkstat & IXGBE_PCS1GLSTA_AN_TIMED_OUT)) == 1)) {
>   ERROR_REPORT1(IXGBE_ERROR_POLLING,
>"Auto-Negotiation did not complete or timed out");
>   goto out;

Unfortunately we are not supposed to change files under ixgbe subfirectory 
(except ixgbe_osdep.*).
Usually we deal with it just by:
If GCC_VERSION...
CFLAGS_ixgbe_common.o += -Wno...

You can have a look at lib/librte_pmd_ixgbe/Makefile, there are plenty of such 
things.
Konstantin


> --
> 2.1.0



[dpdk-dev] [PATCH v9] testpmd: Add port hotplug support

2015-02-19 Thread Tetsuya Mukawa
The patch introduces following commands.
- port attach [ident]
- port detach [port_id]
 - attach: attaching a port
 - detach: detaching a port
 - ident: pci address of physical device.
  Or device name and parameters of virtual device.
 (ex. :02:00.0, eth_pcap0,iface=eth0)
 - port_id: port identifier

v7:
- Fix doc.
  (Thanks to Iremonger, Bernard)
- Fix port checking implementation of star_port();
  (Thanks to Qiu, Michael)
v5:
- Add testpmd documentation.
  (Thanks to Iremonger, Bernard)
v4:
 - Fix strings of command help.

Signed-off-by: Tetsuya Mukawa 
---
 app/test-pmd/cmdline.c  | 137 +++
 app/test-pmd/config.c   | 116 +---
 app/test-pmd/parameters.c   |  22 ++-
 app/test-pmd/testpmd.c  | 199 +---
 app/test-pmd/testpmd.h  |  18 ++-
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  57 
 6 files changed, 416 insertions(+), 133 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 4753bb4..fa6e3a6 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -579,6 +579,12 @@ static void cmd_help_long_parsed(void *parsed_result,
"port close (port_id|all)\n"
"Close all ports or port_id.\n\n"

+   "port attach (ident)\n"
+   "Attach physical or virtual dev by pci address or 
virtual device name\n\n"
+
+   "port detach (port_id)\n"
+   "Detach physical or virtual dev by port_id\n\n"
+
"port config (port_id|all)"
" speed (10|100|1000|1|4|auto)"
" duplex (half|full|auto)\n"
@@ -870,6 +876,89 @@ cmdline_parse_inst_t cmd_operate_specific_port = {
},
 };

+/* *** attach a specified port *** */
+struct cmd_operate_attach_port_result {
+   cmdline_fixed_string_t port;
+   cmdline_fixed_string_t keyword;
+   cmdline_fixed_string_t identifier;
+};
+
+static void cmd_operate_attach_port_parsed(void *parsed_result,
+   __attribute__((unused)) struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   struct cmd_operate_attach_port_result *res = parsed_result;
+
+   if (!strcmp(res->keyword, "attach"))
+   attach_port(res->identifier);
+   else
+   printf("Unknown parameter\n");
+}
+
+cmdline_parse_token_string_t cmd_operate_attach_port_port =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
+   port, "port");
+cmdline_parse_token_string_t cmd_operate_attach_port_keyword =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
+   keyword, "attach");
+cmdline_parse_token_string_t cmd_operate_attach_port_identifier =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
+   identifier, NULL);
+
+cmdline_parse_inst_t cmd_operate_attach_port = {
+   .f = cmd_operate_attach_port_parsed,
+   .data = NULL,
+   .help_str = "port attach identifier, "
+   "identifier: pci address or virtual dev name",
+   .tokens = {
+   (void *)&cmd_operate_attach_port_port,
+   (void *)&cmd_operate_attach_port_keyword,
+   (void *)&cmd_operate_attach_port_identifier,
+   NULL,
+   },
+};
+
+/* *** detach a specified port *** */
+struct cmd_operate_detach_port_result {
+   cmdline_fixed_string_t port;
+   cmdline_fixed_string_t keyword;
+   uint8_t port_id;
+};
+
+static void cmd_operate_detach_port_parsed(void *parsed_result,
+   __attribute__((unused)) struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   struct cmd_operate_detach_port_result *res = parsed_result;
+
+   if (!strcmp(res->keyword, "detach"))
+   detach_port(res->port_id);
+   else
+   printf("Unknown parameter\n");
+}
+
+cmdline_parse_token_string_t cmd_operate_detach_port_port =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_detach_port_result,
+   port, "port");
+cmdline_parse_token_string_t cmd_operate_detach_port_keyword =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_detach_port_result,
+   keyword, "detach");
+cmdline_parse_token_num_t cmd_operate_detach_port_port_id =
+   TOKEN_NUM_INITIALIZER(struct cmd_operate_detach_port_result,
+   port_id, UINT8);
+
+cmdline_parse_inst_t cmd_operate_detach_port = {
+   .f = cmd_operate_detach_port_parsed,
+   .data = NULL,
+   .help_str = "port detach port_id",
+   .tokens = {
+   (void *)&cmd_operate_detach_port_port,
+   (void *)&cmd_operate_detach_port_keyword,
+   

[dpdk-dev] [PATCH v9] librte_pmd_pcap: Add port hotplug support

2015-02-19 Thread Tetsuya Mukawa
This patch adds finalization code to free resources allocated by the
PMD.

v6:
 - Fix a paramter of rte_eth_dev_free().
v4:
 - Change function name.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_pmd_pcap/rte_eth_pcap.c | 40 ++
 1 file changed, 40 insertions(+)

diff --git a/lib/librte_pmd_pcap/rte_eth_pcap.c 
b/lib/librte_pmd_pcap/rte_eth_pcap.c
index af7fae8..5e94930 100644
--- a/lib/librte_pmd_pcap/rte_eth_pcap.c
+++ b/lib/librte_pmd_pcap/rte_eth_pcap.c
@@ -498,6 +498,13 @@ static struct eth_dev_ops ops = {
.stats_reset = eth_stats_reset,
 };

+static struct eth_driver rte_pcap_pmd = {
+   .pci_drv = {
+   .name = "rte_pcap_pmd",
+   .drv_flags = RTE_PCI_DRV_DETACHABLE,
+   },
+};
+
 /*
  * Function handler that opens the pcap file for reading a stores a
  * reference of it for use it later on.
@@ -713,6 +720,10 @@ rte_pmd_init_internals(const char *name, const unsigned 
nb_rx_queues,
if (*eth_dev == NULL)
goto error;

+   /* check length of device name */
+   if ((strlen((*eth_dev)->data->name) + 1) > sizeof(data->name))
+   goto error;
+
/* now put it all together
 * - store queue data in internals,
 * - store numa_node info in pci_driver
@@ -739,10 +750,13 @@ rte_pmd_init_internals(const char *name, const unsigned 
nb_rx_queues,
data->nb_tx_queues = (uint16_t)nb_tx_queues;
data->dev_link = pmd_link;
data->mac_addrs = ð_addr;
+   strncpy(data->name,
+   (*eth_dev)->data->name, strlen((*eth_dev)->data->name));

(*eth_dev)->data = data;
(*eth_dev)->dev_ops = &ops;
(*eth_dev)->pci_dev = pci_dev;
+   (*eth_dev)->driver = &rte_pcap_pmd;

return 0;

@@ -927,10 +941,36 @@ rte_pmd_pcap_devinit(const char *name, const char *params)

 }

+static int
+rte_pmd_pcap_devuninit(const char *name)
+{
+   struct rte_eth_dev *eth_dev = NULL;
+
+   RTE_LOG(INFO, PMD, "Closing pcap ethdev on numa socket %u\n",
+   rte_socket_id());
+
+   if (name == NULL)
+   return -1;
+
+   /* reserve an ethdev entry */
+   eth_dev = rte_eth_dev_allocated(name);
+   if (eth_dev == NULL)
+   return -1;
+
+   rte_free(eth_dev->data->dev_private);
+   rte_free(eth_dev->data);
+   rte_free(eth_dev->pci_dev);
+
+   rte_eth_dev_release_port(eth_dev);
+
+   return 0;
+}
+
 static struct rte_driver pmd_pcap_drv = {
.name = "eth_pcap",
.type = PMD_VDEV,
.init = rte_pmd_pcap_devinit,
+   .uninit = rte_pmd_pcap_devuninit,
 };

 PMD_REGISTER_DRIVER(pmd_pcap_drv);
-- 
1.9.1



[dpdk-dev] [PATCH v9 14/14] doc: Add port hotplug framework section to programmers guide

2015-02-19 Thread Tetsuya Mukawa
This patch adds a new section for describing port hotplug framework.

Signed-off-by: Tetsuya Mukawa 
---
 doc/guides/prog_guide/index.rst  |   1 +
 doc/guides/prog_guide/port_hotplug_framework.rst | 110 +++
 2 files changed, 111 insertions(+)
 create mode 100644 doc/guides/prog_guide/port_hotplug_framework.rst

diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 8d86dd4..428b76b 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -70,6 +70,7 @@ Programmer's Guide
 packet_classif_access_ctrl
 packet_framework
 vhost_lib
+port_hotplug_framework
 source_org
 dev_kit_build_system
 dev_kit_root_make_help
diff --git a/doc/guides/prog_guide/port_hotplug_framework.rst 
b/doc/guides/prog_guide/port_hotplug_framework.rst
new file mode 100644
index 000..355ae28
--- /dev/null
+++ b/doc/guides/prog_guide/port_hotplug_framework.rst
@@ -0,0 +1,110 @@
+..  BSD LICENSE
+Copyright(c) 2015 IGEL Co.,Ltd. All rights reserved.
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+* Neither the name of IGEL Co.,Ltd. nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+Port Hotplug Framework
+==
+
+The Port Hotplug Framework provides DPDK applications with the ability to
+attach and detach ports at runtime. Because the framework depends on PMD
+implementation, the ports that PMDs cannot handle are out of scope of this
+framework. Furthermore, after detaching a port from a DPDK application, the
+framework doesn't provide a way for removing the devices from the system.
+For the ports backed by a physical NIC, the kernel will need to support PCI
+Hotplug feature.
+
+Overview
+
+
+The basic requirements of the Port Hotplug Framework are:
+
+*   DPDK applications that use the Port Hotplug Framework must manage their
+own ports.
+
+The Port Hotplug Framework is implemented to allow DPDK applications to
+manage ports. For example, when DPDK applications call the port attach
+function, the attached port number is returned. DPDK applications can
+also detach the port by port number.
+
+*   Kernel support is needed for attaching or detaching physical device
+ports.
+
+To attach new physical device ports, the device will be recognized by
+userspace driver I/O framework in kernel at first. Then DPDK
+applications can call the Port Hotplug functions to attach the ports.
+For detaching, steps are vice versa.
+
+*   Before detaching, they must be stopped and closed.
+
+DPDK applications must call "rte_eth_dev_stop()" and
+"rte_eth_dev_close()" APIs before detaching ports. These functions will
+start finalization sequence of the PMDs.
+
+*   The framework doesn't affect legacy DPDK applications behavior.
+
+If the Port Hotplug functions aren't called, all legacy DPDK apps can
+still work without modifications.
+
+Port Hotplug API overview
+-
+
+*   Attaching a port
+
+"rte_eal_dev_attach()" API attaches a port to DPDK application, and
+returns the attached port number. Before calling the API, the device
+should be recognized by an userspace driver I/O framework. The API
+receives a pci address like ":01:00.0" or a virtual device name
+like "eth_pcap0,iface=eth0". In the case of virtual device name, the
+format is the same as the general "--vdev" option of DPDK.
+
+*   Detac

[dpdk-dev] [PATCH v9 13/14] eal/pci: Add rte_eal_dev_attach/detach() functions

2015-02-19 Thread Tetsuya Mukawa
These functions are used for attaching or detaching a port.
When rte_eal_dev_attach() is called, the function tries to realize the
device name as pci address. If this is done successfully,
rte_eal_dev_attach() will attach physical device port. If not, attaches
virtual devive port.
When rte_eal_dev_detach() is called, the function gets the device type
of this port to know whether the port is come from physical or virtual.
And then specific detaching function will be called.

v9:
- Fix comments.
- Use strcmp() instead of strncmp().
- Remove RTE_EAL_INVOKE_TYPE_PROBE/CLOSE.
- Change definition of rte_dev_uninit_t.
  (Thanks to Thomas Monjalon and Maxime Leroy)
v8:
- Add missing symbol in version map.
  (Thanks to Qiu, Michael and Iremonger, Bernard)
v7:
- Fix typo of warning messages.
  (Thanks to Qiu, Michael)
v5:
- Change function names like below.
  rte_eal_dev_find_and_invoke() to rte_eal_vdev_find_and_invoke().
  rte_eal_dev_invoke() to rte_eal_vdev_invoke().
- Add code to handle a return value of rte_eal_devargs_remove().
- Fix pci address format in rte_eal_dev_detach().
v4:
- Fix comment.
- Add error checking.
- Fix indent of 'if' statement.
- Change function name.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/eal_common_dev.c  | 307 
 lib/librte_eal/common/eal_private.h |  11 +
 lib/librte_eal/common/include/rte_dev.h |  33 +++
 lib/librte_eal/linuxapp/eal/Makefile|   1 +
 lib/librte_eal/linuxapp/eal/eal_pci.c   |   6 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |   2 +
 6 files changed, 357 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_dev.c 
b/lib/librte_eal/common/eal_common_dev.c
index eae5656..4976bb9 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -32,10 +32,13 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */

+#include 
+#include 
 #include 
 #include 
 #include 

+#include 
 #include 
 #include 
 #include 
@@ -107,3 +110,307 @@ rte_eal_dev_init(void)
}
return 0;
 }
+
+/* So far, DPDK hotplug function only supports linux */
+#ifdef RTE_LIBRTE_EAL_HOTPLUG
+static int
+rte_eal_vdev_find_and_init(const char *name)
+{
+   struct rte_devargs *devargs;
+   struct rte_driver *driver;
+
+   if (name == NULL)
+   return -EINVAL;
+
+   /* find the specified device and call init function */
+   TAILQ_FOREACH(devargs, &devargs_list, next) {
+
+   if (devargs->type != RTE_DEVTYPE_VIRTUAL)
+   continue;
+
+   if (strcmp(name, devargs->virtual.drv_name))
+   continue;
+
+   TAILQ_FOREACH(driver, &dev_driver_list, next) {
+   if (driver->type != PMD_VDEV)
+   continue;
+
+   /*
+* search a driver prefix in virtual device name.
+* For example, if the driver is pcap PMD, driver->name
+* will be "eth_pcap", but devargs->virtual.drv_name
+* will be "eth_pcapN". So use strncmp to compare.
+*/
+   if (!strncmp(driver->name, devargs->virtual.drv_name,
+   strlen(driver->name))) {
+   driver->init(devargs->virtual.drv_name,
+   devargs->args);
+   break;
+   }
+   }
+
+   if (driver == NULL) {
+   RTE_LOG(WARNING, EAL, "no driver found for %s\n",
+ devargs->virtual.drv_name);
+   }
+   return 0;
+   }
+   return 1;
+}
+
+static int
+rte_eal_vdev_find_and_uninit(const char *name)
+{
+   struct rte_devargs *devargs;
+   struct rte_driver *driver;
+
+   if (name == NULL)
+   return -EINVAL;
+
+   /* find the specified device and call uninit function */
+   TAILQ_FOREACH(devargs, &devargs_list, next) {
+
+   if (devargs->type != RTE_DEVTYPE_VIRTUAL)
+   continue;
+
+   if (strcmp(name, devargs->virtual.drv_name))
+   continue;
+
+   TAILQ_FOREACH(driver, &dev_driver_list, next) {
+   if (driver->type != PMD_VDEV)
+   continue;
+
+   /*
+* search a driver prefix in virtual device name.
+* For example, if the driver is pcap PMD, driver->name
+* will be "eth_pcap", but devargs->virtual.drv_name
+* will be "eth_pcapN". So use strncmp to compare.
+*/
+   if (!strncmp(driver->name, devargs->virtual.drv_name,
+   strl

[dpdk-dev] [PATCH v9 12/14] ethdev: Add one dev_type parameter to rte_eth_dev_allocate

2015-02-19 Thread Tetsuya Mukawa
This new parameter is needed to keep device type like PCI or virtual.
Port detaching processes are different between PCI device and virtual
device.
RTE_ETH_DEV_PCI indicates device type is PCI. RTE_ETH_DEV_VIRTUAL
indicates device is virtual.

v9:
- Fix commit log.
- RTE_ETH_DEV_PHYSICAL is replaced by RTE_ETH_DEV_PCI.
  (Thanks to Thomas Monjalon)
v8:
- NONE_TRACE is replaced by NO_TRACE.
- Add missing symbol in version map.
  (Thanks to Qiu, Michael and Iremonger, Bernard)
v4:
- Fix comments of rte_eth_dev_type.

Signed-off-by: Tetsuya Mukawa 
---
 app/test/virtual_pmd.c   |  2 +-
 lib/librte_ether/rte_ethdev.c| 25 +++--
 lib/librte_ether/rte_ethdev.h| 25 -
 lib/librte_ether/rte_ether_version.map   |  1 +
 lib/librte_pmd_af_packet/rte_eth_af_packet.c |  2 +-
 lib/librte_pmd_bond/rte_eth_bond_api.c   |  2 +-
 lib/librte_pmd_pcap/rte_eth_pcap.c   |  2 +-
 lib/librte_pmd_ring/rte_eth_ring.c   |  2 +-
 lib/librte_pmd_xenvirt/rte_eth_xenvirt.c |  2 +-
 9 files changed, 54 insertions(+), 9 deletions(-)

diff --git a/app/test/virtual_pmd.c b/app/test/virtual_pmd.c
index 9fac95d..c02644a 100644
--- a/app/test/virtual_pmd.c
+++ b/app/test/virtual_pmd.c
@@ -556,7 +556,7 @@ virtual_ethdev_create(const char *name, struct ether_addr 
*mac_addr,
goto err;

/* reserve an ethdev entry */
-   eth_dev = rte_eth_dev_allocate(name);
+   eth_dev = rte_eth_dev_allocate(name, RTE_ETH_DEV_PCI);
if (eth_dev == NULL)
goto err;

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 3b64f3a..201c04a 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -227,7 +227,7 @@ rte_eth_dev_find_free_port(void)
 }

 struct rte_eth_dev *
-rte_eth_dev_allocate(const char *name)
+rte_eth_dev_allocate(const char *name, enum rte_eth_dev_type type)
 {
uint8_t port_id;
struct rte_eth_dev *eth_dev;
@@ -251,6 +251,7 @@ rte_eth_dev_allocate(const char *name)
snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name);
eth_dev->data->port_id = port_id;
eth_dev->attached = DEV_ATTACHED;
+   eth_dev->dev_type = type;
nb_ports++;
return eth_dev;
 }
@@ -262,6 +263,7 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
return -EINVAL;

eth_dev->attached = 0;
+   eth_dev->dev_type = RTE_ETH_DEV_UNKNOWN;
nb_ports--;
return 0;
 }
@@ -299,7 +301,7 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
/* Create unique Ethernet device name using PCI address */
rte_eth_dev_create_unique_device_name(ethdev_name, pci_dev);

-   eth_dev = rte_eth_dev_allocate(ethdev_name);
+   eth_dev = rte_eth_dev_allocate(ethdev_name, RTE_ETH_DEV_PCI);
if (eth_dev == NULL)
return -ENOMEM;

@@ -424,6 +426,14 @@ rte_eth_dev_count(void)
return (nb_ports);
 }

+enum rte_eth_dev_type
+rte_eth_dev_get_device_type(uint8_t port_id)
+{
+   if (!rte_eth_dev_is_valid_port(port_id))
+   return -1;
+   return rte_eth_devices[port_id].dev_type;
+}
+
 int
 rte_eth_dev_save(struct rte_eth_dev *devs, size_t size)
 {
@@ -521,6 +531,17 @@ rte_eth_dev_is_detachable(uint8_t port_id)
return -EINVAL;
}

+   if (rte_eth_devices[port_id].dev_type == RTE_ETH_DEV_PCI) {
+   switch (rte_eth_devices[port_id].pci_dev->pt_driver) {
+   case RTE_PT_IGB_UIO:
+   case RTE_PT_UIO_GENERIC:
+   break;
+   case RTE_PT_VFIO:
+   default:
+   return -ENOTSUP;
+   }
+   }
+
drv_flags = rte_eth_devices[port_id].driver->pci_drv.drv_flags;
return !(drv_flags & RTE_PCI_DRV_DETACHABLE);
 }
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 90b7f25..4f66bb6 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1523,6 +1523,17 @@ struct eth_dev_ops {
 };

 /**
+ * The eth device type
+ */
+enum rte_eth_dev_type {
+   RTE_ETH_DEV_UNKNOWN,/**< unknown device type */
+   RTE_ETH_DEV_PCI,
+   /**< Physical function and Virtual function of PCI devices */
+   RTE_ETH_DEV_VIRTUAL,/**< non hardware device */
+   RTE_ETH_DEV_MAX /**< max value of this enum */
+};
+
+/**
  * @internal
  * The generic data structure associated with each ethernet device.
  *
@@ -1541,6 +1552,7 @@ struct rte_eth_dev {
struct rte_pci_device *pci_dev; /**< PCI info. supplied by probing */
struct rte_eth_dev_cb_list callbacks; /**< User application callbacks */
uint8_t attached; /**< Flag indicating the port is attached */
+   enum rte_eth_dev_type dev_type; /**< Flag indicating the device type */
 };

 struct rte_eth_dev_sriov {
@@ -1618,6 +1630,15 @@ ext

[dpdk-dev] [PATCH v9 11/14] eal/pci: Add probe and close functions of pci driver

2015-02-19 Thread Tetsuya Mukawa
- Add pci_close_all_drivers()
  The function tries to find a driver for the specified device, and
  then close the driver.
- Add rte_eal_pci_probe_one() and rte_eal_pci_close_one()
  The functions are used for probe and close a device.
  First the function tries to find a device that has the specified
  PCI address. Then, probe or close the device.

v9:
- Fix commit title.
- Remove RTE_EAL_INVOKE_TYPE_PROBE/CLOSE.
  (Thanks to Thomas Monjalon)
- Implement pci_unmap_device() in this patch.
v5:
- Remove RTE_EAL_INVOKE_TYPE_UNKNOWN, because it's unused.
v4:
- Fix parameter checking.
- Fix indent of 'if' statement.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/eal_common_pci.c  | 98 -
 lib/librte_eal/common/eal_private.h | 15 +
 lib/librte_eal/common/include/rte_pci.h | 32 +++
 lib/librte_eal/linuxapp/eal/eal_pci.c   | 94 +++
 4 files changed, 238 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index bf2793f..5b6b55d 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -108,7 +108,10 @@ static int
 pci_probe_all_drivers(struct rte_pci_device *dev)
 {
struct rte_pci_driver *dr = NULL;
-   int rc;
+   int rc = 0;
+
+   if (dev == NULL)
+   return -1;

TAILQ_FOREACH(dr, &pci_driver_list, next) {
rc = rte_eal_pci_probe_one_driver(dr, dev);
@@ -123,6 +126,99 @@ pci_probe_all_drivers(struct rte_pci_device *dev)
return 1;
 }

+#ifdef RTE_LIBRTE_EAL_HOTPLUG
+/*
+ * If vendor/device ID match, call the devuninit() function of all
+ * registered driver for the given device. Return -1 if initialization
+ * failed, return 1 if no driver is found for this device.
+ */
+static int
+pci_close_all_drivers(struct rte_pci_device *dev)
+{
+   struct rte_pci_driver *dr = NULL;
+   int rc = 0;
+
+   if (dev == NULL)
+   return -1;
+
+   TAILQ_FOREACH(dr, &pci_driver_list, next) {
+   rc = rte_eal_pci_close_one_driver(dr, dev);
+   if (rc < 0)
+   /* negative value is an error */
+   return -1;
+   if (rc > 0)
+   /* positive value means driver not found */
+   continue;
+   return 0;
+   }
+   return 1;
+}
+
+/*
+ * Find the pci device specified by pci address, then invoke probe function of
+ * the driver of the devive.
+ */
+int
+rte_eal_pci_probe_one(struct rte_pci_addr *addr)
+{
+   struct rte_pci_device *dev = NULL;
+   int ret = 0;
+
+   if (addr == NULL)
+   return -1;
+
+   TAILQ_FOREACH(dev, &pci_device_list, next) {
+   if (rte_eal_compare_pci_addr(&dev->addr, addr))
+   continue;
+
+   ret = pci_probe_all_drivers(dev);
+   if (ret < 0)
+   goto err_return;
+   return 0;
+   }
+   return -1;
+
+err_return:
+   RTE_LOG(WARNING, EAL, "Requested device " PCI_PRI_FMT
+   " cannot be used\n", dev->addr.domain, dev->addr.bus,
+   dev->addr.devid, dev->addr.function);
+   return -1;
+}
+
+/*
+ * Find the pci device specified by pci address, then invoke close function of
+ * the driver of the devive.
+ */
+int
+rte_eal_pci_close_one(struct rte_pci_addr *addr)
+{
+   struct rte_pci_device *dev = NULL;
+   int ret = 0;
+
+   if (addr == NULL)
+   return -1;
+
+   TAILQ_FOREACH(dev, &pci_device_list, next) {
+   if (rte_eal_compare_pci_addr(&dev->addr, addr))
+   continue;
+
+   ret = pci_close_all_drivers(dev);
+   if (ret < 0)
+   goto err_return;
+
+   TAILQ_REMOVE(&pci_device_list, dev, next);
+   return 0;
+   }
+   return -1;
+
+err_return:
+   RTE_LOG(WARNING, EAL, "Requested device " PCI_PRI_FMT
+   " cannot be used\n", dev->addr.domain, dev->addr.bus,
+   dev->addr.devid, dev->addr.function);
+   return -1;
+}
+#endif /* RTE_LIBRTE_EAL_HOTPLUG */
+
 /*
  * Scan the content of the PCI bus, and call the devinit() function for
  * all registered drivers that have a matching entry in its id_table
diff --git a/lib/librte_eal/common/eal_private.h 
b/lib/librte_eal/common/eal_private.h
index 159cd66..4acf5a0 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -165,6 +165,21 @@ int rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr,
struct rte_pci_device *dev);

 /**
+ * Munmap memory for single PCI device
+ *
+ * This function is private to EAL.
+ *
+ * @param  dr
+ *  The pointer to the pci driver structure
+ * @param  dev
+ *  The pointer to the pci device structure
+ * @return
+ 

[dpdk-dev] [PATCH v9 10/14] eal/pci: Add a function to remove the entry of devargs list

2015-02-19 Thread Tetsuya Mukawa
The function removes the specified devargs entry from devargs_list.
Also, the patch adds sanity checking to rte_eal_devargs_add().

v5:
- Change function definition of rte_eal_devargs_remove().
v4:
- Fix sanity check code.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/eal_common_devargs.c  | 61 +
 lib/librte_eal/common/include/rte_devargs.h | 21 ++
 2 files changed, 82 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_devargs.c 
b/lib/librte_eal/common/eal_common_devargs.c
index 4c7d11a..d71faaa 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -44,6 +44,36 @@
 struct rte_devargs_list devargs_list =
TAILQ_HEAD_INITIALIZER(devargs_list);

+
+/* find a entry specified by pci address or device name */
+static struct rte_devargs *
+rte_eal_devargs_find(enum rte_devtype devtype, void *args)
+{
+   struct rte_devargs *devargs;
+
+   if (args == NULL)
+   return NULL;
+
+   TAILQ_FOREACH(devargs, &devargs_list, next) {
+   switch (devtype) {
+   case RTE_DEVTYPE_WHITELISTED_PCI:
+   case RTE_DEVTYPE_BLACKLISTED_PCI:
+   if (rte_eal_compare_pci_addr(
+   &devargs->pci.addr, args) == 0)
+   goto found;
+   break;
+   case RTE_DEVTYPE_VIRTUAL:
+   if (memcmp(&devargs->virtual.drv_name, args,
+   strlen((char *)args)) == 0)
+   goto found;
+   break;
+   }
+   }
+   return NULL;
+found:
+   return devargs;
+}
+
 /* store a whitelist parameter for later parsing */
 int
 rte_eal_devargs_add(enum rte_devtype devtype, const char *devargs_str)
@@ -87,6 +117,12 @@ rte_eal_devargs_add(enum rte_devtype devtype, const char 
*devargs_str)
free(devargs);
return -1;
}
+   /* make sure there is no same entry */
+   if (rte_eal_devargs_find(devtype, &devargs->pci.addr)) {
+   RTE_LOG(ERR, EAL,
+   "device already registered: <%s>\n", buf);
+   return -1;
+   }
break;
case RTE_DEVTYPE_VIRTUAL:
/* save driver name */
@@ -98,6 +134,12 @@ rte_eal_devargs_add(enum rte_devtype devtype, const char 
*devargs_str)
free(devargs);
return -1;
}
+   /* make sure there is no same entry */
+   if (rte_eal_devargs_find(devtype, &devargs->virtual.drv_name)) {
+   RTE_LOG(ERR, EAL,
+   "device already registered: <%s>\n", buf);
+   return -1;
+   }
break;
}

@@ -105,6 +147,25 @@ rte_eal_devargs_add(enum rte_devtype devtype, const char 
*devargs_str)
return 0;
 }

+/* remove it from the devargs_list */
+int
+rte_eal_devargs_remove(enum rte_devtype devtype, void *args)
+{
+   struct rte_devargs *devargs;
+
+   if (args == NULL)
+   return -EINVAL;
+
+   devargs = rte_eal_devargs_find(devtype, args);
+   if (devargs == NULL) {
+   RTE_LOG(ERR, EAL, "device not found\n");
+   return -ENODEV;
+   }
+
+   TAILQ_REMOVE(&devargs_list, devargs, next);
+   return 0;
+}
+
 /* count the number of devices of a specified type */
 unsigned int
 rte_eal_devargs_type_count(enum rte_devtype devtype)
diff --git a/lib/librte_eal/common/include/rte_devargs.h 
b/lib/librte_eal/common/include/rte_devargs.h
index 9f9c98f..6d9763b 100644
--- a/lib/librte_eal/common/include/rte_devargs.h
+++ b/lib/librte_eal/common/include/rte_devargs.h
@@ -123,6 +123,27 @@ extern struct rte_devargs_list devargs_list;
 int rte_eal_devargs_add(enum rte_devtype devtype, const char *devargs_str);

 /**
+ * Remove a device from the user device list
+ *
+ * For PCI devices, the format of arguments string is "PCI_ADDR". It shouldn't
+ * involve parameters for the device. Example: "08:00.1".
+ *
+ * For virtual devices, the format of arguments string is "DRIVER_NAME*". It
+ * shouldn't involve parameters for the device. Example: "eth_ring". The
+ * validity of the driver name is not checked by this function, it is done
+ * when closing the drivers.
+ *
+ * @param devtype
+ *   The type of the device.
+ * @param name
+ *   The name of the device.
+ *
+ * @return
+ *   - 0 on success, negative on error
+ */
+int rte_eal_devargs_remove(enum rte_devtype devtype, void *args);
+
+/**
  * Count the number of user devices of a specified type
  *
  * @param devtype
-- 
1.9.1



[dpdk-dev] [PATCH v9 09/14] eal/linux/pci: Add functions for unmapping igb_uio resources

2015-02-19 Thread Tetsuya Mukawa
The patch adds functions for unmapping igb_uio resources. The patch is only
for Linux and igb_uio environment. VFIO and BSD are not supported.

v9:
- Remove "rte_dev_hotplug.h".
- Remove needless "#ifdef".
  (Thanks to Thomas Monjalon and Neil Horman)
- Remove pci_unmap_device(). It will be implemented in later patch.
v8:
- Fix typo.
  (Thanks to Iremonger, Bernard)
v5:
- Fix pci_unmap_device() to check pt_driver.
v4:
- Add parameter checking.
- Add header file to determine if hotplug can be enabled.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/linuxapp/eal/eal_pci.c  | 17 
 lib/librte_eal/linuxapp/eal/eal_pci_init.h |  7 
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c  | 65 ++
 3 files changed, 89 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index f593f2c..7349a60 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -167,6 +167,23 @@ pci_map_resource(void *requested_addr, int fd, off_t 
offset, size_t size)
return mapaddr;
 }

+/* unmap a particular resource */
+void
+pci_unmap_resource(void *requested_addr, size_t size)
+{
+   if (requested_addr == NULL)
+   return;
+
+   /* Unmap the PCI memory resource of device */
+   if (munmap(requested_addr, size)) {
+   RTE_LOG(ERR, EAL, "%s(): cannot munmap(%p, 0x%lx): %s\n",
+   __func__, requested_addr, (unsigned long)size,
+   strerror(errno));
+   } else
+   RTE_LOG(DEBUG, EAL, "  PCI memory unmapped at %p\n",
+   requested_addr);
+}
+
 /* parse the "resource" sysfs file */
 #define IORESOURCE_MEM  0x0200

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_init.h 
b/lib/librte_eal/linuxapp/eal/eal_pci_init.h
index 1070eb8..e2dd8a5 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_init.h
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_init.h
@@ -71,6 +71,13 @@ void *pci_map_resource(void *requested_addr, int fd, off_t 
offset,
 /* map IGB_UIO resource prototype */
 int pci_uio_map_resource(struct rte_pci_device *dev);

+void pci_unmap_resource(void *requested_addr, size_t size);
+
+#ifdef RTE_LIBRTE_EAL_HOTPLUG
+/* unmap IGB_UIO resource prototype */
+void pci_uio_unmap_resource(struct rte_pci_device *dev);
+#endif /* RTE_LIBRTE_EAL_HOTPLUG */
+
 #ifdef VFIO_PRESENT

 #define VFIO_MAX_GROUPS 64
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c 
b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
index d75eb59..43f47dc 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
@@ -404,6 +404,71 @@ pci_uio_map_resource(struct rte_pci_device *dev)
return 0;
 }

+#ifdef RTE_LIBRTE_EAL_HOTPLUG
+static void
+pci_uio_unmap(struct mapped_pci_resource *uio_res)
+{
+   int i;
+
+   if (uio_res == NULL)
+   return;
+
+   for (i = 0; i != uio_res->nb_maps; i++)
+   pci_unmap_resource(uio_res->maps[i].addr,
+   (size_t)uio_res->maps[i].size);
+}
+
+static struct mapped_pci_resource *
+pci_uio_find_resource(struct rte_pci_device *dev)
+{
+   struct mapped_pci_resource *uio_res;
+
+   if (dev == NULL)
+   return NULL;
+
+   TAILQ_FOREACH(uio_res, pci_res_list, next) {
+
+   /* skip this element if it doesn't match our PCI address */
+   if (!rte_eal_compare_pci_addr(&uio_res->pci_addr, &dev->addr))
+   return uio_res;
+   }
+   return NULL;
+}
+
+/* unmap the PCI resource of a PCI device in virtual memory */
+void
+pci_uio_unmap_resource(struct rte_pci_device *dev)
+{
+   struct mapped_pci_resource *uio_res;
+
+   if (dev == NULL)
+   return;
+
+   /* find an entry for the device */
+   uio_res = pci_uio_find_resource(dev);
+   if (uio_res == NULL)
+   return;
+
+   /* secondary processes - just free maps */
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return pci_uio_unmap(uio_res);
+
+   TAILQ_REMOVE(pci_res_list, uio_res, next);
+
+   /* unmap all resources */
+   pci_uio_unmap(uio_res);
+
+   /* free uio resource */
+   rte_free(uio_res);
+
+   /* close fd if in primary process */
+   close(dev->intr_handle.fd);
+
+   dev->intr_handle.fd = -1;
+   dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
+}
+#endif /* RTE_LIBRTE_EAL_HOTPLUG */
+
 /*
  * parse a sysfs file containing one integer value
  * different to the eal version, as it needs to work with 64-bit values
-- 
1.9.1



[dpdk-dev] [PATCH v9 08/14] ethdev: Add functions that will be used by port hotplug functions

2015-02-19 Thread Tetsuya Mukawa
The patch adds following functions.

- rte_eth_dev_save()
  The function is used for saving current rte_eth_dev structures.
- rte_eth_dev_get_changed_port()
  The function receives the rte_eth_dev structures, then compare
  these with current values to know which port is actually
  attached or detached.
- rte_eth_dev_get_addr_by_port()
  The function returns a pci address of an ethdev specified by port
  identifier.
- rte_eth_dev_get_port_by_addr()
  The function returns a port identifier of an ethdev specified by
  pci address.
- rte_eth_dev_get_name_by_port()
  The function returns a unique identifier name of an ethdev
  specified by port identifier.
- Add rte_eth_dev_is_detachable()
  The function returns whether a PMD supports detach function.

Also, the patch changes scope of rte_eth_dev_allocated() to global.
This function will be called by virtual PMDs to support port hotplug.
So change scope of the function to global.

v9:
- rte_eth_dev_check_detachable() is replaced by
  rte_eth_dev_is_detachable().
- strncpy() is replaced by strcpy().
  (Thanks to Thomas Monjalon)
- Add missing symbol in version map.
  (Thanks to Nail Horman)
v8:
- Add size parameter to rte_eth_dev_save().
- Add missing symbol in version map.
  (Thanks to Qiu, Michael and Iremonger, Bernard)
v7:
- Add pt_driver checking to rte_eth_dev_check_detachable().
  (Thanks to Qiu, Michael)
v5:
- Fix return value of below functions.
  rte_eth_dev_get_changed_port().
  rte_eth_dev_get_port_by_addr().
v4:
- Add parameter checking.
v3:
- Fix if-condition bug while comparing pci addresses.
- Add error checking codes.
Reported-by: Mark Enright 

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_ether/rte_ethdev.c  | 103 -
 lib/librte_ether/rte_ethdev.h  |  83 ++
 lib/librte_ether/rte_ether_version.map |   7 +++
 3 files changed, 192 insertions(+), 1 deletion(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ef5d226..3b64f3a 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -201,7 +201,7 @@ rte_eth_dev_data_alloc(void)
RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data));
 }

-static struct rte_eth_dev *
+struct rte_eth_dev *
 rte_eth_dev_allocated(const char *name)
 {
unsigned i;
@@ -424,6 +424,107 @@ rte_eth_dev_count(void)
return (nb_ports);
 }

+int
+rte_eth_dev_save(struct rte_eth_dev *devs, size_t size)
+{
+   if ((devs == NULL) ||
+   (size != sizeof(struct rte_eth_dev) * RTE_MAX_ETHPORTS))
+   return -EINVAL;
+
+   /* save current rte_eth_devices */
+   memcpy(devs, rte_eth_devices, size);
+   return 0;
+}
+
+int
+rte_eth_dev_get_changed_port(struct rte_eth_dev *devs, uint8_t *port_id)
+{
+   if ((devs == NULL) || (port_id == NULL))
+   return -EINVAL;
+
+   /* check which port was attached or detached */
+   for (*port_id = 0; *port_id < RTE_MAX_ETHPORTS; (*port_id)++, devs++) {
+   if (rte_eth_devices[*port_id].attached ^ devs->attached)
+   return 0;
+   }
+   return -ENODEV;
+}
+
+int
+rte_eth_dev_get_addr_by_port(uint8_t port_id, struct rte_pci_addr *addr)
+{
+   if (!rte_eth_dev_is_valid_port(port_id)) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return -EINVAL;
+   }
+
+   if (addr == NULL) {
+   PMD_DEBUG_TRACE("Null pointer is specified\n");
+   return -EINVAL;
+   }
+
+   *addr = rte_eth_devices[port_id].pci_dev->addr;
+   return 0;
+}
+
+int
+rte_eth_dev_get_port_by_addr(struct rte_pci_addr *addr, uint8_t *port_id)
+{
+   struct rte_pci_addr *tmp;
+
+   if ((addr == NULL) || (port_id == NULL)) {
+   PMD_DEBUG_TRACE("Null pointer is specified\n");
+   return -EINVAL;
+   }
+
+   for (*port_id = 0; *port_id < RTE_MAX_ETHPORTS; (*port_id)++) {
+   if (!rte_eth_devices[*port_id].attached)
+   continue;
+   if (!rte_eth_devices[*port_id].pci_dev)
+   continue;
+   tmp = &rte_eth_devices[*port_id].pci_dev->addr;
+   if (rte_eal_compare_pci_addr(tmp, addr) == 0)
+   return 0;
+   }
+   return -ENODEV;
+}
+
+int
+rte_eth_dev_get_name_by_port(uint8_t port_id, char *name)
+{
+   char *tmp;
+
+   if (!rte_eth_dev_is_valid_port(port_id)) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return -EINVAL;
+   }
+
+   if (name == NULL) {
+   PMD_DEBUG_TRACE("Null pointer is specified\n");
+   return -EINVAL;
+   }
+
+   /* shouldn't check 'rte_eth_devices[i].data',
+* because it might be overwritten by VDEV PMD */
+   tmp = rte_eth_dev_data[port_id].name;
+   strcpy(name, tmp);
+   return 0;
+}
+
+int
+rte_eth_dev_is_detachable(uint

[dpdk-dev] [PATCH v9 07/14] eal, ethdev: Add a function and function pointers to close ether device

2015-02-19 Thread Tetsuya Mukawa
The patch adds function pointer to rte_pci_driver and eth_driver
structure. These function pointers are used when ports are detached.
Also, the patch adds rte_eth_dev_uninit(). So far, it's not called
by anywhere, but it will be called when port hotplug function is
implemented.

v9:
- Change parameter of pci_devuninit_t and rte_eth_dev_uninit.
- Remove code that initiaize callback of ethdev from
  rte_eth_dev_uninit().
- Add a function to create a unique device name.
  (Thanks to Thomas Monjalon)
v6:
- Fix rte_eth_dev_uninit() to handle a return value of uninit
  function of PMD.
v4:
- Add parameter checking.
- Change function names.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/include/rte_pci.h |  6 
 lib/librte_ether/rte_ethdev.c   | 62 +++--
 lib/librte_ether/rte_ethdev.h   | 24 +
 3 files changed, 90 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index c609ef3..376f66a 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -189,12 +189,18 @@ struct rte_pci_driver;
 typedef int (pci_devinit_t)(struct rte_pci_driver *, struct rte_pci_device *);

 /**
+ * Uninitialisation function for the driver called during hotplugging.
+ */
+typedef int (pci_devuninit_t)(struct rte_pci_device *);
+
+/**
  * A structure describing a PCI driver.
  */
 struct rte_pci_driver {
TAILQ_ENTRY(rte_pci_driver) next;   /**< Next in list. */
const char *name;   /**< Driver name. */
pci_devinit_t *devinit; /**< Device init. function. */
+   pci_devuninit_t *devuninit; /**< Device uninit function. */
struct rte_pci_id *id_table;/**< ID table, NULL terminated. 
*/
uint32_t drv_flags; /**< Flags contolling handling 
of device. */
 };
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index be5aa18..ef5d226 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -266,6 +266,24 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
return 0;
 }

+static inline int
+rte_eth_dev_create_unique_device_name(char *name,
+   struct rte_pci_device *pci_dev)
+{
+   int ret;
+
+   if ((name == NULL) || (pci_dev == NULL))
+   return -EINVAL;
+
+   ret = snprintf(name, RTE_ETH_NAME_MAX_LEN, "%d:%d.%d",
+   pci_dev->addr.bus, pci_dev->addr.devid,
+   pci_dev->addr.function);
+   if (ret < 0)
+   return ret;
+
+   return 0;
+}
+
 static int
 rte_eth_dev_init(struct rte_pci_driver *pci_drv,
 struct rte_pci_device *pci_dev)
@@ -279,8 +297,7 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
eth_drv = (struct eth_driver *)pci_drv;

/* Create unique Ethernet device name using PCI address */
-   snprintf(ethdev_name, RTE_ETH_NAME_MAX_LEN, "%d:%d.%d",
-   pci_dev->addr.bus, pci_dev->addr.devid, 
pci_dev->addr.function);
+   rte_eth_dev_create_unique_device_name(ethdev_name, pci_dev);

eth_dev = rte_eth_dev_allocate(ethdev_name);
if (eth_dev == NULL)
@@ -321,6 +338,46 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
return diag;
 }

+static int
+rte_eth_dev_uninit(struct rte_pci_device *pci_dev)
+{
+   const struct eth_driver *eth_drv;
+   struct rte_eth_dev *eth_dev;
+   char ethdev_name[RTE_ETH_NAME_MAX_LEN];
+   int ret;
+
+   if (pci_dev == NULL)
+   return -EINVAL;
+
+   /* Create unique Ethernet device name using PCI address */
+   rte_eth_dev_create_unique_device_name(ethdev_name, pci_dev);
+
+   eth_dev = rte_eth_dev_allocated(ethdev_name);
+   if (eth_dev == NULL)
+   return -ENODEV;
+
+   eth_drv = (const struct eth_driver *)pci_dev->driver;
+
+   /* Invoke PMD device uninit function */
+   if (*eth_drv->eth_dev_uninit) {
+   ret = (*eth_drv->eth_dev_uninit)(eth_drv, eth_dev);
+   if (ret)
+   return ret;
+   }
+
+   /* free ether device */
+   rte_eth_dev_release_port(eth_dev);
+
+   if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+   rte_free(eth_dev->data->dev_private);
+
+   eth_dev->pci_dev = NULL;
+   eth_dev->driver = NULL;
+   eth_dev->data = NULL;
+
+   return 0;
+}
+
 /**
  * Register an Ethernet [Poll Mode] driver.
  *
@@ -339,6 +396,7 @@ void
 rte_eth_driver_register(struct eth_driver *eth_drv)
 {
eth_drv->pci_drv.devinit = rte_eth_dev_init;
+   eth_drv->pci_drv.devuninit = rte_eth_dev_uninit;
rte_eal_pci_register(ð_drv->pci_drv);
 }

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 28ecafd..f403780 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/r

[dpdk-dev] [PATCH v9 06/14] ethdev: Add rte_eth_dev_release_port to release specified port

2015-02-19 Thread Tetsuya Mukawa
This patch adds rte_eth_dev_release_port(). The function is used for
changing an attached status of the device that has specified name.

v9:
- rte_eth_dev_free() is replaced by rte_eth_dev_release_port().
  (Thanks to Thomas Monjalon)
v6:
- Use rte_eth_dev structure as the paramter of rte_eth_dev_free().
v4:
- Add parameter checking.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_ether/rte_ethdev.c | 11 +++
 lib/librte_ether/rte_ethdev.h | 12 
 2 files changed, 23 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 56797a5..be5aa18 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -255,6 +255,17 @@ rte_eth_dev_allocate(const char *name)
return eth_dev;
 }

+int
+rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
+{
+   if (eth_dev == NULL)
+   return -EINVAL;
+
+   eth_dev->attached = 0;
+   nb_ports--;
+   return 0;
+}
+
 static int
 rte_eth_dev_init(struct rte_pci_driver *pci_drv,
 struct rte_pci_device *pci_dev)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 768f372..28ecafd 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1628,6 +1628,18 @@ extern uint8_t rte_eth_dev_count(void);
  */
 struct rte_eth_dev *rte_eth_dev_allocate(const char *name);

+/**
+ * Function for internal use by dummy drivers primarily, e.g. ring-based
+ * driver.
+ * Release the specified ethdev port.
+ *
+ * @param eth_dev
+ * The *eth_dev* pointer is the address of the *rte_eth_dev* structure.
+ * @return
+ *   - 0 on success, negative on error
+ */
+int rte_eth_dev_release_port(struct rte_eth_dev *eth_dev);
+
 struct eth_driver;
 /**
  * @internal
-- 
1.9.1



[dpdk-dev] [PATCH v9 05/14] eal/pci: Consolidate pci address comparison APIs

2015-02-19 Thread Tetsuya Mukawa
This patch replaces pci_addr_comparison() and memcmp() of pci addresses by
rte_eal_compare_pci_addr().

To compare PCI addresses, rte_eal_compare_pci_addr() doesn't use memcmp().
This is because sizeof(struct rte_pci_addr) returns 6, but actually
this structure is like below.

struct rte_pci_addr {
uint16_t domain;/**< Device domain */
uint8_t bus;/**< Device bus */
uint8_t devid;  /**< Device ID */
uint8_t function;   /**< Device function. */
};

If the structure is dynamically allocated in a function without bzero,
last 1 byte may have value. As a result, memcmp may not work.
To avoid such a case, rte_eal_compare_pci_addr() compare following values.

dev_addr = (addr->domain << 24) | (addr->bus << 16) |
(addr->devid << 8) | addr->function;

v9:
- eal_compare_pci_addr() is replaced by rte_eal_compare_pci_addr().
- Fix commit log.
  (Thanks to Thomas Monjalon)
v8:
- Fix pci_scan_one() to update sysfs values.
  (Thanks to Qiu, Michael and Iremonger, Bernard)
v5:
- Fix pci_scan_one to handle pt_driver correctly.
v4:
- Fix calculation method of eal_compare_pci_addr().
- Add parameter checking.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/bsdapp/eal/eal_pci.c   | 29 --
 lib/librte_eal/common/eal_common_pci.c|  2 +-
 lib/librte_eal/common/include/rte_pci.h   | 34 +++
 lib/librte_eal/linuxapp/eal/eal_pci.c | 30 +--
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c |  2 +-
 5 files changed, 63 insertions(+), 34 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 74ecce7..9193f80 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -270,20 +270,6 @@ pci_uio_map_resource(struct rte_pci_device *dev)
return (0);
 }

-/* Compare two PCI device addresses. */
-static int
-pci_addr_comparison(struct rte_pci_addr *addr, struct rte_pci_addr *addr2)
-{
-   uint64_t dev_addr = (addr->domain << 24) + (addr->bus << 16) + 
(addr->devid << 8) + addr->function;
-   uint64_t dev_addr2 = (addr2->domain << 24) + (addr2->bus << 16) + 
(addr2->devid << 8) + addr2->function;
-
-   if (dev_addr > dev_addr2)
-   return 1;
-   else
-   return 0;
-}
-
-
 /* Scan one pci sysfs entry, and fill the devices list from it. */
 static int
 pci_scan_one(int dev_pci_fd, struct pci_conf *conf)
@@ -356,13 +342,24 @@ pci_scan_one(int dev_pci_fd, struct pci_conf *conf)
}
else {
struct rte_pci_device *dev2 = NULL;
+   int ret;

TAILQ_FOREACH(dev2, &pci_device_list, next) {
-   if (pci_addr_comparison(&dev->addr, &dev2->addr))
+   ret = rte_eal_compare_pci_addr(&dev->addr, &dev2->addr);
+   if (ret > 0)
continue;
-   else {
+   else if (ret < 0) {
TAILQ_INSERT_BEFORE(dev2, dev, next);
return 0;
+   } else { /* already registered */
+   /* update pt_driver */
+   dev2->pt_driver = dev->pt_driver;
+   dev2->max_vfs = dev->max_vfs;
+   memmove(dev2->mem_resource,
+   dev->mem_resource,
+   sizeof(dev->mem_resource));
+   free(dev);
+   return 0;
}
}
TAILQ_INSERT_TAIL(&pci_device_list, dev, next);
diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index f3c7f71..bf2793f 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -93,7 +93,7 @@ static struct rte_devargs *pci_devargs_lookup(struct 
rte_pci_device *dev)
if (devargs->type != RTE_DEVTYPE_BLACKLISTED_PCI &&
devargs->type != RTE_DEVTYPE_WHITELISTED_PCI)
continue;
-   if (!memcmp(&dev->addr, &devargs->pci.addr, sizeof(dev->addr)))
+   if (!rte_eal_compare_pci_addr(&dev->addr, &devargs->pci.addr))
return devargs;
}
return NULL;
diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 7f2d699..c609ef3 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -269,6 +269,40 @@ eal_parse_pci_DomBDF(const char *input, struct 
rte_pci_addr *dev_addr)
 }
 #undef GET_PCIADDR_FIELD

+/* Compare two PCI device addresses. */
+/**
+ * Utility function to compare two PCI device addresses.
+ *
+ 

[dpdk-dev] [PATCH v9 04/14] eal/pci, ethdev: Remove assumption that port will not be detached

2015-02-19 Thread Tetsuya Mukawa
To remove assumption, do like followings.

This patch adds "RTE_PCI_DRV_DETACHABLE" to drv_flags of rte_pci_driver
structure. The flags indicate the driver can detach devices at runtime.
Also, remove assumption that port will not be detached.

To remove the assumption.
- Add 'attached' member to rte_eth_dev structure.
  This member is used for indicating the port is attached, or not.
  DEV_ATTACHED indicates a port is attached.
  DEV_DETACHED indicates a port is detached.
- Add rte_eth_dev_allocate_new_port().
  This function is used for allocating new port.

v9:
- DEV_INVALID/VALID are removed.
- DEV_DISCONNECTED is replaced by DEV_DETACHED.
- DEV_CONNECTED is replaced by DEV_ATTACHED.
- rte_eth_dev_allocate_new_port() is renamed to
  rte_eth_dev_find_free_port().
- rte_eth_dev_validate_port() is renamed to rte_eth_dev_is_valid_port().
- rte_eth_dev_is_valid_port() is changed not to handle log toggle.
- Fix commit log to describe DEV_ATACHED and DEV_DETACHED.
  (Thanks to Thomas Monjalon)
v8:
- NONE_TRACE is changed to NO_TRACE.
  (Thanks to Iremonger, Bernard)
v5:
- Change parameters of rte_eth_dev_validate_port() to cleanup code.
v4:
- Use braces with 'for' loop.
- Fix indent of 'if' statement.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/include/rte_pci.h |   2 +
 lib/librte_ether/rte_ethdev.c   | 273 
 lib/librte_ether/rte_ethdev.h   |   5 +
 3 files changed, 177 insertions(+), 103 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 7b48b55..7f2d699 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -207,6 +207,8 @@ struct rte_pci_driver {
 #define RTE_PCI_DRV_FORCE_UNBIND 0x0004
 /** Device driver supports link state interrupt */
 #define RTE_PCI_DRV_INTR_LSC   0x0008
+/** Device driver supports detaching capability */
+#define RTE_PCI_DRV_DETACHABLE 0x0010

 /**< Internal use only - Macro used by pci addr parsing functions **/
 #define GET_PCIADDR_FIELD(in, fd, lim, dlm)   \
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ea3a1fb..56797a5 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -175,6 +175,11 @@ enum {
STAT_QMAP_RX
 };

+enum {
+   DEV_DETACHED = 0,
+   DEV_ATTACHED
+};
+
 static inline void
 rte_eth_dev_data_alloc(void)
 {
@@ -201,19 +206,34 @@ rte_eth_dev_allocated(const char *name)
 {
unsigned i;

-   for (i = 0; i < nb_ports; i++) {
-   if (strcmp(rte_eth_devices[i].data->name, name) == 0)
+   for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
+   if ((rte_eth_devices[i].attached == DEV_ATTACHED) &&
+   strcmp(rte_eth_devices[i].data->name, name) == 0)
return &rte_eth_devices[i];
}
return NULL;
 }

+static uint8_t
+rte_eth_dev_find_free_port(void)
+{
+   unsigned i;
+
+   for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
+   if (rte_eth_devices[i].attached == DEV_DETACHED)
+   return i;
+   }
+   return RTE_MAX_ETHPORTS;
+}
+
 struct rte_eth_dev *
 rte_eth_dev_allocate(const char *name)
 {
+   uint8_t port_id;
struct rte_eth_dev *eth_dev;

-   if (nb_ports == RTE_MAX_ETHPORTS) {
+   port_id = rte_eth_dev_find_free_port();
+   if (port_id == RTE_MAX_ETHPORTS) {
PMD_DEBUG_TRACE("Reached maximum number of Ethernet ports\n");
return NULL;
}
@@ -226,10 +246,12 @@ rte_eth_dev_allocate(const char *name)
return NULL;
}

-   eth_dev = &rte_eth_devices[nb_ports];
-   eth_dev->data = &rte_eth_dev_data[nb_ports];
+   eth_dev = &rte_eth_devices[port_id];
+   eth_dev->data = &rte_eth_dev_data[port_id];
snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name);
-   eth_dev->data->port_id = nb_ports++;
+   eth_dev->data->port_id = port_id;
+   eth_dev->attached = DEV_ATTACHED;
+   nb_ports++;
return eth_dev;
 }

@@ -283,6 +305,7 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
(unsigned) pci_dev->id.device_id);
if (rte_eal_process_type() == RTE_PROC_PRIMARY)
rte_free(eth_dev->data->dev_private);
+   eth_dev->attached = DEV_DETACHED;
nb_ports--;
return diag;
 }
@@ -308,10 +331,20 @@ rte_eth_driver_register(struct eth_driver *eth_drv)
rte_eal_pci_register(ð_drv->pci_drv);
 }

+static int
+rte_eth_dev_is_valid_port(uint8_t port_id)
+{
+   if (port_id >= RTE_MAX_ETHPORTS ||
+   rte_eth_devices[port_id].attached != DEV_ATTACHED)
+   return 0;
+   else
+   return 1;
+}
+
 int
 rte_eth_dev_socket_id(uint8_t port_id)
 {
-   if (port_id >= nb_ports)
+   if (!rte_eth_dev_is_valid_port(port_id))
return -1;
return rte_eth_devic

[dpdk-dev] [PATCH v9 03/14] eal_pci: pci memory map work with driver type

2015-02-19 Thread Tetsuya Mukawa
From: Michael Qiu 

With the driver type flag in struct rte_pci_dev, we do not need
to always  map uio devices with vfio related function when
vfio enabled.

Signed-off-by: Michael Qiu 
Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/linuxapp/eal/eal_pci.c | 30 +-
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index e760452..3c463b2 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -556,25 +556,29 @@ pci_config_space_set(struct rte_pci_device *dev)
 static int
 pci_map_device(struct rte_pci_device *dev)
 {
-   int ret, mapped = 0;
+   int ret = -1;

/* try mapping the NIC resources using VFIO if it exists */
+   switch (dev->pt_driver) {
+   case RTE_PT_VFIO:
 #ifdef VFIO_PRESENT
-   if (pci_vfio_is_enabled()) {
-   ret = pci_vfio_map_resource(dev);
-   if (ret == 0)
-   mapped = 1;
-   else if (ret < 0)
-   return ret;
-   }
+   if (pci_vfio_is_enabled())
+   ret = pci_vfio_map_resource(dev);
 #endif
-   /* map resources for devices that use igb_uio */
-   if (!mapped) {
+   break;
+   case RTE_PT_IGB_UIO:
+   case RTE_PT_UIO_GENERIC:
+   /* map resources for devices that use uio */
ret = pci_uio_map_resource(dev);
-   if (ret != 0)
-   return ret;
+   break;
+   default:
+   RTE_LOG(DEBUG, EAL, "  Not managed by known pt driver,"
+   " skipped\n");
+   ret = 1;
+   break;
}
-   return 0;
+
+   return ret;
 }

 /*
-- 
1.9.1



[dpdk-dev] [PATCH v9 02/14] eal_pci: Add flag to hold kernel driver type

2015-02-19 Thread Tetsuya Mukawa
From: Michael Qiu 

Currently, dpdk has no ability to know which type of driver(
vfio-pci/igb_uio/uio_pci_generic) the device used. It only can
check whether vfio is enabled or not staticly.

It really useful to have the flag, becasue different type need to
handle differently in runtime. For example, pci memory map,
pot hotplug, and so on.

This patch add a flag field for pci device to solve above issue.

Signed-off-by: Michael Qiu 
Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/include/rte_pci.h |  8 +
 lib/librte_eal/linuxapp/eal/eal_pci.c   | 53 +++--
 2 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 66ed793..7b48b55 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -139,6 +139,13 @@ struct rte_pci_addr {

 struct rte_devargs;

+enum rte_pt_driver {
+   RTE_PT_UNKNOWN  = 0,
+   RTE_PT_IGB_UIO  = 1,
+   RTE_PT_VFIO = 2,
+   RTE_PT_UIO_GENERIC  = 3,
+};
+
 /**
  * A structure describing a PCI device.
  */
@@ -152,6 +159,7 @@ struct rte_pci_device {
uint16_t max_vfs;   /**< sriov enable if not zero */
int numa_node;  /**< NUMA node connection */
struct rte_devargs *devargs;/**< Device user arguments */
+   enum rte_pt_driver pt_driver;   /**< Driver of passthrough */
 };

 /** Any PCI device identifier (vendor, device, ...) */
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 15db9c4..e760452 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -97,6 +97,35 @@ error:
return -1;
 }

+static int
+pci_get_kernel_driver_by_path(const char *filename, char *dri_name)
+{
+   int count;
+   char path[PATH_MAX];
+   char *name;
+
+   if (!filename || !dri_name)
+   return -1;
+
+   count = readlink(filename, path, PATH_MAX);
+   if (count >= PATH_MAX)
+   return -1;
+
+   /* For device does not have a driver */
+   if (count < 0)
+   return 1;
+
+   path[count] = '\0';
+
+   name = strrchr(path, '/');
+   if (name) {
+   strncpy(dri_name, name + 1, strlen(name + 1) + 1);
+   return 0;
+   }
+
+   return -1;
+}
+
 void *
 pci_find_max_end_va(void)
 {
@@ -222,11 +251,12 @@ pci_scan_one(const char *dirname, uint16_t domain, 
uint8_t bus,
char filename[PATH_MAX];
unsigned long tmp;
struct rte_pci_device *dev;
+   char driver[PATH_MAX];
+   int ret;

dev = malloc(sizeof(*dev));
-   if (dev == NULL) {
+   if (dev == NULL)
return -1;
-   }

memset(dev, 0, sizeof(*dev));
dev->addr.domain = domain;
@@ -305,6 +335,25 @@ pci_scan_one(const char *dirname, uint16_t domain, uint8_t 
bus,
return -1;
}

+   /* parse driver */
+   snprintf(filename, sizeof(filename), "%s/driver", dirname);
+   ret = pci_get_kernel_driver_by_path(filename, driver);
+   if (!ret) {
+   if (!strcmp(driver, "vfio-pci"))
+   dev->pt_driver = RTE_PT_VFIO;
+   else if (!strcmp(driver, "igb_uio"))
+   dev->pt_driver = RTE_PT_IGB_UIO;
+   else if (!strcmp(driver, "uio_pci_generic"))
+   dev->pt_driver = RTE_PT_UIO_GENERIC;
+   else
+   dev->pt_driver = RTE_PT_UNKNOWN;
+   } else if (ret < 0) {
+   RTE_LOG(ERR, EAL, "Fail to get kernel driver\n");
+   free(dev);
+   return -1;
+   } else
+   dev->pt_driver = RTE_PT_UNKNOWN;
+
/* device is valid, add in list (sorted) */
if (TAILQ_EMPTY(&pci_device_list)) {
TAILQ_INSERT_TAIL(&pci_device_list, dev, next);
-- 
1.9.1



[dpdk-dev] [PATCH v9 01/14] eal: Enable port Hotplug framework in Linux

2015-02-19 Thread Tetsuya Mukawa
The patch adds CONFIG_RTE_LIBRTE_EAL_HOTPLUG in Linux and BSD
configuration. So far, Hotplug functions only support linux.

v9:
- Move this patch at the top of this patch series.
  (Thanks to Thomas Monjalon)

Signed-off-by: Tetsuya Mukawa 
---
 config/common_bsdapp   | 6 ++
 config/common_linuxapp | 5 +
 2 files changed, 11 insertions(+)

diff --git a/config/common_bsdapp b/config/common_bsdapp
index 8cfa4e6..d73cbba 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -116,6 +116,12 @@ CONFIG_RTE_LIBRTE_EAL_BSDAPP=y
 CONFIG_RTE_LIBRTE_EAL_LINUXAPP=n

 #
+# Compile Environment Abstraction Layer to support hotplug
+# So far, Hotplug functions only support linux
+#
+CONFIG_RTE_LIBRTE_EAL_HOTPLUG=n
+
+#
 # Compile Environment Abstraction Layer to support Vmware TSC map
 #
 CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index db8332d..a677071 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -114,6 +114,11 @@ CONFIG_RTE_PCI_MAX_READ_REQUEST_SIZE=0
 CONFIG_RTE_LIBRTE_EAL_LINUXAPP=y

 #
+# Compile Environment Abstraction Layer to support hotplug
+#
+CONFIG_RTE_LIBRTE_EAL_HOTPLUG=y
+
+#
 # Compile Environment Abstraction Layer to support Vmware TSC map
 #
 CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=y
-- 
1.9.1



[dpdk-dev] [PATCH v9 00/14] Port Hotplug Framework

2015-02-19 Thread Tetsuya Mukawa
This patch series adds a dynamic port hotplug framework to DPDK.
With the patches, DPDK apps can attach or detach ports at runtime.

The basic concept of the port hotplug is like followings.
- DPDK apps must have responsibility to manage ports.
  DPDK apps only know which ports are attached or detached at the moment.
  The port hotplug framework is implemented to allow DPDK apps to manage ports.
  For example, when DPDK apps call port attach function, attached port number
  will be returned. Also, DPDK apps can detach port by port number.
- Kernel support is needed for attaching or detaching physical device ports.
  To attach a new physical device port, the device will be recognized by
  userspace directly I/O framework in kernel at first. Then DPDK apps can
  call the port hotplug functions to attach ports.
  For detaching, steps are vice versa.
- Before detach ports, ports must be stopped and closed.
  DPDK application must call rte_eth_dev_stop() and rte_eth_dev_close() before
  detaching ports. These function will call finalization codes of PMDs.
  But so far, no PMD frees all resources allocated by initialization.
  It means PMDs are needed to be fixed to support the port hotplug.
  'RTE_PCI_DRV_DETACHABLE' is a new flag indicating a PMD supports detaching.
  Without this flag, detaching will be failed.
- Mustn't affect legacy DPDK apps.
  No DPDK EAL behavior is changed, if the port hotplug functions are't called.
  So all legacy DPDK apps can still work without modifications.

And a few limitations.
- The port hotplug functions are not thread safe.
  DPDK apps should handle it.
- Only support Linux and igb_uio so far.
  BSD and VFIO is not supported. I will send VFIO patches at least, but I don't
  have a plan to submit BSD patch so far.


Here is port hotplug APIs.
---
/**
 * Attach a new device.
 *
 * @param devargs
 *   A pointer to a strings array describing the new device
 *   to be attached. The strings should be a pci address like
 *   ':01:00.0' or virtual device name like 'eth_pcap0'.
 * @param port_id
 *  A pointer to a port identifier actually attached.
 * @return
 *  0 on success and port_id is filled, negative on error
 */
int rte_eal_dev_attach(const char *devargs, uint8_t *port_id);

/**
 * Detach a device.
 *
 * @param port_id
 *   The port identifier of the device to detach.
 * @param addr
 *  A pointer to a device name actually detached.
 * @return
 *  0 on success and devname is filled, negative on error
 */
int rte_eal_dev_detach(uint8_t port_id, char *devname);
---

This patch series are for DPDK EAL. To use port hotplug function by DPDK apps,
each PMD should be fixed to support 'RTE_PCI_DRV_DETACHABLE' flag. Please check
a patch for pcap PMD.

Also, please check testpmd patch. It will show you how to fix your legacy
applications to support port hotplug feature.

PATCH v9 changes
 - Fix commit title.
 - Fix commit log.
 - Fix comments.
 - Define CONFIG_RTE_LIBRTE_EAL_HOTPLUG at the top of this patch series.
 - DEV_INVALID/VALID are removed.
 - DEV_DISCONNECTED is replaced by DEV_DETACHED.
 - DEV_CONNECTED is replaced by DEV_ATTACHED.
 - rte_eth_dev_allocate_new_port() is renamed to
   rte_eth_dev_find_free_port().
 - rte_eth_dev_validate_port() is renamed to rte_eth_dev_is_valid_port().
 - rte_eth_dev_is_valid_port() is changed not to handle log toggle.
 - eal_compare_pci_addr() is replaced by rte_eal_compare_pci_addr().
 - rte_eth_dev_free() is replaced by rte_eth_dev_release_port().
 - Add a function to create a unique device name.
 - Change parameter of pci_devuninit_t and rte_eth_dev_uninit.
 - Remove code that initiaize callback of ethdev from
   rte_eth_dev_uninit().
 - Remove pci_unmap_device(). It will be implemented in later patch.
 - rte_eth_dev_check_detachable() is replaced by
   rte_eth_dev_is_detachable().
 - strncpy() is replaced by strcpy().
 - Implement pci_unmap_device() in this patch.
 - Remove "rte_dev_hotplug.h".
 - Remove needless "#ifdef".
 - Remove RTE_EAL_INVOKE_TYPE_PROBE/CLOSE.
 - RTE_ETH_DEV_PHYSICAL is replaced by RTE_ETH_DEV_PCI.
 - Use strcmp() instead of strncmp().
 - Remove RTE_EAL_INVOKE_TYPE_PROBE/CLOSE.
   (Thanks to Thomas Monjalon)
 - Change definition of rte_dev_uninit_t.
   (Thanks to Thomas Monjalon and Maxime Leroy)
 - Add missing symbol in version map.
   (Thanks to Nail Horman)

PATCH v8 changes
 - Fix Makefile and add version map file.
 - Add missing symbol in version map.
 - Fix pci_scan_one() to update sysfs values.
   (Thanks to Qiu, Michael and Iremonger, Bernard)
 - NONE_TRACE is replaced by NO_TRACE.
 - Fix typo.
 - Add size parameter to rte_eth_dev_save().
   (Thanks to Iremonger, Bernard)

PATCH v7 changes
 - Add a new section to programmer's guide.
   (Thanks to Iremonger, Bernard)
 - Fix port checking implementation of star_port().
 - Fix typo of warning messages.
 - Add 

[dpdk-dev] [PATCH v2] i40e: fix build with gcc 5

2015-02-19 Thread Ananyev, Konstantin


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen
> Sent: Thursday, February 19, 2015 11:21 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2] i40e: fix build with gcc 5
> 
> Eliminate ambiguity in the condition which trips up a "logical not
> is only applied to the left..." warning from gcc 5, causing build
> failure with -Werror. Besides non-ambiguous, the condition is
> far more obvious this way.
> 
> Signed-off-by: Panu Matilainen 
> ---
>  lib/librte_pmd_i40e/i40e_rxtx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c b/lib/librte_pmd_i40e/i40e_rxtx.c
> index c9f1026..12c0831 100644
> --- a/lib/librte_pmd_i40e/i40e_rxtx.c
> +++ b/lib/librte_pmd_i40e/i40e_rxtx.c
> @@ -613,7 +613,7 @@ check_rx_burst_bulk_alloc_preconditions(__rte_unused 
> struct i40e_rx_queue *rxq)
>"rxq->nb_rx_desc=%d",
>rxq->rx_free_thresh, rxq->nb_rx_desc);
>   ret = -EINVAL;
> - } else if (!(rxq->nb_rx_desc % rxq->rx_free_thresh) == 0) {
> + } else if (rxq->nb_rx_desc % rxq->rx_free_thresh != 0) {
>   PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions: "
>"rxq->nb_rx_desc=%d, "
>"rxq->rx_free_thresh=%d",
> --

Acked-by: Konstantin Ananyev 

> 2.1.0



[dpdk-dev] [PATCH v2] eal: add rte_eal_iopl_init to version map

2015-02-19 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Sergio Gonzalez
> Monroy
> Sent: Thursday, February 12, 2015 3:41 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2] eal: add rte_eal_iopl_init to version map
> 
> Building shared libraries and using virtio PMD results in undefined
> reference to 'rte_eal_iopl_init'.
> 
> Add missing function to eal version map.
> 
> Signed-off-by: Sergio Gonzalez Monroy
> 
> ---

Acked-by: Pablo de Lara 



[dpdk-dev] [PATCH v2 0/6] Link Bonding mode 6 support (ALB)

2015-02-19 Thread Thomas Monjalon
2015-02-19 10:14, Jastrzebski, MichalX K:
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > 2015-02-19 09:18, Jastrzebski, MichalX K:
> > > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > > 2015-02-13 16:12, Declan Doherty:
> > > > > On 13/02/15 15:16, Michal Jastrzebski wrote:
> > > > > > Michal Jastrzebski (6):
> > > > > >net: changed arp_hdr struct declaration
> > > > > >bond: add link bonding mode 6 implementation
> > > > > >bond: add debug info for mode 6 link bonding
> > > > > >bond: add example application for link bonding mode 6
> > > > > >bond: modify TLB unit tests
> > > > > >bond: add unit tests for link bonding mode 6.
> > > >
> > > Hi Thomas,
> > > > You didn't sign some of these patches. So I suspect that you should
> > > > fix some authorship.
> > > That's because I am not an author of all of these patches - 1/6, 5/6 and 
> > > 6/6
> > 
> > You probably broke it by importing patches with "patch" command instead of
> > "git am".
> > Then you must fix the authorship in your git tree before sending.
> 
> The authorship in v2 is proper as I think I shouldn't signoff patch that is 
> not mine
> -  that I was never working on the code it provides, should I? 

No. Signed-off is about responsibility of a patch.
I'm speaking about authorship. Please use "git log" and check what is the
name after "Author: ". There is also authoring date but it's not important.
When sending email, the Author field is converted in a From field.

> I edited the patches manually before I submitted it to match the proper 
> authority,
> In git tree I have all patches signed-off by myself by default.



[dpdk-dev] [PATCH] eal: fix fscanf format mismatch

2015-02-19 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Sergio Gonzalez
> Monroy
> Sent: Thursday, February 12, 2015 3:40 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] eal: fix fscanf format mismatch
> 
> Variables are unsigned int but format scans for signed int.
> 
> Signed-off-by: Sergio Gonzalez Monroy
> 

Acked-by: Pablo de Lara 


[dpdk-dev] [PATCH] i40e: fix build with gcc 5

2015-02-19 Thread Ananyev, Konstantin


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen
> Sent: Thursday, February 19, 2015 10:25 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] i40e: fix build with gcc 5
> 
> Eliminate embiguity in the condition which trips up a "logical not
> is only applied to the left..." warning from gcc 5, causing build
> failure with -Werror.
> 
> Signed-off-by: Panu Matilainen 
> ---
>  lib/librte_pmd_i40e/i40e_rxtx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c b/lib/librte_pmd_i40e/i40e_rxtx.c
> index c9f1026..ede5405 100644
> --- a/lib/librte_pmd_i40e/i40e_rxtx.c
> +++ b/lib/librte_pmd_i40e/i40e_rxtx.c
> @@ -613,7 +613,7 @@ check_rx_burst_bulk_alloc_preconditions(__rte_unused 
> struct i40e_rx_queue *rxq)
>"rxq->nb_rx_desc=%d",
>rxq->rx_free_thresh, rxq->nb_rx_desc);
>   ret = -EINVAL;
> - } else if (!(rxq->nb_rx_desc % rxq->rx_free_thresh) == 0) {
> + } else if (!(rxq->nb_rx_desc % rxq->rx_free_thresh == 0)) {

Why just not:
else if (rxq->nb_rx_desc % rxq->rx_free_thresh != 0)
?

>   PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions: "
>"rxq->nb_rx_desc=%d, "
>"rxq->rx_free_thresh=%d",
> --
> 2.1.0



[dpdk-dev] [PATCH v5 1/6] reorder: new reorder library

2015-02-19 Thread Olivier MATZ
Hi,

On 02/19/2015 10:20 AM, Olivier MATZ wrote:
> Hi Sergio,
>
> On 02/18/2015 03:58 PM, Sergio Gonzalez Monroy wrote:
>> This library provides reordering capability for out of order mbufs based
>> on a sequence number in the mbuf structure.
>>
>> Signed-off-by: Reshma Pattan 
>> Signed-off-by: Richardson Bruce 
>> Signed-off-by: Sergio Gonzalez Monroy 
>>
>> [...]
>>
>> --- a/lib/librte_mbuf/rte_mbuf.h
>> +++ b/lib/librte_mbuf/rte_mbuf.h
>> @@ -289,6 +289,9 @@ struct rte_mbuf {
>>   uint32_t usr;  /**< User defined tags. See
>> @rte_distributor_process */
>>   } hash;   /**< hash information */
>>
>> +/* sequence number - field used in distributor and reorder
>> library */
>> +uint32_t seqn;
>> +
>>   /* second cache line - fields only used in slow path or on TX */
>>   MARKER cacheline1 __rte_cache_aligned;
>>
>
> Just one small comment about rte_mbuf: the comment should be in doxygen
> style.


I've just realized it's already applied. I'll submit a patch for this.

Olivier



[dpdk-dev] [PATCH v2 0/6] Link Bonding mode 6 support (ALB)

2015-02-19 Thread Thomas Monjalon
2015-02-19 09:18, Jastrzebski, MichalX K:
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > 2015-02-13 16:12, Declan Doherty:
> > > On 13/02/15 15:16, Michal Jastrzebski wrote:
> > > > Michal Jastrzebski (6):
> > > >net: changed arp_hdr struct declaration
> > > >bond: add link bonding mode 6 implementation
> > > >bond: add debug info for mode 6 link bonding
> > > >bond: add example application for link bonding mode 6
> > > >bond: modify TLB unit tests
> > > >bond: add unit tests for link bonding mode 6.
> > 
> Hi Thomas,
> > You didn't sign some of these patches. So I suspect that you should
> > fix some authorship.
> That's because I am not an author of all of these patches - 1/6, 5/6 and 6/6

You probably broke it by importing patches with "patch" command instead of "git 
am".
Then you must fix the authorship in your git tree before sending.

> > Some of the patches make some changes without explaining why.
> I noticed 5/6 has got incomplete description. Probably it disappeared during 
> edition.

Yes please ask yourself why each patch is done, and check it's explained in 
commit log.

> > > Series Acked-by: Declan Doherty 
> > 
> > Please, use checkpatch before submitting and/or when reviewing.
> I was using checkptach.pl, and have no errors, but I will check again.
> Maybe I overlooked something.
> > 
> > A v3 is needed.
> 
> Best regards
> Michal



[dpdk-dev] [PATCH v5 1/6] reorder: new reorder library

2015-02-19 Thread Olivier MATZ
Hi Sergio,

On 02/18/2015 03:58 PM, Sergio Gonzalez Monroy wrote:
> This library provides reordering capability for out of order mbufs based
> on a sequence number in the mbuf structure.
>
> Signed-off-by: Reshma Pattan 
> Signed-off-by: Richardson Bruce 
> Signed-off-by: Sergio Gonzalez Monroy 
>
> [...]
>
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -289,6 +289,9 @@ struct rte_mbuf {
>   uint32_t usr; /**< User defined tags. See 
> @rte_distributor_process */
>   } hash;   /**< hash information */
>
> + /* sequence number - field used in distributor and reorder library */
> + uint32_t seqn;
> +
>   /* second cache line - fields only used in slow path or on TX */
>   MARKER cacheline1 __rte_cache_aligned;
>

Just one small comment about rte_mbuf: the comment should be in doxygen
style.


Regards,
Olivier



[dpdk-dev] [PATCH v2 0/6] Link Bonding mode 6 support (ALB)

2015-02-19 Thread Jastrzebski, MichalX K
> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Thursday, February 19, 2015 10:40 AM
> To: Jastrzebski, MichalX K
> Cc: Doherty, Declan; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 0/6] Link Bonding mode 6 support (ALB)
> 
> 2015-02-19 09:18, Jastrzebski, MichalX K:
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > 2015-02-13 16:12, Declan Doherty:
> > > > On 13/02/15 15:16, Michal Jastrzebski wrote:
> > > > > Michal Jastrzebski (6):
> > > > >net: changed arp_hdr struct declaration
> > > > >bond: add link bonding mode 6 implementation
> > > > >bond: add debug info for mode 6 link bonding
> > > > >bond: add example application for link bonding mode 6
> > > > >bond: modify TLB unit tests
> > > > >bond: add unit tests for link bonding mode 6.
> > >
> > Hi Thomas,
> > > You didn't sign some of these patches. So I suspect that you should
> > > fix some authorship.
> > That's because I am not an author of all of these patches - 1/6, 5/6 and 6/6
> 
> You probably broke it by importing patches with "patch" command instead of
> "git am".
> Then you must fix the authorship in your git tree before sending.
The authorship in v2 is proper as I think I shouldn't signoff patch that is not 
mine
-  that I was never working on the code it provides, should I? 
I edited the patches manually before I submitted it to match the proper 
authority,
In git tree I have all patches signed-off by myself by default.
> 
> > > Some of the patches make some changes without explaining why.
> > I noticed 5/6 has got incomplete description. Probably it disappeared
> during edition.
> 
> Yes please ask yourself why each patch is done, and check it's explained in
> commit log.
> 
> > > > Series Acked-by: Declan Doherty 
> > >
> > > Please, use checkpatch before submitting and/or when reviewing.
> > I was using checkptach.pl, and have no errors, but I will check again.
> > Maybe I overlooked something.
> > >
> > > A v3 is needed.
> >
> > Best regards
> > Michal



[dpdk-dev] [PATCH v3 2/3] ethdev: Add rxtx callback support

2015-02-19 Thread Mcnamara, John
> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, February 18, 2015 6:20 PM
> To: Mcnamara, John; Richardson, Bruce
> Cc: dev at dpdk.org; nhorman at tuxdriver.com; stephen at networkplumber.org;
> Doherty, Declan
> Subject: Re: [PATCH v3 2/3] ethdev: Add rxtx callback support
> 
> 2015-02-18 17:42, John McNamara:
> 
> Excuse me, it wasn't very clear for me but I thought from the following
> email that the consensus was to use a compile-time option:
>   http://dpdk.org/ml/archives/dev/2015-February/013450.html


Hi Thomas,

I think that got a little lost from our side in the follow-on discussions.

I'll revert with a revision that makes this feature a compile time option.

John.
-- 



[dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application

2015-02-19 Thread Neil Horman
On Thu, Feb 19, 2015 at 01:18:41PM +0100, Pawel Wodkowski wrote:
> Hi community,
> I would like to introduce library for measuring load of some arbitrary jobs. 
> It
> can be used to profile every kind of job sets on any arbitrary execution unit 
> or
> tasking library.
> 
> In provided l2fwd-headroom example I demonstrate how to use this library to
> select optimal rx burst poll time. Jobs are selected by using existing 
> rte_timer
> library calls. This example does no limit possible schemes on which this 
> library
> can be used.
> 
> PATCH v5 changes:
>  - Fix spelling and checkpatch.pl errors.
>  - Add maintainer claim for library and example app.
> 
> PATCH v4 changes:
>  - use proper branch for generating patch.
> 
> PATCH v3 changes:
>  - Fix spelling.
> 
> PATCH v2 changes:
>  - Remove jobs management/callback from library to not duplicate tasking 
> library
>behaviour.
>  - Cleenup/remove useless statistics.
>  - Rework example application to use rte_timer library for jobs selection.
>  - Introduce new app parameter '-l' for automatic thousands separating in 
> stats.
>  - More readable statistics format.
> 
> 
> 
> Pawel Wodkowski (3):
>   librte_headroom: New library for checking core/system/app load
>   examples: introduce new l2fwd-headroom example
>   MAINTAINERS: claim responsibility for headroom library and example app
> 
>  MAINTAINERS  |4 +
>  config/common_bsdapp |5 +
>  config/common_linuxapp   |5 +
>  examples/Makefile|1 +
>  examples/l2fwd-headroom/Makefile |   51 ++
>  examples/l2fwd-headroom/main.c   | 1040 
> ++
>  lib/Makefile |1 +
>  lib/librte_headroom/Makefile |   54 ++
>  lib/librte_headroom/rte_headroom.c   |  271 +++
>  lib/librte_headroom/rte_headroom.h   |  324 
>  lib/librte_headroom/rte_headroom_version.map |   19 +
>  mk/rte.app.mk|4 +
>  12 files changed, 1779 insertions(+)
>  create mode 100644 examples/l2fwd-headroom/Makefile
>  create mode 100644 examples/l2fwd-headroom/main.c
>  create mode 100644 lib/librte_headroom/Makefile
>  create mode 100644 lib/librte_headroom/rte_headroom.c
>  create mode 100644 lib/librte_headroom/rte_headroom.h
>  create mode 100644 lib/librte_headroom/rte_headroom_version.map
> 
> -- 
> 1.9.1
> 
> 
I'm sorry but I still fail to see how this is a particularly useful library.  It
clearly works fine, but it composes an application event loop in its own terms,
and measures stats based on that.  While thats ok, any application is already
going to have to write its own event loop, and can makethe same measurements
synchnously within that loop, using alot less code to optimize its polling 
time. 

In other words, I think this is one of those cases where this library is
probably somewhat useful for anyone who just wants to write an application in
terms the semantics exposed by this library, but not at all useful for anyone
else.  I'd personally rather not have the extra code to maintain here.

Stephen just gave a presentation at netdev about some of the performance
optimization measurements Brocade did with DPDK and how they fine tuned their
environment.  One of the big take aways for me was that making time based
measurements (especially if it was using the tsc), created cpu stalls that
skewed the measurements, and so the best optimizations they made avoided time
measurements, opting instead for packet count metrics.

Neil



[dpdk-dev] [PATCH v2 0/6] Link Bonding mode 6 support (ALB)

2015-02-19 Thread Jastrzebski, MichalX K
> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, February 18, 2015 8:10 PM
> To: Doherty, Declan; Jastrzebski, MichalX K
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 0/6] Link Bonding mode 6 support (ALB)
> 
> Hi,
> 
> 2015-02-13 16:12, Declan Doherty:
> > On 13/02/15 15:16, Michal Jastrzebski wrote:
> > > Michal Jastrzebski (6):
> > >net: changed arp_hdr struct declaration
> > >bond: add link bonding mode 6 implementation
> > >bond: add debug info for mode 6 link bonding
> > >bond: add example application for link bonding mode 6
> > >bond: modify TLB unit tests
> > >bond: add unit tests for link bonding mode 6.
> 
Hi Thomas,
> You didn't sign some of these patches. So I suspect that you should
> fix some authorship.
That's because I am not an author of all of these patches - 1/6, 5/6 and 6/6
> 
> Some of the patches make some changes without explaining why.
I noticed 5/6 has got incomplete description. Probably it disappeared during 
edition.
> 
> > Series Acked-by: Declan Doherty 
> 
> Please, use checkpatch before submitting and/or when reviewing.
I was using checkptach.pl, and have no errors, but I will check again.
Maybe I overlooked something.
> 
> A v3 is needed.

Best regards
Michal


[dpdk-dev] [PATCH v3 1/5] ethdev: add rx interrupt enable/disable functions

2015-02-19 Thread Zhou, Danny


> -Original Message-
> From: Gonzalez Monroy, Sergio
> Sent: Thursday, February 19, 2015 4:22 PM
> To: Zhou, Danny; Neil Horman
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add rx interrupt 
> enable/disable functions
> 
> On 19/02/2015 08:06, Zhou, Danny wrote:
> >
> >> -Original Message-
> >> From: Neil Horman [mailto:nhorman at tuxdriver.com]
> >> Sent: Tuesday, February 17, 2015 11:53 PM
> >> To: Zhou, Danny
> >> Cc: dev at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add rx interrupt 
> >> enable/disable functions
> >>
> >> On Tue, Feb 17, 2015 at 09:47:15PM +0800, Zhou Danny wrote:
> >>> v3 changes
> >>> - Add return value for interrupt enable/disable functions
> >>>
> >>> Add two dev_ops functions to enable and disable rx queue interrupts
> >>>
> >>> Signed-off-by: Danny Zhou 
> >>> Tested-by: Yong Liu 
> >>> ---
> >>>   lib/librte_ether/rte_ethdev.c | 43 
> >>>   lib/librte_ether/rte_ethdev.h | 57 
> >>> +++
> >>>   2 files changed, 100 insertions(+)
> >>>
> >>> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> >>> index ea3a1fb..d27469a 100644
> >>> --- a/lib/librte_ether/rte_ethdev.c
> >>> +++ b/lib/librte_ether/rte_ethdev.c
> >>> @@ -2825,6 +2825,49 @@ _rte_eth_dev_callback_process(struct rte_eth_dev 
> >>> *dev,
> >>>   }
> >>>   rte_spinlock_unlock(&rte_eth_dev_cb_lock);
> >>>   }
> >>> +
> >>> +int
> >>> +rte_eth_dev_rx_queue_intr_enable(uint8_t port_id,
> >>> + uint16_t queue_id)
> >>> +{
> >>> + struct rte_eth_dev *dev;
> >>> +
> >>> + if (port_id >= nb_ports) {
> >>> + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> >>> + return (-ENODEV);
> >>> + }
> >>> +
> >>> + dev = &rte_eth_devices[port_id];
> >>> + if (dev == NULL) {
> >>> + PMD_DEBUG_TRACE("Invalid port device\n");
> >>> + return (-ENODEV);
> >>> + }
> >>> +
> >>> + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, -ENOTSUP);
> >>> + return (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
> >>> +}
> >>> +
> >>> +int
> >>> +rte_eth_dev_rx_queue_intr_disable(uint8_t port_id,
> >>> + uint16_t queue_id)
> >>> +{
> >>> + struct rte_eth_dev *dev;
> >>> +
> >>> + if (port_id >= nb_ports) {
> >>> + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> >>> + return (-ENODEV);
> >>> + }
> >>> +
> >>> + dev = &rte_eth_devices[port_id];
> >>> + if (dev == NULL) {
> >>> + PMD_DEBUG_TRACE("Invalid port device\n");
> >>> + return (-ENODEV);
> >>> + }
> >>> +
> >>> + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_disable, -ENOTSUP);
> >>> + return (*dev->dev_ops->rx_queue_intr_disable)(dev, queue_id);
> >>> +}
> >>> +
> >>>   #ifdef RTE_NIC_BYPASS
> >>>   int rte_eth_dev_bypass_init(uint8_t port_id)
> >>>   {
> >>> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> >>> index 84160c3..0f320a9 100644
> >>> --- a/lib/librte_ether/rte_ethdev.h
> >>> +++ b/lib/librte_ether/rte_ethdev.h
> >>> @@ -848,6 +848,8 @@ struct rte_eth_fdir {
> >>>   struct rte_intr_conf {
> >>>   /** enable/disable lsc interrupt. 0 (default) - disable, 1 
> >>> enable */
> >>>   uint16_t lsc;
> >>> + /** enable/disable rxq interrupt. 0 (default) - disable, 1 enable */
> >>> + uint16_t rxq;
> >>>   };
> >>>
> >>>   /**
> >>> @@ -1109,6 +,14 @@ typedef int (*eth_tx_queue_setup_t)(struct 
> >>> rte_eth_dev *dev,
> >>>   const struct rte_eth_txconf 
> >>> *tx_conf);
> >>>   /**< @internal Setup a transmit queue of an Ethernet device. */
> >>>
> >>> +typedef int (*eth_rx_enable_intr_t)(struct rte_eth_dev *dev,
> >>> + uint16_t rx_queue_id);
> >>> +/**< @internal Enable interrupt of a receive queue of an Ethernet 
> >>> device. */
> >>> +
> >>> +typedef int (*eth_rx_disable_intr_t)(struct rte_eth_dev *dev,
> >>> + uint16_t rx_queue_id);
> >>> +/**< @internal Disable interrupt of a receive queue of an Ethernet 
> >>> device. */
> >>> +
> >>>   typedef void (*eth_queue_release_t)(void *queue);
> >>>   /**< @internal Release memory resources allocated by given RX/TX queue. 
> >>> */
> >>>
> >>> @@ -1445,6 +1455,8 @@ struct eth_dev_ops {
> >>>   eth_queue_start_t  tx_queue_start;/**< Start TX for a 
> >>> queue.*/
> >>>   eth_queue_stop_t   tx_queue_stop;/**< Stop TX for a 
> >>> queue.*/
> >>>   eth_rx_queue_setup_t   rx_queue_setup;/**< Set up device RX 
> >>> queue.*/
> >>> + eth_rx_enable_intr_t   rx_queue_intr_enable; /**< Enable Rx queue 
> >>> interrupt. */
> >>> + eth_rx_disable_intr_t  rx_queue_intr_disable; /**< Disable Rx queue 
> >>> interrupt.*/
> >>>   eth_queue_release_trx_queue_release;/**< Release RX 
> >>> queue.*/
> >>>   eth_rx_queue_count_t   rx_queue_coun

[dpdk-dev] [PATCH v3 1/5] ethdev: add rx interrupt enable/disable functions

2015-02-19 Thread Gonzalez Monroy, Sergio
On 19/02/2015 08:06, Zhou, Danny wrote:
>
>> -Original Message-
>> From: Neil Horman [mailto:nhorman at tuxdriver.com]
>> Sent: Tuesday, February 17, 2015 11:53 PM
>> To: Zhou, Danny
>> Cc: dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add rx interrupt 
>> enable/disable functions
>>
>> On Tue, Feb 17, 2015 at 09:47:15PM +0800, Zhou Danny wrote:
>>> v3 changes
>>> - Add return value for interrupt enable/disable functions
>>>
>>> Add two dev_ops functions to enable and disable rx queue interrupts
>>>
>>> Signed-off-by: Danny Zhou 
>>> Tested-by: Yong Liu 
>>> ---
>>>   lib/librte_ether/rte_ethdev.c | 43 
>>>   lib/librte_ether/rte_ethdev.h | 57 
>>> +++
>>>   2 files changed, 100 insertions(+)
>>>
>>> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
>>> index ea3a1fb..d27469a 100644
>>> --- a/lib/librte_ether/rte_ethdev.c
>>> +++ b/lib/librte_ether/rte_ethdev.c
>>> @@ -2825,6 +2825,49 @@ _rte_eth_dev_callback_process(struct rte_eth_dev 
>>> *dev,
>>> }
>>> rte_spinlock_unlock(&rte_eth_dev_cb_lock);
>>>   }
>>> +
>>> +int
>>> +rte_eth_dev_rx_queue_intr_enable(uint8_t port_id,
>>> +   uint16_t queue_id)
>>> +{
>>> +   struct rte_eth_dev *dev;
>>> +
>>> +   if (port_id >= nb_ports) {
>>> +   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
>>> +   return (-ENODEV);
>>> +   }
>>> +
>>> +   dev = &rte_eth_devices[port_id];
>>> +   if (dev == NULL) {
>>> +   PMD_DEBUG_TRACE("Invalid port device\n");
>>> +   return (-ENODEV);
>>> +   }
>>> +
>>> +   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, -ENOTSUP);
>>> +   return (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
>>> +}
>>> +
>>> +int
>>> +rte_eth_dev_rx_queue_intr_disable(uint8_t port_id,
>>> +   uint16_t queue_id)
>>> +{
>>> +   struct rte_eth_dev *dev;
>>> +
>>> +   if (port_id >= nb_ports) {
>>> +   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
>>> +   return (-ENODEV);
>>> +   }
>>> +
>>> +   dev = &rte_eth_devices[port_id];
>>> +   if (dev == NULL) {
>>> +   PMD_DEBUG_TRACE("Invalid port device\n");
>>> +   return (-ENODEV);
>>> +   }
>>> +
>>> +   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_disable, -ENOTSUP);
>>> +   return (*dev->dev_ops->rx_queue_intr_disable)(dev, queue_id);
>>> +}
>>> +
>>>   #ifdef RTE_NIC_BYPASS
>>>   int rte_eth_dev_bypass_init(uint8_t port_id)
>>>   {
>>> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
>>> index 84160c3..0f320a9 100644
>>> --- a/lib/librte_ether/rte_ethdev.h
>>> +++ b/lib/librte_ether/rte_ethdev.h
>>> @@ -848,6 +848,8 @@ struct rte_eth_fdir {
>>>   struct rte_intr_conf {
>>> /** enable/disable lsc interrupt. 0 (default) - disable, 1 enable */
>>> uint16_t lsc;
>>> +   /** enable/disable rxq interrupt. 0 (default) - disable, 1 enable */
>>> +   uint16_t rxq;
>>>   };
>>>
>>>   /**
>>> @@ -1109,6 +,14 @@ typedef int (*eth_tx_queue_setup_t)(struct 
>>> rte_eth_dev *dev,
>>> const struct rte_eth_txconf *tx_conf);
>>>   /**< @internal Setup a transmit queue of an Ethernet device. */
>>>
>>> +typedef int (*eth_rx_enable_intr_t)(struct rte_eth_dev *dev,
>>> +   uint16_t rx_queue_id);
>>> +/**< @internal Enable interrupt of a receive queue of an Ethernet device. 
>>> */
>>> +
>>> +typedef int (*eth_rx_disable_intr_t)(struct rte_eth_dev *dev,
>>> +   uint16_t rx_queue_id);
>>> +/**< @internal Disable interrupt of a receive queue of an Ethernet device. 
>>> */
>>> +
>>>   typedef void (*eth_queue_release_t)(void *queue);
>>>   /**< @internal Release memory resources allocated by given RX/TX queue. */
>>>
>>> @@ -1445,6 +1455,8 @@ struct eth_dev_ops {
>>> eth_queue_start_t  tx_queue_start;/**< Start TX for a queue.*/
>>> eth_queue_stop_t   tx_queue_stop;/**< Stop TX for a queue.*/
>>> eth_rx_queue_setup_t   rx_queue_setup;/**< Set up device RX queue.*/
>>> +   eth_rx_enable_intr_t   rx_queue_intr_enable; /**< Enable Rx queue 
>>> interrupt. */
>>> +   eth_rx_disable_intr_t  rx_queue_intr_disable; /**< Disable Rx queue 
>>> interrupt.*/
>>> eth_queue_release_trx_queue_release;/**< Release RX queue.*/
>>> eth_rx_queue_count_t   rx_queue_count; /**< Get Rx queue count. */
>>> eth_rx_descriptor_done_t   rx_descriptor_done;  /**< Check rxd DD bit */
>>> @@ -2811,6 +2823,51 @@ void _rte_eth_dev_callback_process(struct 
>>> rte_eth_dev *dev,
>>> enum rte_eth_event_type event);
>>>
>>>   /**
>>> + * When there is no rx packet coming in Rx Queue for a long time, we can
>>> + * sleep lcore related to RX Queue for power saving, and enable rx 
>>> interrupt
>>> + * to be triggered when rx packect arrives.
>>> + *
>>> + * The rte_eth_dev_rx_queue_intr_

[dpdk-dev] [PATCH v3 4/5] eal: add per rx queue interrupt handling based on VFIO

2015-02-19 Thread Zhou, Danny


> -Original Message-
> From: Neil Horman [mailto:nhorman at tuxdriver.com]
> Sent: Tuesday, February 17, 2015 11:59 PM
> To: Zhou, Danny
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 4/5] eal: add per rx queue interrupt 
> handling based on VFIO
> 
> On Tue, Feb 17, 2015 at 09:47:18PM +0800, Zhou Danny wrote:
> > v3 changes:
> > - Fix review comments
> >
> > v2 changes:
> > - Fix compilation issue for a missed header file
> > - Bug fix: free unreleased resources on the exception path before return
> > - Consolidate coding style related review comments
> >
> > This patch does below:
> > - Create multiple VFIO eventfd for rx queues.
> > - Handle per rx queue interrupt.
> > - Eliminate unnecessary suspended DPDK polling thread wakeup mechanism
> > for rx interrupt by allowing polling thread epoll_wait rx queue
> > interrupt notification.
> >
> > Signed-off-by: Danny Zhou 
> > Tested-by: Yong Liu 
> > ---
> >  lib/librte_eal/common/include/rte_eal.h|  12 ++
> >  lib/librte_eal/linuxapp/eal/Makefile   |   1 +
> >  lib/librte_eal/linuxapp/eal/eal_interrupts.c   | 190 
> > -
> >  lib/librte_eal/linuxapp/eal/eal_pci_vfio.c |  12 +-
> >  .../linuxapp/eal/include/exec-env/rte_interrupts.h |   4 +
> >  5 files changed, 175 insertions(+), 44 deletions(-)
> >
> > diff --git a/lib/librte_eal/common/include/rte_eal.h 
> > b/lib/librte_eal/common/include/rte_eal.h
> > index f4ecd2e..d81331f 100644
> > --- a/lib/librte_eal/common/include/rte_eal.h
> > +++ b/lib/librte_eal/common/include/rte_eal.h
> > @@ -150,6 +150,18 @@ int rte_eal_iopl_init(void);
> >   *   - On failure, a negative error value.
> >   */
> >  int rte_eal_init(int argc, char **argv);
> > +
> > +/**
> > + * @param port_id
> > + *   the port id
> > + * @param queue_id
> > + *   the queue id
> > + * @return
> > + *   - On success, return 0
> > + *   - On failure, returns -1.
> > + */
> > +int rte_eal_wait_rx_intr(uint8_t port_id, uint8_t queue_id);
> > +
> >  /**
> >   * Usage function typedef used by the application usage function.
> >   *
> > diff --git a/lib/librte_eal/linuxapp/eal/Makefile 
> > b/lib/librte_eal/linuxapp/eal/Makefile
> > index e117cec..c593dfa 100644
> > --- a/lib/librte_eal/linuxapp/eal/Makefile
> > +++ b/lib/librte_eal/linuxapp/eal/Makefile
> > @@ -43,6 +43,7 @@ CFLAGS += -I$(SRCDIR)/include
> >  CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common
> >  CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common/include
> >  CFLAGS += -I$(RTE_SDK)/lib/librte_ring
> > +CFLAGS += -I$(RTE_SDK)/lib/librte_mbuf
> >  CFLAGS += -I$(RTE_SDK)/lib/librte_mempool
> >  CFLAGS += -I$(RTE_SDK)/lib/librte_malloc
> >  CFLAGS += -I$(RTE_SDK)/lib/librte_ether
> > diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c 
> > b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
> > index dc2668a..97215ad 100644
> > --- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
> > +++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
> > @@ -64,6 +64,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >
> >  #include "eal_private.h"
> >  #include "eal_vfio.h"
> > @@ -127,6 +128,9 @@ static pthread_t intr_thread;
> >  #ifdef VFIO_PRESENT
> >
> >  #define IRQ_SET_BUF_LEN  (sizeof(struct vfio_irq_set) + sizeof(int))
> > +/* irq set buffer length for queue interrupts and LSC interrupt */
> > +#define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
> > +   sizeof(int) * (VFIO_MAX_QUEUE_ID + 1))
> >
> >  /* enable legacy (INTx) interrupts */
> >  static int
> > @@ -218,10 +222,10 @@ vfio_disable_intx(struct rte_intr_handle 
> > *intr_handle) {
> > return 0;
> >  }
> >
> > -/* enable MSI-X interrupts */
> > +/* enable MSI interrupts */
> >  static int
> >  vfio_enable_msi(struct rte_intr_handle *intr_handle) {
> > -   int len, ret;
> > +   int len, ret, max_intr;
> > char irq_set_buf[IRQ_SET_BUF_LEN];
> > struct vfio_irq_set *irq_set;
> > int *fd_ptr;
> > @@ -230,12 +234,19 @@ vfio_enable_msi(struct rte_intr_handle *intr_handle) {
> >
> > irq_set = (struct vfio_irq_set *) irq_set_buf;
> > irq_set->argsz = len;
> > -   irq_set->count = 1;
> > +   if ((!intr_handle->max_intr) ||
> > +   (intr_handle->max_intr > VFIO_MAX_QUEUE_ID))
> > +   max_intr = VFIO_MAX_QUEUE_ID + 1;
> > +   else
> > +   max_intr = intr_handle->max_intr;
> > +
> > +   irq_set->count = max_intr;
> > irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | 
> > VFIO_IRQ_SET_ACTION_TRIGGER;
> > irq_set->index = VFIO_PCI_MSI_IRQ_INDEX;
> > irq_set->start = 0;
> > fd_ptr = (int *) &irq_set->data;
> > -   *fd_ptr = intr_handle->fd;
> > +   memcpy(fd_ptr, intr_handle->queue_fd, sizeof(intr_handle->queue_fd));
> > +   fd_ptr[max_intr - 1] = intr_handle->fd;
> >
> > ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
> >
> > @@ -244,27 +255,10 @@ vfio_enable_msi(struct rte_intr_handle *intr_handle) {
> > intr

[dpdk-dev] [PATCH v3 1/5] ethdev: add rx interrupt enable/disable functions

2015-02-19 Thread Neil Horman
On Thu, Feb 19, 2015 at 08:34:22AM +, Zhou, Danny wrote:
> 
> 
> > -Original Message-
> > From: Gonzalez Monroy, Sergio
> > Sent: Thursday, February 19, 2015 4:22 PM
> > To: Zhou, Danny; Neil Horman
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add rx interrupt 
> > enable/disable functions
> > 
> > On 19/02/2015 08:06, Zhou, Danny wrote:
> > >
> > >> -Original Message-
> > >> From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > >> Sent: Tuesday, February 17, 2015 11:53 PM
> > >> To: Zhou, Danny
> > >> Cc: dev at dpdk.org
> > >> Subject: Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add rx interrupt 
> > >> enable/disable functions
> > >>
> > >> On Tue, Feb 17, 2015 at 09:47:15PM +0800, Zhou Danny wrote:
> > >>> v3 changes
> > >>> - Add return value for interrupt enable/disable functions
> > >>>
> > >>> Add two dev_ops functions to enable and disable rx queue interrupts
> > >>>
> > >>> Signed-off-by: Danny Zhou 
> > >>> Tested-by: Yong Liu 
> > >>> ---
> > >>>   lib/librte_ether/rte_ethdev.c | 43 
> > >>>   lib/librte_ether/rte_ethdev.h | 57 
> > >>> +++
> > >>>   2 files changed, 100 insertions(+)
> > >>>
> > >>> diff --git a/lib/librte_ether/rte_ethdev.c 
> > >>> b/lib/librte_ether/rte_ethdev.c
> > >>> index ea3a1fb..d27469a 100644
> > >>> --- a/lib/librte_ether/rte_ethdev.c
> > >>> +++ b/lib/librte_ether/rte_ethdev.c
> > >>> @@ -2825,6 +2825,49 @@ _rte_eth_dev_callback_process(struct rte_eth_dev 
> > >>> *dev,
> > >>> }
> > >>> rte_spinlock_unlock(&rte_eth_dev_cb_lock);
> > >>>   }
> > >>> +
> > >>> +int
> > >>> +rte_eth_dev_rx_queue_intr_enable(uint8_t port_id,
> > >>> +   uint16_t queue_id)
> > >>> +{
> > >>> +   struct rte_eth_dev *dev;
> > >>> +
> > >>> +   if (port_id >= nb_ports) {
> > >>> +   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > >>> +   return (-ENODEV);
> > >>> +   }
> > >>> +
> > >>> +   dev = &rte_eth_devices[port_id];
> > >>> +   if (dev == NULL) {
> > >>> +   PMD_DEBUG_TRACE("Invalid port device\n");
> > >>> +   return (-ENODEV);
> > >>> +   }
> > >>> +
> > >>> +   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, 
> > >>> -ENOTSUP);
> > >>> +   return (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
> > >>> +}
> > >>> +
> > >>> +int
> > >>> +rte_eth_dev_rx_queue_intr_disable(uint8_t port_id,
> > >>> +   uint16_t queue_id)
> > >>> +{
> > >>> +   struct rte_eth_dev *dev;
> > >>> +
> > >>> +   if (port_id >= nb_ports) {
> > >>> +   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > >>> +   return (-ENODEV);
> > >>> +   }
> > >>> +
> > >>> +   dev = &rte_eth_devices[port_id];
> > >>> +   if (dev == NULL) {
> > >>> +   PMD_DEBUG_TRACE("Invalid port device\n");
> > >>> +   return (-ENODEV);
> > >>> +   }
> > >>> +
> > >>> +   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_disable, 
> > >>> -ENOTSUP);
> > >>> +   return (*dev->dev_ops->rx_queue_intr_disable)(dev, queue_id);
> > >>> +}
> > >>> +
> > >>>   #ifdef RTE_NIC_BYPASS
> > >>>   int rte_eth_dev_bypass_init(uint8_t port_id)
> > >>>   {
> > >>> diff --git a/lib/librte_ether/rte_ethdev.h 
> > >>> b/lib/librte_ether/rte_ethdev.h
> > >>> index 84160c3..0f320a9 100644
> > >>> --- a/lib/librte_ether/rte_ethdev.h
> > >>> +++ b/lib/librte_ether/rte_ethdev.h
> > >>> @@ -848,6 +848,8 @@ struct rte_eth_fdir {
> > >>>   struct rte_intr_conf {
> > >>> /** enable/disable lsc interrupt. 0 (default) - disable, 1 
> > >>> enable */
> > >>> uint16_t lsc;
> > >>> +   /** enable/disable rxq interrupt. 0 (default) - disable, 1 
> > >>> enable */
> > >>> +   uint16_t rxq;
> > >>>   };
> > >>>
> > >>>   /**
> > >>> @@ -1109,6 +,14 @@ typedef int (*eth_tx_queue_setup_t)(struct 
> > >>> rte_eth_dev *dev,
> > >>> const struct rte_eth_txconf 
> > >>> *tx_conf);
> > >>>   /**< @internal Setup a transmit queue of an Ethernet device. */
> > >>>
> > >>> +typedef int (*eth_rx_enable_intr_t)(struct rte_eth_dev *dev,
> > >>> +   uint16_t rx_queue_id);
> > >>> +/**< @internal Enable interrupt of a receive queue of an Ethernet 
> > >>> device. */
> > >>> +
> > >>> +typedef int (*eth_rx_disable_intr_t)(struct rte_eth_dev *dev,
> > >>> +   uint16_t rx_queue_id);
> > >>> +/**< @internal Disable interrupt of a receive queue of an Ethernet 
> > >>> device. */
> > >>> +
> > >>>   typedef void (*eth_queue_release_t)(void *queue);
> > >>>   /**< @internal Release memory resources allocated by given RX/TX 
> > >>> queue. */
> > >>>
> > >>> @@ -1445,6 +1455,8 @@ struct eth_dev_ops {
> > >>> eth_queue_start_t  tx_queue_start;/**< Start TX for a 
> > >>> queue.*/
> > >>>  

[dpdk-dev] Patches outstanding

2015-02-19 Thread Neil Horman
On Tue, Feb 17, 2015 at 10:35:07AM -0500, Stephen Hemminger wrote:
> There are currently 1039 patches outstanding on DPDK.
> What is the schedule for getting these merged or resolved?
> I don't think it would be reasonable to declare 2.0 as done
> until the patch backlog is 0!
> 

I think the subtrees were supposed to start biting into this, but I don't see
them getting used yet.
Neil



[dpdk-dev] [PATCH] ixgbe: fix build with gcc 5

2015-02-19 Thread Neil Horman
On Thu, Feb 19, 2015 at 12:02:06PM +, Ananyev, Konstantin wrote:
> Hi Panu,
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen
> > Sent: Thursday, February 19, 2015 10:25 AM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH] ixgbe: fix build with gcc 5
> > 
> > Add extra parenthesis to remove ambiguity on what we want to compare,
> > otherwise gcc 5 issues a "logical not is only applied to the left hand
> > side of comparison" warning which with -Werror fails the build.
> > 
> > Signed-off-by: Panu Matilainen 
> > ---
> >  lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c 
> > b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
> > index 37e5bae..93a6a00 100644
> > --- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
> > +++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
> > @@ -2898,8 +2898,8 @@ STATIC s32 ixgbe_fc_autoneg_fiber(struct ixgbe_hw *hw)
> >  */
> > 
> > linkstat = IXGBE_READ_REG(hw, IXGBE_PCS1GLSTA);
> > -   if ((!!(linkstat & IXGBE_PCS1GLSTA_AN_COMPLETE) == 0) ||
> > -   (!!(linkstat & IXGBE_PCS1GLSTA_AN_TIMED_OUT) == 1)) {
> > +   if (((!!(linkstat & IXGBE_PCS1GLSTA_AN_COMPLETE)) == 0) ||
> > +   ((!!(linkstat & IXGBE_PCS1GLSTA_AN_TIMED_OUT)) == 1)) {
> > ERROR_REPORT1(IXGBE_ERROR_POLLING,
> >  "Auto-Negotiation did not complete or timed out");
> > goto out;
> 
> Unfortunately we are not supposed to change files under ixgbe subfirectory 
> (except ixgbe_osdep.*).
> Usually we deal with it just by:
> If GCC_VERSION...
> CFLAGS_ixgbe_common.o += -Wno...
> 
Why don't you just send a patch to the netdev list to fix ixgbe in the linux
tree, and then apply the same patch once it gets accepted.  Then the merge will
go smoothly when it comes down.  That would be much better than doing GCC
version ifdeffery.

Neil

> You can have a look at lib/librte_pmd_ixgbe/Makefile, there are plenty of 
> such things.
> Konstantin
> 
> 
> > --
> > 2.1.0
> 
> 


[dpdk-dev] [PATCH v3 1/5] ethdev: add rx interrupt enable/disable functions

2015-02-19 Thread Zhou, Danny


> -Original Message-
> From: Neil Horman [mailto:nhorman at tuxdriver.com]
> Sent: Tuesday, February 17, 2015 11:53 PM
> To: Zhou, Danny
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add rx interrupt 
> enable/disable functions
> 
> On Tue, Feb 17, 2015 at 09:47:15PM +0800, Zhou Danny wrote:
> > v3 changes
> > - Add return value for interrupt enable/disable functions
> >
> > Add two dev_ops functions to enable and disable rx queue interrupts
> >
> > Signed-off-by: Danny Zhou 
> > Tested-by: Yong Liu 
> > ---
> >  lib/librte_ether/rte_ethdev.c | 43 
> >  lib/librte_ether/rte_ethdev.h | 57 
> > +++
> >  2 files changed, 100 insertions(+)
> >
> > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> > index ea3a1fb..d27469a 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -2825,6 +2825,49 @@ _rte_eth_dev_callback_process(struct rte_eth_dev 
> > *dev,
> > }
> > rte_spinlock_unlock(&rte_eth_dev_cb_lock);
> >  }
> > +
> > +int
> > +rte_eth_dev_rx_queue_intr_enable(uint8_t port_id,
> > +   uint16_t queue_id)
> > +{
> > +   struct rte_eth_dev *dev;
> > +
> > +   if (port_id >= nb_ports) {
> > +   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > +   return (-ENODEV);
> > +   }
> > +
> > +   dev = &rte_eth_devices[port_id];
> > +   if (dev == NULL) {
> > +   PMD_DEBUG_TRACE("Invalid port device\n");
> > +   return (-ENODEV);
> > +   }
> > +
> > +   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, -ENOTSUP);
> > +   return (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
> > +}
> > +
> > +int
> > +rte_eth_dev_rx_queue_intr_disable(uint8_t port_id,
> > +   uint16_t queue_id)
> > +{
> > +   struct rte_eth_dev *dev;
> > +
> > +   if (port_id >= nb_ports) {
> > +   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > +   return (-ENODEV);
> > +   }
> > +
> > +   dev = &rte_eth_devices[port_id];
> > +   if (dev == NULL) {
> > +   PMD_DEBUG_TRACE("Invalid port device\n");
> > +   return (-ENODEV);
> > +   }
> > +
> > +   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_disable, -ENOTSUP);
> > +   return (*dev->dev_ops->rx_queue_intr_disable)(dev, queue_id);
> > +}
> > +
> >  #ifdef RTE_NIC_BYPASS
> >  int rte_eth_dev_bypass_init(uint8_t port_id)
> >  {
> > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > index 84160c3..0f320a9 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -848,6 +848,8 @@ struct rte_eth_fdir {
> >  struct rte_intr_conf {
> > /** enable/disable lsc interrupt. 0 (default) - disable, 1 enable */
> > uint16_t lsc;
> > +   /** enable/disable rxq interrupt. 0 (default) - disable, 1 enable */
> > +   uint16_t rxq;
> >  };
> >
> >  /**
> > @@ -1109,6 +,14 @@ typedef int (*eth_tx_queue_setup_t)(struct 
> > rte_eth_dev *dev,
> > const struct rte_eth_txconf *tx_conf);
> >  /**< @internal Setup a transmit queue of an Ethernet device. */
> >
> > +typedef int (*eth_rx_enable_intr_t)(struct rte_eth_dev *dev,
> > +   uint16_t rx_queue_id);
> > +/**< @internal Enable interrupt of a receive queue of an Ethernet device. 
> > */
> > +
> > +typedef int (*eth_rx_disable_intr_t)(struct rte_eth_dev *dev,
> > +   uint16_t rx_queue_id);
> > +/**< @internal Disable interrupt of a receive queue of an Ethernet device. 
> > */
> > +
> >  typedef void (*eth_queue_release_t)(void *queue);
> >  /**< @internal Release memory resources allocated by given RX/TX queue. */
> >
> > @@ -1445,6 +1455,8 @@ struct eth_dev_ops {
> > eth_queue_start_t  tx_queue_start;/**< Start TX for a queue.*/
> > eth_queue_stop_t   tx_queue_stop;/**< Stop TX for a queue.*/
> > eth_rx_queue_setup_t   rx_queue_setup;/**< Set up device RX queue.*/
> > +   eth_rx_enable_intr_t   rx_queue_intr_enable; /**< Enable Rx queue 
> > interrupt. */
> > +   eth_rx_disable_intr_t  rx_queue_intr_disable; /**< Disable Rx queue 
> > interrupt.*/
> > eth_queue_release_trx_queue_release;/**< Release RX queue.*/
> > eth_rx_queue_count_t   rx_queue_count; /**< Get Rx queue count. */
> > eth_rx_descriptor_done_t   rx_descriptor_done;  /**< Check rxd DD bit */
> > @@ -2811,6 +2823,51 @@ void _rte_eth_dev_callback_process(struct 
> > rte_eth_dev *dev,
> > enum rte_eth_event_type event);
> >
> >  /**
> > + * When there is no rx packet coming in Rx Queue for a long time, we can
> > + * sleep lcore related to RX Queue for power saving, and enable rx 
> > interrupt
> > + * to be triggered when rx packect arrives.
> > + *
> > + * The rte_eth_dev_rx_queue_intr_enable() function enables rx queue
> > + * interrupt on specif

[dpdk-dev] [PATCH v3 4/5] eal: add per rx queue interrupt handling based on VFIO

2015-02-19 Thread Neil Horman
On Thu, Feb 19, 2015 at 08:10:47AM +, Zhou, Danny wrote:
> 
> 
> > -Original Message-
> > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > Sent: Tuesday, February 17, 2015 11:59 PM
> > To: Zhou, Danny
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v3 4/5] eal: add per rx queue interrupt 
> > handling based on VFIO
> > 
> > On Tue, Feb 17, 2015 at 09:47:18PM +0800, Zhou Danny wrote:
> > > v3 changes:
> > > - Fix review comments
> > >
> > > v2 changes:
> > > - Fix compilation issue for a missed header file
> > > - Bug fix: free unreleased resources on the exception path before return
> > > - Consolidate coding style related review comments
> > >
> > > This patch does below:
> > > - Create multiple VFIO eventfd for rx queues.
> > > - Handle per rx queue interrupt.
> > > - Eliminate unnecessary suspended DPDK polling thread wakeup mechanism
> > > for rx interrupt by allowing polling thread epoll_wait rx queue
> > > interrupt notification.
> > >
> > > Signed-off-by: Danny Zhou 
> > > Tested-by: Yong Liu 
> > > ---
> > >  lib/librte_eal/common/include/rte_eal.h|  12 ++
> > >  lib/librte_eal/linuxapp/eal/Makefile   |   1 +
> > >  lib/librte_eal/linuxapp/eal/eal_interrupts.c   | 190 
> > > -
> > >  lib/librte_eal/linuxapp/eal/eal_pci_vfio.c |  12 +-
> > >  .../linuxapp/eal/include/exec-env/rte_interrupts.h |   4 +
> > >  5 files changed, 175 insertions(+), 44 deletions(-)
> > >
> > > diff --git a/lib/librte_eal/common/include/rte_eal.h 
> > > b/lib/librte_eal/common/include/rte_eal.h
> > > index f4ecd2e..d81331f 100644
> > > --- a/lib/librte_eal/common/include/rte_eal.h
> > > +++ b/lib/librte_eal/common/include/rte_eal.h
> > > @@ -150,6 +150,18 @@ int rte_eal_iopl_init(void);
> > >   *   - On failure, a negative error value.
> > >   */
> > >  int rte_eal_init(int argc, char **argv);
> > > +
> > > +/**
> > > + * @param port_id
> > > + *   the port id
> > > + * @param queue_id
> > > + *   the queue id
> > > + * @return
> > > + *   - On success, return 0
> > > + *   - On failure, returns -1.
> > > + */
> > > +int rte_eal_wait_rx_intr(uint8_t port_id, uint8_t queue_id);
> > > +
> > >  /**
> > >   * Usage function typedef used by the application usage function.
> > >   *
> > > diff --git a/lib/librte_eal/linuxapp/eal/Makefile 
> > > b/lib/librte_eal/linuxapp/eal/Makefile
> > > index e117cec..c593dfa 100644
> > > --- a/lib/librte_eal/linuxapp/eal/Makefile
> > > +++ b/lib/librte_eal/linuxapp/eal/Makefile
> > > @@ -43,6 +43,7 @@ CFLAGS += -I$(SRCDIR)/include
> > >  CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common
> > >  CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common/include
> > >  CFLAGS += -I$(RTE_SDK)/lib/librte_ring
> > > +CFLAGS += -I$(RTE_SDK)/lib/librte_mbuf
> > >  CFLAGS += -I$(RTE_SDK)/lib/librte_mempool
> > >  CFLAGS += -I$(RTE_SDK)/lib/librte_malloc
> > >  CFLAGS += -I$(RTE_SDK)/lib/librte_ether
> > > diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c 
> > > b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
> > > index dc2668a..97215ad 100644
> > > --- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
> > > +++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
> > > @@ -64,6 +64,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >
> > >  #include "eal_private.h"
> > >  #include "eal_vfio.h"
> > > @@ -127,6 +128,9 @@ static pthread_t intr_thread;
> > >  #ifdef VFIO_PRESENT
> > >
> > >  #define IRQ_SET_BUF_LEN  (sizeof(struct vfio_irq_set) + sizeof(int))
> > > +/* irq set buffer length for queue interrupts and LSC interrupt */
> > > +#define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
> > > + sizeof(int) * (VFIO_MAX_QUEUE_ID + 1))
> > >
> > >  /* enable legacy (INTx) interrupts */
> > >  static int
> > > @@ -218,10 +222,10 @@ vfio_disable_intx(struct rte_intr_handle 
> > > *intr_handle) {
> > >   return 0;
> > >  }
> > >
> > > -/* enable MSI-X interrupts */
> > > +/* enable MSI interrupts */
> > >  static int
> > >  vfio_enable_msi(struct rte_intr_handle *intr_handle) {
> > > - int len, ret;
> > > + int len, ret, max_intr;
> > >   char irq_set_buf[IRQ_SET_BUF_LEN];
> > >   struct vfio_irq_set *irq_set;
> > >   int *fd_ptr;
> > > @@ -230,12 +234,19 @@ vfio_enable_msi(struct rte_intr_handle 
> > > *intr_handle) {
> > >
> > >   irq_set = (struct vfio_irq_set *) irq_set_buf;
> > >   irq_set->argsz = len;
> > > - irq_set->count = 1;
> > > + if ((!intr_handle->max_intr) ||
> > > + (intr_handle->max_intr > VFIO_MAX_QUEUE_ID))
> > > + max_intr = VFIO_MAX_QUEUE_ID + 1;
> > > + else
> > > + max_intr = intr_handle->max_intr;
> > > +
> > > + irq_set->count = max_intr;
> > >   irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | 
> > > VFIO_IRQ_SET_ACTION_TRIGGER;
> > >   irq_set->index = VFIO_PCI_MSI_IRQ_INDEX;
> > >   irq_set->start = 0;
> > >   fd_ptr = (int *) &irq_set->data;
> > > - *fd_ptr = intr_handle->fd;
> > > + memcpy(fd_ptr, intr_handle->queue_fd, sizeof(intr_han

  1   2   >