[dpdk-dev] [PATCH v6 00/16] lib/librte_pmd_fm10k : fm10k pmd driver

2015-02-18 Thread Thomas Monjalon
2015-02-17 22:18, Chen Jing D:
> From: "Chen Jing D(Mark)" 
> 
> The patch set add poll mode driver for the host interface of Intel
> Ethernet Switch FM1 Series of silicons, which integrate NIC and
> switch functionalities. The patch set include below features:
> 
> 1. Basic RX/TX functions for PF/VF.
> 2. Interrupt handling mechanism for PF/VF.
> 3. per queue start/stop functions for PF/VF.
> 4. Mailbox handling between PF/VF and PF/Switch Manager.
> 5. Receive Side Scaling (RSS) for PF/VF.
> 6. Scatter receive function for PF/VF.
> 7. reta update/query for PF/VF.
> 8. VLAN filter set for PF.
> 9. Link status query for PF/VF.
> 
> Change in v6:
> - Merge ABI patch with fm10k driver regsiter patch.
> - Fix typo.
> - Rework comments.
> - Minor adjustment on commit log.
> - Increase error variable after mbuf allocation failed.
> 
> Change in v5:
> - Add sanity check for mbuf allocation.
> - Add a new patch to claim fm10k driver review
> - Change commit log.
> - Add unlikely in func rx_desc_to_ol_flags to gain performance
> - Add a new patch to add ABI version
> 
> Change in v4:
> - Change commit log to remove improper words.
> 
> Changes in v3:
> - Update base driver.
> - Define several macros to pass base driver compile.
> 
> Changes in v2:
> - Merge 3 patches into 1 to configure fm10k compile environment.
> - Rework on log code to follow style in ixgbe.
> - Rework log message, remove redundant '\n'
> - Update Copyright year from "2014" to "2015"
> - Change base driver directory name from SHARED to base
> - Add more description in log for patch "add PF and VF interrupt"
> - Merge 2 patches into 1 to register fm10k driver
> - Define macro to replace numeric for lower 32-bit mask.
> 
> Chen Jing D(Mark) (1):
>   maintainers: claim for fm10k review
> 
> Jeff Shaw (15):
>   fm10k: add base driver
>   eal: add fm10k device id
>   fm10k: register fm10k pmd PF driver
>   config: change config files to add fm10k into compile
>   fm10k: add reta update/requery functions
>   fm10k: add Rx queue setup/release function
>   fm10k: add Tx queue setup/release function
>   fm10k: add Rx/Tx single queue start/stop function
>   fm10k: add dev start/stop functions
>   fm10k: add receive and tranmit function
>   fm10k: add PF RSS support
>   fm10k: add scatter receive function
>   fm10k: add function to set vlan
>   fm10k: add SRIOV-VF support
>   fm10k: add PF and VF interrupt handling function
> 
>  MAINTAINERS |4 +
>  config/common_bsdapp|   11 +
>  config/common_linuxapp  |   11 +
>  lib/Makefile|1 +
>  lib/librte_eal/common/include/rte_pci_dev_ids.h |   22 +
>  lib/librte_pmd_fm10k/Makefile   |  100 +
>  lib/librte_pmd_fm10k/base/fm10k_api.c   |  341 
>  lib/librte_pmd_fm10k/base/fm10k_api.h   |   61 +
>  lib/librte_pmd_fm10k/base/fm10k_common.c|  572 ++
>  lib/librte_pmd_fm10k/base/fm10k_common.h|   52 +
>  lib/librte_pmd_fm10k/base/fm10k_mbx.c   | 2185 
> +++
>  lib/librte_pmd_fm10k/base/fm10k_mbx.h   |  329 
>  lib/librte_pmd_fm10k/base/fm10k_osdep.h |  148 ++
>  lib/librte_pmd_fm10k/base/fm10k_pf.c| 1992 +
>  lib/librte_pmd_fm10k/base/fm10k_pf.h|  155 ++
>  lib/librte_pmd_fm10k/base/fm10k_tlv.c   |  914 ++
>  lib/librte_pmd_fm10k/base/fm10k_tlv.h   |  199 ++
>  lib/librte_pmd_fm10k/base/fm10k_type.h  |  937 ++
>  lib/librte_pmd_fm10k/base/fm10k_vf.c|  641 +++
>  lib/librte_pmd_fm10k/base/fm10k_vf.h|   91 +
>  lib/librte_pmd_fm10k/fm10k.h|  292 +++
>  lib/librte_pmd_fm10k/fm10k_ethdev.c | 1867 +++
>  lib/librte_pmd_fm10k/fm10k_logs.h   |   78 +
>  lib/librte_pmd_fm10k/fm10k_rxtx.c   |  462 +
>  lib/librte_pmd_fm10k/rte_pmd_fm10k_version.map  |4 +
>  mk/rte.app.mk   |4 +

Pulled from next/dpdk-fm10k, thanks.


[dpdk-dev] [PATCH v8 03/14] eal/pci, ethdev: Remove assumption that port will not be detached

2015-02-18 Thread Thomas Monjalon
2015-02-17 15:14, Tetsuya Mukawa:
> On 2015/02/17 9:36, Thomas Monjalon wrote:
> > 2015-02-16 13:14, Tetsuya Mukawa:
> > Is uint8_t sill a good size for hotpluggable virtual device ids?
> 
> I am not sure it's enough, but uint8_t is widely used in "rte_ethdev.c"
> as port id.
> If someone reports it doesn't enough, I guess it will be the time to
> write a patch to change all uint_8 in one patch.

It's a big ABI breakage. So if we feel it's going to be required,
it's better to do it now in 2.0 release I think.

Any opinion?



[dpdk-dev] [PATCH v8 04/14] eal/pci: Consolidate pci address comparison APIs

2015-02-18 Thread Thomas Monjalon
2015-02-17 15:14, Tetsuya Mukawa:
> On 2015/02/17 9:44, Thomas Monjalon wrote:
> > 2015-02-16 13:14, Tetsuya Mukawa:
> >> @@ -356,13 +342,24 @@ pci_scan_one(int dev_pci_fd, struct pci_conf *conf)
> >>}
> >>else {
> >>struct rte_pci_device *dev2 = NULL;
> >> +  int ret;
> >>  
> >>TAILQ_FOREACH(dev2, &pci_device_list, next) {
> >> -  if (pci_addr_comparison(&dev->addr, &dev2->addr))
> >> +  ret = eal_compare_pci_addr(&dev->addr, &dev2->addr);
> >> +  if (ret > 0)
> >>continue;
> >> -  else {
> >> +  else if (ret < 0) {
> >>TAILQ_INSERT_BEFORE(dev2, dev, next);
> >>return 0;
> >> +  } else { /* already registered */
> >> +  /* update pt_driver */
> >> +  dev2->pt_driver = dev->pt_driver;
> >> +  dev2->max_vfs = dev->max_vfs;
> >> +  memmove(dev2->mem_resource,
> >> +  dev->mem_resource,
> >> +  sizeof(dev->mem_resource));
> >> +  free(dev);
> >> +  return 0;
> > Could you comment this "else part" please?
> 
> PCI device list is allocated when rte_eal_init() is called. At the time,
> to fill pci device information, sysfs value is used.
> If sysfs values written by kernel device driver will not be changed by
> igb_uio, vfio or pci_uio_genereic, above code isn't needed.
> But actually above values are changed or added by them.
> 
> Here is a step to cause issue.
> 1. Boot linux.
> 2. Start DPDK application without any physical NIC ports.
>  - Here, some sysfs values are read, and store to pci device list.
> 3. igb_uio starts managing a port.
>  - Here, some sysfs values are changed.
> 4. Add a NIC port to DPDK application using hotplug functions.
>  - Here, we need to replace some values.

I think that you are showing that something is wrongly designed in these
EAL structures. I suggest to try cleaning this mess instead of workarounding.

[...]
> >> -  if (memcmp(&uio_res->pci_addr, &dev->addr, sizeof(dev->addr)))
> >> +  if (eal_compare_pci_addr(&uio_res->pci_addr, &dev->addr))
> > Why memcmp is not sufficient to compare PCI addresses?
> > The only exception I see is endianness for natural sorting.
> 
> Here is the definition.
> 
> struct rte_pci_addr {
> uint16_t domain;/**< Device domain */
> uint8_t bus;/**< Device bus */
> uint8_t devid;  /**< Device ID */
> uint8_t function;   /**< Device function. */
> };
> 
> But, sizeof(struct rte_pci_addr) will be 6.
> If rte_pci_addr is allocated in a function without bzero, last 1 byte
> may have some value.
> As a result, memcmp may not work. To avoid such a case, I checked like
> above.

OK thanks. That's the kind of information which is valuable in a commit log.



[dpdk-dev] [PATCH v8 08/14] eal/linux/pci: Add functions for unmapping igb_uio resources

2015-02-18 Thread Thomas Monjalon
2015-02-17 15:15, Tetsuya Mukawa:
> On 2015/02/17 10:11, Thomas Monjalon wrote:
> > 2015-02-16 13:14, Tetsuya Mukawa:
> >> +#ifdef ENABLE_HOTPLUG
> > Please avoid using #ifdef if not really necessary.
> 
> I agree with you.
> In this case, only hotplug functions call pci_unmap_resource().
> So this will be needed.

Why is it needed?


[dpdk-dev] [PATCH v8 12/14] eal/pci: Add rte_eal_dev_attach/detach() functions

2015-02-18 Thread Thomas Monjalon
2015-02-17 19:26, Tetsuya Mukawa:
> On 2015/02/17 18:23, Thomas Monjalon wrote:
> > 2015-02-17 17:51, Tetsuya Mukawa:
> >> On 2015/02/17 10:48, Thomas Monjalon wrote:
> >>> 2015-02-16 13:14, Tetsuya Mukawa:
>  +/* attach the new physical device, then store port_id of the device */
>  +static int
>  +rte_eal_dev_attach_pdev(struct rte_pci_addr *addr, uint8_t *port_id)
>  +{
>  +uint8_t new_port_id;
>  +struct rte_eth_dev devs[RTE_MAX_ETHPORTS];
>  +
>  +if ((addr == NULL) || (port_id == NULL))
>  +goto err;
>  +
>  +/* save current port status */
>  +if (rte_eth_dev_save(devs, sizeof(devs)))
>  +goto err;
>  +/* re-construct pci_device_list */
>  +if (rte_eal_pci_scan())
>  +goto err;
>  +/* invoke probe func of the driver can handle the new device */
>  +if (rte_eal_pci_probe_one(addr))
>  +goto err;
> >>> You should get the port_id from the previous function instead of 
> >>> searching it.
> >> I agree this will beautify this code, but actually to do like above
> >> changes current DPDK code much more, and it will not be clear, and not
> >> beautiful.
> >>
> >> Could I explain it more.
> >> Problem is initialization sequence of virtual device and physical device
> >> are completely different.
> >>
> >> (1) Attaching a physical device case
> >> - To return port id, pci_invoke_all_drivers() needs to return port id.
> >> - It means "devinit" of "struct rte_pci_driver" needs to return port_id.
> >> - "devinit" will be rte_eth_dev_init(). But if the device is virtual,
> >> this function is not implemented.
> >>
> >> (2) Attaching virtual device case
> >> - To return port id from rte_eal_pci_probe_one(),
> >> pci_invoke_all_drivers() needs to return port id.
> >> - It means "init" of "struct rte_driver" needs to return port_id.
> >> - "init" will be implemented in PMD. But this function has different
> >> usage in physical device and virtual device.
> >> - Especially, In the case of physical device, "init" doesn't allocate
> >> eth device, so cannot return port id.
> >>
> >> As a result, to remove  rte_eth_dev_save() and
> >> rte_eth_dev_get_changed_port(), below different functions needs to
> >> return port id.
> >>  - "devinit" of "struct rte_pci_driver".
> >>  - "init" of "struct rte_driver".
> > Yes, exactly,
> > I think you shouldn't hesitate to improve existing EAL code.
> > And I also think we should try to remove differences between virtual and
> > pci devices.
> 
> I strongly agree with it. But I haven't investigated how to remove it so
> far.
> To be honest, I want to submit hotplug patches to DPDK-2.0.
> Is above functionality needed to merge the hotplug patches?
> I guess I will not be able to remove it by 23rd.

Obviously, it would be better to have it in dpdk-2.0.0-rc1.
If not possible to fix it, would it be possible to work on other comments
and keep this cleanup for post-rc1 integration?
I feel this cleanup is important to get the right design but it wouldn't be
fair to block this (old) patchset for this reason.



[dpdk-dev] [PATCH v3 0/5] Interrupt mode PMD

2015-02-18 Thread Liang, Cunming
Hi,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zhou Danny
> Sent: Tuesday, February 17, 2015 9:47 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v3 0/5] Interrupt mode PMD
> 
> v3 changes
> - Add return value for interrupt enable/disable functions
> - Move spinlok from PMD to L3fwd-power
> - Remove unnecessary variables in e1000_mac_info
> - Fix miscelleous review comments
> 
> v2 changes
> - Fix compilation issue in Makefile for missed header file.
> - Consolidate internal and community review comments of v1 patch set.
> 
> The patch series introduce low-latency one-shot rx interrupt into DPDK with
> polling and interrupt mode switch control example.
> 
> DPDK userspace interrupt notification and handling mechanism is based on UIO
> with below limitation:
> 1) It is designed to handle LSC interrupt only with inefficient suspended
> pthread wakeup procedure (e.g. UIO wakes up LSC interrupt handling thread
> which then wakes up DPDK polling thread). In this way, it introduces
> non-deterministic wakeup latency for DPDK polling thread as well as packet
> latency if it is used to handle Rx interrupt.
> 2) UIO only supports a single interrupt vector which has to been shared by
> LSC interrupt and interrupts assigned to dedicated rx queues.
> 
> This patchset includes below features:
> 1) Enable one-shot rx queue interrupt in ixgbe PMD(PF & VF) and igb PMD(PF
> only).
> 2) Build on top of the VFIO mechanism instead of UIO, so it could support
> up to 64 interrupt vectors for rx queue interrupts.
> 3) Have 1 DPDK polling thread handle per Rx queue interrupt with a dedicated
> VFIO eventfd, which eliminates non-deterministic pthread wakeup latency in
> user space.
> 4) Demonstrate interrupts control APIs and userspace NAIP-like 
> polling/interrupt
> switch algorithms in L3fwd-power example.
> 
> Known limitations:
> 1) It does not work for UIO due to a single interrupt eventfd shared by LSC
> and rx queue interrupt handlers causes a mess.
> 2) LSC interrupt is not supported by VF driver, so it is by default disabled
> in L3fwd-power now. Feel free to turn in on if you want to support both LSC
> and rx queue interrupts on a PF.
> 
> Danny Zhou (5):
>   ethdev: add rx interrupt enable/disable functions
>   ixgbe: enable rx queue interrupts for both PF and VF
>   igb: enable rx queue interrupts for PF
>   eal: add per rx queue interrupt handling based on VFIO
>   l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode
>   switch
> 
>  examples/l3fwd-power/main.c| 153 ++---
>  lib/librte_eal/common/include/rte_eal.h|  12 +
>  lib/librte_eal/linuxapp/eal/Makefile   |   1 +
>  lib/librte_eal/linuxapp/eal/eal_interrupts.c   | 190 ---
>  lib/librte_eal/linuxapp/eal/eal_pci_vfio.c |  12 +-
>  .../linuxapp/eal/include/exec-env/rte_interrupts.h |   4 +
>  lib/librte_ether/rte_ethdev.c  |  43 +++
>  lib/librte_ether/rte_ethdev.h  |  57 
>  lib/librte_pmd_e1000/e1000_ethdev.h|   3 +
>  lib/librte_pmd_e1000/igb_ethdev.c  | 228 +++--
>  lib/librte_pmd_ixgbe/ixgbe_ethdev.c| 365
> -
>  lib/librte_pmd_ixgbe/ixgbe_ethdev.h|   6 +
>  13 files changed, 963 insertions(+), 114 deletions(-)
> 
> --
> 1.8.1.4

Acked-by:  Cunming Liang 



[dpdk-dev] [PATCH v8 12/14] eal/pci: Add rte_eal_dev_attach/detach() functions

2015-02-18 Thread Tetsuya Mukawa
On 2015/02/18 1:15, Maxime Leroy wrote:
> Hi Tetsuya,
>
> On Tue, Feb 17, 2015 at 9:51 AM, Tetsuya Mukawa  wrote:
>>
 +/* get port_id enabled by above procedures */
 +if (rte_eth_dev_get_changed_port(devs, &new_port_id))
 +goto err2;
>>> [...]
>>>
  /**
 + * Uninitilization function called for each device driver once.
 + */
 +typedef int (rte_dev_uninit_t)(const char *name, const char *args);
>>> Why do you need args for uninit?
>>>
>> I just added for the case that finalization code of PMD needs it.
>> But, probably "args" parameter can be removed.
>>
>>
> I think there are no needs to have any args in the uninit function:
> 1) You librte_pmd_null doesn't use it
> 2) You give exactly the same argument that was used by the init
> function. A driver should have already stored these parameters in an
> internal private structure at initialization. So it's not needed to
> give me back for uninit method.
>
> From my understanding devargs_list is only needed at the init to store
> the arguments when we parse the command line. Then, at initialization,
> rte_eal_dev_init  creates the devices from this list .
>
> By removing args from uninit function, you doesn't need to add and
> remove anymore devargs in devargs_list to (de)attach a new device.
>
> What do you think ?

Hi Maxime,

Yes, I agree. We can remove the argument.
I will do it.

Regards,
Tetsuya

>
> Maxime




[dpdk-dev] [PATCH v8 03/14] eal/pci, ethdev: Remove assumption that port will not be detached

2015-02-18 Thread Tetsuya Mukawa
On 2015/02/18 9:31, Thomas Monjalon wrote:
> 2015-02-17 15:14, Tetsuya Mukawa:
>> On 2015/02/17 9:36, Thomas Monjalon wrote:
>>> 2015-02-16 13:14, Tetsuya Mukawa:
>>> Is uint8_t sill a good size for hotpluggable virtual device ids?
>> I am not sure it's enough, but uint8_t is widely used in "rte_ethdev.c"
>> as port id.
>> If someone reports it doesn't enough, I guess it will be the time to
>> write a patch to change all uint_8 in one patch.
> It's a big ABI breakage. So if we feel it's going to be required,
> it's better to do it now in 2.0 release I think.
>
> Any opinion?
>

Hi Thomas,

I agree with it.
I will add an one more patch to change uint8_t to uint16_t.

Thanks,
Tetsuya



[dpdk-dev] [PATCH v8 04/14] eal/pci: Consolidate pci address comparison APIs

2015-02-18 Thread Tetsuya Mukawa
On 2015/02/18 10:02, Thomas Monjalon wrote:
> 2015-02-17 15:14, Tetsuya Mukawa:
>> On 2015/02/17 9:44, Thomas Monjalon wrote:
>>> 2015-02-16 13:14, Tetsuya Mukawa:
 @@ -356,13 +342,24 @@ pci_scan_one(int dev_pci_fd, struct pci_conf *conf)
}
else {
struct rte_pci_device *dev2 = NULL;
 +  int ret;
  
TAILQ_FOREACH(dev2, &pci_device_list, next) {
 -  if (pci_addr_comparison(&dev->addr, &dev2->addr))
 +  ret = eal_compare_pci_addr(&dev->addr, &dev2->addr);
 +  if (ret > 0)
continue;
 -  else {
 +  else if (ret < 0) {
TAILQ_INSERT_BEFORE(dev2, dev, next);
return 0;
 +  } else { /* already registered */
 +  /* update pt_driver */
 +  dev2->pt_driver = dev->pt_driver;
 +  dev2->max_vfs = dev->max_vfs;
 +  memmove(dev2->mem_resource,
 +  dev->mem_resource,
 +  sizeof(dev->mem_resource));
 +  free(dev);
 +  return 0;
>>> Could you comment this "else part" please?
>> PCI device list is allocated when rte_eal_init() is called. At the time,
>> to fill pci device information, sysfs value is used.
>> If sysfs values written by kernel device driver will not be changed by
>> igb_uio, vfio or pci_uio_genereic, above code isn't needed.
>> But actually above values are changed or added by them.
>>
>> Here is a step to cause issue.
>> 1. Boot linux.
>> 2. Start DPDK application without any physical NIC ports.
>>  - Here, some sysfs values are read, and store to pci device list.
>> 3. igb_uio starts managing a port.
>>  - Here, some sysfs values are changed.
>> 4. Add a NIC port to DPDK application using hotplug functions.
>>  - Here, we need to replace some values.
> I think that you are showing that something is wrongly designed in these
> EAL structures. I suggest to try cleaning this mess instead of workarounding.
>
> [...]
 -  if (memcmp(&uio_res->pci_addr, &dev->addr, sizeof(dev->addr)))
 +  if (eal_compare_pci_addr(&uio_res->pci_addr, &dev->addr))
>>> Why memcmp is not sufficient to compare PCI addresses?
>>> The only exception I see is endianness for natural sorting.
>> Here is the definition.
>>
>> struct rte_pci_addr {
>> uint16_t domain;/**< Device domain */
>> uint8_t bus;/**< Device bus */
>> uint8_t devid;  /**< Device ID */
>> uint8_t function;   /**< Device function. */
>> };
>>
>> But, sizeof(struct rte_pci_addr) will be 6.
>> If rte_pci_addr is allocated in a function without bzero, last 1 byte
>> may have some value.
>> As a result, memcmp may not work. To avoid such a case, I checked like
>> above.
> OK thanks. That's the kind of information which is valuable in a commit log.
>

Sure I will add it.

Thanks,
Tetsuya


[dpdk-dev] [PATCH v8 12/14] eal/pci: Add rte_eal_dev_attach/detach() functions

2015-02-18 Thread Tetsuya Mukawa
On 2015/02/18 10:17, Thomas Monjalon wrote:
> 2015-02-17 19:26, Tetsuya Mukawa:
>> On 2015/02/17 18:23, Thomas Monjalon wrote:
>>> 2015-02-17 17:51, Tetsuya Mukawa:
 On 2015/02/17 10:48, Thomas Monjalon wrote:
> 2015-02-16 13:14, Tetsuya Mukawa:
>> +/* attach the new physical device, then store port_id of the device */
>> +static int
>> +rte_eal_dev_attach_pdev(struct rte_pci_addr *addr, uint8_t *port_id)
>> +{
>> +uint8_t new_port_id;
>> +struct rte_eth_dev devs[RTE_MAX_ETHPORTS];
>> +
>> +if ((addr == NULL) || (port_id == NULL))
>> +goto err;
>> +
>> +/* save current port status */
>> +if (rte_eth_dev_save(devs, sizeof(devs)))
>> +goto err;
>> +/* re-construct pci_device_list */
>> +if (rte_eal_pci_scan())
>> +goto err;
>> +/* invoke probe func of the driver can handle the new device */
>> +if (rte_eal_pci_probe_one(addr))
>> +goto err;
> You should get the port_id from the previous function instead of 
> searching it.
 I agree this will beautify this code, but actually to do like above
 changes current DPDK code much more, and it will not be clear, and not
 beautiful.

 Could I explain it more.
 Problem is initialization sequence of virtual device and physical device
 are completely different.

 (1) Attaching a physical device case
 - To return port id, pci_invoke_all_drivers() needs to return port id.
 - It means "devinit" of "struct rte_pci_driver" needs to return port_id.
 - "devinit" will be rte_eth_dev_init(). But if the device is virtual,
 this function is not implemented.

 (2) Attaching virtual device case
 - To return port id from rte_eal_pci_probe_one(),
 pci_invoke_all_drivers() needs to return port id.
 - It means "init" of "struct rte_driver" needs to return port_id.
 - "init" will be implemented in PMD. But this function has different
 usage in physical device and virtual device.
 - Especially, In the case of physical device, "init" doesn't allocate
 eth device, so cannot return port id.

 As a result, to remove  rte_eth_dev_save() and
 rte_eth_dev_get_changed_port(), below different functions needs to
 return port id.
  - "devinit" of "struct rte_pci_driver".
  - "init" of "struct rte_driver".
>>> Yes, exactly,
>>> I think you shouldn't hesitate to improve existing EAL code.
>>> And I also think we should try to remove differences between virtual and
>>> pci devices.
>> I strongly agree with it. But I haven't investigated how to remove it so
>> far.
>> To be honest, I want to submit hotplug patches to DPDK-2.0.
>> Is above functionality needed to merge the hotplug patches?
>> I guess I will not be able to remove it by 23rd.
> Obviously, it would be better to have it in dpdk-2.0.0-rc1.
> If not possible to fix it, would it be possible to work on other comments
> and keep this cleanup for post-rc1 integration?
> I feel this cleanup is important to get the right design but it wouldn't be
> fair to block this (old) patchset for this reason.
>

I appreciate for your suggestion. I will keep working on it for post-rc1.

Thanks,
Tetsuya


[dpdk-dev] rte_memcpy optimization patch to dpdk ver 1.7

2015-02-18 Thread Vithal S Mohare
Ok, crash, as expected.   So, now dpdk mandates either AVX2 or SSSE2 supported 
CPUs.   OR applications needs to handle it run-time.

Thanks,
-Vithal

-Original Message-
From: Neil Horman [mailto:nhor...@tuxdriver.com] 
Sent: Tuesday, February 17, 2015 6:32 PM
To: Vithal S Mohare
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] rte_memcpy optimization patch to dpdk ver 1.7

On Tue, Feb 17, 2015 at 08:39:22AM +, Vithal S Mohare wrote:
> Hi,
> 
> I am trying to use rte_memcpy optimization patch along with dpdk version 1.7. 
>  With the patch, while dpdk itself is compiled, applications failed with 
> below error:
> ---
> include/rte_memcpy.h:629:2: error: implicit declaration of function 
> '_mm_alignr_epi8' [-Werror=implicit-function-declaration]
> /home/vithals/adu_src/build/x-men_dev/Default/shumway/infra/dpdk/shumway_obj/lib/../include/rte_memcpy.h:629:2:
>  error: incompatible type for argument 2 of '_mm_storeu_si128'
> ---
> 
> After including -mssse3 flags, compilation (cross compiled for a x86 linux 
> based platform) went through.  Now the question is, when this binary is 
> loaded on system that doesn't support SSSE3 instruction set (but just sse2 
> etc), what would be the behavior?
> 
A crash.  You'll attempt to send an unknown binary instruction into the 
execution pipeline and the processor will fault.

Neil




[dpdk-dev] [PATCH v8 03/14] eal/pci, ethdev: Remove assumption that port will not be detached

2015-02-18 Thread Tetsuya Mukawa
On 2015/02/18 10:54, Tetsuya Mukawa wrote:
> On 2015/02/18 9:31, Thomas Monjalon wrote:
>> 2015-02-17 15:14, Tetsuya Mukawa:
>>> On 2015/02/17 9:36, Thomas Monjalon wrote:
 2015-02-16 13:14, Tetsuya Mukawa:
 Is uint8_t sill a good size for hotpluggable virtual device ids?
>>> I am not sure it's enough, but uint8_t is widely used in "rte_ethdev.c"
>>> as port id.
>>> If someone reports it doesn't enough, I guess it will be the time to
>>> write a patch to change all uint_8 in one patch.
>> It's a big ABI breakage. So if we feel it's going to be required,
>> it's better to do it now in 2.0 release I think.
>>
>> Any opinion?
>>
> Hi Thomas,
>
> I agree with it.
> I will add an one more patch to change uint8_t to uint16_t.
>
> Thanks,
> Tetsuya
>

Hi Thomas,

Could I make sure.
After changing uint8_t to uint16_t in "rte_ethdev.[ch]", must I also
need to change other applications and libraries that call ethdev APIs?
If so, I would not finish it by 23rd.

I've counted how many lines call ethdev APIs that are related to port_id.
Could you please check an attached file?
It's over 1200 lines. Probably to fix  one of caller, I will need to
check how port_id is used, and fix more related lines. So probably
thousands lines may need to be fixed.

When is deadline for fixing this changing?
Also, if you have a good idea to fix it easier, could you please let me
know?

Thanks,
Tetsuya

-- next part --
rte_eth_dev_configure   app/test-pipeline/init.c240
rte_eth_dev_configure   app/test-pmd/testpmd.c  1304
rte_eth_dev_configure   app/test/test_kni.c 523
rte_eth_dev_configure   app/test/test_link_bonding.c238
rte_eth_dev_configure   app/test/test_link_bonding.c240
rte_eth_dev_configure   app/test/test_pmd_perf.c751
rte_eth_dev_configure   app/test/test_pmd_ring.c67
rte_eth_dev_configure   app/test/test_pmd_ring.c73
rte_eth_dev_configure   app/test/test_pmd_ring.c77
rte_eth_dev_configure   app/test/test_pmd_ring.c81
rte_eth_dev_configure   app/test/test_pmd_ring.c256
rte_eth_dev_configure   app/test/test_pmd_ring.c257
rte_eth_dev_configure   examples/distributor/main.c 125
rte_eth_dev_configure   examples/dpdk_qat/main.c726
rte_eth_dev_configure   examples/exception_path/main.c  433
rte_eth_dev_configure   examples/ip_fragmentation/main.c890
rte_eth_dev_configure   examples/ip_pipeline/init.c 486
rte_eth_dev_configure   examples/ip_reassembly/main.c   1095
rte_eth_dev_configure   examples/ipv4_multicast/main.c  755
rte_eth_dev_configure   examples/kni/main.c 617
rte_eth_dev_configure   examples/kni/main.c 725
rte_eth_dev_configure   examples/l2fwd-ivshmem/host/host.c  745
rte_eth_dev_configure   examples/l2fwd/main.c   650
rte_eth_dev_configure   examples/l3fwd-acl/main.c   1991
rte_eth_dev_configure   examples/l3fwd-power/main.c 1534
rte_eth_dev_configure   examples/l3fwd-vf/main.c1013
rte_eth_dev_configure   examples/l3fwd/main.c   2457
rte_eth_dev_configure   examples/link_status_interrupt/main.c   696
rte_eth_dev_configure   examples/link_status_interrupt/main.c   702
rte_eth_dev_configure   examples/load_balancer/init.c   446
rte_eth_dev_configure   
examples/multi_process/client_server_mp/mp_server/init.c144
rte_eth_dev_configure   examples/multi_process/l2fwd_fork/main.c1121
rte_eth_dev_configure   examples/multi_process/symmetric_mp/main.c  248
rte_eth_dev_configure   examples/netmap_compat/lib/compat_netmap.c  706
rte_eth_dev_configure   examples/qos_meter/main.c   370
rte_eth_dev_configure   examples/qos_meter/main.c   386
rte_eth_dev_configure   examples/qos_sched/init.c   129
rte_eth_dev_configure   examples/quota_watermark/qw/init.c  82
rte_eth_dev_configure   examples/skeleton/basicfwd.c69
rte_eth_dev_configure   examples/vhost/main.c   442
rte_eth_dev_configure   examples/vhost_xen/main.c   306
rte_eth_dev_configure   examples/vmdq/main.c254
rte_eth_dev_configure   examples/vmdq_dcb/main.c177
rte_eth_dev_configure   lib/librte_ether/rte_ethdev.c   728
rte_eth_dev_configure   lib/librte_ether/rte_ethdev.h   91
rte_eth_dev_configure   lib/librte_ether/rte_ethdev.h   102
rte_eth_dev_configure   lib/librte_ether/rte_ethdev.h   1726
rte_eth_dev_configure   lib/librte_ether/rte_ethdev.h   1744
rte_eth_dev_configure   lib/librte_ether/rte_ethdev.h   1783
rte_eth_dev_configure   lib/librte_ether/rte_ethdev.h   1844
rte_eth_dev_configure   lib/librte_ether/rte_ethdev.h   1860
rte_eth_dev_configure   lib/librte_ether/rte_ethdev.h   1877
rte_eth_dev_configure   lib/librte_ether/rte_ethdev.h   1893
rte_eth_dev_configure   lib/librte_ether/rte_ethdev.h   2112
rte_eth_dev_configure   lib/librte_ether/rte_ethdev.h   2133
rte_eth_dev_configure   lib/librte_ether/rte_ethdev.h   2225
rte_eth_dev_configure   lib/librte_ether/rte_ethdev.h   2377
rte_eth_dev_configure   lib/librte_ether/rte_ethdev.h   2492
rte_eth_

[dpdk-dev] Vhost-user early adopter feedback

2015-02-18 Thread BenoƮt Canet

Hello Xie,

As promized I integrated your vhost-user patchset from january in my vswitch.

I just tried it, it works pretty well.

I just had a minor bug with rte_vhost_driver_register taking ownership of the
string patch pointer too late. I freed it out of habit just after registering 
in the
caller and when ifname[IFNAMESIZ] was written the pointer was used for a new 
string I
allocated later. Maybe an early strdup() would fix this.

The last patch of your new version is really a great idea since it will
simplify a lot the socket creation and management code.

Best regards

Beno?t


[dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references

2015-02-18 Thread Olivier MATZ
Hi Sergio,

On 02/16/2015 05:08 PM, Sergio Gonzalez Monroy wrote:
> This patch removes all references to RTE_MBUF_REFCNT, setting the refcnt
> field in the mbuf struct permanently.
>
> Signed-off-by: Sergio Gonzalez Monroy 

I think removing the refcount compile option goes in the right
direction. However, activating the refcount will break the applications
that reserve a private zone in mbufs. This is due to the macros
RTE_MBUF_TO_BADDR() and RTE_MBUF_FROM_BADDR() that suppose that
the beginning of the mbuf is 128 bytes (sizeof mbuf) before the
data buffer.

For RTE_MBUF_TO_BADDR(), it's relatively easy to replace it. The
mbuf pool could store the size of the private size like it's done
for mbp_priv->mbuf_data_room_size. Using rte_mempool_from_obj(m)
or m->pool, we can retrieve the mbuf pool and this value, then
compute the buffer address.

For RTE_MBUF_FROM_BADDR(), it's more complex. We could ensure that
a backpointer to the mbuf is always located before the data buffer,
but it looks difficult to do.

Another idea would be to add a field in indirect mbufs that stores
the pointer to the "parent" mbuf.

Regards,
Olivier



[dpdk-dev] [PATCH v8 03/14] eal/pci, ethdev: Remove assumption that port will not be detached

2015-02-18 Thread Iremonger, Bernard


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Tetsuya Mukawa
> Sent: Wednesday, February 18, 2015 6:10 AM
> To: Thomas Monjalon
> Cc: dev at dpdk.org; Neil Horman
> Subject: Re: [dpdk-dev] [PATCH v8 03/14] eal/pci, ethdev: Remove assumption 
> that port will not be
> detached
> 
> On 2015/02/18 10:54, Tetsuya Mukawa wrote:
> > On 2015/02/18 9:31, Thomas Monjalon wrote:
> >> 2015-02-17 15:14, Tetsuya Mukawa:
> >>> On 2015/02/17 9:36, Thomas Monjalon wrote:
>  2015-02-16 13:14, Tetsuya Mukawa:
>  Is uint8_t sill a good size for hotpluggable virtual device ids?
> >>> I am not sure it's enough, but uint8_t is widely used in "rte_ethdev.c"
> >>> as port id.
> >>> If someone reports it doesn't enough, I guess it will be the time to
> >>> write a patch to change all uint_8 in one patch.
> >> It's a big ABI breakage. So if we feel it's going to be required,
> >> it's better to do it now in 2.0 release I think.
> >>
> >> Any opinion?
> >>
> > Hi Thomas,
> >
> > I agree with it.
> > I will add an one more patch to change uint8_t to uint16_t.
> >
> > Thanks,
> > Tetsuya
> >
> 
> Hi Thomas,
> 
> Could I make sure.
> After changing uint8_t to uint16_t in "rte_ethdev.[ch]", must I also need to 
> change other applications
> and libraries that call ethdev APIs?
> If so, I would not finish it by 23rd.
> 
> I've counted how many lines call ethdev APIs that are related to port_id.
> Could you please check an attached file?
> It's over 1200 lines. Probably to fix  one of caller, I will need to check 
> how port_id is used, and fix more
> related lines. So probably thousands lines may need to be fixed.
> 
> When is deadline for fixing this changing?
> Also, if you have a good idea to fix it easier, could you please let me know?
> 
> Thanks,
> Tetsuya

Hi Tetsuya, Thomas,

As uint8_t is already widely used for port_id, I don't think it should be 
changed in this patchset.
If it is to be changed to uint16_t it should be done as a separate task (in a 
new patchset).

Regards,

Bernard.





[dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references

2015-02-18 Thread Bruce Richardson
On Wed, Feb 18, 2015 at 10:16:56AM +0100, Olivier MATZ wrote:
> Hi Sergio,
> 
> On 02/16/2015 05:08 PM, Sergio Gonzalez Monroy wrote:
> >This patch removes all references to RTE_MBUF_REFCNT, setting the refcnt
> >field in the mbuf struct permanently.
> >
> >Signed-off-by: Sergio Gonzalez Monroy 
> 
> I think removing the refcount compile option goes in the right
> direction. However, activating the refcount will break the applications
> that reserve a private zone in mbufs. This is due to the macros
> RTE_MBUF_TO_BADDR() and RTE_MBUF_FROM_BADDR() that suppose that
> the beginning of the mbuf is 128 bytes (sizeof mbuf) before the
> data buffer.
>

While I understand how the macros make certain assumptions, how does activating
the refcnt specifically lead to the problems you describe? Could you explain
that part in a bit more detail?

Thanks,
/Bruce

> For RTE_MBUF_TO_BADDR(), it's relatively easy to replace it. The
> mbuf pool could store the size of the private size like it's done
> for mbp_priv->mbuf_data_room_size. Using rte_mempool_from_obj(m)
> or m->pool, we can retrieve the mbuf pool and this value, then
> compute the buffer address.
> 
> For RTE_MBUF_FROM_BADDR(), it's more complex. We could ensure that
> a backpointer to the mbuf is always located before the data buffer,
> but it looks difficult to do.
> 
> Another idea would be to add a field in indirect mbufs that stores
> the pointer to the "parent" mbuf.
> 
> Regards,
> Olivier
> 


[dpdk-dev] [PATCH v8 08/14] eal/linux/pci: Add functions for unmapping igb_uio resources

2015-02-18 Thread Tetsuya Mukawa
On 2015/02/18 10:09, Thomas Monjalon wrote:
> 2015-02-17 15:15, Tetsuya Mukawa:
>> On 2015/02/17 10:11, Thomas Monjalon wrote:
>>> 2015-02-16 13:14, Tetsuya Mukawa:
 +#ifdef ENABLE_HOTPLUG
>>> Please avoid using #ifdef if not really necessary.
>> I agree with you.
>> In this case, only hotplug functions call pci_unmap_resource().
>> So this will be needed.
> Why is it needed?

It was my misunderstanding. Yes, we can remove it.
I will fix it.

Thanks,
Tetsuya


[dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references

2015-02-18 Thread Ananyev, Konstantin
Hi lads,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> Sent: Wednesday, February 18, 2015 9:36 AM
> To: Olivier MATZ
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references
> 
> On Wed, Feb 18, 2015 at 10:16:56AM +0100, Olivier MATZ wrote:
> > Hi Sergio,
> >
> > On 02/16/2015 05:08 PM, Sergio Gonzalez Monroy wrote:
> > >This patch removes all references to RTE_MBUF_REFCNT, setting the refcnt
> > >field in the mbuf struct permanently.
> > >
> > >Signed-off-by: Sergio Gonzalez Monroy 
> >
> > I think removing the refcount compile option goes in the right
> > direction. However, activating the refcount will break the applications
> > that reserve a private zone in mbufs. This is due to the macros
> > RTE_MBUF_TO_BADDR() and RTE_MBUF_FROM_BADDR() that suppose that
> > the beginning of the mbuf is 128 bytes (sizeof mbuf) before the
> > data buffer.
> >
> 
> While I understand how the macros make certain assumptions, how does 
> activating
> the refcnt specifically lead to the problems you describe? Could you explain
> that part in a bit more detail?
> 
> Thanks,
> /Bruce
> 

Olivier, I also don't understand your concern here.
As I can see, that patch has nothing to do with RTE_MBUF_FROM_BADDR/ 
RTE_MBUF_FROM_BADDR macros.
They are still there, for example rte_pktmbuf_detach() still uses it to restore 
mbuf's buf_addr.
The only principal change here, is that we don't rely more  on 
RTE_MBUF_FROM_BADDR to determine,
Is that indirect mbuf or not. 
Instead we use a special falg for that purpose:

-#define RTE_MBUF_INDIRECT(mb)   (RTE_MBUF_FROM_BADDR((mb)->buf_addr) != (mb))
+#define RTE_MBUF_INDIRECT(mb)   (mb->ol_flags & IND_ATTACHED_MBUF)

BTW, Sergio as I said before, I think it should be:
#define RTE_MBUF_INDIRECT(mb)   ((mb)->ol_flags & IND_ATTACHED_MBUF)

Konstantin


> > For RTE_MBUF_TO_BADDR(), it's relatively easy to replace it. The
> > mbuf pool could store the size of the private size like it's done
> > for mbp_priv->mbuf_data_room_size. Using rte_mempool_from_obj(m)
> > or m->pool, we can retrieve the mbuf pool and this value, then
> > compute the buffer address.
> >
> > For RTE_MBUF_FROM_BADDR(), it's more complex. We could ensure that
> > a backpointer to the mbuf is always located before the data buffer,
> > but it looks difficult to do.
> >
> > Another idea would be to add a field in indirect mbufs that stores
> > the pointer to the "parent" mbuf.
> >
> > Regards,
> > Olivier
> >


[dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references

2015-02-18 Thread Olivier MATZ
Hi Bruce,

On 02/18/2015 10:35 AM, Bruce Richardson wrote:
> On Wed, Feb 18, 2015 at 10:16:56AM +0100, Olivier MATZ wrote:
>> Hi Sergio,
>>
>> On 02/16/2015 05:08 PM, Sergio Gonzalez Monroy wrote:
>>> This patch removes all references to RTE_MBUF_REFCNT, setting the refcnt
>>> field in the mbuf struct permanently.
>>>
>>> Signed-off-by: Sergio Gonzalez Monroy 
>>
>> I think removing the refcount compile option goes in the right
>> direction. However, activating the refcount will break the applications
>> that reserve a private zone in mbufs. This is due to the macros
>> RTE_MBUF_TO_BADDR() and RTE_MBUF_FROM_BADDR() that suppose that
>> the beginning of the mbuf is 128 bytes (sizeof mbuf) before the
>> data buffer.
>>
>
> While I understand how the macros make certain assumptions, how does 
> activating
> the refcnt specifically lead to the problems you describe? Could you explain
> that part in a bit more detail?

Indeed, you are right, removing the refcount option does not break
the applications if they do not use it. So there is probably no need
to fix that problem in this patch series.

However we should consider this problem as the refcount (which can
not be deactivated now) is not compatible with the private mbuf data.

By the way, once we are on this, shouldn't we consider removing the
RTE_MBUF_REFCNT_ATOMIC compile option?


Regards,
Olivier



>
> Thanks,
> /Bruce
>
>> For RTE_MBUF_TO_BADDR(), it's relatively easy to replace it. The
>> mbuf pool could store the size of the private size like it's done
>> for mbp_priv->mbuf_data_room_size. Using rte_mempool_from_obj(m)
>> or m->pool, we can retrieve the mbuf pool and this value, then
>> compute the buffer address.
>>
>> For RTE_MBUF_FROM_BADDR(), it's more complex. We could ensure that
>> a backpointer to the mbuf is always located before the data buffer,
>> but it looks difficult to do.
>>
>> Another idea would be to add a field in indirect mbufs that stores
>> the pointer to the "parent" mbuf.
>>
>> Regards,
>> Olivier
>>



[dpdk-dev] [PATCH v8 03/14] eal/pci, ethdev: Remove assumption that port will not be detached

2015-02-18 Thread Thomas Monjalon
2015-02-18 15:10, Tetsuya Mukawa:
> On 2015/02/18 10:54, Tetsuya Mukawa wrote:
> > On 2015/02/18 9:31, Thomas Monjalon wrote:
> >> 2015-02-17 15:14, Tetsuya Mukawa:
> >>> On 2015/02/17 9:36, Thomas Monjalon wrote:
>  2015-02-16 13:14, Tetsuya Mukawa:
>  Is uint8_t sill a good size for hotpluggable virtual device ids?
> >>> I am not sure it's enough, but uint8_t is widely used in "rte_ethdev.c"
> >>> as port id.
> >>> If someone reports it doesn't enough, I guess it will be the time to
> >>> write a patch to change all uint_8 in one patch.
> >> It's a big ABI breakage. So if we feel it's going to be required,
> >> it's better to do it now in 2.0 release I think.
> >>
> >> Any opinion?
> >>
> > Hi Thomas,
> >
> > I agree with it.
> > I will add an one more patch to change uint8_t to uint16_t.
> >
> > Thanks,
> > Tetsuya
> >
> 
> Hi Thomas,
> 
> Could I make sure.
> After changing uint8_t to uint16_t in "rte_ethdev.[ch]", must I also
> need to change other applications and libraries that call ethdev APIs?
> If so, I would not finish it by 23rd.
> 
> I've counted how many lines call ethdev APIs that are related to port_id.
> Could you please check an attached file?
> It's over 1200 lines. Probably to fix  one of caller, I will need to
> check how port_id is used, and fix more related lines. So probably
> thousands lines may need to be fixed.
> 
> When is deadline for fixing this changing?
> Also, if you have a good idea to fix it easier, could you please let me
> know?

It was an open question.
If everybody is fine with 255 ports maximum, let's keep it as is.



[dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references

2015-02-18 Thread Bruce Richardson
On Wed, Feb 18, 2015 at 09:48:58AM +, Ananyev, Konstantin wrote:
> Hi lads,
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> > Sent: Wednesday, February 18, 2015 9:36 AM
> > To: Olivier MATZ
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references
> > 
> > On Wed, Feb 18, 2015 at 10:16:56AM +0100, Olivier MATZ wrote:
> > > Hi Sergio,
> > >
> > > On 02/16/2015 05:08 PM, Sergio Gonzalez Monroy wrote:
> > > >This patch removes all references to RTE_MBUF_REFCNT, setting the refcnt
> > > >field in the mbuf struct permanently.
> > > >
> > > >Signed-off-by: Sergio Gonzalez Monroy  > > >intel.com>
> > >
> > > I think removing the refcount compile option goes in the right
> > > direction. However, activating the refcount will break the applications
> > > that reserve a private zone in mbufs. This is due to the macros
> > > RTE_MBUF_TO_BADDR() and RTE_MBUF_FROM_BADDR() that suppose that
> > > the beginning of the mbuf is 128 bytes (sizeof mbuf) before the
> > > data buffer.
> > >
> > 
> > While I understand how the macros make certain assumptions, how does 
> > activating
> > the refcnt specifically lead to the problems you describe? Could you explain
> > that part in a bit more detail?
> > 
> > Thanks,
> > /Bruce
> > 
> 
> Olivier, I also don't understand your concern here.
> As I can see, that patch has nothing to do with RTE_MBUF_FROM_BADDR/ 
> RTE_MBUF_FROM_BADDR macros.
> They are still there, for example rte_pktmbuf_detach() still uses it to 
> restore mbuf's buf_addr.
> The only principal change here, is that we don't rely more  on 
> RTE_MBUF_FROM_BADDR to determine,
> Is that indirect mbuf or not. 
> Instead we use a special falg for that purpose:
> 
> -#define RTE_MBUF_INDIRECT(mb)   (RTE_MBUF_FROM_BADDR((mb)->buf_addr) != (mb))
> +#define RTE_MBUF_INDIRECT(mb)   (mb->ol_flags & IND_ATTACHED_MBUF)
>  
> BTW, Sergio as I said before, I think it should be:
> #define RTE_MBUF_INDIRECT(mb)   ((mb)->ol_flags & IND_ATTACHED_MBUF)
> 
> Konstantin
> 
> 
> > > For RTE_MBUF_TO_BADDR(), it's relatively easy to replace it. The
> > > mbuf pool could store the size of the private size like it's done
> > > for mbp_priv->mbuf_data_room_size. Using rte_mempool_from_obj(m)
> > > or m->pool, we can retrieve the mbuf pool and this value, then
> > > compute the buffer address.

Agreed, that makes sense.

> > >
> > > For RTE_MBUF_FROM_BADDR(), it's more complex. We could ensure that
> > > a backpointer to the mbuf is always located before the data buffer,
> > > but it looks difficult to do.

On the other hand, with the proposed refcnt change Sergio proposes, we no
longer use this macro in any of the built-in mbuf handling for freeing mbufs.
Does this need to be solved at anything other than the application level?

/Bruce

> > >
> > > Another idea would be to add a field in indirect mbufs that stores
> > > the pointer to the "parent" mbuf.
> > >
> > > Regards,
> > > Olivier
> > >


[dpdk-dev] [PATCH v8 03/14] eal/pci, ethdev: Remove assumption that port will not be detached

2015-02-18 Thread Bruce Richardson
On Wed, Feb 18, 2015 at 10:57:25AM +0100, Thomas Monjalon wrote:
> 2015-02-18 15:10, Tetsuya Mukawa:
> > On 2015/02/18 10:54, Tetsuya Mukawa wrote:
> > > On 2015/02/18 9:31, Thomas Monjalon wrote:
> > >> 2015-02-17 15:14, Tetsuya Mukawa:
> > >>> On 2015/02/17 9:36, Thomas Monjalon wrote:
> >  2015-02-16 13:14, Tetsuya Mukawa:
> >  Is uint8_t sill a good size for hotpluggable virtual device ids?
> > >>> I am not sure it's enough, but uint8_t is widely used in "rte_ethdev.c"
> > >>> as port id.
> > >>> If someone reports it doesn't enough, I guess it will be the time to
> > >>> write a patch to change all uint_8 in one patch.
> > >> It's a big ABI breakage. So if we feel it's going to be required,
> > >> it's better to do it now in 2.0 release I think.
> > >>
> > >> Any opinion?
> > >>
> > > Hi Thomas,
> > >
> > > I agree with it.
> > > I will add an one more patch to change uint8_t to uint16_t.
> > >
> > > Thanks,
> > > Tetsuya
> > >
> > 
> > Hi Thomas,
> > 
> > Could I make sure.
> > After changing uint8_t to uint16_t in "rte_ethdev.[ch]", must I also
> > need to change other applications and libraries that call ethdev APIs?
> > If so, I would not finish it by 23rd.
> > 
> > I've counted how many lines call ethdev APIs that are related to port_id.
> > Could you please check an attached file?
> > It's over 1200 lines. Probably to fix  one of caller, I will need to
> > check how port_id is used, and fix more related lines. So probably
> > thousands lines may need to be fixed.
> > 
> > When is deadline for fixing this changing?
> > Also, if you have a good idea to fix it easier, could you please let me
> > know?
> 
> It was an open question.
> If everybody is fine with 255 ports maximum, let's keep it as is.
> 
I think we are probably ok for now (and forseeable future) with 255 max.

However, if we did change it, I agree that in 2.0 is a very good time to do so.
Since we are expanding the field, rather than shrinking it, I don't see why we
can't just make the change at the ethdev level (and in libs API) in 2.0 and 
then in
later releases (e.g. 2.1) update the apps and examples to match. That way the
ABI stays the same from 2.0 onwards, and we don't have a huge amount of churn
changing it everywhere late in the 2.0 release cycle.

/Bruce


[dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references

2015-02-18 Thread Olivier MATZ
On 02/18/2015 11:00 AM, Bruce Richardson wrote:
> On Wed, Feb 18, 2015 at 09:48:58AM +, Ananyev, Konstantin wrote:
>> Hi lads,
>>
>>> -Original Message-
>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
>>> Sent: Wednesday, February 18, 2015 9:36 AM
>>> To: Olivier MATZ
>>> Cc: dev at dpdk.org
>>> Subject: Re: [dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references
>>>
>>> On Wed, Feb 18, 2015 at 10:16:56AM +0100, Olivier MATZ wrote:
 Hi Sergio,

 On 02/16/2015 05:08 PM, Sergio Gonzalez Monroy wrote:
> This patch removes all references to RTE_MBUF_REFCNT, setting the refcnt
> field in the mbuf struct permanently.
>
> Signed-off-by: Sergio Gonzalez Monroy  intel.com>

 I think removing the refcount compile option goes in the right
 direction. However, activating the refcount will break the applications
 that reserve a private zone in mbufs. This is due to the macros
 RTE_MBUF_TO_BADDR() and RTE_MBUF_FROM_BADDR() that suppose that
 the beginning of the mbuf is 128 bytes (sizeof mbuf) before the
 data buffer.

>>>
>>> While I understand how the macros make certain assumptions, how does 
>>> activating
>>> the refcnt specifically lead to the problems you describe? Could you explain
>>> that part in a bit more detail?
>>>
>>> Thanks,
>>> /Bruce
>>>
>>
>> Olivier, I also don't understand your concern here.
>> As I can see, that patch has nothing to do with RTE_MBUF_FROM_BADDR/ 
>> RTE_MBUF_FROM_BADDR macros.
>> They are still there, for example rte_pktmbuf_detach() still uses it to 
>> restore mbuf's buf_addr.
>> The only principal change here, is that we don't rely more  on 
>> RTE_MBUF_FROM_BADDR to determine,
>> Is that indirect mbuf or not.
>> Instead we use a special falg for that purpose:
>>
>> -#define RTE_MBUF_INDIRECT(mb)   (RTE_MBUF_FROM_BADDR((mb)->buf_addr) != 
>> (mb))
>> +#define RTE_MBUF_INDIRECT(mb)   (mb->ol_flags & IND_ATTACHED_MBUF)
>>
>> BTW, Sergio as I said before, I think it should be:
>> #define RTE_MBUF_INDIRECT(mb)   ((mb)->ol_flags & IND_ATTACHED_MBUF)
>>
>> Konstantin
>>
>>
 For RTE_MBUF_TO_BADDR(), it's relatively easy to replace it. The
 mbuf pool could store the size of the private size like it's done
 for mbp_priv->mbuf_data_room_size. Using rte_mempool_from_obj(m)
 or m->pool, we can retrieve the mbuf pool and this value, then
 compute the buffer address.
>
> Agreed, that makes sense.
>

 For RTE_MBUF_FROM_BADDR(), it's more complex. We could ensure that
 a backpointer to the mbuf is always located before the data buffer,
 but it looks difficult to do.
>
> On the other hand, with the proposed refcnt change Sergio proposes, we no
> longer use this macro in any of the built-in mbuf handling for freeing mbufs.
> Does this need to be solved at anything other than the application level?

It's still used in __rte_pktmbuf_prefree_seg() to retrieve the
parent mbuf (direct) from the indirect mbuf beeing freed.




 Another idea would be to add a field in indirect mbufs that stores
 the pointer to the "parent" mbuf.

 Regards,
 Olivier




[dpdk-dev] Using bypass on Silicom PE210G2BPi9 with Intel 82599ES chip

2015-02-18 Thread Martin DraŔar
Hi,

I hope that I am using the correct mailing list...

I have a Silicom PE210G2BPi9 LR card with Intel 82599ES chip that has a
bypass. Now I would like to turn the bypass on and off from software.
However, I am having some problems with it.

I am able to check whether the card supports the bypass (i.e. for a card
without bypass it shows me that bypass is not supported and for this one
is), but I am not able to correctly get or set the bypass mode.

If I call the following:

uint32_t bypassState;
rc = rte_eth_dev_bypass_state_show(port, &bypassState);

Then the bypass state is always 0, which, according to documentation, is
not a valid value. The same goes for rte_eth_dev_bypass_state_set.

Does the DPDK support this card and this functionality?

I am currently using DPDK 1.5.2r1.

Thank you.
Martin



[dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references

2015-02-18 Thread Ananyev, Konstantin


> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Wednesday, February 18, 2015 10:15 AM
> To: Richardson, Bruce; Ananyev, Konstantin
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references
> 
> On 02/18/2015 11:00 AM, Bruce Richardson wrote:
> > On Wed, Feb 18, 2015 at 09:48:58AM +, Ananyev, Konstantin wrote:
> >> Hi lads,
> >>
> >>> -Original Message-
> >>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> >>> Sent: Wednesday, February 18, 2015 9:36 AM
> >>> To: Olivier MATZ
> >>> Cc: dev at dpdk.org
> >>> Subject: Re: [dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references
> >>>
> >>> On Wed, Feb 18, 2015 at 10:16:56AM +0100, Olivier MATZ wrote:
>  Hi Sergio,
> 
>  On 02/16/2015 05:08 PM, Sergio Gonzalez Monroy wrote:
> > This patch removes all references to RTE_MBUF_REFCNT, setting the refcnt
> > field in the mbuf struct permanently.
> >
> > Signed-off-by: Sergio Gonzalez Monroy  > intel.com>
> 
>  I think removing the refcount compile option goes in the right
>  direction. However, activating the refcount will break the applications
>  that reserve a private zone in mbufs. This is due to the macros
>  RTE_MBUF_TO_BADDR() and RTE_MBUF_FROM_BADDR() that suppose that
>  the beginning of the mbuf is 128 bytes (sizeof mbuf) before the
>  data buffer.
> 
> >>>
> >>> While I understand how the macros make certain assumptions, how does 
> >>> activating
> >>> the refcnt specifically lead to the problems you describe? Could you 
> >>> explain
> >>> that part in a bit more detail?
> >>>
> >>> Thanks,
> >>> /Bruce
> >>>
> >>
> >> Olivier, I also don't understand your concern here.
> >> As I can see, that patch has nothing to do with RTE_MBUF_FROM_BADDR/ 
> >> RTE_MBUF_FROM_BADDR macros.
> >> They are still there, for example rte_pktmbuf_detach() still uses it to 
> >> restore mbuf's buf_addr.
> >> The only principal change here, is that we don't rely more  on 
> >> RTE_MBUF_FROM_BADDR to determine,
> >> Is that indirect mbuf or not.
> >> Instead we use a special falg for that purpose:
> >>
> >> -#define RTE_MBUF_INDIRECT(mb)   (RTE_MBUF_FROM_BADDR((mb)->buf_addr) != 
> >> (mb))
> >> +#define RTE_MBUF_INDIRECT(mb)   (mb->ol_flags & IND_ATTACHED_MBUF)
> >>
> >> BTW, Sergio as I said before, I think it should be:
> >> #define RTE_MBUF_INDIRECT(mb)   ((mb)->ol_flags & IND_ATTACHED_MBUF)
> >>
> >> Konstantin
> >>
> >>
>  For RTE_MBUF_TO_BADDR(), it's relatively easy to replace it. The
>  mbuf pool could store the size of the private size like it's done
>  for mbp_priv->mbuf_data_room_size. Using rte_mempool_from_obj(m)
>  or m->pool, we can retrieve the mbuf pool and this value, then
>  compute the buffer address.
> >
> > Agreed, that makes sense.
> >
> 
>  For RTE_MBUF_FROM_BADDR(), it's more complex. We could ensure that
>  a backpointer to the mbuf is always located before the data buffer,
>  but it looks difficult to do.
> >
> > On the other hand, with the proposed refcnt change Sergio proposes, we no
> > longer use this macro in any of the built-in mbuf handling for freeing 
> > mbufs.
> > Does this need to be solved at anything other than the application level?
> 
> It's still used in __rte_pktmbuf_prefree_seg() to retrieve the
> parent mbuf (direct) from the indirect mbuf beeing freed.
> 

Yes, if the INDIRECT flag is set.
Though I still don't understand, what is the problem with these 2 macros with 
that patch?
Why we need to replace it with something?
What exactly you think will be broken?

Konstantin

> 
> 
> 
>  Another idea would be to add a field in indirect mbufs that stores
>  the pointer to the "parent" mbuf.
> 
>  Regards,
>  Olivier
> 



[dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references

2015-02-18 Thread Bruce Richardson
On Wed, Feb 18, 2015 at 11:14:42AM +0100, Olivier MATZ wrote:
> On 02/18/2015 11:00 AM, Bruce Richardson wrote:
> >On Wed, Feb 18, 2015 at 09:48:58AM +, Ananyev, Konstantin wrote:
> >>Hi lads,
> >>
> >>>-Original Message-
> >>>From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> >>>Sent: Wednesday, February 18, 2015 9:36 AM
> >>>To: Olivier MATZ
> >>>Cc: dev at dpdk.org
> >>>Subject: Re: [dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references
> >>>
> >>>On Wed, Feb 18, 2015 at 10:16:56AM +0100, Olivier MATZ wrote:
> Hi Sergio,
> 
> On 02/16/2015 05:08 PM, Sergio Gonzalez Monroy wrote:
> >This patch removes all references to RTE_MBUF_REFCNT, setting the refcnt
> >field in the mbuf struct permanently.
> >
> >Signed-off-by: Sergio Gonzalez Monroy  >intel.com>
> 
> I think removing the refcount compile option goes in the right
> direction. However, activating the refcount will break the applications
> that reserve a private zone in mbufs. This is due to the macros
> RTE_MBUF_TO_BADDR() and RTE_MBUF_FROM_BADDR() that suppose that
> the beginning of the mbuf is 128 bytes (sizeof mbuf) before the
> data buffer.
> 
> >>>
> >>>While I understand how the macros make certain assumptions, how does 
> >>>activating
> >>>the refcnt specifically lead to the problems you describe? Could you 
> >>>explain
> >>>that part in a bit more detail?
> >>>
> >>>Thanks,
> >>>/Bruce
> >>>
> >>
> >>Olivier, I also don't understand your concern here.
> >>As I can see, that patch has nothing to do with RTE_MBUF_FROM_BADDR/ 
> >>RTE_MBUF_FROM_BADDR macros.
> >>They are still there, for example rte_pktmbuf_detach() still uses it to 
> >>restore mbuf's buf_addr.
> >>The only principal change here, is that we don't rely more  on 
> >>RTE_MBUF_FROM_BADDR to determine,
> >>Is that indirect mbuf or not.
> >>Instead we use a special falg for that purpose:
> >>
> >>-#define RTE_MBUF_INDIRECT(mb)   (RTE_MBUF_FROM_BADDR((mb)->buf_addr) != 
> >>(mb))
> >>+#define RTE_MBUF_INDIRECT(mb)   (mb->ol_flags & IND_ATTACHED_MBUF)
> >>
> >>BTW, Sergio as I said before, I think it should be:
> >>#define RTE_MBUF_INDIRECT(mb)   ((mb)->ol_flags & IND_ATTACHED_MBUF)
> >>
> >>Konstantin
> >>
> >>
> For RTE_MBUF_TO_BADDR(), it's relatively easy to replace it. The
> mbuf pool could store the size of the private size like it's done
> for mbp_priv->mbuf_data_room_size. Using rte_mempool_from_obj(m)
> or m->pool, we can retrieve the mbuf pool and this value, then
> compute the buffer address.
> >
> >Agreed, that makes sense.
> >
> 
> For RTE_MBUF_FROM_BADDR(), it's more complex. We could ensure that
> a backpointer to the mbuf is always located before the data buffer,
> but it looks difficult to do.
> >
> >On the other hand, with the proposed refcnt change Sergio proposes, we no
> >longer use this macro in any of the built-in mbuf handling for freeing mbufs.
> >Does this need to be solved at anything other than the application level?
> 
> It's still used in __rte_pktmbuf_prefree_seg() to retrieve the
> parent mbuf (direct) from the indirect mbuf beeing freed.
>
Yes, my bad.
How was this managed before, since refcnt field seems to be necessary in order
to effectively manage indirect mbufs? Is this just the case that this is 
something
that never worked and that needs to be solved, or is it something that was 
working that this patch will now break?

Thanks,
/Bruce


[dpdk-dev] [PATCH v8 04/14] eal/pci: Consolidate pci address comparison APIs

2015-02-18 Thread Iremonger, Bernard
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Tetsuya Mukawa
> Sent: Wednesday, February 18, 2015 1:55 AM
> To: Thomas Monjalon
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v8 04/14] eal/pci: Consolidate pci address 
> comparison APIs
> 
> On 2015/02/18 10:02, Thomas Monjalon wrote:
> > 2015-02-17 15:14, Tetsuya Mukawa:
> >> On 2015/02/17 9:44, Thomas Monjalon wrote:
> >>> 2015-02-16 13:14, Tetsuya Mukawa:
>  @@ -356,13 +342,24 @@ pci_scan_one(int dev_pci_fd, struct pci_conf *conf)
>   }
>   else {
>   struct rte_pci_device *dev2 = NULL;
>  +int ret;
> 
>   TAILQ_FOREACH(dev2, &pci_device_list, next) {
>  -if (pci_addr_comparison(&dev->addr, 
>  &dev2->addr))
>  +ret = eal_compare_pci_addr(&dev->addr, 
>  &dev2->addr);
>  +if (ret > 0)
>   continue;
>  -else {
>  +else if (ret < 0) {
>   TAILQ_INSERT_BEFORE(dev2, dev, next);
>   return 0;
>  +} else { /* already registered */
>  +/* update pt_driver */
>  +dev2->pt_driver = dev->pt_driver;
>  +dev2->max_vfs = dev->max_vfs;
>  +memmove(dev2->mem_resource,
>  +dev->mem_resource,
>  +sizeof(dev->mem_resource));
>  +free(dev);
>  +return 0;
> >>> Could you comment this "else part" please?
> >> PCI device list is allocated when rte_eal_init() is called. At the
> >> time, to fill pci device information, sysfs value is used.
> >> If sysfs values written by kernel device driver will not be changed
> >> by igb_uio, vfio or pci_uio_genereic, above code isn't needed.
> >> But actually above values are changed or added by them.
> >>
> >> Here is a step to cause issue.
> >> 1. Boot linux.
> >> 2. Start DPDK application without any physical NIC ports.
> >>  - Here, some sysfs values are read, and store to pci device list.
> >> 3. igb_uio starts managing a port.
> >>  - Here, some sysfs values are changed.
> >> 4. Add a NIC port to DPDK application using hotplug functions.
> >>  - Here, we need to replace some values.
> > I think that you are showing that something is wrongly designed in
> > these EAL structures. I suggest to try cleaning this mess instead of 
> > workarounding.

Hi Tetsuya, Thomas,
I think that redesigning the EAL structures is beyond the scope of this 
patchset and should be undertaken as a separate task.
I suspect there may be a problem in the original code when  a device which was 
using a kernel driver is bound to igb_uio.  The igb_uio  driver adds 
/sys/bus/pci/devices/\:05\:00.0/max_vfs.

Regards,

Bernard.

> >
> > [...]
>  -if (memcmp(&uio_res->pci_addr, &dev->addr, 
>  sizeof(dev->addr)))
>  +if (eal_compare_pci_addr(&uio_res->pci_addr, 
>  &dev->addr))
> >>> Why memcmp is not sufficient to compare PCI addresses?
> >>> The only exception I see is endianness for natural sorting.
> >> Here is the definition.
> >>
> >> struct rte_pci_addr {
> >> uint16_t domain;/**< Device domain */
> >> uint8_t bus;/**< Device bus */
> >> uint8_t devid;  /**< Device ID */
> >> uint8_t function;   /**< Device function. */
> >> };
> >>
> >> But, sizeof(struct rte_pci_addr) will be 6.
> >> If rte_pci_addr is allocated in a function without bzero, last 1 byte
> >> may have some value.
> >> As a result, memcmp may not work. To avoid such a case, I checked
> >> like above.
> > OK thanks. That's the kind of information which is valuable in a commit log.
> >
> 
> Sure I will add it.
> 
> Thanks,
> Tetsuya


[dpdk-dev] [PATCH v8 04/14] eal/pci: Consolidate pci address comparison APIs

2015-02-18 Thread Thomas Monjalon
Hi Bernard,

2015-02-18 10:26, Iremonger, Bernard:
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Tetsuya Mukawa
> > On 2015/02/18 10:02, Thomas Monjalon wrote:
> > > 2015-02-17 15:14, Tetsuya Mukawa:
> > >> On 2015/02/17 9:44, Thomas Monjalon wrote:
> > >>> 2015-02-16 13:14, Tetsuya Mukawa:
> >  @@ -356,13 +342,24 @@ pci_scan_one(int dev_pci_fd, struct pci_conf 
> >  *conf)
> > }
> > else {
> > struct rte_pci_device *dev2 = NULL;
> >  +  int ret;
> > 
> > TAILQ_FOREACH(dev2, &pci_device_list, next) {
> >  -  if (pci_addr_comparison(&dev->addr, 
> >  &dev2->addr))
> >  +  ret = eal_compare_pci_addr(&dev->addr, 
> >  &dev2->addr);
> >  +  if (ret > 0)
> > continue;
> >  -  else {
> >  +  else if (ret < 0) {
> > TAILQ_INSERT_BEFORE(dev2, dev, next);
> > return 0;
> >  +  } else { /* already registered */
> >  +  /* update pt_driver */
> >  +  dev2->pt_driver = dev->pt_driver;
> >  +  dev2->max_vfs = dev->max_vfs;
> >  +  memmove(dev2->mem_resource,
> >  +  dev->mem_resource,
> >  +  sizeof(dev->mem_resource));
> >  +  free(dev);
> >  +  return 0;
> > >>> Could you comment this "else part" please?
> > >> PCI device list is allocated when rte_eal_init() is called. At the
> > >> time, to fill pci device information, sysfs value is used.
> > >> If sysfs values written by kernel device driver will not be changed
> > >> by igb_uio, vfio or pci_uio_genereic, above code isn't needed.
> > >> But actually above values are changed or added by them.
> > >>
> > >> Here is a step to cause issue.
> > >> 1. Boot linux.
> > >> 2. Start DPDK application without any physical NIC ports.
> > >>  - Here, some sysfs values are read, and store to pci device list.
> > >> 3. igb_uio starts managing a port.
> > >>  - Here, some sysfs values are changed.
> > >> 4. Add a NIC port to DPDK application using hotplug functions.
> > >>  - Here, we need to replace some values.
> > > 
> > > I think that you are showing that something is wrongly designed in
> > > these EAL structures. I suggest to try cleaning this mess instead of 
> > > workarounding.
> 
> Hi Tetsuya, Thomas,
> I think that redesigning the EAL structures is beyond the scope of this 
> patchset and should be undertaken as a separate task.

I strongly disagrees this opinion. We should never workaround design problems
and add more complex/weird code.
I think that this kind of consideration is the heart of some design problems we
have to face today. Please let's stop adding some code which just works without
thinking the whole design.

> I suspect there may be a problem in the original code when  a device which 
> was using a kernel driver is bound to igb_uio.  The igb_uio  driver adds 
> /sys/bus/pci/devices/\:05\:00.0/max_vfs.



[dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references

2015-02-18 Thread Olivier MATZ
Hi,

On 02/18/2015 11:22 AM, Bruce Richardson wrote:
> On Wed, Feb 18, 2015 at 11:14:42AM +0100, Olivier MATZ wrote:
>> On 02/18/2015 11:00 AM, Bruce Richardson wrote:
>>> On Wed, Feb 18, 2015 at 09:48:58AM +, Ananyev, Konstantin wrote:
 Hi lads,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> Sent: Wednesday, February 18, 2015 9:36 AM
> To: Olivier MATZ
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references
>
> On Wed, Feb 18, 2015 at 10:16:56AM +0100, Olivier MATZ wrote:
>> Hi Sergio,
>>
>> On 02/16/2015 05:08 PM, Sergio Gonzalez Monroy wrote:
>>> This patch removes all references to RTE_MBUF_REFCNT, setting the refcnt
>>> field in the mbuf struct permanently.
>>>
>>> Signed-off-by: Sergio Gonzalez Monroy >> intel.com>
>>
>> I think removing the refcount compile option goes in the right
>> direction. However, activating the refcount will break the applications
>> that reserve a private zone in mbufs. This is due to the macros
>> RTE_MBUF_TO_BADDR() and RTE_MBUF_FROM_BADDR() that suppose that
>> the beginning of the mbuf is 128 bytes (sizeof mbuf) before the
>> data buffer.
>>
>
> While I understand how the macros make certain assumptions, how does 
> activating
> the refcnt specifically lead to the problems you describe? Could you 
> explain
> that part in a bit more detail?
>
> Thanks,
> /Bruce
>

 Olivier, I also don't understand your concern here.
 As I can see, that patch has nothing to do with RTE_MBUF_FROM_BADDR/ 
 RTE_MBUF_FROM_BADDR macros.
 They are still there, for example rte_pktmbuf_detach() still uses it to 
 restore mbuf's buf_addr.
 The only principal change here, is that we don't rely more  on 
 RTE_MBUF_FROM_BADDR to determine,
 Is that indirect mbuf or not.
 Instead we use a special falg for that purpose:

 -#define RTE_MBUF_INDIRECT(mb)   (RTE_MBUF_FROM_BADDR((mb)->buf_addr) != 
 (mb))
 +#define RTE_MBUF_INDIRECT(mb)   (mb->ol_flags & IND_ATTACHED_MBUF)

 BTW, Sergio as I said before, I think it should be:
 #define RTE_MBUF_INDIRECT(mb)   ((mb)->ol_flags & IND_ATTACHED_MBUF)

 Konstantin


>> For RTE_MBUF_TO_BADDR(), it's relatively easy to replace it. The
>> mbuf pool could store the size of the private size like it's done
>> for mbp_priv->mbuf_data_room_size. Using rte_mempool_from_obj(m)
>> or m->pool, we can retrieve the mbuf pool and this value, then
>> compute the buffer address.
>>>
>>> Agreed, that makes sense.
>>>
>>
>> For RTE_MBUF_FROM_BADDR(), it's more complex. We could ensure that
>> a backpointer to the mbuf is always located before the data buffer,
>> but it looks difficult to do.
>>>
>>> On the other hand, with the proposed refcnt change Sergio proposes, we no
>>> longer use this macro in any of the built-in mbuf handling for freeing 
>>> mbufs.
>>> Does this need to be solved at anything other than the application level?
>>
>> It's still used in __rte_pktmbuf_prefree_seg() to retrieve the
>> parent mbuf (direct) from the indirect mbuf beeing freed.
>>
> Yes, my bad.
> How was this managed before, since refcnt field seems to be necessary in order
> to effectively manage indirect mbufs? Is this just the case that this is 
> something
> that never worked and that needs to be solved, or is it something that was
> working that this patch will now break?

This is something that never worked before: refcounts are not compatible
with reserving private data in mbufs. This patch does not change the
issue, it is still there.

Before the patch, an application that wanted to reserve a private
data could disable refcounts at compile-time.
After the patch, the solution is just to avoid using refcounts.

Regards,
Olivier




[dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references

2015-02-18 Thread Bruce Richardson
On Wed, Feb 18, 2015 at 11:33:48AM +0100, Olivier MATZ wrote:
> Hi,
> 
> On 02/18/2015 11:22 AM, Bruce Richardson wrote:
> >On Wed, Feb 18, 2015 at 11:14:42AM +0100, Olivier MATZ wrote:
> >>On 02/18/2015 11:00 AM, Bruce Richardson wrote:
> >>>On Wed, Feb 18, 2015 at 09:48:58AM +, Ananyev, Konstantin wrote:
> Hi lads,
> 
> >-Original Message-
> >From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> >Sent: Wednesday, February 18, 2015 9:36 AM
> >To: Olivier MATZ
> >Cc: dev at dpdk.org
> >Subject: Re: [dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references
> >
> >On Wed, Feb 18, 2015 at 10:16:56AM +0100, Olivier MATZ wrote:
> >>Hi Sergio,
> >>
> >>On 02/16/2015 05:08 PM, Sergio Gonzalez Monroy wrote:
> >>>This patch removes all references to RTE_MBUF_REFCNT, setting the 
> >>>refcnt
> >>>field in the mbuf struct permanently.
> >>>
> >>>Signed-off-by: Sergio Gonzalez Monroy  >>>intel.com>
> >>
> >>I think removing the refcount compile option goes in the right
> >>direction. However, activating the refcount will break the applications
> >>that reserve a private zone in mbufs. This is due to the macros
> >>RTE_MBUF_TO_BADDR() and RTE_MBUF_FROM_BADDR() that suppose that
> >>the beginning of the mbuf is 128 bytes (sizeof mbuf) before the
> >>data buffer.
> >>
> >
> >While I understand how the macros make certain assumptions, how does 
> >activating
> >the refcnt specifically lead to the problems you describe? Could you 
> >explain
> >that part in a bit more detail?
> >
> >Thanks,
> >/Bruce
> >
> 
> Olivier, I also don't understand your concern here.
> As I can see, that patch has nothing to do with RTE_MBUF_FROM_BADDR/ 
> RTE_MBUF_FROM_BADDR macros.
> They are still there, for example rte_pktmbuf_detach() still uses it to 
> restore mbuf's buf_addr.
> The only principal change here, is that we don't rely more  on 
> RTE_MBUF_FROM_BADDR to determine,
> Is that indirect mbuf or not.
> Instead we use a special falg for that purpose:
> 
> -#define RTE_MBUF_INDIRECT(mb)   (RTE_MBUF_FROM_BADDR((mb)->buf_addr) != 
> (mb))
> +#define RTE_MBUF_INDIRECT(mb)   (mb->ol_flags & IND_ATTACHED_MBUF)
> 
> BTW, Sergio as I said before, I think it should be:
> #define RTE_MBUF_INDIRECT(mb)   ((mb)->ol_flags & IND_ATTACHED_MBUF)
> 
> Konstantin
> 
> 
> >>For RTE_MBUF_TO_BADDR(), it's relatively easy to replace it. The
> >>mbuf pool could store the size of the private size like it's done
> >>for mbp_priv->mbuf_data_room_size. Using rte_mempool_from_obj(m)
> >>or m->pool, we can retrieve the mbuf pool and this value, then
> >>compute the buffer address.
> >>>
> >>>Agreed, that makes sense.
> >>>
> >>
> >>For RTE_MBUF_FROM_BADDR(), it's more complex. We could ensure that
> >>a backpointer to the mbuf is always located before the data buffer,
> >>but it looks difficult to do.
> >>>
> >>>On the other hand, with the proposed refcnt change Sergio proposes, we no
> >>>longer use this macro in any of the built-in mbuf handling for freeing 
> >>>mbufs.
> >>>Does this need to be solved at anything other than the application level?
> >>
> >>It's still used in __rte_pktmbuf_prefree_seg() to retrieve the
> >>parent mbuf (direct) from the indirect mbuf beeing freed.
> >>
> >Yes, my bad.
> >How was this managed before, since refcnt field seems to be necessary in 
> >order
> >to effectively manage indirect mbufs? Is this just the case that this is 
> >something
> >that never worked and that needs to be solved, or is it something that was
> >working that this patch will now break?
> 
> This is something that never worked before: refcounts are not compatible
> with reserving private data in mbufs. This patch does not change the
> issue, it is still there.
> 
> Before the patch, an application that wanted to reserve a private
> data could disable refcounts at compile-time.
> After the patch, the solution is just to avoid using refcounts.
> 
> Regards,
> Olivier
> 
Thanks for clarifying.
So, you ok with this patch as a step in the right direction?



[dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references

2015-02-18 Thread Ananyev, Konstantin


> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Wednesday, February 18, 2015 10:34 AM
> To: Richardson, Bruce
> Cc: Ananyev, Konstantin; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references
> 
> Hi,
> 
> On 02/18/2015 11:22 AM, Bruce Richardson wrote:
> > On Wed, Feb 18, 2015 at 11:14:42AM +0100, Olivier MATZ wrote:
> >> On 02/18/2015 11:00 AM, Bruce Richardson wrote:
> >>> On Wed, Feb 18, 2015 at 09:48:58AM +, Ananyev, Konstantin wrote:
>  Hi lads,
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> > Sent: Wednesday, February 18, 2015 9:36 AM
> > To: Olivier MATZ
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references
> >
> > On Wed, Feb 18, 2015 at 10:16:56AM +0100, Olivier MATZ wrote:
> >> Hi Sergio,
> >>
> >> On 02/16/2015 05:08 PM, Sergio Gonzalez Monroy wrote:
> >>> This patch removes all references to RTE_MBUF_REFCNT, setting the 
> >>> refcnt
> >>> field in the mbuf struct permanently.
> >>>
> >>> Signed-off-by: Sergio Gonzalez Monroy  >>> intel.com>
> >>
> >> I think removing the refcount compile option goes in the right
> >> direction. However, activating the refcount will break the applications
> >> that reserve a private zone in mbufs. This is due to the macros
> >> RTE_MBUF_TO_BADDR() and RTE_MBUF_FROM_BADDR() that suppose that
> >> the beginning of the mbuf is 128 bytes (sizeof mbuf) before the
> >> data buffer.
> >>
> >
> > While I understand how the macros make certain assumptions, how does 
> > activating
> > the refcnt specifically lead to the problems you describe? Could you 
> > explain
> > that part in a bit more detail?
> >
> > Thanks,
> > /Bruce
> >
> 
>  Olivier, I also don't understand your concern here.
>  As I can see, that patch has nothing to do with RTE_MBUF_FROM_BADDR/ 
>  RTE_MBUF_FROM_BADDR macros.
>  They are still there, for example rte_pktmbuf_detach() still uses it to 
>  restore mbuf's buf_addr.
>  The only principal change here, is that we don't rely more  on 
>  RTE_MBUF_FROM_BADDR to determine,
>  Is that indirect mbuf or not.
>  Instead we use a special falg for that purpose:
> 
>  -#define RTE_MBUF_INDIRECT(mb)   (RTE_MBUF_FROM_BADDR((mb)->buf_addr) != 
>  (mb))
>  +#define RTE_MBUF_INDIRECT(mb)   (mb->ol_flags & IND_ATTACHED_MBUF)
> 
>  BTW, Sergio as I said before, I think it should be:
>  #define RTE_MBUF_INDIRECT(mb)   ((mb)->ol_flags & IND_ATTACHED_MBUF)
> 
>  Konstantin
> 
> 
> >> For RTE_MBUF_TO_BADDR(), it's relatively easy to replace it. The
> >> mbuf pool could store the size of the private size like it's done
> >> for mbp_priv->mbuf_data_room_size. Using rte_mempool_from_obj(m)
> >> or m->pool, we can retrieve the mbuf pool and this value, then
> >> compute the buffer address.
> >>>
> >>> Agreed, that makes sense.
> >>>
> >>
> >> For RTE_MBUF_FROM_BADDR(), it's more complex. We could ensure that
> >> a backpointer to the mbuf is always located before the data buffer,
> >> but it looks difficult to do.
> >>>
> >>> On the other hand, with the proposed refcnt change Sergio proposes, we no
> >>> longer use this macro in any of the built-in mbuf handling for freeing 
> >>> mbufs.
> >>> Does this need to be solved at anything other than the application level?
> >>
> >> It's still used in __rte_pktmbuf_prefree_seg() to retrieve the
> >> parent mbuf (direct) from the indirect mbuf beeing freed.
> >>
> > Yes, my bad.
> > How was this managed before, since refcnt field seems to be necessary in 
> > order
> > to effectively manage indirect mbufs? Is this just the case that this is 
> > something
> > that never worked and that needs to be solved, or is it something that was
> > working that this patch will now break?
> 
> This is something that never worked before: refcounts are not compatible
> with reserving private data in mbufs. This patch does not change the
> issue, it is still there.
> 
> Before the patch, an application that wanted to reserve a private
> data could disable refcounts at compile-time.
> After the patch, the solution is just to avoid using refcounts.

I'd say avoid using mbuf_attach/detach.
refcnt itself has nothing to do with that.
I finally understood what you  are talking about ...
About private data - I suppose it is a matter of another patch.
I still think it would be better to reserve private data space before mbuf, not 
after
(at mbuf pool initialisation time). 
Then *BADDR* macros could be unaffected.
Konstantin

> 
> Regards,
> Olivier
> 



[dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references

2015-02-18 Thread Olivier MATZ
Hi,

On 02/18/2015 11:37 AM, Bruce Richardson wrote:
>>> How was this managed before, since refcnt field seems to be necessary in 
>>> order
>>> to effectively manage indirect mbufs? Is this just the case that this is 
>>> something
>>> that never worked and that needs to be solved, or is it something that was
>>> working that this patch will now break?
>>
>> This is something that never worked before: refcounts are not compatible
>> with reserving private data in mbufs. This patch does not change the
>> issue, it is still there.
>>
>> Before the patch, an application that wanted to reserve a private
>> data could disable refcounts at compile-time.
>> After the patch, the solution is just to avoid using refcounts.
>>
>> Regards,
>> Olivier
>>
> Thanks for clarifying.
> So, you ok with this patch as a step in the right direction?

Yep,
Acked-by: Olivier Matz 




[dpdk-dev] [PATCH v8 03/14] eal/pci, ethdev: Remove assumption that port will not be detached

2015-02-18 Thread Tetsuya Mukawa
On 2015/02/18 19:03, Bruce Richardson wrote:
> On Wed, Feb 18, 2015 at 10:57:25AM +0100, Thomas Monjalon wrote:
>> 2015-02-18 15:10, Tetsuya Mukawa:
>>> On 2015/02/18 10:54, Tetsuya Mukawa wrote:
 On 2015/02/18 9:31, Thomas Monjalon wrote:
> 2015-02-17 15:14, Tetsuya Mukawa:
>> On 2015/02/17 9:36, Thomas Monjalon wrote:
>>> 2015-02-16 13:14, Tetsuya Mukawa:
>>> Is uint8_t sill a good size for hotpluggable virtual device ids?
>> I am not sure it's enough, but uint8_t is widely used in "rte_ethdev.c"
>> as port id.
>> If someone reports it doesn't enough, I guess it will be the time to
>> write a patch to change all uint_8 in one patch.
> It's a big ABI breakage. So if we feel it's going to be required,
> it's better to do it now in 2.0 release I think.
>
> Any opinion?
>
 Hi Thomas,

 I agree with it.
 I will add an one more patch to change uint8_t to uint16_t.

 Thanks,
 Tetsuya

>>> Hi Thomas,
>>>
>>> Could I make sure.
>>> After changing uint8_t to uint16_t in "rte_ethdev.[ch]", must I also
>>> need to change other applications and libraries that call ethdev APIs?
>>> If so, I would not finish it by 23rd.
>>>
>>> I've counted how many lines call ethdev APIs that are related to port_id.
>>> Could you please check an attached file?
>>> It's over 1200 lines. Probably to fix  one of caller, I will need to
>>> check how port_id is used, and fix more related lines. So probably
>>> thousands lines may need to be fixed.
>>>
>>> When is deadline for fixing this changing?
>>> Also, if you have a good idea to fix it easier, could you please let me
>>> know?
>> It was an open question.
>> If everybody is fine with 255 ports maximum, let's keep it as is.
>>
> I think we are probably ok for now (and forseeable future) with 255 max.
>
> However, if we did change it, I agree that in 2.0 is a very good time to do 
> so.
> Since we are expanding the field, rather than shrinking it, I don't see why we
> can't just make the change at the ethdev level (and in libs API) in 2.0 and 
> then in
> later releases (e.g. 2.1) update the apps and examples to match. That way the
> ABI stays the same from 2.0 onwards, and we don't have a huge amount of churn
> changing it everywhere late in the 2.0 release cycle.

Hi Bruce,

Could you please check my RFC patch I will send soon?
I wrote the patch like below.

1. Copy header file like below.
$ cp lib/librte_ether/rte_ethdev.h lib/librte_ether/rte_ethdev_internal.h
2. Change "rte_ethdev.c" to include "rte_ethdev_internal.h"
3. Change type of port id in "rte_ethdev.c" and "rte_ethdev_internal.h".

If the patch is OK, I wll send it with hotplug patches.

Thanks,
Tetsuya


> /Bruce




[dpdk-dev] [PATCH 2/2] Remove RTE_MBUF_REFCNT references

2015-02-18 Thread Olivier MATZ
Hi Konstantin,

On 02/18/2015 11:47 AM, Ananyev, Konstantin wrote:
>>> How was this managed before, since refcnt field seems to be necessary in 
>>> order
>>> to effectively manage indirect mbufs? Is this just the case that this is 
>>> something
>>> that never worked and that needs to be solved, or is it something that was
>>> working that this patch will now break?
>>
>> This is something that never worked before: refcounts are not compatible
>> with reserving private data in mbufs. This patch does not change the
>> issue, it is still there.
>>
>> Before the patch, an application that wanted to reserve a private
>> data could disable refcounts at compile-time.
>> After the patch, the solution is just to avoid using refcounts.
>
> I'd say avoid using mbuf_attach/detach.
> refcnt itself has nothing to do with that.
> I finally understood what you  are talking about ...
> About private data - I suppose it is a matter of another patch.
> I still think it would be better to reserve private data space before mbuf, 
> not after
> (at mbuf pool initialisation time).
> Then *BADDR* macros could be unaffected.

Indeed that could be a good idea.


Regards,
Olivier



[dpdk-dev] [RFC PATCH] lib/librte_ethdev: Expand port identifier

2015-02-18 Thread Tetsuya Mukawa
Currently uint8_t is used for port identifier. This patch changes it,
and use uint16_t as port identifier.
This patch only changes ethdev library. ABI of the library will be
kept even after applying it.

Also, this patch involves following fixes.
- Use "port_id" as variable name instead of "port".

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_ether/rte_ethdev.c  |  212 +-
 lib/librte_ether/rte_ethdev_internal.h | 3672 
 2 files changed, 3778 insertions(+), 106 deletions(-)
 create mode 100644 lib/librte_ether/rte_ethdev_internal.h

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ea3a1fb..3568e4a 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -68,7 +68,7 @@
 #include 

 #include "rte_ether.h"
-#include "rte_ethdev.h"
+#include "rte_ethdev_internal.h"

 #ifdef RTE_LIBRTE_ETHDEV_DEBUG
 #define PMD_DEBUG_TRACE(fmt, args...) do {\
@@ -109,7 +109,7 @@
 static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
 struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
 static struct rte_eth_dev_data *rte_eth_dev_data = NULL;
-static uint8_t nb_ports = 0;
+static uint16_t nb_ports = 0;

 /* spinlock for eth device callbacks */
 static rte_spinlock_t rte_eth_dev_cb_lock = RTE_SPINLOCK_INITIALIZER;
@@ -309,14 +309,14 @@ rte_eth_driver_register(struct eth_driver *eth_drv)
 }

 int
-rte_eth_dev_socket_id(uint8_t port_id)
+rte_eth_dev_socket_id(uint16_t port_id)
 {
if (port_id >= nb_ports)
return -1;
return rte_eth_devices[port_id].pci_dev->numa_node;
 }

-uint8_t
+uint16_t
 rte_eth_dev_count(void)
 {
return (nb_ports);
@@ -361,7 +361,7 @@ rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
 }

 int
-rte_eth_dev_rx_queue_start(uint8_t port_id, uint16_t rx_queue_id)
+rte_eth_dev_rx_queue_start(uint16_t port_id, uint16_t rx_queue_id)
 {
struct rte_eth_dev *dev;

@@ -387,7 +387,7 @@ rte_eth_dev_rx_queue_start(uint8_t port_id, uint16_t 
rx_queue_id)
 }

 int
-rte_eth_dev_rx_queue_stop(uint8_t port_id, uint16_t rx_queue_id)
+rte_eth_dev_rx_queue_stop(uint16_t port_id, uint16_t rx_queue_id)
 {
struct rte_eth_dev *dev;

@@ -413,7 +413,7 @@ rte_eth_dev_rx_queue_stop(uint8_t port_id, uint16_t 
rx_queue_id)
 }

 int
-rte_eth_dev_tx_queue_start(uint8_t port_id, uint16_t tx_queue_id)
+rte_eth_dev_tx_queue_start(uint16_t port_id, uint16_t tx_queue_id)
 {
struct rte_eth_dev *dev;

@@ -439,7 +439,7 @@ rte_eth_dev_tx_queue_start(uint8_t port_id, uint16_t 
tx_queue_id)
 }

 int
-rte_eth_dev_tx_queue_stop(uint8_t port_id, uint16_t tx_queue_id)
+rte_eth_dev_tx_queue_stop(uint16_t port_id, uint16_t tx_queue_id)
 {
struct rte_eth_dev *dev;

@@ -503,7 +503,7 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
 }

 static int
-rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, uint16_t nb_rx_q)
+rte_eth_dev_check_vf_rss_rxq_num(uint16_t port_id, uint16_t nb_rx_q)
 {
struct rte_eth_dev *dev = &rte_eth_devices[port_id];
switch (nb_rx_q) {
@@ -528,7 +528,7 @@ rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, uint16_t 
nb_rx_q)
 }

 static int
-rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
+rte_eth_dev_check_mq_mode(uint16_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
  const struct rte_eth_conf *dev_conf)
 {
struct rte_eth_dev *dev = &rte_eth_devices[port_id];
@@ -692,7 +692,7 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t 
nb_rx_q, uint16_t nb_tx_q,
 }

 int
-rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
+rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
  const struct rte_eth_conf *dev_conf)
 {
struct rte_eth_dev *dev;
@@ -830,7 +830,7 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_q, 
uint16_t nb_tx_q,
 }

 static void
-rte_eth_dev_config_restore(uint8_t port_id)
+rte_eth_dev_config_restore(uint16_t port_id)
 {
struct rte_eth_dev *dev;
struct rte_eth_dev_info dev_info;
@@ -879,7 +879,7 @@ rte_eth_dev_config_restore(uint8_t port_id)
 }

 int
-rte_eth_dev_start(uint8_t port_id)
+rte_eth_dev_start(uint16_t port_id)
 {
struct rte_eth_dev *dev;
int diag;
@@ -915,7 +915,7 @@ rte_eth_dev_start(uint8_t port_id)
 }

 void
-rte_eth_dev_stop(uint8_t port_id)
+rte_eth_dev_stop(uint16_t port_id)
 {
struct rte_eth_dev *dev;

@@ -943,7 +943,7 @@ rte_eth_dev_stop(uint8_t port_id)
 }

 int
-rte_eth_dev_set_link_up(uint8_t port_id)
+rte_eth_dev_set_link_up(uint16_t port_id)
 {
struct rte_eth_dev *dev;

@@ -962,7 +962,7 @@ rte_eth_dev_set_link_up(uint8_t port_id)
 }

 int
-rte_eth_dev_set_link_down(uint8_t port_id)
+rte_eth_dev_set_link_down(uint16_t port_id)
 {
struct rte_eth_dev *dev;

@@ -981,7 +981,7 @@ rte_eth_dev_set_link_down(uint8_t port_id)
 }

 void
-rte_eth_dev_close(uint8

[dpdk-dev] [PATCH v2 0/2] Removal of RTE_MBUF_REFCNT

2015-02-18 Thread Sergio Gonzalez Monroy
This patch tries to remove the RTE_MBUF_REFCNT config options and dependencies
by introducing a new mbuf flag IND_ATTACHED_MBUF that would indicate when the 
mbuf
is an indirect attached mbuf, to differentiate between indirect mbufs and mbufs
with external memory buffers (ie. vhost zero copy).

Previous discussion:
http://dpdk.org/ml/archives/dev/2014-October/007127.html

Currently for mbufs with refcnt, we cannot free mbufs with external memory
buffers (ie. vhost zero copy), as they are recognized as indirect
attached mbufs and therefore we free the direct mbuf it points to,
resulting in an error in the case of external memory buffers.

We solve the issue by introducing the IND_ATTACHED_MBUF flag, which indicates
that the mbuf is an indirect attached mbuf pointing to another mbuf.
When we free an mbuf, we only free the direct mbuf if the flag is set.
Freeing an mbuf with external buffer is the same as freeing a non attached mbuf.
The flag is set during attach and clear on detach.

So in the case of vhost zero copy where we have mbufs with external
buffers, by default we just free the mbuf and it is up to the user to deal with
the external buffer.

v2:
 - Add missing parenthesis to RTE_MBUF_INDIRECT macro

Sergio Gonzalez Monroy (2):
  mbuf: Introduce IND_ATTACHED_MBUF flag
  Remove RTE_MBUF_REFCNT references

 app/test/test_link_bonding.c| 15 ---
 app/test/test_mbuf.c| 17 +++--
 config/common_bsdapp|  1 -
 config/common_linuxapp  |  1 -
 examples/Makefile   |  4 +--
 examples/ip_fragmentation/Makefile  |  4 ---
 examples/ip_pipeline/Makefile   |  3 ---
 examples/ip_pipeline/main.c |  5 
 examples/ipv4_multicast/Makefile|  4 ---
 examples/vhost/main.c   | 19 +++---
 lib/librte_ip_frag/Makefile |  4 ---
 lib/librte_ip_frag/rte_ip_frag.h|  4 ---
 lib/librte_mbuf/rte_mbuf.c  |  2 --
 lib/librte_mbuf/rte_mbuf.h  | 45 +++--
 lib/librte_pmd_bond/Makefile|  4 ---
 lib/librte_pmd_bond/rte_eth_bond.h  |  2 --
 lib/librte_pmd_bond/rte_eth_bond_args.c |  2 --
 lib/librte_pmd_bond/rte_eth_bond_pmd.c  | 10 
 lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c   |  8 --
 lib/librte_port/Makefile|  4 ---
 20 files changed, 19 insertions(+), 139 deletions(-)

-- 
1.9.3



[dpdk-dev] [PATCH v2 2/2] Remove RTE_MBUF_REFCNT references

2015-02-18 Thread Sergio Gonzalez Monroy
This patch removes all references to RTE_MBUF_REFCNT, setting the refcnt
field in the mbuf struct permanently.

Signed-off-by: Sergio Gonzalez Monroy 
Acked-by: Olivier Matz  
---
 app/test/test_link_bonding.c| 15 ---
 app/test/test_mbuf.c| 17 -
 config/common_bsdapp|  1 -
 config/common_linuxapp  |  1 -
 examples/Makefile   |  4 ++--
 examples/ip_fragmentation/Makefile  |  4 
 examples/ip_pipeline/Makefile   |  3 ---
 examples/ip_pipeline/main.c |  5 -
 examples/ipv4_multicast/Makefile|  4 
 examples/vhost/main.c   | 13 -
 lib/librte_ip_frag/Makefile |  4 
 lib/librte_ip_frag/rte_ip_frag.h|  4 
 lib/librte_mbuf/rte_mbuf.c  |  2 --
 lib/librte_mbuf/rte_mbuf.h  | 30 --
 lib/librte_pmd_bond/Makefile|  4 
 lib/librte_pmd_bond/rte_eth_bond.h  |  2 --
 lib/librte_pmd_bond/rte_eth_bond_args.c |  2 --
 lib/librte_pmd_bond/rte_eth_bond_pmd.c  | 10 --
 lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c   |  8 
 lib/librte_port/Makefile|  4 
 20 files changed, 6 insertions(+), 131 deletions(-)

diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c
index 579ebbf..54895ab 100644
--- a/app/test/test_link_bonding.c
+++ b/app/test/test_link_bonding.c
@@ -708,9 +708,7 @@ test_set_bonding_mode(void)
int bonding_modes[] = { BONDING_MODE_ROUND_ROBIN,

BONDING_MODE_ACTIVE_BACKUP,
BONDING_MODE_BALANCE,
-#ifdef RTE_MBUF_REFCNT
BONDING_MODE_BROADCAST
-#endif
};

/* Test supported link bonding modes */
@@ -1425,7 +1423,6 @@ test_roundrobin_tx_burst(void)
return remove_slaves_and_stop_bonded_device();
 }

-#ifdef RTE_MBUF_REFCNT
 static int
 verify_mbufs_ref_count(struct rte_mbuf **mbufs, int nb_mbufs, int val)
 {
@@ -1439,8 +1436,6 @@ verify_mbufs_ref_count(struct rte_mbuf **mbufs, int 
nb_mbufs, int val)
}
return 0;
 }
-#endif
-

 static void
 free_mbufs(struct rte_mbuf **mbufs, int nb_mbufs)
@@ -1545,12 +1540,10 @@ test_roundrobin_tx_burst_slave_tx_fail(void)
(unsigned int)port_stats.opackets, 
slave_expected_tx_count);
}

-#ifdef RTE_MBUF_REFCNT
/* Verify that all mbufs have a ref value of zero */
TEST_ASSERT_SUCCESS(verify_mbufs_ref_count(&pkt_burst[tx_count],
TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT, 1),
"mbufs refcnts not as expected");
-#endif
free_mbufs(&pkt_burst[tx_count], TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT);

/* Clean up and remove slaves from bonded device */
@@ -3056,12 +3049,10 @@ test_balance_tx_burst_slave_tx_fail(void)
(unsigned int)port_stats.opackets,
TEST_BAL_SLAVE_TX_FAIL_BURST_SIZE_2);

-#ifdef RTE_MBUF_REFCNT
/* Verify that all mbufs have a ref value of zero */
TEST_ASSERT_SUCCESS(verify_mbufs_ref_count(&pkts_burst_1[tx_count_1],
TEST_BAL_SLAVE_TX_FAIL_PACKETS_COUNT, 1),
"mbufs refcnts not as expected");
-#endif

free_mbufs(&pkts_burst_1[tx_count_1],
TEST_BAL_SLAVE_TX_FAIL_PACKETS_COUNT);
@@ -3472,9 +3463,6 @@ 
test_balance_verify_slave_link_status_change_behaviour(void)
return remove_slaves_and_stop_bonded_device();
 }

-#ifdef RTE_MBUF_REFCNT
-/** Broadcast Mode Tests */
-
 static int
 test_broadcast_tx_burst(void)
 {
@@ -4001,7 +3989,6 @@ 
test_broadcast_verify_slave_link_status_change_behaviour(void)
/* Clean up and remove slaves from bonded device */
return remove_slaves_and_stop_bonded_device();
 }
-#endif

 static int
 test_reconfigure_bonded_device(void)
@@ -4592,14 +4579,12 @@ static struct unit_test_suite link_bonding_test_suite  
= {
TEST_CASE(test_tlb_verify_mac_assignment),
TEST_CASE(test_tlb_verify_promiscuous_enable_disable),
TEST_CASE(test_tlb_verify_slave_link_status_change_failover),
-#ifdef RTE_MBUF_REFCNT
TEST_CASE(test_broadcast_tx_burst),
TEST_CASE(test_broadcast_tx_burst_slave_tx_fail),
TEST_CASE(test_broadcast_rx_burst),
TEST_CASE(test_broadcast_verify_promiscuous_enable_disable),
TEST_CASE(test_broadcast_verify_mac_assignment),

TEST_CASE(test_broadcast_verify_slave_link_status_change_behaviour),
-#endif
TEST_CASE(test_reconfigure_bonded_device),
TEST_CASE(test_close_bonded_device),

diff --git a/app/test/test_mbuf.c b/app/test/test_

[dpdk-dev] [PATCH v2 1/2] mbuf: Introduce IND_ATTACHED_MBUF flag

2015-02-18 Thread Sergio Gonzalez Monroy
Currently for mbufs with refcnt, we cannot free mbufs with external memory
buffers (ie. vhost zero copy), as they are recognized as indirect
attached mbufs and therefore we free the direct mbuf it points to,
resulting in an error in the case of external memory buffers.

We solve the issue by introducing the IND_ATTACHED_MBUF flag, which indicates
that the mbuf is an indirect attached mbuf pointing to another mbuf.
When we free an mbuf, we only free the direct mbuf if the flag is set.
Freeing an mbuf with external buffer is the same as freeing a non attached mbuf.
The flag is set during attach and clear on detach.

So in the case of vhost zero copy where we have mbufs with external
buffers, by default we just free the mbuf and it is up to the user to deal with
the external buffer.

This patch would allow the removal of the RTE_MBUF_REFCNT config option,
setting refcnt for all mbufs permanently.

The patch also modifies the vhost example as it was using the
RTE_MBUF_INDERECT macro to detect if it was an mbuf with external buffer.

Signed-off-by: Sergio Gonzalez Monroy 
Acked-by: Olivier Matz  
---
v2:
 - Add missing parenthesis to RTE_MBUF_INDIRECT macro

 examples/vhost/main.c  |  6 --
 lib/librte_mbuf/rte_mbuf.h | 15 +--
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 3a35359..5e341d6 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -139,6 +139,8 @@
 /* Number of descriptors per cacheline. */
 #define DESC_PER_CACHELINE (RTE_CACHE_LINE_SIZE / sizeof(struct vring_desc))

+#define MBUF_EXT_MEM(mb)   (RTE_MBUF_FROM_BADDR((mb)->buf_addr) != (mb))
+
 /* mask of enabled ports */
 static uint32_t enabled_port_mask = 0;

@@ -1567,7 +1569,7 @@ txmbuf_clean_zcp(struct virtio_net *dev, struct vpool 
*vpool)

for (index = 0; index < mbuf_count; index++) {
mbuf = __rte_mbuf_raw_alloc(vpool->pool);
-   if (likely(RTE_MBUF_INDIRECT(mbuf)))
+   if (likely(MBUF_EXT_MEM(mbuf)))
pktmbuf_detach_zcp(mbuf);
rte_ring_sp_enqueue(vpool->ring, mbuf);

@@ -1630,7 +1632,7 @@ static void mbuf_destroy_zcp(struct vpool *vpool)
for (index = 0; index < mbuf_count; index++) {
mbuf = __rte_mbuf_raw_alloc(vpool->pool);
if (likely(mbuf != NULL)) {
-   if (likely(RTE_MBUF_INDIRECT(mbuf)))
+   if (likely(MBUF_EXT_MEM(mbuf)))
pktmbuf_detach_zcp(mbuf);
rte_ring_sp_enqueue(vpool->ring, (void *)mbuf);
}
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 16059c6..1e5aa1f 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -162,6 +162,8 @@ extern "C" {
 /** Tell the NIC it's an outer IPv6 packet for tunneling packet */
 #define PKT_TX_OUTER_IPV6(1ULL << 60)

+#define IND_ATTACHED_MBUF(1ULL << 62) /**< Indirect attached mbuf */
+
 /* Use final bit of flags to indicate a control mbuf */
 #define CTRL_MBUF_FLAG   (1ULL << 63) /**< Mbuf contains control data */

@@ -305,13 +307,12 @@ struct rte_mbuf {
 /**
  * Returns TRUE if given mbuf is indirect, or FALSE otherwise.
  */
-#define RTE_MBUF_INDIRECT(mb)   (RTE_MBUF_FROM_BADDR((mb)->buf_addr) != (mb))
+#define RTE_MBUF_INDIRECT(mb)   ((mb)->ol_flags & IND_ATTACHED_MBUF)

 /**
  * Returns TRUE if given mbuf is direct, or FALSE otherwise.
  */
-#define RTE_MBUF_DIRECT(mb) (RTE_MBUF_FROM_BADDR((mb)->buf_addr) == (mb))
-
+#define RTE_MBUF_DIRECT(mb) (!RTE_MBUF_INDIRECT(mb))

 /**
  * Private data in case of pktmbuf pool.
@@ -713,7 +714,7 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, 
struct rte_mbuf *md)
mi->next = NULL;
mi->pkt_len = mi->data_len;
mi->nb_segs = 1;
-   mi->ol_flags = md->ol_flags;
+   mi->ol_flags = md->ol_flags | IND_ATTACHED_MBUF;
mi->packet_type = md->packet_type;

__rte_mbuf_sanity_check(mi, 1);
@@ -744,6 +745,8 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
RTE_PKTMBUF_HEADROOM : m->buf_len;

m->data_len = 0;
+
+   m->ol_flags = 0;
 }

 #endif /* RTE_MBUF_REFCNT */
@@ -757,7 +760,6 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
 #ifdef RTE_MBUF_REFCNT
if (likely (rte_mbuf_refcnt_read(m) == 1) ||
likely (rte_mbuf_refcnt_update(m, -1) == 0)) {
-   struct rte_mbuf *md = RTE_MBUF_FROM_BADDR(m->buf_addr);

rte_mbuf_refcnt_set(m, 0);

@@ -765,7 +767,8 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
 *  - detach mbuf
 *  - free attached mbuf segment
 */
-   if (unlikely (md != m)) {
+   if (RTE_MBUF_INDIRECT(m)) {
+   struct rte_mbuf *md = RTE_MBUF_FROM_BADDR(m->buf_addr);
rte_pktmbuf_detach(m);

[dpdk-dev] [PATCH v8 04/14] eal/pci: Consolidate pci address comparison APIs

2015-02-18 Thread Iremonger, Bernard


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, February 18, 2015 10:33 AM
> To: Iremonger, Bernard
> Cc: Tetsuya Mukawa; dev at dpdk.org; ivan.boule at 6wind.com
> Subject: Re: [dpdk-dev] [PATCH v8 04/14] eal/pci: Consolidate pci address 
> comparison APIs
> 
> Hi Bernard,
> 
> 2015-02-18 10:26, Iremonger, Bernard:
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Tetsuya Mukawa
> > > On 2015/02/18 10:02, Thomas Monjalon wrote:
> > > > 2015-02-17 15:14, Tetsuya Mukawa:
> > > >> On 2015/02/17 9:44, Thomas Monjalon wrote:
> > > >>> 2015-02-16 13:14, Tetsuya Mukawa:
> > >  @@ -356,13 +342,24 @@ pci_scan_one(int dev_pci_fd, struct pci_conf 
> > >  *conf)
> > >   }
> > >   else {
> > >   struct rte_pci_device *dev2 = NULL;
> > >  +int ret;
> > > 
> > >   TAILQ_FOREACH(dev2, &pci_device_list, next) {
> > >  -if (pci_addr_comparison(&dev->addr, 
> > >  &dev2->addr))
> > >  +ret = eal_compare_pci_addr(&dev->addr, 
> > >  &dev2->addr);
> > >  +if (ret > 0)
> > >   continue;
> > >  -else {
> > >  +else if (ret < 0) {
> > >   TAILQ_INSERT_BEFORE(dev2, dev, next);
> > >   return 0;
> > >  +} else { /* already registered */
> > >  +/* update pt_driver */
> > >  +dev2->pt_driver = dev->pt_driver;
> > >  +dev2->max_vfs = dev->max_vfs;
> > >  +memmove(dev2->mem_resource,
> > >  +dev->mem_resource,
> > >  +sizeof(dev->mem_resource));
> > >  +free(dev);
> > >  +return 0;
> > > >>> Could you comment this "else part" please?
> > > >> PCI device list is allocated when rte_eal_init() is called. At
> > > >> the time, to fill pci device information, sysfs value is used.
> > > >> If sysfs values written by kernel device driver will not be
> > > >> changed by igb_uio, vfio or pci_uio_genereic, above code isn't needed.
> > > >> But actually above values are changed or added by them.
> > > >>
> > > >> Here is a step to cause issue.
> > > >> 1. Boot linux.
> > > >> 2. Start DPDK application without any physical NIC ports.
> > > >>  - Here, some sysfs values are read, and store to pci device list.
> > > >> 3. igb_uio starts managing a port.
> > > >>  - Here, some sysfs values are changed.
> > > >> 4. Add a NIC port to DPDK application using hotplug functions.
> > > >>  - Here, we need to replace some values.
> > > >
> > > > I think that you are showing that something is wrongly designed in
> > > > these EAL structures. I suggest to try cleaning this mess instead of 
> > > > workarounding.
> >
> > Hi Tetsuya, Thomas,
> > I think that redesigning the EAL structures is beyond the scope of this 
> > patchset and should be
> undertaken as a separate task.
> 
> I strongly disagrees this opinion. We should never workaround design problems 
> and add more
> complex/weird code.
> I think that this kind of consideration is the heart of some design problems 
> we have to face today.
> Please let's stop adding some code which just works without thinking the 
> whole design.
> 
> > I suspect there may be a problem in the original code when  a device which 
> > was using a kernel driver
> is bound to igb_uio.  The igb_uio  driver adds 
> /sys/bus/pci/devices/\:05\:00.0/max_vfs.

Hi Tomas, Tetsuya,

In general, I agree that we should not workaround design problems.
In this case I don't think there is a problem with the rte_pci_device and 
pci_device_list structures.
The "already registered" device has been replaced. It would probably be cleaner 
to remove the "already registered" device from the list and then add the new 
device to the list rather than update the "already registered" device.

Regards,

Bernard.



[dpdk-dev] [PATCH v2 0/2] Removal of RTE_MBUF_REFCNT

2015-02-18 Thread Ananyev, Konstantin


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Sergio Gonzalez Monroy
> Sent: Wednesday, February 18, 2015 11:03 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2 0/2] Removal of RTE_MBUF_REFCNT
> 
> This patch tries to remove the RTE_MBUF_REFCNT config options and dependencies
> by introducing a new mbuf flag IND_ATTACHED_MBUF that would indicate when the 
> mbuf
> is an indirect attached mbuf, to differentiate between indirect mbufs and 
> mbufs
> with external memory buffers (ie. vhost zero copy).
> 
> Previous discussion:
> http://dpdk.org/ml/archives/dev/2014-October/007127.html
> 
> Currently for mbufs with refcnt, we cannot free mbufs with external memory
> buffers (ie. vhost zero copy), as they are recognized as indirect
> attached mbufs and therefore we free the direct mbuf it points to,
> resulting in an error in the case of external memory buffers.
> 
> We solve the issue by introducing the IND_ATTACHED_MBUF flag, which indicates
> that the mbuf is an indirect attached mbuf pointing to another mbuf.
> When we free an mbuf, we only free the direct mbuf if the flag is set.
> Freeing an mbuf with external buffer is the same as freeing a non attached 
> mbuf.
> The flag is set during attach and clear on detach.
> 
> So in the case of vhost zero copy where we have mbufs with external
> buffers, by default we just free the mbuf and it is up to the user to deal 
> with
> the external buffer.
> 
> v2:
>  - Add missing parenthesis to RTE_MBUF_INDIRECT macro
> 
> Sergio Gonzalez Monroy (2):
>   mbuf: Introduce IND_ATTACHED_MBUF flag
>   Remove RTE_MBUF_REFCNT references
> 
>  app/test/test_link_bonding.c| 15 ---
>  app/test/test_mbuf.c| 17 +++--
>  config/common_bsdapp|  1 -
>  config/common_linuxapp  |  1 -
>  examples/Makefile   |  4 +--
>  examples/ip_fragmentation/Makefile  |  4 ---
>  examples/ip_pipeline/Makefile   |  3 ---
>  examples/ip_pipeline/main.c |  5 
>  examples/ipv4_multicast/Makefile|  4 ---
>  examples/vhost/main.c   | 19 +++---
>  lib/librte_ip_frag/Makefile |  4 ---
>  lib/librte_ip_frag/rte_ip_frag.h|  4 ---
>  lib/librte_mbuf/rte_mbuf.c  |  2 --
>  lib/librte_mbuf/rte_mbuf.h  | 45 
> +++--
>  lib/librte_pmd_bond/Makefile|  4 ---
>  lib/librte_pmd_bond/rte_eth_bond.h  |  2 --
>  lib/librte_pmd_bond/rte_eth_bond_args.c |  2 --
>  lib/librte_pmd_bond/rte_eth_bond_pmd.c  | 10 
>  lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c   |  8 --
>  lib/librte_port/Makefile|  4 ---
>  20 files changed, 19 insertions(+), 139 deletions(-)
> 

Acked-by: Konstantin Ananyev 

> --
> 1.9.3



[dpdk-dev] [PATCH] mk: Rework gcc version detection to permit versions newer than 4.x

2015-02-18 Thread Panu Matilainen
Separately comparing major and minor versions becomes seriously clumsy
when with major version changes, convert the entire version string into
a numeric value (ie 4.6.0 becomes 460 and 5.0.0 becomes 500) and use
that for comparisons. This simplifies the comparisons and makes
gcc 5.0 naturally recognized at least as capable as newest 4.x.

This three-digit scheme would run into trouble if gcc ever went to
two-digit version segments, but that hasn't happened in the last 10+
years so it seems like a safe assumption.

Signed-off-by: Panu Matilainen 
---
 lib/librte_pmd_fm10k/Makefile|  2 +-
 lib/librte_pmd_i40e/Makefile |  2 +-
 lib/librte_pmd_ixgbe/Makefile|  6 +++---
 lib/librte_pmd_vmxnet3/Makefile  |  2 +-
 mk/toolchain/gcc/rte.toolchain-compat.mk | 22 ++
 5 files changed, 16 insertions(+), 18 deletions(-)

diff --git a/lib/librte_pmd_fm10k/Makefile b/lib/librte_pmd_fm10k/Makefile
index 986f4ef..dd37f19 100644
--- a/lib/librte_pmd_fm10k/Makefile
+++ b/lib/librte_pmd_fm10k/Makefile
@@ -62,7 +62,7 @@ else
 #
 # CFLAGS for gcc
 #
-ifneq ($(shell test $(GCC_MAJOR_VERSION) -le 4 -a $(GCC_MINOR_VERSION) -le 3 
&& echo 1), 1)
+ifneq ($(shell test $(GCC_VERSION) -le 430 && echo 1), 1)
 CFLAGS += -Wno-deprecated
 endif
 CFLAGS_BASE_DRIVER = -Wno-unused-parameter -Wno-unused-value
diff --git a/lib/librte_pmd_i40e/Makefile b/lib/librte_pmd_i40e/Makefile
index 9a0eec8..484379a 100644
--- a/lib/librte_pmd_i40e/Makefile
+++ b/lib/librte_pmd_i40e/Makefile
@@ -69,7 +69,7 @@ CFLAGS_BASE_DRIVER += -Wno-pointer-to-int-cast
 CFLAGS_BASE_DRIVER += -Wno-format-nonliteral
 CFLAGS_BASE_DRIVER += -Wno-format-security

-ifeq ($(shell test $(GCC_MAJOR_VERSION) -ge 4 -a $(GCC_MINOR_VERSION) -ge 4 && 
echo 1), 1)
+ifeq ($(shell test $(GCC_VERSION) -ge 440 && echo 1), 1)
 CFLAGS_BASE_DRIVER += -Wno-unused-but-set-variable
 endif

diff --git a/lib/librte_pmd_ixgbe/Makefile b/lib/librte_pmd_ixgbe/Makefile
index d580f62..49ecc2f 100644
--- a/lib/librte_pmd_ixgbe/Makefile
+++ b/lib/librte_pmd_ixgbe/Makefile
@@ -60,18 +60,18 @@ else
 #
 # CFLAGS for gcc
 #
-ifneq ($(shell test $(GCC_MAJOR_VERSION) -le 4 -a $(GCC_MINOR_VERSION) -le 3 
&& echo 1), 1)
+ifneq ($(shell test $(GCC_VERSION) -le 430 && echo 1), 1)
 CFLAGS += -Wno-deprecated
 endif
 CFLAGS_BASE_DRIVER = -Wno-unused-parameter -Wno-unused-value
 CFLAGS_BASE_DRIVER += -Wno-strict-aliasing -Wno-format-extra-args

-ifeq ($(shell test $(GCC_MAJOR_VERSION) -ge 4 -a $(GCC_MINOR_VERSION) -ge 6 && 
echo 1), 1)
+ifeq ($(shell test $(GCC_VERSION) -ge 460 && echo 1), 1)
 CFLAGS_ixgbe_common.o += -Wno-unused-but-set-variable
 CFLAGS_ixgbe_x550.o += -Wno-unused-but-set-variable -Wno-maybe-uninitialized
 endif

-ifeq ($(shell test $(GCC_MAJOR_VERSION) -le 4 -a $(GCC_MINOR_VERSION) -le 6 && 
echo 1), 1)
+ifeq ($(shell test $(GCC_VERSION) -le 460 && echo 1), 1)
 CFLAGS_ixgbe_x550.o += -Wno-uninitialized
 CFLAGS_ixgbe_phy.o += -Wno-uninitialized
 endif
diff --git a/lib/librte_pmd_vmxnet3/Makefile b/lib/librte_pmd_vmxnet3/Makefile
index 93e5580..7d7002c 100644
--- a/lib/librte_pmd_vmxnet3/Makefile
+++ b/lib/librte_pmd_vmxnet3/Makefile
@@ -56,7 +56,7 @@ else
 #
 # CFLAGS for gcc
 #
-ifneq ($(shell test $(GCC_MAJOR_VERSION) -le 4 -a $(GCC_MINOR_VERSION) -le 3 
&& echo 1), 1)
+ifneq ($(shell test $(GCC_VERSION) -le 430 && echo 1), 1)
 CFLAGS += -Wno-deprecated
 endif
 CFLAGS_BASE_DRIVER = -Wno-unused-parameter -Wno-unused-value
diff --git a/mk/toolchain/gcc/rte.toolchain-compat.mk 
b/mk/toolchain/gcc/rte.toolchain-compat.mk
index e40e103..9d262c4 100644
--- a/mk/toolchain/gcc/rte.toolchain-compat.mk
+++ b/mk/toolchain/gcc/rte.toolchain-compat.mk
@@ -38,17 +38,15 @@

 #find out GCC version

-GCC_MAJOR_VERSION = $(shell $(CC) -dumpversion | cut -f1 -d.)
+GCC_VERSION = $(subst .,,$(shell $(CC) -dumpversion))

-# if GCC is not 4.x
-ifneq ($(GCC_MAJOR_VERSION),4)
+# if GCC is older than 4.x
+ifneq ($(shell test $(GCC_VERSION) -ge 400 && echo 1), 1)
MACHINE_CFLAGS =
-$(warning You are not using GCC 4.x. This is neither supported, nor tested.)
+$(warning You are not using GCC >= 4.x. This is neither supported, nor tested.)


 else
-   GCC_MINOR_VERSION = $(shell $(CC) -dumpversion | cut -f2 -d.)
-
 # GCC graceful degradation
 # GCC 4.2.x - added support for generic target
 # GCC 4.3.x - added support for core2, ssse3, sse4.1, sse4.2
@@ -57,18 +55,18 @@ else
 # GCC 4.6.x - added support for corei7, corei7-avx
 # GCC 4.7.x - added support for fsgsbase, rdrnd, f16c, core-avx-i, core-avx2

-   ifeq ($(shell test $(GCC_MINOR_VERSION) -le 7 && echo 1), 1)
+   ifeq ($(shell test $(GCC_VERSION) -le 470 && echo 1), 1)
MACHINE_CFLAGS := $(patsubst 
-march=core-avx-i,-march=corei7-avx,$(MACHINE_CFLAGS))
MACHINE_CFLAGS := $(patsubst 
-march=core-avx2,-march=core-avx2,$(MACHINE_CFLAGS))
endif
-   ifeq ($(shell test $(GCC_MINOR_VERSION) -lt 6 && echo 1), 1)
+   ifeq ($(shell test $(GC

[dpdk-dev] [PATCH v8 04/14] eal/pci: Consolidate pci address comparison APIs

2015-02-18 Thread Tetsuya Mukawa
On 2015/02/18 20:39, Iremonger, Bernard wrote:
>
>> -Original Message-
>> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
>> Sent: Wednesday, February 18, 2015 10:33 AM
>> To: Iremonger, Bernard
>> Cc: Tetsuya Mukawa; dev at dpdk.org; ivan.boule at 6wind.com
>> Subject: Re: [dpdk-dev] [PATCH v8 04/14] eal/pci: Consolidate pci address 
>> comparison APIs
>>
>> Hi Bernard,
>>
>> 2015-02-18 10:26, Iremonger, Bernard:
>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Tetsuya Mukawa
 On 2015/02/18 10:02, Thomas Monjalon wrote:
> 2015-02-17 15:14, Tetsuya Mukawa:
>> On 2015/02/17 9:44, Thomas Monjalon wrote:
>>> 2015-02-16 13:14, Tetsuya Mukawa:
 @@ -356,13 +342,24 @@ pci_scan_one(int dev_pci_fd, struct pci_conf 
 *conf)
}
else {
struct rte_pci_device *dev2 = NULL;
 +  int ret;

TAILQ_FOREACH(dev2, &pci_device_list, next) {
 -  if (pci_addr_comparison(&dev->addr, 
 &dev2->addr))
 +  ret = eal_compare_pci_addr(&dev->addr, 
 &dev2->addr);
 +  if (ret > 0)
continue;
 -  else {
 +  else if (ret < 0) {
TAILQ_INSERT_BEFORE(dev2, dev, next);
return 0;
 +  } else { /* already registered */
 +  /* update pt_driver */
 +  dev2->pt_driver = dev->pt_driver;
 +  dev2->max_vfs = dev->max_vfs;
 +  memmove(dev2->mem_resource,
 +  dev->mem_resource,
 +  sizeof(dev->mem_resource));
 +  free(dev);
 +  return 0;
>>> Could you comment this "else part" please?
>> PCI device list is allocated when rte_eal_init() is called. At
>> the time, to fill pci device information, sysfs value is used.
>> If sysfs values written by kernel device driver will not be
>> changed by igb_uio, vfio or pci_uio_genereic, above code isn't needed.
>> But actually above values are changed or added by them.
>>
>> Here is a step to cause issue.
>> 1. Boot linux.
>> 2. Start DPDK application without any physical NIC ports.
>>  - Here, some sysfs values are read, and store to pci device list.
>> 3. igb_uio starts managing a port.
>>  - Here, some sysfs values are changed.
>> 4. Add a NIC port to DPDK application using hotplug functions.
>>  - Here, we need to replace some values.
> I think that you are showing that something is wrongly designed in
> these EAL structures. I suggest to try cleaning this mess instead of 
> workarounding.
>>> Hi Tetsuya, Thomas,
>>> I think that redesigning the EAL structures is beyond the scope of this 
>>> patchset and should be
>> undertaken as a separate task.
>>
>> I strongly disagrees this opinion. We should never workaround design 
>> problems and add more
>> complex/weird code.
>> I think that this kind of consideration is the heart of some design problems 
>> we have to face today.
>> Please let's stop adding some code which just works without thinking the 
>> whole design.
>>
>>> I suspect there may be a problem in the original code when  a device which 
>>> was using a kernel driver
>> is bound to igb_uio.  The igb_uio  driver adds 
>> /sys/bus/pci/devices/\:05\:00.0/max_vfs.
> Hi Tomas, Tetsuya,
>
> In general, I agree that we should not workaround design problems.
> In this case I don't think there is a problem with the rte_pci_device and 
> pci_device_list structures.

I agree with it.

> The "already registered" device has been replaced. It would probably be 
> cleaner to remove the "already registered" device from the list and then add 
> the new device to the list rather than update the "already registered" device.
>

I guess "replacing" will not work, because rte_pci_device structure is
also registered in rte_eth_dev structure.
If we remove and free the pci device, I guess something goes wrong in
ethdev library.
Just removing is one more option, but it means there is a working pci
device that is not registered in the pci_device_list. I guess it's weird.

I still think updating may be correct behavior.
The pci_device_list is used like below when rte_eal_init() is called.

1. When rte_eal_pci_init() is called, all pci devices are registered in
the pci_device_list.
2. When rte_eal_dev_init() is called, dev_driver_list is traversed, and
if a PCI device for a driver is found in the pci_device_list, init() of
the driver is called.

I guess it's not so strange design.
But this design 

[dpdk-dev] [PATCH v8 03/14] eal/pci, ethdev: Remove assumption that port will not be detached

2015-02-18 Thread Bruce Richardson
On Wed, Feb 18, 2015 at 07:58:06PM +0900, Tetsuya Mukawa wrote:
> On 2015/02/18 19:03, Bruce Richardson wrote:
> > On Wed, Feb 18, 2015 at 10:57:25AM +0100, Thomas Monjalon wrote:
> >> 2015-02-18 15:10, Tetsuya Mukawa:
> >>> On 2015/02/18 10:54, Tetsuya Mukawa wrote:
>  On 2015/02/18 9:31, Thomas Monjalon wrote:
> > 2015-02-17 15:14, Tetsuya Mukawa:
> >> On 2015/02/17 9:36, Thomas Monjalon wrote:
> >>> 2015-02-16 13:14, Tetsuya Mukawa:
> >>> Is uint8_t sill a good size for hotpluggable virtual device ids?
> >> I am not sure it's enough, but uint8_t is widely used in "rte_ethdev.c"
> >> as port id.
> >> If someone reports it doesn't enough, I guess it will be the time to
> >> write a patch to change all uint_8 in one patch.
> > It's a big ABI breakage. So if we feel it's going to be required,
> > it's better to do it now in 2.0 release I think.
> >
> > Any opinion?
> >
>  Hi Thomas,
> 
>  I agree with it.
>  I will add an one more patch to change uint8_t to uint16_t.
> 
>  Thanks,
>  Tetsuya
> 
> >>> Hi Thomas,
> >>>
> >>> Could I make sure.
> >>> After changing uint8_t to uint16_t in "rte_ethdev.[ch]", must I also
> >>> need to change other applications and libraries that call ethdev APIs?
> >>> If so, I would not finish it by 23rd.
> >>>
> >>> I've counted how many lines call ethdev APIs that are related to port_id.
> >>> Could you please check an attached file?
> >>> It's over 1200 lines. Probably to fix  one of caller, I will need to
> >>> check how port_id is used, and fix more related lines. So probably
> >>> thousands lines may need to be fixed.
> >>>
> >>> When is deadline for fixing this changing?
> >>> Also, if you have a good idea to fix it easier, could you please let me
> >>> know?
> >> It was an open question.
> >> If everybody is fine with 255 ports maximum, let's keep it as is.
> >>
> > I think we are probably ok for now (and forseeable future) with 255 max.
> >
> > However, if we did change it, I agree that in 2.0 is a very good time to do 
> > so.
> > Since we are expanding the field, rather than shrinking it, I don't see why 
> > we
> > can't just make the change at the ethdev level (and in libs API) in 2.0 and 
> > then in
> > later releases (e.g. 2.1) update the apps and examples to match. That way 
> > the
> > ABI stays the same from 2.0 onwards, and we don't have a huge amount of 
> > churn
> > changing it everywhere late in the 2.0 release cycle.
> 
> Hi Bruce,
> 
> Could you please check my RFC patch I will send soon?
> I wrote the patch like below.
> 
> 1. Copy header file like below.
> $ cp lib/librte_ether/rte_ethdev.h lib/librte_ether/rte_ethdev_internal.h
> 2. Change "rte_ethdev.c" to include "rte_ethdev_internal.h"
> 3. Change type of port id in "rte_ethdev.c" and "rte_ethdev_internal.h".
> 
> If the patch is OK, I wll send it with hotplug patches.
> 
> Thanks,
> Tetsuya
> 
>
Why the new ethdev internal file? 


[dpdk-dev] [RFC PATCH] lib/librte_ethdev: Expand port identifier

2015-02-18 Thread Bruce Richardson
On Wed, Feb 18, 2015 at 08:02:49PM +0900, Tetsuya Mukawa wrote:
> Currently uint8_t is used for port identifier. This patch changes it,
> and use uint16_t as port identifier.
> This patch only changes ethdev library. ABI of the library will be
> kept even after applying it.
> 
> Also, this patch involves following fixes.
> - Use "port_id" as variable name instead of "port".
> 
>
> Signed-off-by: Tetsuya Mukawa 
> ---
>  lib/librte_ether/rte_ethdev.c  |  212 +-
>  lib/librte_ether/rte_ethdev_internal.h | 3672 
> 
>  2 files changed, 3778 insertions(+), 106 deletions(-)
>  create mode 100644 lib/librte_ether/rte_ethdev_internal.h
> 
I'm not sure I follow why we need a new header file for this.
Also, thinking about this change, a more fundamental problem is going to be
the mbuf structure, which stores a port id inside it in an 8-bit value.
Upgrading that to a 16-bit value requires some thought, and verification to
ensure any adjustment of fields does not lead to serious performance issues.

Therefore, I suggest we leave the port id values as 8-bits until such time
as we need greater than 255 port values in a real-world use case.
Out of interest - anyone have a DPDK app where they use >16 port id values? If
so, how high does the port id value get?

Regards,
/Bruce



[dpdk-dev] [RFC PATCH] lib/librte_ethdev: Expand port identifier

2015-02-18 Thread Bruce Richardson
On Wed, Feb 18, 2015 at 12:30:07PM +, Bruce Richardson wrote:
> On Wed, Feb 18, 2015 at 08:02:49PM +0900, Tetsuya Mukawa wrote:
> > Currently uint8_t is used for port identifier. This patch changes it,
> > and use uint16_t as port identifier.
> > This patch only changes ethdev library. ABI of the library will be
> > kept even after applying it.
> > 
> > Also, this patch involves following fixes.
> > - Use "port_id" as variable name instead of "port".
> > 
> >
> > Signed-off-by: Tetsuya Mukawa 
> > ---
> >  lib/librte_ether/rte_ethdev.c  |  212 +-
> >  lib/librte_ether/rte_ethdev_internal.h | 3672 
> > 
> >  2 files changed, 3778 insertions(+), 106 deletions(-)
> >  create mode 100644 lib/librte_ether/rte_ethdev_internal.h
> > 
> I'm not sure I follow why we need a new header file for this.
> Also, thinking about this change, a more fundamental problem is going to be
> the mbuf structure, which stores a port id inside it in an 8-bit value.
> Upgrading that to a 16-bit value requires some thought, and verification to
> ensure any adjustment of fields does not lead to serious performance issues.
> 
> Therefore, I suggest we leave the port id values as 8-bits until such time
> as we need greater than 255 port values in a real-world use case.
> Out of interest - anyone have a DPDK app where they use >16 port id values? If
> so, how high does the port id value get?
> 
> Regards,
> /Bruce
> 

Resending with correct email addr for Neil.

/Bruce


[dpdk-dev] [PATCH v8 03/14] eal/pci, ethdev: Remove assumption that port will not be detached

2015-02-18 Thread Iremonger, Bernard


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Tetsuya Mukawa
> Sent: Wednesday, February 18, 2015 10:58 AM
> To: Richardson, Bruce; Thomas Monjalon
> Cc: dev at dpdk.org; Neil Horman
> Subject: Re: [dpdk-dev] [PATCH v8 03/14] eal/pci, ethdev: Remove assumption 
> that port will not be
> detached
> 
> On 2015/02/18 19:03, Bruce Richardson wrote:
> > On Wed, Feb 18, 2015 at 10:57:25AM +0100, Thomas Monjalon wrote:
> >> 2015-02-18 15:10, Tetsuya Mukawa:
> >>> On 2015/02/18 10:54, Tetsuya Mukawa wrote:
>  On 2015/02/18 9:31, Thomas Monjalon wrote:
> > 2015-02-17 15:14, Tetsuya Mukawa:
> >> On 2015/02/17 9:36, Thomas Monjalon wrote:
> >>> 2015-02-16 13:14, Tetsuya Mukawa:
> >>> Is uint8_t sill a good size for hotpluggable virtual device ids?
> >> I am not sure it's enough, but uint8_t is widely used in "rte_ethdev.c"
> >> as port id.
> >> If someone reports it doesn't enough, I guess it will be the time
> >> to write a patch to change all uint_8 in one patch.
> > It's a big ABI breakage. So if we feel it's going to be required,
> > it's better to do it now in 2.0 release I think.
> >
> > Any opinion?
> >
>  Hi Thomas,
> 
>  I agree with it.
>  I will add an one more patch to change uint8_t to uint16_t.
> 
>  Thanks,
>  Tetsuya
> 
> >>> Hi Thomas,
> >>>
> >>> Could I make sure.
> >>> After changing uint8_t to uint16_t in "rte_ethdev.[ch]", must I also
> >>> need to change other applications and libraries that call ethdev APIs?
> >>> If so, I would not finish it by 23rd.
> >>>
> >>> I've counted how many lines call ethdev APIs that are related to port_id.
> >>> Could you please check an attached file?
> >>> It's over 1200 lines. Probably to fix  one of caller, I will need to
> >>> check how port_id is used, and fix more related lines. So probably
> >>> thousands lines may need to be fixed.
> >>>
> >>> When is deadline for fixing this changing?
> >>> Also, if you have a good idea to fix it easier, could you please let
> >>> me know?
> >> It was an open question.
> >> If everybody is fine with 255 ports maximum, let's keep it as is.
> >>
> > I think we are probably ok for now (and forseeable future) with 255 max.
> >
> > However, if we did change it, I agree that in 2.0 is a very good time to do 
> > so.
> > Since we are expanding the field, rather than shrinking it, I don't
> > see why we can't just make the change at the ethdev level (and in libs
> > API) in 2.0 and then in later releases (e.g. 2.1) update the apps and
> > examples to match. That way the ABI stays the same from 2.0 onwards,
> > and we don't have a huge amount of churn changing it everywhere late in the 
> > 2.0 release cycle.
> 
> Hi Bruce,
> 
> Could you please check my RFC patch I will send soon?
> I wrote the patch like below.
> 
> 1. Copy header file like below.
> $ cp lib/librte_ether/rte_ethdev.h lib/librte_ether/rte_ethdev_internal.h
> 2. Change "rte_ethdev.c" to include "rte_ethdev_internal.h"
> 3. Change type of port id in "rte_ethdev.c" and "rte_ethdev_internal.h".
> 
> If the patch is OK, I wll send it with hotplug patches.
> 
> Thanks,
> Tetsuya
> 
> 
> > /Bruce
> 
Hi Tetsuya,

After this change there will be two header files with a lot of the same 
information.
lib/librte_ether/rte_ethdev.h
lib/librte_ether/rte_ethdev_internal.h
I don't think this is a good idea for maintenance in the future.
If 255 is ok for the foreseeable future, why change it now.

Regards,

Bernard.




[dpdk-dev] [PATCH v8 03/14] eal/pci, ethdev: Remove assumption that port will not be detached

2015-02-18 Thread Tetsuya Mukawa
On 2015/02/18 21:23, Bruce Richardson wrote:
> On Wed, Feb 18, 2015 at 07:58:06PM +0900, Tetsuya Mukawa wrote:
>> On 2015/02/18 19:03, Bruce Richardson wrote:
>>> On Wed, Feb 18, 2015 at 10:57:25AM +0100, Thomas Monjalon wrote:
 2015-02-18 15:10, Tetsuya Mukawa:
> On 2015/02/18 10:54, Tetsuya Mukawa wrote:
>> On 2015/02/18 9:31, Thomas Monjalon wrote:
>>> 2015-02-17 15:14, Tetsuya Mukawa:
 On 2015/02/17 9:36, Thomas Monjalon wrote:
> 2015-02-16 13:14, Tetsuya Mukawa:
> Is uint8_t sill a good size for hotpluggable virtual device ids?
 I am not sure it's enough, but uint8_t is widely used in "rte_ethdev.c"
 as port id.
 If someone reports it doesn't enough, I guess it will be the time to
 write a patch to change all uint_8 in one patch.
>>> It's a big ABI breakage. So if we feel it's going to be required,
>>> it's better to do it now in 2.0 release I think.
>>>
>>> Any opinion?
>>>
>> Hi Thomas,
>>
>> I agree with it.
>> I will add an one more patch to change uint8_t to uint16_t.
>>
>> Thanks,
>> Tetsuya
>>
> Hi Thomas,
>
> Could I make sure.
> After changing uint8_t to uint16_t in "rte_ethdev.[ch]", must I also
> need to change other applications and libraries that call ethdev APIs?
> If so, I would not finish it by 23rd.
>
> I've counted how many lines call ethdev APIs that are related to port_id.
> Could you please check an attached file?
> It's over 1200 lines. Probably to fix  one of caller, I will need to
> check how port_id is used, and fix more related lines. So probably
> thousands lines may need to be fixed.
>
> When is deadline for fixing this changing?
> Also, if you have a good idea to fix it easier, could you please let me
> know?
 It was an open question.
 If everybody is fine with 255 ports maximum, let's keep it as is.

>>> I think we are probably ok for now (and forseeable future) with 255 max.
>>>
>>> However, if we did change it, I agree that in 2.0 is a very good time to do 
>>> so.
>>> Since we are expanding the field, rather than shrinking it, I don't see why 
>>> we
>>> can't just make the change at the ethdev level (and in libs API) in 2.0 and 
>>> then in
>>> later releases (e.g. 2.1) update the apps and examples to match. That way 
>>> the
>>> ABI stays the same from 2.0 onwards, and we don't have a huge amount of 
>>> churn
>>> changing it everywhere late in the 2.0 release cycle.
>> Hi Bruce,
>>
>> Could you please check my RFC patch I will send soon?
>> I wrote the patch like below.
>>
>> 1. Copy header file like below.
>> $ cp lib/librte_ether/rte_ethdev.h lib/librte_ether/rte_ethdev_internal.h
>> 2. Change "rte_ethdev.c" to include "rte_ethdev_internal.h"
>> 3. Change type of port id in "rte_ethdev.c" and "rte_ethdev_internal.h".
>>
>> If the patch is OK, I wll send it with hotplug patches.
>>
>> Thanks,
>> Tetsuya
>>
>>
> Why the new ethdev internal file? 

I guess some libraries that  include "rte_ethdev.h". To compile these
libraries, I thought such a header was needed.
But, it seems it's not the time to change type of port_id.
I appreciate for your checking.

Tetsuya



[dpdk-dev] [PATCH v8 03/14] eal/pci, ethdev: Remove assumption that port will not be detached

2015-02-18 Thread Tetsuya Mukawa
On 2015/02/18 21:33, Iremonger, Bernard wrote:
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Tetsuya Mukawa
>> Sent: Wednesday, February 18, 2015 10:58 AM
>> To: Richardson, Bruce; Thomas Monjalon
>> Cc: dev at dpdk.org; Neil Horman
>> Subject: Re: [dpdk-dev] [PATCH v8 03/14] eal/pci, ethdev: Remove assumption 
>> that port will not be
>> detached
>>
>> On 2015/02/18 19:03, Bruce Richardson wrote:
>>> On Wed, Feb 18, 2015 at 10:57:25AM +0100, Thomas Monjalon wrote:
 2015-02-18 15:10, Tetsuya Mukawa:
> On 2015/02/18 10:54, Tetsuya Mukawa wrote:
>> On 2015/02/18 9:31, Thomas Monjalon wrote:
>>> 2015-02-17 15:14, Tetsuya Mukawa:
 On 2015/02/17 9:36, Thomas Monjalon wrote:
> 2015-02-16 13:14, Tetsuya Mukawa:
> Is uint8_t sill a good size for hotpluggable virtual device ids?
 I am not sure it's enough, but uint8_t is widely used in "rte_ethdev.c"
 as port id.
 If someone reports it doesn't enough, I guess it will be the time
 to write a patch to change all uint_8 in one patch.
>>> It's a big ABI breakage. So if we feel it's going to be required,
>>> it's better to do it now in 2.0 release I think.
>>>
>>> Any opinion?
>>>
>> Hi Thomas,
>>
>> I agree with it.
>> I will add an one more patch to change uint8_t to uint16_t.
>>
>> Thanks,
>> Tetsuya
>>
> Hi Thomas,
>
> Could I make sure.
> After changing uint8_t to uint16_t in "rte_ethdev.[ch]", must I also
> need to change other applications and libraries that call ethdev APIs?
> If so, I would not finish it by 23rd.
>
> I've counted how many lines call ethdev APIs that are related to port_id.
> Could you please check an attached file?
> It's over 1200 lines. Probably to fix  one of caller, I will need to
> check how port_id is used, and fix more related lines. So probably
> thousands lines may need to be fixed.
>
> When is deadline for fixing this changing?
> Also, if you have a good idea to fix it easier, could you please let
> me know?
 It was an open question.
 If everybody is fine with 255 ports maximum, let's keep it as is.

>>> I think we are probably ok for now (and forseeable future) with 255 max.
>>>
>>> However, if we did change it, I agree that in 2.0 is a very good time to do 
>>> so.
>>> Since we are expanding the field, rather than shrinking it, I don't
>>> see why we can't just make the change at the ethdev level (and in libs
>>> API) in 2.0 and then in later releases (e.g. 2.1) update the apps and
>>> examples to match. That way the ABI stays the same from 2.0 onwards,
>>> and we don't have a huge amount of churn changing it everywhere late in the 
>>> 2.0 release cycle.
>> Hi Bruce,
>>
>> Could you please check my RFC patch I will send soon?
>> I wrote the patch like below.
>>
>> 1. Copy header file like below.
>> $ cp lib/librte_ether/rte_ethdev.h lib/librte_ether/rte_ethdev_internal.h
>> 2. Change "rte_ethdev.c" to include "rte_ethdev_internal.h"
>> 3. Change type of port id in "rte_ethdev.c" and "rte_ethdev_internal.h".
>>
>> If the patch is OK, I wll send it with hotplug patches.
>>
>> Thanks,
>> Tetsuya
>>
>>
>>> /Bruce
> Hi Tetsuya,
>
> After this change there will be two header files with a lot of the same 
> information.
> lib/librte_ether/rte_ethdev.h
> lib/librte_ether/rte_ethdev_internal.h
> I don't think this is a good idea for maintenance in the future.
> If 255 is ok for the foreseeable future, why change it now.

Hi Bernard,

I appreciate for your checking.
Agree, it will not be good to have almost same headers.

Thanks,
Tetsuya

> Regards,
>
> Bernard.
>  
>



[dpdk-dev] [PATCH v2 0/3] remove limit on devargs parameters length

2015-02-18 Thread Thomas Monjalon
2015-02-13 16:03, David Marchand:
> Here is a little patchset that removes the limit on the devargs parameters
> length. Previously, arguments specified by user would be stored in a static
> buffer, while there is no particular reason why we should have such a
> constraint, afaik.
> 
> Changes since v1:
> - fix devargs tests (problem reported by Thomas)
> 
> David Marchand (3):
>   devargs: indent and cleanup
>   devargs: remove limit on parameters length
>   app/test: fix devargs tests

Acked-by: Thomas Monjalon 

Applied, thanks


[dpdk-dev] [RFC PATCH] lib/librte_ethdev: Expand port identifier

2015-02-18 Thread Wodkowski, PawelX
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> Sent: Wednesday, February 18, 2015 1:32 PM
> To: Tetsuya Mukawa
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [RFC PATCH] lib/librte_ethdev: Expand port identifier
> 
> On Wed, Feb 18, 2015 at 12:30:07PM +, Bruce Richardson wrote:
> > On Wed, Feb 18, 2015 at 08:02:49PM +0900, Tetsuya Mukawa wrote:
> > > Currently uint8_t is used for port identifier. This patch changes it,
> > > and use uint16_t as port identifier.
> > > This patch only changes ethdev library. ABI of the library will be
> > > kept even after applying it.
> > >
> > > Also, this patch involves following fixes.
> > > - Use "port_id" as variable name instead of "port".
> > >
> > >
> > > Signed-off-by: Tetsuya Mukawa 
> > > ---
> > >  lib/librte_ether/rte_ethdev.c  |  212 +-
> > >  lib/librte_ether/rte_ethdev_internal.h | 3672
> 
> > >  2 files changed, 3778 insertions(+), 106 deletions(-)
> > >  create mode 100644 lib/librte_ether/rte_ethdev_internal.h
> > >
> > I'm not sure I follow why we need a new header file for this.
> > Also, thinking about this change, a more fundamental problem is going to be
> > the mbuf structure, which stores a port id inside it in an 8-bit value.
> > Upgrading that to a 16-bit value requires some thought, and verification to
> > ensure any adjustment of fields does not lead to serious performance issues.
> >
> > Therefore, I suggest we leave the port id values as 8-bits until such time
> > as we need greater than 255 port values in a real-world use case.
> > Out of interest - anyone have a DPDK app where they use >16 port id values? 
> > If
> > so, how high does the port id value get?

Not real application but simple example of setup:
4 Niantic x 2 ports x 64 VF = 512 port id

I don't know what would be the real need/advantage of such setup (bonding?) but 
you see, in theory it is already insufficient.

> >
> > Regards,
> > /Bruce
> >
> 
> Resending with correct email addr for Neil.
> 
> /Bruce


[dpdk-dev] [RFC PATCH] lib/librte_ethdev: Expand port identifier

2015-02-18 Thread Marc Sune

On 18/02/15 13:31, Bruce Richardson wrote:
> On Wed, Feb 18, 2015 at 12:30:07PM +, Bruce Richardson wrote:
>> On Wed, Feb 18, 2015 at 08:02:49PM +0900, Tetsuya Mukawa wrote:
>>> Currently uint8_t is used for port identifier. This patch changes it,
>>> and use uint16_t as port identifier.
>>> This patch only changes ethdev library. ABI of the library will be
>>> kept even after applying it.
>>>
>>> Also, this patch involves following fixes.
>>> - Use "port_id" as variable name instead of "port".
>>>
>>>
>>> Signed-off-by: Tetsuya Mukawa 
>>> ---
>>>   lib/librte_ether/rte_ethdev.c  |  212 +-
>>>   lib/librte_ether/rte_ethdev_internal.h | 3672 
>>> 
>>>   2 files changed, 3778 insertions(+), 106 deletions(-)
>>>   create mode 100644 lib/librte_ether/rte_ethdev_internal.h
>>>
>> I'm not sure I follow why we need a new header file for this.
>> Also, thinking about this change, a more fundamental problem is going to be
>> the mbuf structure, which stores a port id inside it in an 8-bit value.
>> Upgrading that to a 16-bit value requires some thought, and verification to
>> ensure any adjustment of fields does not lead to serious performance issues.
>>
>> Therefore, I suggest we leave the port id values as 8-bits until such time
>> as we need greater than 255 port values in a real-world use case.
>> Out of interest - anyone have a DPDK app where they use >16 port id values? 
>> If
>> so, how high does the port id value get?

Just a though on port_id in general; I wouldn't see why other type of 
ports could fall into the same abstraction of using port_ids as we do 
for PHY ports, if eventually we could create a unified API TX/RX 
routines he same regardless of the port (I know KNI deprecated this 
approach in the past). Of course initialization routines should be 
different for each type of port.

I see quite a bit of code duplicity, basically in TX/RX routines for PHY 
ports, KNI ports, SHMEM (ring) ports like ivshmem etc.., which are very 
similar, and we put into the shoulders of all users of DPDK to have to 
do the "switch() - case" based on the type of port (which is state that 
they have to store themselves too). This seems to me it could be 
improved from a DPDK user's point of view.

By no means I am saying lower level APIs should not be exposed (current 
APIs)... There is the need to, since users using one type of ports only 
should be able to by-pass that (small?) extra overhead of this higher 
level APIs.

If the implementation would eventually would go into this direction, 
there would be more pressure in the port_id identifier; e.g. KNI 
interfaces and other SW like interfaces can be created and destroyed 
quite frequently (e.g. VMs), so more than 8 bits for addressing would 
probably be needed.

I know it not helping in the short-term, but let's see if someone thinks 
this makes any sense at all.

Marc

>>
>> Regards,
>> /Bruce
>>
> Resending with correct email addr for Neil.
>
> /Bruce



[dpdk-dev] [PATCH v4 1/2] librte_headroom: New library for checking core/system/app load

2015-02-18 Thread De Lara Guarch, Pablo
Hi Pawel,

A few things to fix in this patch:

> -Original Message-
> From: Wodkowski, PawelX
> Sent: Tuesday, February 17, 2015 4:42 PM
> To: dev at dpdk.org
> Cc: De Lara Guarch, Pablo
> Subject: [PATCH v4 1/2] librte_headroom: New library for checking
> core/system/app load
> 
> This library provide API to measure time spend in particular parts of
> code and to calculate optimal polling time.
> 
> To calculate a those statistics application code need to be devided into

Typo in "devided"

> parts (called jobs) that do something. It is up to application to decide
> what is considered a job.
> 
> Series of jobs must be surrounded with the rte_headroom_start_loop() and
> rte_headroom_finish_loop() calls. After that, jobs might be started.
> Each job must be surrounded with rte_headroom_start_job() and
> rte_headroom_finish_job() calls.
> 
> After job finish its execution, period in which it should be called

Finishes

> again is adjusted to minimize time wasted on unnecessary polls/calls.
> Adjustmend is based on data provided by job itself (ex: number of
> packets it processed).

Adjustment

> 
> After all jobs in serie are executed fallowing statistics are updated
> and might be used by application. Statistics can be reset. Some of
> provided statistic data:
>  - total/min/max execution - time spent in executing jobs.
>  - total/min/max management - time spent outside execution area. This
> value might used to measure overhead of sheduling jobs. This time also

Be used, scheduling

> contains overhead of headroom library itself.
>  - number of loops that executed at least one job
>  - executed jobs
>  - time when statistics were reset.
> 
> Each job provide total/min/max execution time and execution count
> statistics.
> 
> Signed-off-by: Pawel Wodkowski 
> ---
>  config/common_bsdapp |   5 +
>  config/common_linuxapp   |   5 +
>  lib/Makefile |   1 +
>  lib/librte_headroom/Makefile |  54 +
>  lib/librte_headroom/rte_headroom.c   | 271
> ++
>  lib/librte_headroom/rte_headroom.h   | 324
> +++
>  lib/librte_headroom/rte_headroom_version.map |  20 ++
>  7 files changed, 680 insertions(+)
>  create mode 100644 lib/librte_headroom/Makefile
>  create mode 100644 lib/librte_headroom/rte_headroom.c
>  create mode 100644 lib/librte_headroom/rte_headroom.h
>  create mode 100644 lib/librte_headroom/rte_headroom_version.map
> 

[...]

> diff --git a/lib/librte_headroom/rte_headroom_version.map
> b/lib/librte_headroom/rte_headroom_version.map
> new file mode 100644
> index 000..1f20016
> --- /dev/null
> +++ b/lib/librte_headroom/rte_headroom_version.map
> @@ -0,0 +1,20 @@
> +DPDK_2.0 {
> + global:
> +
> + rte_headroom_init;
> + rte_headroom_start_loop;
> + rte_headroom_finish_loop;
> + rte_headroom_job_init;
> + rte_headroom_set_job_target;
> + rte_headroom_start_job;
> + rte_headroom_finish_job;
> + rte_headroom_job_set_period;
> + rte_headroom_set_min_period;
> + rte_headroom_set_max_period;
> + rte_headroom_set_update_period_function;
> + rte_headroom_reset_job_stats;
> + rte_headroom_reset_stats;
> +

Trailing whitespaces here.

> + local: *;
> +};
> +

Trailing whitespaces here.
> \ No newline at end of file
> --
> 1.9.1



[dpdk-dev] [PATCH v1 1/3] eal: enable uio_pci_generic support

2015-02-18 Thread Thomas Monjalon
Hi Danny,

I wanted to apply this patchset which was reviewed. But when having a quick
overview, I've seen some strange additions.

2015-01-29 17:28, Danny Zhou:
> 1) Unify procedure to retrieve BAR resource mapping information.  
> 2) Setup bus master bit in NIC's PCIe configuration space for uio_pci_generic.
> 
> Signed-off-by: Danny Zhou 
> Tested-by: Qun Wan 
[...]
> --- a/lib/librte_eal/common/include/rte_pci.h
> +++ b/lib/librte_eal/common/include/rte_pci.h
> @@ -148,6 +148,7 @@ struct rte_pci_device {
>   struct rte_pci_id id;   /**< PCI ID. */
>   struct rte_pci_resource mem_resource[PCI_MAX_RESOURCE];   /**< PCI 
> Memory Resource */
>   struct rte_intr_handle intr_handle; /**< Interrupt handle */
> + char driver_name[BUFSIZ];   /**< driver name */

Why not embedding this field in driver struct?
The name and comment should be more precise.
There is also driver->name and hotplug patchset is adding a kernel driver name.
Please bring the light in all these driver names :)

>   const struct rte_pci_driver *driver;/**< Associated driver */
[...]
> --- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
> +#define IORESOURCE_MEM0x0200

Please comment this value.

> --- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
> +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
> @@ -50,8 +50,14 @@ enum rte_intr_handle_type {
>  
>  /** Handle for interrupts. */
>  struct rte_intr_handle {
> - int vfio_dev_fd; /**< VFIO device file descriptor */
> - int fd;  /**< file descriptor */
> + union {
> + int vfio_dev_fd;  /**< VFIO device file descriptor */
> + };
> +union {
> + int uio_cfg_fd;  /**< UIO config file descriptor
> + for uio_pci_generic */
> + };

Apart the indent, it seems there is a mistake here.
Why 2 unions with 1 field each?

> + int fd;  /**< interrupt event file descriptor */
>   enum rte_intr_handle_type type;  /**< handle type */
>  };



[dpdk-dev] [PATCH v4 2/2] examples: introduce new l2fwd-headroom example

2015-02-18 Thread De Lara Guarch, Pablo
Hi Pawel,

A few things to fix:

> -Original Message-
> From: Wodkowski, PawelX
> Sent: Tuesday, February 17, 2015 4:42 PM
> To: dev at dpdk.org
> Cc: De Lara Guarch, Pablo
> Subject: [PATCH v4 2/2] examples: introduce new l2fwd-headroom example
> 
> This app demonstrate usage of new headroom library.
> It is basicaly orginal l2fwd with following modificantions to met

Typo: Basically the original, modifications

> headroom library requirements:
> - main_loop() was split into two jobs: forward job and flush job. Logic
> for those jobs is almost the same as in orginal application.

original

> - stats is moved to rte_alarm callbac to not introduce overhead of

callback

> printing.
> - stats are expanded to show headroom statistics.
> - added new parameter '-l' to automatic thousands separator.
> 
> Comparing orginal l2fwd and l2fwd-headroom apps will show approach what

original

> is needed to properly write own application with headroom measurements.
> 
> New available statistics:
> - Total and % of fwd and flush execution time
> - management time - overhead of rte_timer + overhead of headroom library
> - Idle time and % of time spent waiting for fwd or flush to be ready to
> execute.
> - per job execution time and period.
> 
> 
> Signed-off-by: Pawel Wodkowski 
> ---
>  examples/Makefile|1 +
>  examples/l2fwd-headroom/Makefile |   51 ++
>  examples/l2fwd-headroom/main.c   | 1039
> ++
>  mk/rte.app.mk|4 +
>  4 files changed, 1095 insertions(+)
>  create mode 100644 examples/l2fwd-headroom/Makefile
>  create mode 100644 examples/l2fwd-headroom/main.c
> 
> diff --git a/examples/Makefile b/examples/Makefile
> index 81f1d2f..8a459b7 100644
> --- a/examples/Makefile
> +++ b/examples/Makefile
> @@ -50,6 +50,7 @@ DIRS-$(CONFIG_RTE_MBUF_REFCNT) +=
> ip_fragmentation
>  DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ipv4_multicast
>  DIRS-$(CONFIG_RTE_LIBRTE_KNI) += kni
>  DIRS-y += l2fwd
> +DIRS-y += l2fwd-headroom
>  DIRS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += l2fwd-ivshmem
>  DIRS-y += l3fwd
>  DIRS-$(CONFIG_RTE_LIBRTE_ACL) += l3fwd-acl
> diff --git a/examples/l2fwd-headroom/Makefile b/examples/l2fwd-
> headroom/Makefile
> new file mode 100644
> index 000..07da286
> --- /dev/null
> +++ b/examples/l2fwd-headroom/Makefile
> @@ -0,0 +1,51 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +# * Redistributions of source code must retain the above copyright
> +#   notice, this list of conditions and the following disclaimer.
> +# * Redistributions in binary form must reproduce the above copyright
> +#   notice, this list of conditions and the following disclaimer in
> +#   the documentation and/or other materials provided with the
> +#   distribution.
> +# * Neither the name of Intel Corporation nor the names of its
> +#   contributors may be used to endorse or promote products derived
> +#   from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> +
> +ifeq ($(RTE_SDK),)
> +$(error "Please define RTE_SDK environment variable")
> +endif
> +
> +# Default target, can be overriden by command line or environment

overridden

> +RTE_TARGET ?= x86_64-native-linuxapp-gcc
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# binary name
> +APP = l2fwd-headroom
> +
> +# all source are stored in SRCS-y
> +SRCS-y := main.c
> +
> +
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +
> +include $(RTE_SDK)/mk/rte.extapp.mk
> diff --git a/examples/l2fwd-headroom/main.c b/examples/l2fwd-
> headroom/main.c

[...]

> + if (qconf->n_rx_port > 0) {
> + job = &qconf->flush_job;
> + printf("\n\nJob %" PRIu32 ": %-20s "
> + "\n%-18s %'14" PRIu64
> + "\n%-18s %'14.0f"
> + STAT_FMT,
> + i, job->name,
> +

[dpdk-dev] [RFC PATCH] lib/librte_ethdev: Expand port identifier

2015-02-18 Thread Bruce Richardson
On Wed, Feb 18, 2015 at 02:10:48PM +0100, Marc Sune wrote:
> 
> On 18/02/15 13:31, Bruce Richardson wrote:
> >On Wed, Feb 18, 2015 at 12:30:07PM +, Bruce Richardson wrote:
> >>On Wed, Feb 18, 2015 at 08:02:49PM +0900, Tetsuya Mukawa wrote:
> >>>Currently uint8_t is used for port identifier. This patch changes it,
> >>>and use uint16_t as port identifier.
> >>>This patch only changes ethdev library. ABI of the library will be
> >>>kept even after applying it.
> >>>
> >>>Also, this patch involves following fixes.
> >>>- Use "port_id" as variable name instead of "port".
> >>>
> >>>
> >>>Signed-off-by: Tetsuya Mukawa 
> >>>---
> >>>  lib/librte_ether/rte_ethdev.c  |  212 +-
> >>>  lib/librte_ether/rte_ethdev_internal.h | 3672 
> >>> 
> >>>  2 files changed, 3778 insertions(+), 106 deletions(-)
> >>>  create mode 100644 lib/librte_ether/rte_ethdev_internal.h
> >>>
> >>I'm not sure I follow why we need a new header file for this.
> >>Also, thinking about this change, a more fundamental problem is going to be
> >>the mbuf structure, which stores a port id inside it in an 8-bit value.
> >>Upgrading that to a 16-bit value requires some thought, and verification to
> >>ensure any adjustment of fields does not lead to serious performance issues.
> >>
> >>Therefore, I suggest we leave the port id values as 8-bits until such time
> >>as we need greater than 255 port values in a real-world use case.
> >>Out of interest - anyone have a DPDK app where they use >16 port id values? 
> >>If
> >>so, how high does the port id value get?
> 
> Just a though on port_id in general; I wouldn't see why other type of ports
> could fall into the same abstraction of using port_ids as we do for PHY
> ports, if eventually we could create a unified API TX/RX routines he same
> regardless of the port (I know KNI deprecated this approach in the past). Of
> course initialization routines should be different for each type of port.
> 
> I see quite a bit of code duplicity, basically in TX/RX routines for PHY
> ports, KNI ports, SHMEM (ring) ports like ivshmem etc.., which are very
> similar, and we put into the shoulders of all users of DPDK to have to do
> the "switch() - case" based on the type of port (which is state that they
> have to store themselves too). This seems to me it could be improved from a
> DPDK user's point of view.
> 
> By no means I am saying lower level APIs should not be exposed (current
> APIs)... There is the need to, since users using one type of ports only
> should be able to by-pass that (small?) extra overhead of this higher level
> APIs.
> 
> If the implementation would eventually would go into this direction, there
> would be more pressure in the port_id identifier; e.g. KNI interfaces and
> other SW like interfaces can be created and destroyed quite frequently (e.g.
> VMs), so more than 8 bits for addressing would probably be needed.
> 
> I know it not helping in the short-term, but let's see if someone thinks
> this makes any sense at all.
> 
> Marc
> 

Yes, this makes complete sense, Marc, and it's something I (and some others 
here in Intel ) have certainly been thinking about - and go on thinking about.
The ability to have multiple objects of different types accessible under a
generic rx_burst/tx_burst interface is a very powerful one. 
[I actually tried something a bit like this before, with
patches to allow a form of type-casting from rings to ethdevs, but it didn't
go further as it was felt to overlap too much with the rings pmd.
http://dpdk.org/ml/archives/dev/2014-May/002505.html ].

Overall, at this point we believe that the ethdev itself is a bit of overkill
for a common API, since it's got a lot of NIC specific baggage in its APIs.
We are thinking that perhaps a "higher-level", more minimal abstraction, which
just has rx_burst or tx_burst functions and not a lot else, might be more
useful, as it can then be "sub-classed" [to use object-oriented terminology]
into device-type specific versions such as ethdev.
At this stage, we're just thinking about what such a thing might look like, and 
trying
a few things out to see if such an idea is workable. If we think it's something
worth pursuing, we hope to have something to share very soon, to get input
from the whole community to see if it's worth doing for future releases. 
However,
it appears your thinking closely aligns with what we are thinking, which is good
to see!

Regards,
/Bruce

> >>
> >>Regards,
> >>/Bruce
> >>
> >Resending with correct email addr for Neil.
> >
> >/Bruce
> 


[dpdk-dev] [RFC PATCH] lib/librte_ethdev: Expand port identifier

2015-02-18 Thread Bruce Richardson
On Wed, Feb 18, 2015 at 01:05:10PM +, Wodkowski, PawelX wrote:
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> > Sent: Wednesday, February 18, 2015 1:32 PM
> > To: Tetsuya Mukawa
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [RFC PATCH] lib/librte_ethdev: Expand port 
> > identifier
> > 
> > On Wed, Feb 18, 2015 at 12:30:07PM +, Bruce Richardson wrote:
> > > On Wed, Feb 18, 2015 at 08:02:49PM +0900, Tetsuya Mukawa wrote:
> > > > Currently uint8_t is used for port identifier. This patch changes it,
> > > > and use uint16_t as port identifier.
> > > > This patch only changes ethdev library. ABI of the library will be
> > > > kept even after applying it.
> > > >
> > > > Also, this patch involves following fixes.
> > > > - Use "port_id" as variable name instead of "port".
> > > >
> > > >
> > > > Signed-off-by: Tetsuya Mukawa 
> > > > ---
> > > >  lib/librte_ether/rte_ethdev.c  |  212 +-
> > > >  lib/librte_ether/rte_ethdev_internal.h | 3672
> > 
> > > >  2 files changed, 3778 insertions(+), 106 deletions(-)
> > > >  create mode 100644 lib/librte_ether/rte_ethdev_internal.h
> > > >
> > > I'm not sure I follow why we need a new header file for this.
> > > Also, thinking about this change, a more fundamental problem is going to 
> > > be
> > > the mbuf structure, which stores a port id inside it in an 8-bit value.
> > > Upgrading that to a 16-bit value requires some thought, and verification 
> > > to
> > > ensure any adjustment of fields does not lead to serious performance 
> > > issues.
> > >
> > > Therefore, I suggest we leave the port id values as 8-bits until such time
> > > as we need greater than 255 port values in a real-world use case.
> > > Out of interest - anyone have a DPDK app where they use >16 port id 
> > > values? If
> > > so, how high does the port id value get?
> 
> Not real application but simple example of setup:
> 4 Niantic x 2 ports x 64 VF = 512 port id

However, I'd find it hard to see why you would split a single port into 64
within a *single* application, let alone do that with 8 of them :-)

> 
> I don't know what would be the real need/advantage of such setup (bonding?) 
> but 
> you see, in theory it is already insufficient.

Well, in theory any number of ports is insufficient, as you can create thousands
and thousands of pcap or ring ethdevs in an app if you really want. Hence the
question about any actual use cases which use a large number of ports. :-)

/Bruce

> 
> > >
> > > Regards,
> > > /Bruce
> > >
> > 
> > Resending with correct email addr for Neil.
> > 
> > /Bruce


[dpdk-dev] [PATCH] eal: add missing symbol export for rte_eal_iopl_init()

2015-02-18 Thread Panu Matilainen
Signed-off-by: Panu Matilainen 
---
 lib/librte_eal/bsdapp/eal/rte_eal_version.map   | 1 +
 lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
 2 files changed, 2 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map 
b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index d36286e..5ed6e4d 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -33,6 +33,7 @@ DPDK_2.0 {
rte_eal_has_hugepages;
rte_eal_hpet_init;
rte_eal_init;
+   rte_eal_iopl_init;
rte_eal_lcore_role;
rte_eal_mp_remote_launch;
rte_eal_mp_wait_lcore;
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map 
b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index d36286e..5ed6e4d 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -33,6 +33,7 @@ DPDK_2.0 {
rte_eal_has_hugepages;
rte_eal_hpet_init;
rte_eal_init;
+   rte_eal_iopl_init;
rte_eal_lcore_role;
rte_eal_mp_remote_launch;
rte_eal_mp_wait_lcore;
-- 
2.1.0



[dpdk-dev] [PATCH] eal: add missing symbol export for rte_eal_iopl_init()

2015-02-18 Thread Gonzalez Monroy, Sergio
On 18/02/2015 14:17, Panu Matilainen wrote:
> Signed-off-by: Panu Matilainen 
> ---
>   lib/librte_eal/bsdapp/eal/rte_eal_version.map   | 1 +
>   lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
>   2 files changed, 2 insertions(+)
>
> diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map 
> b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> index d36286e..5ed6e4d 100644
> --- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> +++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
> @@ -33,6 +33,7 @@ DPDK_2.0 {
>   rte_eal_has_hugepages;
>   rte_eal_hpet_init;
>   rte_eal_init;
> + rte_eal_iopl_init;
>   rte_eal_lcore_role;
>   rte_eal_mp_remote_launch;
>   rte_eal_mp_wait_lcore;
> diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map 
> b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> index d36286e..5ed6e4d 100644
> --- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> +++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> @@ -33,6 +33,7 @@ DPDK_2.0 {
>   rte_eal_has_hugepages;
>   rte_eal_hpet_init;
>   rte_eal_init;
> + rte_eal_iopl_init;
>   rte_eal_lcore_role;
>   rte_eal_mp_remote_launch;
>   rte_eal_mp_wait_lcore;
Duplicated of http://dpdk.org/dev/patchwork/patch/3234/ ?

Sergio


[dpdk-dev] [PATCH v4 0/5] New Reorder Library

2015-02-18 Thread Thomas Monjalon
Hi Sergio,

2015-02-11 13:07, Sergio Gonzalez Monroy:
> This series introduces the new reorder library along with unit tests,
> sample app and a new entry in the programmers guide describing the library.
> 
> The library provides reordering of mbufs based on their sequence number.
> 
> As mention in the patch describing the library, one use case is the
> packet distributor.
> The distributor receives packets, assigns them a sequence number and sends
> them to the workers.
> The workers process those packets and return them to the distributor.
> The distributor collects out-of-order packets from the workers and uses
> this library to reorder the packets based on the sequence number they
> were assigned.
> 
> v4:
>  - add missing version.map and related versioning macros

You forgot to update the MAINTAINERS file.

> v3:
>  - fix copyright date
>  - add option to sample app to disable reordering
>  - add packet ordering sample guide entry
> 
> v2:
>  - add programmers guide entry describing the library
>  - use malloc instead of memzone to allocate memory
>  - modify create and init implementation, init takes a reorder buffer as input
>and create reserves memory and call init.
>  - update unit tests
> 
> Sergio Gonzalez Monroy (5):
>   reorder: new reorder library
>   app: New reorder unit test
>   examples: new sample app packet_ordering
>   doc: new reorder library description
>   doc: new packet ordering app description



[dpdk-dev] [PATCH] eal: add missing symbol export for rte_eal_iopl_init()

2015-02-18 Thread Panu Matilainen
On 02/18/2015 04:21 PM, Gonzalez Monroy, Sergio wrote:
> On 18/02/2015 14:17, Panu Matilainen wrote:
>> Signed-off-by: Panu Matilainen 
>> ---
>>   lib/librte_eal/bsdapp/eal/rte_eal_version.map   | 1 +
>>   lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
>>   2 files changed, 2 insertions(+)
>>
>> diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>> b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>> index d36286e..5ed6e4d 100644
>> --- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>> +++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>> @@ -33,6 +33,7 @@ DPDK_2.0 {
>>   rte_eal_has_hugepages;
>>   rte_eal_hpet_init;
>>   rte_eal_init;
>> +rte_eal_iopl_init;
>>   rte_eal_lcore_role;
>>   rte_eal_mp_remote_launch;
>>   rte_eal_mp_wait_lcore;
>> diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
>> b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
>> index d36286e..5ed6e4d 100644
>> --- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
>> +++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
>> @@ -33,6 +33,7 @@ DPDK_2.0 {
>>   rte_eal_has_hugepages;
>>   rte_eal_hpet_init;
>>   rte_eal_init;
>> +rte_eal_iopl_init;
>>   rte_eal_lcore_role;
>>   rte_eal_mp_remote_launch;
>>   rte_eal_mp_wait_lcore;
> Duplicated of http://dpdk.org/dev/patchwork/patch/3234/ ?

Yup, a dupe it is. Probably also saw your patch at the time as I had a 
slight sense of dejavu with this, but thought such a trivial thing 
would've been applied already...

- Panu -



[dpdk-dev] [PATCH v4 0/5] New Reorder Library

2015-02-18 Thread Gonzalez Monroy, Sergio
On 18/02/2015 14:22, Thomas Monjalon wrote:
> Hi Sergio,
>
> 2015-02-11 13:07, Sergio Gonzalez Monroy:
>> This series introduces the new reorder library along with unit tests,
>> sample app and a new entry in the programmers guide describing the library.
>>
>> The library provides reordering of mbufs based on their sequence number.
>>
>> As mention in the patch describing the library, one use case is the
>> packet distributor.
>> The distributor receives packets, assigns them a sequence number and sends
>> them to the workers.
>> The workers process those packets and return them to the distributor.
>> The distributor collects out-of-order packets from the workers and uses
>> this library to reorder the packets based on the sequence number they
>> were assigned.
>>
>> v4:
>>   - add missing version.map and related versioning macros
> You forgot to update the MAINTAINERS file.
>
>> v3:
>>   - fix copyright date
>>   - add option to sample app to disable reordering
>>   - add packet ordering sample guide entry
>>
>> v2:
>>   - add programmers guide entry describing the library
>>   - use malloc instead of memzone to allocate memory
>>   - modify create and init implementation, init takes a reorder buffer as 
>> input
>> and create reserves memory and call init.
>>   - update unit tests
>>
>> Sergio Gonzalez Monroy (5):
>>reorder: new reorder library
>>app: New reorder unit test
>>examples: new sample app packet_ordering
>>doc: new reorder library description
>>doc: new packet ordering app description
True!

I'll do v5.

Sergio


[dpdk-dev] [PATCH v5 0/6] New Reorder Library

2015-02-18 Thread Sergio Gonzalez Monroy
This series introduces the new reorder library along with unit tests,
sample app and a new entry in the programmers guide describing the library.

The library provides reordering of mbufs based on their sequence number.

As mention in the patch describing the library, one use case is the
packet distributor.
The distributor receives packets, assigns them a sequence number and sends
them to the workers.
The workers process those packets and return them to the distributor.
The distributor collects out-of-order packets from the workers and uses
this library to reorder the packets based on the sequence number they
were assigned.

v5:
 - update MAINTAINERS

v4:
 - add missing version.map and related versioning macros

v3:
 - fix copyright date
 - add option to sample app to disable reordering
 - add packet ordering sample guide entry

v2:
 - add programmers guide entry describing the library
 - use malloc instead of memzone to allocate memory
 - modify create and init implementation, init takes a reorder buffer as input
   and create reserves memory and call init.
 - update unit tests

Sergio Gonzalez Monroy (6):
  reorder: new reorder library
  app: New reorder unit test
  examples: new sample app packet_ordering
  doc: new reorder library description
  doc: new packet ordering app description
  MAINTAINERS: add and claim reorder

 MAINTAINERS|   8 +
 app/test/Makefile  |   2 +
 app/test/test_reorder.c| 393 ++
 config/common_bsdapp   |   5 +
 config/common_linuxapp |   5 +
 doc/guides/prog_guide/index.rst|   1 +
 doc/guides/prog_guide/reorder_lib.rst  | 115 
 doc/guides/sample_app_ug/index.rst |   1 +
 doc/guides/sample_app_ug/packet_ordering.rst   | 102 
 examples/packet_ordering/Makefile  |  50 ++
 examples/packet_ordering/main.c| 695 +
 lib/Makefile   |   1 +
 lib/librte_eal/common/include/rte_tailq_elem.h |   2 +
 lib/librte_mbuf/rte_mbuf.h |   3 +
 lib/librte_reorder/Makefile|  54 ++
 lib/librte_reorder/rte_reorder.c   | 416 +++
 lib/librte_reorder/rte_reorder.h   | 181 +++
 lib/librte_reorder/rte_reorder_version.map |  13 +
 mk/rte.app.mk  |   4 +
 19 files changed, 2051 insertions(+)
 create mode 100644 app/test/test_reorder.c
 create mode 100644 doc/guides/prog_guide/reorder_lib.rst
 create mode 100644 doc/guides/sample_app_ug/packet_ordering.rst
 create mode 100644 examples/packet_ordering/Makefile
 create mode 100644 examples/packet_ordering/main.c
 create mode 100644 lib/librte_reorder/Makefile
 create mode 100644 lib/librte_reorder/rte_reorder.c
 create mode 100644 lib/librte_reorder/rte_reorder.h
 create mode 100644 lib/librte_reorder/rte_reorder_version.map

-- 
1.9.3



[dpdk-dev] [PATCH v5 4/6] doc: new reorder library description

2015-02-18 Thread Sergio Gonzalez Monroy
This patch introduces a new section in the programmers guide describing
the reorder library.

Signed-off-by: Sergio Gonzalez Monroy 
---
 doc/guides/prog_guide/index.rst   |   1 +
 doc/guides/prog_guide/reorder_lib.rst | 115 ++
 2 files changed, 116 insertions(+)
 create mode 100644 doc/guides/prog_guide/reorder_lib.rst

diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 8d86dd4..de69682 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -61,6 +61,7 @@ Programmer's Guide
 lpm_lib
 lpm6_lib
 packet_distrib_lib
+reorder_lib
 ip_fragment_reassembly_lib
 multi_proc_support
 kernel_nic_interface
diff --git a/doc/guides/prog_guide/reorder_lib.rst 
b/doc/guides/prog_guide/reorder_lib.rst
new file mode 100644
index 000..bbd0521
--- /dev/null
+++ b/doc/guides/prog_guide/reorder_lib.rst
@@ -0,0 +1,115 @@
+..  BSD LICENSE
+Copyright(c) 2015 Intel Corporation. All rights reserved.
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+* Neither the name of Intel Corporation nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+.. _Reorder_Library:
+
+Reorder Library
+=
+
+The Reorder Library provides a mechanism for reordering mbufs based on their
+sequence number.
+
+Operation
+--
+
+The reorder library is essentially a buffer that reorders mbufs.
+The user inserts out of order mbufs into the reorder buffer and pulls in-order
+mbufs from it.
+
+At a given time, the reorder buffer contains mbufs whose sequence number are
+inside the sequence window. The sequence window is determined by the minimum
+sequence number and the number of entries that the buffer was configured to 
hold.
+For example, given a reorder buffer with 200 entries and a minimum sequence
+number of 350, the sequence window has low and high limits of 350 and 550
+respectively.
+
+When inserting mbufs, the reorder library differentiates between valid, early
+and late mbufs depending on the sequence number of the inserted mbuf:
+
+* valid: the sequence number is inside the window.
+* late: the sequence number is outside the window and less than the low limit.
+* early: the sequence number is outside the window and greater than the high
+  limit.
+
+The reorder buffer directly returns late mbufs and tries to accommodate early
+mbufs.
+
+
+Implementation Details
+-
+
+The reorder library is implemented as a pair of buffers, which referred to as
+the *Order* buffer and the *Ready* buffer.
+
+On an insert call, valid mbufs are inserted directly into the Order buffer and
+late mbufs are returned to the user with an error.
+
+In the case of early mbufs, the reorder buffer will try to move the window
+(incrementing the minimum sequence number) so that the mbuf becomes a valid 
one.
+To that end, mbufs in the Order buffer are moved into the Ready buffer.
+Any mbufs that have not arrived yet are ignored and therefore will become
+late mbufs.
+This means that as long as there is room in the Ready buffer, the window will
+be moved to accommodate early mbufs that would otherwise be outside the
+reordering window.
+
+For example, assuming that we have a buffer of 200 entries with a 350 minimum
+sequence number, and we need to insert an early mbuf with 565 sequence number.
+That means that we would need to move the windows at least 15 positions to
+accommodate the mbuf.
+The reorder buffer would try to move mbufs from at least the next 15 slots in
+the Order buffer to the Ready buffer, as lo

[dpdk-dev] [PATCH v5 2/6] app: New reorder unit test

2015-02-18 Thread Sergio Gonzalez Monroy
Adding new reorder unit test for the test app.
The command to run the unit test from the test shell is: reorder_autotest

Signed-off-by: Reshma Pattan 
Signed-off-by: Sergio Gonzalez Monroy 
---
 app/test/Makefile   |   2 +
 app/test/test_reorder.c | 393 
 mk/rte.app.mk   |   4 +
 3 files changed, 399 insertions(+)
 create mode 100644 app/test/test_reorder.c

diff --git a/app/test/Makefile b/app/test/Makefile
index 4311f96..24b27d7 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -124,6 +124,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += test_ivshmem.c
 SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += test_distributor.c
 SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += test_distributor_perf.c

+SRCS-$(CONFIG_RTE_LIBRTE_REORDER) += test_reorder.c
+
 SRCS-y += test_devargs.c
 SRCS-y += virtual_pmd.c
 SRCS-y += packet_burst_generator.c
diff --git a/app/test/test_reorder.c b/app/test/test_reorder.c
new file mode 100644
index 000..61cf8d3
--- /dev/null
+++ b/app/test/test_reorder.c
@@ -0,0 +1,393 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "test.h"
+#include "stdio.h"
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "test.h"
+
+#define BURST 32
+#define REORDER_BUFFER_SIZE 16384
+#define NUM_MBUFS (2*REORDER_BUFFER_SIZE)
+#define REORDER_BUFFER_SIZE_INVALID 2049
+#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
+
+struct reorder_unittest_params {
+   struct rte_mempool *p;
+   struct rte_reorder_buffer *b;
+};
+
+static struct reorder_unittest_params default_params  = {
+   .p = NULL,
+   .b = NULL
+};
+
+static struct reorder_unittest_params *test_params = &default_params;
+
+static int
+test_reorder_create(void)
+{
+   struct rte_reorder_buffer *b = NULL;
+
+   b = rte_reorder_create(NULL, rte_socket_id(), REORDER_BUFFER_SIZE);
+   TEST_ASSERT((b == NULL) && (rte_errno == EINVAL),
+   "No error on create() with NULL name");
+
+   b = rte_reorder_create("PKT", rte_socket_id(), 
REORDER_BUFFER_SIZE_INVALID);
+   TEST_ASSERT((b == NULL) && (rte_errno == EINVAL),
+   "No error on create() with invalid buffer size param.");
+
+   b = rte_reorder_create("PKT_RO1", rte_socket_id(), REORDER_BUFFER_SIZE);
+   printf("DEBUG: b= %p, orig_b= %p\n", b, test_params->b);
+   TEST_ASSERT_EQUAL(b, test_params->b,
+   "New reorder instance created with already existing 
name");
+
+   return 0;
+}
+
+static int
+test_reorder_init(void)
+{
+   struct rte_reorder_buffer *b = NULL;
+   unsigned int size;
+   /*
+* The minimum memory area size that should be passed to library is,
+* sizeof(struct rte_reorder_buffer) + (2 * size * sizeof(struct 
rte_mbuf *));
+* Otherwise error will be thrown
+*/
+
+   size = 100;
+   b = rte_reorder_init(b, size, "PKT1", REORDER_BUFFER_SIZE);
+   TEST_ASSERT((b == NULL) && (rte_errno == EINVAL),
+   "No error on init with NULL buffer.");
+
+   b = rte_malloc(NULL, size, 0);
+   b = rte_reorder_init(b, size, "PKT1", REORDER_BUFFER_SIZE);
+   TEST_ASSERT((b == NULL) && (rte_errno == EINVAL),
+   

[dpdk-dev] [PATCH v5 1/6] reorder: new reorder library

2015-02-18 Thread Sergio Gonzalez Monroy
This library provides reordering capability for out of order mbufs based
on a sequence number in the mbuf structure.

Signed-off-by: Reshma Pattan 
Signed-off-by: Richardson Bruce 
Signed-off-by: Sergio Gonzalez Monroy 
---
 config/common_bsdapp   |   5 +
 config/common_linuxapp |   5 +
 lib/Makefile   |   1 +
 lib/librte_eal/common/include/rte_tailq_elem.h |   2 +
 lib/librte_mbuf/rte_mbuf.h |   3 +
 lib/librte_reorder/Makefile|  54 
 lib/librte_reorder/rte_reorder.c   | 416 +
 lib/librte_reorder/rte_reorder.h   | 181 +++
 lib/librte_reorder/rte_reorder_version.map |  13 +
 9 files changed, 680 insertions(+)
 create mode 100644 lib/librte_reorder/Makefile
 create mode 100644 lib/librte_reorder/rte_reorder.c
 create mode 100644 lib/librte_reorder/rte_reorder.h
 create mode 100644 lib/librte_reorder/rte_reorder_version.map

diff --git a/config/common_bsdapp b/config/common_bsdapp
index 8cfa4e6..f11ff39 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -344,6 +344,11 @@ CONFIG_RTE_SCHED_PORT_N_GRINDERS=8
 CONFIG_RTE_LIBRTE_DISTRIBUTOR=y

 #
+# Compile the reorder library
+#
+CONFIG_RTE_LIBRTE_REORDER=y
+
+#
 # Compile librte_port
 #
 CONFIG_RTE_LIBRTE_PORT=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index db8332d..f921d8c 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -352,6 +352,11 @@ CONFIG_RTE_SCHED_PORT_N_GRINDERS=8
 CONFIG_RTE_LIBRTE_DISTRIBUTOR=y

 #
+# Compile the reorder library
+#
+CONFIG_RTE_LIBRTE_REORDER=y
+
+#
 # Compile librte_port
 #
 CONFIG_RTE_LIBRTE_PORT=y
diff --git a/lib/Makefile b/lib/Makefile
index 561a696..6575a4e 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -67,6 +67,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += librte_distributor
 DIRS-$(CONFIG_RTE_LIBRTE_PORT) += librte_port
 DIRS-$(CONFIG_RTE_LIBRTE_TABLE) += librte_table
 DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += librte_pipeline
+DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder

 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_eal/common/include/rte_tailq_elem.h 
b/lib/librte_eal/common/include/rte_tailq_elem.h
index f74fc7c..3013869 100644
--- a/lib/librte_eal/common/include/rte_tailq_elem.h
+++ b/lib/librte_eal/common/include/rte_tailq_elem.h
@@ -84,6 +84,8 @@ rte_tailq_elem(RTE_TAILQ_ACL, "RTE_ACL")

 rte_tailq_elem(RTE_TAILQ_DISTRIBUTOR, "RTE_DISTRIBUTOR")

+rte_tailq_elem(RTE_TAILQ_REORDER, "RTE_REORDER")
+
 rte_tailq_end(RTE_TAILQ_NUM)

 #undef rte_tailq_elem
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index e3008c6..ace6736 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -289,6 +289,9 @@ struct rte_mbuf {
uint32_t usr; /**< User defined tags. See 
@rte_distributor_process */
} hash;   /**< hash information */

+   /* sequence number - field used in distributor and reorder library */
+   uint32_t seqn;
+
/* second cache line - fields only used in slow path or on TX */
MARKER cacheline1 __rte_cache_aligned;

diff --git a/lib/librte_reorder/Makefile b/lib/librte_reorder/Makefile
new file mode 100644
index 000..0c01de1
--- /dev/null
+++ b/lib/librte_reorder/Makefile
@@ -0,0 +1,54 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGL

[dpdk-dev] [PATCH v5 3/6] examples: new sample app packet_ordering

2015-02-18 Thread Sergio Gonzalez Monroy
This new app makes use of the librte_reorder library.

It requires at least 3 lcores for RX, Workers (1 or more) and TX threads.
Communication between RX-Workers and Workers-TX is done by using rings.
The flow of mbufs is the following:
 * RX thread gets mbufs from driver, set sequence number and enqueue
   them in ring.
 * Workers dequeue mbufs from ring, do some 'work' and enqueue mbufs in
   ring.
 * TX dequeue mbufs from ring, inserts them in reorder buffer, drains
   mbufs from reorder and sends them to the driver.

Signed-off-by: Reshma Pattan 
Signed-off-by: Sergio Gonzalez Monroy 
---
 examples/packet_ordering/Makefile |  50 +++
 examples/packet_ordering/main.c   | 695 ++
 2 files changed, 745 insertions(+)
 create mode 100644 examples/packet_ordering/Makefile
 create mode 100644 examples/packet_ordering/main.c

diff --git a/examples/packet_ordering/Makefile 
b/examples/packet_ordering/Makefile
new file mode 100644
index 000..9e080a3
--- /dev/null
+++ b/examples/packet_ordering/Makefile
@@ -0,0 +1,50 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overridden by command line or environment
+RTE_TARGET ?= x86_64-ivshmem-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = packet_ordering
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/packet_ordering/main.c b/examples/packet_ordering/main.c
new file mode 100644
index 000..75e2f46
--- /dev/null
+++ b/examples/packet_ordering/main.c
@@ -0,0 +1,695 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE U

[dpdk-dev] [PATCH v5 6/6] MAINTAINERS: add and claim reorder

2015-02-18 Thread Sergio Gonzalez Monroy
Add files related to reorder library and claim it.

Signed-off-by: Sergio Gonzalez Monroy 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index e7a425b..d7d672c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -264,6 +264,14 @@ F: app/test/test_distributor*
 F: examples/distributor/
 F: doc/guides/sample_app_ug/dist_app.rst

+Reorder
+M: Sergio Gonzalez Monroy 
+F: lib/librte_reorder/
+F: doc/guides/prog_guide/reorder_lib.rst
+F: app/test/test_reorder*
+F: examples/packet_ordering/
+F: doc/guides/sample_app_ug/packet_ordering.rst
+
 Hierarchical scheduler
 M: Cristian Dumitrescu 
 F: lib/librte_sched/
-- 
1.9.3



[dpdk-dev] [PATCH v5 5/6] doc: new packet ordering app description

2015-02-18 Thread Sergio Gonzalez Monroy
This patch describes how to build and run he new packet ordering sample
application that exercises the reorder library.

Signed-off-by: Sergio Gonzalez Monroy 
---
 doc/guides/sample_app_ug/index.rst   |   1 +
 doc/guides/sample_app_ug/packet_ordering.rst | 102 +++
 2 files changed, 103 insertions(+)
 create mode 100644 doc/guides/sample_app_ug/packet_ordering.rst

diff --git a/doc/guides/sample_app_ug/index.rst 
b/doc/guides/sample_app_ug/index.rst
index d07dec3..5720181 100644
--- a/doc/guides/sample_app_ug/index.rst
+++ b/doc/guides/sample_app_ug/index.rst
@@ -60,6 +60,7 @@ Sample Applications User Guide
 intel_quickassist
 quota_watermark
 timer
+packet_ordering
 vmdq_dcb_forwarding
 vhost
 netmap_compatibility
diff --git a/doc/guides/sample_app_ug/packet_ordering.rst 
b/doc/guides/sample_app_ug/packet_ordering.rst
new file mode 100644
index 000..481f1b7
--- /dev/null
+++ b/doc/guides/sample_app_ug/packet_ordering.rst
@@ -0,0 +1,102 @@
+..  BSD LICENSE
+Copyright(c) 2015 Intel Corporation. All rights reserved.
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+* Neither the name of Intel Corporation nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+Packet Ordering Application
+
+
+The Packet Ordering sample app simply shows the impact of reordering a stream.
+It's meant to stress the library with different configurations for performance.
+
+Overview
+
+
+The application uses at least three CPU cores:
+
+* RX core (maser core) receives traffic from the NIC ports and feeds Worker
+  cores with traffic through SW queues.
+
+* Worker core (slave core) basically do some light work on the packet.
+  Currently it modifies the output port of the packet for configurations with
+  more than one port enabled.
+
+* TX Core (slave core) receives traffic from Woker cores through software 
queues,
+  inserts out-of-order packets into reorder buffer, extracts ordered packets
+  from the reorder buffer and sends them to the NIC ports for transmission.
+
+Compiling the Application
+--
+
+#.  Go to the example directory:
+
+.. code-block:: console
+
+export RTE_SDK=/path/to/rte_sdk
+cd ${RTE_SDK}/examples/helloworld
+
+#.  Set the target (a default target is used if not specified). For example:
+
+.. code-block:: console
+
+export RTE_TARGET=x86_64-native-linuxapp-gcc
+
+See the *DPDK Getting Started* Guide for possible RTE_TARGET values.
+
+#.  Build the application:
+
+.. code-block:: console
+
+make
+
+Running the Application
+---
+
+Refer to *DPDK Getting Started Guide* for general information on running 
applications
+and the Environment Abstraction Layer (EAL) options.
+
+Application Command Line
+
+
+The application execution command line is:
+
+.. code-block:: console
+
+./test-pipeline [EAL options] -- -p PORTMASK [--disable-reorder]
+
+The -c EAL CPU_COREMASK option has to contain at least 3 CPU cores.
+The first CPU core in the core mask is the master core and would be assigned to
+RX core, the last to TX core and the rest to Worker cores.
+
+The PORTMASK parameter must contain either 1 or even enabled port numbers.
+When setting more than 1 port, traffic would be forwarderd in pairs.
+For example, if we enable 4 ports, traffic from port 0 to 1 and from 1 to 0,
+then the other pair from 2 to 3 and from 3 to 2, having [0,1] and [2,3] pairs.
+
+The disable-reorder long option does, as its name 

[dpdk-dev] [PATCH v5 0/6] New Reorder Library

2015-02-18 Thread Thomas Monjalon
> This series introduces the new reorder library along with unit tests,
> sample app and a new entry in the programmers guide describing the library.
> 
> The library provides reordering of mbufs based on their sequence number.
> 
> As mention in the patch describing the library, one use case is the
> packet distributor.
> The distributor receives packets, assigns them a sequence number and sends
> them to the workers.
> The workers process those packets and return them to the distributor.
> The distributor collects out-of-order packets from the workers and uses
> this library to reorder the packets based on the sequence number they
> were assigned.
> 
> v5:
>  - update MAINTAINERS
> 
> v4:
>  - add missing version.map and related versioning macros
> 
> v3:
>  - fix copyright date
>  - add option to sample app to disable reordering
>  - add packet ordering sample guide entry
> 
> v2:
>  - add programmers guide entry describing the library
>  - use malloc instead of memzone to allocate memory
>  - modify create and init implementation, init takes a reorder buffer as input
>and create reserves memory and call init.
>  - update unit tests
> 
> Sergio Gonzalez Monroy (6):
>   reorder: new reorder library
>   app: New reorder unit test
>   examples: new sample app packet_ordering
>   doc: new reorder library description
>   doc: new packet ordering app description
>   MAINTAINERS: add and claim reorder

Applied, thanks


[dpdk-dev] [PATCH 2/3] doc: ACL - add description for different classification methods

2015-02-18 Thread Konstantin Ananyev
Signed-off-by: Konstantin Ananyev 
---
 doc/guides/prog_guide/packet_classif_access_ctrl.rst | 16 
 1 file changed, 16 insertions(+)

diff --git a/doc/guides/prog_guide/packet_classif_access_ctrl.rst 
b/doc/guides/prog_guide/packet_classif_access_ctrl.rst
index e018c68..d2adbff 100644
--- a/doc/guides/prog_guide/packet_classif_access_ctrl.rst
+++ b/doc/guides/prog_guide/packet_classif_access_ctrl.rst
@@ -269,6 +269,22 @@ When creating a set of rules, for each rule, additional 
information must be supp
 When adding new rules into an ACL context, all fields must be in host byte 
order (LSB).
 When the search is performed for an input tuple, all fields in that tuple 
must be in network byte order (MSB).

+Classification methods
+~~
+
+After rte_acl_build() over given ACL context has finished successfully, it can 
be used to perform classification - search for a ACL rule with highest priority 
over the input data.
+There are several implementations of classify algorithm:
+
+*   **RTE_ACL_CLASSIFY_SCALAR**: generic implementation, doesn't require any 
specific HW support.
+
+*   **RTE_ACL_CLASSIFY_SSE**: vector implementation, can process up to 8 flows 
in parallel. Requires SSE 4.1 support.
+
+*   **RTE_ACL_CLASSIFY_AVX2**: vector implementation, can process up to 16 
flows in parallel. Requires AVX2 support.
+
+It is purely a runtime decision which method to choose, there is no build-time 
difference.
+All implementations operates over the same internal RT structures and use 
similar principles. The main difference is that vector implementations can 
manually exploit IA SIMD instructions and process several input data flows in 
parallel.
+At startup ACL library determines the highest available classify method for 
the given platform and sets it as default one. Though the user has an ability 
to override the default classifier function for a given ACL context or perform 
particular search using non-default classify method. In that case it is user 
responsibility to make sure that given platform supports selected classify 
implementation.
+
 Application Programming Interface (API) Usage
 -

-- 
1.8.3.1



[dpdk-dev] [PATCH 1/3] doc: Add restrictions for ACL rule fields

2015-02-18 Thread Konstantin Ananyev
Signed-off-by: Konstantin Ananyev 
---
 doc/guides/prog_guide/packet_classif_access_ctrl.rst | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/doc/guides/prog_guide/packet_classif_access_ctrl.rst 
b/doc/guides/prog_guide/packet_classif_access_ctrl.rst
index 72f4510..e018c68 100644
--- a/doc/guides/prog_guide/packet_classif_access_ctrl.rst
+++ b/doc/guides/prog_guide/packet_classif_access_ctrl.rst
@@ -1,5 +1,5 @@
 ..  BSD LICENSE
-Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
 All rights reserved.

 Redistribution and use in source and binary forms, with or without
@@ -51,8 +51,18 @@ The library API provides the following basic operations:
 Overview
 

+Rule definition
+~~~
+
 The current implementation allows the user for each AC context to specify its 
own rule (set of fields)
 over which packet classification will be performed.
+Though there are few restrictions on the rule fields layout:
+
+*  First field in the rule definition has to be one byte long.
+*  All subsequent fields has to be grouped into sets of 4 consecutive bytes.
+
+This is done mainly for performance reasons - search function processes the 
first input byte as part of the flow setup and then the inner loop of the 
search function is unrolled to process four input bytes at a time.
+
 To define each field inside an AC rule, the following structure is used:

 .. code-block:: c
@@ -85,10 +95,7 @@ To define each field inside an AC rule, the following 
structure is used:
 A zero-based value that represents the position of the field inside the 
rule; 0 to N-1 for N fields.

 *   input_index
-For performance reasons, the inner loop of the search function is unrolled 
to process four input bytes at a time.
-This requires the input to be grouped into sets of 4 consecutive bytes.
-The loop processes the first input byte as part of the setup and then
-subsequent bytes must be in groups of 4 consecutive bytes.
+As mentioned above, all input fields, except the very first one, must be 
in groups of 4 consecutive bytes.
 The input index specifies to which input group that field belongs to.

 *   offset
-- 
1.8.3.1



[dpdk-dev] [PATCH 0/3] doc: ACL - add description for new features.

2015-02-18 Thread Konstantin Ananyev
Konstantin Ananyev (3):
  doc: Add restrictions for ACL rule fields
  doc: ACL - add description for different classification methods
  doc: ACL - add description for max_size build parameter

 .../prog_guide/packet_classif_access_ctrl.rst  | 83 --
 1 file changed, 78 insertions(+), 5 deletions(-)

-- 
1.8.3.1



[dpdk-dev] [PATCH 3/3] doc: ACL - add description for max_size build parameter

2015-02-18 Thread Konstantin Ananyev
Signed-off-by: Konstantin Ananyev 
---
 .../prog_guide/packet_classif_access_ctrl.rst  | 52 +-
 1 file changed, 51 insertions(+), 1 deletion(-)

diff --git a/doc/guides/prog_guide/packet_classif_access_ctrl.rst 
b/doc/guides/prog_guide/packet_classif_access_ctrl.rst
index d2adbff..210b020 100644
--- a/doc/guides/prog_guide/packet_classif_access_ctrl.rst
+++ b/doc/guides/prog_guide/packet_classif_access_ctrl.rst
@@ -269,10 +269,60 @@ When creating a set of rules, for each rule, additional 
information must be supp
 When adding new rules into an ACL context, all fields must be in host byte 
order (LSB).
 When the search is performed for an input tuple, all fields in that tuple 
must be in network byte order (MSB).

+RT memory size limit
+
+
+Build phase (rte_acl_build()) creates for a given set of rules internal 
structure for further run-time traversal.
+With current implementation it is a set of multi-bit tries (with stride == 8).
+Depending on the rules set, that could consume significant amount of memory.
+In attempt to conserve some space ACL build process tries to split the given
+rule-set into several non-intersecting subsets and construct a separate trie
+for each of them.
+Depending on the rule-set, it might reduce RT memory requirements but might
+increase classification time.
+There is a possibility at build-time to specify maximum memory limit for 
internal RT structures for given AC context.
+It could be done via **max_size** field of the **rte_acl_config** strucure.
+Setting it to the value greater than zero, instructs rte_acl_build() to:
+
+*   attempt to minimise number of tries in the RT table, but
+*   make sure that size of RT table wouldn't exceed given value.
+
+Setting it to zero makes rte_acl_build() to use the default behaviour:
+try to minimise size of the RT structures, but doesn't expose any hard limit 
on it.
+
+That gives the user the ability to decisions about performance/space trade-off.
+For example:
+
+.. code-block:: c
+
+struct rte_acl_ctx * acx;
+struct rte_acl_config cfg;
+int ret;
+
+/*
+ * assuming that acx points to already created and
+ * populated with rules AC context and cfg filled properly.
+ */
+
+ /* try to build AC context, with RT strcutures less then 8MB. */
+ cfg.max_size = 0x80;
+ ret = rte_acl_build(acx, &cfg);
+
+ /*
+  * RT strcutures can't fit into 8MB for given context.
+  * Try to build without exposing any hard limit.
+  */
+ if (ret == -ERANGE) {
+cfg.max_size = 0;
+ret = rte_acl_build(acx, &cfg);
+ }
+
+
+
 Classification methods
 ~~

-After rte_acl_build() over given ACL context has finished successfully, it can 
be used to perform classification - search for a ACL rule with highest priority 
over the input data.
+After rte_acl_build() over given AC context has finished successfully, it can 
be used to perform classification - search for a rule with highest priority 
over the input data.
 There are several implementations of classify algorithm:

 *   **RTE_ACL_CLASSIFY_SCALAR**: generic implementation, doesn't require any 
specific HW support.
-- 
1.8.3.1



[dpdk-dev] [PATCH 0/2] rte_ethdev fix/improvement

2015-02-18 Thread Thomas Monjalon
> > This patch series include a fix and an improvement to rte_ethdev lib.
> > 
> > Jia Yu (2):
> >   rte_ethdev: update link status (speed, duplex, link_up) after
> > rte_eth_dev_start
> >   rte_ethdev: add return status for rte_eth_stats_get
> 
> Acked-by: Helin Zhang 

Applied, thanks


[dpdk-dev] fm10k_rxtx.c does not compile

2015-02-18 Thread Wiles, Keith
I just pulled the code and found a unused function error ?dump_rxd()?

I had to add the ifdef around the function and remove the ifdef inside the 
function:

#ifdef RTE_LIBRTE_FM10K_DEBUG_RX
static inline void dump_rxd(union fm10k_rx_desc *rxd)
{
RTE_SET_USED(rxd);
PMD_RX_LOG(DEBUG, "+|+");
PMD_RX_LOG(DEBUG, "| GLORT  | PKT HDR & TYPE |");
PMD_RX_LOG(DEBUG, "|   0x%08x   |   0x%08x   |", rxd->d.glort,
rxd->d.data);
PMD_RX_LOG(DEBUG, "+|+");
PMD_RX_LOG(DEBUG, "|   VLAN & LEN   | STATUS |");
PMD_RX_LOG(DEBUG, "|   0x%08x   |   0x%08x   |", rxd->d.vlan_len,
rxd->d.staterr);
PMD_RX_LOG(DEBUG, "+|+");
PMD_RX_LOG(DEBUG, "|RESERVED|RSS_HASH|");
PMD_RX_LOG(DEBUG, "|   0x%08x   |   0x%08x   |", 0, rxd->d.rss);
PMD_RX_LOG(DEBUG, "+|+");
PMD_RX_LOG(DEBUG, "|TIME TAG |");
PMD_RX_LOG(DEBUG, "|   0x%016lx|", rxd->q.timestamp);
PMD_RX_LOG(DEBUG, "+|+");
}
#endif

Also clang on Ubuntu 14.04 does not like the option:
  CC fm10k_pf.o
error: unknown warning option '-Wno-unused-but-set-variable'; did you mean 
'-Wno-unused-const-variable'? [-Werror,-Wunknown-warning-option]



[dpdk-dev] [PATCH] bond: fix for kvlist memory leak on rte_kvargs_process failure identified by klockwork scan

2015-02-18 Thread Thomas Monjalon
> > Signed-off-by: Declan Doherty 
> 
> Acked-by: Olivier Matz 

Applied, thanks



[dpdk-dev] fm10k_rxtx.c does not compile

2015-02-18 Thread Thomas Monjalon
Hi Keith,

2015-02-18 17:01, Wiles, Keith:
> I just pulled the code and found a unused function error ?dump_rxd()?
> 
> I had to add the ifdef around the function and remove the ifdef inside the 
> function:
> 
> #ifdef RTE_LIBRTE_FM10K_DEBUG_RX
> static inline void dump_rxd(union fm10k_rx_desc *rxd)
> {
> RTE_SET_USED(rxd);
> PMD_RX_LOG(DEBUG, "+|+");
> PMD_RX_LOG(DEBUG, "| GLORT  | PKT HDR & TYPE |");
> PMD_RX_LOG(DEBUG, "|   0x%08x   |   0x%08x   |", rxd->d.glort,
> rxd->d.data);
> PMD_RX_LOG(DEBUG, "+|+");
> PMD_RX_LOG(DEBUG, "|   VLAN & LEN   | STATUS |");
> PMD_RX_LOG(DEBUG, "|   0x%08x   |   0x%08x   |", rxd->d.vlan_len,
> rxd->d.staterr);
> PMD_RX_LOG(DEBUG, "+|+");
> PMD_RX_LOG(DEBUG, "|RESERVED|RSS_HASH|");
> PMD_RX_LOG(DEBUG, "|   0x%08x   |   0x%08x   |", 0, rxd->d.rss);
> PMD_RX_LOG(DEBUG, "+|+");
> PMD_RX_LOG(DEBUG, "|TIME TAG |");
> PMD_RX_LOG(DEBUG, "|   0x%016lx|", rxd->q.timestamp);
> PMD_RX_LOG(DEBUG, "+|+");
> }
> #endif
> 
> Also clang on Ubuntu 14.04 does not like the option:
>   CC fm10k_pf.o
> error: unknown warning option '-Wno-unused-but-set-variable'; did you mean 
> '-Wno-unused-const-variable'? [-Werror,-Wunknown-warning-option]

Thanks for reporting.
I have no error with GCC 4.9.2.
Please could you try to send a patch?


[dpdk-dev] fm10k_rxtx.c does not compile

2015-02-18 Thread Wiles, Keith
I believe Jeff @ Intel is going to submit a patch soon.


On 2/18/15, 11:18 AM, "Thomas Monjalon"  wrote:

>Hi Keith,
>
>2015-02-18 17:01, Wiles, Keith:
>> I just pulled the code and found a unused function error ?dump_rxd()?
>> 
>> I had to add the ifdef around the function and remove the ifdef inside
>>the function:
>> 
>> #ifdef RTE_LIBRTE_FM10K_DEBUG_RX
>> static inline void dump_rxd(union fm10k_rx_desc *rxd)
>> {
>> RTE_SET_USED(rxd);
>> PMD_RX_LOG(DEBUG, "+|+");
>> PMD_RX_LOG(DEBUG, "| GLORT  | PKT HDR & TYPE |");
>> PMD_RX_LOG(DEBUG, "|   0x%08x   |   0x%08x   |", rxd->d.glort,
>> rxd->d.data);
>> PMD_RX_LOG(DEBUG, "+|+");
>> PMD_RX_LOG(DEBUG, "|   VLAN & LEN   | STATUS |");
>> PMD_RX_LOG(DEBUG, "|   0x%08x   |   0x%08x   |", rxd->d.vlan_len,
>> rxd->d.staterr);
>> PMD_RX_LOG(DEBUG, "+|+");
>> PMD_RX_LOG(DEBUG, "|RESERVED|RSS_HASH|");
>> PMD_RX_LOG(DEBUG, "|   0x%08x   |   0x%08x   |", 0, rxd->d.rss);
>> PMD_RX_LOG(DEBUG, "+|+");
>> PMD_RX_LOG(DEBUG, "|TIME TAG |");
>> PMD_RX_LOG(DEBUG, "|   0x%016lx|", rxd->q.timestamp);
>> PMD_RX_LOG(DEBUG, "+|+");
>> }
>> #endif
>> 
>> Also clang on Ubuntu 14.04 does not like the option:
>>   CC fm10k_pf.o
>> error: unknown warning option '-Wno-unused-but-set-variable'; did you
>>mean '-Wno-unused-const-variable'? [-Werror,-Wunknown-warning-option]
>
>Thanks for reporting.
>I have no error with GCC 4.9.2.
>Please could you try to send a patch?



[dpdk-dev] [PATCH v2 0/3] PMD ring MAC management, fix initialization, link up/down

2015-02-18 Thread Thomas Monjalon
> > Patch split into smaller parts to separate features from previous version.
> > 
> > Tomasz Kulasek (3):
> >   PMD Ring - Add link up/down functions
> >   PMD Ring - Add MAC addr add/remove functions
> >   PMD Ring - Fix for per device management
> 
> Acked-by: Declan Doherty 

Applied, thanks


[dpdk-dev] [PATCH v3 0/3] DPDK ethdev callback support

2015-02-18 Thread John McNamara
This patchset is for a small addition to the ethdev library, to
add in support for callbacks at the RX and TX stages. This allows
packet processing to be done on packets before they get returned
to applications using rte_eth_rx_burst call.

See the RFC cover letter for the use cases:

http://dpdk.org/ml/archives/dev/2014-December/010491.html

For this version we spent some time investigating Stephen Hemminger's
suggestion of using the userspace RCU (read-copy-update) library for
SMP safety:

   http://urcu.so/

The default liburcu (which defaulted to liburcu-mb) requires the least
interaction from the end user but showed a 25% drop in packet throughput
in the callback sample app.

The liburcu-qsbr (quiescent state) variant showed a 1% drop in packet
throughput in the callback sample app. However it requires registered
RCU threads in the program to periodically announce quiescent states.
This makes it more difficult to implement for end user applications.

For this release we will document that adding and removing callbacks
is not thread safe.

Note: Sample application documentation to follow in a patch update.

Version 3 changes:
* Removed unnecessary header file from example folder
  (which included baremetal reference).
* Renamed the interrupt, RX and TX callbacks to make their function
  clearer (using the names suggested in the mailing list comments).
* Squashed ABI version update into the commit it relates to.
* Fixed various checkpatch warnings.

Version 2 changes:
* Added ABI versioning.
* Doxygen clarifications.

Version 1 changes:
* Added callback removal functions.
* Minor fixes.


Richardson, Bruce (3):
  ethdev: Rename callbacks field to link_intr_cbs
  ethdev: Add rxtx callback support
  examples: example showing use of callbacks.

 app/test/virtual_pmd.c |2 +-
 examples/rxtx_callbacks/Makefile   |   57 
 examples/rxtx_callbacks/main.c |  228 
 lib/librte_ether/rte_ethdev.c  |  183 --
 lib/librte_ether/rte_ethdev.h  |  192 ++-
 lib/librte_ether/rte_ether_version.map |4 +
 lib/librte_pmd_bond/rte_eth_bond_api.c |2 +-
 7 files changed, 654 insertions(+), 14 deletions(-)
 create mode 100644 examples/rxtx_callbacks/Makefile
 create mode 100644 examples/rxtx_callbacks/main.c

-- 
1.7.4.1



[dpdk-dev] [PATCH v3 1/3] ethdev: Rename callbacks field to link_intr_cbs

2015-02-18 Thread John McNamara
From: Richardson, Bruce 

The 'callbacks' member of the rte_eth_dev structure has been renamed
to 'link_intr_cbs' to make it clear that it refers to callbacks from
NIC interrupts. This allows us to add other types of callbacks to
the structure without ambiguity.

Signed-off-by: Bruce Richardson 
Signed-off-by: John McNamara 
---
 app/test/virtual_pmd.c |2 +-
 lib/librte_ether/rte_ethdev.c  |   12 ++--
 lib/librte_ether/rte_ethdev.h  |2 +-
 lib/librte_pmd_bond/rte_eth_bond_api.c |2 +-
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/app/test/virtual_pmd.c b/app/test/virtual_pmd.c
index 9fac95d..eb75846 100644
--- a/app/test/virtual_pmd.c
+++ b/app/test/virtual_pmd.c
@@ -576,7 +576,7 @@ virtual_ethdev_create(const char *name, struct ether_addr 
*mac_addr,
eth_dev->data->nb_rx_queues = (uint16_t)1;
eth_dev->data->nb_tx_queues = (uint16_t)1;

-   TAILQ_INIT(&(eth_dev->callbacks));
+   TAILQ_INIT(&(eth_dev->link_intr_cbs));

eth_dev->data->dev_link.link_status = 0;
eth_dev->data->dev_link.link_speed = ETH_LINK_SPEED_1;
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 17be2f3..7c4e772 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -265,7 +265,7 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
eth_dev->data->rx_mbuf_alloc_failed = 0;

/* init user callbacks */
-   TAILQ_INIT(&(eth_dev->callbacks));
+   TAILQ_INIT(&(eth_dev->link_intr_cbs));

/*
 * Set the default MTU.
@@ -2743,7 +2743,7 @@ rte_eth_dev_callback_register(uint8_t port_id,
dev = &rte_eth_devices[port_id];
rte_spinlock_lock(&rte_eth_dev_cb_lock);

-   TAILQ_FOREACH(user_cb, &(dev->callbacks), next) {
+   TAILQ_FOREACH(user_cb, &(dev->link_intr_cbs), next) {
if (user_cb->cb_fn == cb_fn &&
user_cb->cb_arg == cb_arg &&
user_cb->event == event) {
@@ -2757,7 +2757,7 @@ rte_eth_dev_callback_register(uint8_t port_id,
user_cb->cb_fn = cb_fn;
user_cb->cb_arg = cb_arg;
user_cb->event = event;
-   TAILQ_INSERT_TAIL(&(dev->callbacks), user_cb, next);
+   TAILQ_INSERT_TAIL(&(dev->link_intr_cbs), user_cb, next);
}

rte_spinlock_unlock(&rte_eth_dev_cb_lock);
@@ -2784,7 +2784,7 @@ rte_eth_dev_callback_unregister(uint8_t port_id,
rte_spinlock_lock(&rte_eth_dev_cb_lock);

ret = 0;
-   for (cb = TAILQ_FIRST(&dev->callbacks); cb != NULL; cb = next) {
+   for (cb = TAILQ_FIRST(&dev->link_intr_cbs); cb != NULL; cb = next) {

next = TAILQ_NEXT(cb, next);

@@ -2798,7 +2798,7 @@ rte_eth_dev_callback_unregister(uint8_t port_id,
 * then remove it.
 */
if (cb->active == 0) {
-   TAILQ_REMOVE(&(dev->callbacks), cb, next);
+   TAILQ_REMOVE(&(dev->link_intr_cbs), cb, next);
rte_free(cb);
} else {
ret = -EAGAIN;
@@ -2817,7 +2817,7 @@ _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
struct rte_eth_dev_callback dev_cb;

rte_spinlock_lock(&rte_eth_dev_cb_lock);
-   TAILQ_FOREACH(cb_lst, &(dev->callbacks), next) {
+   TAILQ_FOREACH(cb_lst, &(dev->link_intr_cbs), next) {
if (cb_lst->cb_fn == NULL || cb_lst->event != event)
continue;
dev_cb = *cb_lst;
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 6e454e8..48e4ac9 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1539,7 +1539,7 @@ struct rte_eth_dev {
const struct eth_driver *driver;/**< Driver for this device */
struct eth_dev_ops *dev_ops;/**< Functions exported by PMD */
struct rte_pci_device *pci_dev; /**< PCI info. supplied by probing */
-   struct rte_eth_dev_cb_list callbacks; /**< User application callbacks */
+   struct rte_eth_dev_cb_list link_intr_cbs; /**< User application 
callbacks on interrupt*/
 };

 struct rte_eth_dev_sriov {
diff --git a/lib/librte_pmd_bond/rte_eth_bond_api.c 
b/lib/librte_pmd_bond/rte_eth_bond_api.c
index 4ab3267..077cb73 100644
--- a/lib/librte_pmd_bond/rte_eth_bond_api.c
+++ b/lib/librte_pmd_bond/rte_eth_bond_api.c
@@ -251,7 +251,7 @@ rte_eth_bond_create(const char *name, uint8_t mode, uint8_t 
socket_id)
eth_dev->data->nb_rx_queues = (uint16_t)1;
eth_dev->data->nb_tx_queues = (uint16_t)1;

-   TAILQ_INIT(&(eth_dev->callbacks));
+   TAILQ_INIT(&(eth_dev->link_intr_cbs));

eth_dev->data->dev_link.link_status = 0;

-- 
1.7.4.1



[dpdk-dev] [PATCH v3 3/3] examples: example showing use of callbacks.

2015-02-18 Thread John McNamara
From: Richardson, Bruce 

Example showing how callbacks can be used to insert a timestamp
into each packet on RX. On TX the timestamp is used to calculate
the packet latency through the app, in cycles.

Signed-off-by: Bruce Richardson 
Signed-off-by: John McNamara 
---
 examples/rxtx_callbacks/Makefile |   57 ++
 examples/rxtx_callbacks/main.c   |  228 ++
 2 files changed, 285 insertions(+), 0 deletions(-)
 create mode 100644 examples/rxtx_callbacks/Makefile
 create mode 100644 examples/rxtx_callbacks/main.c

diff --git a/examples/rxtx_callbacks/Makefile b/examples/rxtx_callbacks/Makefile
new file mode 100644
index 000..0fafbb7
--- /dev/null
+++ b/examples/rxtx_callbacks/Makefile
@@ -0,0 +1,57 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overridden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = rxtx_callbacks
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+EXTRA_CFLAGS += -O3 -g -Wfatal-errors
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/rxtx_callbacks/main.c b/examples/rxtx_callbacks/main.c
new file mode 100644
index 000..9e5e68e
--- /dev/null
+++ b/examples/rxtx_callbacks/main.c
@@ -0,0 +1,228 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include

[dpdk-dev] [PATCH v3 2/3] ethdev: Add rxtx callback support

2015-02-18 Thread John McNamara
From: Richardson, Bruce 

Add in support for inline processing of packets inside the RX or
TX call. For an RX callback, what happens is that we get a set of
packets from the NIC and then pass them to a callback function, if
configured, to allow additional processing to be done on them, e.g.
filling in more mbuf fields, before passing back to the application.
On TX, the packets are similarly post-processed before being handed
to the NIC for transmission.

Signed-off-by: Bruce Richardson 
Signed-off-by: John McNamara 
---
 lib/librte_ether/rte_ethdev.c  |  171 +++-
 lib/librte_ether/rte_ethdev.h  |  192 +++-
 lib/librte_ether/rte_ether_version.map |4 +
 3 files changed, 361 insertions(+), 6 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 7c4e772..dde1b49 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -337,6 +337,16 @@ rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
dev->data->nb_rx_queues = 0;
return -(ENOMEM);
}
+   dev->post_rx_burst_cbs = rte_zmalloc(
+   "ethdev->post_rx_burst_cbs",
+   sizeof(*dev->post_rx_burst_cbs) * nb_queues,
+   RTE_CACHE_LINE_SIZE);
+   if (dev->post_rx_burst_cbs == NULL) {
+   rte_free(dev->data->rx_queues);
+   dev->data->rx_queues = NULL;
+   dev->data->nb_rx_queues = 0;
+   return -(ENOMEM);
+   }
} else { /* re-configure */
FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_release, -ENOTSUP);

@@ -348,10 +358,20 @@ rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
RTE_CACHE_LINE_SIZE);
if (rxq == NULL)
return -(ENOMEM);
+   dev->post_rx_burst_cbs = rte_realloc(
+   dev->post_rx_burst_cbs,
+   sizeof(*dev->post_rx_burst_cbs) *
+   nb_queues, RTE_CACHE_LINE_SIZE);
+   if (dev->post_rx_burst_cbs == NULL)
+   return -(ENOMEM);

-   if (nb_queues > old_nb_queues)
+   if (nb_queues > old_nb_queues) {
+   uint16_t new_qs = nb_queues - old_nb_queues;
memset(rxq + old_nb_queues, 0,
-   sizeof(rxq[0]) * (nb_queues - old_nb_queues));
+   sizeof(rxq[0]) * new_qs);
+   memset(dev->post_rx_burst_cbs + old_nb_queues, 0,
+   sizeof(dev->post_rx_burst_cbs[0]) * new_qs);
+   }

dev->data->rx_queues = rxq;

@@ -479,6 +499,16 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
dev->data->nb_tx_queues = 0;
return -(ENOMEM);
}
+   dev->pre_tx_burst_cbs = rte_zmalloc(
+   "ethdev->pre_tx_burst_cbs",
+   sizeof(*dev->pre_tx_burst_cbs) * nb_queues,
+   RTE_CACHE_LINE_SIZE);
+   if (dev->pre_tx_burst_cbs == NULL) {
+   rte_free(dev->data->tx_queues);
+   dev->data->tx_queues = NULL;
+   dev->data->nb_tx_queues = 0;
+   return -(ENOMEM);
+   }
} else { /* re-configure */
FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_release, -ENOTSUP);

@@ -490,10 +520,21 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
RTE_CACHE_LINE_SIZE);
if (txq == NULL)
return -(ENOMEM);
+   dev->pre_tx_burst_cbs = rte_realloc(
+   dev->pre_tx_burst_cbs,
+   sizeof(*dev->pre_tx_burst_cbs) *
+   nb_queues, RTE_CACHE_LINE_SIZE);
+   if (dev->pre_tx_burst_cbs == NULL)
+   return -(ENOMEM);

-   if (nb_queues > old_nb_queues)
+
+   if (nb_queues > old_nb_queues) {
+   uint16_t new_qs = nb_queues - old_nb_queues;
memset(txq + old_nb_queues, 0,
-   sizeof(txq[0]) * (nb_queues - old_nb_queues));
+   sizeof(txq[0]) * new_qs);
+   memset(dev->pre_tx_burst_cbs + old_nb_queues, 0,
+   sizeof(dev->pre_tx_burst_cbs[0]) * new_qs);
+   }

dev->data->tx_queues = txq;

@@ -3258,3 +3299,125 @@ rte_eth_dev_filter_ctrl(uint8_t port_id, enum 
rte_filter_type filter_type,
FUNC_PTR_OR_ERR_RET(*dev->dev_ops->filter_ctrl, -ENOTSUP);

[dpdk-dev] [PATCH] fm10k: fix clang warning flags

2015-02-18 Thread Jeff Shaw
This commit fixes the following error which was reported when
compiling with clang by removing the option.

error: unknown warning option '-Wno-unused-but-set-variable'

Signed-off-by: Jeff Shaw 
---
 lib/librte_pmd_fm10k/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_pmd_fm10k/Makefile b/lib/librte_pmd_fm10k/Makefile
index 986f4ef..26663ae 100644
--- a/lib/librte_pmd_fm10k/Makefile
+++ b/lib/librte_pmd_fm10k/Makefile
@@ -55,7 +55,7 @@ else ifeq ($(CC), clang)
 #
 CFLAGS_BASE_DRIVER = -Wno-unused-parameter -Wno-unused-value
 CFLAGS_BASE_DRIVER += -Wno-strict-aliasing -Wno-format-extra-args
-CFLAGS_BASE_DRIVER += -Wno-unused-variable -Wno-unused-but-set-variable
+CFLAGS_BASE_DRIVER += -Wno-unused-variable
 CFLAGS_BASE_DRIVER += -Wno-missing-field-initializers

 else
-- 
2.1.0



[dpdk-dev] [PATCH v4 0/4] Unit tests for mode 4

2015-02-18 Thread Thomas Monjalon
> > This patch series depends of "PMD ring" patches and should be applied after 
> > them
> > to run successfully.
> >
> > v4 changes
> > - Adapting to changes in the initialize_eth_header API
> > - Fix linking problem against librte_pmd_bond.so for shared libraries
> > - Patchset cleanup for smoother application
> >
> > v3 changes
> > - Fix compilation issues for dynamic libraries
> > - Re-create a series of patches to maintain consistency
> >
> > v2 changes
> > - Patch split for better readability
> >
> > Tomasz Kulasek (4):
> >test: test.h rework
> >test: Unit tests for mode 4
> >mk: Link test app against librte_pmd_ring when needed
> >pmd_bond: Set routines required by test app global
> 
> Series Acked-by: Declan Doherty 

Applied, thanks


[dpdk-dev] [PATCH] fm10k: fix clang unused function error

2015-02-18 Thread Jeff Shaw
This commit fixes the following error which was reported when
compiling with clang by moving the function inside an
RTE_LIBRTE_FM10K_DEBUG_RX ifdef block.

error: unused function 'dump_rxd'

Signed-off-by: Jeff Shaw 
---
 lib/librte_pmd_fm10k/fm10k_rxtx.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/lib/librte_pmd_fm10k/fm10k_rxtx.c 
b/lib/librte_pmd_fm10k/fm10k_rxtx.c
index f6768f0..83bddfc 100644
--- a/lib/librte_pmd_fm10k/fm10k_rxtx.c
+++ b/lib/librte_pmd_fm10k/fm10k_rxtx.c
@@ -41,11 +41,9 @@
 #define rte_packet_prefetch(p)  do {} while (0)
 #endif

+#ifdef RTE_LIBRTE_FM10K_DEBUG_RX
 static inline void dump_rxd(union fm10k_rx_desc *rxd)
 {
-#ifndef RTE_LIBRTE_FM10K_DEBUG_RX
-   RTE_SET_USED(rxd);
-#endif
PMD_RX_LOG(DEBUG, "+|+");
PMD_RX_LOG(DEBUG, "| GLORT  | PKT HDR & TYPE |");
PMD_RX_LOG(DEBUG, "|   0x%08x   |   0x%08x   |", rxd->d.glort,
@@ -62,6 +60,7 @@ static inline void dump_rxd(union fm10k_rx_desc *rxd)
PMD_RX_LOG(DEBUG, "|   0x%016lx|", rxd->q.timestamp);
PMD_RX_LOG(DEBUG, "+|+");
 }
+#endif

 static inline void
 rx_desc_to_ol_flags(struct rte_mbuf *m, const union fm10k_rx_desc *d)
-- 
2.1.0



[dpdk-dev] fm10k_rxtx.c does not compile

2015-02-18 Thread Jeff Shaw
On Wed, Feb 18, 2015 at 05:29:02PM +, Wiles, Keith wrote:
> I believe Jeff @ Intel is going to submit a patch soon.

I sent patches to fix these errors.

Thanks,
Jeff



[dpdk-dev] [PATCH] fm10k: fix clang warning flags

2015-02-18 Thread Wiles, Keith


On 2/18/15, 11:57 AM, "Shaw, Jeffrey B"  wrote:

>This commit fixes the following error which was reported when
>compiling with clang by removing the option.
>
>error: unknown warning option '-Wno-unused-but-set-variable'
>
>Signed-off-by: Jeff Shaw 

Acked-by: keith.wiles at intel.com
>---
> lib/librte_pmd_fm10k/Makefile | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/lib/librte_pmd_fm10k/Makefile b/lib/librte_pmd_fm10k/Makefile
>index 986f4ef..26663ae 100644
>--- a/lib/librte_pmd_fm10k/Makefile
>+++ b/lib/librte_pmd_fm10k/Makefile
>@@ -55,7 +55,7 @@ else ifeq ($(CC), clang)
> #
> CFLAGS_BASE_DRIVER = -Wno-unused-parameter -Wno-unused-value
> CFLAGS_BASE_DRIVER += -Wno-strict-aliasing -Wno-format-extra-args
>-CFLAGS_BASE_DRIVER += -Wno-unused-variable -Wno-unused-but-set-variable
>+CFLAGS_BASE_DRIVER += -Wno-unused-variable
> CFLAGS_BASE_DRIVER += -Wno-missing-field-initializers
> 
> else
>-- 
>2.1.0
>



[dpdk-dev] [PATCH v3 2/3] ethdev: Add rxtx callback support

2015-02-18 Thread Thomas Monjalon
2015-02-18 17:42, John McNamara:
> From: Richardson, Bruce 
> 
> Add in support for inline processing of packets inside the RX or
> TX call. For an RX callback, what happens is that we get a set of
> packets from the NIC and then pass them to a callback function, if
> configured, to allow additional processing to be done on them, e.g.
> filling in more mbuf fields, before passing back to the application.
> On TX, the packets are similarly post-processed before being handed
> to the NIC for transmission.
> 
> Signed-off-by: Bruce Richardson 
> Signed-off-by: John McNamara 
[...]
> @@ -2393,7 +2447,18 @@ rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id,
>   struct rte_eth_dev *dev;
>  
>   dev = &rte_eth_devices[port_id];
> - return (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id], rx_pkts, 
> nb_pkts);
> + nb_pkts = (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id], rx_pkts,
> + nb_pkts);
> + struct rte_eth_rxtx_callback *cb = dev->post_rx_burst_cbs[queue_id];
> +
> + if (unlikely(cb != NULL)) {
> + do {
> + nb_pkts = cb->fn(port_id, queue_id, rx_pkts, nb_pkts,
> + cb->param);
> + cb = cb->next;
> + } while (cb != NULL);
> + }

Excuse me, it wasn't very clear for me but I thought from the following email
that the consensus was to use a compile-time option:
http://dpdk.org/ml/archives/dev/2015-February/013450.html



[dpdk-dev] [PATCH 1/3] bond change warning

2015-02-18 Thread Thomas Monjalon
> Remove function name from warning. 
> 
> Signed-off-by: Pawel Wodkowski 

Other patches of series were split in other series.

Applied, thanks


[dpdk-dev] [PATCH] fm10k: fix clang unused function error

2015-02-18 Thread Wiles, Keith


On 2/18/15, 12:07 PM, "Shaw, Jeffrey B"  wrote:

>This commit fixes the following error which was reported when
>compiling with clang by moving the function inside an
>RTE_LIBRTE_FM10K_DEBUG_RX ifdef block.
>
>error: unused function 'dump_rxd'
>
>Signed-off-by: Jeff Shaw 

Acked-by: keith.wiles at intel.com
>---
> lib/librte_pmd_fm10k/fm10k_rxtx.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
>diff --git a/lib/librte_pmd_fm10k/fm10k_rxtx.c
>b/lib/librte_pmd_fm10k/fm10k_rxtx.c
>index f6768f0..83bddfc 100644
>--- a/lib/librte_pmd_fm10k/fm10k_rxtx.c
>+++ b/lib/librte_pmd_fm10k/fm10k_rxtx.c
>@@ -41,11 +41,9 @@
> #define rte_packet_prefetch(p)  do {} while (0)
> #endif
> 
>+#ifdef RTE_LIBRTE_FM10K_DEBUG_RX
> static inline void dump_rxd(union fm10k_rx_desc *rxd)
> {
>-#ifndef RTE_LIBRTE_FM10K_DEBUG_RX
>-  RTE_SET_USED(rxd);
>-#endif
>   PMD_RX_LOG(DEBUG, "+|+");
>   PMD_RX_LOG(DEBUG, "| GLORT  | PKT HDR & TYPE |");
>   PMD_RX_LOG(DEBUG, "|   0x%08x   |   0x%08x   |", rxd->d.glort,
>@@ -62,6 +60,7 @@ static inline void dump_rxd(union fm10k_rx_desc *rxd)
>   PMD_RX_LOG(DEBUG, "|   0x%016lx|", rxd->q.timestamp);
>   PMD_RX_LOG(DEBUG, "+|+");
> }
>+#endif
> 
> static inline void
> rx_desc_to_ol_flags(struct rte_mbuf *m, const union fm10k_rx_desc *d)
>-- 
>2.1.0
>



[dpdk-dev] [PATCH v2 0/6] Link Bonding mode 6 support (ALB)

2015-02-18 Thread Thomas Monjalon
Hi,

2015-02-13 16:12, Declan Doherty:
> On 13/02/15 15:16, Michal Jastrzebski wrote:
> > Michal Jastrzebski (6):
> >net: changed arp_hdr struct declaration
> >bond: add link bonding mode 6 implementation
> >bond: add debug info for mode 6 link bonding
> >bond: add example application for link bonding mode 6
> >bond: modify TLB unit tests
> >bond: add unit tests for link bonding mode 6.

You didn't sign some of these patches. So I suspect that you should
fix some authorship.

Some of the patches make some changes without explaining why.

> Series Acked-by: Declan Doherty 

Please, use checkpatch before submitting and/or when reviewing.

A v3 is needed.


[dpdk-dev] [PATCH 0/3] doc: ACL - add description for new features.

2015-02-18 Thread Butler, Siobhan A

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Konstantin
> Ananyev
> Sent: Wednesday, February 18, 2015 4:29 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 0/3] doc: ACL - add description for new features.
> 
> Konstantin Ananyev (3):
>   doc: Add restrictions for ACL rule fields
>   doc: ACL - add description for different classification methods
>   doc: ACL - add description for max_size build parameter
> 
>  .../prog_guide/packet_classif_access_ctrl.rst  | 83
> --
>  1 file changed, 78 insertions(+), 5 deletions(-)
> 
> --
> 1.8.3.1

Series Acked-by: Siobhan Butler 


[dpdk-dev] [PATCH] x32 ABI support, first iteration

2015-02-18 Thread Thomas Monjalon
> > > Signed-off-by: Konstantin Ananyev 
> > > Signed-off-by: Daniel Mrzyglod 
> > 
> > Acked-by: Konstantin Ananyev 
> 
> Acked-by: Pablo de Lara 
> 
> Just add that documentation should be updated for this.

Applied, thanks


[dpdk-dev] [PATCH v2] doc: Add requirements for x32 ABI

2015-02-18 Thread Thomas Monjalon
> > This patch add requirements about compiler and distribution support.
> > 
> > v2:
> > spelling fixes
> > 
> > Signed-off-by: Daniel Mrzyglod 
> 
> Acked-by: Pablo de Lara 
> 
> Thanks Daniel!

Applied, thanks


[dpdk-dev] test_mp_secondary??

2015-02-18 Thread Stephen Hemminger
Why is test_mp_secondary in export list for eal?

$ git grep test_mp_secondary
MAINTAINERS:F: app/test/test_mp_secondary.c
app/test/Makefile:SRCS-y += test_mp_secondary.c
app/test/test.c:{ "run_secondary_instances", 
test_mp_secondary },
app/test/test.h:int test_mp_secondary(void);
app/test/test_mp_secondary.c:test_mp_secondary(void)
app/test/test_mp_secondary.c:   .callback = test_mp_secondary,
lib/librte_eal/bsdapp/eal/rte_eal_version.map:  test_mp_secondary;
lib/librte_eal/linuxapp/eal/rte_eal_version.map:test_mp_secondary;