> -----Original Message----- > From: Neil Horman [mailto:nhorman at tuxdriver.com] > Sent: Friday, June 13, 2014 5:08 PM > To: Doherty, Declan > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v3 2/5] Link Bonding PMD Library > (librte_eal/librte_ether link bonding support changes) > > On Fri, Jun 13, 2014 at 03:41:59PM +0100, Declan Doherty wrote: > > Updating functionality in EAL to support adding link bonding > > devices via ?vdev option. Link bonding devices will be > > initialized after all physical devices have been probed and > > initialized. > > > > Signed-off-by: Declan Doherty <declan.doherty at intel.com> > > --- > > lib/librte_eal/common/eal_common_dev.c | 66 > +++++++++++++++++++++++++++-- > > lib/librte_eal/common/eal_common_pci.c | 6 +++ > > lib/librte_eal/common/include/eal_private.h | 7 +++ > > lib/librte_eal/common/include/rte_dev.h | 1 + > > lib/librte_ether/rte_ethdev.c | 34 +++++++++++++-- > > lib/librte_ether/rte_ethdev.h | 7 ++- > > lib/librte_pmd_pcap/rte_eth_pcap.c | 22 +++++----- > > lib/librte_pmd_ring/rte_eth_ring.c | 32 +++++++------- > > lib/librte_pmd_ring/rte_eth_ring.h | 3 +- > > lib/librte_pmd_xenvirt/rte_eth_xenvirt.c | 2 +- > > 10 files changed, 144 insertions(+), 36 deletions(-) > > > > diff --git a/lib/librte_eal/common/eal_common_dev.c > b/lib/librte_eal/common/eal_common_dev.c > > index eae5656..b50c908 100644 > > --- a/lib/librte_eal/common/eal_common_dev.c > > +++ b/lib/librte_eal/common/eal_common_dev.c > > @@ -75,14 +75,28 @@ rte_eal_dev_init(void) > > > > /* call the init function for each virtual device */ > > TAILQ_FOREACH(devargs, &devargs_list, next) { > > + uint8_t bdev = 0; > > > > if (devargs->type != RTE_DEVTYPE_VIRTUAL) > > continue; > > > > TAILQ_FOREACH(driver, &dev_driver_list, next) { > > - if (driver->type != PMD_VDEV) > > + /* RTE_DEVTYPE_VIRTUAL can only be a virtual or bonded > device*/ > > + if (driver->type != PMD_VDEV && driver->type != > PMD_BDEV) > > continue; > > > > + /* > > + * Bonded devices are not initialize here, we do it > > later in > > + * rte_eal_bonded_dev_init() after all physical devices > have been > > + * probed and initialized > > + */ > > + if (driver->type == PMD_BDEV && > > + !strncmp(driver->name, devargs- > >virtual.drv_name, > > + strlen(driver->name))) { > > + bdev = 1; > > + break; > > + } > > + > I really don't think you need to add a new device type for bonded devs. Its > got > no specific hardware that it drives, and you configure it with a --vdev > command, > so treat it as one here. I understand that you need to pass additional > information about slaves to a bonded device, which is fine, but you can do > that > with kvargs pretty easily, at which point its just another vdev. The only > other > requirement is that you initilize the bonded vdev after the slave vdevs have > been created, which you can do by any of several methods (a priority field to > indicate that bonded drivers should be initilized last/later, a deferral > return > code from the init routine, or by dead reckoning via the careful construction > of > the application command line (placed the bonded --vdev option last on the > command line argument list at run time). > It was my initial intent to do as you have describe above, but the physical devices cause a real issue here, physical devices don't call through to rte_eth_dev_allocate until during rte_eal_pci_probe call, so it's not possible to initialize the bonded device from within rte_eal_dev_init as the physical devices have not been fully initialized at this point, as a port_id has not been allocated and can't be added as bonding slaves. I don't see away around this without changing the EAL API, which I've tried to avoid with this solution.
Ordering isn't an issue, and can easily be solved if the above problem didn't exist, and although a new device type isn't technically required, I think it's a cleaner solution than doing string comparisons. > > > /* search a driver prefix in virtual device name */ > > if (!strncmp(driver->name, devargs->virtual.drv_name, > > strlen(driver->name))) { > > @@ -92,9 +106,9 @@ rte_eal_dev_init(void) > > } > > } > > > > - if (driver == NULL) { > > - rte_panic("no driver found for %s\n", > > - devargs->virtual.drv_name); > > + if (driver == NULL && !bdev) { > > + rte_panic("no driver found for %s and is not a bonded > vdev %d\n", > > + devargs->virtual.drv_name, bdev); > > } > > } > > > > @@ -107,3 +121,47 @@ rte_eal_dev_init(void) > > } > > return 0; > > } > > + > > +#ifdef RTE_LIBRTE_PMD_BOND > > +int > > +rte_eal_bonded_dev_init(void) > > +{ > > + struct rte_devargs *devargs; > > + struct rte_driver *driver; > > + > > + TAILQ_FOREACH(devargs, &devargs_list, next) { > > + int vdev = 0; > > + > > + if (devargs->type != RTE_DEVTYPE_VIRTUAL) > > + continue; > > + > > + TAILQ_FOREACH(driver, &dev_driver_list, next) { > > + if (driver->type != PMD_VDEV && driver->type != > PMD_BDEV) > > + continue; > > + > > + /* Virtual devices have already been initialized so we > > skip > them > > + * here*/ > > + if (driver->type == PMD_VDEV && > > + !strncmp(driver->name, devargs- > >virtual.drv_name, > > + strlen(driver->name))) { > > + vdev = 1; > > + break; > > + } > > + > > + /* search a driver prefix in bonded device name */ > > + if (!strncmp(driver->name, devargs->virtual.drv_name, > > + strlen(driver->name))) { > > + driver->init(devargs->virtual.drv_name, devargs- > >args); > > + break; > > + } > > + } > > + > > + if (driver == NULL && !vdev) { > > + rte_panic("no driver found for %s\n", > > + devargs->virtual.drv_name); > > + } > > + } > > + return 0; > > +} > > +#endif > > + > If you treat bonded devices as vdevs, you can remove this function entirely. Agreed but only if there is a solution to the issues described above. > > > diff --git a/lib/librte_eal/common/eal_common_pci.c > b/lib/librte_eal/common/eal_common_pci.c > > index 4d877ea..9b584f5 100644 > > --- a/lib/librte_eal/common/eal_common_pci.c > > +++ b/lib/librte_eal/common/eal_common_pci.c > > @@ -166,7 +166,13 @@ rte_eal_pci_probe(void) > > dev->addr.devid, dev->addr.function); > > } > > > > +#ifdef RTE_LIBRTE_PMD_BOND > > + /* After all physical PCI devices have been probed and initialized then > > we > > + * initialize the bonded devices */ > > + return rte_eal_bonded_dev_init(); > > +#else > This is the wrong place for this, bonded devices are not pci devices, this > doesn't belong in the pci device probe path. If you treat the bonded devices > as > vdevs and handle the ordering as described above, you won't need this anyway. See first comment, but unless a new API is added for initialization of the bonded devices, then some sort signal/callback is required here to notify that it is now safe to initialize the bonded devices. > > > return 0; > > +#endif > > } > > > > /* dump one device */ > > diff --git a/lib/librte_eal/common/include/eal_private.h > b/lib/librte_eal/common/include/eal_private.h > > index 232fcec..f6081bb 100644 > > --- a/lib/librte_eal/common/include/eal_private.h > > +++ b/lib/librte_eal/common/include/eal_private.h > > @@ -203,4 +203,11 @@ int rte_eal_alarm_init(void); > > */ > > int rte_eal_dev_init(void); > > > > +#ifdef RTE_LIBRTE_PMD_BOND > > +/** > > + * Initialize the bonded devices > > + */ > > +int rte_eal_bonded_dev_init(void); > > +#endif > > + > > #endif /* _EAL_PRIVATE_H_ */ > > diff --git a/lib/librte_eal/common/include/rte_dev.h > b/lib/librte_eal/common/include/rte_dev.h > > index f7e3a10..f0a780a 100644 > > --- a/lib/librte_eal/common/include/rte_dev.h > > +++ b/lib/librte_eal/common/include/rte_dev.h > > @@ -62,6 +62,7 @@ typedef int (rte_dev_init_t)(const char *name, const char > *args); > > enum pmd_type { > > PMD_VDEV = 0, > > PMD_PDEV = 1, > > + PMD_BDEV = 2, /**< Poll Mode Driver Bonded Device*/ > > }; > Can drop this as noted above. > > > > > /** > > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c > > index 8011b8b..4c2f1d3 100644 > > --- a/lib/librte_ether/rte_ethdev.c > > +++ b/lib/librte_ether/rte_ethdev.c > > @@ -64,6 +64,7 @@ > > #include <rte_mbuf.h> > > #include <rte_errno.h> > > #include <rte_spinlock.h> > > +#include <rte_string_fns.h> > > > > #include "rte_ether.h" > > #include "rte_ethdev.h" > > @@ -152,8 +153,21 @@ rte_eth_dev_data_alloc(void) > > RTE_MAX_ETHPORTS * > sizeof(*rte_eth_dev_data)); > > } > > > > +static int > > +rte_eth_dev_name_unique(const char* name) > > +{ > > + unsigned i; > > + > > + for (i = 0; i < nb_ports; i++) { > > + if (strcmp(rte_eth_devices[i].data->name, name) == 0) > > + return -1; > > + } > > + > > + return 0; > > +} > > + > > struct rte_eth_dev * > > -rte_eth_dev_allocate(void) > > +rte_eth_dev_allocate(const char* name) > > { > > struct rte_eth_dev *eth_dev; > > > > @@ -165,23 +179,37 @@ rte_eth_dev_allocate(void) > > if (rte_eth_dev_data == NULL) > > rte_eth_dev_data_alloc(); > > > > + if (rte_eth_dev_name_unique(name)) { > > + PMD_DEBUG_TRACE("Ethernet Device with name %s already > allocated!\n"); > > + return NULL; > > + } > > + > This seems fairly racy if you allow dynamic device creation at run time from > the > application, if multiple threads attempt to create bonds in parallel. True but if this is an issue then there probably should be some locking around this allocation anyway. > > > > eth_dev = &rte_eth_devices[nb_ports]; > > eth_dev->data = &rte_eth_dev_data[nb_ports]; > > + rte_snprintf(eth_dev->data->name , sizeof(eth_dev->data->name ), > > + "%s", name); > > eth_dev->data->port_id = nb_ports++; > > return eth_dev; > > } > > > > static int > > rte_eth_dev_init(struct rte_pci_driver *pci_drv, > > - struct rte_pci_device *pci_dev) > > + struct rte_pci_device *pci_dev) > > { > > struct eth_driver *eth_drv; > > struct rte_eth_dev *eth_dev; > > + char ethdev_name[RTE_ETH_NAME_MAX_LEN]; > > + > > int diag; > > > > eth_drv = (struct eth_driver *)pci_drv; > > > > - eth_dev = rte_eth_dev_allocate(); > > + /* Create unique ethdev name by concatenating drive name and number > of > > + * ports */ > > + rte_snprintf(ethdev_name, RTE_ETH_NAME_MAX_LEN, "%d:%d.%d", > > + pci_dev->addr.bus, pci_dev->addr.devid, pci_dev- > >addr.function); > > + > > + eth_dev = rte_eth_dev_allocate(ethdev_name); > > if (eth_dev == NULL) > > return -ENOMEM; > > > > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h > > index 67eda50..27ed0ab 100644 > > --- a/lib/librte_ether/rte_ethdev.h > > +++ b/lib/librte_ether/rte_ethdev.h > > @@ -1233,6 +1233,8 @@ struct rte_eth_dev_sriov { > > }; > > #define RTE_ETH_DEV_SRIOV(dev) ((dev)->data->sriov) > > > > +#define RTE_ETH_NAME_MAX_LEN (32) > > + > > /** > > * @internal > > * The data part, with no function pointers, associated with each ethernet > device. > > @@ -1241,6 +1243,8 @@ struct rte_eth_dev_sriov { > > * processes in a multi-process configuration. > > */ > > struct rte_eth_dev_data { > > + char name[RTE_ETH_NAME_MAX_LEN]; /**< Unique identifier name */ > > + > > void **rx_queues; /**< Array of pointers to RX queues. */ > > void **tx_queues; /**< Array of pointers to TX queues. */ > > uint16_t nb_rx_queues; /**< Number of RX queues. */ > > @@ -1293,10 +1297,11 @@ extern uint8_t rte_eth_dev_count(void); > > * Allocates a new ethdev slot for an ethernet device and returns the > > pointer > > * to that slot for the driver to use. > > * > > + * @param name Unique identifier name for each Ethernet device > > * @return > > * - Slot in the rte_dev_devices array for a new device; > > */ > > -struct rte_eth_dev *rte_eth_dev_allocate(void); > > +struct rte_eth_dev *rte_eth_dev_allocate(const char *name); > > > > struct eth_driver; > > > > /** > > diff --git a/lib/librte_pmd_pcap/rte_eth_pcap.c > b/lib/librte_pmd_pcap/rte_eth_pcap.c > Hmm, we're modifying other pmds for the naming feature, I think it would be > best > split out into a separate patch. Something entitled "support unique interface > naming for virtual pmds" or something. True, but if I do that, I will need to break support for virtual devices in this patch set, and I would prefer to keep this functionality working initial release of link bonding. Also another patchset will be required at some point in the near future to tidy up/rationalize the current handling of virtual devices identification and these changes do not change functionally how these devices work.