On Tue, Feb 11, 2020 at 2:08 PM Gaetan Rivet <gr...@u256.net> wrote: > > [...] > >>> (gdb) p dev2 > >>> $5 = (struct rte_pci_device *) 0x54de5e0 > >>> (gdb) p /x *dev2 > >>> $6 = {next = {tqe_next = 0x5307460, tqe_prev = 0x5539f80}, device = > >>> {next = {tqe_next = 0x0, tqe_prev = 0x0}, name = 0x54e4f00, driver = > >>> 0x3a6b7f0, > >>> bus = 0x3a59a00, numa_node = 0x0, devargs = 0x0}, addr = {domain = > >>> 0x0, bus = 0x6, devid = 0x2, function = 0x0} > >>> > >>> we hit the pci_device ( 0x54de5e0) that was created during the first > >>> time probe of the parent device ? > >>> Since it is already probed, it goes to line 381 where it does frees > >>> the just allocated 'pci_device' inside pci_scan_one() via 'free(dev)' > >>> > >>> As you can see this pci_device does not have it's devargs set (rightly > >>> so as initially there were no devargs for this device) > >>> > >>> But as shown in the stack trace in my previous reply, when > >>> pci_find_device() walks the rte_pci_devices list , it finds this > >>> earlier probed device (without devargs) > >>> > >> > >> Alright, I think you are right, this is what is happening. > >> The issue IMO, is that the PCI scan is thus hitting an already existing > >> device, but we have > >> missed the case where the new device has more info that the previous one > >> (linked devargs). > >> > > That is correct > > > >>> pci_find_device (start=0x0, cmp=0x4b92d8 <cmp_dev_name>, > >>> data=0x55a4af8) at /root/dpdk-19.11/drivers/bus/pci/pci_common.c:426 > >>> 426 return &pdev->device; > >>> (gdb) p pdev > >>> $15 = (struct rte_pci_device *) 0x54de5e0 > >>> > >>> (gdb) p /x *pdev > >>> $16 = {next = {tqe_next = 0x5307460, tqe_prev = 0x5539f80}, device = > >>> {next = {tqe_next = 0x0, tqe_prev = 0x0}, name = 0x54e4f00, driver = > >>> 0x3a6b7f0, > >>> bus = 0x3a59a00, numa_node = 0x0, devargs = 0x0}, addr = {domain = > >>> 0x0, bus = 0x6, devid = 0x2, function = 0x0}, > >>> > >>> Hope this explains why the pci_device has NULL devargs at this point > >>> and how my fix to set it at this point helps solve the problem? > >>> > >>> Please let me know your thoughts > >>> > >> > >> I think the proper fix is instead to have a clean update during the PCI > >> scan. > >> The proper way is to keep the old device, but override its fields as new > >> info was found. > > > > Agreed > >> > >> Something like calling pci_name_set(dev2); line 359 maybe. BSD should also > >> be updated for consistency. > >> > >> My issue with your patch is that I think this issue is very specific to > >> the PCI bus and the capabilities > >> of some devices on it. It would be better to have a fully compliant scan + > >> probe ops considering > >> the supported capabilities, instead of forcing this on all buses. > > Sure, so i'm guessing you meant something like this ? > > > > dex 740a2cd..92a41c2 100644 > > --- a/drivers/bus/pci/linux/pci.c > > +++ b/drivers/bus/pci/linux/pci.c > > @@ -377,6 +377,8 @@ > > */ > > RTE_LOG(ERR, EAL, > > "Unexpected device scan at %s!\n", > > filename); > > + else if (dev2->device.devargs > > != dev->device.devargs) > > + pci_name_set(dev2); > > } > > free(dev); > > } > > > > Which is basically checking for devargs mismatch between the 'new' > > device and a match with an > > already probed device and return the old device while adjusting the > > new info (devargs) > > Is that OK? > > > >> > >> I'm wondering whether this can happen with an already existing devargs? If > >> so, is it more correct to keep > >> the new one, or ignore the probe, free the new devargs and report an > >> error? If the former, > >> please clean up properly the devargs using rte_devargs_remove() (before > >> calling pci_name_set() of course). > > Hello Somnath, > > Yes, this is basically it, however you also need to take care of a potential > already existing devargs here, because the PCI bus does not guarantee that > all devargs of a device will be named the same Thanks a lot Gaetan, have sent out V2 incorporating your suggestions here.. Please ack
> > .