Re: pci: kernel crash in bus_find_device
>> > Look for callers of bus_find_device. Unless I am missing something, only pci > and scsi code call it with non-NULL 'start' argument, and the scsi use is > limited to a walk through scsi devices for a proc file. > > Makes me wonder if the start argument should go away, and if pci and scsi > should use another means to walk through devices. I think that would be the correct approach. In case of pci all functions using pci_get_device, pci_get_subsys or pci_get_class (which call pci_get_dev_by_id/bus_find_device) to iterate over the whole list using a non-NULL start argument would have to be audited. There seem to be quite a few of them using loops of the kind while ((dev = pci_get_device( …, dev)) != NULL) (and similarly for pci_get_subsys and pci_get_class) and they could all be vulnerable if they try to resume their search from a device that was unregistered. Francesco > > Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
Look for callers of bus_find_device. Unless I am missing something, only pci and scsi code call it with non-NULL 'start' argument, and the scsi use is limited to a walk through scsi devices for a proc file. Makes me wonder if the start argument should go away, and if pci and scsi should use another means to walk through devices. I think that would be the correct approach. In case of pci all functions using pci_get_device, pci_get_subsys or pci_get_class (which call pci_get_dev_by_id/bus_find_device) to iterate over the whole list using a non-NULL start argument would have to be audited. There seem to be quite a few of them using loops of the kind while ((dev = pci_get_device( …, dev)) != NULL) (and similarly for pci_get_subsys and pci_get_class) and they could all be vulnerable if they try to resume their search from a device that was unregistered. Francesco Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
On Tue, Jun 03, 2014 at 04:21:00PM -0700, Greg KH wrote: > On Tue, Jun 03, 2014 at 03:55:02PM -0700, Francesco Ruggeri wrote: > > In-Reply-To: <20140523023141.gc13...@kroah.com> > > > > > > Hi Guenter, > > I got back to looking into this crash. > > Just as an example, the attached diffs also fix my bus_find_device problem > > for > > traversals that start from the head of the list and traverse it completely. > > They are very specific to the case of bus_find_device, and a complete > > solution > > would affect a lot of code. > > The main issue seems to be that when a device is found in a klist by say > > bus_find_device the klist_node reference should be returned to the caller, > > who should then decide whether to use it for the next klist search, drop it > > or > > maybe exchange it for a struct device reference. When resuming a search one > > should already hold a klist_node reference from the previous search. > > This model is broken by several functions using struct devices such as > > bus_find_device, which resume klist searches on the implicit assumption that > > holding a reference to the struct device is enough to acquire one on the > > klist_node. > > The only reason that this has not been a big issue so far is probably that > > on most systems struct devices are not destroyed and created very often. > > Not true, this happens on every USB device insertion and removal, and on > startup and shutdown. What makes PCI special that we aren't hitting > these issues in USB and other subsystems that do a lot of device > creation/removal? > Look for callers of bus_find_device. Unless I am missing something, only pci and scsi code call it with non-NULL 'start' argument, and the scsi use is limited to a walk through scsi devices for a proc file. Makes me wonder if the start argument should go away, and if pci and scsi should use another means to walk through devices. Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
On Tue, Jun 03, 2014 at 03:55:02PM -0700, Francesco Ruggeri wrote: > @@ -719,6 +719,11 @@ struct pci_dev *pci_get_device(unsigned > struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device, > unsigned int ss_vendor, unsigned int ss_device, > struct pci_dev *from); > +struct pci_dev *pci_get_device_noref(unsigned int vendor, unsigned int > device, > + struct pci_dev *from); > +struct pci_dev *pci_get_subsys_noref(unsigned int vendor, unsigned int > device, > + unsigned int ss_vendor, unsigned int > ss_device, > + struct pci_dev *from); To follow up, what drivers are you thinking need to make these calls? Perhaps they shouldn't be doing something like this? And, to answer my previous question, the reason PCI is different is that drivers want to walk the list of devices to find "stupid" things like this out, USB doesn't do dumb stuff like that :) thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
On Tue, Jun 03, 2014 at 03:55:02PM -0700, Francesco Ruggeri wrote: > In-Reply-To: <20140523023141.gc13...@kroah.com> > > > Hi Guenter, > I got back to looking into this crash. > Just as an example, the attached diffs also fix my bus_find_device problem for > traversals that start from the head of the list and traverse it completely. > They are very specific to the case of bus_find_device, and a complete solution > would affect a lot of code. > The main issue seems to be that when a device is found in a klist by say > bus_find_device the klist_node reference should be returned to the caller, > who should then decide whether to use it for the next klist search, drop it or > maybe exchange it for a struct device reference. When resuming a search one > should already hold a klist_node reference from the previous search. > This model is broken by several functions using struct devices such as > bus_find_device, which resume klist searches on the implicit assumption that > holding a reference to the struct device is enough to acquire one on the > klist_node. > The only reason that this has not been a big issue so far is probably that > on most systems struct devices are not destroyed and created very often. Not true, this happens on every USB device insertion and removal, and on startup and shutdown. What makes PCI special that we aren't hitting these issues in USB and other subsystems that do a lot of device creation/removal? thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
In-Reply-To: <20140523023141.gc13...@kroah.com> Hi Guenter, I got back to looking into this crash. Just as an example, the attached diffs also fix my bus_find_device problem for traversals that start from the head of the list and traverse it completely. They are very specific to the case of bus_find_device, and a complete solution would affect a lot of code. The main issue seems to be that when a device is found in a klist by say bus_find_device the klist_node reference should be returned to the caller, who should then decide whether to use it for the next klist search, drop it or maybe exchange it for a struct device reference. When resuming a search one should already hold a klist_node reference from the previous search. This model is broken by several functions using struct devices such as bus_find_device, which resume klist searches on the implicit assumption that holding a reference to the struct device is enough to acquire one on the klist_node. The only reason that this has not been a big issue so far is probably that on most systems struct devices are not destroyed and created very often. Thanks, Francesco Index: linux-3.4/lib/klist.c === --- linux-3.4.orig/lib/klist.c +++ linux-3.4/lib/klist.c @@ -318,6 +318,51 @@ void klist_iter_exit(struct klist_iter * } EXPORT_SYMBOL_GPL(klist_iter_exit); +/** + * klist_iter_init_node_noref - Initialize a klist_iter structure without + * taking a kref. + * @k: klist we're iterating. + * @i: klist_iter we're filling. + * @n: node to start with. + * + * Similar to klist_iter_init_noref(), but starts the action off with @n, + * instead of with the list head. + */ +void klist_iter_init_node_noref(struct klist *k, struct klist_iter *i, + struct klist_node *n) +{ + i->i_klist = k; + i->i_cur = n; +} +EXPORT_SYMBOL_GPL(klist_iter_init_node_noref); + +/** + * klist_iter_init_noref - Initalize a klist_iter structure without taking + *a kref. + * @k: klist we're iterating. + * @i: klist_iter structure we're filling. + * + * Similar to klist_iter_init_node_noref(), but start with the list head. + */ +void klist_iter_init_noref(struct klist *k, struct klist_iter *i) +{ + klist_iter_init_node_noref(k, i, NULL); +} +EXPORT_SYMBOL_GPL(klist_iter_init_noref); + +/** + * klist_iter_exit_noref - Finish a list iteration without dropping a kref. + * @i: Iterator structure. + * + * Not really needed, but always good form. + */ +void klist_iter_exit_noref(struct klist_iter *i) +{ + if (i->i_cur) + i->i_cur = NULL; +} +EXPORT_SYMBOL_GPL(klist_iter_exit_noref); + static struct klist_node *to_klist_node(struct list_head *n) { return container_of(n, struct klist_node, n_node); Index: linux-3.4/drivers/base/bus.c === --- linux-3.4.orig/drivers/base/bus.c +++ linux-3.4/drivers/base/bus.c @@ -341,6 +341,37 @@ struct device *bus_find_device(struct bu } EXPORT_SYMBOL_GPL(bus_find_device); +/** + * bus_find_device_noref - device iterator for locating a particular device. + * @bus: bus type + * @start: Device to begin with + * @data: Data to pass to match function + * @match: Callback function to check device + * + * Same as bus_find_device, but it assumes caller already has klist_node ref, + * and it returns to caller a struct device with the new kref. + */ + +struct device *bus_find_device_noref(struct bus_type *bus, +struct device *start, void *data, +int (*match)(struct device *dev, void *data)) +{ + struct klist_iter i; + struct device *dev; + + if (!bus || !bus->p) + return NULL; + + klist_iter_init_node_noref(>p->klist_devices, , + (start ? >p->knode_bus : NULL)); + while ((dev = next_device())) + if (match(dev, data) && get_device(dev)) + break; + klist_iter_exit_noref(); + return dev; +} +EXPORT_SYMBOL_GPL(bus_find_device_noref); + static int match_name(struct device *dev, void *data) { const char *name = data; Index: linux-3.4/drivers/pci/search.c === --- linux-3.4.orig/drivers/pci/search.c +++ linux-3.4/drivers/pci/search.c @@ -289,6 +289,94 @@ pci_get_device(unsigned int vendor, unsi return pci_get_subsys(vendor, device, PCI_ANY_ID, PCI_ANY_ID, from); } +/* + * pci_get_dev_by_id_noref - begin or continue searching for a PCI device by id + * @id: pointer to struct pci_device_id to match for the device + * @from: Previous PCI device found in search, or %NULL for new search. + * + * Same as pci_get_dev_by_id, but it expects caller to hold a kref on knode_bus + * in "from" from previous search, and it holds a kref on knode_bus in returned + *
Re: pci: kernel crash in bus_find_device
In-Reply-To: 20140523023141.gc13...@kroah.com Hi Guenter, I got back to looking into this crash. Just as an example, the attached diffs also fix my bus_find_device problem for traversals that start from the head of the list and traverse it completely. They are very specific to the case of bus_find_device, and a complete solution would affect a lot of code. The main issue seems to be that when a device is found in a klist by say bus_find_device the klist_node reference should be returned to the caller, who should then decide whether to use it for the next klist search, drop it or maybe exchange it for a struct device reference. When resuming a search one should already hold a klist_node reference from the previous search. This model is broken by several functions using struct devices such as bus_find_device, which resume klist searches on the implicit assumption that holding a reference to the struct device is enough to acquire one on the klist_node. The only reason that this has not been a big issue so far is probably that on most systems struct devices are not destroyed and created very often. Thanks, Francesco Index: linux-3.4/lib/klist.c === --- linux-3.4.orig/lib/klist.c +++ linux-3.4/lib/klist.c @@ -318,6 +318,51 @@ void klist_iter_exit(struct klist_iter * } EXPORT_SYMBOL_GPL(klist_iter_exit); +/** + * klist_iter_init_node_noref - Initialize a klist_iter structure without + * taking a kref. + * @k: klist we're iterating. + * @i: klist_iter we're filling. + * @n: node to start with. + * + * Similar to klist_iter_init_noref(), but starts the action off with @n, + * instead of with the list head. + */ +void klist_iter_init_node_noref(struct klist *k, struct klist_iter *i, + struct klist_node *n) +{ + i-i_klist = k; + i-i_cur = n; +} +EXPORT_SYMBOL_GPL(klist_iter_init_node_noref); + +/** + * klist_iter_init_noref - Initalize a klist_iter structure without taking + *a kref. + * @k: klist we're iterating. + * @i: klist_iter structure we're filling. + * + * Similar to klist_iter_init_node_noref(), but start with the list head. + */ +void klist_iter_init_noref(struct klist *k, struct klist_iter *i) +{ + klist_iter_init_node_noref(k, i, NULL); +} +EXPORT_SYMBOL_GPL(klist_iter_init_noref); + +/** + * klist_iter_exit_noref - Finish a list iteration without dropping a kref. + * @i: Iterator structure. + * + * Not really needed, but always good form. + */ +void klist_iter_exit_noref(struct klist_iter *i) +{ + if (i-i_cur) + i-i_cur = NULL; +} +EXPORT_SYMBOL_GPL(klist_iter_exit_noref); + static struct klist_node *to_klist_node(struct list_head *n) { return container_of(n, struct klist_node, n_node); Index: linux-3.4/drivers/base/bus.c === --- linux-3.4.orig/drivers/base/bus.c +++ linux-3.4/drivers/base/bus.c @@ -341,6 +341,37 @@ struct device *bus_find_device(struct bu } EXPORT_SYMBOL_GPL(bus_find_device); +/** + * bus_find_device_noref - device iterator for locating a particular device. + * @bus: bus type + * @start: Device to begin with + * @data: Data to pass to match function + * @match: Callback function to check device + * + * Same as bus_find_device, but it assumes caller already has klist_node ref, + * and it returns to caller a struct device with the new kref. + */ + +struct device *bus_find_device_noref(struct bus_type *bus, +struct device *start, void *data, +int (*match)(struct device *dev, void *data)) +{ + struct klist_iter i; + struct device *dev; + + if (!bus || !bus-p) + return NULL; + + klist_iter_init_node_noref(bus-p-klist_devices, i, + (start ? start-p-knode_bus : NULL)); + while ((dev = next_device(i))) + if (match(dev, data) get_device(dev)) + break; + klist_iter_exit_noref(i); + return dev; +} +EXPORT_SYMBOL_GPL(bus_find_device_noref); + static int match_name(struct device *dev, void *data) { const char *name = data; Index: linux-3.4/drivers/pci/search.c === --- linux-3.4.orig/drivers/pci/search.c +++ linux-3.4/drivers/pci/search.c @@ -289,6 +289,94 @@ pci_get_device(unsigned int vendor, unsi return pci_get_subsys(vendor, device, PCI_ANY_ID, PCI_ANY_ID, from); } +/* + * pci_get_dev_by_id_noref - begin or continue searching for a PCI device by id + * @id: pointer to struct pci_device_id to match for the device + * @from: Previous PCI device found in search, or %NULL for new search. + * + * Same as pci_get_dev_by_id, but it expects caller to hold a kref on knode_bus + * in from from previous search, and it holds a kref on knode_bus in returned + * pdev.
Re: pci: kernel crash in bus_find_device
On Tue, Jun 03, 2014 at 03:55:02PM -0700, Francesco Ruggeri wrote: In-Reply-To: 20140523023141.gc13...@kroah.com Hi Guenter, I got back to looking into this crash. Just as an example, the attached diffs also fix my bus_find_device problem for traversals that start from the head of the list and traverse it completely. They are very specific to the case of bus_find_device, and a complete solution would affect a lot of code. The main issue seems to be that when a device is found in a klist by say bus_find_device the klist_node reference should be returned to the caller, who should then decide whether to use it for the next klist search, drop it or maybe exchange it for a struct device reference. When resuming a search one should already hold a klist_node reference from the previous search. This model is broken by several functions using struct devices such as bus_find_device, which resume klist searches on the implicit assumption that holding a reference to the struct device is enough to acquire one on the klist_node. The only reason that this has not been a big issue so far is probably that on most systems struct devices are not destroyed and created very often. Not true, this happens on every USB device insertion and removal, and on startup and shutdown. What makes PCI special that we aren't hitting these issues in USB and other subsystems that do a lot of device creation/removal? thanks, greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
On Tue, Jun 03, 2014 at 03:55:02PM -0700, Francesco Ruggeri wrote: @@ -719,6 +719,11 @@ struct pci_dev *pci_get_device(unsigned struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device, unsigned int ss_vendor, unsigned int ss_device, struct pci_dev *from); +struct pci_dev *pci_get_device_noref(unsigned int vendor, unsigned int device, + struct pci_dev *from); +struct pci_dev *pci_get_subsys_noref(unsigned int vendor, unsigned int device, + unsigned int ss_vendor, unsigned int ss_device, + struct pci_dev *from); To follow up, what drivers are you thinking need to make these calls? Perhaps they shouldn't be doing something like this? And, to answer my previous question, the reason PCI is different is that drivers want to walk the list of devices to find stupid things like this out, USB doesn't do dumb stuff like that :) thanks, greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
On Tue, Jun 03, 2014 at 04:21:00PM -0700, Greg KH wrote: On Tue, Jun 03, 2014 at 03:55:02PM -0700, Francesco Ruggeri wrote: In-Reply-To: 20140523023141.gc13...@kroah.com Hi Guenter, I got back to looking into this crash. Just as an example, the attached diffs also fix my bus_find_device problem for traversals that start from the head of the list and traverse it completely. They are very specific to the case of bus_find_device, and a complete solution would affect a lot of code. The main issue seems to be that when a device is found in a klist by say bus_find_device the klist_node reference should be returned to the caller, who should then decide whether to use it for the next klist search, drop it or maybe exchange it for a struct device reference. When resuming a search one should already hold a klist_node reference from the previous search. This model is broken by several functions using struct devices such as bus_find_device, which resume klist searches on the implicit assumption that holding a reference to the struct device is enough to acquire one on the klist_node. The only reason that this has not been a big issue so far is probably that on most systems struct devices are not destroyed and created very often. Not true, this happens on every USB device insertion and removal, and on startup and shutdown. What makes PCI special that we aren't hitting these issues in USB and other subsystems that do a lot of device creation/removal? Look for callers of bus_find_device. Unless I am missing something, only pci and scsi code call it with non-NULL 'start' argument, and the scsi use is limited to a walk through scsi devices for a proc file. Makes me wonder if the start argument should go away, and if pci and scsi should use another means to walk through devices. Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
On Thu, May 22, 2014 at 12:22:40AM -0700, Guenter Roeck wrote: > On 05/22/2014 12:14 AM, Greg Kroah-Hartmann wrote: > > On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote: > >> On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote: > >>> I have been using an x86 platform. > >>> When I started working on it I got early crashes until I added the > >>> check for p not NULL in > >>> > >>> +void bus_release_device(struct device *dev) > >>> +{ > >>> + struct device_private *p = dev->p; > >>> + > >>> + if (p && klist_node_attached(>knode_bus)) > >>> + klist_put_last(>knode_bus); > >>> +} > >>> + > >>> > >>> Maybe on powerpc *p is overriden between device_del and device_release? > >>> > >>> Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are > >>> treated as WARN_ONs in the current klist code. > >>> The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I > >>> ran into it without the second patch (but only when I ran my module > >>> and tests). > >>> > >> Hi Francesco, > >> > >> I replaced the BUG_ON with WARN_ON; still crashes. > >> > >> Anyway, the problem seems to be known. I found two related exchanges. > >> > >> [1] describes pretty much the same problem. I don't see if/where it was > >> ever fixed, though. > >> > >> [2] is a patch to fix the problem. It did not apply cleanly to 3.14, > >> so I had to make some adjustments in klist_iter_init_node. Resulting > >> patch is below. With this patch, the problem is gone. It is not perfect, > >> as it aborts the loop if it encounters a deleted kobject, but it is better > >> than nothing. Unfortunately, the patch never made it upstream; no idea why. > >> Copying the author and Greg to get additional feedback. > >> > >> Guenter > >> > >> [1] https://lkml.org/lkml/2008/10/26/79 > >> [2] https://lkml.org/lkml/2012/4/16/218 > > > > 2 years ago? I have no idea what was up with that, sorry... > > > > Ok, but do you have comments on the patch itself in its current version ? I have no idea, and at the moment, no time to look at this at all, sorry. Feel free to work on it and verify if it is a valid fix or not for this issue and let me know. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
On Thu, May 22, 2014 at 09:19:40AM -0700, Francesco Ruggeri wrote: > Aborting a search does not sound like a correct solution. > How does a higher level user (eg for_each_pci_dev) know that a search > was aborted and decide whether it should try again, assuming it would > be ok repeating the action on the devices visited the first time? > Agreed, it is less than desirable. I would consider this to be a secondary problem, though, the immediate problem being the crash. One possible solution might be to have the various functions return error codes (ERR_PTR), but that would be quite invasive as well. I really think we need input from Greg and, if the solution touches the PCI subsystem, from Bjorn Helgaas to find an acceptable solution to that problem. Guenter > Francesco > > > On Thu, May 22, 2014 at 12:22 AM, Guenter Roeck wrote: > > On 05/22/2014 12:14 AM, Greg Kroah-Hartmann wrote: > >> > >> On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote: > >>> > >>> On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote: > > I have been using an x86 platform. > When I started working on it I got early crashes until I added the > check for p not NULL in > > +void bus_release_device(struct device *dev) > +{ > + struct device_private *p = dev->p; > + > + if (p && klist_node_attached(>knode_bus)) > + klist_put_last(>knode_bus); > +} > + > > Maybe on powerpc *p is overriden between device_del and device_release? > > Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are > treated as WARN_ONs in the current klist code. > The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I > ran into it without the second patch (but only when I ran my module > and tests). > > >>> Hi Francesco, > >>> > >>> I replaced the BUG_ON with WARN_ON; still crashes. > >>> > >>> Anyway, the problem seems to be known. I found two related exchanges. > >>> > >>> [1] describes pretty much the same problem. I don't see if/where it was > >>> ever fixed, though. > >>> > >>> [2] is a patch to fix the problem. It did not apply cleanly to 3.14, > >>> so I had to make some adjustments in klist_iter_init_node. Resulting > >>> patch is below. With this patch, the problem is gone. It is not perfect, > >>> as it aborts the loop if it encounters a deleted kobject, but it is > >>> better > >>> than nothing. Unfortunately, the patch never made it upstream; no idea > >>> why. > >>> Copying the author and Greg to get additional feedback. > >>> > >>> Guenter > >>> > >>> [1] https://lkml.org/lkml/2008/10/26/79 > >>> [2] https://lkml.org/lkml/2012/4/16/218 > >> > >> > >> 2 years ago? I have no idea what was up with that, sorry... > >> > > > > Ok, but do you have comments on the patch itself in its current version ? > > > > Guenter > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
Aborting a search does not sound like a correct solution. How does a higher level user (eg for_each_pci_dev) know that a search was aborted and decide whether it should try again, assuming it would be ok repeating the action on the devices visited the first time? Francesco On Thu, May 22, 2014 at 12:22 AM, Guenter Roeck wrote: > On 05/22/2014 12:14 AM, Greg Kroah-Hartmann wrote: >> >> On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote: >>> >>> On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote: I have been using an x86 platform. When I started working on it I got early crashes until I added the check for p not NULL in +void bus_release_device(struct device *dev) +{ + struct device_private *p = dev->p; + + if (p && klist_node_attached(>knode_bus)) + klist_put_last(>knode_bus); +} + Maybe on powerpc *p is overriden between device_del and device_release? Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are treated as WARN_ONs in the current klist code. The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I ran into it without the second patch (but only when I ran my module and tests). >>> Hi Francesco, >>> >>> I replaced the BUG_ON with WARN_ON; still crashes. >>> >>> Anyway, the problem seems to be known. I found two related exchanges. >>> >>> [1] describes pretty much the same problem. I don't see if/where it was >>> ever fixed, though. >>> >>> [2] is a patch to fix the problem. It did not apply cleanly to 3.14, >>> so I had to make some adjustments in klist_iter_init_node. Resulting >>> patch is below. With this patch, the problem is gone. It is not perfect, >>> as it aborts the loop if it encounters a deleted kobject, but it is >>> better >>> than nothing. Unfortunately, the patch never made it upstream; no idea >>> why. >>> Copying the author and Greg to get additional feedback. >>> >>> Guenter >>> >>> [1] https://lkml.org/lkml/2008/10/26/79 >>> [2] https://lkml.org/lkml/2012/4/16/218 >> >> >> 2 years ago? I have no idea what was up with that, sorry... >> > > Ok, but do you have comments on the patch itself in its current version ? > > Guenter > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
On 05/22/2014 12:14 AM, Greg Kroah-Hartmann wrote: On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote: On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote: I have been using an x86 platform. When I started working on it I got early crashes until I added the check for p not NULL in +void bus_release_device(struct device *dev) +{ + struct device_private *p = dev->p; + + if (p && klist_node_attached(>knode_bus)) + klist_put_last(>knode_bus); +} + Maybe on powerpc *p is overriden between device_del and device_release? Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are treated as WARN_ONs in the current klist code. The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I ran into it without the second patch (but only when I ran my module and tests). Hi Francesco, I replaced the BUG_ON with WARN_ON; still crashes. Anyway, the problem seems to be known. I found two related exchanges. [1] describes pretty much the same problem. I don't see if/where it was ever fixed, though. [2] is a patch to fix the problem. It did not apply cleanly to 3.14, so I had to make some adjustments in klist_iter_init_node. Resulting patch is below. With this patch, the problem is gone. It is not perfect, as it aborts the loop if it encounters a deleted kobject, but it is better than nothing. Unfortunately, the patch never made it upstream; no idea why. Copying the author and Greg to get additional feedback. Guenter [1] https://lkml.org/lkml/2008/10/26/79 [2] https://lkml.org/lkml/2012/4/16/218 2 years ago? I have no idea what was up with that, sorry... Ok, but do you have comments on the patch itself in its current version ? Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote: > On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote: > > I have been using an x86 platform. > > When I started working on it I got early crashes until I added the > > check for p not NULL in > > > > +void bus_release_device(struct device *dev) > > +{ > > + struct device_private *p = dev->p; > > + > > + if (p && klist_node_attached(>knode_bus)) > > + klist_put_last(>knode_bus); > > +} > > + > > > > Maybe on powerpc *p is overriden between device_del and device_release? > > > > Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are > > treated as WARN_ONs in the current klist code. > > The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I > > ran into it without the second patch (but only when I ran my module > > and tests). > > > Hi Francesco, > > I replaced the BUG_ON with WARN_ON; still crashes. > > Anyway, the problem seems to be known. I found two related exchanges. > > [1] describes pretty much the same problem. I don't see if/where it was > ever fixed, though. > > [2] is a patch to fix the problem. It did not apply cleanly to 3.14, > so I had to make some adjustments in klist_iter_init_node. Resulting > patch is below. With this patch, the problem is gone. It is not perfect, > as it aborts the loop if it encounters a deleted kobject, but it is better > than nothing. Unfortunately, the patch never made it upstream; no idea why. > Copying the author and Greg to get additional feedback. > > Guenter > > [1] https://lkml.org/lkml/2008/10/26/79 > [2] https://lkml.org/lkml/2012/4/16/218 2 years ago? I have no idea what was up with that, sorry... greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote: On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote: I have been using an x86 platform. When I started working on it I got early crashes until I added the check for p not NULL in +void bus_release_device(struct device *dev) +{ + struct device_private *p = dev-p; + + if (p klist_node_attached(p-knode_bus)) + klist_put_last(p-knode_bus); +} + Maybe on powerpc *p is overriden between device_del and device_release? Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are treated as WARN_ONs in the current klist code. The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I ran into it without the second patch (but only when I ran my module and tests). Hi Francesco, I replaced the BUG_ON with WARN_ON; still crashes. Anyway, the problem seems to be known. I found two related exchanges. [1] describes pretty much the same problem. I don't see if/where it was ever fixed, though. [2] is a patch to fix the problem. It did not apply cleanly to 3.14, so I had to make some adjustments in klist_iter_init_node. Resulting patch is below. With this patch, the problem is gone. It is not perfect, as it aborts the loop if it encounters a deleted kobject, but it is better than nothing. Unfortunately, the patch never made it upstream; no idea why. Copying the author and Greg to get additional feedback. Guenter [1] https://lkml.org/lkml/2008/10/26/79 [2] https://lkml.org/lkml/2012/4/16/218 2 years ago? I have no idea what was up with that, sorry... greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
On 05/22/2014 12:14 AM, Greg Kroah-Hartmann wrote: On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote: On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote: I have been using an x86 platform. When I started working on it I got early crashes until I added the check for p not NULL in +void bus_release_device(struct device *dev) +{ + struct device_private *p = dev-p; + + if (p klist_node_attached(p-knode_bus)) + klist_put_last(p-knode_bus); +} + Maybe on powerpc *p is overriden between device_del and device_release? Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are treated as WARN_ONs in the current klist code. The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I ran into it without the second patch (but only when I ran my module and tests). Hi Francesco, I replaced the BUG_ON with WARN_ON; still crashes. Anyway, the problem seems to be known. I found two related exchanges. [1] describes pretty much the same problem. I don't see if/where it was ever fixed, though. [2] is a patch to fix the problem. It did not apply cleanly to 3.14, so I had to make some adjustments in klist_iter_init_node. Resulting patch is below. With this patch, the problem is gone. It is not perfect, as it aborts the loop if it encounters a deleted kobject, but it is better than nothing. Unfortunately, the patch never made it upstream; no idea why. Copying the author and Greg to get additional feedback. Guenter [1] https://lkml.org/lkml/2008/10/26/79 [2] https://lkml.org/lkml/2012/4/16/218 2 years ago? I have no idea what was up with that, sorry... Ok, but do you have comments on the patch itself in its current version ? Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
Aborting a search does not sound like a correct solution. How does a higher level user (eg for_each_pci_dev) know that a search was aborted and decide whether it should try again, assuming it would be ok repeating the action on the devices visited the first time? Francesco On Thu, May 22, 2014 at 12:22 AM, Guenter Roeck li...@roeck-us.net wrote: On 05/22/2014 12:14 AM, Greg Kroah-Hartmann wrote: On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote: On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote: I have been using an x86 platform. When I started working on it I got early crashes until I added the check for p not NULL in +void bus_release_device(struct device *dev) +{ + struct device_private *p = dev-p; + + if (p klist_node_attached(p-knode_bus)) + klist_put_last(p-knode_bus); +} + Maybe on powerpc *p is overriden between device_del and device_release? Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are treated as WARN_ONs in the current klist code. The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I ran into it without the second patch (but only when I ran my module and tests). Hi Francesco, I replaced the BUG_ON with WARN_ON; still crashes. Anyway, the problem seems to be known. I found two related exchanges. [1] describes pretty much the same problem. I don't see if/where it was ever fixed, though. [2] is a patch to fix the problem. It did not apply cleanly to 3.14, so I had to make some adjustments in klist_iter_init_node. Resulting patch is below. With this patch, the problem is gone. It is not perfect, as it aborts the loop if it encounters a deleted kobject, but it is better than nothing. Unfortunately, the patch never made it upstream; no idea why. Copying the author and Greg to get additional feedback. Guenter [1] https://lkml.org/lkml/2008/10/26/79 [2] https://lkml.org/lkml/2012/4/16/218 2 years ago? I have no idea what was up with that, sorry... Ok, but do you have comments on the patch itself in its current version ? Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
On Thu, May 22, 2014 at 09:19:40AM -0700, Francesco Ruggeri wrote: Aborting a search does not sound like a correct solution. How does a higher level user (eg for_each_pci_dev) know that a search was aborted and decide whether it should try again, assuming it would be ok repeating the action on the devices visited the first time? Agreed, it is less than desirable. I would consider this to be a secondary problem, though, the immediate problem being the crash. One possible solution might be to have the various functions return error codes (ERR_PTR), but that would be quite invasive as well. I really think we need input from Greg and, if the solution touches the PCI subsystem, from Bjorn Helgaas to find an acceptable solution to that problem. Guenter Francesco On Thu, May 22, 2014 at 12:22 AM, Guenter Roeck li...@roeck-us.net wrote: On 05/22/2014 12:14 AM, Greg Kroah-Hartmann wrote: On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote: On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote: I have been using an x86 platform. When I started working on it I got early crashes until I added the check for p not NULL in +void bus_release_device(struct device *dev) +{ + struct device_private *p = dev-p; + + if (p klist_node_attached(p-knode_bus)) + klist_put_last(p-knode_bus); +} + Maybe on powerpc *p is overriden between device_del and device_release? Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are treated as WARN_ONs in the current klist code. The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I ran into it without the second patch (but only when I ran my module and tests). Hi Francesco, I replaced the BUG_ON with WARN_ON; still crashes. Anyway, the problem seems to be known. I found two related exchanges. [1] describes pretty much the same problem. I don't see if/where it was ever fixed, though. [2] is a patch to fix the problem. It did not apply cleanly to 3.14, so I had to make some adjustments in klist_iter_init_node. Resulting patch is below. With this patch, the problem is gone. It is not perfect, as it aborts the loop if it encounters a deleted kobject, but it is better than nothing. Unfortunately, the patch never made it upstream; no idea why. Copying the author and Greg to get additional feedback. Guenter [1] https://lkml.org/lkml/2008/10/26/79 [2] https://lkml.org/lkml/2012/4/16/218 2 years ago? I have no idea what was up with that, sorry... Ok, but do you have comments on the patch itself in its current version ? Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
On Thu, May 22, 2014 at 12:22:40AM -0700, Guenter Roeck wrote: On 05/22/2014 12:14 AM, Greg Kroah-Hartmann wrote: On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote: On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote: I have been using an x86 platform. When I started working on it I got early crashes until I added the check for p not NULL in +void bus_release_device(struct device *dev) +{ + struct device_private *p = dev-p; + + if (p klist_node_attached(p-knode_bus)) + klist_put_last(p-knode_bus); +} + Maybe on powerpc *p is overriden between device_del and device_release? Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are treated as WARN_ONs in the current klist code. The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I ran into it without the second patch (but only when I ran my module and tests). Hi Francesco, I replaced the BUG_ON with WARN_ON; still crashes. Anyway, the problem seems to be known. I found two related exchanges. [1] describes pretty much the same problem. I don't see if/where it was ever fixed, though. [2] is a patch to fix the problem. It did not apply cleanly to 3.14, so I had to make some adjustments in klist_iter_init_node. Resulting patch is below. With this patch, the problem is gone. It is not perfect, as it aborts the loop if it encounters a deleted kobject, but it is better than nothing. Unfortunately, the patch never made it upstream; no idea why. Copying the author and Greg to get additional feedback. Guenter [1] https://lkml.org/lkml/2008/10/26/79 [2] https://lkml.org/lkml/2012/4/16/218 2 years ago? I have no idea what was up with that, sorry... Ok, but do you have comments on the patch itself in its current version ? I have no idea, and at the moment, no time to look at this at all, sorry. Feel free to work on it and verify if it is a valid fix or not for this issue and let me know. thanks, greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote: > I have been using an x86 platform. > When I started working on it I got early crashes until I added the > check for p not NULL in > > +void bus_release_device(struct device *dev) > +{ > + struct device_private *p = dev->p; > + > + if (p && klist_node_attached(>knode_bus)) > + klist_put_last(>knode_bus); > +} > + > > Maybe on powerpc *p is overriden between device_del and device_release? > > Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are > treated as WARN_ONs in the current klist code. > The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I > ran into it without the second patch (but only when I ran my module > and tests). > Hi Francesco, I replaced the BUG_ON with WARN_ON; still crashes. Anyway, the problem seems to be known. I found two related exchanges. [1] describes pretty much the same problem. I don't see if/where it was ever fixed, though. [2] is a patch to fix the problem. It did not apply cleanly to 3.14, so I had to make some adjustments in klist_iter_init_node. Resulting patch is below. With this patch, the problem is gone. It is not perfect, as it aborts the loop if it encounters a deleted kobject, but it is better than nothing. Unfortunately, the patch never made it upstream; no idea why. Copying the author and Greg to get additional feedback. Guenter [1] https://lkml.org/lkml/2008/10/26/79 [2] https://lkml.org/lkml/2012/4/16/218 >From 2bf95c4a05a91dbe7168b53d4405dee3a0912a98 Mon Sep 17 00:00:00 2001 From: Hannes Reinecke Date: Mon, 16 Apr 2012 15:06:25 +0200 Subject: [PATCH] driver core: check start node in klist_iter_init_node klist_iter_init_node() takes a node as a start argument. However, this node might not be valid anymore. This patch updates the klist_iter_init_node() and dependent functions to return an error if so. All calling functions have been audited to check for a return code here. Signed-off-by: Hannes Reinecke Cc: Greg Kroah-Hartmann Cc: Kay Sievers [groeck: ported to 3.14] Signed-off-by: Guenter Roeck --- drivers/base/bus.c | 46 +- drivers/base/class.c | 32 drivers/base/driver.c | 18 +++--- include/linux/device.h | 10 +- include/linux/klist.h |2 +- lib/klist.c| 15 +++ 6 files changed, 77 insertions(+), 46 deletions(-) diff --git a/drivers/base/bus.c b/drivers/base/bus.c index fbad61b..9d6af80 100644 --- a/drivers/base/bus.c +++ b/drivers/base/bus.c @@ -305,11 +305,13 @@ int bus_for_each_dev(struct bus_type *bus, struct device *start, if (!bus || !bus->p || (start && !start->p)) return -EINVAL; - klist_iter_init_node(>p->klist_devices, , -(start ? >p->knode_bus : NULL)); - while ((dev = next_device()) && !error) - error = fn(dev, data); - klist_iter_exit(); + error = klist_iter_init_node(>p->klist_devices, , +(start ? >p->knode_bus : NULL)); + if (!error) { + while ((dev = next_device()) && !error) + error = fn(dev, data); + klist_iter_exit(); + } return error; } EXPORT_SYMBOL_GPL(bus_for_each_dev); @@ -339,8 +341,10 @@ struct device *bus_find_device(struct bus_type *bus, if (!bus || !bus->p) return NULL; - klist_iter_init_node(>p->klist_devices, , -(start ? >p->knode_bus : NULL)); + if (klist_iter_init_node(>p->klist_devices, , +(start ? >p->knode_bus : NULL)) < 0) + return NULL; + while ((dev = next_device())) if (match(dev, data) && get_device(dev)) break; @@ -393,7 +397,9 @@ struct device *subsys_find_device_by_id(struct bus_type *subsys, unsigned int id return NULL; if (hint) { - klist_iter_init_node(>p->klist_devices, , >p->knode_bus); + if (klist_iter_init_node(>p->klist_devices, , +>p->knode_bus) < 0) + return NULL; dev = next_device(); if (dev && dev->id == id && get_device(dev)) { klist_iter_exit(); @@ -455,11 +461,13 @@ int bus_for_each_drv(struct bus_type *bus, struct device_driver *start, if (!bus) return -EINVAL; - klist_iter_init_node(>p->klist_drivers, , -start ? >p->knode_bus : NULL); - while ((drv = next_driver()) && !error) - error = fn(drv, data); - klist_iter_exit(); + error = klist_iter_init_node(>p->klist_drivers, , +start ? >p->knode_bus : NULL); + if (!error) { + while ((drv = next_driver()) && !error) +
Re: pci: kernel crash in bus_find_device
On Tue, May 20, 2014 at 03:35:15PM -0700, Francesco Ruggeri wrote: > Hi Guenter, > thank you for your reply. I will check out the changes that you pointed to. > The problem we are seeing is a race condition between for_each_pci_dev > (or similar) and device_unregisters. I am not sure if use of the new > lock should be extended to all code using for_each_pci_dev as well. > > pci_scan is a kernel thread that I used for testing purposes, to > mimick the dynamics that we saw in our crashes in > edac_pci_clear_parity_errors: > > for (;;) { > pci_dev = NULL; > while ((pci_dev = pci_get_device(PCI_ANY_ID, > PCI_ANY_ID, pci_dev)) != NULL) > ; > } > > It keeps traversing klist_devices in pci_bus_type using > bus_find_device, costantly resuming its search for the next element > starting from the one it got in the previous round. > There are several loops of this kind in linux. In case of this thread > no action is taken on the elements as they are "found". > > The race condition occurs when bus_find_device resumes its search from > a device that has been unregistered. Because device_unregister resets > klist_bus in the device, bus_find device cannot resume from where it > left off in the klist. > The sequence is device_unregister, device_del, bus_remove_device, > klist_del(>p->knode_bus.). > Problem is confirmed to exist in 3.14, and can be reproduced easily with the following dummy driver, courtesy to Francesco. I added usleep_range() to make it easier to reproduce. It took only about half a dozen hot insertion/removal events to make it happen. Here are the tracebacks: [ cut here ] WARNING: at /home/p2020/linux-freescale/include/linux/kref.h:47 Modules linked in: jnx_connector leds_gpio sam_flash gpio_sam i2c_sam sam_core uio_pci_hostif pci_scan [last unloaded: sam_core] CPU: 0 PID: 2641 Comm: pci_scan Not tainted 3.14.4-juniper-00422-gf428c34 #47 task: e7ce8ea0 ti: e73e6000 task.ti: e73e6000 NIP: c04e0988 LR: c02baa28 CTR: c0268ca4 REGS: e73e7da0 TRAP: 0700 Not tainted (3.14.4-juniper-00422-gf428c34) MSR: 00029000 CR: 24038382 XER: GPR00: c0268b38 e73e7e50 e7ce8ea0 e7c96f94 e73e7e58 e725a264 c0268a38 eedaa2c0 GPR08: 0002 0001 00021000 2403d382 c00576f8 e7377750 GPR16: GPR24: f170f000 c056c33c c0268a38 e73e7ea8 eedaa2c0 NIP [c04e0988] klist_iter_init_node.part.0+0xc/0x684 LR [c02baa28] bus_find_device+0x48/0xac Call Trace: [e73e7e80] [c0268b38] pci_get_dev_by_id+0x5c/0x94 [e73e7ea0] [c0268c94] pci_get_subsys+0x38/0x48 [e73e7ed0] [f170f02c] pci_scan+0x2c/0x64 [pci_scan] [e73e7ee0] [c00577bc] kthread+0xc4/0xd8 [e73e7f40] [c000f004] ret_from_kernel_thread+0x5c/0x64 and: [ cut here ] WARNING: at /home/p2020/linux-freescale/lib/klist.c:189 Modules linked in: jnx_connector leds_gpio sam_flash gpio_sam i2c_sam sam_core uio_pci_hostif pci_scan [last unloaded: sam_core] CPU: 0 PID: 2641 Comm: pci_scan Tainted: GW 3.14.4-juniper-00422-gf428c34 #47 task: e7ce8ea0 ti: e73e6000 task.ti: e73e6000 NIP: c04d7ad0 LR: c04d7be4 CTR: c0268ca4 REGS: e73e7d30 TRAP: 0700 Tainted: GW (3.14.4-juniper-00422-gf428c34) MSR: 00029000 CR: 24038382 XER: GPR00: c04d7be4 e73e7de0 e7ce8ea0 e725a264 e73e7e58 e725a264 c0268a38 eedaa2c0 GPR08: 0002 0001 0001 00021000 24038384 c00576f8 e7377750 GPR16: GPR24: f170f000 c02ba364 e725a258 e725a258 e73e7e58 NIP [c04d7ad0] klist_release+0x20/0xec LR [c04d7be4] klist_dec_and_del+0x48/0x5c Call Trace: [e73e7e10] [c04d7be4] klist_dec_and_del+0x48/0x5c [e73e7e20] [c04d7c3c] klist_next+0x44/0x138 [e73e7e40] [c02ba444] next_device+0x10/0x34 [e73e7e50] [c02baa30] bus_find_device+0x50/0xac [e73e7e80] [c0268b38] pci_get_dev_by_id+0x5c/0x94 [e73e7ea0] [c0268c94] pci_get_subsys+0x38/0x48 [e73e7ed0] [f170f02c] pci_scan+0x2c/0x64 [pci_scan] [e73e7ee0] [c00577bc] kthread+0xc4/0xd8 [e73e7f40] [c000f004] ret_from_kernel_thread+0x5c/0x64 Francesco, I'll test the patches you sent me next. Guenter --- /* * PCI scan test driver */ #include #include #include #include #include #include #include #include #include static struct task_struct *pci_scan_task = NULL; static int pci_scan(void *unused) { for (;;) { struct pci_dev *dev = NULL; while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) usleep_range(1000, 2000); schedule(); if (kthread_should_stop()) break; } return 0; } static int __init pci_scan_init(void) { pci_scan_task = kthread_create(pci_scan, NULL, "pci_scan"); if (!pci_scan_task) return -ENODEV;
Re: pci: kernel crash in bus_find_device
On Tue, May 20, 2014 at 03:35:15PM -0700, Francesco Ruggeri wrote: Hi Guenter, thank you for your reply. I will check out the changes that you pointed to. The problem we are seeing is a race condition between for_each_pci_dev (or similar) and device_unregisters. I am not sure if use of the new lock should be extended to all code using for_each_pci_dev as well. pci_scan is a kernel thread that I used for testing purposes, to mimick the dynamics that we saw in our crashes in edac_pci_clear_parity_errors: for (;;) { pci_dev = NULL; while ((pci_dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, pci_dev)) != NULL) ; } It keeps traversing klist_devices in pci_bus_type using bus_find_device, costantly resuming its search for the next element starting from the one it got in the previous round. There are several loops of this kind in linux. In case of this thread no action is taken on the elements as they are found. The race condition occurs when bus_find_device resumes its search from a device that has been unregistered. Because device_unregister resets klist_bus in the device, bus_find device cannot resume from where it left off in the klist. The sequence is device_unregister, device_del, bus_remove_device, klist_del(dev-p-knode_bus.). Problem is confirmed to exist in 3.14, and can be reproduced easily with the following dummy driver, courtesy to Francesco. I added usleep_range() to make it easier to reproduce. It took only about half a dozen hot insertion/removal events to make it happen. Here are the tracebacks: [ cut here ] WARNING: at /home/p2020/linux-freescale/include/linux/kref.h:47 Modules linked in: jnx_connector leds_gpio sam_flash gpio_sam i2c_sam sam_core uio_pci_hostif pci_scan [last unloaded: sam_core] CPU: 0 PID: 2641 Comm: pci_scan Not tainted 3.14.4-juniper-00422-gf428c34 #47 task: e7ce8ea0 ti: e73e6000 task.ti: e73e6000 NIP: c04e0988 LR: c02baa28 CTR: c0268ca4 REGS: e73e7da0 TRAP: 0700 Not tainted (3.14.4-juniper-00422-gf428c34) MSR: 00029000 CE,EE,ME CR: 24038382 XER: GPR00: c0268b38 e73e7e50 e7ce8ea0 e7c96f94 e73e7e58 e725a264 c0268a38 eedaa2c0 GPR08: 0002 0001 00021000 2403d382 c00576f8 e7377750 GPR16: GPR24: f170f000 c056c33c c0268a38 e73e7ea8 eedaa2c0 NIP [c04e0988] klist_iter_init_node.part.0+0xc/0x684 LR [c02baa28] bus_find_device+0x48/0xac Call Trace: [e73e7e80] [c0268b38] pci_get_dev_by_id+0x5c/0x94 [e73e7ea0] [c0268c94] pci_get_subsys+0x38/0x48 [e73e7ed0] [f170f02c] pci_scan+0x2c/0x64 [pci_scan] [e73e7ee0] [c00577bc] kthread+0xc4/0xd8 [e73e7f40] [c000f004] ret_from_kernel_thread+0x5c/0x64 and: [ cut here ] WARNING: at /home/p2020/linux-freescale/lib/klist.c:189 Modules linked in: jnx_connector leds_gpio sam_flash gpio_sam i2c_sam sam_core uio_pci_hostif pci_scan [last unloaded: sam_core] CPU: 0 PID: 2641 Comm: pci_scan Tainted: GW 3.14.4-juniper-00422-gf428c34 #47 task: e7ce8ea0 ti: e73e6000 task.ti: e73e6000 NIP: c04d7ad0 LR: c04d7be4 CTR: c0268ca4 REGS: e73e7d30 TRAP: 0700 Tainted: GW (3.14.4-juniper-00422-gf428c34) MSR: 00029000 CE,EE,ME CR: 24038382 XER: GPR00: c04d7be4 e73e7de0 e7ce8ea0 e725a264 e73e7e58 e725a264 c0268a38 eedaa2c0 GPR08: 0002 0001 0001 00021000 24038384 c00576f8 e7377750 GPR16: GPR24: f170f000 c02ba364 e725a258 e725a258 e73e7e58 NIP [c04d7ad0] klist_release+0x20/0xec LR [c04d7be4] klist_dec_and_del+0x48/0x5c Call Trace: [e73e7e10] [c04d7be4] klist_dec_and_del+0x48/0x5c [e73e7e20] [c04d7c3c] klist_next+0x44/0x138 [e73e7e40] [c02ba444] next_device+0x10/0x34 [e73e7e50] [c02baa30] bus_find_device+0x50/0xac [e73e7e80] [c0268b38] pci_get_dev_by_id+0x5c/0x94 [e73e7ea0] [c0268c94] pci_get_subsys+0x38/0x48 [e73e7ed0] [f170f02c] pci_scan+0x2c/0x64 [pci_scan] [e73e7ee0] [c00577bc] kthread+0xc4/0xd8 [e73e7f40] [c000f004] ret_from_kernel_thread+0x5c/0x64 Francesco, I'll test the patches you sent me next. Guenter --- /* * PCI scan test driver */ #include linux/delay.h #include linux/kernel.h #include linux/module.h #include linux/sched.h #include linux/kprobes.h #include linux/kallsyms.h #include linux/kthread.h #include linux/pci.h #include linux/pcieport_if.h static struct task_struct *pci_scan_task = NULL; static int pci_scan(void *unused) { for (;;) { struct pci_dev *dev = NULL; while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) usleep_range(1000, 2000); schedule(); if (kthread_should_stop()) break; } return 0; } static int __init pci_scan_init(void) {
Re: pci: kernel crash in bus_find_device
On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote: I have been using an x86 platform. When I started working on it I got early crashes until I added the check for p not NULL in +void bus_release_device(struct device *dev) +{ + struct device_private *p = dev-p; + + if (p klist_node_attached(p-knode_bus)) + klist_put_last(p-knode_bus); +} + Maybe on powerpc *p is overriden between device_del and device_release? Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are treated as WARN_ONs in the current klist code. The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I ran into it without the second patch (but only when I ran my module and tests). Hi Francesco, I replaced the BUG_ON with WARN_ON; still crashes. Anyway, the problem seems to be known. I found two related exchanges. [1] describes pretty much the same problem. I don't see if/where it was ever fixed, though. [2] is a patch to fix the problem. It did not apply cleanly to 3.14, so I had to make some adjustments in klist_iter_init_node. Resulting patch is below. With this patch, the problem is gone. It is not perfect, as it aborts the loop if it encounters a deleted kobject, but it is better than nothing. Unfortunately, the patch never made it upstream; no idea why. Copying the author and Greg to get additional feedback. Guenter [1] https://lkml.org/lkml/2008/10/26/79 [2] https://lkml.org/lkml/2012/4/16/218 From 2bf95c4a05a91dbe7168b53d4405dee3a0912a98 Mon Sep 17 00:00:00 2001 From: Hannes Reinecke h...@suse.de Date: Mon, 16 Apr 2012 15:06:25 +0200 Subject: [PATCH] driver core: check start node in klist_iter_init_node klist_iter_init_node() takes a node as a start argument. However, this node might not be valid anymore. This patch updates the klist_iter_init_node() and dependent functions to return an error if so. All calling functions have been audited to check for a return code here. Signed-off-by: Hannes Reinecke h...@suse.de Cc: Greg Kroah-Hartmann gre...@linuxfoundation.org Cc: Kay Sievers k...@vrfy.org [groeck: ported to 3.14] Signed-off-by: Guenter Roeck li...@roeck-us.net --- drivers/base/bus.c | 46 +- drivers/base/class.c | 32 drivers/base/driver.c | 18 +++--- include/linux/device.h | 10 +- include/linux/klist.h |2 +- lib/klist.c| 15 +++ 6 files changed, 77 insertions(+), 46 deletions(-) diff --git a/drivers/base/bus.c b/drivers/base/bus.c index fbad61b..9d6af80 100644 --- a/drivers/base/bus.c +++ b/drivers/base/bus.c @@ -305,11 +305,13 @@ int bus_for_each_dev(struct bus_type *bus, struct device *start, if (!bus || !bus-p || (start !start-p)) return -EINVAL; - klist_iter_init_node(bus-p-klist_devices, i, -(start ? start-p-knode_bus : NULL)); - while ((dev = next_device(i)) !error) - error = fn(dev, data); - klist_iter_exit(i); + error = klist_iter_init_node(bus-p-klist_devices, i, +(start ? start-p-knode_bus : NULL)); + if (!error) { + while ((dev = next_device(i)) !error) + error = fn(dev, data); + klist_iter_exit(i); + } return error; } EXPORT_SYMBOL_GPL(bus_for_each_dev); @@ -339,8 +341,10 @@ struct device *bus_find_device(struct bus_type *bus, if (!bus || !bus-p) return NULL; - klist_iter_init_node(bus-p-klist_devices, i, -(start ? start-p-knode_bus : NULL)); + if (klist_iter_init_node(bus-p-klist_devices, i, +(start ? start-p-knode_bus : NULL)) 0) + return NULL; + while ((dev = next_device(i))) if (match(dev, data) get_device(dev)) break; @@ -393,7 +397,9 @@ struct device *subsys_find_device_by_id(struct bus_type *subsys, unsigned int id return NULL; if (hint) { - klist_iter_init_node(subsys-p-klist_devices, i, hint-p-knode_bus); + if (klist_iter_init_node(subsys-p-klist_devices, i, +hint-p-knode_bus) 0) + return NULL; dev = next_device(i); if (dev dev-id == id get_device(dev)) { klist_iter_exit(i); @@ -455,11 +461,13 @@ int bus_for_each_drv(struct bus_type *bus, struct device_driver *start, if (!bus) return -EINVAL; - klist_iter_init_node(bus-p-klist_drivers, i, -start ? start-p-knode_bus : NULL); - while ((drv = next_driver(i)) !error) - error = fn(drv, data); - klist_iter_exit(i); + error = klist_iter_init_node(bus-p-klist_drivers, i, +start ?
Re: pci: kernel crash in bus_find_device
On Tue, May 20, 2014 at 03:35:15PM -0700, Francesco Ruggeri wrote: > Hi Guenter, > thank you for your reply. I will check out the changes that you pointed to. > The problem we are seeing is a race condition between for_each_pci_dev > (or similar) and device_unregisters. I am not sure if use of the new > lock should be extended to all code using for_each_pci_dev as well. > > pci_scan is a kernel thread that I used for testing purposes, to > mimick the dynamics that we saw in our crashes in > edac_pci_clear_parity_errors: > > for (;;) { > pci_dev = NULL; > while ((pci_dev = pci_get_device(PCI_ANY_ID, > PCI_ANY_ID, pci_dev)) != NULL) > ; > } > > It keeps traversing klist_devices in pci_bus_type using > bus_find_device, costantly resuming its search for the next element > starting from the one it got in the previous round. > There are several loops of this kind in linux. In case of this thread > no action is taken on the elements as they are "found". > > The race condition occurs when bus_find_device resumes its search from > a device that has been unregistered. Because device_unregister resets > klist_bus in the device, bus_find device cannot resume from where it > left off in the klist. > The sequence is device_unregister, device_del, bus_remove_device, > klist_del(>p->knode_bus.). > Hmmm ... sounds more like a generic problem, not specifically related to pci. Essentially everything calling bus_find_device() with a starting device which has been removed (though only pci and scsi seem to be doing that in practice). Can you reproduce the problem with the latest kernel ? Also, can you send me the entire file with the kernel thread you mentioned above ? Maybe I can reproduce the problem here. Thanks, Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
Hi Guenter, thank you for your reply. I will check out the changes that you pointed to. The problem we are seeing is a race condition between for_each_pci_dev (or similar) and device_unregisters. I am not sure if use of the new lock should be extended to all code using for_each_pci_dev as well. pci_scan is a kernel thread that I used for testing purposes, to mimick the dynamics that we saw in our crashes in edac_pci_clear_parity_errors: for (;;) { pci_dev = NULL; while ((pci_dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, pci_dev)) != NULL) ; } It keeps traversing klist_devices in pci_bus_type using bus_find_device, costantly resuming its search for the next element starting from the one it got in the previous round. There are several loops of this kind in linux. In case of this thread no action is taken on the elements as they are "found". The race condition occurs when bus_find_device resumes its search from a device that has been unregistered. Because device_unregister resets klist_bus in the device, bus_find device cannot resume from where it left off in the klist. The sequence is device_unregister, device_del, bus_remove_device, klist_del(>p->knode_bus.). Francesco -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
On Tue, May 20, 2014 at 12:17:57PM -0700, Francesco Ruggeri wrote: > I posted this about a week ago but I did not get any replies. > Re-trying. > > While traversing devices on pci_bus_type I ran into the crash below. > The immediate cause of the crash is that bus_find_device is trying to resume > a scan starting from a device that has been unregistered (and whose knode_bus > has already been klist_del' ed). > The main issue seems to be that when resuming a scan the caller should > be holding a > reference to the klist_node, but instead it relies on holding a > reference to the device. > I played with a couple of narrow fixes, but a clean solution would > affect quite a bit of code. > > Has anybody run into this before? > Hi Francesco, I may be missing something, but I don't find a pci_scan symbol in the 3.4 kernel. Also, the process name suggests that you may possibly trigger pci rescans from user space. Both suggest that you may possibly run third party code in your kernel. Either case, I ran into similar problems myself with pci rescans triggered from user space. The 3.4 kernel has no synchronization for rescans triggered from user space with those triggered from the kernel. In a nutshell, when triggering rescans and removals from user space you must ensure that only one such rescan/removal is active at any given time. Under no circumstances trigger rescans from user space if a rescan can also be triggered from the kernel. Obviously that also applies if rescans can be triggered multiple times in parallel by some third party kernel module. Maybe that explains your problem ? The problem has been addressed recently with commit 9d16947 (PCI: Add global pci_lock_rescan_remove) and several subsequent patches. Guenter > Thanks, > Francesco Ruggeri > > > [ cut here ] > WARNING: at /bld/EosKernel/Artools-rpmbuild/linux-3.4/include/linux/kref.h:41 > klist_iter_init_node+0x30/0x38() > Modules linked in: pci_scan(O) sch_prio sand_dma(PO) arista_bde(PO) > macvlan ip6table_mangle iptable_mangle msr nf_conntrack_ipv6 > nf_defrag_ipv6 ip6t_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_LOG > xt_limit ipt_REJECT xt_hl xt_state xt_multiport xt_tcpudp kbfd(O) > 8021q garp stp llc tun scd_em_driver(O) nf_conntrack_tftp iptable_raw > iptable_filter ip_tables xt_NOTRACK nf_conntrack xt_mark ip6table_raw > ip6table_filter ip6_tables x_tables scd(O) k8temp amd64_edac_mod hwmon > kvm_amd kvm > Pid: 6861, comm: pci_scan_0 Tainted: P O > 3.4.43.Ar-1797671.flbocafruggeri #1 > Call Trace: > [] warn_slowpath_common+0x80/0x98 > [] ? pci_do_find_bus+0x49/0x49 > [] warn_slowpath_null+0x15/0x17 > [] klist_iter_init_node+0x30/0x38 > [] bus_find_device+0x48/0x90 > [] pci_get_dev_by_id+0x5e/0x81 > [] pci_get_subsys+0x5c/0x7f > [] pci_get_device+0x11/0x13 > [] pci_scan+0x39/0x8a [pci_scan] > [] ? init_module+0x3c/0x3c [pci_scan] > [] kthread+0x84/0x8c > [] kernel_thread_helper+0x4/0x10 > [] ? __init_kthread_worker+0x37/0x37 > [] ? gs_change+0xb/0xb > ---[ end trace 79cea1ec476672fe ]--- > [ cut here ] > WARNING: at /bld/EosKernel/Artools-rpmbuild/linux-3.4/lib/klist.c:189 > klist_release+0x2b/0xeb() > Modules linked in: pci_scan(O) sch_prio sand_dma(PO) arista_bde(PO) > macvlan ip6table_mangle iptable_mangle msr nf_conntrack_ipv6 > nf_defrag_ipv6 ip6t_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_LOG > xt_limit ipt_REJECT xt_hl xt_state xt_multiport xt_tcpudp kbfd(O) > 8021q garp stp llc tun scd_em_driver(O) nf_conntrack_tftp iptable_raw > iptable_filter ip_tables xt_NOTRACK nf_conntrack xt_mark ip6table_raw > ip6table_filter ip6_tables x_tables scd(O) k8temp amd64_edac_mod hwmon > kvm_amd kvm > Pid: 6861, comm: pci_scan_0 Tainted: PW O > 3.4.43.Ar-1797671.flbocafruggeri #1 > Call Trace: > [] warn_slowpath_common+0x80/0x98 > [] ? bus_get_device_klist+0x10/0x10 > [] warn_slowpath_null+0x15/0x17 > [] klist_release+0x2b/0xeb > [] klist_dec_and_del+0x1e/0x25 > [] klist_next+0x35/0xc9 > [] ? pci_do_find_bus+0x49/0x49 > [] next_device+0x9/0x19 > [] bus_find_device+0x6c/0x90 > [] pci_get_dev_by_id+0x5e/0x81 > [] pci_get_subsys+0x5c/0x7f > [] pci_get_device+0x11/0x13 > [] pci_scan+0x39/0x8a [pci_scan] > [] ? init_module+0x3c/0x3c [pci_scan] > [] kthread+0x84/0x8c > [] kernel_thread_helper+0x4/0x10 > [] ? __init_kthread_worker+0x37/0x37 > [] ? gs_change+0xb/0xb > ---[ end trace 79cea1ec476672ff ]--- > general protection fault: [#1] PREEMPT SMP > CPU 1 > Modules linked in: pci_scan(O) sch_prio sand_dma(PO) arista_bde(PO) > macvlan ip6table_mangle iptable_mangle msr nf_conntrack_ipv6 > nf_defrag_ipv6 ip6t_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_LOG > xt_limit ipt_REJECT xt_hl xt_state xt_multiport xt_tcpudp kbfd(O) > 8021q garp stp llc tun scd_em_driver(O) nf_conntrack_tftp iptable_raw > iptable_filter ip_tables xt_NOTRACK nf_conntrack xt_mark ip6table_raw > ip6table_filter ip6_tables x_tables scd(O) k8temp amd64_edac_mod hwmon > kvm_amd kvm
Re: pci: kernel crash in bus_find_device
On Tue, May 20, 2014 at 12:17:57PM -0700, Francesco Ruggeri wrote: I posted this about a week ago but I did not get any replies. Re-trying. While traversing devices on pci_bus_type I ran into the crash below. The immediate cause of the crash is that bus_find_device is trying to resume a scan starting from a device that has been unregistered (and whose knode_bus has already been klist_del' ed). The main issue seems to be that when resuming a scan the caller should be holding a reference to the klist_node, but instead it relies on holding a reference to the device. I played with a couple of narrow fixes, but a clean solution would affect quite a bit of code. Has anybody run into this before? Hi Francesco, I may be missing something, but I don't find a pci_scan symbol in the 3.4 kernel. Also, the process name suggests that you may possibly trigger pci rescans from user space. Both suggest that you may possibly run third party code in your kernel. Either case, I ran into similar problems myself with pci rescans triggered from user space. The 3.4 kernel has no synchronization for rescans triggered from user space with those triggered from the kernel. In a nutshell, when triggering rescans and removals from user space you must ensure that only one such rescan/removal is active at any given time. Under no circumstances trigger rescans from user space if a rescan can also be triggered from the kernel. Obviously that also applies if rescans can be triggered multiple times in parallel by some third party kernel module. Maybe that explains your problem ? The problem has been addressed recently with commit 9d16947 (PCI: Add global pci_lock_rescan_remove) and several subsequent patches. Guenter Thanks, Francesco Ruggeri [ cut here ] WARNING: at /bld/EosKernel/Artools-rpmbuild/linux-3.4/include/linux/kref.h:41 klist_iter_init_node+0x30/0x38() Modules linked in: pci_scan(O) sch_prio sand_dma(PO) arista_bde(PO) macvlan ip6table_mangle iptable_mangle msr nf_conntrack_ipv6 nf_defrag_ipv6 ip6t_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_LOG xt_limit ipt_REJECT xt_hl xt_state xt_multiport xt_tcpudp kbfd(O) 8021q garp stp llc tun scd_em_driver(O) nf_conntrack_tftp iptable_raw iptable_filter ip_tables xt_NOTRACK nf_conntrack xt_mark ip6table_raw ip6table_filter ip6_tables x_tables scd(O) k8temp amd64_edac_mod hwmon kvm_amd kvm Pid: 6861, comm: pci_scan_0 Tainted: P O 3.4.43.Ar-1797671.flbocafruggeri #1 Call Trace: [81029dc4] warn_slowpath_common+0x80/0x98 [811b57f1] ? pci_do_find_bus+0x49/0x49 [81029df1] warn_slowpath_null+0x15/0x17 [813a43ce] klist_iter_init_node+0x30/0x38 [8120e57e] bus_find_device+0x48/0x90 [811b5908] pci_get_dev_by_id+0x5e/0x81 [811b5a6a] pci_get_subsys+0x5c/0x7f [811b5a9e] pci_get_device+0x11/0x13 [a00b2087] pci_scan+0x39/0x8a [pci_scan] [a00b204e] ? init_module+0x3c/0x3c [pci_scan] [81040e6e] kthread+0x84/0x8c [813c8b14] kernel_thread_helper+0x4/0x10 [81040dea] ? __init_kthread_worker+0x37/0x37 [813c8b10] ? gs_change+0xb/0xb ---[ end trace 79cea1ec476672fe ]--- [ cut here ] WARNING: at /bld/EosKernel/Artools-rpmbuild/linux-3.4/lib/klist.c:189 klist_release+0x2b/0xeb() Modules linked in: pci_scan(O) sch_prio sand_dma(PO) arista_bde(PO) macvlan ip6table_mangle iptable_mangle msr nf_conntrack_ipv6 nf_defrag_ipv6 ip6t_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_LOG xt_limit ipt_REJECT xt_hl xt_state xt_multiport xt_tcpudp kbfd(O) 8021q garp stp llc tun scd_em_driver(O) nf_conntrack_tftp iptable_raw iptable_filter ip_tables xt_NOTRACK nf_conntrack xt_mark ip6table_raw ip6table_filter ip6_tables x_tables scd(O) k8temp amd64_edac_mod hwmon kvm_amd kvm Pid: 6861, comm: pci_scan_0 Tainted: PW O 3.4.43.Ar-1797671.flbocafruggeri #1 Call Trace: [81029dc4] warn_slowpath_common+0x80/0x98 [8120de13] ? bus_get_device_klist+0x10/0x10 [81029df1] warn_slowpath_null+0x15/0x17 [813a440e] klist_release+0x2b/0xeb [813a44ec] klist_dec_and_del+0x1e/0x25 [813a4528] klist_next+0x35/0xc9 [811b57f1] ? pci_do_find_bus+0x49/0x49 [8120deb3] next_device+0x9/0x19 [8120e5a2] bus_find_device+0x6c/0x90 [811b5908] pci_get_dev_by_id+0x5e/0x81 [811b5a6a] pci_get_subsys+0x5c/0x7f [811b5a9e] pci_get_device+0x11/0x13 [a00b2087] pci_scan+0x39/0x8a [pci_scan] [a00b204e] ? init_module+0x3c/0x3c [pci_scan] [81040e6e] kthread+0x84/0x8c [813c8b14] kernel_thread_helper+0x4/0x10 [81040dea] ? __init_kthread_worker+0x37/0x37 [813c8b10] ? gs_change+0xb/0xb ---[ end trace 79cea1ec476672ff ]--- general protection fault: [#1] PREEMPT SMP CPU 1 Modules linked in: pci_scan(O) sch_prio sand_dma(PO) arista_bde(PO)
Re: pci: kernel crash in bus_find_device
Hi Guenter, thank you for your reply. I will check out the changes that you pointed to. The problem we are seeing is a race condition between for_each_pci_dev (or similar) and device_unregisters. I am not sure if use of the new lock should be extended to all code using for_each_pci_dev as well. pci_scan is a kernel thread that I used for testing purposes, to mimick the dynamics that we saw in our crashes in edac_pci_clear_parity_errors: for (;;) { pci_dev = NULL; while ((pci_dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, pci_dev)) != NULL) ; } It keeps traversing klist_devices in pci_bus_type using bus_find_device, costantly resuming its search for the next element starting from the one it got in the previous round. There are several loops of this kind in linux. In case of this thread no action is taken on the elements as they are found. The race condition occurs when bus_find_device resumes its search from a device that has been unregistered. Because device_unregister resets klist_bus in the device, bus_find device cannot resume from where it left off in the klist. The sequence is device_unregister, device_del, bus_remove_device, klist_del(dev-p-knode_bus.). Francesco -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: kernel crash in bus_find_device
On Tue, May 20, 2014 at 03:35:15PM -0700, Francesco Ruggeri wrote: Hi Guenter, thank you for your reply. I will check out the changes that you pointed to. The problem we are seeing is a race condition between for_each_pci_dev (or similar) and device_unregisters. I am not sure if use of the new lock should be extended to all code using for_each_pci_dev as well. pci_scan is a kernel thread that I used for testing purposes, to mimick the dynamics that we saw in our crashes in edac_pci_clear_parity_errors: for (;;) { pci_dev = NULL; while ((pci_dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, pci_dev)) != NULL) ; } It keeps traversing klist_devices in pci_bus_type using bus_find_device, costantly resuming its search for the next element starting from the one it got in the previous round. There are several loops of this kind in linux. In case of this thread no action is taken on the elements as they are found. The race condition occurs when bus_find_device resumes its search from a device that has been unregistered. Because device_unregister resets klist_bus in the device, bus_find device cannot resume from where it left off in the klist. The sequence is device_unregister, device_del, bus_remove_device, klist_del(dev-p-knode_bus.). Hmmm ... sounds more like a generic problem, not specifically related to pci. Essentially everything calling bus_find_device() with a starting device which has been removed (though only pci and scsi seem to be doing that in practice). Can you reproduce the problem with the latest kernel ? Also, can you send me the entire file with the kernel thread you mentioned above ? Maybe I can reproduce the problem here. Thanks, Guenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/