Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
Ar Sul, 2006-10-15 am 16:44 -0700, ysgrifennodd Andrew Morton: > Let me restore the words from my earlier email which you removed so that > you could say that: > > For you the driver author to make assumptions about what's happening > inside pci_set_mwi() is a layering violation. Maybe the bridge got > hot-unplugged. Maybe the attempt to set MWI caused some synchronous PCI > error. For example, take a look at the various implementations of > pci_ops.read() around the place - various of them can fail for various > reasons. Let me repeat what I said before. As a driver author I do not care. It doesn't matter if it failed because it is not supported or because a pink elephant went for a dance on the PCI bus. > Now it could be that an appropriate solution is to make pci_set_mwi() > return only 0 or 1, and to generate a warning from within pci_set_mwi() > if some unexpected error happens. In which case it is legitimate for > callers to not check for errors. That would be my belief, and ditto for a lot of these other functions - even the correctly __must_check ones like pci_set_master should do the error reporting in the set_master() function etc not in every driver. That gives us a single consistent printk and avoids missing them out or bloat. Alan - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
Ar Sul, 2006-10-15 am 17:16 -0700, ysgrifennodd David Brownell: > Signed-off-by: David Brownell <[EMAIL PROTECTED]> Acked-by: Alan Cox <[EMAIL PROTECTED]> > > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -499,7 +499,7 @@ int __must_check pci_enable_device_bars( > void pci_disable_device(struct pci_dev *dev); > void pci_set_master(struct pci_dev *dev); > #define HAVE_PCI_SET_MWI > -int __must_check pci_set_mwi(struct pci_dev *dev); > +int pci_set_mwi(struct pci_dev *dev); > void pci_clear_mwi(struct pci_dev *dev); > void pci_intx(struct pci_dev *dev, int enable); > int pci_set_dma_mask(struct pci_dev *dev, u64 mask); - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
Ar Sul, 2006-10-15 am 18:10 -0700, ysgrifennodd Andrew Morton: > Question is, should pci_set_mwi() ever return -EFOO? I guess it should, in > the case where setting the line size didn't work out. It does no harm, no driver will ever check anything but 0 v !0 because the handling is no different in either case. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
On Sunday 15 October 2006 6:10 pm, Andrew Morton wrote: > - A printk if something went bad Where "bad" would I hope exclude cases where the device doesn't support MWI ... that is, the "go faster if you can" advice from the driver, where it isn't an error to run into the common case of _this_ implementation not happening to be able to issue MWI cycles. The other cases, where something actually went wrong, would be reasonable for emitting KERN_ERR or KERN_WARNING messages. - Dave - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
On Mon, 16 Oct 2006 10:44:30 +1000 Paul Mackerras <[EMAIL PROTECTED]> wrote: > Andrew Morton writes: > > > Let me restore the words from my earlier email which you removed so that > > you could say that: > > > > For you the driver author to make assumptions about what's happening > > inside pci_set_mwi() is a layering violation. Maybe the bridge got > > hot-unplugged. Maybe the attempt to set MWI caused some synchronous PCI > > error. For example, take a look at the various implementations of > > pci_ops.read() around the place - various of them can fail for various > > reasons. > > Maybe aliens are firing a ray-gun at the card. I think it's > fundamentally wrong for the driver to deny service completely because > of a maybe. > > Either there was a transient error that only affected the attempt to > set MWI, in which case a printk (inside pci_set_mwi!) is appropriate, > and we carry on. Or there is a persistent error condition, in which > case the driver will see something else fail soon enough - something > that the driver actually needs to have working in order to operate - > and fail at that point. > > For the driver to stop and refuse to go any further because of an > error in pci_set_mwi has far more disadvantages than advantages. > Sure. So I think what we're needing in this case is: - A modified version of Willy's patch which returns 0 if MWI was enabled, 1 if MWI isn't available. - A printk if something went bad It appears that the various functions which try to match the line sizes already have printks if something went wrong, but they're using KERN_DEBUG facility level and that warning would dupliate the new one in pci_set_mwi(). And although the various implementations of pci_read_config_foo() and pci_write_config_foo() can return error codes, we have so many instances where we're not checking it, I don't think it's practical to try to start doing that everywhere. - drop the __must_check. Question is, should pci_set_mwi() ever return -EFOO? I guess it should, in the case where setting the line size didn't work out. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
Andrew Morton writes: > Let me restore the words from my earlier email which you removed so that > you could say that: > > For you the driver author to make assumptions about what's happening > inside pci_set_mwi() is a layering violation. Maybe the bridge got > hot-unplugged. Maybe the attempt to set MWI caused some synchronous PCI > error. For example, take a look at the various implementations of > pci_ops.read() around the place - various of them can fail for various > reasons. Maybe aliens are firing a ray-gun at the card. I think it's fundamentally wrong for the driver to deny service completely because of a maybe. Either there was a transient error that only affected the attempt to set MWI, in which case a printk (inside pci_set_mwi!) is appropriate, and we carry on. Or there is a persistent error condition, in which case the driver will see something else fail soon enough - something that the driver actually needs to have working in order to operate - and fail at that point. For the driver to stop and refuse to go any further because of an error in pci_set_mwi has far more disadvantages than advantages. Paul. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
On Sun, 15 Oct 2006 17:16:35 -0700 David Brownell <[EMAIL PROTECTED]> wrote: > > > You, the driver author _do not know_ what pci_set_mwi() does at present, on > > all platforms, nor do you know what it does in the future. > > I know that it enables MWI accesses ... or fails. Beyond that, there > should be no reason to care. If the hardware can use a lower-overhead > type of PCI bus cycle, I want it to do so. If not, no sweat. > There are two reasons why it can fail: 1: The bus doesn't support MWI. Here, the caller doesn't care. 2: The bus _does_ support MWI, but the attempt to enable it failed. Here we very much do care, because we're losing performance. > > > This is not a terribly important issue, and it is far from the worst case > > of missed error-checking which we have in there. > > The reason I think it's important enough to continue this discussion is > that as it currently stands, it's a good example of a **BAD** interface > design ... since it's pointlessly marked as must_check. (See appended > patch to fix that issue.) It's important to continue this discussion so that certain principles can be set and agreed to. Because we have a *lot* of unchecked errors in there. We would benefit from setting guidelines establishing - Which sorts of errors should be handled in callers - Which sorts of errors should be handled (ie: just reported) in callees - Which sorts of errors should be handled in neither callers nor callees (are there any of these?) - Whether is it ever legitimate for a caller to not check the return code from a callee which can return -EFOO. (I suspect not - it probably indicates a misdesign in the callee, as in this case). - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bulk] Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
> > If the drivers doesn't care and if it makes no difference to performance > > then just delete the call to pci_set_mwi(). > > > > But if MWI _does_ make a difference to performance then we should tell > > someone that it isn't working rather than silently misbehaving? To repeat: it's not "misbehaving" since support for MWI cycles is completely optional. It'd be more interesting to get messages in the cases that it can be enabled, since typically it can't be. > That sounds like we need a printk inside pci_set_mwi then, rather than > adding a printk to every single caller. Maybe wrapped in #ifdef CONFIG_SPAM_MY_KERNEL_LOG_MESSAGES ... :) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
> > I agree that set_mwo() should set MWI if possible, and fail cleanly > > if it couldn't (for whatever reason). Thing is, choosing to treat > > that as an error must be the _driver's_ choice ... it'd be wrong to force > > that policy into the _interface_ by forcing must_check etc. > > No. If pci_set_mwi() detects an unexpected error then the driver should > take some action: report it, recover from it, fail to load, etc. If the > driver fails to do any of this then it's a buggy driver. But what is an "unexpected" "error"? Not every fault that's unexpected is an error; consider a page fault. Consider also kfree(NULL). The same motivation for drivers not needing to validate that parameter is behind arguing that device drivers should not need to poke around in PCI config space to verify that the device supports MWI; and should have the flexibility to ignore the return code, just like kfree() returns no value. > You, the driver author _do not know_ what pci_set_mwi() does at present, on > all platforms, nor do you know what it does in the future. I know that it enables MWI accesses ... or fails. Beyond that, there should be no reason to care. If the hardware can use a lower-overhead type of PCI bus cycle, I want it to do so. If not, no sweat. > This is not a terribly important issue, and it is far from the worst case > of missed error-checking which we have in there. The reason I think it's important enough to continue this discussion is that as it currently stands, it's a good example of a **BAD** interface design ... since it's pointlessly marked as must_check. (See appended patch to fix that issue.) If you wouldn't want all callers of kfree() to say "if (ptr) kfree(ptr);"; why then would anyone want to demand ... read the pci config space byte (and cleanly handle errors) ... ... check for the MWI bit ... ... if it's set (so we "expect" this next call to succeed) then: ... call pci_set_mwi() ... ... test set_mwi() value ... ... ignore that value ... where the first three lines duplicate code _inside_ pci_set_mwi(), and the last two lines are obviously a pure waste of code and effort. I'd want to know the criteria by which "if(ptr)" is judged a waste of effort in all callers, but that more extensive PCI configspace logic was not... - Dave CUT HERE Remove bogus annotation of pci_set_mwi() as __must_check. It's completely reasonable for drivers to not care about the return code, when all they're doing is enabling an optional performance-improving mechanism that's often not even available. Signed-off-by: David Brownell <[EMAIL PROTECTED]> --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -499,7 +499,7 @@ int __must_check pci_enable_device_bars( void pci_disable_device(struct pci_dev *dev); void pci_set_master(struct pci_dev *dev); #define HAVE_PCI_SET_MWI -int __must_check pci_set_mwi(struct pci_dev *dev); +int pci_set_mwi(struct pci_dev *dev); void pci_clear_mwi(struct pci_dev *dev); void pci_intx(struct pci_dev *dev, int enable); int pci_set_dma_mask(struct pci_dev *dev, u64 mask); - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bulk] Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
On Mon, 16 Oct 2006 10:00:25 +1000 Paul Mackerras <[EMAIL PROTECTED]> wrote: > Andrew Morton writes: > > > If the drivers doesn't care and if it makes no difference to performance > > then just delete the call to pci_set_mwi(). > > > > But if MWI _does_ make a difference to performance then we should tell > > someone that it isn't working rather than silently misbehaving? > > That sounds like we need a printk inside pci_set_mwi then, rather than > adding a printk to every single caller. > I think so, yes. That's a good solution to a lot of these things. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bulk] Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
Andrew Morton writes: > If the drivers doesn't care and if it makes no difference to performance > then just delete the call to pci_set_mwi(). > > But if MWI _does_ make a difference to performance then we should tell > someone that it isn't working rather than silently misbehaving? That sounds like we need a printk inside pci_set_mwi then, rather than adding a printk to every single caller. Paul. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
On Mon, 16 Oct 2006 01:02:40 +0100 Alan Cox <[EMAIL PROTECTED]> wrote: > Ar Sul, 2006-10-15 am 16:18 -0700, ysgrifennodd Andrew Morton: > > No. If pci_set_mwi() detects an unexpected error then the driver should > > take some action: report it, recover from it, fail to load, etc. If the > > driver fails to do any of this then it's a buggy driver. > > Wrong and there are several drivers in the kernel that are proof of > this. Let me restore the words from my earlier email which you removed so that you could say that: For you the driver author to make assumptions about what's happening inside pci_set_mwi() is a layering violation. Maybe the bridge got hot-unplugged. Maybe the attempt to set MWI caused some synchronous PCI error. For example, take a look at the various implementations of pci_ops.read() around the place - various of them can fail for various reasons. > > You, the driver author _do not know_ what pci_set_mwi() does at present, on > > all platforms, nor do you know what it does in the future. For you the > > You don't care. It isn't an error for set_mwi to fail. In fact the only > reason set_mwi even needs to bother with a return code is that some > chips want you to set other config private to the device if it is > available and active. > Let me restore the words from my earlier email which you removed which address that: Now it could be that an appropriate solution is to make pci_set_mwi() return only 0 or 1, and to generate a warning from within pci_set_mwi() if some unexpected error happens. In which case it is legitimate for callers to not check for errors. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
Ar Sul, 2006-10-15 am 16:18 -0700, ysgrifennodd Andrew Morton: > No. If pci_set_mwi() detects an unexpected error then the driver should > take some action: report it, recover from it, fail to load, etc. If the > driver fails to do any of this then it's a buggy driver. Wrong and there are several drivers in the kernel that are proof of this. > You, the driver author _do not know_ what pci_set_mwi() does at present, on > all platforms, nor do you know what it does in the future. For you the You don't care. It isn't an error for set_mwi to fail. In fact the only reason set_mwi even needs to bother with a return code is that some chips want you to set other config private to the device if it is available and active. Alan - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
On Sun, 15 Oct 2006 15:45:58 -0700 David Brownell <[EMAIL PROTECTED]> wrote: > > In that case its interface is misdesigned, because it doesn't discriminate > > between "yes-it-does/no-it-doesn't" (which we don't want to report, because > > either is expected and legitimate) and "something screwed up", which we do > > want to report, because it is always unexpected. > > You mis-understand. It's completely legit for the driver not to care. > > I agree that set_mwo() should set MWI if possible, and fail cleanly > if it couldn't (for whatever reason). Thing is, choosing to treat > that as an error must be the _driver's_ choice ... it'd be wrong to force > that policy into the _interface_ by forcing must_check etc. No. If pci_set_mwi() detects an unexpected error then the driver should take some action: report it, recover from it, fail to load, etc. If the driver fails to do any of this then it's a buggy driver. You, the driver author _do not know_ what pci_set_mwi() does at present, on all platforms, nor do you know what it does in the future. For you the driver author to make assumptions about what's happening inside pci_set_mwi() is a layering violation. Maybe the bridge got hot-unplugged. Maybe the attempt to set MWI caused some synchronous PCI error. For example, take a look at the various implementations of pci_ops.read() around the place - various of them can fail for various reasons. Now it could be that an appropriate solution is to make pci_set_mwi() return only 0 or 1, and to generate a warning from within pci_set_mwi() if some unexpected error happens. In which case it is legitimate for callers to not check for errors. This is not a terribly important issue, and it is far from the worst case of missed error-checking which we have in there. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
> > Most drivers should be able to say "enable MWI if possible, but > > don't worry if it's not possible". Only a few controllers need > > additional setup to make MWI actually work ... if they couldn't > > do that setup, that'd be worth a warning before they backed off > > to run in a non-MWI mode. > > > > So the semantics of pci_set_mwi() are "try to set MWI if this > platform/device supports it". Not what I said ... that's what the _driver_ usually wants to do, which is different from the step implemented by set_mwi(). What Alan Cox said is a better paraphrase: > MWI is an "extra cheese" option not a "no pizza" case Or "sorry, that car is not available in olive, just burgundy." Not: > In that case its interface is misdesigned, because it doesn't discriminate > between "yes-it-does/no-it-doesn't" (which we don't want to report, because > either is expected and legitimate) and "something screwed up", which we do > want to report, because it is always unexpected. You mis-understand. It's completely legit for the driver not to care. I agree that set_mwo() should set MWI if possible, and fail cleanly if it couldn't (for whatever reason). Thing is, choosing to treat that as an error must be the _driver's_ choice ... it'd be wrong to force that policy into the _interface_ by forcing must_check etc. - Dave - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bulk] Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
Ar Sul, 2006-10-15 am 10:45 -0700, ysgrifennodd Andrew Morton: > If the drivers doesn't care and if it makes no difference to performance > then just delete the call to pci_set_mwi(). > > But if MWI _does_ make a difference to performance then we should tell > someone that it isn't working rather than silently misbehaving? It isn't misbehaving it just isn't available. MWI is rather different to say pci_set_master() in that it makes a lot of sense for many drivers to ask for it but not care about the results. Something like pci_set_master failing is a big problem and has to be handled (although as we often don't use BIOS PCI services we see fake success in some cases). MWI is an "extra cheese" option not a "no pizza" case Alan - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
On Sun, 15 Oct 2006 12:16:31 -0700 David Brownell <[EMAIL PROTECTED]> wrote: > (From Alan Cox:) > > The underlying bug is that someone marked pci_set_mwi must-check, that's > > wrong for most of the drivers that use it. If you remove the must check > > annotation from it then the problem and a thousand other spurious > > warnings go away. > > Yes, there seems to be abuse of this new "must_check" feature. > > > (From Andrew Morton:) > > But if MWI _does_ make a difference to performance then we should tell > > someone that it isn't working rather than silently misbehaving? > > Thing is, a "difference to performance (alone)" != "misbehavior". > > If it affected correctness, then a warning would be appropriate. > > Most drivers should be able to say "enable MWI if possible, but > don't worry if it's not possible". Only a few controllers need > additional setup to make MWI actually work ... if they couldn't > do that setup, that'd be worth a warning before they backed off > to run in a non-MWI mode. > So the semantics of pci_set_mwi() are "try to set MWI if this platform/device supports it". In that case its interface is misdesigned, because it doesn't discriminate between "yes-it-does/no-it-doesn't" (which we don't want to report, because either is expected and legitimate) and "something screwed up", which we do want to report, because it is always unexpected. So an appropriate return value protocol would be 0: No error, unable to set MWI 1: No error, able to set MWI -EFOO: Error - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
(From Alan Cox:) > The underlying bug is that someone marked pci_set_mwi must-check, that's > wrong for most of the drivers that use it. If you remove the must check > annotation from it then the problem and a thousand other spurious > warnings go away. Yes, there seems to be abuse of this new "must_check" feature. (From Andrew Morton:) > But if MWI _does_ make a difference to performance then we should tell > someone that it isn't working rather than silently misbehaving? Thing is, a "difference to performance (alone)" != "misbehavior". If it affected correctness, then a warning would be appropriate. Most drivers should be able to say "enable MWI if possible, but don't worry if it's not possible". Only a few controllers need additional setup to make MWI actually work ... if they couldn't do that setup, that'd be worth a warning before they backed off to run in a non-MWI mode. - Dave - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
On Sun, 15 Oct 2006 07:54:41 -0600 Matthew Wilcox <[EMAIL PROTECTED]> wrote: > On Sat, Oct 14, 2006 at 11:53:59PM -0700, Andrew Morton wrote: > > It seems to have stopped happening. I don't know why. > > Argh. Possibly a mistake during the bisect procedure? I don't think so - I spent a lot of time fiddling with that because it was so bizarre to have two patches each of which caused the same failure. Something has changed, perhaps config: the failure is a bit different now (happens earlier). > > gregkh-pci-pci-prevent-user-config-space-access-during-power-state-transitions.patch > > still breaks suspend though: http://userweb.kernel.org/~akpm/s5000349.jpg > > I believe that; I was going to generate a new patch for that yesterday, > but got distracted trying to debug your other problem. I'll put out a > new version of that patch later today. ok.. Add plenty of debug printks to it. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bulk] Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
On Sun, 15 Oct 2006 07:57:56 -0600 Matthew Wilcox <[EMAIL PROTECTED]> wrote: > On Sun, Oct 15, 2006 at 03:21:22PM +0100, Alan Cox wrote: > > Ar Sul, 2006-10-15 am 00:08 -0700, ysgrifennodd David Brownell: > > > Since it's not an error, there should be no such printk ... which > > > is exactly how it's coded above. > > > > The underlying bug is that someone marked pci_set_mwi must-check, that's > > wrong for most of the drivers that use it. If you remove the must check > > annotation from it then the problem and a thousand other spurious > > warnings go away. > > There's only about 20 users of pci_set_mwi ... about 12 of them seem to > check it, one of them uses a variable called > compiler_warning_pointless_fix which leaves about 7 warnings to be > removed by removing the __must_check. > > However, I do believe the __must_check should be removed. For example, > the LSI 53c1030 has *nothing* to be done if setting MWI fails. It just > doesn't work, and the device copes. If the drivers doesn't care and if it makes no difference to performance then just delete the call to pci_set_mwi(). But if MWI _does_ make a difference to performance then we should tell someone that it isn't working rather than silently misbehaving? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bulk] Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
On Sun, Oct 15, 2006 at 03:21:22PM +0100, Alan Cox wrote: > Ar Sul, 2006-10-15 am 00:08 -0700, ysgrifennodd David Brownell: > > Since it's not an error, there should be no such printk ... which > > is exactly how it's coded above. > > The underlying bug is that someone marked pci_set_mwi must-check, that's > wrong for most of the drivers that use it. If you remove the must check > annotation from it then the problem and a thousand other spurious > warnings go away. There's only about 20 users of pci_set_mwi ... about 12 of them seem to check it, one of them uses a variable called compiler_warning_pointless_fix which leaves about 7 warnings to be removed by removing the __must_check. However, I do believe the __must_check should be removed. For example, the LSI 53c1030 has *nothing* to be done if setting MWI fails. It just doesn't work, and the device copes. It's not like Tulip or sym53c8xx where there are additional bits to be set or cleared in control registers. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bulk] Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
On Sun, Oct 15, 2006 at 12:08:09AM -0700, David Brownell wrote: > > But the only effect of returning EINVAL is a printk (for this particular > > driver): > > > > /* PCI Memory-Write-Invalidate cycle support is optional (uncommon) > > */ > > retval = pci_set_mwi(pdev); > > if (!retval) > > ehci_dbg(ehci, "MWI active\n"); > > Erm, I've lost context here but it's completely legit for hardware > to NOT support MWI, so it is in no way an error if it's not set. > (Memory-Write-Invalidate is just a more efficient way to write data > that may be cached; if the device can't issue those cycles, there's > no loss of correctness.) > > Since it's not an error, there should be no such printk ... which > is exactly how it's coded above. > > Who is issuing the printk on a non-error code path?? Er, that would be the EHCI driver, which you wrote ... - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bulk] Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
Ar Sul, 2006-10-15 am 00:08 -0700, ysgrifennodd David Brownell: > Since it's not an error, there should be no such printk ... which > is exactly how it's coded above. The underlying bug is that someone marked pci_set_mwi must-check, that's wrong for most of the drivers that use it. If you remove the must check annotation from it then the problem and a thousand other spurious warnings go away. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
On Sat, Oct 14, 2006 at 11:53:59PM -0700, Andrew Morton wrote: > It seems to have stopped happening. I don't know why. Argh. Possibly a mistake during the bisect procedure? > gregkh-pci-pci-prevent-user-config-space-access-during-power-state-transitions.patch > still breaks suspend though: http://userweb.kernel.org/~akpm/s5000349.jpg I believe that; I was going to generate a new patch for that yesterday, but got distracted trying to debug your other problem. I'll put out a new version of that patch later today. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bulk] Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
> But the only effect of returning EINVAL is a printk (for this particular > driver): > > /* PCI Memory-Write-Invalidate cycle support is optional (uncommon) */ > retval = pci_set_mwi(pdev); > if (!retval) > ehci_dbg(ehci, "MWI active\n"); Erm, I've lost context here but it's completely legit for hardware to NOT support MWI, so it is in no way an error if it's not set. (Memory-Write-Invalidate is just a more efficient way to write data that may be cached; if the device can't issue those cycles, there's no loss of correctness.) Since it's not an error, there should be no such printk ... which is exactly how it's coded above. Who is issuing the printk on a non-error code path?? - Dave - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
On Sat, 14 Oct 2006 21:20:01 -0600 Matthew Wilcox <[EMAIL PROTECTED]> wrote: > On Sat, Oct 14, 2006 at 01:48:55PM -0700, Andrew Morton wrote: > > On Sat, 14 Oct 2006 08:02:49 -0600 > > Matthew Wilcox <[EMAIL PROTECTED]> wrote: > > > > > On Fri, Oct 13, 2006 at 09:41:35PM -0700, Andrew Morton wrote: > > > > Bisection shows that this patch > > > > (pci-check-that-mwi-bit-really-did-get-set.patch in Greg's PCI tree) > > > > breaks > > > > suspend-to-disk on my Vaio. It writes the suspend image and gets to the > > > > point where it's supposed to power down, but doesn't. > > > > > > How odd. What driver is calling pci_set_mwi() on the suspend path? > > > > ehci_pci_reinit(). I stuck a dump_stack() in there. See > > http://userweb.kernel.org/~akpm/s5000342.jpg > > Thanks for the picture; that's really helpful. > > I see. We hibernate all the devices then wake them all back up again. > No doubt there's a good reason for this. > > Still doesn't make much sense, though. As far as I can see, the only > consequence of this particular patch is that 1) we do an additional read > from PCI_COMMAND and 2) we can return -EINVAL in one additional case. > But the only effect of returning EINVAL is a printk (for this particular > driver): > > /* PCI Memory-Write-Invalidate cycle support is optional (uncommon) */ > retval = pci_set_mwi(pdev); > if (!retval) > ehci_dbg(ehci, "MWI active\n"); > > ehci_port_power(ehci, 0); > > return 0; > > So even if we return EINVAL ... big deal. > > Is it possible reading PCI_COMMAND too quickly after writing it causes > a foul-up? That would be weird ... > > so I suppose there's a few things I can ask you to try: It seems to have stopped happening. I don't know why. gregkh-pci-pci-prevent-user-config-space-access-during-power-state-transitions.patch still breaks suspend though: http://userweb.kernel.org/~akpm/s5000349.jpg - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
On Sat, Oct 14, 2006 at 01:48:55PM -0700, Andrew Morton wrote: > On Sat, 14 Oct 2006 08:02:49 -0600 > Matthew Wilcox <[EMAIL PROTECTED]> wrote: > > > On Fri, Oct 13, 2006 at 09:41:35PM -0700, Andrew Morton wrote: > > > Bisection shows that this patch > > > (pci-check-that-mwi-bit-really-did-get-set.patch in Greg's PCI tree) > > > breaks > > > suspend-to-disk on my Vaio. It writes the suspend image and gets to the > > > point where it's supposed to power down, but doesn't. > > > > How odd. What driver is calling pci_set_mwi() on the suspend path? > > ehci_pci_reinit(). I stuck a dump_stack() in there. See > http://userweb.kernel.org/~akpm/s5000342.jpg Thanks for the picture; that's really helpful. I see. We hibernate all the devices then wake them all back up again. No doubt there's a good reason for this. Still doesn't make much sense, though. As far as I can see, the only consequence of this particular patch is that 1) we do an additional read from PCI_COMMAND and 2) we can return -EINVAL in one additional case. But the only effect of returning EINVAL is a printk (for this particular driver): /* PCI Memory-Write-Invalidate cycle support is optional (uncommon) */ retval = pci_set_mwi(pdev); if (!retval) ehci_dbg(ehci, "MWI active\n"); ehci_port_power(ehci, 0); return 0; So even if we return EINVAL ... big deal. Is it possible reading PCI_COMMAND too quickly after writing it causes a foul-up? That would be weird ... so I suppose there's a few things I can ask you to try: 1. Stop reading the register back altogether. This should revert the behaviour to the prepatch state: - pci_read_config_word(dev, PCI_COMMAND, &cmd); +// pci_read_config_word(dev, PCI_COMMAND, &cmd); 2. Put an mdelay(1); before that line 3. Change the last line to just return 0. - return (cmd & PCI_COMMAND_INVALIDATE) ? 0 : -EINVAL; + return 0; > > What drivers do you have loaded on the Vaio? > > sony:/home/akpm> lsmod I don't see any of the other drivers calling pci_set_mwi, so i guess we're looking at the right suspect. I feel rather guilty about the amount of time you're spending on this; any bugs you want me to look at as penance? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
On Sat, 14 Oct 2006 08:02:49 -0600 Matthew Wilcox <[EMAIL PROTECTED]> wrote: > On Fri, Oct 13, 2006 at 09:41:35PM -0700, Andrew Morton wrote: > > Bisection shows that this patch > > (pci-check-that-mwi-bit-really-did-get-set.patch in Greg's PCI tree) breaks > > suspend-to-disk on my Vaio. It writes the suspend image and gets to the > > point where it's supposed to power down, but doesn't. > > How odd. What driver is calling pci_set_mwi() on the suspend path? ehci_pci_reinit(). I stuck a dump_stack() in there. See http://userweb.kernel.org/~akpm/s5000342.jpg > What drivers do you have loaded on the Vaio? sony:/home/akpm> lsmod Module Size Used by ipw2200 163184 0 sonypi 21484 0 autofs419908 1 hidp 16192 2 l2cap 21764 5 hidp bluetooth 49060 2 hidp,l2cap sunrpc154172 1 ip_conntrack_netbios_ns 3328 0 ipt_REJECT 4736 1 xt_state2496 2 ip_conntrack 51020 2 ip_conntrack_netbios_ns,xt_state nfnetlink 7128 1 ip_conntrack xt_tcpudp 3392 4 iptable_filter 3264 1 ip_tables 12616 1 iptable_filter x_tables 15428 4 ipt_REJECT,xt_state,xt_tcpudp,ip_tables video 16836 0 sony_acpi 7312 0 sbs15968 0 i2c_ec 5504 1 sbs button 7184 0 battery10692 0 asus_acpi 17564 0 backlight 6464 2 sony_acpi,asus_acpi ac 5636 0 nvram 8072 0 ohci1394 33264 0 ehci_hcd 30088 0 ieee1394 291896 1 ohci1394 uhci_hcd 22092 0 sg 33628 0 joydev 9920 0 evbug 3392 0 snd_hda_intel 18968 0 snd_hda_codec 161536 1 snd_hda_intel snd_seq_dummy 4228 0 snd_seq_oss31744 0 snd_seq_midi_event 7360 1 snd_seq_oss snd_seq48208 5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event snd_seq_device 8524 3 snd_seq_dummy,snd_seq_oss,snd_seq ieee80211 30920 1 ipw2200 snd_pcm_oss41504 0 snd_mixer_oss 16640 1 snd_pcm_oss ieee80211_crypt 6016 1 ieee80211 snd_pcm74632 3 snd_hda_intel,snd_hda_codec,snd_pcm_oss snd_timer 21316 2 snd_seq,snd_pcm snd50980 9 snd_hda_intel,snd_hda_codec,snd_seq_oss,snd_seq,snd_seq_device,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer soundcore 7968 1 snd snd_page_alloc 10376 2 snd_hda_intel,snd_pcm piix9604 0 [permanent] i2c_i8017820 0 pcspkr 3136 0 i2c_core 21840 2 i2c_ec,i2c_i801 generic 5252 0 [permanent] ext3 127688 1 jbd52712 1 ext3 ide_disk 16000 0 ide_core 114780 3 piix,generic,ide_disk - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
On Fri, Oct 13, 2006 at 09:41:35PM -0700, Andrew Morton wrote: > Bisection shows that this patch > (pci-check-that-mwi-bit-really-did-get-set.patch in Greg's PCI tree) breaks > suspend-to-disk on my Vaio. It writes the suspend image and gets to the > point where it's supposed to power down, but doesn't. How odd. What driver is calling pci_set_mwi() on the suspend path? What drivers do you have loaded on the Vaio? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
On Fri, Oct 13, 2006 at 09:41:35PM -0700, Andrew Morton wrote: > On Fri, 06 Oct 2006 13:05:18 -0600 > Matthew Wilcox <[EMAIL PROTECTED]> wrote: > > > Since some devices may not implement the MWI bit, we should check that > > the write did set it and return an error if it didn't. > > > > Signed-off-by: Matthew Wilcox <[EMAIL PROTECTED]> > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > index a544997..3d041f4 100644 > > --- a/drivers/pci/pci.c > > +++ b/drivers/pci/pci.c > > @@ -900,13 +900,17 @@ #endif > > return rc; > > > > pci_read_config_word(dev, PCI_COMMAND, &cmd); > > - if (! (cmd & PCI_COMMAND_INVALIDATE)) { > > - pr_debug("PCI: Enabling Mem-Wr-Inval for device %s\n", > > pci_name(dev)); > > - cmd |= PCI_COMMAND_INVALIDATE; > > - pci_write_config_word(dev, PCI_COMMAND, cmd); > > - } > > - > > - return 0; > > + if (cmd & PCI_COMMAND_INVALIDATE) > > + return 0; > > + > > + pr_debug("PCI: Enabling Mem-Wr-Inval for device %s\n", pci_name(dev)); > > + cmd |= PCI_COMMAND_INVALIDATE; > > + pci_write_config_word(dev, PCI_COMMAND, cmd); > > + > > + /* read result from hardware (in case bit refused to enable) */ > > + pci_read_config_word(dev, PCI_COMMAND, &cmd); > > + > > + return (cmd & PCI_COMMAND_INVALIDATE) ? 0 : -EINVAL; > > } > > > > /** > > Bisection shows that this patch > (pci-check-that-mwi-bit-really-did-get-set.patch in Greg's PCI tree) breaks > suspend-to-disk on my Vaio. It writes the suspend image and gets to the > point where it's supposed to power down, but doesn't. > > After a manual power-cycle it successfully resumes from disk, but > networking (at least) is dead. Ok, I'll drop this from my tree too. Matthew, let me know whn you have a revised patch you wish to have me include. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
On Fri, 06 Oct 2006 13:05:18 -0600 Matthew Wilcox <[EMAIL PROTECTED]> wrote: > Since some devices may not implement the MWI bit, we should check that > the write did set it and return an error if it didn't. > > Signed-off-by: Matthew Wilcox <[EMAIL PROTECTED]> > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index a544997..3d041f4 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -900,13 +900,17 @@ #endif > return rc; > > pci_read_config_word(dev, PCI_COMMAND, &cmd); > - if (! (cmd & PCI_COMMAND_INVALIDATE)) { > - pr_debug("PCI: Enabling Mem-Wr-Inval for device %s\n", > pci_name(dev)); > - cmd |= PCI_COMMAND_INVALIDATE; > - pci_write_config_word(dev, PCI_COMMAND, cmd); > - } > - > - return 0; > + if (cmd & PCI_COMMAND_INVALIDATE) > + return 0; > + > + pr_debug("PCI: Enabling Mem-Wr-Inval for device %s\n", pci_name(dev)); > + cmd |= PCI_COMMAND_INVALIDATE; > + pci_write_config_word(dev, PCI_COMMAND, cmd); > + > + /* read result from hardware (in case bit refused to enable) */ > + pci_read_config_word(dev, PCI_COMMAND, &cmd); > + > + return (cmd & PCI_COMMAND_INVALIDATE) ? 0 : -EINVAL; > } > > /** Bisection shows that this patch (pci-check-that-mwi-bit-really-did-get-set.patch in Greg's PCI tree) breaks suspend-to-disk on my Vaio. It writes the suspend image and gets to the point where it's supposed to power down, but doesn't. After a manual power-cycle it successfully resumes from disk, but networking (at least) is dead. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
Matthew Wilcox wrote: Since some devices may not implement the MWI bit, we should check that the write did set it and return an error if it didn't. Signed-off-by: Matthew Wilcox <[EMAIL PROTECTED]> ACK, though (as it seems you are doing) you should audit pci_set_mwi() users to make sure behavior matches up with this implementation Jeff - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] [PCI] Check that MWI bit really did get set
Since some devices may not implement the MWI bit, we should check that the write did set it and return an error if it didn't. Signed-off-by: Matthew Wilcox <[EMAIL PROTECTED]> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index a544997..3d041f4 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -900,13 +900,17 @@ #endif return rc; pci_read_config_word(dev, PCI_COMMAND, &cmd); - if (! (cmd & PCI_COMMAND_INVALIDATE)) { - pr_debug("PCI: Enabling Mem-Wr-Inval for device %s\n", pci_name(dev)); - cmd |= PCI_COMMAND_INVALIDATE; - pci_write_config_word(dev, PCI_COMMAND, cmd); - } - - return 0; + if (cmd & PCI_COMMAND_INVALIDATE) + return 0; + + pr_debug("PCI: Enabling Mem-Wr-Inval for device %s\n", pci_name(dev)); + cmd |= PCI_COMMAND_INVALIDATE; + pci_write_config_word(dev, PCI_COMMAND, cmd); + + /* read result from hardware (in case bit refused to enable) */ + pci_read_config_word(dev, PCI_COMMAND, &cmd); + + return (cmd & PCI_COMMAND_INVALIDATE) ? 0 : -EINVAL; } /** - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html