Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
On Wed, 2007-10-17 at 08:31 -0700, Stephen Hemminger wrote: > Please don't use double underscore, for this function name. There is no > reason to not make it a normal API call. > > The sky2 fix I am working on will use napi_synchronize as well. Allright. A compiler barrier in the !SMP case makes sense, but I would still want an smp_mb() before the test_bit. I think it's a bug in synchronize_irq not to have it. Cheers, Ben. > --- a/include/linux/netdevice.h 2007-10-16 16:48:20.0 -0700 > +++ b/include/linux/netdevice.h 2007-10-17 08:29:55.0 -0700 > @@ -407,6 +407,24 @@ static inline void napi_enable(struct na > clear_bit(NAPI_STATE_SCHED, &n->state); > } > > +#ifdef CONFIG_SMP > +/** > + * napi_synchronize - wait until NAPI is not running > + * @n: napi context > + * > + * Wait until NAPI is done being scheduled on this context. > + * Any outstanding processing completes but > + * does not disable future activations. > + */ > +static inline void napi_synchronize(const struct napi_struct *n) > +{ > + while (test_bit(NAPI_STATE_SCHED, &n->state)) > + msleep(1); > +} > +#else > +# define napi_synchronize(n) barrier() > +#endif > + > /* > * The DEVICE structure. > * Actually, this whole structure is a big mistake. It mixes I/O - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
Please don't use double underscore, for this function name. There is no reason to not make it a normal API call. The sky2 fix I am working on will use napi_synchronize as well. --- a/include/linux/netdevice.h 2007-10-16 16:48:20.0 -0700 +++ b/include/linux/netdevice.h 2007-10-17 08:29:55.0 -0700 @@ -407,6 +407,24 @@ static inline void napi_enable(struct na clear_bit(NAPI_STATE_SCHED, &n->state); } +#ifdef CONFIG_SMP +/** + * napi_synchronize - wait until NAPI is not running + * @n: napi context + * + * Wait until NAPI is done being scheduled on this context. + * Any outstanding processing completes but + * does not disable future activations. + */ +static inline void napi_synchronize(const struct napi_struct *n) +{ + while (test_bit(NAPI_STATE_SCHED, &n->state)) + msleep(1); +} +#else +# define napi_synchronize(n) barrier() +#endif + /* * The DEVICE structure. * Actually, this whole structure is a big mistake. It mixes I/O - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
\ > > Note: unfortunately, Jeff already picked up the EMAC patch without > > waiting for this to be sorted out (oops...). So if you agree with > > this patch, it would be nice to have it go in quickly or maybe via > > Jeff's tree to avoid breakage ? Not terribly important tho. > > > Sorry, I thought that was the way everybody was headed. With the driver > broken /anyway/, I just sorta threw it into the pile of fixes. > > It's upstream now, so let me know if you want to revert or move forward > from here... Let's just move forward. I can always re-fix the driver :-) Ben. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
Benjamin Herrenschmidt wrote: net: Add __napi_synchronize() to sync with napi poll The EMAC driver which needs to handle multiple devices with one NAPI instance implements its own per-channel disable bit. However, when setting such a bit, it needs to synchronize with the poller (that is make sure that any pending poller instance has completed, or is started late enough to see that disable bit). This implements a low level __napi_synchronize() function to acheive that. The underscores are to emphasis the low level aspect of it and to discourage driver writers who don't know what they are doing to use it (to please DaveM :-) Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> --- Switched to do a cpu_relax() spin instead and only on SMP. Not that we have an smp_mb() in there which synchronize_irq() lacks. I believe this is a bug in synchronize_irq() which I'll handle separately. Note: unfortunately, Jeff already picked up the EMAC patch without waiting for this to be sorted out (oops...). So if you agree with this patch, it would be nice to have it go in quickly or maybe via Jeff's tree to avoid breakage ? Not terribly important tho. Sorry, I thought that was the way everybody was headed. With the driver broken /anyway/, I just sorta threw it into the pile of fixes. It's upstream now, so let me know if you want to revert or move forward from here... Jeff - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
net: Add __napi_synchronize() to sync with napi poll The EMAC driver which needs to handle multiple devices with one NAPI instance implements its own per-channel disable bit. However, when setting such a bit, it needs to synchronize with the poller (that is make sure that any pending poller instance has completed, or is started late enough to see that disable bit). This implements a low level __napi_synchronize() function to acheive that. The underscores are to emphasis the low level aspect of it and to discourage driver writers who don't know what they are doing to use it (to please DaveM :-) Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> --- Switched to do a cpu_relax() spin instead and only on SMP. Not that we have an smp_mb() in there which synchronize_irq() lacks. I believe this is a bug in synchronize_irq() which I'll handle separately. Note: unfortunately, Jeff already picked up the EMAC patch without waiting for this to be sorted out (oops...). So if you agree with this patch, it would be nice to have it go in quickly or maybe via Jeff's tree to avoid breakage ? Not terribly important tho. Index: linux-work/include/linux/netdevice.h === --- linux-work.orig/include/linux/netdevice.h 2007-10-17 12:39:40.0 +1000 +++ linux-work/include/linux/netdevice.h2007-10-17 12:40:59.0 +1000 @@ -394,6 +394,23 @@ static inline void napi_disable(struct n } /** + * __napi_synchronize - synchronize with a concurrent poll + * @n: napi context + * + * Synchronizes with a concurrent poll. Not to be used in normal + * drivers, mostly useful if you end up with multiple interfaces + * on one NAPI instance. + */ +static inline void __napi_synchronize(struct napi_struct *n) +{ +#ifdef CONFIG_SMP + smp_mb(); + while (test_bit(NAPI_STATE_SCHED, &n->state)) + cpu_relax(); +#endif /* CONFIG_SMP */ +} + +/** * napi_enable - enable NAPI scheduling * @n: napi context * - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
napi: use non-interruptible sleep in napi_disable The current napi_disable() uses msleep_interruptible() but doesn't (and can't) exit in case there's a signal, thus ending up doing a hot spin without a cpu_relax. Use uninterruptible sleep instead. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> --- Index: linux-work/include/linux/netdevice.h === --- linux-work.orig/include/linux/netdevice.h 2007-10-17 12:39:16.0 +1000 +++ linux-work/include/linux/netdevice.h2007-10-17 12:45:00.0 +1000 @@ -390,7 +390,7 @@ static inline void napi_complete(struct static inline void napi_disable(struct napi_struct *n) { while (test_and_set_bit(NAPI_STATE_SCHED, &n->state)) - msleep_interruptible(1); + msleep(1); } /** - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
> So this is really just like synchronize_irq()? Using msleep is bogus > because you want to spin, you are only waiting for a softirq on the other > cpu to finish. If you wait for a whole millisecond and sleep that > is far longer than the napi routine should take. > > You could even optimize it like synchronize_irq() for the non-SMP case. It's just like synchronize_irq() indeed. I used the mlseep() just like napi_disable() mostly because I use it in a very similar context, for disabling my sub-channels on things like link change etc... where I need to reconfigure parts of the chip. I prefer sleeping in my case but I agree that if somebody else was going to use for something else more performance critical, it might be an issue. On the other hand, spinning will not be nice for my usage scenario :-) I agree about the SMP optimisation though again, in my usage pattern, it's very unimportant (similar code path as napi_disable) I'll respin later today though. Cheers, Ben. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
On Tue, 16 Oct 2007 15:49:52 +1000 Benjamin Herrenschmidt <[EMAIL PROTECTED]> wrote: > net: Add __napi_sycnhronize() to sync with napi poll > > The EMAC driver which needs to handle multiple devices with one > NAPI instance implements its own per-channel disable bit. However, > when setting such a bit, it needs to synchronize with the poller > (that is make sure that any pending poller instance has completed, > or is started late enough to see that disable bit). > > This implements a low level __napi_synchronize() function to acheive > that. The underscores are to emphasis the low level aspect of it and > to discourage driver writers who don't know what they are doing to > use it (to please DaveM :-) > > Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> > --- > > (Use correct address for Stephen this time) > > If the approach is accepted, I would like to have this merged now > so the EMAC patch to make it work again can follow :-) > > Note: I use msleep_interruptible(1); just like napi_disable(). However > I'm not too happy that the "hot" loop that results of a pending signal > here will spin without even a cpu_relax ... what do you guys think would > be the best way to handle this ? So this is really just like synchronize_irq()? Using msleep is bogus because you want to spin, you are only waiting for a softirq on the other cpu to finish. If you wait for a whole millisecond and sleep that is far longer than the napi routine should take. You could even optimize it like synchronize_irq() for the non-SMP case. -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
From: Benjamin Herrenschmidt <[EMAIL PROTECTED]> Date: Tue, 16 Oct 2007 17:37:03 +1000 > > On Tue, 2007-10-16 at 14:06 +0800, Herbert Xu wrote: > > Benjamin Herrenschmidt <[EMAIL PROTECTED]> wrote: > > > > > > Note: I use msleep_interruptible(1); just like napi_disable(). However > > > I'm not too happy that the "hot" loop that results of a pending signal > > > here will spin without even a cpu_relax ... what do you guys think would > > > be the best way to handle this ? > > > > Well since the loop does not check signals at all, it should > > just use msleep. > > > > Granted the process will end up in the D state and contribute > > to the load average. But if this loop executes long enough > > for that to be noticed then we've got bigger problems to worry > > about. > > If Dave & Stephen agree, I'll send a patch changing napi_disable() too > then. I agree with the msleep() change. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
On Tue, 2007-10-16 at 14:06 +0800, Herbert Xu wrote: > Benjamin Herrenschmidt <[EMAIL PROTECTED]> wrote: > > > > Note: I use msleep_interruptible(1); just like napi_disable(). However > > I'm not too happy that the "hot" loop that results of a pending signal > > here will spin without even a cpu_relax ... what do you guys think would > > be the best way to handle this ? > > Well since the loop does not check signals at all, it should > just use msleep. > > Granted the process will end up in the D state and contribute > to the load average. But if this loop executes long enough > for that to be noticed then we've got bigger problems to worry > about. If Dave & Stephen agree, I'll send a patch changing napi_disable() too then. Cheers, Ben. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
Benjamin Herrenschmidt <[EMAIL PROTECTED]> wrote: > > Note: I use msleep_interruptible(1); just like napi_disable(). However > I'm not too happy that the "hot" loop that results of a pending signal > here will spin without even a cpu_relax ... what do you guys think would > be the best way to handle this ? Well since the loop does not check signals at all, it should just use msleep. Granted the process will end up in the D state and contribute to the load average. But if this loop executes long enough for that to be noticed then we've got bigger problems to worry about. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
net: Add __napi_sycnhronize() to sync with napi poll The EMAC driver which needs to handle multiple devices with one NAPI instance implements its own per-channel disable bit. However, when setting such a bit, it needs to synchronize with the poller (that is make sure that any pending poller instance has completed, or is started late enough to see that disable bit). This implements a low level __napi_synchronize() function to acheive that. The underscores are to emphasis the low level aspect of it and to discourage driver writers who don't know what they are doing to use it (to please DaveM :-) Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> --- (Use correct address for Stephen this time) If the approach is accepted, I would like to have this merged now so the EMAC patch to make it work again can follow :-) Note: I use msleep_interruptible(1); just like napi_disable(). However I'm not too happy that the "hot" loop that results of a pending signal here will spin without even a cpu_relax ... what do you guys think would be the best way to handle this ? Index: linux-work/include/linux/netdevice.h === --- linux-work.orig/include/linux/netdevice.h 2007-10-16 15:27:27.0 +1000 +++ linux-work/include/linux/netdevice.h2007-10-16 15:27:38.0 +1000 @@ -394,6 +394,21 @@ static inline void napi_disable(struct n } /** + * __napi_synchronize - synchronize with a concurrent poll + * @n: napi context + * + * Synchronizes with a concurrent poll. Not to be used in normal + * drivers, mostly useful if you end up with multiple interfaces + * on one NAPI instance. + */ +static inline void __napi_synchronize(struct napi_struct *n) +{ + smp_mb(); + while (test_bit(NAPI_STATE_SCHED, &n->state)) + msleep_interruptible(1); +} + +/** * napi_enable - enable NAPI scheduling * @n: napi context * - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC] net: Add __napi_sycnhronize() to sync with napi poll
net: Add __napi_sycnhronize() to sync with napi poll The EMAC driver which needs to handle multiple devices with one NAPI instance implements its own per-channel disable bit. However, when setting such a bit, it needs to synchronize with the poller (that is make sure that any pending poller instance has completed, or is started late enough to see that disable bit). This implements a low level __napi_synchronize() function to acheive that. The underscores are to emphasis the low level aspect of it and to discourage driver writers who don't know what they are doing to use it (to please DaveM :-) Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> --- If the approach is accepted, I would like to have this merged now so the EMAC patch to make it work again can follow :-) Note: I use msleep_interruptible(1); just like napi_disable(). However I'm not too happy that the "hot" loop that results of a pending signal here will spin without even a cpu_relax ... what do you guys think would be the best way to handle this ? Index: linux-work/include/linux/netdevice.h === --- linux-work.orig/include/linux/netdevice.h 2007-10-16 15:27:27.0 +1000 +++ linux-work/include/linux/netdevice.h2007-10-16 15:27:38.0 +1000 @@ -394,6 +394,21 @@ static inline void napi_disable(struct n } /** + * __napi_synchronize - synchronize with a concurrent poll + * @n: napi context + * + * Synchronizes with a concurrent poll. Not to be used in normal + * drivers, mostly useful if you end up with multiple interfaces + * on one NAPI instance. + */ +static inline void __napi_synchronize(struct napi_struct *n) +{ + smp_mb(); + while (test_bit(NAPI_STATE_SCHED, &n->state)) + msleep_interruptible(1); +} + +/** * napi_enable - enable NAPI scheduling * @n: napi context * - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html