Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-09-23 Thread Paul E. McKenney
On Wed, Sep 23, 2015 at 07:57:53PM +0200, Patrick Marlier wrote:
> 
> On 09/22/2015 10:50 PM, Paul E. McKenney wrote:
> >On Sun, Sep 13, 2015 at 09:10:24AM -0700, Paul E. McKenney wrote:
> >>On Sun, Sep 13, 2015 at 12:06:33PM +0200, Patrick Marlier wrote:
> >
> >[ . . . ]
> >
> >>>Paul,
> >>>
> >>>This sounds good to me. It should fix the performance issue (will
> >>>check with my benchmark).
> >>
> >>Thank you, looking forward to seeing the results!
> >
> >Just following up -- how is the benchmarking going?
> 
> Note that in my module I am using the kernel version 3.16.0-31 (I
> ported your change).
> Here the results of my benchmark that tests rculist in the case of
> read only.
> 
> # 1st column : The number of threads
> # 2nd : ops/s the original version
> # 3rd : ops/s your version with lockless_dereference
> 1 883923 1747554
> 2 1741441 3543062
> 3 2462360 5103647
> 4 3437273 7176608
> 6 5155803 9812348
> 8 6718111 13330050
> 10 8519227 17458294
> 12 9773632 20298897
> 14 11555198 23191424
> 16 11643264 25125712
> 
> I get same performance with my patch.
> So indeed it fixes the performance problem I was seeing.

Nice, thank you!!!

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-09-23 Thread Patrick Marlier


On 09/22/2015 10:50 PM, Paul E. McKenney wrote:

On Sun, Sep 13, 2015 at 09:10:24AM -0700, Paul E. McKenney wrote:

On Sun, Sep 13, 2015 at 12:06:33PM +0200, Patrick Marlier wrote:


[ . . . ]


Paul,

This sounds good to me. It should fix the performance issue (will
check with my benchmark).


Thank you, looking forward to seeing the results!


Just following up -- how is the benchmarking going?


Note that in my module I am using the kernel version 3.16.0-31 (I ported 
your change).
Here the results of my benchmark that tests rculist in the case of read 
only.


# 1st column : The number of threads
# 2nd : ops/s the original version
# 3rd : ops/s your version with lockless_dereference
1 883923 1747554
2 1741441 3543062
3 2462360 5103647
4 3437273 7176608
6 5155803 9812348
8 6718111 13330050
10 8519227 17458294
12 9773632 20298897
14 11555198 23191424
16 11643264 25125712

I get same performance with my patch.
So indeed it fixes the performance problem I was seeing.

Thanks a lot!
--
Pat
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-09-23 Thread Patrick Marlier


On 09/22/2015 10:50 PM, Paul E. McKenney wrote:

On Sun, Sep 13, 2015 at 09:10:24AM -0700, Paul E. McKenney wrote:

On Sun, Sep 13, 2015 at 12:06:33PM +0200, Patrick Marlier wrote:


[ . . . ]


Paul,

This sounds good to me. It should fix the performance issue (will
check with my benchmark).


Thank you, looking forward to seeing the results!


Just following up -- how is the benchmarking going?


Note that in my module I am using the kernel version 3.16.0-31 (I ported 
your change).
Here the results of my benchmark that tests rculist in the case of read 
only.


# 1st column : The number of threads
# 2nd : ops/s the original version
# 3rd : ops/s your version with lockless_dereference
1 883923 1747554
2 1741441 3543062
3 2462360 5103647
4 3437273 7176608
6 5155803 9812348
8 6718111 13330050
10 8519227 17458294
12 9773632 20298897
14 11555198 23191424
16 11643264 25125712

I get same performance with my patch.
So indeed it fixes the performance problem I was seeing.

Thanks a lot!
--
Pat
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-09-23 Thread Paul E. McKenney
On Wed, Sep 23, 2015 at 07:57:53PM +0200, Patrick Marlier wrote:
> 
> On 09/22/2015 10:50 PM, Paul E. McKenney wrote:
> >On Sun, Sep 13, 2015 at 09:10:24AM -0700, Paul E. McKenney wrote:
> >>On Sun, Sep 13, 2015 at 12:06:33PM +0200, Patrick Marlier wrote:
> >
> >[ . . . ]
> >
> >>>Paul,
> >>>
> >>>This sounds good to me. It should fix the performance issue (will
> >>>check with my benchmark).
> >>
> >>Thank you, looking forward to seeing the results!
> >
> >Just following up -- how is the benchmarking going?
> 
> Note that in my module I am using the kernel version 3.16.0-31 (I
> ported your change).
> Here the results of my benchmark that tests rculist in the case of
> read only.
> 
> # 1st column : The number of threads
> # 2nd : ops/s the original version
> # 3rd : ops/s your version with lockless_dereference
> 1 883923 1747554
> 2 1741441 3543062
> 3 2462360 5103647
> 4 3437273 7176608
> 6 5155803 9812348
> 8 6718111 13330050
> 10 8519227 17458294
> 12 9773632 20298897
> 14 11555198 23191424
> 16 11643264 25125712
> 
> I get same performance with my patch.
> So indeed it fixes the performance problem I was seeing.

Nice, thank you!!!

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-09-22 Thread Paul E. McKenney
On Sun, Sep 13, 2015 at 09:10:24AM -0700, Paul E. McKenney wrote:
> On Sun, Sep 13, 2015 at 12:06:33PM +0200, Patrick Marlier wrote:

[ . . . ]

> > Paul,
> > 
> > This sounds good to me. It should fix the performance issue (will
> > check with my benchmark).
> 
> Thank you, looking forward to seeing the results!

Just following up -- how is the benchmarking going?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-09-22 Thread Paul E. McKenney
On Sun, Sep 13, 2015 at 09:10:24AM -0700, Paul E. McKenney wrote:
> On Sun, Sep 13, 2015 at 12:06:33PM +0200, Patrick Marlier wrote:

[ . . . ]

> > Paul,
> > 
> > This sounds good to me. It should fix the performance issue (will
> > check with my benchmark).
> 
> Thank you, looking forward to seeing the results!

Just following up -- how is the benchmarking going?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-09-13 Thread Paul E. McKenney
On Sun, Sep 13, 2015 at 12:06:33PM +0200, Patrick Marlier wrote:
> 
> On 09/12/2015 01:05 AM, Paul E. McKenney wrote:
> >On Tue, May 19, 2015 at 03:07:25PM -0700, Paul E. McKenney wrote:
> >>On Mon, May 18, 2015 at 09:43:21AM -0400, Steven Rostedt wrote:
> >>>On Mon, 18 May 2015 12:06:47 +1000
> >>>NeilBrown  wrote:
> >>>
> >>>
> >struct mddev {
> >...
> > struct list_headdisks;
> >...}
> >
> >struct list_head {
> >  struct list_head *next, *prev;
> >};
> >
> >The tricky thing is that "list_entry_rcu" before and after the patch is
> >reading the same thing.
> 
> No it isn't.
> Before the patch it is passed the address of the 'next' field.  After the
> patch it is passed the contents of the 'next' field.
> >>>
> >>>Right.
> >>>
> 
> 
> >
> >However in your case, the change I proposed is probably wrong I trust
> >you on this side. :) What's your proposal to fix it with the rculist 
> >patch?
> 
> What needs fixing?  I don't see anything broken.
> 
> Maybe there is something in this "rculist patch" that I'm missing.  Can 
> you
> point me at it?
> 
> >>>
> >>>Probably some debugging tool like sparse notices that the assignment
> >>>isn't a true list entry and complains about it. In other words, I think
> >>>the real fix is to fix the debugging tool to ignore this, because the
> >>>code is correct, and this is a false positive failure, and is causing
> >>>more harm than good, because people are sending out broken patches due
> >>>to it.
> >>
> >>OK, finally did the history trawling that I should have done to begin with.
> >>
> >>Back in 2010, Arnd added the __rcu pointer checking in sparse.
> >>But the RCU list primitives were used on non-RCU-protected lists, so
> >>some casting pain was required to avoid sparse complaints.  (Keep in
> >>mind that the list_head structure does not mark ->next with __rcu.)
> >>Arnd's workaround was to copy the pointer to the local stack, casting
> >>it to an __rcu pointer, then use rcu_dereference_raw() to do the needed
> >>traversal of an RCU-protected pointer.
> >>
> >>This of course resulted in an extraneous load from the stack, which
> >>Patrick noticed in his performance work, and which motivated him to send
> >>the patches.
> >>
> >>Perhaps what I should do is create an rcu_dereference_nocheck() for use
> >>in list traversals, that omits the sparse checking.  That should get rid
> >>of both the sparse warnings and the strange casts.
> >>
> >>The code in md probably needs to change in any case, as otherwise we are
> >>invoking rcu_dereference_whatever() on a full struct list_head rather
> >>than on a single pointer.  Or am I missing something here?
> >
> >Finally getting back to this one...
> >
> >I switched to lockless_dereference() instead of rcu_dereference_raw(),
> >and am running it through the testing gamut.  Patrick, are you OK with
> >this change?
> 
> Paul,
> 
> This sounds good to me. It should fix the performance issue (will
> check with my benchmark).

Thank you, looking forward to seeing the results!

> I think for drivers/md/bitmap.c:next_active_rdev() the problem was
> fixed but do you know if it also fixed for
> net/netfilter/core.c:nf_hook_slow()?

It does appear so.  The statement now reads:

elem = list_entry_rcu(state->hook_list, struct nf_hook_ops, list);

And ->hook_list is defined as follows:

struct list_head *hook_list;

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-09-13 Thread Patrick Marlier


On 09/12/2015 01:05 AM, Paul E. McKenney wrote:

On Tue, May 19, 2015 at 03:07:25PM -0700, Paul E. McKenney wrote:

On Mon, May 18, 2015 at 09:43:21AM -0400, Steven Rostedt wrote:

On Mon, 18 May 2015 12:06:47 +1000
NeilBrown  wrote:



struct mddev {
...
struct list_headdisks;
...}

struct list_head {
  struct list_head *next, *prev;
};

The tricky thing is that "list_entry_rcu" before and after the patch is
reading the same thing.


No it isn't.
Before the patch it is passed the address of the 'next' field.  After the
patch it is passed the contents of the 'next' field.


Right.






However in your case, the change I proposed is probably wrong I trust
you on this side. :) What's your proposal to fix it with the rculist patch?


What needs fixing?  I don't see anything broken.

Maybe there is something in this "rculist patch" that I'm missing.  Can you
point me at it?



Probably some debugging tool like sparse notices that the assignment
isn't a true list entry and complains about it. In other words, I think
the real fix is to fix the debugging tool to ignore this, because the
code is correct, and this is a false positive failure, and is causing
more harm than good, because people are sending out broken patches due
to it.


OK, finally did the history trawling that I should have done to begin with.

Back in 2010, Arnd added the __rcu pointer checking in sparse.
But the RCU list primitives were used on non-RCU-protected lists, so
some casting pain was required to avoid sparse complaints.  (Keep in
mind that the list_head structure does not mark ->next with __rcu.)
Arnd's workaround was to copy the pointer to the local stack, casting
it to an __rcu pointer, then use rcu_dereference_raw() to do the needed
traversal of an RCU-protected pointer.

This of course resulted in an extraneous load from the stack, which
Patrick noticed in his performance work, and which motivated him to send
the patches.

Perhaps what I should do is create an rcu_dereference_nocheck() for use
in list traversals, that omits the sparse checking.  That should get rid
of both the sparse warnings and the strange casts.

The code in md probably needs to change in any case, as otherwise we are
invoking rcu_dereference_whatever() on a full struct list_head rather
than on a single pointer.  Or am I missing something here?


Finally getting back to this one...

I switched to lockless_dereference() instead of rcu_dereference_raw(),
and am running it through the testing gamut.  Patrick, are you OK with
this change?


Paul,

This sounds good to me. It should fix the performance issue (will check 
with my benchmark).
I think for drivers/md/bitmap.c:next_active_rdev() the problem was fixed 
but do you know if it also fixed for net/netfilter/core.c:nf_hook_slow()?


Thanks.
--
Pat
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-09-13 Thread Patrick Marlier


On 09/12/2015 01:05 AM, Paul E. McKenney wrote:

On Tue, May 19, 2015 at 03:07:25PM -0700, Paul E. McKenney wrote:

On Mon, May 18, 2015 at 09:43:21AM -0400, Steven Rostedt wrote:

On Mon, 18 May 2015 12:06:47 +1000
NeilBrown  wrote:



struct mddev {
...
struct list_headdisks;
...}

struct list_head {
  struct list_head *next, *prev;
};

The tricky thing is that "list_entry_rcu" before and after the patch is
reading the same thing.


No it isn't.
Before the patch it is passed the address of the 'next' field.  After the
patch it is passed the contents of the 'next' field.


Right.






However in your case, the change I proposed is probably wrong I trust
you on this side. :) What's your proposal to fix it with the rculist patch?


What needs fixing?  I don't see anything broken.

Maybe there is something in this "rculist patch" that I'm missing.  Can you
point me at it?



Probably some debugging tool like sparse notices that the assignment
isn't a true list entry and complains about it. In other words, I think
the real fix is to fix the debugging tool to ignore this, because the
code is correct, and this is a false positive failure, and is causing
more harm than good, because people are sending out broken patches due
to it.


OK, finally did the history trawling that I should have done to begin with.

Back in 2010, Arnd added the __rcu pointer checking in sparse.
But the RCU list primitives were used on non-RCU-protected lists, so
some casting pain was required to avoid sparse complaints.  (Keep in
mind that the list_head structure does not mark ->next with __rcu.)
Arnd's workaround was to copy the pointer to the local stack, casting
it to an __rcu pointer, then use rcu_dereference_raw() to do the needed
traversal of an RCU-protected pointer.

This of course resulted in an extraneous load from the stack, which
Patrick noticed in his performance work, and which motivated him to send
the patches.

Perhaps what I should do is create an rcu_dereference_nocheck() for use
in list traversals, that omits the sparse checking.  That should get rid
of both the sparse warnings and the strange casts.

The code in md probably needs to change in any case, as otherwise we are
invoking rcu_dereference_whatever() on a full struct list_head rather
than on a single pointer.  Or am I missing something here?


Finally getting back to this one...

I switched to lockless_dereference() instead of rcu_dereference_raw(),
and am running it through the testing gamut.  Patrick, are you OK with
this change?


Paul,

This sounds good to me. It should fix the performance issue (will check 
with my benchmark).
I think for drivers/md/bitmap.c:next_active_rdev() the problem was fixed 
but do you know if it also fixed for net/netfilter/core.c:nf_hook_slow()?


Thanks.
--
Pat
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-09-13 Thread Paul E. McKenney
On Sun, Sep 13, 2015 at 12:06:33PM +0200, Patrick Marlier wrote:
> 
> On 09/12/2015 01:05 AM, Paul E. McKenney wrote:
> >On Tue, May 19, 2015 at 03:07:25PM -0700, Paul E. McKenney wrote:
> >>On Mon, May 18, 2015 at 09:43:21AM -0400, Steven Rostedt wrote:
> >>>On Mon, 18 May 2015 12:06:47 +1000
> >>>NeilBrown  wrote:
> >>>
> >>>
> >struct mddev {
> >...
> > struct list_headdisks;
> >...}
> >
> >struct list_head {
> >  struct list_head *next, *prev;
> >};
> >
> >The tricky thing is that "list_entry_rcu" before and after the patch is
> >reading the same thing.
> 
> No it isn't.
> Before the patch it is passed the address of the 'next' field.  After the
> patch it is passed the contents of the 'next' field.
> >>>
> >>>Right.
> >>>
> 
> 
> >
> >However in your case, the change I proposed is probably wrong I trust
> >you on this side. :) What's your proposal to fix it with the rculist 
> >patch?
> 
> What needs fixing?  I don't see anything broken.
> 
> Maybe there is something in this "rculist patch" that I'm missing.  Can 
> you
> point me at it?
> 
> >>>
> >>>Probably some debugging tool like sparse notices that the assignment
> >>>isn't a true list entry and complains about it. In other words, I think
> >>>the real fix is to fix the debugging tool to ignore this, because the
> >>>code is correct, and this is a false positive failure, and is causing
> >>>more harm than good, because people are sending out broken patches due
> >>>to it.
> >>
> >>OK, finally did the history trawling that I should have done to begin with.
> >>
> >>Back in 2010, Arnd added the __rcu pointer checking in sparse.
> >>But the RCU list primitives were used on non-RCU-protected lists, so
> >>some casting pain was required to avoid sparse complaints.  (Keep in
> >>mind that the list_head structure does not mark ->next with __rcu.)
> >>Arnd's workaround was to copy the pointer to the local stack, casting
> >>it to an __rcu pointer, then use rcu_dereference_raw() to do the needed
> >>traversal of an RCU-protected pointer.
> >>
> >>This of course resulted in an extraneous load from the stack, which
> >>Patrick noticed in his performance work, and which motivated him to send
> >>the patches.
> >>
> >>Perhaps what I should do is create an rcu_dereference_nocheck() for use
> >>in list traversals, that omits the sparse checking.  That should get rid
> >>of both the sparse warnings and the strange casts.
> >>
> >>The code in md probably needs to change in any case, as otherwise we are
> >>invoking rcu_dereference_whatever() on a full struct list_head rather
> >>than on a single pointer.  Or am I missing something here?
> >
> >Finally getting back to this one...
> >
> >I switched to lockless_dereference() instead of rcu_dereference_raw(),
> >and am running it through the testing gamut.  Patrick, are you OK with
> >this change?
> 
> Paul,
> 
> This sounds good to me. It should fix the performance issue (will
> check with my benchmark).

Thank you, looking forward to seeing the results!

> I think for drivers/md/bitmap.c:next_active_rdev() the problem was
> fixed but do you know if it also fixed for
> net/netfilter/core.c:nf_hook_slow()?

It does appear so.  The statement now reads:

elem = list_entry_rcu(state->hook_list, struct nf_hook_ops, list);

And ->hook_list is defined as follows:

struct list_head *hook_list;

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-09-11 Thread Paul E. McKenney
On Tue, May 19, 2015 at 03:07:25PM -0700, Paul E. McKenney wrote:
> On Mon, May 18, 2015 at 09:43:21AM -0400, Steven Rostedt wrote:
> > On Mon, 18 May 2015 12:06:47 +1000
> > NeilBrown  wrote:
> > 
> > 
> > > > struct mddev {
> > > > ...
> > > > struct list_headdisks;
> > > > ...}
> > > > 
> > > > struct list_head {
> > > >  struct list_head *next, *prev;
> > > > };
> > > > 
> > > > The tricky thing is that "list_entry_rcu" before and after the patch is 
> > > > reading the same thing.
> > > 
> > > No it isn't.
> > > Before the patch it is passed the address of the 'next' field.  After the
> > > patch it is passed the contents of the 'next' field.
> > 
> > Right.
> > 
> > > 
> > > 
> > > > 
> > > > However in your case, the change I proposed is probably wrong I trust 
> > > > you on this side. :) What's your proposal to fix it with the rculist 
> > > > patch?
> > > 
> > > What needs fixing?  I don't see anything broken.
> > > 
> > > Maybe there is something in this "rculist patch" that I'm missing.  Can 
> > > you
> > > point me at it?
> > > 
> > 
> > Probably some debugging tool like sparse notices that the assignment
> > isn't a true list entry and complains about it. In other words, I think
> > the real fix is to fix the debugging tool to ignore this, because the
> > code is correct, and this is a false positive failure, and is causing
> > more harm than good, because people are sending out broken patches due
> > to it.
> 
> OK, finally did the history trawling that I should have done to begin with.
> 
> Back in 2010, Arnd added the __rcu pointer checking in sparse.
> But the RCU list primitives were used on non-RCU-protected lists, so
> some casting pain was required to avoid sparse complaints.  (Keep in
> mind that the list_head structure does not mark ->next with __rcu.)
> Arnd's workaround was to copy the pointer to the local stack, casting
> it to an __rcu pointer, then use rcu_dereference_raw() to do the needed
> traversal of an RCU-protected pointer.
> 
> This of course resulted in an extraneous load from the stack, which
> Patrick noticed in his performance work, and which motivated him to send
> the patches.
> 
> Perhaps what I should do is create an rcu_dereference_nocheck() for use
> in list traversals, that omits the sparse checking.  That should get rid
> of both the sparse warnings and the strange casts.
> 
> The code in md probably needs to change in any case, as otherwise we are
> invoking rcu_dereference_whatever() on a full struct list_head rather
> than on a single pointer.  Or am I missing something here?

Finally getting back to this one...

I switched to lockless_dereference() instead of rcu_dereference_raw(),
and am running it through the testing gamut.  Patrick, are you OK with
this change?

Thanx, Paul



rculist: Make list_entry_rcu() use lockless_dereference()

The current list_entry_rcu() implementation copies the pointer to a stack
variable, then invokes rcu_dereference_raw() on it.  This results in an
additional store-load pair.  Now, most compilers will emit normal store
and load instructions, which might seem to be of negligible overhead,
but this results in a load-hit-store situation that can cause surprisingly
long pipeline stalls, even on modern microprocessors.  The problem is
that it takes time for the store to get the store buffer updated, which
can delay the subsequent load, which immediately follows.

This commit therefore switches to the lockless_dereference() primitive,
which does not expect the __rcu annotations (that are anyway not present
in the list_head structure) and which, like rcu_dereference_raw(),
does not check for an enclosing RCU read-side critical section.
Most importantly, it does not copy the pointer, thus avoiding the
load-hit-store overhead.

Signed-off-by: Patrick Marlier 
[ paulmck: Switched to lockless_dereference() to suppress sparse warnings. ]
Signed-off-by: Paul E. McKenney 

diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 17c6b1f84a77..5ed540986019 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -247,10 +247,7 @@ static inline void list_splice_init_rcu(struct list_head 
*list,
  * primitives such as list_add_rcu() as long as it's guarded by 
rcu_read_lock().
  */
 #define list_entry_rcu(ptr, type, member) \
-({ \
-   typeof(*ptr) __rcu *__ptr = (typeof(*ptr) __rcu __force *)ptr; \
-   container_of((typeof(ptr))rcu_dereference_raw(__ptr), type, member); \
-})
+   container_of(lockless_dereference(ptr), type, member)
 
 /**
  * Where are list_empty_rcu() and list_first_entry_rcu()?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-09-11 Thread Paul E. McKenney
On Tue, May 19, 2015 at 03:07:25PM -0700, Paul E. McKenney wrote:
> On Mon, May 18, 2015 at 09:43:21AM -0400, Steven Rostedt wrote:
> > On Mon, 18 May 2015 12:06:47 +1000
> > NeilBrown  wrote:
> > 
> > 
> > > > struct mddev {
> > > > ...
> > > > struct list_headdisks;
> > > > ...}
> > > > 
> > > > struct list_head {
> > > >  struct list_head *next, *prev;
> > > > };
> > > > 
> > > > The tricky thing is that "list_entry_rcu" before and after the patch is 
> > > > reading the same thing.
> > > 
> > > No it isn't.
> > > Before the patch it is passed the address of the 'next' field.  After the
> > > patch it is passed the contents of the 'next' field.
> > 
> > Right.
> > 
> > > 
> > > 
> > > > 
> > > > However in your case, the change I proposed is probably wrong I trust 
> > > > you on this side. :) What's your proposal to fix it with the rculist 
> > > > patch?
> > > 
> > > What needs fixing?  I don't see anything broken.
> > > 
> > > Maybe there is something in this "rculist patch" that I'm missing.  Can 
> > > you
> > > point me at it?
> > > 
> > 
> > Probably some debugging tool like sparse notices that the assignment
> > isn't a true list entry and complains about it. In other words, I think
> > the real fix is to fix the debugging tool to ignore this, because the
> > code is correct, and this is a false positive failure, and is causing
> > more harm than good, because people are sending out broken patches due
> > to it.
> 
> OK, finally did the history trawling that I should have done to begin with.
> 
> Back in 2010, Arnd added the __rcu pointer checking in sparse.
> But the RCU list primitives were used on non-RCU-protected lists, so
> some casting pain was required to avoid sparse complaints.  (Keep in
> mind that the list_head structure does not mark ->next with __rcu.)
> Arnd's workaround was to copy the pointer to the local stack, casting
> it to an __rcu pointer, then use rcu_dereference_raw() to do the needed
> traversal of an RCU-protected pointer.
> 
> This of course resulted in an extraneous load from the stack, which
> Patrick noticed in his performance work, and which motivated him to send
> the patches.
> 
> Perhaps what I should do is create an rcu_dereference_nocheck() for use
> in list traversals, that omits the sparse checking.  That should get rid
> of both the sparse warnings and the strange casts.
> 
> The code in md probably needs to change in any case, as otherwise we are
> invoking rcu_dereference_whatever() on a full struct list_head rather
> than on a single pointer.  Or am I missing something here?

Finally getting back to this one...

I switched to lockless_dereference() instead of rcu_dereference_raw(),
and am running it through the testing gamut.  Patrick, are you OK with
this change?

Thanx, Paul



rculist: Make list_entry_rcu() use lockless_dereference()

The current list_entry_rcu() implementation copies the pointer to a stack
variable, then invokes rcu_dereference_raw() on it.  This results in an
additional store-load pair.  Now, most compilers will emit normal store
and load instructions, which might seem to be of negligible overhead,
but this results in a load-hit-store situation that can cause surprisingly
long pipeline stalls, even on modern microprocessors.  The problem is
that it takes time for the store to get the store buffer updated, which
can delay the subsequent load, which immediately follows.

This commit therefore switches to the lockless_dereference() primitive,
which does not expect the __rcu annotations (that are anyway not present
in the list_head structure) and which, like rcu_dereference_raw(),
does not check for an enclosing RCU read-side critical section.
Most importantly, it does not copy the pointer, thus avoiding the
load-hit-store overhead.

Signed-off-by: Patrick Marlier 
[ paulmck: Switched to lockless_dereference() to suppress sparse warnings. ]
Signed-off-by: Paul E. McKenney 

diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 17c6b1f84a77..5ed540986019 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -247,10 +247,7 @@ static inline void list_splice_init_rcu(struct list_head 
*list,
  * primitives such as list_add_rcu() as long as it's guarded by 
rcu_read_lock().
  */
 #define list_entry_rcu(ptr, type, member) \
-({ \
-   typeof(*ptr) __rcu *__ptr = (typeof(*ptr) __rcu __force *)ptr; \
-   container_of((typeof(ptr))rcu_dereference_raw(__ptr), type, member); \
-})
+   container_of(lockless_dereference(ptr), type, member)
 
 /**
  * Where are list_empty_rcu() and list_first_entry_rcu()?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a 

Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-20 Thread NeilBrown
On Wed, 20 May 2015 06:28:35 -0700 "Paul E. McKenney"
 wrote:

> On Wed, May 20, 2015 at 03:09:19PM +1000, NeilBrown wrote:
> > On Tue, 19 May 2015 15:07:25 -0700 "Paul E. McKenney"
> >  wrote:
> > 
> > 
> > > The code in md probably needs to change in any case, as otherwise we are
> > > invoking rcu_dereference_whatever() on a full struct list_head rather
> > > than on a single pointer.  Or am I missing something here?
> > 
> > I think it would be
> >rcu_dereference_whatever(>disks)
> > 
> > I don't know what you mean by "on a full struct list_head", but there is
> > nothing actually being dereferenced here - right?  Just pointer arithmetic 
> > on
> > 'mddev'.
> 
> It really does dereference.  Strange but true.

Well... your the expert.  But without an lvalue, I can't see it.


> 
> > I should probably just
> > 
> > diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
> > index 2bc56e2a3526..b1d237bf8b3b 100644
> > --- a/drivers/md/bitmap.c
> > +++ b/drivers/md/bitmap.c
> > @@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct md_rdev 
> > *rdev, struct mddev *mdde
> > rcu_read_lock();
> > if (rdev == NULL)
> > /* start at the beginning */
> > -   rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
> > +   rdev = list_entry(>disks, struct md_rdev, same_set);
> > else {
> > /* release the previous rdev and start from there. */
> > rdev_dec_pending(rdev, mddev);
> > 
> > as there really are no RCU issues with getting that address.  Maybe I should
> > move it outside the rcu_read_lock() just to be blatant but that would
> > make the code a lot more clumsy as the rdev_dec_pending must be inside the
> > rcu_read_lock..
> > 
> > So this.
> 
> Fair enough -- if you aren't using RCU, there is really no point in using
> the RCU API.  I will drop this patch from my tree.  You are pushing yours,
> I am guessing?

Excellent guess :-)
Hopefully for the next -rc.

Thanks,
NeilBrown


> 
>   Thanx, Paul
> 
> > Thanks,
> > NeilBrown
> > 
> > From: NeilBrown 
> > Date: Wed, 20 May 2015 15:05:09 +1000
> > Subject: [PATCH] md/bitmap: remove rcu annotation from pointer arithmetic.
> > 
> > Evaluating  ">disks" is simple pointer arithmetic, so
> > it does not need 'rcu' annotations - no dereferencing is happening.
> > 
> > Also enhance the comment to explain that 'rdev' in that case
> > is not actually a pointer to an rdev.
> > 
> > Reported-by: Patrick Marlier 
> > Signed-off-by: NeilBrown 
> > 
> > diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
> > index 2bc56e2a3526..135a0907e9de 100644
> > --- a/drivers/md/bitmap.c
> > +++ b/drivers/md/bitmap.c
> > @@ -177,11 +177,16 @@ static struct md_rdev *next_active_rdev(struct 
> > md_rdev *rdev, struct mddev *mdde
> >  * nr_pending is 0 and In_sync is clear, the entries we return will
> >  * still be in the same position on the list when we re-enter
> >  * list_for_each_entry_continue_rcu.
> > +*
> > +* Note that if entered with 'rdev == NULL' to start at the
> > +* beginning, we temporarily assign 'rdev' to an address which
> > +* isn't really an rdev, but which can be used by
> > +* list_for_each_entry_continue_rcu() to find the first entry.
> >  */
> > rcu_read_lock();
> > if (rdev == NULL)
> > /* start at the beginning */
> > -   rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
> > +   rdev = list_entry(>disks, struct md_rdev, same_set);
> > else {
> > /* release the previous rdev and start from there. */
> > rdev_dec_pending(rdev, mddev);
> 



pgpjY5cGI4jNr.pgp
Description: OpenPGP digital signature


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-20 Thread Paul E. McKenney
On Wed, May 20, 2015 at 03:09:19PM +1000, NeilBrown wrote:
> On Tue, 19 May 2015 15:07:25 -0700 "Paul E. McKenney"
>  wrote:
> 
> 
> > The code in md probably needs to change in any case, as otherwise we are
> > invoking rcu_dereference_whatever() on a full struct list_head rather
> > than on a single pointer.  Or am I missing something here?
> 
> I think it would be
>rcu_dereference_whatever(>disks)
> 
> I don't know what you mean by "on a full struct list_head", but there is
> nothing actually being dereferenced here - right?  Just pointer arithmetic on
> 'mddev'.

It really does dereference.  Strange but true.

> I should probably just
> 
> diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
> index 2bc56e2a3526..b1d237bf8b3b 100644
> --- a/drivers/md/bitmap.c
> +++ b/drivers/md/bitmap.c
> @@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct md_rdev 
> *rdev, struct mddev *mdde
>   rcu_read_lock();
>   if (rdev == NULL)
>   /* start at the beginning */
> - rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
> + rdev = list_entry(>disks, struct md_rdev, same_set);
>   else {
>   /* release the previous rdev and start from there. */
>   rdev_dec_pending(rdev, mddev);
> 
> as there really are no RCU issues with getting that address.  Maybe I should
> move it outside the rcu_read_lock() just to be blatant but that would
> make the code a lot more clumsy as the rdev_dec_pending must be inside the
> rcu_read_lock..
> 
> So this.

Fair enough -- if you aren't using RCU, there is really no point in using
the RCU API.  I will drop this patch from my tree.  You are pushing yours,
I am guessing?

Thanx, Paul

> Thanks,
> NeilBrown
> 
> From: NeilBrown 
> Date: Wed, 20 May 2015 15:05:09 +1000
> Subject: [PATCH] md/bitmap: remove rcu annotation from pointer arithmetic.
> 
> Evaluating  ">disks" is simple pointer arithmetic, so
> it does not need 'rcu' annotations - no dereferencing is happening.
> 
> Also enhance the comment to explain that 'rdev' in that case
> is not actually a pointer to an rdev.
> 
> Reported-by: Patrick Marlier 
> Signed-off-by: NeilBrown 
> 
> diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
> index 2bc56e2a3526..135a0907e9de 100644
> --- a/drivers/md/bitmap.c
> +++ b/drivers/md/bitmap.c
> @@ -177,11 +177,16 @@ static struct md_rdev *next_active_rdev(struct md_rdev 
> *rdev, struct mddev *mdde
>* nr_pending is 0 and In_sync is clear, the entries we return will
>* still be in the same position on the list when we re-enter
>* list_for_each_entry_continue_rcu.
> +  *
> +  * Note that if entered with 'rdev == NULL' to start at the
> +  * beginning, we temporarily assign 'rdev' to an address which
> +  * isn't really an rdev, but which can be used by
> +  * list_for_each_entry_continue_rcu() to find the first entry.
>*/
>   rcu_read_lock();
>   if (rdev == NULL)
>   /* start at the beginning */
> - rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
> + rdev = list_entry(>disks, struct md_rdev, same_set);
>   else {
>   /* release the previous rdev and start from there. */
>   rdev_dec_pending(rdev, mddev);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-20 Thread Paul E. McKenney
On Wed, May 20, 2015 at 03:09:19PM +1000, NeilBrown wrote:
 On Tue, 19 May 2015 15:07:25 -0700 Paul E. McKenney
 paul...@linux.vnet.ibm.com wrote:
 
 
  The code in md probably needs to change in any case, as otherwise we are
  invoking rcu_dereference_whatever() on a full struct list_head rather
  than on a single pointer.  Or am I missing something here?
 
 I think it would be
rcu_dereference_whatever(mddev-disks)
 
 I don't know what you mean by on a full struct list_head, but there is
 nothing actually being dereferenced here - right?  Just pointer arithmetic on
 'mddev'.

It really does dereference.  Strange but true.

 I should probably just
 
 diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
 index 2bc56e2a3526..b1d237bf8b3b 100644
 --- a/drivers/md/bitmap.c
 +++ b/drivers/md/bitmap.c
 @@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct md_rdev 
 *rdev, struct mddev *mdde
   rcu_read_lock();
   if (rdev == NULL)
   /* start at the beginning */
 - rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);
 + rdev = list_entry(mddev-disks, struct md_rdev, same_set);
   else {
   /* release the previous rdev and start from there. */
   rdev_dec_pending(rdev, mddev);
 
 as there really are no RCU issues with getting that address.  Maybe I should
 move it outside the rcu_read_lock() just to be blatant but that would
 make the code a lot more clumsy as the rdev_dec_pending must be inside the
 rcu_read_lock..
 
 So this.

Fair enough -- if you aren't using RCU, there is really no point in using
the RCU API.  I will drop this patch from my tree.  You are pushing yours,
I am guessing?

Thanx, Paul

 Thanks,
 NeilBrown
 
 From: NeilBrown ne...@suse.de
 Date: Wed, 20 May 2015 15:05:09 +1000
 Subject: [PATCH] md/bitmap: remove rcu annotation from pointer arithmetic.
 
 Evaluating  mddev-disks is simple pointer arithmetic, so
 it does not need 'rcu' annotations - no dereferencing is happening.
 
 Also enhance the comment to explain that 'rdev' in that case
 is not actually a pointer to an rdev.
 
 Reported-by: Patrick Marlier patrick.marl...@gmail.com
 Signed-off-by: NeilBrown ne...@suse.de
 
 diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
 index 2bc56e2a3526..135a0907e9de 100644
 --- a/drivers/md/bitmap.c
 +++ b/drivers/md/bitmap.c
 @@ -177,11 +177,16 @@ static struct md_rdev *next_active_rdev(struct md_rdev 
 *rdev, struct mddev *mdde
* nr_pending is 0 and In_sync is clear, the entries we return will
* still be in the same position on the list when we re-enter
* list_for_each_entry_continue_rcu.
 +  *
 +  * Note that if entered with 'rdev == NULL' to start at the
 +  * beginning, we temporarily assign 'rdev' to an address which
 +  * isn't really an rdev, but which can be used by
 +  * list_for_each_entry_continue_rcu() to find the first entry.
*/
   rcu_read_lock();
   if (rdev == NULL)
   /* start at the beginning */
 - rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);
 + rdev = list_entry(mddev-disks, struct md_rdev, same_set);
   else {
   /* release the previous rdev and start from there. */
   rdev_dec_pending(rdev, mddev);


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-20 Thread NeilBrown
On Wed, 20 May 2015 06:28:35 -0700 Paul E. McKenney
paul...@linux.vnet.ibm.com wrote:

 On Wed, May 20, 2015 at 03:09:19PM +1000, NeilBrown wrote:
  On Tue, 19 May 2015 15:07:25 -0700 Paul E. McKenney
  paul...@linux.vnet.ibm.com wrote:
  
  
   The code in md probably needs to change in any case, as otherwise we are
   invoking rcu_dereference_whatever() on a full struct list_head rather
   than on a single pointer.  Or am I missing something here?
  
  I think it would be
 rcu_dereference_whatever(mddev-disks)
  
  I don't know what you mean by on a full struct list_head, but there is
  nothing actually being dereferenced here - right?  Just pointer arithmetic 
  on
  'mddev'.
 
 It really does dereference.  Strange but true.

Well... your the expert.  But without an lvalue, I can't see it.


 
  I should probably just
  
  diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
  index 2bc56e2a3526..b1d237bf8b3b 100644
  --- a/drivers/md/bitmap.c
  +++ b/drivers/md/bitmap.c
  @@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct md_rdev 
  *rdev, struct mddev *mdde
  rcu_read_lock();
  if (rdev == NULL)
  /* start at the beginning */
  -   rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);
  +   rdev = list_entry(mddev-disks, struct md_rdev, same_set);
  else {
  /* release the previous rdev and start from there. */
  rdev_dec_pending(rdev, mddev);
  
  as there really are no RCU issues with getting that address.  Maybe I should
  move it outside the rcu_read_lock() just to be blatant but that would
  make the code a lot more clumsy as the rdev_dec_pending must be inside the
  rcu_read_lock..
  
  So this.
 
 Fair enough -- if you aren't using RCU, there is really no point in using
 the RCU API.  I will drop this patch from my tree.  You are pushing yours,
 I am guessing?

Excellent guess :-)
Hopefully for the next -rc.

Thanks,
NeilBrown


 
   Thanx, Paul
 
  Thanks,
  NeilBrown
  
  From: NeilBrown ne...@suse.de
  Date: Wed, 20 May 2015 15:05:09 +1000
  Subject: [PATCH] md/bitmap: remove rcu annotation from pointer arithmetic.
  
  Evaluating  mddev-disks is simple pointer arithmetic, so
  it does not need 'rcu' annotations - no dereferencing is happening.
  
  Also enhance the comment to explain that 'rdev' in that case
  is not actually a pointer to an rdev.
  
  Reported-by: Patrick Marlier patrick.marl...@gmail.com
  Signed-off-by: NeilBrown ne...@suse.de
  
  diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
  index 2bc56e2a3526..135a0907e9de 100644
  --- a/drivers/md/bitmap.c
  +++ b/drivers/md/bitmap.c
  @@ -177,11 +177,16 @@ static struct md_rdev *next_active_rdev(struct 
  md_rdev *rdev, struct mddev *mdde
   * nr_pending is 0 and In_sync is clear, the entries we return will
   * still be in the same position on the list when we re-enter
   * list_for_each_entry_continue_rcu.
  +*
  +* Note that if entered with 'rdev == NULL' to start at the
  +* beginning, we temporarily assign 'rdev' to an address which
  +* isn't really an rdev, but which can be used by
  +* list_for_each_entry_continue_rcu() to find the first entry.
   */
  rcu_read_lock();
  if (rdev == NULL)
  /* start at the beginning */
  -   rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);
  +   rdev = list_entry(mddev-disks, struct md_rdev, same_set);
  else {
  /* release the previous rdev and start from there. */
  rdev_dec_pending(rdev, mddev);
 



pgpjY5cGI4jNr.pgp
Description: OpenPGP digital signature


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-19 Thread NeilBrown
On Tue, 19 May 2015 15:07:25 -0700 "Paul E. McKenney"
 wrote:


> The code in md probably needs to change in any case, as otherwise we are
> invoking rcu_dereference_whatever() on a full struct list_head rather
> than on a single pointer.  Or am I missing something here?

I think it would be
   rcu_dereference_whatever(>disks)

I don't know what you mean by "on a full struct list_head", but there is
nothing actually being dereferenced here - right?  Just pointer arithmetic on
'mddev'.

I should probably just

diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 2bc56e2a3526..b1d237bf8b3b 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct md_rdev 
*rdev, struct mddev *mdde
rcu_read_lock();
if (rdev == NULL)
/* start at the beginning */
-   rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
+   rdev = list_entry(>disks, struct md_rdev, same_set);
else {
/* release the previous rdev and start from there. */
rdev_dec_pending(rdev, mddev);

as there really are no RCU issues with getting that address.  Maybe I should
move it outside the rcu_read_lock() just to be blatant but that would
make the code a lot more clumsy as the rdev_dec_pending must be inside the
rcu_read_lock..

So this.

Thanks,
NeilBrown

From: NeilBrown 
Date: Wed, 20 May 2015 15:05:09 +1000
Subject: [PATCH] md/bitmap: remove rcu annotation from pointer arithmetic.

Evaluating  ">disks" is simple pointer arithmetic, so
it does not need 'rcu' annotations - no dereferencing is happening.

Also enhance the comment to explain that 'rdev' in that case
is not actually a pointer to an rdev.

Reported-by: Patrick Marlier 
Signed-off-by: NeilBrown 

diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 2bc56e2a3526..135a0907e9de 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -177,11 +177,16 @@ static struct md_rdev *next_active_rdev(struct md_rdev 
*rdev, struct mddev *mdde
 * nr_pending is 0 and In_sync is clear, the entries we return will
 * still be in the same position on the list when we re-enter
 * list_for_each_entry_continue_rcu.
+*
+* Note that if entered with 'rdev == NULL' to start at the
+* beginning, we temporarily assign 'rdev' to an address which
+* isn't really an rdev, but which can be used by
+* list_for_each_entry_continue_rcu() to find the first entry.
 */
rcu_read_lock();
if (rdev == NULL)
/* start at the beginning */
-   rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
+   rdev = list_entry(>disks, struct md_rdev, same_set);
else {
/* release the previous rdev and start from there. */
rdev_dec_pending(rdev, mddev);


pgpMbWdBiMk4u.pgp
Description: OpenPGP digital signature


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-19 Thread Paul E. McKenney
On Mon, May 18, 2015 at 09:43:21AM -0400, Steven Rostedt wrote:
> On Mon, 18 May 2015 12:06:47 +1000
> NeilBrown  wrote:
> 
> 
> > > struct mddev {
> > > ...
> > >   struct list_headdisks;
> > > ...}
> > > 
> > > struct list_head {
> > >  struct list_head *next, *prev;
> > > };
> > > 
> > > The tricky thing is that "list_entry_rcu" before and after the patch is 
> > > reading the same thing.
> > 
> > No it isn't.
> > Before the patch it is passed the address of the 'next' field.  After the
> > patch it is passed the contents of the 'next' field.
> 
> Right.
> 
> > 
> > 
> > > 
> > > However in your case, the change I proposed is probably wrong I trust 
> > > you on this side. :) What's your proposal to fix it with the rculist 
> > > patch?
> > 
> > What needs fixing?  I don't see anything broken.
> > 
> > Maybe there is something in this "rculist patch" that I'm missing.  Can you
> > point me at it?
> > 
> 
> Probably some debugging tool like sparse notices that the assignment
> isn't a true list entry and complains about it. In other words, I think
> the real fix is to fix the debugging tool to ignore this, because the
> code is correct, and this is a false positive failure, and is causing
> more harm than good, because people are sending out broken patches due
> to it.

OK, finally did the history trawling that I should have done to begin with.

Back in 2010, Arnd added the __rcu pointer checking in sparse.
But the RCU list primitives were used on non-RCU-protected lists, so
some casting pain was required to avoid sparse complaints.  (Keep in
mind that the list_head structure does not mark ->next with __rcu.)
Arnd's workaround was to copy the pointer to the local stack, casting
it to an __rcu pointer, then use rcu_dereference_raw() to do the needed
traversal of an RCU-protected pointer.

This of course resulted in an extraneous load from the stack, which
Patrick noticed in his performance work, and which motivated him to send
the patches.

Perhaps what I should do is create an rcu_dereference_nocheck() for use
in list traversals, that omits the sparse checking.  That should get rid
of both the sparse warnings and the strange casts.

The code in md probably needs to change in any case, as otherwise we are
invoking rcu_dereference_whatever() on a full struct list_head rather
than on a single pointer.  Or am I missing something here?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-19 Thread NeilBrown
On Tue, 19 May 2015 15:07:25 -0700 Paul E. McKenney
paul...@linux.vnet.ibm.com wrote:


 The code in md probably needs to change in any case, as otherwise we are
 invoking rcu_dereference_whatever() on a full struct list_head rather
 than on a single pointer.  Or am I missing something here?

I think it would be
   rcu_dereference_whatever(mddev-disks)

I don't know what you mean by on a full struct list_head, but there is
nothing actually being dereferenced here - right?  Just pointer arithmetic on
'mddev'.

I should probably just

diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 2bc56e2a3526..b1d237bf8b3b 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct md_rdev 
*rdev, struct mddev *mdde
rcu_read_lock();
if (rdev == NULL)
/* start at the beginning */
-   rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);
+   rdev = list_entry(mddev-disks, struct md_rdev, same_set);
else {
/* release the previous rdev and start from there. */
rdev_dec_pending(rdev, mddev);

as there really are no RCU issues with getting that address.  Maybe I should
move it outside the rcu_read_lock() just to be blatant but that would
make the code a lot more clumsy as the rdev_dec_pending must be inside the
rcu_read_lock..

So this.

Thanks,
NeilBrown

From: NeilBrown ne...@suse.de
Date: Wed, 20 May 2015 15:05:09 +1000
Subject: [PATCH] md/bitmap: remove rcu annotation from pointer arithmetic.

Evaluating  mddev-disks is simple pointer arithmetic, so
it does not need 'rcu' annotations - no dereferencing is happening.

Also enhance the comment to explain that 'rdev' in that case
is not actually a pointer to an rdev.

Reported-by: Patrick Marlier patrick.marl...@gmail.com
Signed-off-by: NeilBrown ne...@suse.de

diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 2bc56e2a3526..135a0907e9de 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -177,11 +177,16 @@ static struct md_rdev *next_active_rdev(struct md_rdev 
*rdev, struct mddev *mdde
 * nr_pending is 0 and In_sync is clear, the entries we return will
 * still be in the same position on the list when we re-enter
 * list_for_each_entry_continue_rcu.
+*
+* Note that if entered with 'rdev == NULL' to start at the
+* beginning, we temporarily assign 'rdev' to an address which
+* isn't really an rdev, but which can be used by
+* list_for_each_entry_continue_rcu() to find the first entry.
 */
rcu_read_lock();
if (rdev == NULL)
/* start at the beginning */
-   rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);
+   rdev = list_entry(mddev-disks, struct md_rdev, same_set);
else {
/* release the previous rdev and start from there. */
rdev_dec_pending(rdev, mddev);


pgpMbWdBiMk4u.pgp
Description: OpenPGP digital signature


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-19 Thread Paul E. McKenney
On Mon, May 18, 2015 at 09:43:21AM -0400, Steven Rostedt wrote:
 On Mon, 18 May 2015 12:06:47 +1000
 NeilBrown ne...@suse.de wrote:
 
 
   struct mddev {
   ...
 struct list_headdisks;
   ...}
   
   struct list_head {
struct list_head *next, *prev;
   };
   
   The tricky thing is that list_entry_rcu before and after the patch is 
   reading the same thing.
  
  No it isn't.
  Before the patch it is passed the address of the 'next' field.  After the
  patch it is passed the contents of the 'next' field.
 
 Right.
 
  
  
   
   However in your case, the change I proposed is probably wrong I trust 
   you on this side. :) What's your proposal to fix it with the rculist 
   patch?
  
  What needs fixing?  I don't see anything broken.
  
  Maybe there is something in this rculist patch that I'm missing.  Can you
  point me at it?
  
 
 Probably some debugging tool like sparse notices that the assignment
 isn't a true list entry and complains about it. In other words, I think
 the real fix is to fix the debugging tool to ignore this, because the
 code is correct, and this is a false positive failure, and is causing
 more harm than good, because people are sending out broken patches due
 to it.

OK, finally did the history trawling that I should have done to begin with.

Back in 2010, Arnd added the __rcu pointer checking in sparse.
But the RCU list primitives were used on non-RCU-protected lists, so
some casting pain was required to avoid sparse complaints.  (Keep in
mind that the list_head structure does not mark -next with __rcu.)
Arnd's workaround was to copy the pointer to the local stack, casting
it to an __rcu pointer, then use rcu_dereference_raw() to do the needed
traversal of an RCU-protected pointer.

This of course resulted in an extraneous load from the stack, which
Patrick noticed in his performance work, and which motivated him to send
the patches.

Perhaps what I should do is create an rcu_dereference_nocheck() for use
in list traversals, that omits the sparse checking.  That should get rid
of both the sparse warnings and the strange casts.

The code in md probably needs to change in any case, as otherwise we are
invoking rcu_dereference_whatever() on a full struct list_head rather
than on a single pointer.  Or am I missing something here?

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-18 Thread Patrick Marlier


On 05/18/2015 03:53 PM, Patrick Marlier wrote:

On Mon, May 18, 2015 at 4:06 AM, NeilBrown  wrote:

On Sat, 16 May 2015 19:42:54 +0200 Patrick Marlier
 wrote:

On 05/13/2015 04:58 AM, NeilBrown wrote:

On Tue, 12 May 2015 22:38:53 -0400 Steven Rostedt  wrote:

On Tue, 12 May 2015 15:46:26 -0700
"Paul E. McKenney"  wrote:
What comes after this is:

list_for_each_entry_continue_rcu(rdev, >disks, same_set) {
if (rdev->raid_disk >= 0 &&

Now the original code had:

rdev = list_entry_rcu(>disks, struct md_rdev, same_set);

Where >disks would return the address of the disks field of
mddev which is a list head. Then it would get the 'same_set' offset,
which is 0, and rdev is pointing to a makeshift md_rdev struct. But it
isn't used, as the list_for_each_entry_continue_rcu() has:

#define list_for_each_entry_continue_rcu(pos, head, member)\
for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
 >member != (head);\
 pos = list_entry_rcu(pos->member.next, typeof(*pos), member))

Thus the first use of pos is pos->member.next or:

mddev->disks.next

But now you converted it to rdev = mddev->disks.next, which means the
first use is:

pos = mddev->disks.next->next

I think you are skipping the first element here.



struct mddev {
...
   struct list_headdisks;
...}

struct list_head {
  struct list_head *next, *prev;
};

The tricky thing is that "list_entry_rcu" before and after the patch is
reading the same thing.


No it isn't.
Before the patch it is passed the address of the 'next' field.  After the
patch it is passed the contents of the 'next' field.


Here I meant "list_entry_rcu" (in include/linux/rculist.h) not the
change to drivers/md/bitmap.c.



However in your case, the change I proposed is probably wrong I trust
you on this side. :) What's your proposal to fix it with the rculist patch?


What needs fixing?  I don't see anything broken.

Maybe there is something in this "rculist patch" that I'm missing.  Can you
point me at it?


Do not apply the patch on drivers/md/bitmap.c but only on
include/linux/rculist.h and you will see that the compilation fails.


Here an example of the compilation error in drivers/md/bitmap.c

In file included from include/linux/sched.h:17:0,
 from include/linux/blkdev.h:4,
 from drivers/md/bitmap.c:18:
drivers/md/bitmap.c: In function ‘next_active_rdev’:
include/linux/compiler.h:469:24: error: lvalue required as unary ‘&’ operand
  (volatile typeof(x) *)&(x); })
^
include/linux/kernel.h:800:49: note: in definition of macro ‘container_of’
  const typeof( ((type *)0)->member ) *__mptr = (ptr); \
 ^
include/linux/compiler.h:470:26: note: in expansion of macro ‘__ACCESS_ONCE’
 #define ACCESS_ONCE(x) (*__ACCESS_ONCE(x))
  ^
include/linux/rcupdate.h:630:26: note: in expansion of macro ‘ACCESS_ONCE’
  typeof(p) _p1 = ACCESS_ONCE(p); \
  ^
include/linux/rcupdate.h:587:48: note: in expansion of macro 
‘lockless_dereference’

  typeof(*p) *p1 = (typeof(*p) *__force)lockless_dereference(p); \
^
include/linux/rcupdate.h:723:2: note: in expansion of macro 
‘__rcu_dereference_check’

  __rcu_dereference_check((p), rcu_read_lock_held() || (c), __rcu)
  ^
include/linux/rcupdate.h:746:32: note: in expansion of macro 
‘rcu_dereference_check’
 #define rcu_dereference_raw(p) rcu_dereference_check(p, 1) /*@@@ 
needed? @@@*/

^
include/linux/rculist.h:251:28: note: in expansion of macro 
‘rcu_dereference_raw’

  container_of((typeof(ptr))rcu_dereference_raw(ptr), type, member); \
^
drivers/md/bitmap.c:184:10: note: in expansion of macro ‘list_entry_rcu’
   rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
  ^

Thanks.
--
Pat
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-18 Thread Patrick Marlier
On Mon, May 18, 2015 at 4:06 AM, NeilBrown  wrote:
> On Sat, 16 May 2015 19:42:54 +0200 Patrick Marlier
>  wrote:
>> On 05/13/2015 04:58 AM, NeilBrown wrote:
>> > On Tue, 12 May 2015 22:38:53 -0400 Steven Rostedt  
>> > wrote:
>> >> On Tue, 12 May 2015 15:46:26 -0700
>> >> "Paul E. McKenney"  wrote:
>> >> What comes after this is:
>> >>
>> >>list_for_each_entry_continue_rcu(rdev, >disks, same_set) {
>> >>if (rdev->raid_disk >= 0 &&
>> >>
>> >> Now the original code had:
>> >>
>> >>rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
>> >>
>> >> Where >disks would return the address of the disks field of
>> >> mddev which is a list head. Then it would get the 'same_set' offset,
>> >> which is 0, and rdev is pointing to a makeshift md_rdev struct. But it
>> >> isn't used, as the list_for_each_entry_continue_rcu() has:
>> >>
>> >> #define list_for_each_entry_continue_rcu(pos, head, member)   
>> >>  \
>> >>for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
>> >> >member != (head);\
>> >> pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
>> >>
>> >> Thus the first use of pos is pos->member.next or:
>> >>
>> >>mddev->disks.next
>> >>
>> >> But now you converted it to rdev = mddev->disks.next, which means the
>> >> first use is:
>> >>
>> >>pos = mddev->disks.next->next
>> >>
>> >> I think you are skipping the first element here.
>>
>>
>> struct mddev {
>> ...
>>   struct list_headdisks;
>> ...}
>>
>> struct list_head {
>>  struct list_head *next, *prev;
>> };
>>
>> The tricky thing is that "list_entry_rcu" before and after the patch is
>> reading the same thing.
>
> No it isn't.
> Before the patch it is passed the address of the 'next' field.  After the
> patch it is passed the contents of the 'next' field.

Here I meant "list_entry_rcu" (in include/linux/rculist.h) not the
change to drivers/md/bitmap.c.


>> However in your case, the change I proposed is probably wrong I trust
>> you on this side. :) What's your proposal to fix it with the rculist patch?
>
> What needs fixing?  I don't see anything broken.
>
> Maybe there is something in this "rculist patch" that I'm missing.  Can you
> point me at it?

Do not apply the patch on drivers/md/bitmap.c but only on
include/linux/rculist.h and you will see that the compilation fails.

--
Pat
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-18 Thread Steven Rostedt
On Mon, 18 May 2015 12:06:47 +1000
NeilBrown  wrote:


> > struct mddev {
> > ...
> > struct list_headdisks;
> > ...}
> > 
> > struct list_head {
> >  struct list_head *next, *prev;
> > };
> > 
> > The tricky thing is that "list_entry_rcu" before and after the patch is 
> > reading the same thing.
> 
> No it isn't.
> Before the patch it is passed the address of the 'next' field.  After the
> patch it is passed the contents of the 'next' field.

Right.

> 
> 
> > 
> > However in your case, the change I proposed is probably wrong I trust 
> > you on this side. :) What's your proposal to fix it with the rculist patch?
> 
> What needs fixing?  I don't see anything broken.
> 
> Maybe there is something in this "rculist patch" that I'm missing.  Can you
> point me at it?
> 

Probably some debugging tool like sparse notices that the assignment
isn't a true list entry and complains about it. In other words, I think
the real fix is to fix the debugging tool to ignore this, because the
code is correct, and this is a false positive failure, and is causing
more harm than good, because people are sending out broken patches due
to it.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-18 Thread Patrick Marlier
On Mon, May 18, 2015 at 4:06 AM, NeilBrown ne...@suse.de wrote:
 On Sat, 16 May 2015 19:42:54 +0200 Patrick Marlier
 patrick.marl...@gmail.com wrote:
 On 05/13/2015 04:58 AM, NeilBrown wrote:
  On Tue, 12 May 2015 22:38:53 -0400 Steven Rostedt rost...@goodmis.org 
  wrote:
  On Tue, 12 May 2015 15:46:26 -0700
  Paul E. McKenney paul...@linux.vnet.ibm.com wrote:
  What comes after this is:
 
 list_for_each_entry_continue_rcu(rdev, mddev-disks, same_set) {
 if (rdev-raid_disk = 0 
 
  Now the original code had:
 
 rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);
 
  Where mddev-disks would return the address of the disks field of
  mddev which is a list head. Then it would get the 'same_set' offset,
  which is 0, and rdev is pointing to a makeshift md_rdev struct. But it
  isn't used, as the list_for_each_entry_continue_rcu() has:
 
  #define list_for_each_entry_continue_rcu(pos, head, member)   
   \
 for (pos = list_entry_rcu(pos-member.next, typeof(*pos), member); \
  pos-member != (head);\
  pos = list_entry_rcu(pos-member.next, typeof(*pos), member))
 
  Thus the first use of pos is pos-member.next or:
 
 mddev-disks.next
 
  But now you converted it to rdev = mddev-disks.next, which means the
  first use is:
 
 pos = mddev-disks.next-next
 
  I think you are skipping the first element here.


 struct mddev {
 ...
   struct list_headdisks;
 ...}

 struct list_head {
  struct list_head *next, *prev;
 };

 The tricky thing is that list_entry_rcu before and after the patch is
 reading the same thing.

 No it isn't.
 Before the patch it is passed the address of the 'next' field.  After the
 patch it is passed the contents of the 'next' field.

Here I meant list_entry_rcu (in include/linux/rculist.h) not the
change to drivers/md/bitmap.c.


 However in your case, the change I proposed is probably wrong I trust
 you on this side. :) What's your proposal to fix it with the rculist patch?

 What needs fixing?  I don't see anything broken.

 Maybe there is something in this rculist patch that I'm missing.  Can you
 point me at it?

Do not apply the patch on drivers/md/bitmap.c but only on
include/linux/rculist.h and you will see that the compilation fails.

--
Pat
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-18 Thread Steven Rostedt
On Mon, 18 May 2015 12:06:47 +1000
NeilBrown ne...@suse.de wrote:


  struct mddev {
  ...
  struct list_headdisks;
  ...}
  
  struct list_head {
   struct list_head *next, *prev;
  };
  
  The tricky thing is that list_entry_rcu before and after the patch is 
  reading the same thing.
 
 No it isn't.
 Before the patch it is passed the address of the 'next' field.  After the
 patch it is passed the contents of the 'next' field.

Right.

 
 
  
  However in your case, the change I proposed is probably wrong I trust 
  you on this side. :) What's your proposal to fix it with the rculist patch?
 
 What needs fixing?  I don't see anything broken.
 
 Maybe there is something in this rculist patch that I'm missing.  Can you
 point me at it?
 

Probably some debugging tool like sparse notices that the assignment
isn't a true list entry and complains about it. In other words, I think
the real fix is to fix the debugging tool to ignore this, because the
code is correct, and this is a false positive failure, and is causing
more harm than good, because people are sending out broken patches due
to it.

-- Steve
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-18 Thread Patrick Marlier


On 05/18/2015 03:53 PM, Patrick Marlier wrote:

On Mon, May 18, 2015 at 4:06 AM, NeilBrown ne...@suse.de wrote:

On Sat, 16 May 2015 19:42:54 +0200 Patrick Marlier
patrick.marl...@gmail.com wrote:

On 05/13/2015 04:58 AM, NeilBrown wrote:

On Tue, 12 May 2015 22:38:53 -0400 Steven Rostedt rost...@goodmis.org wrote:

On Tue, 12 May 2015 15:46:26 -0700
Paul E. McKenney paul...@linux.vnet.ibm.com wrote:
What comes after this is:

list_for_each_entry_continue_rcu(rdev, mddev-disks, same_set) {
if (rdev-raid_disk = 0 

Now the original code had:

rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);

Where mddev-disks would return the address of the disks field of
mddev which is a list head. Then it would get the 'same_set' offset,
which is 0, and rdev is pointing to a makeshift md_rdev struct. But it
isn't used, as the list_for_each_entry_continue_rcu() has:

#define list_for_each_entry_continue_rcu(pos, head, member)\
for (pos = list_entry_rcu(pos-member.next, typeof(*pos), member); \
 pos-member != (head);\
 pos = list_entry_rcu(pos-member.next, typeof(*pos), member))

Thus the first use of pos is pos-member.next or:

mddev-disks.next

But now you converted it to rdev = mddev-disks.next, which means the
first use is:

pos = mddev-disks.next-next

I think you are skipping the first element here.



struct mddev {
...
   struct list_headdisks;
...}

struct list_head {
  struct list_head *next, *prev;
};

The tricky thing is that list_entry_rcu before and after the patch is
reading the same thing.


No it isn't.
Before the patch it is passed the address of the 'next' field.  After the
patch it is passed the contents of the 'next' field.


Here I meant list_entry_rcu (in include/linux/rculist.h) not the
change to drivers/md/bitmap.c.



However in your case, the change I proposed is probably wrong I trust
you on this side. :) What's your proposal to fix it with the rculist patch?


What needs fixing?  I don't see anything broken.

Maybe there is something in this rculist patch that I'm missing.  Can you
point me at it?


Do not apply the patch on drivers/md/bitmap.c but only on
include/linux/rculist.h and you will see that the compilation fails.


Here an example of the compilation error in drivers/md/bitmap.c

In file included from include/linux/sched.h:17:0,
 from include/linux/blkdev.h:4,
 from drivers/md/bitmap.c:18:
drivers/md/bitmap.c: In function ‘next_active_rdev’:
include/linux/compiler.h:469:24: error: lvalue required as unary ‘’ operand
  (volatile typeof(x) *)(x); })
^
include/linux/kernel.h:800:49: note: in definition of macro ‘container_of’
  const typeof( ((type *)0)-member ) *__mptr = (ptr); \
 ^
include/linux/compiler.h:470:26: note: in expansion of macro ‘__ACCESS_ONCE’
 #define ACCESS_ONCE(x) (*__ACCESS_ONCE(x))
  ^
include/linux/rcupdate.h:630:26: note: in expansion of macro ‘ACCESS_ONCE’
  typeof(p) _p1 = ACCESS_ONCE(p); \
  ^
include/linux/rcupdate.h:587:48: note: in expansion of macro 
‘lockless_dereference’

  typeof(*p) *p1 = (typeof(*p) *__force)lockless_dereference(p); \
^
include/linux/rcupdate.h:723:2: note: in expansion of macro 
‘__rcu_dereference_check’

  __rcu_dereference_check((p), rcu_read_lock_held() || (c), __rcu)
  ^
include/linux/rcupdate.h:746:32: note: in expansion of macro 
‘rcu_dereference_check’
 #define rcu_dereference_raw(p) rcu_dereference_check(p, 1) /*@@@ 
needed? @@@*/

^
include/linux/rculist.h:251:28: note: in expansion of macro 
‘rcu_dereference_raw’

  container_of((typeof(ptr))rcu_dereference_raw(ptr), type, member); \
^
drivers/md/bitmap.c:184:10: note: in expansion of macro ‘list_entry_rcu’
   rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);
  ^

Thanks.
--
Pat
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-17 Thread NeilBrown
On Sat, 16 May 2015 19:42:54 +0200 Patrick Marlier
 wrote:

> 
> 
> On 05/13/2015 04:58 AM, NeilBrown wrote:
> > On Tue, 12 May 2015 22:38:53 -0400 Steven Rostedt  
> > wrote:
> >
> >> On Tue, 12 May 2015 15:46:26 -0700
> >> "Paul E. McKenney"  wrote:
> >>
> >>> From: Patrick Marlier 
> >>>
> >>> Signed-off-by: Patrick Marlier 
> >>> Signed-off-by: Paul E. McKenney 
> >>> ---
> >>>   drivers/md/bitmap.c | 2 +-
> >>>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
> >>> index 2bc56e2a3526..32901772e4ee 100644
> >>> --- a/drivers/md/bitmap.c
> >>> +++ b/drivers/md/bitmap.c
> >>> @@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct 
> >>> md_rdev *rdev, struct mddev *mdde
> >>>   rcu_read_lock();
> >>>   if (rdev == NULL)
> >>>   /* start at the beginning */
> >>> - rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
> >>> + rdev = list_entry_rcu(mddev->disks.next, struct md_rdev, 
> >>> same_set);
> >>
> >> Hmm, this changes the semantics.
> >>
> >> The original code looks nasty, I first thought it was broken, but it
> >> seems to work out of sheer luck (or clever hack)
> >
> > Definitely a clever hack - no question of "luck" here :-)
> >
> > It might makes sense to change it to use list_for_each_entry_from_rcu()
> >
> >if (rdev == NULL)
> >   rdev = list_entry_rcu(mddev->disks.next, struct md_rdev, same_set);
> >else {
> >   rdev_dec_pending(rdev, mddev);
> >   rdev = list_next_entry_rcu(rdev->same_set.next, struct md_rdev, 
> > same_set);
> >}
> >list_for_each_entry_from_rcu(rdev, )
> >
> > but there isn't a "list_next_entry_rcu"
> >
> >
> > Also, it would have been polity to at least 'cc' them Maintainer of this 
> > code
> > in the original patch - no?
> 
> Sure my bad. I hesitated to CC maintainers. I was almost sure that it 
> will be rejected so I wanted to avoid noise.

Well... If the subject has contained the magic string "RFC" I might have been
less concerned.
But there have been enough times that people have changed md without telling
me, and thereby broken it, that I'd much rather  see the patch than not.


> 
> 
> >
> > Thanks,
> > NeilBrown
> >
> >>
> >>>   else {
> >>>   /* release the previous rdev and start from there. */
> >>>   rdev_dec_pending(rdev, mddev);
> >>
> >>
> >> What comes after this is:
> >>
> >>list_for_each_entry_continue_rcu(rdev, >disks, same_set) {
> >>if (rdev->raid_disk >= 0 &&
> >>
> >> Now the original code had:
> >>
> >>rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
> >>
> >> Where >disks would return the address of the disks field of
> >> mddev which is a list head. Then it would get the 'same_set' offset,
> >> which is 0, and rdev is pointing to a makeshift md_rdev struct. But it
> >> isn't used, as the list_for_each_entry_continue_rcu() has:
> >>
> >> #define list_for_each_entry_continue_rcu(pos, head, member)
> >> \
> >>for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
> >> >member != (head);\
> >> pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
> >>
> >> Thus the first use of pos is pos->member.next or:
> >>
> >>mddev->disks.next
> >>
> >> But now you converted it to rdev = mddev->disks.next, which means the
> >> first use is:
> >>
> >>pos = mddev->disks.next->next
> >>
> >> I think you are skipping the first element here.
> 
> 
> struct mddev {
> ...
>   struct list_headdisks;
> ...}
> 
> struct list_head {
>  struct list_head *next, *prev;
> };
> 
> The tricky thing is that "list_entry_rcu" before and after the patch is 
> reading the same thing.

No it isn't.
Before the patch it is passed the address of the 'next' field.  After the
patch it is passed the contents of the 'next' field.


> 
> However in your case, the change I proposed is probably wrong I trust 
> you on this side. :) What's your proposal to fix it with the rculist patch?

What needs fixing?  I don't see anything broken.

Maybe there is something in this "rculist patch" that I'm missing.  Can you
point me at it?

Thanks,
NeilBrown


> 
> PS: In the rculist patch I proposed, I avoid the store and the atomic 
> reload in the stack variable __ptr. (yeap, the 
> rcu_dereference_raw/ACCESS_ONCE is a bit confusing because it implicitly 
> do & on the parameter).
> 
> Thanks.
> --
> Pat
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



pgp3v3UEoN5r7.pgp
Description: OpenPGP digital signature


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-17 Thread NeilBrown
On Sat, 16 May 2015 19:42:54 +0200 Patrick Marlier
patrick.marl...@gmail.com wrote:

 
 
 On 05/13/2015 04:58 AM, NeilBrown wrote:
  On Tue, 12 May 2015 22:38:53 -0400 Steven Rostedt rost...@goodmis.org 
  wrote:
 
  On Tue, 12 May 2015 15:46:26 -0700
  Paul E. McKenney paul...@linux.vnet.ibm.com wrote:
 
  From: Patrick Marlier patrick.marl...@gmail.com
 
  Signed-off-by: Patrick Marlier patrick.marl...@gmail.com
  Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com
  ---
drivers/md/bitmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
 
  diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
  index 2bc56e2a3526..32901772e4ee 100644
  --- a/drivers/md/bitmap.c
  +++ b/drivers/md/bitmap.c
  @@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct 
  md_rdev *rdev, struct mddev *mdde
rcu_read_lock();
if (rdev == NULL)
/* start at the beginning */
  - rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);
  + rdev = list_entry_rcu(mddev-disks.next, struct md_rdev, 
  same_set);
 
  Hmm, this changes the semantics.
 
  The original code looks nasty, I first thought it was broken, but it
  seems to work out of sheer luck (or clever hack)
 
  Definitely a clever hack - no question of luck here :-)
 
  It might makes sense to change it to use list_for_each_entry_from_rcu()
 
 if (rdev == NULL)
rdev = list_entry_rcu(mddev-disks.next, struct md_rdev, same_set);
 else {
rdev_dec_pending(rdev, mddev);
rdev = list_next_entry_rcu(rdev-same_set.next, struct md_rdev, 
  same_set);
 }
 list_for_each_entry_from_rcu(rdev, )
 
  but there isn't a list_next_entry_rcu
 
 
  Also, it would have been polity to at least 'cc' them Maintainer of this 
  code
  in the original patch - no?
 
 Sure my bad. I hesitated to CC maintainers. I was almost sure that it 
 will be rejected so I wanted to avoid noise.

Well... If the subject has contained the magic string RFC I might have been
less concerned.
But there have been enough times that people have changed md without telling
me, and thereby broken it, that I'd much rather  see the patch than not.


 
 
 
  Thanks,
  NeilBrown
 
 
else {
/* release the previous rdev and start from there. */
rdev_dec_pending(rdev, mddev);
 
 
  What comes after this is:
 
 list_for_each_entry_continue_rcu(rdev, mddev-disks, same_set) {
 if (rdev-raid_disk = 0 
 
  Now the original code had:
 
 rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);
 
  Where mddev-disks would return the address of the disks field of
  mddev which is a list head. Then it would get the 'same_set' offset,
  which is 0, and rdev is pointing to a makeshift md_rdev struct. But it
  isn't used, as the list_for_each_entry_continue_rcu() has:
 
  #define list_for_each_entry_continue_rcu(pos, head, member)
  \
 for (pos = list_entry_rcu(pos-member.next, typeof(*pos), member); \
  pos-member != (head);\
  pos = list_entry_rcu(pos-member.next, typeof(*pos), member))
 
  Thus the first use of pos is pos-member.next or:
 
 mddev-disks.next
 
  But now you converted it to rdev = mddev-disks.next, which means the
  first use is:
 
 pos = mddev-disks.next-next
 
  I think you are skipping the first element here.
 
 
 struct mddev {
 ...
   struct list_headdisks;
 ...}
 
 struct list_head {
  struct list_head *next, *prev;
 };
 
 The tricky thing is that list_entry_rcu before and after the patch is 
 reading the same thing.

No it isn't.
Before the patch it is passed the address of the 'next' field.  After the
patch it is passed the contents of the 'next' field.


 
 However in your case, the change I proposed is probably wrong I trust 
 you on this side. :) What's your proposal to fix it with the rculist patch?

What needs fixing?  I don't see anything broken.

Maybe there is something in this rculist patch that I'm missing.  Can you
point me at it?

Thanks,
NeilBrown


 
 PS: In the rculist patch I proposed, I avoid the store and the atomic 
 reload in the stack variable __ptr. (yeap, the 
 rcu_dereference_raw/ACCESS_ONCE is a bit confusing because it implicitly 
 do  on the parameter).
 
 Thanks.
 --
 Pat
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/



pgp3v3UEoN5r7.pgp
Description: OpenPGP digital signature


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-16 Thread Patrick Marlier



On 05/13/2015 04:58 AM, NeilBrown wrote:

On Tue, 12 May 2015 22:38:53 -0400 Steven Rostedt  wrote:


On Tue, 12 May 2015 15:46:26 -0700
"Paul E. McKenney"  wrote:


From: Patrick Marlier 

Signed-off-by: Patrick Marlier 
Signed-off-by: Paul E. McKenney 
---
  drivers/md/bitmap.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 2bc56e2a3526..32901772e4ee 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct md_rdev 
*rdev, struct mddev *mdde
rcu_read_lock();
if (rdev == NULL)
/* start at the beginning */
-   rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
+   rdev = list_entry_rcu(mddev->disks.next, struct md_rdev, 
same_set);


Hmm, this changes the semantics.

The original code looks nasty, I first thought it was broken, but it
seems to work out of sheer luck (or clever hack)


Definitely a clever hack - no question of "luck" here :-)

It might makes sense to change it to use list_for_each_entry_from_rcu()

   if (rdev == NULL)
  rdev = list_entry_rcu(mddev->disks.next, struct md_rdev, same_set);
   else {
  rdev_dec_pending(rdev, mddev);
  rdev = list_next_entry_rcu(rdev->same_set.next, struct md_rdev, same_set);
   }
   list_for_each_entry_from_rcu(rdev, )

but there isn't a "list_next_entry_rcu"


Also, it would have been polity to at least 'cc' them Maintainer of this code
in the original patch - no?


Sure my bad. I hesitated to CC maintainers. I was almost sure that it 
will be rejected so I wanted to avoid noise.





Thanks,
NeilBrown




else {
/* release the previous rdev and start from there. */
rdev_dec_pending(rdev, mddev);



What comes after this is:

list_for_each_entry_continue_rcu(rdev, >disks, same_set) {
if (rdev->raid_disk >= 0 &&

Now the original code had:

   rdev = list_entry_rcu(>disks, struct md_rdev, same_set);

Where >disks would return the address of the disks field of
mddev which is a list head. Then it would get the 'same_set' offset,
which is 0, and rdev is pointing to a makeshift md_rdev struct. But it
isn't used, as the list_for_each_entry_continue_rcu() has:

#define list_for_each_entry_continue_rcu(pos, head, member) \
for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
 >member != (head); \
 pos = list_entry_rcu(pos->member.next, typeof(*pos), member))

Thus the first use of pos is pos->member.next or:

   mddev->disks.next

But now you converted it to rdev = mddev->disks.next, which means the
first use is:

   pos = mddev->disks.next->next

I think you are skipping the first element here.



struct mddev {
...
struct list_headdisks;
...}

struct list_head {
struct list_head *next, *prev;
};

The tricky thing is that "list_entry_rcu" before and after the patch is 
reading the same thing.


However in your case, the change I proposed is probably wrong I trust 
you on this side. :) What's your proposal to fix it with the rculist patch?


PS: In the rculist patch I proposed, I avoid the store and the atomic 
reload in the stack variable __ptr. (yeap, the 
rcu_dereference_raw/ACCESS_ONCE is a bit confusing because it implicitly 
do & on the parameter).


Thanks.
--
Pat
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-16 Thread Patrick Marlier



On 05/13/2015 04:58 AM, NeilBrown wrote:

On Tue, 12 May 2015 22:38:53 -0400 Steven Rostedt rost...@goodmis.org wrote:


On Tue, 12 May 2015 15:46:26 -0700
Paul E. McKenney paul...@linux.vnet.ibm.com wrote:


From: Patrick Marlier patrick.marl...@gmail.com

Signed-off-by: Patrick Marlier patrick.marl...@gmail.com
Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com
---
  drivers/md/bitmap.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 2bc56e2a3526..32901772e4ee 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct md_rdev 
*rdev, struct mddev *mdde
rcu_read_lock();
if (rdev == NULL)
/* start at the beginning */
-   rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);
+   rdev = list_entry_rcu(mddev-disks.next, struct md_rdev, 
same_set);


Hmm, this changes the semantics.

The original code looks nasty, I first thought it was broken, but it
seems to work out of sheer luck (or clever hack)


Definitely a clever hack - no question of luck here :-)

It might makes sense to change it to use list_for_each_entry_from_rcu()

   if (rdev == NULL)
  rdev = list_entry_rcu(mddev-disks.next, struct md_rdev, same_set);
   else {
  rdev_dec_pending(rdev, mddev);
  rdev = list_next_entry_rcu(rdev-same_set.next, struct md_rdev, same_set);
   }
   list_for_each_entry_from_rcu(rdev, )

but there isn't a list_next_entry_rcu


Also, it would have been polity to at least 'cc' them Maintainer of this code
in the original patch - no?


Sure my bad. I hesitated to CC maintainers. I was almost sure that it 
will be rejected so I wanted to avoid noise.





Thanks,
NeilBrown




else {
/* release the previous rdev and start from there. */
rdev_dec_pending(rdev, mddev);



What comes after this is:

list_for_each_entry_continue_rcu(rdev, mddev-disks, same_set) {
if (rdev-raid_disk = 0 

Now the original code had:

   rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);

Where mddev-disks would return the address of the disks field of
mddev which is a list head. Then it would get the 'same_set' offset,
which is 0, and rdev is pointing to a makeshift md_rdev struct. But it
isn't used, as the list_for_each_entry_continue_rcu() has:

#define list_for_each_entry_continue_rcu(pos, head, member) \
for (pos = list_entry_rcu(pos-member.next, typeof(*pos), member); \
 pos-member != (head); \
 pos = list_entry_rcu(pos-member.next, typeof(*pos), member))

Thus the first use of pos is pos-member.next or:

   mddev-disks.next

But now you converted it to rdev = mddev-disks.next, which means the
first use is:

   pos = mddev-disks.next-next

I think you are skipping the first element here.



struct mddev {
...
struct list_headdisks;
...}

struct list_head {
struct list_head *next, *prev;
};

The tricky thing is that list_entry_rcu before and after the patch is 
reading the same thing.


However in your case, the change I proposed is probably wrong I trust 
you on this side. :) What's your proposal to fix it with the rculist patch?


PS: In the rculist patch I proposed, I avoid the store and the atomic 
reload in the stack variable __ptr. (yeap, the 
rcu_dereference_raw/ACCESS_ONCE is a bit confusing because it implicitly 
do  on the parameter).


Thanks.
--
Pat
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-13 Thread Paul E. McKenney
On Wed, May 13, 2015 at 12:58:39PM +1000, NeilBrown wrote:
> On Tue, 12 May 2015 22:38:53 -0400 Steven Rostedt  wrote:
> 
> > On Tue, 12 May 2015 15:46:26 -0700
> > "Paul E. McKenney"  wrote:
> > 
> > > From: Patrick Marlier 
> > > 
> > > Signed-off-by: Patrick Marlier 
> > > Signed-off-by: Paul E. McKenney 
> > > ---
> > >  drivers/md/bitmap.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
> > > index 2bc56e2a3526..32901772e4ee 100644
> > > --- a/drivers/md/bitmap.c
> > > +++ b/drivers/md/bitmap.c
> > > @@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct 
> > > md_rdev *rdev, struct mddev *mdde
> > >   rcu_read_lock();
> > >   if (rdev == NULL)
> > >   /* start at the beginning */
> > > - rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
> > > + rdev = list_entry_rcu(mddev->disks.next, struct md_rdev, 
> > > same_set);
> > 
> > Hmm, this changes the semantics.
> > 
> > The original code looks nasty, I first thought it was broken, but it
> > seems to work out of sheer luck (or clever hack)
> 
> Definitely a clever hack - no question of "luck" here :-)
> 
> It might makes sense to change it to use list_for_each_entry_from_rcu()
> 
>   if (rdev == NULL)
>  rdev = list_entry_rcu(mddev->disks.next, struct md_rdev, same_set);
>   else {
>  rdev_dec_pending(rdev, mddev);
>  rdev = list_next_entry_rcu(rdev->same_set.next, struct md_rdev, 
> same_set);
>   }
>   list_for_each_entry_from_rcu(rdev, )
> 
> but there isn't a "list_next_entry_rcu"

Patrick, this one is for you.  As are all of the questions from Steven.

> Also, it would have been polity to at least 'cc' them Maintainer of this code
> in the original patch - no?

Rest assured that this set is not going to -tip without at least an
Acked-by from at least one of the maintainers.  In some cases, I will
interpret silence as assent, but this is not one of those cases.  ;-)

Thanx, Paul

> Thanks,
> NeilBrown
> 
> > 
> > >   else {
> > >   /* release the previous rdev and start from there. */
> > >   rdev_dec_pending(rdev, mddev);
> > 
> > 
> > What comes after this is:
> > 
> > list_for_each_entry_continue_rcu(rdev, >disks, same_set) {
> > if (rdev->raid_disk >= 0 &&
> > 
> > Now the original code had:
> > 
> >   rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
> > 
> > Where >disks would return the address of the disks field of
> > mddev which is a list head. Then it would get the 'same_set' offset,
> > which is 0, and rdev is pointing to a makeshift md_rdev struct. But it
> > isn't used, as the list_for_each_entry_continue_rcu() has:
> > 
> > #define list_for_each_entry_continue_rcu(pos, head, member) 
> > \
> > for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
> >  >member != (head);\
> >  pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
> > 
> > Thus the first use of pos is pos->member.next or:
> > 
> >   mddev->disks.next
> > 
> > But now you converted it to rdev = mddev->disks.next, which means the
> > first use is:
> > 
> >   pos = mddev->disks.next->next
> > 
> > I think you are skipping the first element here.
> > 
> > -- Steve
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-13 Thread Paul E. McKenney
On Wed, May 13, 2015 at 12:58:39PM +1000, NeilBrown wrote:
 On Tue, 12 May 2015 22:38:53 -0400 Steven Rostedt rost...@goodmis.org wrote:
 
  On Tue, 12 May 2015 15:46:26 -0700
  Paul E. McKenney paul...@linux.vnet.ibm.com wrote:
  
   From: Patrick Marlier patrick.marl...@gmail.com
   
   Signed-off-by: Patrick Marlier patrick.marl...@gmail.com
   Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com
   ---
drivers/md/bitmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
   
   diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
   index 2bc56e2a3526..32901772e4ee 100644
   --- a/drivers/md/bitmap.c
   +++ b/drivers/md/bitmap.c
   @@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct 
   md_rdev *rdev, struct mddev *mdde
 rcu_read_lock();
 if (rdev == NULL)
 /* start at the beginning */
   - rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);
   + rdev = list_entry_rcu(mddev-disks.next, struct md_rdev, 
   same_set);
  
  Hmm, this changes the semantics.
  
  The original code looks nasty, I first thought it was broken, but it
  seems to work out of sheer luck (or clever hack)
 
 Definitely a clever hack - no question of luck here :-)
 
 It might makes sense to change it to use list_for_each_entry_from_rcu()
 
   if (rdev == NULL)
  rdev = list_entry_rcu(mddev-disks.next, struct md_rdev, same_set);
   else {
  rdev_dec_pending(rdev, mddev);
  rdev = list_next_entry_rcu(rdev-same_set.next, struct md_rdev, 
 same_set);
   }
   list_for_each_entry_from_rcu(rdev, )
 
 but there isn't a list_next_entry_rcu

Patrick, this one is for you.  As are all of the questions from Steven.

 Also, it would have been polity to at least 'cc' them Maintainer of this code
 in the original patch - no?

Rest assured that this set is not going to -tip without at least an
Acked-by from at least one of the maintainers.  In some cases, I will
interpret silence as assent, but this is not one of those cases.  ;-)

Thanx, Paul

 Thanks,
 NeilBrown
 
  
 else {
 /* release the previous rdev and start from there. */
 rdev_dec_pending(rdev, mddev);
  
  
  What comes after this is:
  
  list_for_each_entry_continue_rcu(rdev, mddev-disks, same_set) {
  if (rdev-raid_disk = 0 
  
  Now the original code had:
  
rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);
  
  Where mddev-disks would return the address of the disks field of
  mddev which is a list head. Then it would get the 'same_set' offset,
  which is 0, and rdev is pointing to a makeshift md_rdev struct. But it
  isn't used, as the list_for_each_entry_continue_rcu() has:
  
  #define list_for_each_entry_continue_rcu(pos, head, member) 
  \
  for (pos = list_entry_rcu(pos-member.next, typeof(*pos), member); \
   pos-member != (head);\
   pos = list_entry_rcu(pos-member.next, typeof(*pos), member))
  
  Thus the first use of pos is pos-member.next or:
  
mddev-disks.next
  
  But now you converted it to rdev = mddev-disks.next, which means the
  first use is:
  
pos = mddev-disks.next-next
  
  I think you are skipping the first element here.
  
  -- Steve
 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-12 Thread NeilBrown
On Tue, 12 May 2015 22:38:53 -0400 Steven Rostedt  wrote:

> On Tue, 12 May 2015 15:46:26 -0700
> "Paul E. McKenney"  wrote:
> 
> > From: Patrick Marlier 
> > 
> > Signed-off-by: Patrick Marlier 
> > Signed-off-by: Paul E. McKenney 
> > ---
> >  drivers/md/bitmap.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
> > index 2bc56e2a3526..32901772e4ee 100644
> > --- a/drivers/md/bitmap.c
> > +++ b/drivers/md/bitmap.c
> > @@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct md_rdev 
> > *rdev, struct mddev *mdde
> > rcu_read_lock();
> > if (rdev == NULL)
> > /* start at the beginning */
> > -   rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
> > +   rdev = list_entry_rcu(mddev->disks.next, struct md_rdev, 
> > same_set);
> 
> Hmm, this changes the semantics.
> 
> The original code looks nasty, I first thought it was broken, but it
> seems to work out of sheer luck (or clever hack)

Definitely a clever hack - no question of "luck" here :-)

It might makes sense to change it to use list_for_each_entry_from_rcu()

  if (rdev == NULL)
 rdev = list_entry_rcu(mddev->disks.next, struct md_rdev, same_set);
  else {
 rdev_dec_pending(rdev, mddev);
 rdev = list_next_entry_rcu(rdev->same_set.next, struct md_rdev, same_set);
  }
  list_for_each_entry_from_rcu(rdev, )

but there isn't a "list_next_entry_rcu"


Also, it would have been polity to at least 'cc' them Maintainer of this code
in the original patch - no?

Thanks,
NeilBrown

> 
> > else {
> > /* release the previous rdev and start from there. */
> > rdev_dec_pending(rdev, mddev);
> 
> 
> What comes after this is:
> 
>   list_for_each_entry_continue_rcu(rdev, >disks, same_set) {
>   if (rdev->raid_disk >= 0 &&
> 
> Now the original code had:
> 
>   rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
> 
> Where >disks would return the address of the disks field of
> mddev which is a list head. Then it would get the 'same_set' offset,
> which is 0, and rdev is pointing to a makeshift md_rdev struct. But it
> isn't used, as the list_for_each_entry_continue_rcu() has:
> 
> #define list_for_each_entry_continue_rcu(pos, head, member)   \
>   for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
>>member != (head);\
>pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
> 
> Thus the first use of pos is pos->member.next or:
> 
>   mddev->disks.next
> 
> But now you converted it to rdev = mddev->disks.next, which means the
> first use is:
> 
>   pos = mddev->disks.next->next
> 
> I think you are skipping the first element here.
> 
> -- Steve



pgpOeTxNWol3n.pgp
Description: OpenPGP digital signature


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-12 Thread Steven Rostedt
On Tue, 12 May 2015 15:46:26 -0700
"Paul E. McKenney"  wrote:

> From: Patrick Marlier 
> 
> Signed-off-by: Patrick Marlier 
> Signed-off-by: Paul E. McKenney 
> ---
>  drivers/md/bitmap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
> index 2bc56e2a3526..32901772e4ee 100644
> --- a/drivers/md/bitmap.c
> +++ b/drivers/md/bitmap.c
> @@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct md_rdev 
> *rdev, struct mddev *mdde
>   rcu_read_lock();
>   if (rdev == NULL)
>   /* start at the beginning */
> - rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
> + rdev = list_entry_rcu(mddev->disks.next, struct md_rdev, 
> same_set);

Hmm, this changes the semantics.

The original code looks nasty, I first thought it was broken, but it
seems to work out of sheer luck (or clever hack)

>   else {
>   /* release the previous rdev and start from there. */
>   rdev_dec_pending(rdev, mddev);


What comes after this is:

list_for_each_entry_continue_rcu(rdev, >disks, same_set) {
if (rdev->raid_disk >= 0 &&

Now the original code had:

  rdev = list_entry_rcu(>disks, struct md_rdev, same_set);

Where >disks would return the address of the disks field of
mddev which is a list head. Then it would get the 'same_set' offset,
which is 0, and rdev is pointing to a makeshift md_rdev struct. But it
isn't used, as the list_for_each_entry_continue_rcu() has:

#define list_for_each_entry_continue_rcu(pos, head, member) \
for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
 >member != (head);\
 pos = list_entry_rcu(pos->member.next, typeof(*pos), member))

Thus the first use of pos is pos->member.next or:

  mddev->disks.next

But now you converted it to rdev = mddev->disks.next, which means the
first use is:

  pos = mddev->disks.next->next

I think you are skipping the first element here.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-12 Thread Paul E. McKenney
From: Patrick Marlier 

Signed-off-by: Patrick Marlier 
Signed-off-by: Paul E. McKenney 
---
 drivers/md/bitmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 2bc56e2a3526..32901772e4ee 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct md_rdev 
*rdev, struct mddev *mdde
rcu_read_lock();
if (rdev == NULL)
/* start at the beginning */
-   rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
+   rdev = list_entry_rcu(mddev->disks.next, struct md_rdev, 
same_set);
else {
/* release the previous rdev and start from there. */
rdev_dec_pending(rdev, mddev);
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-12 Thread Steven Rostedt
On Tue, 12 May 2015 15:46:26 -0700
Paul E. McKenney paul...@linux.vnet.ibm.com wrote:

 From: Patrick Marlier patrick.marl...@gmail.com
 
 Signed-off-by: Patrick Marlier patrick.marl...@gmail.com
 Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com
 ---
  drivers/md/bitmap.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
 index 2bc56e2a3526..32901772e4ee 100644
 --- a/drivers/md/bitmap.c
 +++ b/drivers/md/bitmap.c
 @@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct md_rdev 
 *rdev, struct mddev *mdde
   rcu_read_lock();
   if (rdev == NULL)
   /* start at the beginning */
 - rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);
 + rdev = list_entry_rcu(mddev-disks.next, struct md_rdev, 
 same_set);

Hmm, this changes the semantics.

The original code looks nasty, I first thought it was broken, but it
seems to work out of sheer luck (or clever hack)

   else {
   /* release the previous rdev and start from there. */
   rdev_dec_pending(rdev, mddev);


What comes after this is:

list_for_each_entry_continue_rcu(rdev, mddev-disks, same_set) {
if (rdev-raid_disk = 0 

Now the original code had:

  rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);

Where mddev-disks would return the address of the disks field of
mddev which is a list head. Then it would get the 'same_set' offset,
which is 0, and rdev is pointing to a makeshift md_rdev struct. But it
isn't used, as the list_for_each_entry_continue_rcu() has:

#define list_for_each_entry_continue_rcu(pos, head, member) \
for (pos = list_entry_rcu(pos-member.next, typeof(*pos), member); \
 pos-member != (head);\
 pos = list_entry_rcu(pos-member.next, typeof(*pos), member))

Thus the first use of pos is pos-member.next or:

  mddev-disks.next

But now you converted it to rdev = mddev-disks.next, which means the
first use is:

  pos = mddev-disks.next-next

I think you are skipping the first element here.

-- Steve
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-12 Thread Paul E. McKenney
From: Patrick Marlier patrick.marl...@gmail.com

Signed-off-by: Patrick Marlier patrick.marl...@gmail.com
Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com
---
 drivers/md/bitmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 2bc56e2a3526..32901772e4ee 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct md_rdev 
*rdev, struct mddev *mdde
rcu_read_lock();
if (rdev == NULL)
/* start at the beginning */
-   rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);
+   rdev = list_entry_rcu(mddev-disks.next, struct md_rdev, 
same_set);
else {
/* release the previous rdev and start from there. */
rdev_dec_pending(rdev, mddev);
-- 
1.8.1.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-12 Thread NeilBrown
On Tue, 12 May 2015 22:38:53 -0400 Steven Rostedt rost...@goodmis.org wrote:

 On Tue, 12 May 2015 15:46:26 -0700
 Paul E. McKenney paul...@linux.vnet.ibm.com wrote:
 
  From: Patrick Marlier patrick.marl...@gmail.com
  
  Signed-off-by: Patrick Marlier patrick.marl...@gmail.com
  Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com
  ---
   drivers/md/bitmap.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
  
  diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
  index 2bc56e2a3526..32901772e4ee 100644
  --- a/drivers/md/bitmap.c
  +++ b/drivers/md/bitmap.c
  @@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct md_rdev 
  *rdev, struct mddev *mdde
  rcu_read_lock();
  if (rdev == NULL)
  /* start at the beginning */
  -   rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);
  +   rdev = list_entry_rcu(mddev-disks.next, struct md_rdev, 
  same_set);
 
 Hmm, this changes the semantics.
 
 The original code looks nasty, I first thought it was broken, but it
 seems to work out of sheer luck (or clever hack)

Definitely a clever hack - no question of luck here :-)

It might makes sense to change it to use list_for_each_entry_from_rcu()

  if (rdev == NULL)
 rdev = list_entry_rcu(mddev-disks.next, struct md_rdev, same_set);
  else {
 rdev_dec_pending(rdev, mddev);
 rdev = list_next_entry_rcu(rdev-same_set.next, struct md_rdev, same_set);
  }
  list_for_each_entry_from_rcu(rdev, )

but there isn't a list_next_entry_rcu


Also, it would have been polity to at least 'cc' them Maintainer of this code
in the original patch - no?

Thanks,
NeilBrown

 
  else {
  /* release the previous rdev and start from there. */
  rdev_dec_pending(rdev, mddev);
 
 
 What comes after this is:
 
   list_for_each_entry_continue_rcu(rdev, mddev-disks, same_set) {
   if (rdev-raid_disk = 0 
 
 Now the original code had:
 
   rdev = list_entry_rcu(mddev-disks, struct md_rdev, same_set);
 
 Where mddev-disks would return the address of the disks field of
 mddev which is a list head. Then it would get the 'same_set' offset,
 which is 0, and rdev is pointing to a makeshift md_rdev struct. But it
 isn't used, as the list_for_each_entry_continue_rcu() has:
 
 #define list_for_each_entry_continue_rcu(pos, head, member)   \
   for (pos = list_entry_rcu(pos-member.next, typeof(*pos), member); \
pos-member != (head);\
pos = list_entry_rcu(pos-member.next, typeof(*pos), member))
 
 Thus the first use of pos is pos-member.next or:
 
   mddev-disks.next
 
 But now you converted it to rdev = mddev-disks.next, which means the
 first use is:
 
   pos = mddev-disks.next-next
 
 I think you are skipping the first element here.
 
 -- Steve



pgpOeTxNWol3n.pgp
Description: OpenPGP digital signature