Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-18 Thread Paul E. McKenney
On Thu, Jun 18, 2015 at 01:10:22PM -0400, Steven Rostedt wrote:
> On Thu, 18 Jun 2015 09:54:07 -0700
> "Paul E. McKenney"  wrote:
> 
> > Yep, I have to frequently remind them that most projects need to support
> > old compilers.  And I did point out that the commentary at the end
> > of that document would not encourage adoption of C11.  They of course
> > felt this was unfair of me, so I have to thank you both for proving the
> > correctness of my reply to them.  Although you guys didn't use quite as
> > many swear words as I would have expected.  ;-)
> 
> I could add a few more if you would like ;-)

I bet!  ;-)

> What's their issue? Is there some kind of benchmark war going on
> between different compilers? Where they want want to prove they can
> produce the absolute fastest code possible, but only use single
> threaded apps and screw those that must support multi-threaded
> applications.
> 
> My phone and my camera have multicore systems. Single threaded is not
> the way of the future. Which ever compiler makes it easier to write
> multi-threaded applications is going to win, regardless of how well a
> compiler can claim they optimize code the best for a single threaded
> app.

Their viewpoint is that they have produced syntax for marking shared
variables and also for marking accesses to shared variables, and that for
code that is not using those markings, anything goes.  They need frequent
reminders of the need to accommodate pre-existing code, and usually don't
take such reminders very well.  Ditto for projects such as the Linux
kernel to support pre-C11 compilers, and that need production-quality
compiler support.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-18 Thread Steven Rostedt
On Thu, 18 Jun 2015 09:54:07 -0700
"Paul E. McKenney"  wrote:

> Yep, I have to frequently remind them that most projects need to support
> old compilers.  And I did point out that the commentary at the end
> of that document would not encourage adoption of C11.  They of course
> felt this was unfair of me, so I have to thank you both for proving the
> correctness of my reply to them.  Although you guys didn't use quite as
> many swear words as I would have expected.  ;-)

I could add a few more if you would like ;-)


What's their issue? Is there some kind of benchmark war going on
between different compilers? Where they want want to prove they can
produce the absolute fastest code possible, but only use single
threaded apps and screw those that must support multi-threaded
applications.

My phone and my camera have multicore systems. Single threaded is not
the way of the future. Which ever compiler makes it easier to write
multi-threaded applications is going to win, regardless of how well a
compiler can claim they optimize code the best for a single threaded
app.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-18 Thread Paul E. McKenney
On Thu, Jun 18, 2015 at 12:40:15PM +0200, Peter Zijlstra wrote:
> On Thu, Jun 18, 2015 at 11:40:14AM +0200, Ingo Molnar wrote:
> 
> > In what retarded use-case do unasked for speculative writes even make any 
> > sense 
> > beyond as a sadistic tool to make parallel, threaded code even more 
> > fragile??

Well, these are the compiler guys we are talking about, so why would
you expect otherwise?  Sorry, couldn't resist...  ;-)

They believe that all parallel threaded code should mark either all
shared variables (but get memory barriers on all normal accesses) or all
accesses.  Their use case is that they would like to be able to continue
carrying out single-threaded-safe optimizations on concurrent code.  And
speculative writes of that form are just fine in single-threaded code.

And if you think -that- is bad, you should have seen the screaming and
shouting when they were prohibited from stomping on unrelated variables.
They used to do "write widening", where they might use a vector unit
to do a store, then fix up the adjacent variables that were clobbered
by a too-wide write.  You just can't make this stuff up!

Anyway, I am having an ongoing discussion with them on handling
pre-existing code.  It has been an "interesting" discussion, particularly
the parts involving rcu_dereference().

> So what worries me most is the "Takeaways" from the document:
> 
>   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html
> 
> Those read like a 'fuck you' to 30 years of concurrent code in C.

Pretty much.  I already told them that, and they will hopefully do better
in the next version.  Some of them were annoyed at me for telling them
what they could do with their proposal to eliminate thread-local storage
(TLS, similar to Linux kernel per-CPU variables), though JF Bastien was
co-author with me on that paper.  But perhaps he felt the need to kiss
up to the people who were annoyed by the refusal to go along with their
TLS-removal urge.

> Sure, its nice and all that they finally have something that's
> standardized, and this might be an option for new projects (in reality
> it might only really be an option in another 5-10 years).
> 
> But the active encouragement to break existing code is utterly fucked.

Yep, I have to frequently remind them that most projects need to support
old compilers.  And I did point out that the commentary at the end
of that document would not encourage adoption of C11.  They of course
felt this was unfair of me, so I have to thank you both for proving the
correctness of my reply to them.  Although you guys didn't use quite as
many swear words as I would have expected.  ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-18 Thread Peter Zijlstra
On Thu, Jun 18, 2015 at 11:40:14AM +0200, Ingo Molnar wrote:

> In what retarded use-case do unasked for speculative writes even make any 
> sense 
> beyond as a sadistic tool to make parallel, threaded code even more fragile??

So what worries me most is the "Takeaways" from the document:

  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html

Those read like a 'fuck you' to 30 years of concurrent code in C.

Sure, its nice and all that they finally have something that's
standardized, and this might be an option for new projects (in reality
it might only really be an option in another 5-10 years).

But the active encouragement to break existing code is utterly fucked.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-18 Thread Ingo Molnar

* Peter Zijlstra  wrote:

> > > I doubt there's a single OS kernel (that supports SMP configurations) 
> > > that 
> > > does not rely on a whole host of 'undefined' behaviour.
> > 
> > An alternative approach would be a compiler switch (or similar) that 
> > changed 
> > the default atomic access from SC to relaxed.  Then shared variables could 
> > be 
> > marked atomic, and normal C code could be used to access them, but without 
> > the 
> > compiler emitting memory barriers all over the place (yes, even on x86).
> 
> See, I don;'t think that is a realistic approach. Who is going to audit our 
> ~16 
> million lines of code to mark all shared variables? Or all the other existing 
> code bases that rely on this behaviour?

Sidenote: we are well beyond 19 million lines meanwhile.

But generating speculative writes unless the compiler can prove it's not shared 
memory are crazy. Who on earth argues they are sane?

In what retarded use-case do unasked for speculative writes even make any sense 
beyond as a sadistic tool to make parallel, threaded code even more fragile??

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-18 Thread Peter Zijlstra
On Wed, Jun 17, 2015 at 11:02:14AM -0700, Paul E. McKenney wrote:
> On Wed, Jun 17, 2015 at 07:11:40PM +0200, Peter Zijlstra wrote:
> > On Wed, Jun 17, 2015 at 09:37:31AM -0700, Paul E. McKenney wrote:
> > > The point of std::atomic<> (and of the equivalent C11 syntax) is to
> > > force the compiler to suppress optimizations that are unsafe for shared
> > > variables.  We get more or less the same effect with volatile, protests
> > > from compiler people notwithstanding.
> > > 
> > > I often tell the compiler guys that they have to expect make -some-
> > > concessions for being 30 years late to the concurrency party, but
> > > it nevertheless makes sense to future-proof our code where it is
> > > reasonable to do so.
> > 
> > Right, so in that regards I would request the compiler option (and or
> > #pragma) that disables all the out-of-thin-air nonsense.
> 
> OK.  What is the form of the #pragma?  If it focuses on a specific
> access, we are likely to get a lot of pushback.

I didn't have anything specific in mind; other than

#pragma no_speculative_stores_ever

Which would forbid all these retarded 'optimizations' for the entire
translation unit.

> > Because while they hide behind their undefined behaviour, the fact is
> > that all of their machines for the past 30 odd years have been relying
> > on this 'undefined' behaviour to work. This being the machines they've
> > been typing their useless specs on :-)
> 
> Maybe I can scare them into doing all their work on UP systems.  ;-)
> 
> Interestingly enough, LLVM is taking a slightly different approach.
> Rather than invoke undefined behavior, they say that data races result
> in random bits being loaded.  Not that it makes much difference to the
> health and well-being of the software, mind you...

I'm not sure I follow that argument.

> > I doubt there's a single OS kernel (that supports SMP configurations)
> > that does not rely on a whole host of 'undefined' behaviour.
> 
> An alternative approach would be a compiler switch (or similar) that
> changed the default atomic access from SC to relaxed.  Then shared
> variables could be marked atomic, and normal C code could be used to
> access them, but without the compiler emitting memory barriers all over
> the place (yes, even on x86).

See, I don;'t think that is a realistic approach. Who is going to audit
our ~16 million lines of code to mark all shared variables? Or all the
other existing code bases that rely on this behaviour?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-18 Thread Paul E. McKenney
On Thu, Jun 18, 2015 at 12:40:15PM +0200, Peter Zijlstra wrote:
 On Thu, Jun 18, 2015 at 11:40:14AM +0200, Ingo Molnar wrote:
 
  In what retarded use-case do unasked for speculative writes even make any 
  sense 
  beyond as a sadistic tool to make parallel, threaded code even more 
  fragile??

Well, these are the compiler guys we are talking about, so why would
you expect otherwise?  Sorry, couldn't resist...  ;-)

They believe that all parallel threaded code should mark either all
shared variables (but get memory barriers on all normal accesses) or all
accesses.  Their use case is that they would like to be able to continue
carrying out single-threaded-safe optimizations on concurrent code.  And
speculative writes of that form are just fine in single-threaded code.

And if you think -that- is bad, you should have seen the screaming and
shouting when they were prohibited from stomping on unrelated variables.
They used to do write widening, where they might use a vector unit
to do a store, then fix up the adjacent variables that were clobbered
by a too-wide write.  You just can't make this stuff up!

Anyway, I am having an ongoing discussion with them on handling
pre-existing code.  It has been an interesting discussion, particularly
the parts involving rcu_dereference().

 So what worries me most is the Takeaways from the document:
 
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html
 
 Those read like a 'fuck you' to 30 years of concurrent code in C.

Pretty much.  I already told them that, and they will hopefully do better
in the next version.  Some of them were annoyed at me for telling them
what they could do with their proposal to eliminate thread-local storage
(TLS, similar to Linux kernel per-CPU variables), though JF Bastien was
co-author with me on that paper.  But perhaps he felt the need to kiss
up to the people who were annoyed by the refusal to go along with their
TLS-removal urge.

 Sure, its nice and all that they finally have something that's
 standardized, and this might be an option for new projects (in reality
 it might only really be an option in another 5-10 years).
 
 But the active encouragement to break existing code is utterly fucked.

Yep, I have to frequently remind them that most projects need to support
old compilers.  And I did point out that the commentary at the end
of that document would not encourage adoption of C11.  They of course
felt this was unfair of me, so I have to thank you both for proving the
correctness of my reply to them.  Although you guys didn't use quite as
many swear words as I would have expected.  ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-18 Thread Steven Rostedt
On Thu, 18 Jun 2015 09:54:07 -0700
Paul E. McKenney paul...@linux.vnet.ibm.com wrote:

 Yep, I have to frequently remind them that most projects need to support
 old compilers.  And I did point out that the commentary at the end
 of that document would not encourage adoption of C11.  They of course
 felt this was unfair of me, so I have to thank you both for proving the
 correctness of my reply to them.  Although you guys didn't use quite as
 many swear words as I would have expected.  ;-)

I could add a few more if you would like ;-)


What's their issue? Is there some kind of benchmark war going on
between different compilers? Where they want want to prove they can
produce the absolute fastest code possible, but only use single
threaded apps and screw those that must support multi-threaded
applications.

My phone and my camera have multicore systems. Single threaded is not
the way of the future. Which ever compiler makes it easier to write
multi-threaded applications is going to win, regardless of how well a
compiler can claim they optimize code the best for a single threaded
app.

-- Steve


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-18 Thread Paul E. McKenney
On Thu, Jun 18, 2015 at 01:10:22PM -0400, Steven Rostedt wrote:
 On Thu, 18 Jun 2015 09:54:07 -0700
 Paul E. McKenney paul...@linux.vnet.ibm.com wrote:
 
  Yep, I have to frequently remind them that most projects need to support
  old compilers.  And I did point out that the commentary at the end
  of that document would not encourage adoption of C11.  They of course
  felt this was unfair of me, so I have to thank you both for proving the
  correctness of my reply to them.  Although you guys didn't use quite as
  many swear words as I would have expected.  ;-)
 
 I could add a few more if you would like ;-)

I bet!  ;-)

 What's their issue? Is there some kind of benchmark war going on
 between different compilers? Where they want want to prove they can
 produce the absolute fastest code possible, but only use single
 threaded apps and screw those that must support multi-threaded
 applications.
 
 My phone and my camera have multicore systems. Single threaded is not
 the way of the future. Which ever compiler makes it easier to write
 multi-threaded applications is going to win, regardless of how well a
 compiler can claim they optimize code the best for a single threaded
 app.

Their viewpoint is that they have produced syntax for marking shared
variables and also for marking accesses to shared variables, and that for
code that is not using those markings, anything goes.  They need frequent
reminders of the need to accommodate pre-existing code, and usually don't
take such reminders very well.  Ditto for projects such as the Linux
kernel to support pre-C11 compilers, and that need production-quality
compiler support.

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-18 Thread Peter Zijlstra
On Wed, Jun 17, 2015 at 11:02:14AM -0700, Paul E. McKenney wrote:
 On Wed, Jun 17, 2015 at 07:11:40PM +0200, Peter Zijlstra wrote:
  On Wed, Jun 17, 2015 at 09:37:31AM -0700, Paul E. McKenney wrote:
   The point of std::atomic (and of the equivalent C11 syntax) is to
   force the compiler to suppress optimizations that are unsafe for shared
   variables.  We get more or less the same effect with volatile, protests
   from compiler people notwithstanding.
   
   I often tell the compiler guys that they have to expect make -some-
   concessions for being 30 years late to the concurrency party, but
   it nevertheless makes sense to future-proof our code where it is
   reasonable to do so.
  
  Right, so in that regards I would request the compiler option (and or
  #pragma) that disables all the out-of-thin-air nonsense.
 
 OK.  What is the form of the #pragma?  If it focuses on a specific
 access, we are likely to get a lot of pushback.

I didn't have anything specific in mind; other than

#pragma no_speculative_stores_ever

Which would forbid all these retarded 'optimizations' for the entire
translation unit.

  Because while they hide behind their undefined behaviour, the fact is
  that all of their machines for the past 30 odd years have been relying
  on this 'undefined' behaviour to work. This being the machines they've
  been typing their useless specs on :-)
 
 Maybe I can scare them into doing all their work on UP systems.  ;-)
 
 Interestingly enough, LLVM is taking a slightly different approach.
 Rather than invoke undefined behavior, they say that data races result
 in random bits being loaded.  Not that it makes much difference to the
 health and well-being of the software, mind you...

I'm not sure I follow that argument.

  I doubt there's a single OS kernel (that supports SMP configurations)
  that does not rely on a whole host of 'undefined' behaviour.
 
 An alternative approach would be a compiler switch (or similar) that
 changed the default atomic access from SC to relaxed.  Then shared
 variables could be marked atomic, and normal C code could be used to
 access them, but without the compiler emitting memory barriers all over
 the place (yes, even on x86).

See, I don;'t think that is a realistic approach. Who is going to audit
our ~16 million lines of code to mark all shared variables? Or all the
other existing code bases that rely on this behaviour?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-18 Thread Ingo Molnar

* Peter Zijlstra pet...@infradead.org wrote:

   I doubt there's a single OS kernel (that supports SMP configurations) 
   that 
   does not rely on a whole host of 'undefined' behaviour.
  
  An alternative approach would be a compiler switch (or similar) that 
  changed 
  the default atomic access from SC to relaxed.  Then shared variables could 
  be 
  marked atomic, and normal C code could be used to access them, but without 
  the 
  compiler emitting memory barriers all over the place (yes, even on x86).
 
 See, I don;'t think that is a realistic approach. Who is going to audit our 
 ~16 
 million lines of code to mark all shared variables? Or all the other existing 
 code bases that rely on this behaviour?

Sidenote: we are well beyond 19 million lines meanwhile.

But generating speculative writes unless the compiler can prove it's not shared 
memory are crazy. Who on earth argues they are sane?

In what retarded use-case do unasked for speculative writes even make any sense 
beyond as a sadistic tool to make parallel, threaded code even more fragile??

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-18 Thread Peter Zijlstra
On Thu, Jun 18, 2015 at 11:40:14AM +0200, Ingo Molnar wrote:

 In what retarded use-case do unasked for speculative writes even make any 
 sense 
 beyond as a sadistic tool to make parallel, threaded code even more fragile??

So what worries me most is the Takeaways from the document:

  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html

Those read like a 'fuck you' to 30 years of concurrent code in C.

Sure, its nice and all that they finally have something that's
standardized, and this might be an option for new projects (in reality
it might only really be an option in another 5-10 years).

But the active encouragement to break existing code is utterly fucked.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-17 Thread Paul E. McKenney
On Wed, Jun 17, 2015 at 07:11:40PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 17, 2015 at 09:37:31AM -0700, Paul E. McKenney wrote:
> > The point of std::atomic<> (and of the equivalent C11 syntax) is to
> > force the compiler to suppress optimizations that are unsafe for shared
> > variables.  We get more or less the same effect with volatile, protests
> > from compiler people notwithstanding.
> > 
> > I often tell the compiler guys that they have to expect make -some-
> > concessions for being 30 years late to the concurrency party, but
> > it nevertheless makes sense to future-proof our code where it is
> > reasonable to do so.
> 
> Right, so in that regards I would request the compiler option (and or
> #pragma) that disables all the out-of-thin-air nonsense.

OK.  What is the form of the #pragma?  If it focuses on a specific
access, we are likely to get a lot of pushback.

> Because while they hide behind their undefined behaviour, the fact is
> that all of their machines for the past 30 odd years have been relying
> on this 'undefined' behaviour to work. This being the machines they've
> been typing their useless specs on :-)

Maybe I can scare them into doing all their work on UP systems.  ;-)

Interestingly enough, LLVM is taking a slightly different approach.
Rather than invoke undefined behavior, they say that data races result
in random bits being loaded.  Not that it makes much difference to the
health and well-being of the software, mind you...

> I doubt there's a single OS kernel (that supports SMP configurations)
> that does not rely on a whole host of 'undefined' behaviour.

An alternative approach would be a compiler switch (or similar) that
changed the default atomic access from SC to relaxed.  Then shared
variables could be marked atomic, and normal C code could be used to
access them, but without the compiler emitting memory barriers all over
the place (yes, even on x86).

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-17 Thread Peter Zijlstra
On Wed, Jun 17, 2015 at 09:37:31AM -0700, Paul E. McKenney wrote:
> The point of std::atomic<> (and of the equivalent C11 syntax) is to
> force the compiler to suppress optimizations that are unsafe for shared
> variables.  We get more or less the same effect with volatile, protests
> from compiler people notwithstanding.
> 
> I often tell the compiler guys that they have to expect make -some-
> concessions for being 30 years late to the concurrency party, but
> it nevertheless makes sense to future-proof our code where it is
> reasonable to do so.

Right, so in that regards I would request the compiler option (and or
#pragma) that disables all the out-of-thin-air nonsense.

Because while they hide behind their undefined behaviour, the fact is
that all of their machines for the past 30 odd years have been relying
on this 'undefined' behaviour to work. This being the machines they've
been typing their useless specs on :-)

I doubt there's a single OS kernel (that supports SMP configurations)
that does not rely on a whole host of 'undefined' behaviour.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-17 Thread Peter Zijlstra
On Wed, Jun 17, 2015 at 08:42:45AM -0700, Paul E. McKenney wrote:
> > I would very much prefer a compiler switch that instructs the compiler
> > to not do bloody stupid things like this instead of marking every other
> > load/store in the kernel with volatile.
> 
> I would of course be good with such a compiler switch, though my earlier
> attempts to negotiate one were unsuccessful.  But I don't believe that we
> discussed a switch to specifically prohibit only use of to-be-stored-into
> variables as temporary scratch space.  The trick is finding restrictions
> that are useful, but that don't imply -O0.

I would request on that disables all the 'stores from thin air'
'optimizations'. IOW assume everything is shared memory and concurrent
unless you can prove its not so. For example a local stack variable that
does not escape scope.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-17 Thread Paul E. McKenney
On Wed, Jun 17, 2015 at 05:49:27PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 17, 2015 at 07:57:12AM -0700, Paul E. McKenney wrote:
> > On Wed, Jun 17, 2015 at 02:29:24PM +0200, Peter Zijlstra wrote:
> > > I did leave off the READ/WRITE ONCE stuff, because I could not come up
> > > with a scenario where it makes a difference -- I appreciate paranoia,
> > > but I also think we should not overdo the thing.
> > 
> > I can only conclude that you have not read this document:
> > 
> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html
> > 
> > Specifically, please keep in mind that unless you mark either the variable
> > or the memory access, the compiler is within its rights to assume that
> > there are no concurrent accesses to that variable.  For but one example,
> > if you do a normal store to a given variable, then the compiler is
> > within its rights to use that variable as temporary storage prior to
> > that store.  And yes, you can reasonably argue that no sane compiler
> > would store something else to s->sequence given that it could free up
> > a register by storing the incremented value, but the fact remains that
> > you have given it permission to do so if it wants.
> 
> This is the "Optimizations without Atomics" section you're referring to.
> 
> It has words such as: "if the compiler can prove they are not accessed
> in other threads concurrently" and "This requires escape analysis: the
> compiler must see the full scope of the memory location 'p', or must
> know that leaf functions don't capture 'p' and aren't used concurrently,
> for this optimization to be valid."

Yes, but...  These qualifiers apply only in cases where the code at hand
is not writing to the variable.

In this case, it is legal for other threads to be concurrently reading
the variable.  After all, if the value of the variable is not changing,
then the various read-side optimizations have no effect -- even the silly
ones, like reading the variable one bit at a time.  If the compiler
introduces a write into code that was only reading the variable, then
it is the compiler's job to ensure that no other thread is reading it.
If the compiler were to introduce a write to such a variable that other
threads might be reading, then the compiler would have introduced a data
race, which would mean that the compiler introduced undefined behavior
to correct code.  The compiler therefore must be extremely careful when
introducing a write to a variable that is only read by the code at hand.
(Though there are still some "interesting" escape clauses for the compiler
if the variable in question is allocated on the stack and the function
can return without transferring control to some function in some other
translation unit.)

Recall that a data race occurs when there are multiple concurrent normal
accesses to a given normal variable, at least one of which is a write.
Here normal accesses and normal variables are those that are not marked
as atomic (in the C11 sense).  Accesses and variables marked as volatile
also disable most (perhaps all) of the dangerous optimizations that lead
to the undefined behavior.  That said, many compiler people hate volatile,
and will therefore automatically argue that it is useless in a misguided
attempt to convince people not to use it.  :-/

On the other hand, if the code is already writing to the variable (as it
is in the s->sequence++ case), then if there are any concurrent accesses,
the code -already- contains a data race.  This data race invokes undefined
behavior in the code as written, so the compiler is within its rights to
do anything at all, even spawn the proverbial game of rogue.  A somewhat
more reasonable compiler is within its rights to assume that no one is
concurrently reading any normal variable to which the code does a normal
write.  The compiler is therefore within its rights to use that variable
as scratch storage at any time between the most recent read and the write
in question.

> But then it starts weasel wording and saying that the lack of
> std::atomic<> usage implies a lack of concurrency, to which I strongly
> object.

Heh!

The point of std::atomic<> (and of the equivalent C11 syntax) is to
force the compiler to suppress optimizations that are unsafe for shared
variables.  We get more or less the same effect with volatile, protests
from compiler people notwithstanding.

I often tell the compiler guys that they have to expect make -some-
concessions for being 30 years late to the concurrency party, but
it nevertheless makes sense to future-proof our code where it is
reasonable to do so.

All that aside, I agree that "s->sequence++" is relatively low
priority, given that the compiler can easily free up a register by
storing the actual value.  But that might well be a failure of
imagination on my part.

> Esp. seeing how -ffreestanding does not have access to any of the atomic
> stuff since its library bits and not language bits (something which I've
> often said was a 

Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-17 Thread Peter Zijlstra
On Wed, Jun 17, 2015 at 07:57:12AM -0700, Paul E. McKenney wrote:
> On Wed, Jun 17, 2015 at 02:29:24PM +0200, Peter Zijlstra wrote:
> > I did leave off the READ/WRITE ONCE stuff, because I could not come up
> > with a scenario where it makes a difference -- I appreciate paranoia,
> > but I also think we should not overdo the thing.
> 
> I can only conclude that you have not read this document:
> 
>   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html
> 
> Specifically, please keep in mind that unless you mark either the variable
> or the memory access, the compiler is within its rights to assume that
> there are no concurrent accesses to that variable.  For but one example,
> if you do a normal store to a given variable, then the compiler is
> within its rights to use that variable as temporary storage prior to
> that store.  And yes, you can reasonably argue that no sane compiler
> would store something else to s->sequence given that it could free up
> a register by storing the incremented value, but the fact remains that
> you have given it permission to do so if it wants.

This is the "Optimizations without Atomics" section you're referring to.

It has words such as: "if the compiler can prove they are not accessed
in other threads concurrently" and "This requires escape analysis: the
compiler must see the full scope of the memory location 'p', or must
know that leaf functions don't capture 'p' and aren't used concurrently,
for this optimization to be valid."

But then it starts weasel wording and saying that the lack of
std::atomic<> usage implies a lack of concurrency, to which I strongly
object.

Esp. seeing how -ffreestanding does not have access to any of the atomic
stuff since its library bits and not language bits (something which I've
often said was a failure in the spec).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-17 Thread Paul E. McKenney
On Wed, Jun 17, 2015 at 05:11:09PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 17, 2015 at 07:57:12AM -0700, Paul E. McKenney wrote:
> > On Wed, Jun 17, 2015 at 02:29:24PM +0200, Peter Zijlstra wrote:
> > > I did leave off the READ/WRITE ONCE stuff, because I could not come up
> > > with a scenario where it makes a difference -- I appreciate paranoia,
> > > but I also think we should not overdo the thing.
> > 
> > I can only conclude that you have not read this document:
> > 
> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html
> 
> This would be correct.
> 
> > Specifically, please keep in mind that unless you mark either the variable
> > or the memory access, the compiler is within its rights to assume that
> > there are no concurrent accesses to that variable.  For but one example,
> > if you do a normal store to a given variable, then the compiler is
> > within its rights to use that variable as temporary storage prior to
> > that store.  And yes, you can reasonably argue that no sane compiler
> > would store something else to s->sequence given that it could free up
> > a register by storing the incremented value, but the fact remains that
> > you have given it permission to do so if it wants.
> 
> Argh *grmbl*, that's bloody insane!

You expected me to argue with that statement?  ;-)

> So I get the re-loading, I get the tearing, but this random intermittent
> values (somewhat related to stores out of thin air) is completely
> bonkers.
> 
> I would very much prefer a compiler switch that instructs the compiler
> to not do bloody stupid things like this instead of marking every other
> load/store in the kernel with volatile.

I would of course be good with such a compiler switch, though my earlier
attempts to negotiate one were unsuccessful.  But I don't believe that we
discussed a switch to specifically prohibit only use of to-be-stored-into
variables as temporary scratch space.  The trick is finding restrictions
that are useful, but that don't imply -O0.

Any GCC or LLVM folks on the list?

> Note that if GCC were to actually do something like this, the kernel
> would already be broken, because I'm very sure we did not consider/audit
> it for this.

An accident waiting to happen, given that both GCC and the Linux kernel
are moving targets.  :-/

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-17 Thread Peter Zijlstra
On Wed, Jun 17, 2015 at 07:57:12AM -0700, Paul E. McKenney wrote:
> On Wed, Jun 17, 2015 at 02:29:24PM +0200, Peter Zijlstra wrote:
> > I did leave off the READ/WRITE ONCE stuff, because I could not come up
> > with a scenario where it makes a difference -- I appreciate paranoia,
> > but I also think we should not overdo the thing.
> 
> I can only conclude that you have not read this document:
> 
>   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html

This would be correct.

> Specifically, please keep in mind that unless you mark either the variable
> or the memory access, the compiler is within its rights to assume that
> there are no concurrent accesses to that variable.  For but one example,
> if you do a normal store to a given variable, then the compiler is
> within its rights to use that variable as temporary storage prior to
> that store.  And yes, you can reasonably argue that no sane compiler
> would store something else to s->sequence given that it could free up
> a register by storing the incremented value, but the fact remains that
> you have given it permission to do so if it wants.

Argh *grmbl*, that's bloody insane!

So I get the re-loading, I get the tearing, but this random intermittent
values (somewhat related to stores out of thin air) is completely
bonkers.

I would very much prefer a compiler switch that instructs the compiler
to not do bloody stupid things like this instead of marking every other
load/store in the kernel with volatile.

Note that if GCC were to actually do something like this, the kernel
would already be broken, because I'm very sure we did not consider/audit
it for this.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-17 Thread Paul E. McKenney
On Wed, Jun 17, 2015 at 02:29:24PM +0200, Peter Zijlstra wrote:
> On Thu, Jun 11, 2015 at 02:45:57PM -0700, Paul E. McKenney wrote:
> > Color me slow and stupid.  Maybe due to reviewing a patch too early in
> > the morning, who knows?
> > 
> > There is nothing above that prevents the compiler and the CPU from
> > reordering the assignments to X and Y with the increment of s->sequence++.
> > One fix would be as follows:
> > 
> > static inline void raw_write_seqcount_barrier(seqcount_t *s)
> > {
> > smp_wmb();
> > s->sequence++;
> > smp_wmb();
> > s->sequence++;
> > smp_wmb();
> > }
> > 
> > Of course, this assumes that the accesses surrounding the call to
> > raw_write_seqcount_barrier() are writes.  If they can be a reads,
> > the two added smp_wmb() calls need to be full barriers.
> 
> I have updated the Changelog to hopefully explain things better.
> 
> I did leave off the READ/WRITE ONCE stuff, because I could not come up
> with a scenario where it makes a difference -- I appreciate paranoia,
> but I also think we should not overdo the thing.

I can only conclude that you have not read this document:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html

Specifically, please keep in mind that unless you mark either the variable
or the memory access, the compiler is within its rights to assume that
there are no concurrent accesses to that variable.  For but one example,
if you do a normal store to a given variable, then the compiler is
within its rights to use that variable as temporary storage prior to
that store.  And yes, you can reasonably argue that no sane compiler
would store something else to s->sequence given that it could free up
a register by storing the incremented value, but the fact remains that
you have given it permission to do so if it wants.

Thanx, Paul

> ---
> Subject: seqcount: Introduce raw_write_seqcount_barrier()
> From: Peter Zijlstra 
> Date: Thu Jun 11 12:35:48 CEST 2015
> 
> Introduce raw_write_seqcount_barrier(), a new construct that can be
> used to provide write barrier semantics in seqcount read loops instead
> of the usual consistency guarantee.
> 
> raw_write_seqcount_barier() is equivalent to:
> 
>   raw_write_seqcount_begin();
>   raw_write_seqcount_end();
> 
> But avoids issueing two back-to-back smp_wmb() instructions.
> 
> This construct works because the read side will 'stall' when observing
> odd values. This means that -- referring to the example in the comment
> below -- even though there is no (matching) read barrier between the
> loads of X and Y, we cannot observe !x && !y, because:
> 
>  - if we observe Y == false we must observe the first sequence
>increment, which makes us loop, until
> 
>  - we observe !(seq & 1) -- the second sequence increment -- at which
>time we must also observe T == true.
> 
> Cc: Al Viro 
> Cc: Linus Torvalds 
> Cc: Paul McKenney 
> Suggested-by: Oleg Nesterov 
> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  include/linux/seqlock.h |   42 ++
>  1 file changed, 42 insertions(+)
> 
> --- a/include/linux/seqlock.h
> +++ b/include/linux/seqlock.h
> @@ -233,6 +233,47 @@ static inline void raw_write_seqcount_en
>   s->sequence++;
>  }
> 
> +/**
> + * raw_write_seqcount_barrier - do a seq write barrier
> + * @s: pointer to seqcount_t
> + *
> + * This can be used to provide an ordering guarantee instead of the
> + * usual consistency guarantee. It is one wmb cheaper, because we can
> + * collapse the two back-to-back wmb()s.
> + *
> + *  seqcount_t seq;
> + *  bool X = true, Y = false;
> + *
> + *  void read(void)
> + *  {
> + *  bool x, y;
> + *
> + *  do {
> + *  int s = read_seqcount_begin();
> + *
> + *  x = X; y = Y;
> + *
> + *  } while (read_seqcount_retry(, s));
> + *
> + *  BUG_ON(!x && !y);
> + *  }
> + *
> + *  void write(void)
> + *  {
> + *  Y = true;
> + *
> + *  raw_write_seqcount_barrier(seq);
> + *
> + *  X = false;
> + *  }
> + */
> +static inline void raw_write_seqcount_barrier(seqcount_t *s)
> +{
> + s->sequence++;
> + smp_wmb();
> + s->sequence++;
> +}
> +
>  /*
>   * raw_write_seqcount_latch - redirect readers to even/odd copy
>   * @s: pointer to seqcount_t
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-17 Thread Peter Zijlstra
On Thu, Jun 11, 2015 at 02:45:57PM -0700, Paul E. McKenney wrote:
> Color me slow and stupid.  Maybe due to reviewing a patch too early in
> the morning, who knows?
> 
> There is nothing above that prevents the compiler and the CPU from
> reordering the assignments to X and Y with the increment of s->sequence++.
> One fix would be as follows:
> 
>   static inline void raw_write_seqcount_barrier(seqcount_t *s)
>   {
>   smp_wmb();
>   s->sequence++;
>   smp_wmb();
>   s->sequence++;
>   smp_wmb();
>   }
> 
> Of course, this assumes that the accesses surrounding the call to
> raw_write_seqcount_barrier() are writes.  If they can be a reads,
> the two added smp_wmb() calls need to be full barriers.

I have updated the Changelog to hopefully explain things better.

I did leave off the READ/WRITE ONCE stuff, because I could not come up
with a scenario where it makes a difference -- I appreciate paranoia,
but I also think we should not overdo the thing.

---
Subject: seqcount: Introduce raw_write_seqcount_barrier()
From: Peter Zijlstra 
Date: Thu Jun 11 12:35:48 CEST 2015

Introduce raw_write_seqcount_barrier(), a new construct that can be
used to provide write barrier semantics in seqcount read loops instead
of the usual consistency guarantee.

raw_write_seqcount_barier() is equivalent to:

raw_write_seqcount_begin();
raw_write_seqcount_end();

But avoids issueing two back-to-back smp_wmb() instructions.

This construct works because the read side will 'stall' when observing
odd values. This means that -- referring to the example in the comment
below -- even though there is no (matching) read barrier between the
loads of X and Y, we cannot observe !x && !y, because:

 - if we observe Y == false we must observe the first sequence
   increment, which makes us loop, until

 - we observe !(seq & 1) -- the second sequence increment -- at which
   time we must also observe T == true.

Cc: Al Viro 
Cc: Linus Torvalds 
Cc: Paul McKenney 
Suggested-by: Oleg Nesterov 
Signed-off-by: Peter Zijlstra (Intel) 
---
 include/linux/seqlock.h |   42 ++
 1 file changed, 42 insertions(+)

--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -233,6 +233,47 @@ static inline void raw_write_seqcount_en
s->sequence++;
 }
 
+/**
+ * raw_write_seqcount_barrier - do a seq write barrier
+ * @s: pointer to seqcount_t
+ *
+ * This can be used to provide an ordering guarantee instead of the
+ * usual consistency guarantee. It is one wmb cheaper, because we can
+ * collapse the two back-to-back wmb()s.
+ *
+ *  seqcount_t seq;
+ *  bool X = true, Y = false;
+ *
+ *  void read(void)
+ *  {
+ *  bool x, y;
+ *
+ *  do {
+ *  int s = read_seqcount_begin();
+ *
+ *  x = X; y = Y;
+ *
+ *  } while (read_seqcount_retry(, s));
+ *
+ *  BUG_ON(!x && !y);
+ *  }
+ *
+ *  void write(void)
+ *  {
+ *  Y = true;
+ *
+ *  raw_write_seqcount_barrier(seq);
+ *
+ *  X = false;
+ *  }
+ */
+static inline void raw_write_seqcount_barrier(seqcount_t *s)
+{
+   s->sequence++;
+   smp_wmb();
+   s->sequence++;
+}
+
 /*
  * raw_write_seqcount_latch - redirect readers to even/odd copy
  * @s: pointer to seqcount_t
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-17 Thread Paul E. McKenney
On Wed, Jun 17, 2015 at 07:11:40PM +0200, Peter Zijlstra wrote:
 On Wed, Jun 17, 2015 at 09:37:31AM -0700, Paul E. McKenney wrote:
  The point of std::atomic (and of the equivalent C11 syntax) is to
  force the compiler to suppress optimizations that are unsafe for shared
  variables.  We get more or less the same effect with volatile, protests
  from compiler people notwithstanding.
  
  I often tell the compiler guys that they have to expect make -some-
  concessions for being 30 years late to the concurrency party, but
  it nevertheless makes sense to future-proof our code where it is
  reasonable to do so.
 
 Right, so in that regards I would request the compiler option (and or
 #pragma) that disables all the out-of-thin-air nonsense.

OK.  What is the form of the #pragma?  If it focuses on a specific
access, we are likely to get a lot of pushback.

 Because while they hide behind their undefined behaviour, the fact is
 that all of their machines for the past 30 odd years have been relying
 on this 'undefined' behaviour to work. This being the machines they've
 been typing their useless specs on :-)

Maybe I can scare them into doing all their work on UP systems.  ;-)

Interestingly enough, LLVM is taking a slightly different approach.
Rather than invoke undefined behavior, they say that data races result
in random bits being loaded.  Not that it makes much difference to the
health and well-being of the software, mind you...

 I doubt there's a single OS kernel (that supports SMP configurations)
 that does not rely on a whole host of 'undefined' behaviour.

An alternative approach would be a compiler switch (or similar) that
changed the default atomic access from SC to relaxed.  Then shared
variables could be marked atomic, and normal C code could be used to
access them, but without the compiler emitting memory barriers all over
the place (yes, even on x86).

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-17 Thread Peter Zijlstra
On Thu, Jun 11, 2015 at 02:45:57PM -0700, Paul E. McKenney wrote:
 Color me slow and stupid.  Maybe due to reviewing a patch too early in
 the morning, who knows?
 
 There is nothing above that prevents the compiler and the CPU from
 reordering the assignments to X and Y with the increment of s-sequence++.
 One fix would be as follows:
 
   static inline void raw_write_seqcount_barrier(seqcount_t *s)
   {
   smp_wmb();
   s-sequence++;
   smp_wmb();
   s-sequence++;
   smp_wmb();
   }
 
 Of course, this assumes that the accesses surrounding the call to
 raw_write_seqcount_barrier() are writes.  If they can be a reads,
 the two added smp_wmb() calls need to be full barriers.

I have updated the Changelog to hopefully explain things better.

I did leave off the READ/WRITE ONCE stuff, because I could not come up
with a scenario where it makes a difference -- I appreciate paranoia,
but I also think we should not overdo the thing.

---
Subject: seqcount: Introduce raw_write_seqcount_barrier()
From: Peter Zijlstra pet...@infradead.org
Date: Thu Jun 11 12:35:48 CEST 2015

Introduce raw_write_seqcount_barrier(), a new construct that can be
used to provide write barrier semantics in seqcount read loops instead
of the usual consistency guarantee.

raw_write_seqcount_barier() is equivalent to:

raw_write_seqcount_begin();
raw_write_seqcount_end();

But avoids issueing two back-to-back smp_wmb() instructions.

This construct works because the read side will 'stall' when observing
odd values. This means that -- referring to the example in the comment
below -- even though there is no (matching) read barrier between the
loads of X and Y, we cannot observe !x  !y, because:

 - if we observe Y == false we must observe the first sequence
   increment, which makes us loop, until

 - we observe !(seq  1) -- the second sequence increment -- at which
   time we must also observe T == true.

Cc: Al Viro v...@zeniv.linux.org.uk
Cc: Linus Torvalds torva...@linux-foundation.org
Cc: Paul McKenney paul...@linux.vnet.ibm.com
Suggested-by: Oleg Nesterov o...@redhat.com
Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org
---
 include/linux/seqlock.h |   42 ++
 1 file changed, 42 insertions(+)

--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -233,6 +233,47 @@ static inline void raw_write_seqcount_en
s-sequence++;
 }
 
+/**
+ * raw_write_seqcount_barrier - do a seq write barrier
+ * @s: pointer to seqcount_t
+ *
+ * This can be used to provide an ordering guarantee instead of the
+ * usual consistency guarantee. It is one wmb cheaper, because we can
+ * collapse the two back-to-back wmb()s.
+ *
+ *  seqcount_t seq;
+ *  bool X = true, Y = false;
+ *
+ *  void read(void)
+ *  {
+ *  bool x, y;
+ *
+ *  do {
+ *  int s = read_seqcount_begin(seq);
+ *
+ *  x = X; y = Y;
+ *
+ *  } while (read_seqcount_retry(seq, s));
+ *
+ *  BUG_ON(!x  !y);
+ *  }
+ *
+ *  void write(void)
+ *  {
+ *  Y = true;
+ *
+ *  raw_write_seqcount_barrier(seq);
+ *
+ *  X = false;
+ *  }
+ */
+static inline void raw_write_seqcount_barrier(seqcount_t *s)
+{
+   s-sequence++;
+   smp_wmb();
+   s-sequence++;
+}
+
 /*
  * raw_write_seqcount_latch - redirect readers to even/odd copy
  * @s: pointer to seqcount_t
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-17 Thread Paul E. McKenney
On Wed, Jun 17, 2015 at 02:29:24PM +0200, Peter Zijlstra wrote:
 On Thu, Jun 11, 2015 at 02:45:57PM -0700, Paul E. McKenney wrote:
  Color me slow and stupid.  Maybe due to reviewing a patch too early in
  the morning, who knows?
  
  There is nothing above that prevents the compiler and the CPU from
  reordering the assignments to X and Y with the increment of s-sequence++.
  One fix would be as follows:
  
  static inline void raw_write_seqcount_barrier(seqcount_t *s)
  {
  smp_wmb();
  s-sequence++;
  smp_wmb();
  s-sequence++;
  smp_wmb();
  }
  
  Of course, this assumes that the accesses surrounding the call to
  raw_write_seqcount_barrier() are writes.  If they can be a reads,
  the two added smp_wmb() calls need to be full barriers.
 
 I have updated the Changelog to hopefully explain things better.
 
 I did leave off the READ/WRITE ONCE stuff, because I could not come up
 with a scenario where it makes a difference -- I appreciate paranoia,
 but I also think we should not overdo the thing.

I can only conclude that you have not read this document:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html

Specifically, please keep in mind that unless you mark either the variable
or the memory access, the compiler is within its rights to assume that
there are no concurrent accesses to that variable.  For but one example,
if you do a normal store to a given variable, then the compiler is
within its rights to use that variable as temporary storage prior to
that store.  And yes, you can reasonably argue that no sane compiler
would store something else to s-sequence given that it could free up
a register by storing the incremented value, but the fact remains that
you have given it permission to do so if it wants.

Thanx, Paul

 ---
 Subject: seqcount: Introduce raw_write_seqcount_barrier()
 From: Peter Zijlstra pet...@infradead.org
 Date: Thu Jun 11 12:35:48 CEST 2015
 
 Introduce raw_write_seqcount_barrier(), a new construct that can be
 used to provide write barrier semantics in seqcount read loops instead
 of the usual consistency guarantee.
 
 raw_write_seqcount_barier() is equivalent to:
 
   raw_write_seqcount_begin();
   raw_write_seqcount_end();
 
 But avoids issueing two back-to-back smp_wmb() instructions.
 
 This construct works because the read side will 'stall' when observing
 odd values. This means that -- referring to the example in the comment
 below -- even though there is no (matching) read barrier between the
 loads of X and Y, we cannot observe !x  !y, because:
 
  - if we observe Y == false we must observe the first sequence
increment, which makes us loop, until
 
  - we observe !(seq  1) -- the second sequence increment -- at which
time we must also observe T == true.
 
 Cc: Al Viro v...@zeniv.linux.org.uk
 Cc: Linus Torvalds torva...@linux-foundation.org
 Cc: Paul McKenney paul...@linux.vnet.ibm.com
 Suggested-by: Oleg Nesterov o...@redhat.com
 Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org
 ---
  include/linux/seqlock.h |   42 ++
  1 file changed, 42 insertions(+)
 
 --- a/include/linux/seqlock.h
 +++ b/include/linux/seqlock.h
 @@ -233,6 +233,47 @@ static inline void raw_write_seqcount_en
   s-sequence++;
  }
 
 +/**
 + * raw_write_seqcount_barrier - do a seq write barrier
 + * @s: pointer to seqcount_t
 + *
 + * This can be used to provide an ordering guarantee instead of the
 + * usual consistency guarantee. It is one wmb cheaper, because we can
 + * collapse the two back-to-back wmb()s.
 + *
 + *  seqcount_t seq;
 + *  bool X = true, Y = false;
 + *
 + *  void read(void)
 + *  {
 + *  bool x, y;
 + *
 + *  do {
 + *  int s = read_seqcount_begin(seq);
 + *
 + *  x = X; y = Y;
 + *
 + *  } while (read_seqcount_retry(seq, s));
 + *
 + *  BUG_ON(!x  !y);
 + *  }
 + *
 + *  void write(void)
 + *  {
 + *  Y = true;
 + *
 + *  raw_write_seqcount_barrier(seq);
 + *
 + *  X = false;
 + *  }
 + */
 +static inline void raw_write_seqcount_barrier(seqcount_t *s)
 +{
 + s-sequence++;
 + smp_wmb();
 + s-sequence++;
 +}
 +
  /*
   * raw_write_seqcount_latch - redirect readers to even/odd copy
   * @s: pointer to seqcount_t
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-17 Thread Peter Zijlstra
On Wed, Jun 17, 2015 at 07:57:12AM -0700, Paul E. McKenney wrote:
 On Wed, Jun 17, 2015 at 02:29:24PM +0200, Peter Zijlstra wrote:
  I did leave off the READ/WRITE ONCE stuff, because I could not come up
  with a scenario where it makes a difference -- I appreciate paranoia,
  but I also think we should not overdo the thing.
 
 I can only conclude that you have not read this document:
 
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html

This would be correct.

 Specifically, please keep in mind that unless you mark either the variable
 or the memory access, the compiler is within its rights to assume that
 there are no concurrent accesses to that variable.  For but one example,
 if you do a normal store to a given variable, then the compiler is
 within its rights to use that variable as temporary storage prior to
 that store.  And yes, you can reasonably argue that no sane compiler
 would store something else to s-sequence given that it could free up
 a register by storing the incremented value, but the fact remains that
 you have given it permission to do so if it wants.

Argh *grmbl*, that's bloody insane!

So I get the re-loading, I get the tearing, but this random intermittent
values (somewhat related to stores out of thin air) is completely
bonkers.

I would very much prefer a compiler switch that instructs the compiler
to not do bloody stupid things like this instead of marking every other
load/store in the kernel with volatile.

Note that if GCC were to actually do something like this, the kernel
would already be broken, because I'm very sure we did not consider/audit
it for this.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-17 Thread Paul E. McKenney
On Wed, Jun 17, 2015 at 05:11:09PM +0200, Peter Zijlstra wrote:
 On Wed, Jun 17, 2015 at 07:57:12AM -0700, Paul E. McKenney wrote:
  On Wed, Jun 17, 2015 at 02:29:24PM +0200, Peter Zijlstra wrote:
   I did leave off the READ/WRITE ONCE stuff, because I could not come up
   with a scenario where it makes a difference -- I appreciate paranoia,
   but I also think we should not overdo the thing.
  
  I can only conclude that you have not read this document:
  
  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html
 
 This would be correct.
 
  Specifically, please keep in mind that unless you mark either the variable
  or the memory access, the compiler is within its rights to assume that
  there are no concurrent accesses to that variable.  For but one example,
  if you do a normal store to a given variable, then the compiler is
  within its rights to use that variable as temporary storage prior to
  that store.  And yes, you can reasonably argue that no sane compiler
  would store something else to s-sequence given that it could free up
  a register by storing the incremented value, but the fact remains that
  you have given it permission to do so if it wants.
 
 Argh *grmbl*, that's bloody insane!

You expected me to argue with that statement?  ;-)

 So I get the re-loading, I get the tearing, but this random intermittent
 values (somewhat related to stores out of thin air) is completely
 bonkers.
 
 I would very much prefer a compiler switch that instructs the compiler
 to not do bloody stupid things like this instead of marking every other
 load/store in the kernel with volatile.

I would of course be good with such a compiler switch, though my earlier
attempts to negotiate one were unsuccessful.  But I don't believe that we
discussed a switch to specifically prohibit only use of to-be-stored-into
variables as temporary scratch space.  The trick is finding restrictions
that are useful, but that don't imply -O0.

Any GCC or LLVM folks on the list?

 Note that if GCC were to actually do something like this, the kernel
 would already be broken, because I'm very sure we did not consider/audit
 it for this.

An accident waiting to happen, given that both GCC and the Linux kernel
are moving targets.  :-/

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-17 Thread Peter Zijlstra
On Wed, Jun 17, 2015 at 08:42:45AM -0700, Paul E. McKenney wrote:
  I would very much prefer a compiler switch that instructs the compiler
  to not do bloody stupid things like this instead of marking every other
  load/store in the kernel with volatile.
 
 I would of course be good with such a compiler switch, though my earlier
 attempts to negotiate one were unsuccessful.  But I don't believe that we
 discussed a switch to specifically prohibit only use of to-be-stored-into
 variables as temporary scratch space.  The trick is finding restrictions
 that are useful, but that don't imply -O0.

I would request on that disables all the 'stores from thin air'
'optimizations'. IOW assume everything is shared memory and concurrent
unless you can prove its not so. For example a local stack variable that
does not escape scope.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-17 Thread Peter Zijlstra
On Wed, Jun 17, 2015 at 09:37:31AM -0700, Paul E. McKenney wrote:
 The point of std::atomic (and of the equivalent C11 syntax) is to
 force the compiler to suppress optimizations that are unsafe for shared
 variables.  We get more or less the same effect with volatile, protests
 from compiler people notwithstanding.
 
 I often tell the compiler guys that they have to expect make -some-
 concessions for being 30 years late to the concurrency party, but
 it nevertheless makes sense to future-proof our code where it is
 reasonable to do so.

Right, so in that regards I would request the compiler option (and or
#pragma) that disables all the out-of-thin-air nonsense.

Because while they hide behind their undefined behaviour, the fact is
that all of their machines for the past 30 odd years have been relying
on this 'undefined' behaviour to work. This being the machines they've
been typing their useless specs on :-)

I doubt there's a single OS kernel (that supports SMP configurations)
that does not rely on a whole host of 'undefined' behaviour.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-17 Thread Paul E. McKenney
On Wed, Jun 17, 2015 at 05:49:27PM +0200, Peter Zijlstra wrote:
 On Wed, Jun 17, 2015 at 07:57:12AM -0700, Paul E. McKenney wrote:
  On Wed, Jun 17, 2015 at 02:29:24PM +0200, Peter Zijlstra wrote:
   I did leave off the READ/WRITE ONCE stuff, because I could not come up
   with a scenario where it makes a difference -- I appreciate paranoia,
   but I also think we should not overdo the thing.
  
  I can only conclude that you have not read this document:
  
  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html
  
  Specifically, please keep in mind that unless you mark either the variable
  or the memory access, the compiler is within its rights to assume that
  there are no concurrent accesses to that variable.  For but one example,
  if you do a normal store to a given variable, then the compiler is
  within its rights to use that variable as temporary storage prior to
  that store.  And yes, you can reasonably argue that no sane compiler
  would store something else to s-sequence given that it could free up
  a register by storing the incremented value, but the fact remains that
  you have given it permission to do so if it wants.
 
 This is the Optimizations without Atomics section you're referring to.
 
 It has words such as: if the compiler can prove they are not accessed
 in other threads concurrently and This requires escape analysis: the
 compiler must see the full scope of the memory location 'p', or must
 know that leaf functions don't capture 'p' and aren't used concurrently,
 for this optimization to be valid.

Yes, but...  These qualifiers apply only in cases where the code at hand
is not writing to the variable.

In this case, it is legal for other threads to be concurrently reading
the variable.  After all, if the value of the variable is not changing,
then the various read-side optimizations have no effect -- even the silly
ones, like reading the variable one bit at a time.  If the compiler
introduces a write into code that was only reading the variable, then
it is the compiler's job to ensure that no other thread is reading it.
If the compiler were to introduce a write to such a variable that other
threads might be reading, then the compiler would have introduced a data
race, which would mean that the compiler introduced undefined behavior
to correct code.  The compiler therefore must be extremely careful when
introducing a write to a variable that is only read by the code at hand.
(Though there are still some interesting escape clauses for the compiler
if the variable in question is allocated on the stack and the function
can return without transferring control to some function in some other
translation unit.)

Recall that a data race occurs when there are multiple concurrent normal
accesses to a given normal variable, at least one of which is a write.
Here normal accesses and normal variables are those that are not marked
as atomic (in the C11 sense).  Accesses and variables marked as volatile
also disable most (perhaps all) of the dangerous optimizations that lead
to the undefined behavior.  That said, many compiler people hate volatile,
and will therefore automatically argue that it is useless in a misguided
attempt to convince people not to use it.  :-/

On the other hand, if the code is already writing to the variable (as it
is in the s-sequence++ case), then if there are any concurrent accesses,
the code -already- contains a data race.  This data race invokes undefined
behavior in the code as written, so the compiler is within its rights to
do anything at all, even spawn the proverbial game of rogue.  A somewhat
more reasonable compiler is within its rights to assume that no one is
concurrently reading any normal variable to which the code does a normal
write.  The compiler is therefore within its rights to use that variable
as scratch storage at any time between the most recent read and the write
in question.

 But then it starts weasel wording and saying that the lack of
 std::atomic usage implies a lack of concurrency, to which I strongly
 object.

Heh!

The point of std::atomic (and of the equivalent C11 syntax) is to
force the compiler to suppress optimizations that are unsafe for shared
variables.  We get more or less the same effect with volatile, protests
from compiler people notwithstanding.

I often tell the compiler guys that they have to expect make -some-
concessions for being 30 years late to the concurrency party, but
it nevertheless makes sense to future-proof our code where it is
reasonable to do so.

All that aside, I agree that s-sequence++ is relatively low
priority, given that the compiler can easily free up a register by
storing the actual value.  But that might well be a failure of
imagination on my part.

 Esp. seeing how -ffreestanding does not have access to any of the atomic
 stuff since its library bits and not language bits (something which I've
 often said was a failure in the spec).

Agreed, given that atomics can almost always be 

Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-17 Thread Peter Zijlstra
On Wed, Jun 17, 2015 at 07:57:12AM -0700, Paul E. McKenney wrote:
 On Wed, Jun 17, 2015 at 02:29:24PM +0200, Peter Zijlstra wrote:
  I did leave off the READ/WRITE ONCE stuff, because I could not come up
  with a scenario where it makes a difference -- I appreciate paranoia,
  but I also think we should not overdo the thing.
 
 I can only conclude that you have not read this document:
 
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html
 
 Specifically, please keep in mind that unless you mark either the variable
 or the memory access, the compiler is within its rights to assume that
 there are no concurrent accesses to that variable.  For but one example,
 if you do a normal store to a given variable, then the compiler is
 within its rights to use that variable as temporary storage prior to
 that store.  And yes, you can reasonably argue that no sane compiler
 would store something else to s-sequence given that it could free up
 a register by storing the incremented value, but the fact remains that
 you have given it permission to do so if it wants.

This is the Optimizations without Atomics section you're referring to.

It has words such as: if the compiler can prove they are not accessed
in other threads concurrently and This requires escape analysis: the
compiler must see the full scope of the memory location 'p', or must
know that leaf functions don't capture 'p' and aren't used concurrently,
for this optimization to be valid.

But then it starts weasel wording and saying that the lack of
std::atomic usage implies a lack of concurrency, to which I strongly
object.

Esp. seeing how -ffreestanding does not have access to any of the atomic
stuff since its library bits and not language bits (something which I've
often said was a failure in the spec).
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-12 Thread Oleg Nesterov
On 06/11, Paul E. McKenney wrote:
>
> > > + *  seqcount_t seq;
> > > + *  bool X = true, Y = false;
> > > + *
> > > + *  void read(void)
> > > + *  {
> > > + *  bool x, y;
> > > + *
> > > + *  do {
> > > + *  int s = read_seqcount_begin();
> > > + *
> > > + *  x = X; y = Y;
> > > + *
> > > + *  } while (read_seqcount_retry(, s));
> > > + *
> > > + *  BUG_ON(!x && !y);
> > > + *  }
> > > + *
> > > + *  void write(void)
> > > + *  {
> > > + *  Y = true;
> > > + *
> > > + *  write_seqcount_begin(seq);
> > > + *  write_seqcount_end(seq);
> > > + *
> > > + *  X = false;
> > > + *  }
> >
> > > +static inline void raw_write_seqcount_barrier(seqcount_t *s)
> > > +{
> > > + s->sequence++;
> > > + smp_wmb();
> > > + s->sequence++;
> > > +}
> > > +
> > >  /*
> > >   * raw_write_seqcount_latch - redirect readers to even/odd copy
> > >   * @s: pointer to seqcount_t
> >
> > Looks good otherwise.
> >
> > Reviewed-by: Paul E. McKenney 
>
> Color me slow and stupid.  Maybe due to reviewing a patch too early in
> the morning, who knows?
>
> There is nothing above that prevents the compiler and the CPU from
> reordering the assignments to X and Y with the increment of s->sequence++.

Yes, but this doesn't matter, I think. The writer does

Y = true;
1st_increment;

wmb();

2nd_increment;
X = false;

and we do not care about reordering before or after wmnb() at all. But we
rely on the fact that 1st_increment can not be reordered with "X = false",
and that "Y = true" can not be reordered with the 2nd_increment.


And another simple "proof" is that  seqcount_barrier() is equivalent to
write_seqcount_begin() + + write_seqcount_end() and thus the code above
is correct, or the ACQUIRE/RELEASE semantics of seqcount_t is broken ;)

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-12 Thread Peter Zijlstra
On Thu, Jun 11, 2015 at 02:45:57PM -0700, Paul E. McKenney wrote:
> Color me slow and stupid.  Maybe due to reviewing a patch too early in
> the morning, who knows?
> 
> There is nothing above that prevents the compiler and the CPU from
> reordering the assignments to X and Y with the increment of s->sequence++.

That's actually fine. As long as we observe an odd value the read side
will repeat.

> Of course, this assumes that the accesses surrounding the call to
> raw_write_seqcount_barrier() are writes.  

Which is why its got both write and barrier in the name :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-12 Thread Peter Zijlstra
On Thu, Jun 11, 2015 at 02:45:57PM -0700, Paul E. McKenney wrote:
 Color me slow and stupid.  Maybe due to reviewing a patch too early in
 the morning, who knows?
 
 There is nothing above that prevents the compiler and the CPU from
 reordering the assignments to X and Y with the increment of s-sequence++.

That's actually fine. As long as we observe an odd value the read side
will repeat.

 Of course, this assumes that the accesses surrounding the call to
 raw_write_seqcount_barrier() are writes.  

Which is why its got both write and barrier in the name :-)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-12 Thread Oleg Nesterov
On 06/11, Paul E. McKenney wrote:

   + *  seqcount_t seq;
   + *  bool X = true, Y = false;
   + *
   + *  void read(void)
   + *  {
   + *  bool x, y;
   + *
   + *  do {
   + *  int s = read_seqcount_begin(seq);
   + *
   + *  x = X; y = Y;
   + *
   + *  } while (read_seqcount_retry(seq, s));
   + *
   + *  BUG_ON(!x  !y);
   + *  }
   + *
   + *  void write(void)
   + *  {
   + *  Y = true;
   + *
   + *  write_seqcount_begin(seq);
   + *  write_seqcount_end(seq);
   + *
   + *  X = false;
   + *  }
 
   +static inline void raw_write_seqcount_barrier(seqcount_t *s)
   +{
   + s-sequence++;
   + smp_wmb();
   + s-sequence++;
   +}
   +
/*
 * raw_write_seqcount_latch - redirect readers to even/odd copy
 * @s: pointer to seqcount_t
 
  Looks good otherwise.
 
  Reviewed-by: Paul E. McKenney paul...@linux.vnet.ibm.com

 Color me slow and stupid.  Maybe due to reviewing a patch too early in
 the morning, who knows?

 There is nothing above that prevents the compiler and the CPU from
 reordering the assignments to X and Y with the increment of s-sequence++.

Yes, but this doesn't matter, I think. The writer does

Y = true;
1st_increment;

wmb();

2nd_increment;
X = false;

and we do not care about reordering before or after wmnb() at all. But we
rely on the fact that 1st_increment can not be reordered with X = false,
and that Y = true can not be reordered with the 2nd_increment.


And another simple proof is that  seqcount_barrier() is equivalent to
write_seqcount_begin() + + write_seqcount_end() and thus the code above
is correct, or the ACQUIRE/RELEASE semantics of seqcount_t is broken ;)

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-11 Thread Paul E. McKenney
On Thu, Jun 11, 2015 at 08:33:41AM -0700, Paul E. McKenney wrote:
> On Thu, Jun 11, 2015 at 02:46:47PM +0200, Peter Zijlstra wrote:
> > Introduce raw_write_seqcount_barrier(), a new construct that can be
> > used to provide write barrier semantics in seqcount read loops instead
> > of the usual consistency guarantee.
> > 
> > Cc: Al Viro 
> > Cc: Linus Torvalds 
> > Cc: Paul McKenney 
> > Suggested-by: Oleg Nesterov 
> > Signed-off-by: Peter Zijlstra (Intel) 
> > ---
> >  include/linux/seqlock.h |   42 ++
> >  1 file changed, 42 insertions(+)
> > 
> > --- a/include/linux/seqlock.h
> > +++ b/include/linux/seqlock.h
> > @@ -233,6 +233,48 @@ static inline void raw_write_seqcount_en
> > s->sequence++;
> >  }
> > 
> > +/**
> > + * raw_write_seqcount_barrier - do a seq write barrier
> > + * @s: pointer to seqcount_t
> > + *
> > + * This can be used to provide an ordering guarantee instead of the
> > + * usual consistency guarantee. It is one wmb cheaper, because we can
> > + * collapse the two back-to-back wmb()s.
> > + *
> > + *  seqcount_t seq;
> > + *  bool X = true, Y = false;
> > + *
> > + *  void read(void)
> > + *  {
> > + *  bool x, y;
> > + *
> > + *  do {
> > + *  int s = read_seqcount_begin();
> > + *
> > + *  x = X; y = Y;
> > + *
> > + *  } while (read_seqcount_retry(, s));
> > + *
> > + *  BUG_ON(!x && !y);
> > + *  }
> > + *
> > + *  void write(void)
> > + *  {
> > + *  Y = true;
> > + *
> > + *  write_seqcount_begin(seq);
> > + *  write_seqcount_end(seq);
> > + *
> > + *  X = false;
> > + *  }
> 
> So when using this, write() would instead look like this?
> 
>   void write(void)
>   {
>   Y = true;
>   raw_write_seqcount_barrier(seq);
>   X = false;
>   }
> 
> I suggest calling this out explicitly.  Agreed, it should be obvious,
> but some poor sot is going to be reading this at 3AM local time after
> a couple days of no sleep, in which case obvious might not be so obvious.
> 
> I also would suggest READ_ONCE() and WRITE_ONCE() to keep the compiler
> trickiness down to a dull roar.  Understood, it is hard to make anything
> bad happen in this case, but small changes could result in badness.
> 
> > + */
> > +static inline void raw_write_seqcount_barrier(seqcount_t *s)
> > +{
> > +   s->sequence++;
> > +   smp_wmb();
> > +   s->sequence++;
> > +}
> > +
> >  /*
> >   * raw_write_seqcount_latch - redirect readers to even/odd copy
> >   * @s: pointer to seqcount_t
> 
> Looks good otherwise.
> 
> Reviewed-by: Paul E. McKenney 

Color me slow and stupid.  Maybe due to reviewing a patch too early in
the morning, who knows?

There is nothing above that prevents the compiler and the CPU from
reordering the assignments to X and Y with the increment of s->sequence++.
One fix would be as follows:

static inline void raw_write_seqcount_barrier(seqcount_t *s)
{
smp_wmb();
s->sequence++;
smp_wmb();
s->sequence++;
smp_wmb();
}

Of course, this assumes that the accesses surrounding the call to
raw_write_seqcount_barrier() are writes.  If they can be a reads,
the two added smp_wmb() calls need to be full barriers.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-11 Thread Paul E. McKenney
On Thu, Jun 11, 2015 at 02:46:47PM +0200, Peter Zijlstra wrote:
> Introduce raw_write_seqcount_barrier(), a new construct that can be
> used to provide write barrier semantics in seqcount read loops instead
> of the usual consistency guarantee.
> 
> Cc: Al Viro 
> Cc: Linus Torvalds 
> Cc: Paul McKenney 
> Suggested-by: Oleg Nesterov 
> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  include/linux/seqlock.h |   42 ++
>  1 file changed, 42 insertions(+)
> 
> --- a/include/linux/seqlock.h
> +++ b/include/linux/seqlock.h
> @@ -233,6 +233,48 @@ static inline void raw_write_seqcount_en
>   s->sequence++;
>  }
> 
> +/**
> + * raw_write_seqcount_barrier - do a seq write barrier
> + * @s: pointer to seqcount_t
> + *
> + * This can be used to provide an ordering guarantee instead of the
> + * usual consistency guarantee. It is one wmb cheaper, because we can
> + * collapse the two back-to-back wmb()s.
> + *
> + *  seqcount_t seq;
> + *  bool X = true, Y = false;
> + *
> + *  void read(void)
> + *  {
> + *  bool x, y;
> + *
> + *  do {
> + *  int s = read_seqcount_begin();
> + *
> + *  x = X; y = Y;
> + *
> + *  } while (read_seqcount_retry(, s));
> + *
> + *  BUG_ON(!x && !y);
> + *  }
> + *
> + *  void write(void)
> + *  {
> + *  Y = true;
> + *
> + *  write_seqcount_begin(seq);
> + *  write_seqcount_end(seq);
> + *
> + *  X = false;
> + *  }

So when using this, write() would instead look like this?

void write(void)
{
Y = true;
raw_write_seqcount_barrier(seq);
X = false;
}

I suggest calling this out explicitly.  Agreed, it should be obvious,
but some poor sot is going to be reading this at 3AM local time after
a couple days of no sleep, in which case obvious might not be so obvious.

I also would suggest READ_ONCE() and WRITE_ONCE() to keep the compiler
trickiness down to a dull roar.  Understood, it is hard to make anything
bad happen in this case, but small changes could result in badness.

> + */
> +static inline void raw_write_seqcount_barrier(seqcount_t *s)
> +{
> + s->sequence++;
> + smp_wmb();
> + s->sequence++;
> +}
> +
>  /*
>   * raw_write_seqcount_latch - redirect readers to even/odd copy
>   * @s: pointer to seqcount_t

Looks good otherwise.

Reviewed-by: Paul E. McKenney 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-11 Thread Peter Zijlstra
Introduce raw_write_seqcount_barrier(), a new construct that can be
used to provide write barrier semantics in seqcount read loops instead
of the usual consistency guarantee.

Cc: Al Viro 
Cc: Linus Torvalds 
Cc: Paul McKenney 
Suggested-by: Oleg Nesterov 
Signed-off-by: Peter Zijlstra (Intel) 
---
 include/linux/seqlock.h |   42 ++
 1 file changed, 42 insertions(+)

--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -233,6 +233,48 @@ static inline void raw_write_seqcount_en
s->sequence++;
 }
 
+/**
+ * raw_write_seqcount_barrier - do a seq write barrier
+ * @s: pointer to seqcount_t
+ *
+ * This can be used to provide an ordering guarantee instead of the
+ * usual consistency guarantee. It is one wmb cheaper, because we can
+ * collapse the two back-to-back wmb()s.
+ *
+ *  seqcount_t seq;
+ *  bool X = true, Y = false;
+ *
+ *  void read(void)
+ *  {
+ *  bool x, y;
+ *
+ *  do {
+ *  int s = read_seqcount_begin();
+ *
+ *  x = X; y = Y;
+ *
+ *  } while (read_seqcount_retry(, s));
+ *
+ *  BUG_ON(!x && !y);
+ *  }
+ *
+ *  void write(void)
+ *  {
+ *  Y = true;
+ *
+ *  write_seqcount_begin(seq);
+ *  write_seqcount_end(seq);
+ *
+ *  X = false;
+ *  }
+ */
+static inline void raw_write_seqcount_barrier(seqcount_t *s)
+{
+   s->sequence++;
+   smp_wmb();
+   s->sequence++;
+}
+
 /*
  * raw_write_seqcount_latch - redirect readers to even/odd copy
  * @s: pointer to seqcount_t


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-11 Thread Peter Zijlstra
Introduce raw_write_seqcount_barrier(), a new construct that can be
used to provide write barrier semantics in seqcount read loops instead
of the usual consistency guarantee.

Cc: Al Viro v...@zeniv.linux.org.uk
Cc: Linus Torvalds torva...@linux-foundation.org
Cc: Paul McKenney paul...@linux.vnet.ibm.com
Suggested-by: Oleg Nesterov o...@redhat.com
Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org
---
 include/linux/seqlock.h |   42 ++
 1 file changed, 42 insertions(+)

--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -233,6 +233,48 @@ static inline void raw_write_seqcount_en
s-sequence++;
 }
 
+/**
+ * raw_write_seqcount_barrier - do a seq write barrier
+ * @s: pointer to seqcount_t
+ *
+ * This can be used to provide an ordering guarantee instead of the
+ * usual consistency guarantee. It is one wmb cheaper, because we can
+ * collapse the two back-to-back wmb()s.
+ *
+ *  seqcount_t seq;
+ *  bool X = true, Y = false;
+ *
+ *  void read(void)
+ *  {
+ *  bool x, y;
+ *
+ *  do {
+ *  int s = read_seqcount_begin(seq);
+ *
+ *  x = X; y = Y;
+ *
+ *  } while (read_seqcount_retry(seq, s));
+ *
+ *  BUG_ON(!x  !y);
+ *  }
+ *
+ *  void write(void)
+ *  {
+ *  Y = true;
+ *
+ *  write_seqcount_begin(seq);
+ *  write_seqcount_end(seq);
+ *
+ *  X = false;
+ *  }
+ */
+static inline void raw_write_seqcount_barrier(seqcount_t *s)
+{
+   s-sequence++;
+   smp_wmb();
+   s-sequence++;
+}
+
 /*
  * raw_write_seqcount_latch - redirect readers to even/odd copy
  * @s: pointer to seqcount_t


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-11 Thread Paul E. McKenney
On Thu, Jun 11, 2015 at 08:33:41AM -0700, Paul E. McKenney wrote:
 On Thu, Jun 11, 2015 at 02:46:47PM +0200, Peter Zijlstra wrote:
  Introduce raw_write_seqcount_barrier(), a new construct that can be
  used to provide write barrier semantics in seqcount read loops instead
  of the usual consistency guarantee.
  
  Cc: Al Viro v...@zeniv.linux.org.uk
  Cc: Linus Torvalds torva...@linux-foundation.org
  Cc: Paul McKenney paul...@linux.vnet.ibm.com
  Suggested-by: Oleg Nesterov o...@redhat.com
  Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org
  ---
   include/linux/seqlock.h |   42 ++
   1 file changed, 42 insertions(+)
  
  --- a/include/linux/seqlock.h
  +++ b/include/linux/seqlock.h
  @@ -233,6 +233,48 @@ static inline void raw_write_seqcount_en
  s-sequence++;
   }
  
  +/**
  + * raw_write_seqcount_barrier - do a seq write barrier
  + * @s: pointer to seqcount_t
  + *
  + * This can be used to provide an ordering guarantee instead of the
  + * usual consistency guarantee. It is one wmb cheaper, because we can
  + * collapse the two back-to-back wmb()s.
  + *
  + *  seqcount_t seq;
  + *  bool X = true, Y = false;
  + *
  + *  void read(void)
  + *  {
  + *  bool x, y;
  + *
  + *  do {
  + *  int s = read_seqcount_begin(seq);
  + *
  + *  x = X; y = Y;
  + *
  + *  } while (read_seqcount_retry(seq, s));
  + *
  + *  BUG_ON(!x  !y);
  + *  }
  + *
  + *  void write(void)
  + *  {
  + *  Y = true;
  + *
  + *  write_seqcount_begin(seq);
  + *  write_seqcount_end(seq);
  + *
  + *  X = false;
  + *  }
 
 So when using this, write() would instead look like this?
 
   void write(void)
   {
   Y = true;
   raw_write_seqcount_barrier(seq);
   X = false;
   }
 
 I suggest calling this out explicitly.  Agreed, it should be obvious,
 but some poor sot is going to be reading this at 3AM local time after
 a couple days of no sleep, in which case obvious might not be so obvious.
 
 I also would suggest READ_ONCE() and WRITE_ONCE() to keep the compiler
 trickiness down to a dull roar.  Understood, it is hard to make anything
 bad happen in this case, but small changes could result in badness.
 
  + */
  +static inline void raw_write_seqcount_barrier(seqcount_t *s)
  +{
  +   s-sequence++;
  +   smp_wmb();
  +   s-sequence++;
  +}
  +
   /*
* raw_write_seqcount_latch - redirect readers to even/odd copy
* @s: pointer to seqcount_t
 
 Looks good otherwise.
 
 Reviewed-by: Paul E. McKenney paul...@linux.vnet.ibm.com

Color me slow and stupid.  Maybe due to reviewing a patch too early in
the morning, who knows?

There is nothing above that prevents the compiler and the CPU from
reordering the assignments to X and Y with the increment of s-sequence++.
One fix would be as follows:

static inline void raw_write_seqcount_barrier(seqcount_t *s)
{
smp_wmb();
s-sequence++;
smp_wmb();
s-sequence++;
smp_wmb();
}

Of course, this assumes that the accesses surrounding the call to
raw_write_seqcount_barrier() are writes.  If they can be a reads,
the two added smp_wmb() calls need to be full barriers.

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/18] seqcount: Introduce raw_write_seqcount_barrier()

2015-06-11 Thread Paul E. McKenney
On Thu, Jun 11, 2015 at 02:46:47PM +0200, Peter Zijlstra wrote:
 Introduce raw_write_seqcount_barrier(), a new construct that can be
 used to provide write barrier semantics in seqcount read loops instead
 of the usual consistency guarantee.
 
 Cc: Al Viro v...@zeniv.linux.org.uk
 Cc: Linus Torvalds torva...@linux-foundation.org
 Cc: Paul McKenney paul...@linux.vnet.ibm.com
 Suggested-by: Oleg Nesterov o...@redhat.com
 Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org
 ---
  include/linux/seqlock.h |   42 ++
  1 file changed, 42 insertions(+)
 
 --- a/include/linux/seqlock.h
 +++ b/include/linux/seqlock.h
 @@ -233,6 +233,48 @@ static inline void raw_write_seqcount_en
   s-sequence++;
  }
 
 +/**
 + * raw_write_seqcount_barrier - do a seq write barrier
 + * @s: pointer to seqcount_t
 + *
 + * This can be used to provide an ordering guarantee instead of the
 + * usual consistency guarantee. It is one wmb cheaper, because we can
 + * collapse the two back-to-back wmb()s.
 + *
 + *  seqcount_t seq;
 + *  bool X = true, Y = false;
 + *
 + *  void read(void)
 + *  {
 + *  bool x, y;
 + *
 + *  do {
 + *  int s = read_seqcount_begin(seq);
 + *
 + *  x = X; y = Y;
 + *
 + *  } while (read_seqcount_retry(seq, s));
 + *
 + *  BUG_ON(!x  !y);
 + *  }
 + *
 + *  void write(void)
 + *  {
 + *  Y = true;
 + *
 + *  write_seqcount_begin(seq);
 + *  write_seqcount_end(seq);
 + *
 + *  X = false;
 + *  }

So when using this, write() would instead look like this?

void write(void)
{
Y = true;
raw_write_seqcount_barrier(seq);
X = false;
}

I suggest calling this out explicitly.  Agreed, it should be obvious,
but some poor sot is going to be reading this at 3AM local time after
a couple days of no sleep, in which case obvious might not be so obvious.

I also would suggest READ_ONCE() and WRITE_ONCE() to keep the compiler
trickiness down to a dull roar.  Understood, it is hard to make anything
bad happen in this case, but small changes could result in badness.

 + */
 +static inline void raw_write_seqcount_barrier(seqcount_t *s)
 +{
 + s-sequence++;
 + smp_wmb();
 + s-sequence++;
 +}
 +
  /*
   * raw_write_seqcount_latch - redirect readers to even/odd copy
   * @s: pointer to seqcount_t

Looks good otherwise.

Reviewed-by: Paul E. McKenney paul...@linux.vnet.ibm.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/