On Wed, Feb 13, 2013 at 08:16:32AM -0700, Ian Lepore wrote:
> On Tue, 2013-02-12 at 22:34 +0200, Konstantin Belousov wrote:
> > On Tue, Feb 12, 2013 at 09:03:39AM -0700, Ian Lepore wrote:
> > > On Sun, 2013-02-10 at 12:37 +0200, Konstantin Belousov wrote:
> > > > On Sat, Feb 09, 2013 at 02:47:06PM +0100, Jilles Tjoelker wrote:
> > > > > On Wed, Feb 06, 2013 at 05:58:30PM +0200, Konstantin Belousov wrote:
> > > > > > On Tue, Feb 05, 2013 at 09:41:38PM -0700, Ian Lepore wrote:
> > > > > > > I'd like feedback on the attached patch, which adds support to our
> > > > > > > time_pps_fetch() implementation for the blocking behaviors 
> > > > > > > described in
> > > > > > > section 3.4.3 of RFC 2783.  The existing implementation can only 
> > > > > > > return
> > > > > > > the most recently captured data without blocking.  These changes 
> > > > > > > add the
> > > > > > > ability to block (forever or with timeout) until a new event 
> > > > > > > occurs.
> > > > > 
> > > > > > > Index: sys/kern/kern_tc.c
> > > > > > > ===================================================================
> > > > > > > --- sys/kern/kern_tc.c    (revision 246337)
> > > > > > > +++ sys/kern/kern_tc.c    (working copy)
> > > > > > > @@ -1446,6 +1446,50 @@
> > > > > > >   * RFC 2783 PPS-API implementation.
> > > > > > >   */
> > > > > > >  
> > > > > > > +static int
> > > > > > > +pps_fetch(struct pps_fetch_args *fapi, struct pps_state *pps)
> > > > > > > +{
> > > > > > > [snip]
> > > > > > > +         aseq = pps->ppsinfo.assert_sequence;
> > > > > > > +         cseq = pps->ppsinfo.clear_sequence;
> > > > > > > +         while (aseq == pps->ppsinfo.assert_sequence &&
> > > > > > > +             cseq == pps->ppsinfo.clear_sequence) {
> > > > > > Note that compilers are allowed to optimize these accesses even over
> > > > > > the sequential point, which is the tsleep() call. Only accesses to
> > > > > > volatile objects are forbidden to be rearranged.
> > > > > 
> > > > > > I suggest to add volatile casts to pps in the loop condition.
> > > > > 
> > > > > The memory pointed to by pps is global (other code may have a pointer 
> > > > > to
> > > > > it); therefore, the compiler must assume that the tsleep() call (which
> > > > > invokes code in a different compilation unit) may modify it.
> > > > > 
> > > > > Because volatile does not make concurrent access by multiple threads
> > > > > defined either, adding it here only seems to slow down the code
> > > > > (potentially).
> > > > The volatile guarantees that the compiler indeed reloads the value on
> > > > read access. Conceptually, the tsleep() does not modify or even access
> > > > the checked fields, and compiler is allowed to note this by whatever
> > > > methods (LTO ?).
> > > > 
> > > > More, the standard says that an implementation is allowed to not 
> > > > evaluate
> > > > part of the expression if no side effects are produced, even by calling
> > > > a function.
> > > > 
> > > > I agree that for practical means, the _currently_ used compilers should
> > > > consider the tsleep() call as the sequential point. But then the 
> > > > volatile
> > > > qualifier cast applied for the given access would not change the code as
> > > > well.
> > > > 
> > > 
> > > Doesn't this then imply that essentially every driver has this problem,
> > > and for that matter, every sequence of code anywhere in the base
> > > involving "loop while repeatedly sleeping, then waking and checking the
> > > state of some data for changes"?  I sure haven't seen that many volatile
> > > qualifiers scattered around the code.
> > 
> > No, it does not imply that every driver has this problem.
> > A typical driver provides the mutual exclusion for access of
> > the shared data, which means using locks. Locks include neccessary
> > barries to ensure the visibility of the changes, in particular the
> > compiler barriers.
> 
> Ohhhh.   I had never considered that using mutexes had other side
> effects.  So is there a correct MI way to invoke the right barrier magic
> in a situation like this?

My belief is that you do not need a barrier there.  The only (slightly)
problematic issue there is a purely theoretical possibility that a
very smart compiler would omit the reload step.  The volatile qualifier
for the dererefence in the loop condition should close this, as I described
in the very first reply.

Attachment: pgpmciXFSJ0R5.pgp
Description: PGP signature

Reply via email to