On Wed, Feb 13, 2013 at 08:16:32AM -0700, Ian Lepore wrote: > On Tue, 2013-02-12 at 22:34 +0200, Konstantin Belousov wrote: > > On Tue, Feb 12, 2013 at 09:03:39AM -0700, Ian Lepore wrote: > > > On Sun, 2013-02-10 at 12:37 +0200, Konstantin Belousov wrote: > > > > On Sat, Feb 09, 2013 at 02:47:06PM +0100, Jilles Tjoelker wrote: > > > > > On Wed, Feb 06, 2013 at 05:58:30PM +0200, Konstantin Belousov wrote: > > > > > > On Tue, Feb 05, 2013 at 09:41:38PM -0700, Ian Lepore wrote: > > > > > > > I'd like feedback on the attached patch, which adds support to our > > > > > > > time_pps_fetch() implementation for the blocking behaviors > > > > > > > described in > > > > > > > section 3.4.3 of RFC 2783. The existing implementation can only > > > > > > > return > > > > > > > the most recently captured data without blocking. These changes > > > > > > > add the > > > > > > > ability to block (forever or with timeout) until a new event > > > > > > > occurs. > > > > > > > > > > > > Index: sys/kern/kern_tc.c > > > > > > > =================================================================== > > > > > > > --- sys/kern/kern_tc.c (revision 246337) > > > > > > > +++ sys/kern/kern_tc.c (working copy) > > > > > > > @@ -1446,6 +1446,50 @@ > > > > > > > * RFC 2783 PPS-API implementation. > > > > > > > */ > > > > > > > > > > > > > > +static int > > > > > > > +pps_fetch(struct pps_fetch_args *fapi, struct pps_state *pps) > > > > > > > +{ > > > > > > > [snip] > > > > > > > + aseq = pps->ppsinfo.assert_sequence; > > > > > > > + cseq = pps->ppsinfo.clear_sequence; > > > > > > > + while (aseq == pps->ppsinfo.assert_sequence && > > > > > > > + cseq == pps->ppsinfo.clear_sequence) { > > > > > > Note that compilers are allowed to optimize these accesses even over > > > > > > the sequential point, which is the tsleep() call. Only accesses to > > > > > > volatile objects are forbidden to be rearranged. > > > > > > > > > > > I suggest to add volatile casts to pps in the loop condition. > > > > > > > > > > The memory pointed to by pps is global (other code may have a pointer > > > > > to > > > > > it); therefore, the compiler must assume that the tsleep() call (which > > > > > invokes code in a different compilation unit) may modify it. > > > > > > > > > > Because volatile does not make concurrent access by multiple threads > > > > > defined either, adding it here only seems to slow down the code > > > > > (potentially). > > > > The volatile guarantees that the compiler indeed reloads the value on > > > > read access. Conceptually, the tsleep() does not modify or even access > > > > the checked fields, and compiler is allowed to note this by whatever > > > > methods (LTO ?). > > > > > > > > More, the standard says that an implementation is allowed to not > > > > evaluate > > > > part of the expression if no side effects are produced, even by calling > > > > a function. > > > > > > > > I agree that for practical means, the _currently_ used compilers should > > > > consider the tsleep() call as the sequential point. But then the > > > > volatile > > > > qualifier cast applied for the given access would not change the code as > > > > well. > > > > > > > > > > Doesn't this then imply that essentially every driver has this problem, > > > and for that matter, every sequence of code anywhere in the base > > > involving "loop while repeatedly sleeping, then waking and checking the > > > state of some data for changes"? I sure haven't seen that many volatile > > > qualifiers scattered around the code. > > > > No, it does not imply that every driver has this problem. > > A typical driver provides the mutual exclusion for access of > > the shared data, which means using locks. Locks include neccessary > > barries to ensure the visibility of the changes, in particular the > > compiler barriers. > > Ohhhh. I had never considered that using mutexes had other side > effects. So is there a correct MI way to invoke the right barrier magic > in a situation like this?
My belief is that you do not need a barrier there. The only (slightly) problematic issue there is a purely theoretical possibility that a very smart compiler would omit the reload step. The volatile qualifier for the dererefence in the loop condition should close this, as I described in the very first reply.
pgpmciXFSJ0R5.pgp
Description: PGP signature