Hey Ed

Yes, I actually experimented with moving the new interface to hrtime_t 
instead of clock_t, but that involved adding conversion statements to 
all previous consumers of cv_timedwait(), all of which had been designed 
with clock_t in mind.

If a new interface with such characteristic is necessary or desired, I 
believe it should be implemented so that folks would gradually migrate 
their code to it. Not just have someone change the immediate routine 
that consumes the interface, but do it with a full understanding of how 
their code handles time.

Thanks,
Rafael


Edward Pilatowicz wrote:
> hey rafael,
> 
> for the new cv_* functions the delta is still specified by a clock_t,
> which iirc is still subject to the value of "hz".  many driver writers
> incorrectly assume that "hz" is always 100.  hence when some
> brave/foolish person comes along and sets "hires_tick" or a custom "hz"
> value, things break.  getting to my point, if we're introducing new
> interfaces to allow driver writers to simplify their drivers, perhaps we
> could move away from having them deal with clock_t values all together?
> did you consider creating the new interfaces to allow the caller to
> simply specify the requested delta directly in
> NANOSEC/MICROSEC/MILLISEC/SEC values?
> 
> ed
> 
> On Tue, Sep 01, 2009 at 02:22:45PM -0700, Jerry Gilliam wrote:
>> I am sponsoring the following fast-track on behalf of Rafael Vanoni,
>> with a time-out of 09/09/2009.  The project desires
>> minor/major binding, plus micro/patch binding for one
>> interface, as specified.
>>
>> -------------------------------------
>>
>> Template Version: @(#)onepager.txt 1.35 07/11/07 SMI
>> Copyright 2007 Sun Microsystems
>>
>> 1. Introduction
>>    1.1. Project/Component Working Name:
>>        Tickless Kernel Architecture / lbolt decoupling
>>
>>    1.2. Name of Document Author/Supplier:
>>        Rafael Vanoni Polanczyk (rafael.vanoni at sun.com)
>>
>>    1.3. Date of This Document:
>>     08/04/09
>>
>>     1.3.1. Date this project was conceived:
>>         07/01/09
>>
>>    1.4. Name of Major Document Customer(s)/Consumer(s):
>>     1.4.1. The PAC or CPT you expect to review your project:
>>         Solaris PAC
>>     1.4.2. The ARC(s) you expect to review your project:
>>     1.4.3. The Director/VP who is "Sponsoring" this project:
>>         Greg.Lavender at Sun.COM
>>     1.4.4. The name of your business unit:
>>         Systems
>>
>>    1.5. Email Aliases:
>>         1.5.1. Responsible Manager: darrin.johnson at sun.com
>>         1.5.2. Responsible Engineer: rafael.vanoni at sun.com
>>     1.5.3. Marketing Manger: mike.mulkey at sun.com
>>     1.5.4. Interest List: tickless-dev at opensolaris.org
>>
>>
>> 2. Project Summary
>>    2.1. Project Description:
>>        The tickless project aims at implementing the services provided by the
>>        clock cyclic in an event driven fashion. The first sub-project is the
>>        decoupling of the lbolt and lbolt64 variables from clock(). These two
>>        variables are incremented at each firing of the clock cyclic and 
>> provide
>>        a time reference to the system. They are being replaced by two 
>> routines
>>        that are backed by gethrtime(), the existing ddi_get_lbolt() and
>>        the new ddi_get_lbolt64(), introduced as a migration path for existing
>>        non-DDI compliant consumers.
>>
>>        This project also presents a solution to minimize the usage of the DDI
>>        lbolt routines through new interfaces, and a method to prevent any
>>        performance impact of migrating inexpensive references to variables, 
>> to
>>        calling of routines. These are described in detail on section 4.1.
>>
>>
>> 4. Technical Description:
>>     4.1. Details:
>>     lbolt and lbolt64 variables will be replaced by two routines,
>>     ddi_get_lbolt() and ddi_get_lbolt64(), which are backed by a hardware
>>     counter to provide the same service in en event driven way.
>>
>>        One of the major consumers of the lbolt service are the cv_timedwait()
>>        and cv_timedwait_sig() routines, which require lbolt to form one of 
>> its
>>        arguments (an absolute value of time) and once again internally to
>>        decompose it into a relative time. This project is introducing two new
>>        routines, cv_reltimedwait() and cv_reltimedwait_sig() which will 
>> perform
>>        the same service of the previously mentioned routines but simply
>>        receiving a relative time, and not requiring lbolt at all. These new
>>        routines will also have a new argument of type time_res_t to inform
>>        the underlying timeout system as to how accurately the given timeout
>>        must expire. This will allow the kernel to anticipate or defer such
>>        timeouts when possible, allowing the system to stay idle for longer
>>        periods of time.
>>
>>        Some consumers of the lbolt and lbolt64 variables may have inexplicit
>>        dependencies on the cheapness of reading a memory position that will 
>> be
>>        exposed when migrated to a gethrtime() backed routine. In such cases
>>        migrating references to lbolt and lbolt64 to ddi_get_lbolt() and
>>        ddi_get_lbolt64() will have a negative performance impact. To address
>>        this case, our project will perform the    lbolt service in an hybrid 
>> way,
>>        switching from event to cyclic driven when the DDI lbolt routines are
>>        being heavily used. This cyclic mode will reprogram a timer that will
>>        expire at each clock tick and increment    an internal (lbolt like)
>>        variable and return its value to the consumer. This cyclic will only
>>        be activated during periods of heavy load, and will switch itself off
>>        when the activity subsides.
>>
>>     The decision to remove the lbolt and lbolt64 variables was made during
>>     design review, and a consensus was reached on the basis that, since
>>     we're reaching the end of a major release, this is the right moment to
>>     obsolete these. The side effects and cost of maintaining such symbols
>>     outweigh the benefits. However, this decision can be re-evaluated in
>>     case the negative impact on 3rd party modules during the development
>>     release is greater than expected. We're working with ISV and RPE to
>>     minimize the impact pro-actively.
>>
>>     4.2. Bug/RFE Number(s):
>>          6860030 tickless clock requires a clock() decoupled lbolt / lbolt64
>>
>>     4.5. Interfaces:
>>         This project is adding the following interfaces to the DDI:
>>
>>         int64_t ddi_get_lbolt64(void);
>>
>>         clock_t cv_reltimedwait(kcondvar_t *cvp, kmutex_t *mp, clock_t delta,
>>             time_res_t res);
>>
>>         clock_t cv_reltimedwait_sig(kcondvar_t *cvp, kmutex_t *mp, clock_t
>>             delta, time_res_t res);
>>
>>         With time_res_t defined as
>>
>>     enum time_res {
>>         TR_NANOSEC,
>>         TR_MICROSEC,
>>         TR_MILLISEC,
>>         TR_SEC,
>>         TR_CLOCK_TICK,
>>         TR_COUNT
>>     };
>>
>>     typedef enum time_res time_res_t;
>>
>>         In addition to that, the lbolt and lbolt64 variables (which are
>>         *private* symbols known to be used by non-DDI compliant modules) are
>>         being removed. 3rd party modules that are not brought up to speed 
>> will
>>         fail to load.
>>
>>     In summary:
>>
>>     Interface            Commitment  Comments
>>     -----------------------------------------------------------------------
>>     ddi_get_lbolt64()    Public/DDI  return lbolt64
>>     cv_reltimedwait(9F)    Public/DDI  cv_timedwait(9f), relative time
>>     cv_reltimedwait_sig(9F)    Public/DDI  cv_timedwait_sig(9F), relative 
>> time
>>     lbolt            Obsolete    commonly referenced kernel symbol
>>     lbolt64            Obsolete    commonly referenced kernel symbol
>>
>>     We also plan on back porting the ddi_get_lbolt64() interface to Solaris
>>     10 Update 9 to extend the migration path for S10 users who would like
>>     to update their modules before moving to Solaris Nevada or the next
>>     version of Solaris. These users already have ddi_get_lbolt() but
>>     currently lack the 64 bits version of it. Such back port will have
>>     patch release binding.
>>
>>
>>     4.6. Doc Impact:
>>         6868417 updates for tickless kernel/lbolt decoupling (6860030)
>>
>>         Updates to the 'Writing Device Drivers' document are necessary, the
>>         project team is in contact with the documentation group to address
>>         these.
>>
>>
>> 5. Reference Documents:
>>     This project is being developed through OpenSolaris, our project pages
>>     and alias contain all the necessary information:
>>         http://opensolaris.org/os/project/tickless/
>>         http://opensolaris.org/os/project/tickless/tasks/lbolt/
>>         tickless-dev at opensolaris.org
>>
>>
>> 6. Resources and Schedule:
>>     6.5. ARC review type: Fast track
>>     6.6. ARC Exposure: open
>>
>>
>>
>>
>>
>> Updates to existing man pages:
>> ------------------------------
>>
>> drv_getparm.9f
>>
>> PARAMETERS
>>      ...
>>
>>      LBOLT     Read the value of lbolt. lbolt is a  clock_t  that    |
>>                represents the number of clock ticks since system     |
>>                boot. No special treatment is  applied  when          |
>>                this  value  overflows  the  maximum  value of the
>>                signed integral type clock_t.  When  this  occurs,
>>                its value will be negative, and its magnitude will
>>                be decreasing until it again passes zero.  It  can
>>                ...
>>
>>
>>
>>
>> drv_hztousec.9f
>>
>> DESCRIPTION
>>      The drv_hztousec() function converts into  microseconds  the
>>      time expressed by hertz, which is in system clock ticks.
>>
>>      The length of time the system has been up since boot can be    |
>>      retrieved by calling ddi_get_lbolt(9F), which will return a    |
>>      value of type clock_t containing the number of clock ticks
>>      since boot. Drivers often use this value before and after an
>>      I/O request to measure the amount of time it took the device to
>>      process the request. The drv_hztousec() function can be used
>>      by the driver to convert the reading from clock ticks  to  a
>>      known unit of time.
>>
>>
>>
>>
>> Intro.9f
>>
>> Kernel Functions for Drivers                            Intro(9F)
>>
>>      ddi_get_instance                  Solaris DDI
>>      ddi_get_kt_did                    Solaris DDI
>>      ddi_get_lbolt                     Solaris DDI
>>      ddi_get_lbolt64                   Solaris DDI            +
>>      ddi_get_name                      Solaris DDI
>>      ...
>>
>>
>>
>>
>> Updated ddi_get_lbolt.9f:
>> -------------------------
>>
>> Kernel Functions for Drivers                    ddi_get_lbolt(9F)
>>
>> NAME
>>      ddi_get_lbolt - returns the number of clock ticks since boot    |
>>
>> SYNOPSIS
>>      #include <sys/types.h>
>>      #include <sys/ddi.h>
>>      #include <sys/sunddi.h>
>>
>>      clock_t ddi_get_lbolt(void);
>>
>> INTERFACE LEVEL
>>      Solaris DDI specific (Solaris DDI).
>>
>> DESCRIPTION
>>      ddi_get_lbolt() returns a value that represents the number        |
>>      of clock ticks since the system booted.  This value is        |
>>      used  as  a  counter  or timer  inside  the  system kernel.
>>      The tick frequency can be determined  by using drv_usectohz(9F)
>>      which converts microseconds into clock ticks.
>>
>>
>> RETURN VALUES
>>      ddi_get_lbolt() returns the number of clock ticks since boot    |
>>      in clock_t type.
>>
>> CONTEXT
>>       This routine can be called from any context.
>>
>> SEE ALSO
>>      ddi_get_lbolt64(9F), ddi_get_time(9F), drv_getparm(9F),
>>      drv_usectohz(9F)
>>
>>
>>
>>
>> New man page for ddi_get_lbolt64():
>> -----------------------------------
>>
>> Kernel Functions for Drivers                    ddi_get_lbolt64(9F)
>>
>> NAME
>>      ddi_get_lbolt64 - returns the number of clock ticks since boot
>>      in int64_t type
>>
>> SYNOPSIS
>>      #include <sys/types.h>
>>      #include <sys/ddi.h>
>>      #include <sys/sunddi.h>
>>
>>      int64_t ddi_get_lbolt64(void);
>>
>> INTERFACE LEVEL
>>      Solaris DDI specific (Solaris DDI).
>>
>> DESCRIPTION
>>      ddi_get_lbolt64() returns a value that represents the number
>>      of clock ticks since the system booted.  This value is
>>      used  as  a  counter  or timer  inside  the  system kernel. It is
>>      essentially the same value returned by ddi_get_lbolt(9F), but in a
>>      longer data type that will not wrap for 2.9 billion years.
>>
>> RETURN VALUES
>>      ddi_get_lbolt64() returns the number of clock ticks since boot
>>      in int64_t type.
>>
>> CONTEXT
>>       This routine can be called from any context.
>>
>> SEE ALSO
>>      ddi_get_lbolt(9F), ddi_get_time(9F)
>>
>>      Writing Device Drivers
>>
>>       STREAMS Programming Guide
>>
>> SunOS 5.11          Last change: 29 Jul 2009                    1
>>
>>
>> Updates to condvar(9f):
>> ----------------------
>>
>> Kernel Functions for Drivers                          condvar(9F)
>>
>> NAME
>>      condvar,   cv_init,    cv_destroy,    cv_wait,    cv_signal,
>>      cv_broadcast,  cv_wait_sig, cv_timedwait, cv_timedwait_sig,
>>      cv_reltimedwait, cv_reltimedwait_sig - condition variable
>>      routines
>>
>> SYNOPSIS
>>      #include <sys/ksynch.h>
>>
>>      void cv_init(kcondvar_t *cvp, char *name, kcv_type_t type, void *arg);
>>
>>      void cv_destroy(kcondvar_t *cvp);
>>
>>      void cv_wait(kcondvar_t *cvp, kmutex_t *mp);
>>
>>      void cv_signal(kcondvar_t *cvp);
>>
>>      void cv_broadcast(kcondvar_t *cvp);
>>
>>      int cv_wait_sig(kcondvar_t *cvp, kmutex_t *mp);
>>
>>      clock_t cv_timedwait(kcondvar_t *cvp, kmutex_t *mp, clock_t timeout);
>>
>>      clock_t cv_timedwait_sig(kcondvar_t *cvp, kmutex_t *mp, clock_t 
>> timeout);
>>
>> |    clock_t cv_reltimedwait(kcondvar_t *cvp, kmutex_t *mp, clock_t delta,
>> |    time_res_t resolution);
>>
>> |    clock_t cv_reltimedwait_sig(kcondvar_t *cvp, kmutex_t *mp, clock_t 
>> delta,
>> |    time_res_t resolution);
>>
>> INTERFACE LEVEL
>>      Solaris DDI specific (Solaris DDI).
>>
>> PARAMETERS
>>      cvp        A pointer to an abstract data type kcondvar_t.
>>
>>      mp         A pointer to a mutual exclusion lock  (kmutex_t),
>>                 initialized  by  mutex_init(9F)  and  held by the
>>                 caller.
>>
>>      name       Descriptive string. This is obsolete  and  should
>>                 be NULL. (Non-NULL strings are legal, but they're
>>                 a waste of kernel memory.)
>>
>> SunOS 5.11          Last change: 02 Aug 2009                    1
>>
>> Kernel Functions for Drivers                          condvar(9F)
>>
>>      type       The constant CV_DRIVER.
>>
>>      arg        A type-specific argument, drivers should pass arg
>>                 as NULL.
>>
>>      timeout    A  time,  in  absolute  ticks  since  boot,  when
>>                 cv_timedwait()   or   cv_timedwait_sig()   should
>>                 return.
>>
>> |     delta      A time, in relative ticks, when cv_reltimedwait()
>> |        or cv_reltimedwait_sig() should return.
>> |
>> |  resolution    A flag that specifies how accurately the relative
>> |          time interval should be. Possible values are
>> |          TR_NANOSEC, TR_MICROSEC, TR_MILLISEC, TR_SEC or
>> |          TR_CLOCK_TICK, the former indicating that the interval
>> |          should be aligned to system clock ticks. This
>> |          information allows the system to anticipate or
>> |          deffer the timeout expiration in order to batch process
>> |          similarly expiring events. Allowing the system to
>> |          stay idle for longer periods of time and enhance
>> |          its power efficiency.
>>
>>
>> DESCRIPTION
>>      Condition variables are a standard form of thread synchroni-
>>      zation.  They  are designed to be used with mutual exclusion
>>      locks (mutexes). The associated mutex is used to ensure that
>>      a  condition  can  be checked atomically and that the thread
>>      can block on the associated condition variable without miss-
>>      ing  either  a  change to the condition or a signal that the
>>      condition has changed. Condition variables must be  initial-
>>      ized  by  calling cv_init(), and must be deallocated by cal-
>>      ling cv_destroy().
>>
>>      The usual use of condition variables is to check a condition
>>      (for  example, device state, data structure reference count,
>>      etc.) while holding a mutex which keeps other  threads  from
>>      changing  the  condition.  If the condition is such that the
>>      thread should block, cv_wait() is called with a related con-
>>      dition  variable and the mutex. At some later point in time,
>>      another thread would acquire the mutex,  set  the  condition
>>      such  that the previous thread can be unblocked, unblock the
>>      previous thread with cv_signal() or cv_broadcast(), and then
>>      release the mutex.
>>
>>      cv_wait() suspends the calling thread and  exits  the  mutex
>>      atomically so that another thread which holds the mutex can-
>>      not signal on the  condition  variable  until  the  blocking
>>      thread  is  blocked.  Before  returning,  the mutex is reac-
>>      quired.
>>
>>      cv_signal() signals the  condition  and  wakes  one  blocked
>>      thread.  All  blocked  threads  can  be unblocked by calling
>>      cv_broadcast(). cv_signal() and cv_broadcast() can be called
>>      by  a  thread even if it does not hold the mutex passed into
>>      cv_wait(), though holding the mutex is necessary  to  ensure
>>      predictable scheduling.
>>
>> SunOS 5.11          Last change: 02 Aug 2009                    2
>>
>> Kernel Functions for Drivers                          condvar(9F)
>>
>>      The function  cv_wait_sig()  is  similar  to  cv_wait()  but
>>      returns  0  if a signal (for example, by kill(2)) is sent to
>>      the thread. In any case,  the  mutex  is  reacquired  before
>>      returning.
>>
>>      The function cv_timedwait() is similar to cv_wait(),  except
>>      that  it  returns  -1  without  the condition being signaled
>>      after the timeout time has been reached.
>>
>>      The function cv_timedwait_sig() is similar to cv_timedwait()
>>      and  cv_wait_sig(),  except  that  it returns -1 without the
>>      condition being signaled after the  timeout  time  has  been
>>      reached,  or 0 if a signal (for example, by kill(2)) is sent
>>      to the thread.
>>
>>      For both cv_timedwait() and cv_timedwait_sig(), time  is  in
>>      absolute  clock  ticks  since  the  last  system reboot. The
>>      current time may be found by calling ddi_get_lbolt(9F).
>>
>> |     The cv_reltimedwait() function is similar to cv_timedwait(),
>> |     except that it takes a relative time value as argument and
>> |     it also takes an additional argument to specify the accuracy
>> |     of such interval. cv_reltimedwait_sig() is analogous to
>> |     cv_timedwait_sig(), but takes the same arguments as
>> |     cv_reltimedwait().
>>
>> RETURN VALUES
>>      0        For cv_wait_sig(), cv_timedwait_sig() and cv_reltimedwait_sig()
>>           indicates
>>               that the condition was not necessarily signaled and
>>               the function  returned  because  a  signal  (as  in
>>               kill(2)) was pending.
>>
>> |     -1       For cv_timedwait(), cv_timedwait_sig(),
>> |              cv_reltimedwait() and cv_reltimedwait_sig() indicates
>>               that the condition was not necessarily signaled and
>>               the function returned because the timeout time  was
>>               reached.
>>
>> |     >0       For cv_wait_sig(), cv_timedwait(), cv_timedwait_sig(),
>> |               cv_reltimedwait() or cv_reltimedwait_sig()
>> |                indicates that the condition was
>>               met and the function returned  due  to  a  call  to
>>               cv_signal()  or  cv_broadcast(), or due to a prema-
>>               ture wakeup (see NOTES).
>>
>> CONTEXT
>>      These functions can be called from user, kernel or interrupt
>>      context.  In most cases, however, cv_wait(), cv_timedwait(),
>> |     cv_wait_sig(), cv_timedwait_sig(), cv_reltimedwait() and
>> |     cv_reltimedwait_sig()
>>      should not  be  called
>>      from  interrupt  context,  and cannot be called from a high-
>>      level interrupt context.
>>
>>      If    cv_wait(),    cv_timedwait(),    cv_wait_sig(),
>> |     cv_timedwait_sig(), cv_reltimedwait() or cv_reltimedwait_sig()
>> |       are  used from interrupt context, lower-
>>
>> SunOS 5.11          Last change: 02 Aug 2009                    3
>>
>> Kernel Functions for Drivers                          condvar(9F)
>>
>>      priority interrupts will not be serviced  during  the  wait.
>>      This  means  that if the thread that will eventually perform
>>      the wakeup becomes blocked on  anything  that  requires  the
>>      lower-priority interrupt, the system will hang.
>>
>>      For example, the thread that will  perform  the  wakeup  may
>>      need  to  first  allocate memory. This memory allocation may
>>      require waiting  for  paging  I/O  to  complete,  which  may
>>      require  a  lower-priority  disk  or network interrupt to be
>>      serviced. In general,  situations  like  this  are  hard  to
>>      predict,  so  it  is advisable to avoid waiting on condition
>>      variables or semaphores in an interrupt context.
>>
>> EXAMPLES
>>      Example 1 Waiting for a Flag Value in a Driver's Unit
>>
>>      Here the condition being waited for is a  flag  value  in  a
>>      driver's  unit  structure. The condition variable is also in
>>      the unit structure, and the flag  word  is  protected  by  a
>>      mutex in the unit structure.
>>
>>             mutex_enter(&un->un_lock);
>>             while (un->un_flag & UNIT_BUSY)
>>               cv_wait(&un->un_cv, &un->un_lock);
>>             un->un_flag |= UNIT_BUSY;
>>             mutex_exit(&un->un_lock);
>>
>>      Example 2 Unblocking Threads Blocked by the Code in  Example
>>      1
>>
>>      At some later point in time, another  thread  would  execute
>>      the  following  to  unblock any threads blocked by the above
>>      code.
>>
>>        mutex_enter(&un->un_lock);
>>        un->un_flag &= ~UNIT_BUSY;
>>        cv_broadcast(&un->un_cv);
>>        mutex_exit(&un->un_lock);
>>
>> NOTES
>> |     It is possible for cv_wait(), cv_wait_sig(), cv_timedwait(),
>> |     cv_timedwait_sig(), cv_reltimedwait() and cv_reltimedwait_sig()
>> |     to return prematurely, that is, not
>>      due to a call to cv_signal() or cv_broadcast(). This  occurs
>>      most   commonly   in   the   case   of   cv_wait_sig(),
>>
>> SunOS 5.11          Last change: 02 Aug 2009                    4
>>
>> Kernel Functions for Drivers                          condvar(9F)
>>
>> |    cv_timedwait_sig() and cv_reltimedwait_sig() when the thread
>> |     is stopped and  restarted
>>      by  job  control signals or by a debugger, but can happen in
>>      other cases as well, even for  cv_wait().  Code  that  calls
>>      these  functions must always recheck the reason for blocking
>>      and call again if the reason for blocking is still true.
>>
>> |     If your driver needs to wait on  behalf  of  processes  that
>> |     have  real-time  constraints, use cv_timedwait() or cv_reltimedwait()
>> |     rather than
>>      delay(9F). The delay() function calls timeout(9F), which can
>>      be subject to priority inversions.
>>
>>      Not  all  threads  can  receive  signals  from  user   level
>>      processes. In cases where such reception is impossible (such
>>      as  during  execution  of   close(9E)   due   to   exit(2)),
>>      cv_wait_sig()  behaves  as cv_wait(), cv_timedwait_sig()
>> |     behaves as cv_timedwait() and cv_reltimedwait_sig() behaves as
>> |     cv_reltimedwait().
>>      To  avoid  unkillable  processes,
>>      users of these functions may need to protect against waiting
>>      indefinitely  for  events  that   might   not   occur.   The
>>      ddi_can_receive_sig(9F)  function is provided to detect when
>>      signal reception is possible.
>>
>> SEE ALSO
>>      kill(2),     ddi_can_receive_sig(9F),     ddi_get_lbolt(9F),
>> |     ddi_get_lbolt64(9F), mutex(9F), mutex_init(9F)
>>
>>      Writing Device Drivers
>>
>> SunOS 5.11          Last change: 02 Aug 2009                    5

Reply via email to