On 22.06.2012 11:25, Konstantin Belousov wrote:
On Fri, Jun 22, 2012 at 11:08:50AM +0300, Alexander Motin wrote:
On 06/22/12 10:06, Konstantin Belousov wrote:
Author: kib
Date: Fri Jun 22 07:06:40 2012
New Revision: 237433
URL: http://svn.freebsd.org/changeset/base/237433

Log:
   Implement mechanism to export some kernel timekeeping data to
   usermode, using shared page.  The structures and functions have vdso
   prefix, to indicate the intended location of the code in some future.

   The versioned per-algorithm data is exported in the format of struct
   vdso_timehands, which mostly repeats the content of in-kernel struct
   timehands. Usermode reading of the structure can be lockless.
   Compatibility export for 32bit processes on 64bit host is also
   provided. Kernel also provides usermode with indication about
   currently used timecounter, so that libc can fall back to syscall if
   configured timecounter is unknown to usermode code.

   The shared data updates are initiated both from the tc_windup(), where
   a fast task is queued to do the update, and from sysctl handlers which
   change timecounter. A manual override switch
   kern.timecounter.fast_gettime allows to turn off the mechanism.

   Only x86 architectures export the real algorithm data, and there, only
   for tsc timecounter. HPET counters page could be exported as well, but
   I prefer to not further glue the kernel and libc ABI there until
   proper vdso-based solution is developed.

   Minimal stubs neccessary for non-x86 architectures to still compile
   are provided.

   Discussed with:      bde
   Reviewed by: jhb
   Tested by:   flo
   MFC after:   1 month


@@ -1360,6 +1367,7 @@ tc_windup(void)
  #endif

        timehands = th;
+       taskqueue_enqueue_fast(taskqueue_fast,&tc_windup_push_vdso_task);
  }

  /* Report or change the active timecounter hardware. */

This taskqueue_enqueue_fast() will schedule extra thread to run each
time hardclock() fires. One thread may be not a big problem, but
together with callout swi and possible other threads woken up there it
will wake up several other CPU cores from sleep just to put them back in
few microseconds. Now davide@ and me are trying to fix that by avoiding
callout SWI use for simple tasks. Please, let's not create another
problem same time.

The patch was public for quite a time. If you have some comments about
it, it would be much more productive to let me know about them before
the commit, not after.

I'm sorry, I haven't seen it. My mad.

Anyway, what is your proposal for 'let's not create another problem
same time' part of the message ? It was discussed, as a possibility,
to have permanent mapping for the shared page in the KVA and to perform
lock-less update of the struct vdso_timehands directly from hardclock
handler. My opinion was that amount of work added by tc_windup
eventhandler is not trivial, so it is better to be postponed to
less critical context. It also slightly more safe to not perform
lockless update for vdso_timehands, since otherwise module load which
register exec handler could cause transient gettimeofday() failure
in usermode.

This might boil down to the fact that tc_windup function is called
too often, in fact. Also, packing execution of tc_windup eventhandler
together with the clock swi is fine from my POV.

I have nothing against using shared pages. On UP system I would probably have not so much against several threads. But on SMP system it will cause at least one, but in many cases two extra CPUs to be woken up. There are two or more threads to run on hardclock(): this taskqueue, callout swi and some thread(s) woken from callouts. Scheduler has no idea how heavy they are. So it will try to move each of them to separate idle CPU. Does the amount of work done in event handlers worth extra Watts consumed by rapidly waking CPUs? As quite rare person running FreeBSD on laptop, I am not sure. I am not sure even that on desktop/server this won't kill all benefit of fast clocks by limiting TurboBoost.

--
Alexander Motin
_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to