Re: svn commit: r237434 - in head/lib/libc: amd64/sys gen i386/sys include sys
On Sat, 23 Jun 2012, Alexander Motin wrote: On 06/23/12 18:26, Bruce Evans wrote: On Sat, 23 Jun 2012, Konstantin Belousov wrote: On Sat, Jun 23, 2012 at 03:17:57PM +0200, Marius Strobl wrote: So apart from introducing code to constantly synchronize the TICK counters, using the timecounters on the host busses also seems to be the only viable solution for userland. The latter should be doable but is long-winded as besides duplicating portions of the corresponding device drivers in userland, it probably also means to get some additional infrastructure like being able to memory map registers for devices on the nexus(4) level in place ... There is little point in optimizations to avoid syscalls for hardware. On x86, a syscall takes 100-400 nsec extra, so if the hardware takes 500-2000 nsec then reduction the total time by 100-400 nsec is not very useful. Just out of curiosity I've run my own binuptime() micro-benchmarks: - on Core i5-650: TSC 11ns HPET433ns ACPI-fast 515ns i8254 3736ns The TSC is surprisingly fast and the others are depressingly slow, although about the fastest I've seen for bus-based timecounters. On Athlon64, rdtsc() takes 6.5 cycles, but I thought all P-state invariant TSCs took 40 cycles. rdtsc() takes 65 cycles on FreeBSD x86 cluster machines (core2 Xeon), except on freefall (P4(?) Xeon). I hardly believe 11ns. That's 44 cycles at 4GHz. IIRC, the Athlon64 at 2.2GHz took 29nsec for binuptime() last time I measured it (long ago, when it still had the statistics counter pessimization). - on dual-socket Xeon E5645: TSC 15ns HPET580ns ACPI-fast 1118ns i8254 3911ns I think it could be useful to have that small benchmark in base kernel. I think kib put one in src/tools for userland. I mostly use a userland one. Except for the TSC, the overhead for the kernel parts can be estimate accurately from userland, since it is so large. This is more normal slowness for ACPI-[!]fast. freefall still uses ACPI-fast and it takes a minimum of 1396 and an average of 1729nsec from usrerland (load average 1.3). Other x86 cluster machines now use TSC-[s]low, and it takes a minimum of 481 and an average of 533nsec (now the swing from 481 to 533 is given by its gratuitous impreciseness and not by system load). BTW, the i8254 timecounter can be made about 3/2 times faster if anyone cared, by reading only the low 8 bits of the timer. This would require running clock interrupts at = 4kHz so that the top 8 bits are rarely needed (great for a tickless kernel :-), or maybe by using a fuzzier timer to determine when the top bits are needed. At ~2500ns, it would be only slightly slower than the slowest ACPI-fast, and faster than ACPI-safe. OTOH, I have measured i8254 timer reads taking 138000ns (on UP with interrupts disabled) on a system where they normally take only 4000ns. Apparently the ISA bus waits for other bus activity (DMA?) for that long. Does this happen for other buses? Extra bridges for ISA can't help. ... The new timeout code to support tickless kernels looks like it will give large pessimizations unless the timecounter is fast. Instead of using the tick counter (1 atomic increment on every clock tick) and some getbinuptime() calls in places like select(), it uses the hardware timecounter via binuptime() in most places (since without a tick counter and without clock interrupts updating the timehands periodically, it takes a hardware timecounter read to determine the time). So callout_reset() might start taking thousands of nsec for per call, depending on how slow the timecounter is. This fix is probably to use a fuzzy time for long long timeouts and to discourage use of short timeouts and/or to turn them into long or fuzzy timeouts so that they are not very useful. The new timeout code is still in active development and optimization was not the first priority yet. My idea was to use much faster getbinuptime() for periods above let's say 100ms. You would need to run non-tickless with a clock interrupt frequency of = 10Hz to keep getbinuptime() working. Seems like a bad thing to aim for. Better not use bintimes at all. I would try using pseudo-ticks, (where the tick counter is advanced on every not-very-periodic clock interrupt and at some other times when you know that clock interrupts have been stopped, and maybe at other interesting places (all interrupts and all syscalls?)). Only call binuptime() every few thousand pseudo-ticks to prevent long-term drift. Timeouts would become longer and fuzzier than now, but that is a feature (it inhibits using them for busy-waiting). You know when you scheduled clock interrupts and can advance the tick counter to represent the interval between clock interrupts fairly accurately (say to within 10%). The fuzziness comes mainly from not scheduling clock interrupts very often, so that for example when something asks for a sleep of 1 tick
Re: svn commit: r237434 - in head/lib/libc: amd64/sys gen i386/sys include sys
On Fri, Jun 22, 2012 at 10:48:17AM +0300, Konstantin Belousov wrote: On Fri, Jun 22, 2012 at 09:34:56AM +0200, Marius Strobl wrote: On Fri, Jun 22, 2012 at 07:13:31AM +, Konstantin Belousov wrote: Author: kib Date: Fri Jun 22 07:13:30 2012 New Revision: 237434 URL: http://svn.freebsd.org/changeset/base/237434 Log: Use struct vdso_timehands data to implement fast gettimeofday(2) and clock_gettime(2) functions if supported. The speedup seen in microbenchmarks is in range 4x-7x depending on the hardware. Only amd64 and i386 architectures are supported. Libc uses rdtsc and kernel data to calculate current time, if enabled by kernel. I don't know much about x86 CPUs but is my understanding correct that TSCs are not synchronized in any way across CPUs, i.e. reading it on different CPUs may result in time going backwards etc., which is okay for this application though? Generally speaking, tsc state among different CPU after boot is not synchronized, you are right. Kernel has somewhat doubtful test which verifies whether the after-boot state of tsc looks good. If the test fails, TSC is not enabled by default as timecounter, and then usermode follows kernel policy and falls back to slow syscall. So we err on the safe side. I tested this on Core i7 2xxx, where the test (usually) passes. Okay, so for x86 the TSCs are not used as timecounters by either the kernel or userland in the SMP case if they don't appear to be synchronized, correct? While you are there. do you have comments about sparc64 TICK counter ? On SMP, the counter of BSP is used by IPI. Is it unavoidable ? The TICK counters are per-core and not synchronized by the hardware. We synchronize APs with the BSP on bring-up but they drift over time and the initial synchronization might not be perfect in the first place. At least in the past, drifting TICK counters caused all sorts of issues and strange behavior in FreeBSD when used as timecounter in the SMP case. If my understanding of the above is right, as is this still rules them out as timecounters for userland. Linux has some complex code (based on equivalent code origining in their ia64 port) for constantly synchronizing the TICK counters. In order to avoid that complexity and overhead, what I do in FreeBSD in the SMP case is to (ab)use counters (either intended for that purpose or bus cycle counters probably intended for debugging the hardware during development) available in the various host-to-foo bridges so it doesn't matter which CPU they are read by. This works just fine except for pre-PCI-Express based USIIIi machines, where the bus cycle counters are broken. That's where the TICK counter is always read from the BSP using an IPI in the SMP case. The latter is done as sched_bind(9) isn't possible with td_critnest 1 according to information from jhb@ and mav@. So apart from introducing code to constantly synchronize the TICK counters, using the timecounters on the host busses also seems to be the only viable solution for userland. The latter should be doable but is long-winded as besides duplicating portions of the corresponding device drivers in userland, it probably also means to get some additional infrastructure like being able to memory map registers for devices on the nexus(4) level in place ... Marius ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r237434 - in head/lib/libc: amd64/sys gen i386/sys include sys
On Sat, Jun 23, 2012 at 03:17:57PM +0200, Marius Strobl wrote: On Fri, Jun 22, 2012 at 10:48:17AM +0300, Konstantin Belousov wrote: On Fri, Jun 22, 2012 at 09:34:56AM +0200, Marius Strobl wrote: On Fri, Jun 22, 2012 at 07:13:31AM +, Konstantin Belousov wrote: Author: kib Date: Fri Jun 22 07:13:30 2012 New Revision: 237434 URL: http://svn.freebsd.org/changeset/base/237434 Log: Use struct vdso_timehands data to implement fast gettimeofday(2) and clock_gettime(2) functions if supported. The speedup seen in microbenchmarks is in range 4x-7x depending on the hardware. Only amd64 and i386 architectures are supported. Libc uses rdtsc and kernel data to calculate current time, if enabled by kernel. I don't know much about x86 CPUs but is my understanding correct that TSCs are not synchronized in any way across CPUs, i.e. reading it on different CPUs may result in time going backwards etc., which is okay for this application though? Generally speaking, tsc state among different CPU after boot is not synchronized, you are right. Kernel has somewhat doubtful test which verifies whether the after-boot state of tsc looks good. If the test fails, TSC is not enabled by default as timecounter, and then usermode follows kernel policy and falls back to slow syscall. So we err on the safe side. I tested this on Core i7 2xxx, where the test (usually) passes. Okay, so for x86 the TSCs are not used as timecounters by either the kernel or userland in the SMP case if they don't appear to be synchronized, correct? Correct as for now. But this is bug and not a feature. The tscs shall be synchronized, or skew tables calculated instead of refusing to use it. While you are there. do you have comments about sparc64 TICK counter ? On SMP, the counter of BSP is used by IPI. Is it unavoidable ? The TICK counters are per-core and not synchronized by the hardware. We synchronize APs with the BSP on bring-up but they drift over time and the initial synchronization might not be perfect in the first place. At least in the past, drifting TICK counters caused all sorts of issues and strange behavior in FreeBSD when used as timecounter in the SMP case. If my understanding of the above is right, as is this still rules them out as timecounters for userland. Linux has some complex code (based on equivalent code origining in their ia64 port) for constantly synchronizing the TICK counters. In order to avoid that complexity and overhead, what I do in FreeBSD in the SMP case is to (ab)use counters (either intended for that purpose or bus cycle counters probably intended for debugging the hardware during development) available in the various host-to-foo bridges so it doesn't matter which CPU they are read by. This works just fine except for pre-PCI-Express based USIIIi machines, where the bus cycle counters are broken. That's where the TICK counter is always read from the BSP using an IPI in the SMP case. The latter is done as sched_bind(9) isn't possible with td_critnest 1 according to information from jhb@ and mav@. So apart from introducing code to constantly synchronize the TICK counters, using the timecounters on the host busses also seems to be the only viable solution for userland. The latter should be doable but is long-winded as besides duplicating portions of the corresponding device drivers in userland, it probably also means to get some additional infrastructure like being able to memory map registers for devices on the nexus(4) level in place ... Understand. I do plan eventually to map HPET counters page into usermode on x86. Also, as I noted above, some code to synchronize per-package counters would be useful for x86, so it might be developed with multi-arch usage in mind. pgprOXUaTAQSS.pgp Description: PGP signature
Re: svn commit: r237434 - in head/lib/libc: amd64/sys gen i386/sys include sys
On Sat, 23 Jun 2012, Konstantin Belousov wrote: On Sat, Jun 23, 2012 at 03:17:57PM +0200, Marius Strobl wrote: On Fri, Jun 22, 2012 at 10:48:17AM +0300, Konstantin Belousov wrote: On Fri, Jun 22, 2012 at 09:34:56AM +0200, Marius Strobl wrote: On Fri, Jun 22, 2012 at 07:13:31AM +, Konstantin Belousov wrote: Author: kib Date: Fri Jun 22 07:13:30 2012 New Revision: 237434 URL: http://svn.freebsd.org/changeset/base/237434 Log: Use struct vdso_timehands data to implement fast gettimeofday(2) and clock_gettime(2) functions if supported. The speedup seen in microbenchmarks is in range 4x-7x depending on the hardware. Only amd64 and i386 architectures are supported. Libc uses rdtsc and kernel data to calculate current time, if enabled by kernel. I don't know much about x86 CPUs but is my understanding correct that TSCs are not synchronized in any way across CPUs, i.e. reading it on different CPUs may result in time going backwards etc., which is okay for this application though? Generally speaking, tsc state among different CPU after boot is not synchronized, you are right. Kernel has somewhat doubtful test which verifies whether the after-boot state of tsc looks good. If the test fails, TSC is not enabled by default as timecounter, and then usermode follows kernel policy and falls back to slow syscall. So we err on the safe side. I tested this on Core i7 2xxx, where the test (usually) passes. Okay, so for x86 the TSCs are not used as timecounters by either the kernel or userland in the SMP case if they don't appear to be synchronized, correct? Correct as for now. But this is bug and not a feature. The tscs shall be synchronized, or skew tables calculated instead of refusing to use it. While you are there. do you have comments about sparc64 TICK counter ? On SMP, the counter of BSP is used by IPI. Is it unavoidable ? The TICK counters are per-core and not synchronized by the hardware. We synchronize APs with the BSP on bring-up but they drift over time and the initial synchronization might not be perfect in the first place. At least in the past, drifting TICK counters caused all sorts of issues and strange behavior in FreeBSD when used as timecounter in the SMP case. If my understanding of the above is right, as is this still rules them out as timecounters for userland. Linux has some complex code (based on equivalent code origining in their ia64 port) for constantly synchronizing the TICK counters. In order to avoid that complexity and overhead, what I do in FreeBSD in the SMP case is to (ab)use counters (either intended Attempted synchronization of TSCs is left out for the same reason on x86. Except some half-baked synchronization for a home made time function in dtrace (dtrace_gethrtime() on amd64 and i386) crept in. for that purpose or bus cycle counters probably intended for debugging the hardware during development) available in the various host-to-foo bridges so it doesn't matter which CPU they are read by. This works just fine except for pre-PCI-Express based USIIIi machines, where the bus cycle counters are broken. That's where the TICK counter is always read from the BSP using an IPI in the SMP case. The latter is done as sched_bind(9) isn't possible with td_critnest 1 according to information from jhb@ and mav@. How can it work fine? Buses are too slow. On x86, ACPI-fast takes 700-1900 nsec on machines that I've tested (mostly pre-PCIe ones). HPET seems to be only slightly faster (maybe 500 nsec). So apart from introducing code to constantly synchronize the TICK counters, using the timecounters on the host busses also seems to be the only viable solution for userland. The latter should be doable but is long-winded as besides duplicating portions of the corresponding device drivers in userland, it probably also means to get some additional infrastructure like being able to memory map registers for devices on the nexus(4) level in place ... There is little point in optimizations to avoid syscalls for hardware. On x86, a syscall takes 100-400 nsec extra, so if the hardware takes 500-2000 nsec then reduction the total time by 100-400 nsec is not very useful. Understand. I do plan eventually to map HPET counters page into usermode on x86. This should be left out too. Also, as I noted above, some code to synchronize per-package counters would be useful for x86, so it might be developed with multi-arch usage in mind. It's only worth synchonizing fast timecounter hardware so that it can be used in more cases. It probably needs to be non-bus based to be fast. That means the TSC on x86. The new timeout code to support tickless kernels looks like it will give large pessimizations unless the timecounter is fast. Instead of using the tick counter (1 atomic increment on every clock tick) and some getbinuptime() calls in places like select(), it uses the hardware timecounter via binuptime() in most places (since without a tick counter and without clock
Re: svn commit: r237434 - in head/lib/libc: amd64/sys gen i386/sys include sys
On 06/23/12 18:26, Bruce Evans wrote: On Sat, 23 Jun 2012, Konstantin Belousov wrote: On Sat, Jun 23, 2012 at 03:17:57PM +0200, Marius Strobl wrote: So apart from introducing code to constantly synchronize the TICK counters, using the timecounters on the host busses also seems to be the only viable solution for userland. The latter should be doable but is long-winded as besides duplicating portions of the corresponding device drivers in userland, it probably also means to get some additional infrastructure like being able to memory map registers for devices on the nexus(4) level in place ... There is little point in optimizations to avoid syscalls for hardware. On x86, a syscall takes 100-400 nsec extra, so if the hardware takes 500-2000 nsec then reduction the total time by 100-400 nsec is not very useful. Just out of curiosity I've run my own binuptime() micro-benchmarks: - on Core i5-650: TSC 11ns HPET 433ns ACPI-fast 515ns i8254 3736ns - on dual-socket Xeon E5645: TSC 15ns HPET 580ns ACPI-fast 1118ns i8254 3911ns I think it could be useful to have that small benchmark in base kernel. Understand. I do plan eventually to map HPET counters page into usermode on x86. This should be left out too. Also, as I noted above, some code to synchronize per-package counters would be useful for x86, so it might be developed with multi-arch usage in mind. It's only worth synchonizing fast timecounter hardware so that it can be used in more cases. It probably needs to be non-bus based to be fast. That means the TSC on x86. The new timeout code to support tickless kernels looks like it will give large pessimizations unless the timecounter is fast. Instead of using the tick counter (1 atomic increment on every clock tick) and some getbinuptime() calls in places like select(), it uses the hardware timecounter via binuptime() in most places (since without a tick counter and without clock interrupts updating the timehands periodically, it takes a hardware timecounter read to determine the time). So callout_reset() might start taking thousands of nsec for per call, depending on how slow the timecounter is. This fix is probably to use a fuzzy time for long long timeouts and to discourage use of short timeouts and/or to turn them into long or fuzzy timeouts so that they are not very useful. The new timeout code is still in active development and optimization was not the first priority yet. My idea was to use much faster getbinuptime() for periods above let's say 100ms. Legacy ticks-oriented callout_reset() functions are by default not supposed to provide sub-tick resolution and with some assumptions could use getbinuptime(). For new interfaces it depends on caller, how will it get present time. I understand that integer tick counter is as fast as nothing else can ever be. But sorry, 32bit counter doesn't fit present goals. To have more we need some artificial atomicity -- exactly what getbinuptime() implements. What I would like to see there is tc_tick removal to make tc_windup() called for every hardclock tick. Having new tick-irrelevant callout interfaces we probably won't so much need to increase HZ too high any more, while this simplification would make ticks and getbinuptime() precision equal, solving some of your valid arguments against the last. -- Alexander Motin ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
svn commit: r237434 - in head/lib/libc: amd64/sys gen i386/sys include sys
Author: kib Date: Fri Jun 22 07:13:30 2012 New Revision: 237434 URL: http://svn.freebsd.org/changeset/base/237434 Log: Use struct vdso_timehands data to implement fast gettimeofday(2) and clock_gettime(2) functions if supported. The speedup seen in microbenchmarks is in range 4x-7x depending on the hardware. Only amd64 and i386 architectures are supported. Libc uses rdtsc and kernel data to calculate current time, if enabled by kernel. Hopefully, this code is going to migrate into vdso in some future. Discussed with: bde Reviewed by: jhb Tested by:flo MFC after:1 month Added: head/lib/libc/amd64/sys/__vdso_gettc.c (contents, props changed) head/lib/libc/i386/sys/__vdso_gettc.c (contents, props changed) head/lib/libc/sys/__vdso_gettimeofday.c (contents, props changed) head/lib/libc/sys/clock_gettime.c (contents, props changed) head/lib/libc/sys/gettimeofday.c (contents, props changed) Modified: head/lib/libc/amd64/sys/Makefile.inc head/lib/libc/gen/aux.c head/lib/libc/i386/sys/Makefile.inc head/lib/libc/include/libc_private.h head/lib/libc/sys/Makefile.inc Modified: head/lib/libc/amd64/sys/Makefile.inc == --- head/lib/libc/amd64/sys/Makefile.incFri Jun 22 07:06:40 2012 (r237433) +++ head/lib/libc/amd64/sys/Makefile.incFri Jun 22 07:13:30 2012 (r237434) @@ -1,7 +1,8 @@ # from: Makefile.inc,v 1.1 1993/09/03 19:04:23 jtc Exp # $FreeBSD$ -SRCS+= amd64_get_fsbase.c amd64_get_gsbase.c amd64_set_fsbase.c amd64_set_gsbase.c +SRCS+= amd64_get_fsbase.c amd64_get_gsbase.c amd64_set_fsbase.c \ + amd64_set_gsbase.c __vdso_gettc.c MDASM= vfork.S brk.S cerror.S exect.S getcontext.S pipe.S ptrace.S \ reboot.S sbrk.S setlogin.S sigreturn.S Added: head/lib/libc/amd64/sys/__vdso_gettc.c == --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ head/lib/libc/amd64/sys/__vdso_gettc.c Fri Jun 22 07:13:30 2012 (r237434) @@ -0,0 +1,49 @@ +/*- + * Copyright (c) 2012 Konstantin Belousov k...@freebsd.org + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + *notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + *notice, this list of conditions and the following disclaimer in the + *documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include sys/cdefs.h +__FBSDID($FreeBSD$); + +#include sys/types.h +#include sys/time.h +#include sys/vdso.h +#include machine/cpufunc.h + +static u_int +__vdso_gettc_low(const struct vdso_timehands *th) +{ + uint32_t rv; + + __asm __volatile(rdtsc; shrd %%cl, %%edx, %0 + : =a (rv) : c (th-th_x86_shift) : edx); + return (rv); +} + +u_int +__vdso_gettc(const struct vdso_timehands *th) +{ + + return (th-th_x86_shift 0 ? __vdso_gettc_low(th) : rdtsc32()); +} Modified: head/lib/libc/gen/aux.c == --- head/lib/libc/gen/aux.c Fri Jun 22 07:06:40 2012(r237433) +++ head/lib/libc/gen/aux.c Fri Jun 22 07:13:30 2012(r237434) @@ -66,6 +66,7 @@ __init_elf_aux_vector(void) static pthread_once_t aux_once = PTHREAD_ONCE_INIT; static int pagesize, osreldate, canary_len, ncpus, pagesizes_len; static char *canary, *pagesizes; +static void *timekeep; static void init_aux(void) @@ -101,6 +102,10 @@ init_aux(void) case AT_NCPUS: ncpus = aux-a_un.a_val; break; + + case AT_TIMEKEEP: + timekeep = aux-a_un.a_ptr; + break; } } } @@ -163,6 +168,16 @@ _elf_aux_info(int aux, void *buf, int bu } else res = EINVAL; break;
Re: svn commit: r237434 - in head/lib/libc: amd64/sys gen i386/sys include sys
On Fri, Jun 22, 2012 at 07:13:31AM +, Konstantin Belousov wrote: Author: kib Date: Fri Jun 22 07:13:30 2012 New Revision: 237434 URL: http://svn.freebsd.org/changeset/base/237434 Log: Use struct vdso_timehands data to implement fast gettimeofday(2) and clock_gettime(2) functions if supported. The speedup seen in microbenchmarks is in range 4x-7x depending on the hardware. Only amd64 and i386 architectures are supported. Libc uses rdtsc and kernel data to calculate current time, if enabled by kernel. I don't know much about x86 CPUs but is my understanding correct that TSCs are not synchronized in any way across CPUs, i.e. reading it on different CPUs may result in time going backwards etc., which is okay for this application though? Marius ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r237434 - in head/lib/libc: amd64/sys gen i386/sys include sys
On Fri, Jun 22, 2012 at 09:34:56AM +0200, Marius Strobl wrote: On Fri, Jun 22, 2012 at 07:13:31AM +, Konstantin Belousov wrote: Author: kib Date: Fri Jun 22 07:13:30 2012 New Revision: 237434 URL: http://svn.freebsd.org/changeset/base/237434 Log: Use struct vdso_timehands data to implement fast gettimeofday(2) and clock_gettime(2) functions if supported. The speedup seen in microbenchmarks is in range 4x-7x depending on the hardware. Only amd64 and i386 architectures are supported. Libc uses rdtsc and kernel data to calculate current time, if enabled by kernel. I don't know much about x86 CPUs but is my understanding correct that TSCs are not synchronized in any way across CPUs, i.e. reading it on different CPUs may result in time going backwards etc., which is okay for this application though? Generally speaking, tsc state among different CPU after boot is not synchronized, you are right. Kernel has somewhat doubtful test which verifies whether the after-boot state of tsc looks good. If the test fails, TSC is not enabled by default as timecounter, and then usermode follows kernel policy and falls back to slow syscall. So we err on the safe side. I tested this on Core i7 2xxx, where the test (usually) passes. The test we currently have fails for me at least on single-package Nehalems, where the counter should be located on uncore part. This indicates some brokeness in the code, but I did not investigated the cause. The code can be developed which adjusts tsc msrs to be in sync. Or, rtdscp instruction can be used, which allow to handle counter skew in usermode in race-free manner. While you are there. do you have comments about sparc64 TICK counter ? On SMP, the counter of BSP is used by IPI. Is it unavoidable ? pgpbUQaD5boRd.pgp Description: PGP signature
Re: svn commit: r237434 - in head/lib/libc: amd64/sys gen i386/sys include sys
On 22 Jun 2012, at 08:34, Marius Strobl wrote: I don't know much about x86 CPUs but is my understanding correct that TSCs are not synchronized in any way across CPUs, i.e. reading it on different CPUs may result in time going backwards etc., which is okay for this application though? As long as the initial value is set on every context switch, it only matters that the TSC is monotonic and increments at an approximately constant rate. It is also possible to set the TSC value, but that's less useful in this context. The one thing to be careful about is the fact that certain power saving states will affect the speed at which the TSC increments, and so it is important to update the ticks-per-second value whenever a core goes into a low power state. This is more or less the same approach used by Xen, so most of the issues have been ironed out: Oracle complained to CPU vendors about a few corner cases and, because Oracle customers tend to buy a lot of expensive Xeon and Opteron chips, they were fixed quite promptly. David ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r237434 - in head/lib/libc: amd64/sys gen i386/sys include sys
On Friday, June 22, 2012 3:34:56 am Marius Strobl wrote: On Fri, Jun 22, 2012 at 07:13:31AM +, Konstantin Belousov wrote: Author: kib Date: Fri Jun 22 07:13:30 2012 New Revision: 237434 URL: http://svn.freebsd.org/changeset/base/237434 Log: Use struct vdso_timehands data to implement fast gettimeofday(2) and clock_gettime(2) functions if supported. The speedup seen in microbenchmarks is in range 4x-7x depending on the hardware. Only amd64 and i386 architectures are supported. Libc uses rdtsc and kernel data to calculate current time, if enabled by kernel. I don't know much about x86 CPUs but is my understanding correct that TSCs are not synchronized in any way across CPUs, i.e. reading it on different CPUs may result in time going backwards etc., which is okay for this application though? Hmm, in practice I have found that on modern x86 CPUs (Penryn and later) the TSC is in fact sychronized across packages at work. At least, when I measure skew across packages it appears to be consistent with the time it would take a write to propagate from one to the other. -- John Baldwin ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org