Re: [GSoc] Timeconter Performance Improvements

2011-03-27 Thread Warner Losh

On Mar 26, 2011, at 8:43 AM, Jing Huang wrote:

 Hi,
 
 Thanks for you all sincerely. Under your guidance, I read the
 specification of TSC in Intel Manual and learned the hardware feature
 of TSC:
 
 Processor families increment the time-stamp counter differently:
   • For Pentium M processors (family [06H], models [09H, 0DH]); for Pentium 4
 processors, Intel Xeon processors (family [0FH], models [00H, 01H, or 02H]);
 and for P6 family processors: the time-stamp counter increments with every
 internal processor clock cycle.
 
   • For Pentium 4 processors, Intel Xeon processors (family [0FH],
 models [03H and
 higher]); for Intel Core Solo and Intel Core Duo processors (family [06H], 
 model
 [0EH]); for the Intel Xeon processor 5100 series and Intel Core 2 Duo 
 processors
 (family [06H], model [0FH]); for Intel Core 2 and Intel Xeon processors 
 (family
 [06H], display_model [17H]); for Intel Atom processors (family [06H],
 display_model [1CH]): the time-stamp counter increments at a constant rate.
 
 Maybe we would implement gettimeofday as fellows. Firstly, use cpuid
 to find the family and models of current CPU. If the CPU support
 constant TSC, we look up the shared page and calculate the precise
 time in usermode. If the platform has invariant TSCs, and we just
 fallback to a syscall. So, I think a single global shared page maybe
 proper.

I think that the userspace portion should be more like:

int kernel_time_type) SECTION(shared);
struct tsc_goo tsc_time_data SECTION(shared);

switch (kernel_time_type) {
case 1:
/* code to use tsc_time_data to return time */
break;
default:
/* call the kernel */
}

I think we should avoid hard-coding lists of CPU families in userland.  The 
kernel init routines will decide, based on the CPU type and other stuff if this 
optimization can be done.  This would allow the kernel to update to support new 
CPU types without needing to churn libc.

Warner

P.S.  The SECTION(shared) notation above just means that the variables are in 
the shared page.

 
 
 On Sat, Mar 26, 2011 at 10:12 PM, John Baldwin j...@freebsd.org wrote:
 On Saturday, March 26, 2011 08:16:46 am Peter Jeremy wrote:
 On 2011-Mar-25 08:18:38 -0400, John Baldwin j...@freebsd.org wrote:
 For modern Intel CPUs you can just assume that the TSCs are in sync across
 packages.  They also have invariant TSC's meaning that the frequency
 doesn't change.
 
 Synchronised P-state invariant TSCs vastly simplify the problem but
 not everyone has them.  Should the fallback be more complexity to
 support per-CPU TSC counts and varying frequencies or a fallback to
 reading the time via a syscall?
 
 I think we should just fallback to a syscall in that case.  We will also need
 to do that if the TSC is not used as the timecounter (or always duplicate the
 ntp_adjtime() work we do for the current timecounter for the TSC 
 timecounter).
 
 Doing this easy case may give us the most bang for the buck, and it is also a
 good first milestone.  Once that is in place we can decide what the value is
 in extending it to support harder variations.
 
 One thing we do need to think about is if the shared page should just export 
 a
 fixed set of global data, or if it should export routines.  The latter
 approach is more complex, but it makes the ABI boundary between userland and
 the kernel more friendly to future changes.  I believe Linux does the latter
 approach?
 
 --
 John Baldwin
 
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
 
 

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [GSoc] Timeconter Performance Improvements

2011-03-27 Thread Mark Tinguely

On 3/27/2011 5:32 PM, Warner Losh wrote:

On Mar 26, 2011, at 8:43 AM, Jing Huang wrote:


Hi,

Thanks for you all sincerely. Under your guidance, I read the
specification of TSC in Intel Manual and learned the hardware feature
of TSC:

Processor families increment the time-stamp counter differently:
   • For Pentium M processors (family [06H], models [09H, 0DH]); for Pentium 4
processors, Intel Xeon processors (family [0FH], models [00H, 01H, or 02H]);
and for P6 family processors: the time-stamp counter increments with every
internal processor clock cycle.

   • For Pentium 4 processors, Intel Xeon processors (family [0FH],
models [03H and
higher]); for Intel Core Solo and Intel Core Duo processors (family [06H], model
[0EH]); for the Intel Xeon processor 5100 series and Intel Core 2 Duo processors
(family [06H], model [0FH]); for Intel Core 2 and Intel Xeon processors (family
[06H], display_model [17H]); for Intel Atom processors (family [06H],
display_model [1CH]): the time-stamp counter increments at a constant rate.

Maybe we would implement gettimeofday as fellows. Firstly, use cpuid
to find the family and models of current CPU. If the CPU support
constant TSC, we look up the shared page and calculate the precise
time in usermode. If the platform has invariant TSCs, and we just
fallback to a syscall. So, I think a single global shared page maybe
proper.

I think that the userspace portion should be more like:

int kernel_time_type) SECTION(shared);
struct tsc_goo tsc_time_data SECTION(shared);

switch (kernel_time_type) {
case 1:
/* code to use tsc_time_data to return time */
break;
default:
/* call the kernel */
}

I think we should avoid hard-coding lists of CPU families in userland.  The 
kernel init routines will decide, based on the CPU type and other stuff if this 
optimization can be done.  This would allow the kernel to update to support new 
CPU types without needing to churn libc.

Warner

P.S.  The SECTION(shared) notation above just means that the variables are in 
the shared page.



On Sat, Mar 26, 2011 at 10:12 PM, John Baldwinj...@freebsd.org  wrote:

On Saturday, March 26, 2011 08:16:46 am Peter Jeremy wrote:

On 2011-Mar-25 08:18:38 -0400, John Baldwinj...@freebsd.org  wrote:

For modern Intel CPUs you can just assume that the TSCs are in sync across
packages.  They also have invariant TSC's meaning that the frequency
doesn't change.

Synchronised P-state invariant TSCs vastly simplify the problem but
not everyone has them.  Should the fallback be more complexity to
support per-CPU TSC counts and varying frequencies or a fallback to
reading the time via a syscall?

I think we should just fallback to a syscall in that case.  We will also need
to do that if the TSC is not used as the timecounter (or always duplicate the
ntp_adjtime() work we do for the current timecounter for the TSC timecounter).

Doing this easy case may give us the most bang for the buck, and it is also a
good first milestone.  Once that is in place we can decide what the value is
in extending it to support harder variations.

One thing we do need to think about is if the shared page should just export a
fixed set of global data, or if it should export routines.  The latter
approach is more complex, but it makes the ABI boundary between userland and
the kernel more friendly to future changes.  I believe Linux does the latter
approach?

--
John Baldwin


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org



If a user process can perform a rfork(2) or rfork_thread(3) with RFMEM 
option, then can't the same page table be active on multiple processors? 
Mapping per CPU page(s) to a fixed user addess(es) would only hold the 
last switched cpu's information.


x86 architectures use a segment pointer to keep the kernel per cpu 
information current.



--Mark Tinguely.


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [GSoc] Timeconter Performance Improvements

2011-03-27 Thread Julian Elischer

On 3/27/11 3:32 PM, Warner Losh wrote:

On Mar 26, 2011, at 8:43 AM, Jing Huang wrote:


Hi,

Thanks for you all sincerely. Under your guidance, I read the
specification of TSC in Intel Manual and learned the hardware feature
of TSC:

Processor families increment the time-stamp counter differently:
   • For Pentium M processors (family [06H], models [09H, 0DH]); for Pentium 4
processors, Intel Xeon processors (family [0FH], models [00H, 01H, or 02H]);
and for P6 family processors: the time-stamp counter increments with every
internal processor clock cycle.

   • For Pentium 4 processors, Intel Xeon processors (family [0FH],
models [03H and
higher]); for Intel Core Solo and Intel Core Duo processors (family [06H], model
[0EH]); for the Intel Xeon processor 5100 series and Intel Core 2 Duo processors
(family [06H], model [0FH]); for Intel Core 2 and Intel Xeon processors (family
[06H], display_model [17H]); for Intel Atom processors (family [06H],
display_model [1CH]): the time-stamp counter increments at a constant rate.

Maybe we would implement gettimeofday as fellows. Firstly, use cpuid
to find the family and models of current CPU. If the CPU support
constant TSC, we look up the shared page and calculate the precise
time in usermode. If the platform has invariant TSCs, and we just
fallback to a syscall. So, I think a single global shared page maybe
proper.

I think that the userspace portion should be more like:

int kernel_time_type) SECTION(shared);
struct tsc_goo tsc_time_data SECTION(shared);

switch (kernel_time_type) {
case 1:
/* code to use tsc_time_data to return time */
break;
default:
/* call the kernel */
}

I think we should avoid hard-coding lists of CPU families in userland.  The 
kernel init routines will decide, based on the CPU type and other stuff if this 
optimization can be done.  This would allow the kernel to update to support new 
CPU types without needing to churn libc.

Warner

P.S.  The SECTION(shared) notation above just means that the variables are in 
the shared page.


As has been mentioned here and there, the gold-standard way for doing 
this is for the kernel to export a special memory region
in elf format that can be linked to with exported kernel sanctioned 
code snippets specially tailored for the cpu/OS/binray-format
in question. There is no real security risk to this but potential 
upsides are great.


On Sat, Mar 26, 2011 at 10:12 PM, John Baldwinj...@freebsd.org  wrote:

On Saturday, March 26, 2011 08:16:46 am Peter Jeremy wrote:

On 2011-Mar-25 08:18:38 -0400, John Baldwinj...@freebsd.org  wrote:

For modern Intel CPUs you can just assume that the TSCs are in sync across
packages.  They also have invariant TSC's meaning that the frequency
doesn't change.

Synchronised P-state invariant TSCs vastly simplify the problem but
not everyone has them.  Should the fallback be more complexity to
support per-CPU TSC counts and varying frequencies or a fallback to
reading the time via a syscall?

I think we should just fallback to a syscall in that case.  We will also need
to do that if the TSC is not used as the timecounter (or always duplicate the
ntp_adjtime() work we do for the current timecounter for the TSC timecounter).

Doing this easy case may give us the most bang for the buck, and it is also a
good first milestone.  Once that is in place we can decide what the value is
in extending it to support harder variations.

One thing we do need to think about is if the shared page should just export a
fixed set of global data, or if it should export routines.  The latter
approach is more complex, but it makes the ABI boundary between userland and
the kernel more friendly to future changes.  I believe Linux does the latter
approach?

--
John Baldwin


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org




___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [GSoc] Timeconter Performance Improvements

2011-03-27 Thread Warner Losh

On Mar 27, 2011, at 10:29 PM, Julian Elischer wrote:

 On 3/27/11 3:32 PM, Warner Losh wrote:
 On Mar 26, 2011, at 8:43 AM, Jing Huang wrote:
 
 Hi,
 
 Thanks for you all sincerely. Under your guidance, I read the
 specification of TSC in Intel Manual and learned the hardware feature
 of TSC:
 
 Processor families increment the time-stamp counter differently:
   • For Pentium M processors (family [06H], models [09H, 0DH]); for Pentium 
 4
 processors, Intel Xeon processors (family [0FH], models [00H, 01H, or 02H]);
 and for P6 family processors: the time-stamp counter increments with every
 internal processor clock cycle.
 
   • For Pentium 4 processors, Intel Xeon processors (family [0FH],
 models [03H and
 higher]); for Intel Core Solo and Intel Core Duo processors (family [06H], 
 model
 [0EH]); for the Intel Xeon processor 5100 series and Intel Core 2 Duo 
 processors
 (family [06H], model [0FH]); for Intel Core 2 and Intel Xeon processors 
 (family
 [06H], display_model [17H]); for Intel Atom processors (family [06H],
 display_model [1CH]): the time-stamp counter increments at a constant rate.
 
 Maybe we would implement gettimeofday as fellows. Firstly, use cpuid
 to find the family and models of current CPU. If the CPU support
 constant TSC, we look up the shared page and calculate the precise
 time in usermode. If the platform has invariant TSCs, and we just
 fallback to a syscall. So, I think a single global shared page maybe
 proper.
 I think that the userspace portion should be more like:
 
 int kernel_time_type) SECTION(shared);
 struct tsc_goo tsc_time_data SECTION(shared);
 
 switch (kernel_time_type) {
 case 1:
  /* code to use tsc_time_data to return time */
  break;
 default:
  /* call the kernel */
 }
 
 I think we should avoid hard-coding lists of CPU families in userland.  The 
 kernel init routines will decide, based on the CPU type and other stuff if 
 this optimization can be done.  This would allow the kernel to update to 
 support new CPU types without needing to churn libc.
 
 Warner
 
 P.S.  The SECTION(shared) notation above just means that the variables are 
 in the shared page.
 
 As has been mentioned here and there, the gold-standard way for doing this is 
 for the kernel to export a special memory region
 in elf format that can be linked to with exported kernel sanctioned code 
 snippets specially tailored for the cpu/OS/binray-format
 in question. There is no real security risk to this but potential upsides are 
 great.

You'll have to map multiple pages if you do this: one for the data that has to 
be exported from the kernel and one that has to be the executable code.  I 
don't think this is necessarily the gold standard at all.  I think it is 
overkill that we'll grow to regret.

My method you'll have the code 100% in userland, where it belongs.  If you want 
to map CPU-type-specific code, add it to ld.so.

Warner

 
 On Sat, Mar 26, 2011 at 10:12 PM, John Baldwinj...@freebsd.org  wrote:
 On Saturday, March 26, 2011 08:16:46 am Peter Jeremy wrote:
 On 2011-Mar-25 08:18:38 -0400, John Baldwinj...@freebsd.org  wrote:
 For modern Intel CPUs you can just assume that the TSCs are in sync 
 across
 packages.  They also have invariant TSC's meaning that the frequency
 doesn't change.
 Synchronised P-state invariant TSCs vastly simplify the problem but
 not everyone has them.  Should the fallback be more complexity to
 support per-CPU TSC counts and varying frequencies or a fallback to
 reading the time via a syscall?
 I think we should just fallback to a syscall in that case.  We will also 
 need
 to do that if the TSC is not used as the timecounter (or always duplicate 
 the
 ntp_adjtime() work we do for the current timecounter for the TSC 
 timecounter).
 
 Doing this easy case may give us the most bang for the buck, and it is 
 also a
 good first milestone.  Once that is in place we can decide what the value 
 is
 in extending it to support harder variations.
 
 One thing we do need to think about is if the shared page should just 
 export a
 fixed set of global data, or if it should export routines.  The latter
 approach is more complex, but it makes the ABI boundary between userland 
 and
 the kernel more friendly to future changes.  I believe Linux does the 
 latter
 approach?
 
 --
 John Baldwin
 
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
 
 
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
 
 
 
 

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to 

Re: [GSoc] Timeconter Performance Improvements

2011-03-26 Thread Peter Jeremy
On 2011-Mar-25 08:18:38 -0400, John Baldwin j...@freebsd.org wrote:
For modern Intel CPUs you can just assume that the TSCs are in sync across 
packages.  They also have invariant TSC's meaning that the frequency doesn't 
change.

Synchronised P-state invariant TSCs vastly simplify the problem but
not everyone has them.  Should the fallback be more complexity to
support per-CPU TSC counts and varying frequencies or a fallback to
reading the time via a syscall?

I believe we already have a shared page (it holds the signal trampoline now)
for at least the x86 platform (probably some others as well).

r217151 for amd64 and r217400 for ppc.  It doesn't appear to be
supported on other platforms.  My reading of the code is that there is
a single shared page used by all processes/CPUs.  In order to support
non-synchronised TSCs, this would need to be changed to per-CPU.

-- 
Peter Jeremy


pgpTiRyo5tsg4.pgp
Description: PGP signature


Re: [GSoc] Timeconter Performance Improvements

2011-03-26 Thread Kostik Belousov
On Sat, Mar 26, 2011 at 11:16:46PM +1100, Peter Jeremy wrote:
 On 2011-Mar-25 08:18:38 -0400, John Baldwin j...@freebsd.org wrote:
 For modern Intel CPUs you can just assume that the TSCs are in sync across 
 packages.  They also have invariant TSC's meaning that the frequency doesn't 
 change.
 
 Synchronised P-state invariant TSCs vastly simplify the problem but
 not everyone has them.  Should the fallback be more complexity to
 support per-CPU TSC counts and varying frequencies or a fallback to
 reading the time via a syscall?
 
 I believe we already have a shared page (it holds the signal trampoline now)
 for at least the x86 platform (probably some others as well).
 
 r217151 for amd64 and r217400 for ppc.  It doesn't appear to be
 supported on other platforms.  My reading of the code is that there is
 a single shared page used by all processes/CPUs.  In order to support
 non-synchronised TSCs, this would need to be changed to per-CPU.
Not neccessary. If you have a reliable way to access proper private
per-CPU page from the array, then you could use the same method
to access the array in the single page.

IMO, per-cpu page in process address space at the same address
for all pages is too costly. I think we can target a modern hardware
for user-mode tsc, this is the kind of machines that are used for
benchmarks anyway.


pgpnxlUPO1v61.pgp
Description: PGP signature


Re: [GSoc] Timeconter Performance Improvements

2011-03-26 Thread John Baldwin
On Saturday, March 26, 2011 08:16:46 am Peter Jeremy wrote:
 On 2011-Mar-25 08:18:38 -0400, John Baldwin j...@freebsd.org wrote:
 For modern Intel CPUs you can just assume that the TSCs are in sync across
 packages.  They also have invariant TSC's meaning that the frequency
 doesn't change.
 
 Synchronised P-state invariant TSCs vastly simplify the problem but
 not everyone has them.  Should the fallback be more complexity to
 support per-CPU TSC counts and varying frequencies or a fallback to
 reading the time via a syscall?

I think we should just fallback to a syscall in that case.  We will also need 
to do that if the TSC is not used as the timecounter (or always duplicate the 
ntp_adjtime() work we do for the current timecounter for the TSC timecounter).

Doing this easy case may give us the most bang for the buck, and it is also a 
good first milestone.  Once that is in place we can decide what the value is 
in extending it to support harder variations.

One thing we do need to think about is if the shared page should just export a
fixed set of global data, or if it should export routines.  The latter 
approach is more complex, but it makes the ABI boundary between userland and 
the kernel more friendly to future changes.  I believe Linux does the latter 
approach?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [GSoc] Timeconter Performance Improvements

2011-03-26 Thread Jing Huang
Hi,

 Thanks for you all sincerely. Under your guidance, I read the
specification of TSC in Intel Manual and learned the hardware feature
of TSC:

Processor families increment the time-stamp counter differently:
   • For Pentium M processors (family [06H], models [09H, 0DH]); for Pentium 4
processors, Intel Xeon processors (family [0FH], models [00H, 01H, or 02H]);
and for P6 family processors: the time-stamp counter increments with every
internal processor clock cycle.

   • For Pentium 4 processors, Intel Xeon processors (family [0FH],
models [03H and
higher]); for Intel Core Solo and Intel Core Duo processors (family [06H], model
[0EH]); for the Intel Xeon processor 5100 series and Intel Core 2 Duo processors
(family [06H], model [0FH]); for Intel Core 2 and Intel Xeon processors (family
[06H], display_model [17H]); for Intel Atom processors (family [06H],
display_model [1CH]): the time-stamp counter increments at a constant rate.

Maybe we would implement gettimeofday as fellows. Firstly, use cpuid
to find the family and models of current CPU. If the CPU support
constant TSC, we look up the shared page and calculate the precise
time in usermode. If the platform has invariant TSCs, and we just
fallback to a syscall. So, I think a single global shared page maybe
proper.


On Sat, Mar 26, 2011 at 10:12 PM, John Baldwin j...@freebsd.org wrote:
 On Saturday, March 26, 2011 08:16:46 am Peter Jeremy wrote:
 On 2011-Mar-25 08:18:38 -0400, John Baldwin j...@freebsd.org wrote:
 For modern Intel CPUs you can just assume that the TSCs are in sync across
 packages.  They also have invariant TSC's meaning that the frequency
 doesn't change.

 Synchronised P-state invariant TSCs vastly simplify the problem but
 not everyone has them.  Should the fallback be more complexity to
 support per-CPU TSC counts and varying frequencies or a fallback to
 reading the time via a syscall?

 I think we should just fallback to a syscall in that case.  We will also need
 to do that if the TSC is not used as the timecounter (or always duplicate the
 ntp_adjtime() work we do for the current timecounter for the TSC timecounter).

 Doing this easy case may give us the most bang for the buck, and it is also a
 good first milestone.  Once that is in place we can decide what the value is
 in extending it to support harder variations.

 One thing we do need to think about is if the shared page should just export a
 fixed set of global data, or if it should export routines.  The latter
 approach is more complex, but it makes the ABI boundary between userland and
 the kernel more friendly to future changes.  I believe Linux does the latter
 approach?

 --
 John Baldwin

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [GSoc] Timeconter Performance Improvements

2011-03-26 Thread Kostik Belousov
On Sat, Mar 26, 2011 at 10:12:32AM -0400, John Baldwin wrote:
 On Saturday, March 26, 2011 08:16:46 am Peter Jeremy wrote:
  On 2011-Mar-25 08:18:38 -0400, John Baldwin j...@freebsd.org wrote:
  For modern Intel CPUs you can just assume that the TSCs are in sync across
  packages.  They also have invariant TSC's meaning that the frequency
  doesn't change.
  
  Synchronised P-state invariant TSCs vastly simplify the problem but
  not everyone has them.  Should the fallback be more complexity to
  support per-CPU TSC counts and varying frequencies or a fallback to
  reading the time via a syscall?
 
 I think we should just fallback to a syscall in that case.  We will also need 
 to do that if the TSC is not used as the timecounter (or always duplicate the 
 ntp_adjtime() work we do for the current timecounter for the TSC timecounter).
 
 Doing this easy case may give us the most bang for the buck, and it is also a 
 good first milestone.  Once that is in place we can decide what the value is 
 in extending it to support harder variations.
 
 One thing we do need to think about is if the shared page should just export a
 fixed set of global data, or if it should export routines.  The latter 
 approach is more complex, but it makes the ABI boundary between userland and 
 the kernel more friendly to future changes.  I believe Linux does the latter 
 approach?
Linux uses a so-called vdso, which is linked into the process.

I think that the efforts to implement a vdso approximately equal to the
efforts required to implement timecounters in the user mode. On the
other hand, with vdso we could properly annotate signal trampolines
with the unwind info, that is also a big win.


pgpbOEkvvqnQ4.pgp
Description: PGP signature


Re: [GSoc] Timeconter Performance Improvements

2011-03-26 Thread Warner Losh

On Mar 26, 2011, at 8:12 AM, John Baldwin wrote:

 On Saturday, March 26, 2011 08:16:46 am Peter Jeremy wrote:
 On 2011-Mar-25 08:18:38 -0400, John Baldwin j...@freebsd.org wrote:
 For modern Intel CPUs you can just assume that the TSCs are in sync across
 packages.  They also have invariant TSC's meaning that the frequency
 doesn't change.
 
 Synchronised P-state invariant TSCs vastly simplify the problem but
 not everyone has them.  Should the fallback be more complexity to
 support per-CPU TSC counts and varying frequencies or a fallback to
 reading the time via a syscall?
 
 I think we should just fallback to a syscall in that case.  We will also need 
 to do that if the TSC is not used as the timecounter (or always duplicate the 
 ntp_adjtime() work we do for the current timecounter for the TSC timecounter).

Logically, the code should look like:
if (can_do_fast_time)
do_the_fast_time
else
call the kernel

We can expand what can or can't do the fast time later once we get the basics 
working.

 Doing this easy case may give us the most bang for the buck, and it is also a 
 good first milestone.  Once that is in place we can decide what the value is 
 in extending it to support harder variations.

Agreed.

 One thing we do need to think about is if the shared page should just export a
 fixed set of global data, or if it should export routines.  The latter 
 approach is more complex, but it makes the ABI boundary between userland and 
 the kernel more friendly to future changes.  I believe Linux does the latter 
 approach?

There's nothing that says we can't couple this with loading a cpu-specific 
shared library, which would also insulate things.

Having a single page of both data and code strikes me as unwise.  Having one of 
each wouldn't be too bad.

Warner___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [GSoc] Timeconter Performance Improvements

2011-03-26 Thread Julian Elischer

On 3/25/11 1:24 AM, Peter Jeremy wrote:

On 2011-Mar-24 17:00:02 +0800, Jing Huangjing.huang@gmail.com  wrote:

 In this scenario, I plan to use both tsc and shared memory to
calculate precise time in user mode. The shared memory includes
system_time, tsc_system_time and factor_tsc-system_time.

This sounds like a reasonable approach to me.  Note that once we
implement a shared page, there is probably a variety of other
information we could usefully place on that page.

SunOS 4.x included a page of shared memory per CPU.  This was mapped
as an array (indexed by CPU number) at one address and the page
reflecting the current CPU was additionally mapped at another fixed
address.  This allowed a process to both refer to data on its CPU
as well any CPU on the system.


 We also consider the CPU frequency, because tsc counter is
related to it. When kernel changes CPU frequency, the shared memory
should be update subsequently.

Two issues with this, particularly on x86 without invariant TSC:
- looking up the current CPU frequency may not be a cheap operation
- the reported CPU frequency appears to be just an approximate value,
   rather than the actual TSC frequency.

On 2011-Mar-24 21:34:35 +0800, Jing Huangjing.huang@gmail.com  wrote:

As I know, tsc counter is CPU specific. If the process running on
a multi-core platform, we must consider switching problem. The one
way, we can let the kernel to take of this. When switching to another
CPU, the kernel will reset the shared memory according to the new CPU.

I'm not sure what the cost of managing this page mapping will be.


The second way, we can use CPUID instruction to get the info of
current CPU, which can be executed in user mode ether. At the same
time, the kernel maintains shared memory for each CPU. When invoke
gettimeofday, the function will calculate precise time with current
CPU's shared memory.

This approach suffers from a race condition between the CPUID
instruction and accessing the appropriate shared page - there is the
potential for an interrupt causing the process to be switched to a
different CPU, resulting in an incorrect page being accessed.



The shared page(s) can be in the form of an elf module that is linked 
with the process at load time.

that way you can put cpu-specific code snippets there as well.
when using  a shared page to modify the TSC value read, one also needs to
tempirarily lock the cpu you are on between the time you read the 
calibration value and
the time you read the TSC.. A user process has only limited ability to 
do that.




___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [GSoc] Timeconter Performance Improvements

2011-03-25 Thread Peter Jeremy
On 2011-Mar-24 17:00:02 +0800, Jing Huang jing.huang@gmail.com wrote:
 In this scenario, I plan to use both tsc and shared memory to
calculate precise time in user mode. The shared memory includes
system_time, tsc_system_time and factor_tsc-system_time.

This sounds like a reasonable approach to me.  Note that once we
implement a shared page, there is probably a variety of other
information we could usefully place on that page.

SunOS 4.x included a page of shared memory per CPU.  This was mapped
as an array (indexed by CPU number) at one address and the page
reflecting the current CPU was additionally mapped at another fixed
address.  This allowed a process to both refer to data on its CPU
as well any CPU on the system.

 We also consider the CPU frequency, because tsc counter is
related to it. When kernel changes CPU frequency, the shared memory
should be update subsequently.

Two issues with this, particularly on x86 without invariant TSC:
- looking up the current CPU frequency may not be a cheap operation
- the reported CPU frequency appears to be just an approximate value,
  rather than the actual TSC frequency.

On 2011-Mar-24 21:34:35 +0800, Jing Huang jing.huang@gmail.com wrote:
As I know, tsc counter is CPU specific. If the process running on
a multi-core platform, we must consider switching problem. The one
way, we can let the kernel to take of this. When switching to another
CPU, the kernel will reset the shared memory according to the new CPU.

I'm not sure what the cost of managing this page mapping will be.

The second way, we can use CPUID instruction to get the info of
current CPU, which can be executed in user mode ether. At the same
time, the kernel maintains shared memory for each CPU. When invoke
gettimeofday, the function will calculate precise time with current
CPU's shared memory.

This approach suffers from a race condition between the CPUID
instruction and accessing the appropriate shared page - there is the
potential for an interrupt causing the process to be switched to a
different CPU, resulting in an incorrect page being accessed.

-- 
Peter Jeremy


pgpHImAnkRcSI.pgp
Description: PGP signature


Re: [GSoc] Timeconter Performance Improvements

2011-03-25 Thread John Baldwin
On Thursday, March 24, 2011 9:34:35 am Jing Huang wrote:
 Hi,
 
Thanks for your replay. That is just my self-introduction:) I want
 to borrow the shared memory idea from KVM, I am not want to port a
 whole KVM:)  But for this project, there are some basic problems.
 
 As I know, tsc counter is CPU specific. If the process running on
 a multi-core platform, we must consider switching problem. The one
 way, we can let the kernel to take of this. When switching to another
 CPU, the kernel will reset the shared memory according to the new CPU.
 The second way, we can use CPUID instruction to get the info of
 current CPU, which can be executed in user mode ether. At the same
 time, the kernel maintains shared memory for each CPU. When invoke
 gettimeofday, the function will calculate precise time with current
 CPU's shared memory.
 
I don't know which is better? Could I need to deal other problems?

For modern Intel CPUs you can just assume that the TSCs are in sync across 
packages.  They also have invariant TSC's meaning that the frequency doesn't 
change.  You can easily export a copy of the current 'timehands' structure 
when the TSC is used as the timecounter and then just reimplement bintime() in 
userland.  This assumes you use the TSC as the kernel's timecounter, but you 
really need to do that so that ntpd_adjtime() is taken into account, etc.

That will give a very fast and very cheap timecounter.

I believe we already have a shared page (it holds the signal trampoline now)
for at least the x86 platform (probably some others as well).

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org