Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-17 Thread Russell King
On Thu, Mar 17, 2005 at 02:44:57PM -0500, Albert Cahalan wrote:
> Does the ARM kernel provide a special page of code for
> apps to execute? If not, then ARM is irrelevant.

No.  However, I was responding to your suggestion that supporting
self modifying code in the kernel is trivial.

> Doesn't ARM always have an MMU? If you have an MMU, then
> it is no problem to have one single page of non-XIP code
> for this purpose.

No.  You also have a big misconception about how we map system memory.
We have 1MB mappings, and replacing 1MB of code/data (which would
equate to half a kernel) would completely negate the whole point of
XIP.

> Supposing that you do support the vsyscall hack and you don't
> have an MMU, you can just place the tiny code fragment on the
> stack (or anywhere else) when an exec is performed.
> 
> So, as far as I can see, ARM is fully capable of supporting this.



-- 
Russell King
 Linux kernel2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 Serial core
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-17 Thread Albert Cahalan
On Thu, 2005-03-17 at 16:55 +, Russell King wrote:
> On Tue, Mar 15, 2005 at 10:23:54AM -0500, Albert Cahalan wrote:
> > On Mon, 2005-03-14 at 19:22 -0800, Christoph Lameter wrote:
> > > On Mon, 14 Mar 2005, Albert Cahalan wrote:
> > > 
> > > > When the vsyscall page is created, copy the one needed function
> > > > into it. The kernel is already self-modifying in many places; this
> > > > is nothing new.
> > > 
> > > AFAIK this will only works on ia32 and x86_64 and not definitely not
> > > on ia64. Who knows about the other platforms 
> > 
> > I'll bet it does work fine on IA-64. If it didn't, you would
> > be unable to load the kernel or load an executable.
> > 
> > I know it works for PowerPC. You'll need an isync instruction
> > of course. You may also want a sync instruction and some code
> > to invalidate the cache.
> > 
> > Setting up the page content should be a 1-time operation done
> > at boot. Check your processor manuals as needed.
> 
> Won't work on ARM.  We have XIP kernels, which prevents the use of
> self-modifying code.

Does the ARM kernel provide a special page of code for
apps to execute? If not, then ARM is irrelevant.

Doesn't ARM always have an MMU? If you have an MMU, then
it is no problem to have one single page of non-XIP code
for this purpose.

Supposing that you do support the vsyscall hack and you don't
have an MMU, you can just place the tiny code fragment on the
stack (or anywhere else) when an exec is performed.

So, as far as I can see, ARM is fully capable of supporting this.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-17 Thread Russell King
On Tue, Mar 15, 2005 at 10:23:54AM -0500, Albert Cahalan wrote:
> On Mon, 2005-03-14 at 19:22 -0800, Christoph Lameter wrote:
> > On Mon, 14 Mar 2005, Albert Cahalan wrote:
> > 
> > > When the vsyscall page is created, copy the one needed function
> > > into it. The kernel is already self-modifying in many places; this
> > > is nothing new.
> > 
> > AFAIK this will only works on ia32 and x86_64 and not definitely not
> > on ia64. Who knows about the other platforms 
> 
> I'll bet it does work fine on IA-64. If it didn't, you would
> be unable to load the kernel or load an executable.
> 
> I know it works for PowerPC. You'll need an isync instruction
> of course. You may also want a sync instruction and some code
> to invalidate the cache.
> 
> Setting up the page content should be a 1-time operation done
> at boot. Check your processor manuals as needed.

Won't work on ARM.  We have XIP kernels, which prevents the use of
self-modifying code.

-- 
Russell King
 Linux kernel2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 Serial core
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-17 Thread Ulrich Windl
On 15 Mar 2005 at 10:25, john stultz wrote:

> On Mon, 2005-03-14 at 21:37 -0800, Christoph Lameter wrote:
> > Note that similarities exist between the posix clock and the time sources.
> > Will all time sources be exportable as posix clocks?
> 
> At this point I'm not familiar enough with the posix clocks interface to
> say, although its probably outside the scope of the initial timeofday
> rework.

I'd be happy to see the required POSIX clocks at nanosecond resolution for the 
initial version. Add-Ons may follow later.

> 
> Do you have a link that might explain the posix clocks spec and its
> intent?

There's a book named like "POSIX.4: Programming for the real world" by Bill 
Gallmeister (I think).

Regards,
Ulrich

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-17 Thread Ulrich Windl
On 15 Mar 2005 at 10:25, john stultz wrote:

 On Mon, 2005-03-14 at 21:37 -0800, Christoph Lameter wrote:
  Note that similarities exist between the posix clock and the time sources.
  Will all time sources be exportable as posix clocks?
 
 At this point I'm not familiar enough with the posix clocks interface to
 say, although its probably outside the scope of the initial timeofday
 rework.

I'd be happy to see the required POSIX clocks at nanosecond resolution for the 
initial version. Add-Ons may follow later.

 
 Do you have a link that might explain the posix clocks spec and its
 intent?

There's a book named like POSIX.4: Programming for the real world by Bill 
Gallmeister (I think).

Regards,
Ulrich

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-17 Thread Russell King
On Tue, Mar 15, 2005 at 10:23:54AM -0500, Albert Cahalan wrote:
 On Mon, 2005-03-14 at 19:22 -0800, Christoph Lameter wrote:
  On Mon, 14 Mar 2005, Albert Cahalan wrote:
  
   When the vsyscall page is created, copy the one needed function
   into it. The kernel is already self-modifying in many places; this
   is nothing new.
  
  AFAIK this will only works on ia32 and x86_64 and not definitely not
  on ia64. Who knows about the other platforms 
 
 I'll bet it does work fine on IA-64. If it didn't, you would
 be unable to load the kernel or load an executable.
 
 I know it works for PowerPC. You'll need an isync instruction
 of course. You may also want a sync instruction and some code
 to invalidate the cache.
 
 Setting up the page content should be a 1-time operation done
 at boot. Check your processor manuals as needed.

Won't work on ARM.  We have XIP kernels, which prevents the use of
self-modifying code.

-- 
Russell King
 Linux kernel2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 Serial core
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-17 Thread Albert Cahalan
On Thu, 2005-03-17 at 16:55 +, Russell King wrote:
 On Tue, Mar 15, 2005 at 10:23:54AM -0500, Albert Cahalan wrote:
  On Mon, 2005-03-14 at 19:22 -0800, Christoph Lameter wrote:
   On Mon, 14 Mar 2005, Albert Cahalan wrote:
   
When the vsyscall page is created, copy the one needed function
into it. The kernel is already self-modifying in many places; this
is nothing new.
   
   AFAIK this will only works on ia32 and x86_64 and not definitely not
   on ia64. Who knows about the other platforms 
  
  I'll bet it does work fine on IA-64. If it didn't, you would
  be unable to load the kernel or load an executable.
  
  I know it works for PowerPC. You'll need an isync instruction
  of course. You may also want a sync instruction and some code
  to invalidate the cache.
  
  Setting up the page content should be a 1-time operation done
  at boot. Check your processor manuals as needed.
 
 Won't work on ARM.  We have XIP kernels, which prevents the use of
 self-modifying code.

Does the ARM kernel provide a special page of code for
apps to execute? If not, then ARM is irrelevant.

Doesn't ARM always have an MMU? If you have an MMU, then
it is no problem to have one single page of non-XIP code
for this purpose.

Supposing that you do support the vsyscall hack and you don't
have an MMU, you can just place the tiny code fragment on the
stack (or anywhere else) when an exec is performed.

So, as far as I can see, ARM is fully capable of supporting this.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-17 Thread Russell King
On Thu, Mar 17, 2005 at 02:44:57PM -0500, Albert Cahalan wrote:
 Does the ARM kernel provide a special page of code for
 apps to execute? If not, then ARM is irrelevant.

No.  However, I was responding to your suggestion that supporting
self modifying code in the kernel is trivial.

 Doesn't ARM always have an MMU? If you have an MMU, then
 it is no problem to have one single page of non-XIP code
 for this purpose.

No.  You also have a big misconception about how we map system memory.
We have 1MB mappings, and replacing 1MB of code/data (which would
equate to half a kernel) would completely negate the whole point of
XIP.

 Supposing that you do support the vsyscall hack and you don't
 have an MMU, you can just place the tiny code fragment on the
 stack (or anywhere else) when an exec is performed.
 
 So, as far as I can see, ARM is fully capable of supporting this.

cough

-- 
Russell King
 Linux kernel2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 Serial core
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-15 Thread George Anzinger
john stultz wrote:
On Mon, 2005-03-14 at 21:37 -0800, Christoph Lameter wrote:
Note that similarities exist between the posix clock and the time sources.
Will all time sources be exportable as posix clocks?

At this point I'm not familiar enough with the posix clocks interface to
say, although its probably outside the scope of the initial timeofday
rework.
I do think we need to consider the needs of that subsystem.  Clock wise, it 
makes a monotonic and a real time clock available to the user.  The real time 
clock is just a timespec version of the timeval gettimeofday clock.  At the 
current time, the monotonic clock is the real time clock plus wall_to_monotonic. 
 All that is rather simple and straight forward, an I don't recommend adding 
any other clocks unless there is a real need.

The interesting thing is that the posix timers are based on the posix clocks 
which are base on wall_clock, and the jiffies clock which is what runs the 
timers.  In order to make sense of timer requests it is neccessary to, 
atomically, grab all three clocks (i.e. wall_clock aka gettimeofday, 
wall_to_monotonic, and jiffies with the jiffies offset).  The code can then 
figure out when a timer needs to expire in jiffies time in order to expire at a 
given wall or monotonic time.  Currently the xtime_time sequence lock is used to 
do this.

Another issue that posix timers brings forward is the need to know when the 
clock is set.  This is needed to cause timers that were requested to expire at 
some absolute wall_time to do so even if time is set while they are running.  A 
word on how this is done is in order...

Since the processing of a clock set by the posix timers code may, in fact, allow 
the time to be set more than once before the affected timers are adjusted (or 
rather to avoid the locking rats nest not allowing this would cause), the 
wall_to_monotonic value is exploited.  In particular, a clock setting changes 
this value by the exact amount that time was adjusted.  So, each posix timer 
carries the value of wall_to_monotonic that was in use when the timer was 
started.  The clock_was_set code uses this to compute the clock movement and 
thus the adjustment needed to make the timer expire at the right time.

What this translates to in the new code is a) the need for a way to atomically 
get all the key times (wall, monotonic, jiffie) and b) access to a value that 
will allow it to compute the amount of time a clock set, or a series of clock 
settings, changed time by.  Of course, it also needs the clock_was_set() notify 
call.
Do you have a link that might explain the posix clocks spec and its
intent?
Well, there is my signature :)  Really, on the high-res-timers project site you 
want to download the support patch.  In there, among other things, is a set of 
man pages on posix clocks & timers.  The patch applies to any kernel and just 
adds a new set of directories off of Documentation.
--
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-15 Thread john stultz
On Mon, 2005-03-14 at 21:37 -0800, Christoph Lameter wrote:
> Note that similarities exist between the posix clock and the time sources.
> Will all time sources be exportable as posix clocks?

At this point I'm not familiar enough with the posix clocks interface to
say, although its probably outside the scope of the initial timeofday
rework.

Do you have a link that might explain the posix clocks spec and its
intent?

thanks
-john

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-15 Thread Chris Friesen
Albert Cahalan wrote:
I know it works for PowerPC. You'll need an isync instruction
of course. You may also want a sync instruction and some code
to invalidate the cache.
For PPC you'll want to flush the dcache, then invalidate the icache. 
This will ensure that it works on all processors.

Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-15 Thread Albert Cahalan
On Mon, 2005-03-14 at 19:22 -0800, Christoph Lameter wrote:
> On Mon, 14 Mar 2005, Albert Cahalan wrote:
> 
> > When the vsyscall page is created, copy the one needed function
> > into it. The kernel is already self-modifying in many places; this
> > is nothing new.
> 
> AFAIK this will only works on ia32 and x86_64 and not definitely not
> on ia64. Who knows about the other platforms 

I'll bet it does work fine on IA-64. If it didn't, you would
be unable to load the kernel or load an executable.

I know it works for PowerPC. You'll need an isync instruction
of course. You may also want a sync instruction and some code
to invalidate the cache.

Setting up the page content should be a 1-time operation done
at boot. Check your processor manuals as needed.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-15 Thread Albert Cahalan
On Mon, 2005-03-14 at 19:22 -0800, Christoph Lameter wrote:
 On Mon, 14 Mar 2005, Albert Cahalan wrote:
 
  When the vsyscall page is created, copy the one needed function
  into it. The kernel is already self-modifying in many places; this
  is nothing new.
 
 AFAIK this will only works on ia32 and x86_64 and not definitely not
 on ia64. Who knows about the other platforms 

I'll bet it does work fine on IA-64. If it didn't, you would
be unable to load the kernel or load an executable.

I know it works for PowerPC. You'll need an isync instruction
of course. You may also want a sync instruction and some code
to invalidate the cache.

Setting up the page content should be a 1-time operation done
at boot. Check your processor manuals as needed.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-15 Thread Chris Friesen
Albert Cahalan wrote:
I know it works for PowerPC. You'll need an isync instruction
of course. You may also want a sync instruction and some code
to invalidate the cache.
For PPC you'll want to flush the dcache, then invalidate the icache. 
This will ensure that it works on all processors.

Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-15 Thread john stultz
On Mon, 2005-03-14 at 21:37 -0800, Christoph Lameter wrote:
 Note that similarities exist between the posix clock and the time sources.
 Will all time sources be exportable as posix clocks?

At this point I'm not familiar enough with the posix clocks interface to
say, although its probably outside the scope of the initial timeofday
rework.

Do you have a link that might explain the posix clocks spec and its
intent?

thanks
-john

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-15 Thread George Anzinger
john stultz wrote:
On Mon, 2005-03-14 at 21:37 -0800, Christoph Lameter wrote:
Note that similarities exist between the posix clock and the time sources.
Will all time sources be exportable as posix clocks?

At this point I'm not familiar enough with the posix clocks interface to
say, although its probably outside the scope of the initial timeofday
rework.
I do think we need to consider the needs of that subsystem.  Clock wise, it 
makes a monotonic and a real time clock available to the user.  The real time 
clock is just a timespec version of the timeval gettimeofday clock.  At the 
current time, the monotonic clock is the real time clock plus wall_to_monotonic. 
 All that is rather simple and straight forward, an I don't recommend adding 
any other clocks unless there is a real need.

The interesting thing is that the posix timers are based on the posix clocks 
which are base on wall_clock, and the jiffies clock which is what runs the 
timers.  In order to make sense of timer requests it is neccessary to, 
atomically, grab all three clocks (i.e. wall_clock aka gettimeofday, 
wall_to_monotonic, and jiffies with the jiffies offset).  The code can then 
figure out when a timer needs to expire in jiffies time in order to expire at a 
given wall or monotonic time.  Currently the xtime_time sequence lock is used to 
do this.

Another issue that posix timers brings forward is the need to know when the 
clock is set.  This is needed to cause timers that were requested to expire at 
some absolute wall_time to do so even if time is set while they are running.  A 
word on how this is done is in order...

Since the processing of a clock set by the posix timers code may, in fact, allow 
the time to be set more than once before the affected timers are adjusted (or 
rather to avoid the locking rats nest not allowing this would cause), the 
wall_to_monotonic value is exploited.  In particular, a clock setting changes 
this value by the exact amount that time was adjusted.  So, each posix timer 
carries the value of wall_to_monotonic that was in use when the timer was 
started.  The clock_was_set code uses this to compute the clock movement and 
thus the adjustment needed to make the timer expire at the right time.

What this translates to in the new code is a) the need for a way to atomically 
get all the key times (wall, monotonic, jiffie) and b) access to a value that 
will allow it to compute the amount of time a clock set, or a series of clock 
settings, changed time by.  Of course, it also needs the clock_was_set() notify 
call.
Do you have a link that might explain the posix clocks spec and its
intent?
Well, there is my signature :)  Really, on the high-res-timers project site you 
want to download the support patch.  In there, among other things, is a set of 
man pages on posix clocks  timers.  The patch applies to any kernel and just 
adds a new set of directories off of Documentation.
--
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread Christoph Lameter
On Fri, 11 Mar 2005, john stultz wrote:

> +/* cyc2ns():
> + *   Uses the timesource and ntp ajdustment interval to
> + *   convert cycle_ts to nanoseconds.
> + *   If rem is not null, it stores the remainder of the
> + *   calculation there.
> + *
> + */

This function is called in critical paths and it would be very important
to optimize it further.

> +static inline nsec_t cyc2ns(struct timesource_t* ts, int ntp_adj, cycle_t 
> cycles, cycle_t* rem)
> +{
> + u64 ret;
> + ret = (u64)cycles;
> + ret *= (ts->mult + ntp_adj);

This only changes when nt_adj changes. Maybe maintain the sum separately?

> + if (unlikely(rem)) {
> + /* XXX clean this up later!
> +  *  buf for now relax, we only calc
> +  *  remainders at interrupt time
> +  */
> + u64 remainder = ret & ((1 << ts->shift) -1);
> + do_div(remainder, ts->mult);
> + *rem = remainder;

IA64 does not do remainder processing (maybe I just do not understand
this...) but this seems to be not necessay if one uses 64 bit values that
are properly shifted?

> + }
> + ret >>= ts->shift;
> + return (nsec_t)ret;
> +}

The whole function could simply be:

#define cyc2ns(cycles, ts) (cycles*ts->current_factor) >> ts->shift
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread Christoph Lameter
Note that similarities exist between the posix clock and the time sources.
Will all time sources be exportable as posix clocks?


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread Christoph Lameter
On Mon, 14 Mar 2005, Albert Cahalan wrote:

> When the vsyscall page is created, copy the one needed function
> into it. The kernel is already self-modifying in many places; this
> is nothing new.

AFAIK this will only works on ia32 and x86_64 and not definitely not
on ia64. Who knows about the other platforms 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread Albert Cahalan
On Mon, 2005-03-14 at 12:27 -0800, Matt Mackall wrote:
> On Mon, Mar 14, 2005 at 12:04:07PM -0800, john stultz wrote:
> > > > > > > > +static inline cycle_t read_timesource(struct timesource_t* ts)
> > > > > > > > +{
> > > > > > > > +   switch (ts->type) {
> > > > > > > > +   case TIMESOURCE_MMIO_32:
> > > > > > > > +   return (cycle_t)readl(ts->mmio_ptr);
> > > > > > > > +   case TIMESOURCE_MMIO_64:
> > > > > > > > +   return (cycle_t)readq(ts->mmio_ptr);
> > > > > > > > +   case TIMESOURCE_CYCLES:
> > > > > > > > +   return (cycle_t)get_cycles();
> > > > > > > > +   default:/* case: TIMESOURCE_FUNCTION */
> > > > > > > > +   return ts->read_fnct();
> > > > > > > > +   }
> > > > > > > > +}
> > > Well where we'd read an MMIO address, we'd simply set read_fnct to
> > > generic_timesource_mmio32 or so. And that function just does the read.
> > > So both that function and read_timesource become one-liners and we
> > > drop the conditional branches in the switch.
> > 
> > However the vsyscall/fsyscall bits cannot call in-kernel functions (as
> > they execute in userspace or a sudo-userspace). As it stands now in my
> > design TIMESOURCE_FUNCTION timesources will not be usable for
> > vsyscall/fsyscall implementations, so I'm not sure if that's doable.
> > 
> > I'd be interested you've got a way around that.
> 
> We can either stick all the generic mmio timer functions in the
> vsyscall page (they're tiny) or leave the vsyscall using type/ptr but
> have the kernel internally use only the function pointer. Someone
> who's more familiar with the vsyscall timer code should chime in here.

When the vsyscall page is created, copy the one needed function
into it. The kernel is already self-modifying in many places; this
is nothing new.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread Christoph Lameter
On Mon, 14 Mar 2005, Matt Mackall wrote:
> We can either stick all the generic mmio timer functions in the
> vsyscall page (they're tiny) or leave the vsyscall using type/ptr but
> have the kernel internally use only the function pointer. Someone
> who's more familiar with the vsyscall timer code should chime in here.

No we cannot do any function calls in a fastcall path on ia64. The current
design is ok. Why duplicate the functionality with additional indirect
function calls? Plus an indirect function  calls stalls pipelines on some
processors and will limit the performance of gettimeofday.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread john stultz
On Mon, 2005-03-14 at 16:28 -0800, Christoph Lameter wrote:
> On Mon, 14 Mar 2005, john stultz wrote:
> 
> > Huh. So if I understand you properly, all timesources should have valid
> > read_fnct pointers that return the cycle value, however we'll still
> > preserve the type and mmio_ptr so fsyscall/vsyscall bits can use them
> > externally?
> >
> > Hmm. I'm a little cautious, as I really want to make the vsyscall
> > gettimeofday and regular do_gettimeofday be a similar as possible to
> > avoid some of the bugs we've seen between different gettimeofday
> > implementations. However I'm not completely against the idea.
> >
> > Christoph: Do you have any thoughts on this?
> 
> Sorry to be late to the party. It would be a weird implementation to have
> two ways to obtain time for each timesource. Also would be even more a
> headache to maintain than the existing fastcall vs. fullcall.

That's my feeling as well, unless a more convincing argument comes up.

thanks
-john

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread Christoph Lameter
On Mon, 14 Mar 2005, john stultz wrote:

> Huh. So if I understand you properly, all timesources should have valid
> read_fnct pointers that return the cycle value, however we'll still
> preserve the type and mmio_ptr so fsyscall/vsyscall bits can use them
> externally?
>
> Hmm. I'm a little cautious, as I really want to make the vsyscall
> gettimeofday and regular do_gettimeofday be a similar as possible to
> avoid some of the bugs we've seen between different gettimeofday
> implementations. However I'm not completely against the idea.
>
> Christoph: Do you have any thoughts on this?

Sorry to be late to the party. It would be a weird implementation to have
two ways to obtain time for each timesource. Also would be even more a
headache to maintain than the existing fastcall vs. fullcall.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread Christoph Lameter


On Fri, 11 Mar 2005, john stultz wrote:

> +/* get_lowres_timestamp():
> + *   Returns a low res timestamp.
> + *   (ie: the value of system_time as  calculated at
> + *   the last invocation of timeofday_periodic_hook() )
> + */
> +nsec_t get_lowres_timestamp(void)
> +{
> + nsec_t ret;
> + unsigned long seq;
> + do {
> + seq = read_seqbegin(_time_lock);
> +
> + /* quickly grab system_time*/
> + ret = system_time;
> +
> + } while (read_seqretry(_time_lock, seq));
> +
> + return ret;
> +}

On 64 bit platforms this could simply be a macro accessing "system time".

> +/* do_gettimeofday():
> + *   Returns the time of day
> + */
> +void do_gettimeofday(struct timeval *tv)
> +{
> + nsec_t wall, sys;
> + unsigned long seq;
> +
> + /* atomically read wall and sys time */
> + do {
> + seq = read_seqbegin(_time_lock);
> +
> + wall = wall_time_offset;
> + sys = __monotonic_clock();
> +
> + } while (read_seqretry(_time_lock, seq));
> +
> + /* add them and convert to timeval */
> + *tv = ns2timeval(wall+sys);
> +}
> +EXPORT_SYMBOL(do_gettimeofday);

Good.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread George Anzinger
john stultz wrote:
On Sat, 2005-03-12 at 16:49 -0800, Matt Mackall wrote:
~


+   /* finally, update legacy time values */
+   write_seqlock_irqsave(_lock, x_flags);
+   xtime = ns2timespec(system_time + wall_time_offset);
+   wall_to_monotonic = ns2timespec(wall_time_offset);
+   wall_to_monotonic.tv_sec = -wall_to_monotonic.tv_sec;
+   wall_to_monotonic.tv_nsec = -wall_to_monotonic.tv_nsec;
+   /* XXX - should jiffies be updated here? */
Excellent question. 

Indeed.  Currently jiffies is used as both a interrupt counter and a
time unit, and I'm trying make it just the former. If I emulate it then
it stops functioning as a interrupt counter, and if I don't then I'll
probably break assumptions about jiffies being a time unit. So I'm not
sure which is the easiest path to go until all the users of jiffies are
audited for intent. 
Really?  Who counts interrupts???  The timer code treats jiffies as a unit of 
time.  You will need to rewrite that to make it otherwise.  But then you have 
another problem.  To correctly function, times need to expire on time (hay how 
bout that) not some time later.  To do this we need an interrupt source.  To 
this point in time, the jiffies interrupt has been the indication that one or 
more timer may have expired.  While we don't need to "count" the interrupts, we 
DO need them to expire the timers AND they need to be on time.

~
--
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread Matt Mackall
On Mon, Mar 14, 2005 at 12:04:07PM -0800, john stultz wrote:
> > > > > > > +static inline cycle_t read_timesource(struct timesource_t* ts)
> > > > > > > +{
> > > > > > > + switch (ts->type) {
> > > > > > > + case TIMESOURCE_MMIO_32:
> > > > > > > + return (cycle_t)readl(ts->mmio_ptr);
> > > > > > > + case TIMESOURCE_MMIO_64:
> > > > > > > + return (cycle_t)readq(ts->mmio_ptr);
> > > > > > > + case TIMESOURCE_CYCLES:
> > > > > > > + return (cycle_t)get_cycles();
> > > > > > > + default:/* case: TIMESOURCE_FUNCTION */
> > > > > > > + return ts->read_fnct();
> > > > > > > + }
> > > > > > > +}
> > Well where we'd read an MMIO address, we'd simply set read_fnct to
> > generic_timesource_mmio32 or so. And that function just does the read.
> > So both that function and read_timesource become one-liners and we
> > drop the conditional branches in the switch.
> 
> However the vsyscall/fsyscall bits cannot call in-kernel functions (as
> they execute in userspace or a sudo-userspace). As it stands now in my
> design TIMESOURCE_FUNCTION timesources will not be usable for
> vsyscall/fsyscall implementations, so I'm not sure if that's doable.
> 
> I'd be interested you've got a way around that.

We can either stick all the generic mmio timer functions in the
vsyscall page (they're tiny) or leave the vsyscall using type/ptr but
have the kernel internally use only the function pointer. Someone
who's more familiar with the vsyscall timer code should chime in here.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread john stultz
On Mon, 2005-03-14 at 11:51 -0800, Matt Mackall wrote:
> On Mon, Mar 14, 2005 at 11:43:21AM -0800, john stultz wrote:
> > On Mon, 2005-03-14 at 11:29 -0800, Matt Mackall wrote:
> > > On Mon, Mar 14, 2005 at 10:42:45AM -0800, john stultz wrote:
> > > > 
> > > > > > +static inline cycle_t read_timesource(struct timesource_t* ts)
> > > > > > +{
> > > > > > +   switch (ts->type) {
> > > > > > +   case TIMESOURCE_MMIO_32:
> > > > > > +   return (cycle_t)readl(ts->mmio_ptr);
> > > > > > +   case TIMESOURCE_MMIO_64:
> > > > > > +   return (cycle_t)readq(ts->mmio_ptr);
> > > > > > +   case TIMESOURCE_CYCLES:
> > > > > > +   return (cycle_t)get_cycles();
> > > > > > +   default:/* case: TIMESOURCE_FUNCTION */
> > > > > > +   return ts->read_fnct();
> > > > > > +   }
> > > > > > +}
> > > > > 
> > > > > Wouldn't it be better to change read_fnct to take a timesource * and
> > > > > then change all the other guys to generic_timesource_ helper
> > > > > functions? This does away with the switch and makes it trivial to add
> > > > > new generic sources. Change mmio_ptr to void *private.
> > > > 
> > > > Not sure if I totally understand this, but originally I just had a read
> > > > function, but to allow this framework to function w/ ia64 fsyscalls (and
> > > > likely other arches vsyscalls) we need to pass the raw mmio pointers.
> > > > Thus the timesource type and switch idea was taken from the time
> > > > interpolator code.
> > > 
> > > Well for vsyscall, we can leave the mmio_ptr and type. But in-kernel,
> > > I think we'd rather always call read_fnct with generic helpers than hit 
> > > this
> > > switch every time.
> > 
> > Huh. So if I understand you properly, all timesources should have valid
> > read_fnct pointers that return the cycle value, however we'll still
> > preserve the type and mmio_ptr so fsyscall/vsyscall bits can use them
> > externally?
> 
> Well where we'd read an MMIO address, we'd simply set read_fnct to
> generic_timesource_mmio32 or so. And that function just does the read.
> So both that function and read_timesource become one-liners and we
> drop the conditional branches in the switch.

However the vsyscall/fsyscall bits cannot call in-kernel functions (as
they execute in userspace or a sudo-userspace). As it stands now in my
design TIMESOURCE_FUNCTION timesources will not be usable for
vsyscall/fsyscall implementations, so I'm not sure if that's doable.

I'd be interested you've got a way around that.

thanks
-john


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread Matt Mackall
On Mon, Mar 14, 2005 at 11:43:21AM -0800, john stultz wrote:
> On Mon, 2005-03-14 at 11:29 -0800, Matt Mackall wrote:
> > On Mon, Mar 14, 2005 at 10:42:45AM -0800, john stultz wrote:
> > > 
> > > > > +static inline cycle_t read_timesource(struct timesource_t* ts)
> > > > > +{
> > > > > + switch (ts->type) {
> > > > > + case TIMESOURCE_MMIO_32:
> > > > > + return (cycle_t)readl(ts->mmio_ptr);
> > > > > + case TIMESOURCE_MMIO_64:
> > > > > + return (cycle_t)readq(ts->mmio_ptr);
> > > > > + case TIMESOURCE_CYCLES:
> > > > > + return (cycle_t)get_cycles();
> > > > > + default:/* case: TIMESOURCE_FUNCTION */
> > > > > + return ts->read_fnct();
> > > > > + }
> > > > > +}
> > > > 
> > > > Wouldn't it be better to change read_fnct to take a timesource * and
> > > > then change all the other guys to generic_timesource_ helper
> > > > functions? This does away with the switch and makes it trivial to add
> > > > new generic sources. Change mmio_ptr to void *private.
> > > 
> > > Not sure if I totally understand this, but originally I just had a read
> > > function, but to allow this framework to function w/ ia64 fsyscalls (and
> > > likely other arches vsyscalls) we need to pass the raw mmio pointers.
> > > Thus the timesource type and switch idea was taken from the time
> > > interpolator code.
> > 
> > Well for vsyscall, we can leave the mmio_ptr and type. But in-kernel,
> > I think we'd rather always call read_fnct with generic helpers than hit this
> > switch every time.
> 
> Huh. So if I understand you properly, all timesources should have valid
> read_fnct pointers that return the cycle value, however we'll still
> preserve the type and mmio_ptr so fsyscall/vsyscall bits can use them
> externally?

Well where we'd read an MMIO address, we'd simply set read_fnct to
generic_timesource_mmio32 or so. And that function just does the read.
So both that function and read_timesource become one-liners and we
drop the conditional branches in the switch.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread john stultz
On Mon, 2005-03-14 at 11:29 -0800, Matt Mackall wrote:
> On Mon, Mar 14, 2005 at 10:42:45AM -0800, john stultz wrote:
> > 
> > > > +static inline cycle_t read_timesource(struct timesource_t* ts)
> > > > +{
> > > > +   switch (ts->type) {
> > > > +   case TIMESOURCE_MMIO_32:
> > > > +   return (cycle_t)readl(ts->mmio_ptr);
> > > > +   case TIMESOURCE_MMIO_64:
> > > > +   return (cycle_t)readq(ts->mmio_ptr);
> > > > +   case TIMESOURCE_CYCLES:
> > > > +   return (cycle_t)get_cycles();
> > > > +   default:/* case: TIMESOURCE_FUNCTION */
> > > > +   return ts->read_fnct();
> > > > +   }
> > > > +}
> > > 
> > > Wouldn't it be better to change read_fnct to take a timesource * and
> > > then change all the other guys to generic_timesource_ helper
> > > functions? This does away with the switch and makes it trivial to add
> > > new generic sources. Change mmio_ptr to void *private.
> > 
> > Not sure if I totally understand this, but originally I just had a read
> > function, but to allow this framework to function w/ ia64 fsyscalls (and
> > likely other arches vsyscalls) we need to pass the raw mmio pointers.
> > Thus the timesource type and switch idea was taken from the time
> > interpolator code.
> 
> Well for vsyscall, we can leave the mmio_ptr and type. But in-kernel,
> I think we'd rather always call read_fnct with generic helpers than hit this
> switch every time.

Huh. So if I understand you properly, all timesources should have valid
read_fnct pointers that return the cycle value, however we'll still
preserve the type and mmio_ptr so fsyscall/vsyscall bits can use them
externally?

Hmm. I'm a little cautious, as I really want to make the vsyscall
gettimeofday and regular do_gettimeofday be a similar as possible to
avoid some of the bugs we've seen between different gettimeofday
implementations. However I'm not completely against the idea.

Christoph: Do you have any thoughts on this?


> > > > +   if (time_suspend_state != TIME_RUNNING) {
> > > > +   printk(KERN_INFO "timeofday_suspend_hook: ACK! called 
> > > > while we're suspended!");
> > > 
> > > Line length. Perhaps BUG_ON instead.
> > 
> > Eh, its not fatal to BUG_ON seems a bit harsh. I'll fix the line length
> > though. 
> 
> Well there's a trade-off here. If it's something that should never
> happen and you only printk, you may never get a failure report
> (especially at KERN_INFO). It's good to be accomodating of external
> errors, but catching internal should-never-happen errors is important.

Fair enough. 


> > > Excellent question. 
> > 
> > Indeed.  Currently jiffies is used as both a interrupt counter and a
> > time unit, and I'm trying make it just the former. If I emulate it then
> > it stops functioning as a interrupt counter, and if I don't then I'll
> > probably break assumptions about jiffies being a time unit. So I'm not
> > sure which is the easiest path to go until all the users of jiffies are
> > audited for intent. 
> 
> Post this as a separate thread. There are various thoughts floating
> around on this already.

I'm a little busy with other things today, but I'll try to stir up a
discussion on this soon. 

thanks
-john


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread Matt Mackall
On Mon, Mar 14, 2005 at 10:42:45AM -0800, john stultz wrote:
> 
> > > +static inline cycle_t read_timesource(struct timesource_t* ts)
> > > +{
> > > + switch (ts->type) {
> > > + case TIMESOURCE_MMIO_32:
> > > + return (cycle_t)readl(ts->mmio_ptr);
> > > + case TIMESOURCE_MMIO_64:
> > > + return (cycle_t)readq(ts->mmio_ptr);
> > > + case TIMESOURCE_CYCLES:
> > > + return (cycle_t)get_cycles();
> > > + default:/* case: TIMESOURCE_FUNCTION */
> > > + return ts->read_fnct();
> > > + }
> > > +}
> > 
> > Wouldn't it be better to change read_fnct to take a timesource * and
> > then change all the other guys to generic_timesource_ helper
> > functions? This does away with the switch and makes it trivial to add
> > new generic sources. Change mmio_ptr to void *private.
> 
> Not sure if I totally understand this, but originally I just had a read
> function, but to allow this framework to function w/ ia64 fsyscalls (and
> likely other arches vsyscalls) we need to pass the raw mmio pointers.
> Thus the timesource type and switch idea was taken from the time
> interpolator code.

Well for vsyscall, we can leave the mmio_ptr and type. But in-kernel,
I think we'd rather always call read_fnct with generic helpers than hit this
switch every time.

> > > + if (time_suspend_state != TIME_RUNNING) {
> > > + printk(KERN_INFO "timeofday_suspend_hook: ACK! called while 
> > > we're suspended!");
> > 
> > Line length. Perhaps BUG_ON instead.
> 
> Eh, its not fatal to BUG_ON seems a bit harsh. I'll fix the line length
> though. 

Well there's a trade-off here. If it's something that should never
happen and you only printk, you may never get a failure report
(especially at KERN_INFO). It's good to be accomodating of external
errors, but catching internal should-never-happen errors is important.

> > Excellent question. 
> 
> Indeed.  Currently jiffies is used as both a interrupt counter and a
> time unit, and I'm trying make it just the former. If I emulate it then
> it stops functioning as a interrupt counter, and if I don't then I'll
> probably break assumptions about jiffies being a time unit. So I'm not
> sure which is the easiest path to go until all the users of jiffies are
> audited for intent. 

Post this as a separate thread. There are various thoughts floating
around on this already.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread john stultz
On Sat, 2005-03-12 at 16:49 -0800, Matt Mackall wrote:
> On Fri, Mar 11, 2005 at 05:24:15PM -0800, john stultz wrote:
> > +struct timesource_t timesource_jiffies = {
> > +   .name = "jiffies",
> > +   .priority = 0, /* lowest priority*/
> > +   .type = TIMESOURCE_FUNCTION,
> > +   .read_fnct = jiffies_read,
> > +   .mask = (cycle_t)~0,
> 
> Not sure this is right. The type of 0 is 'int' and the ~ will happen
> before the cast to a potentially longer type.

Good point. I'll change it to Andreas' suggestion of (type)-1.


> > +   .mult = NSEC_PER_SEC/HZ,
> 
> Does rounding matter here? Alpha has HZ of 1024, so this comes out to
> 976562.5.

Actually, there are probably a number of places where I need to be
better with rounding. Its a good idea there as well.

> > +static inline cycle_t read_timesource(struct timesource_t* ts)
> > +{
> > +   switch (ts->type) {
> > +   case TIMESOURCE_MMIO_32:
> > +   return (cycle_t)readl(ts->mmio_ptr);
> > +   case TIMESOURCE_MMIO_64:
> > +   return (cycle_t)readq(ts->mmio_ptr);
> > +   case TIMESOURCE_CYCLES:
> > +   return (cycle_t)get_cycles();
> > +   default:/* case: TIMESOURCE_FUNCTION */
> > +   return ts->read_fnct();
> > +   }
> > +}
> 
> Wouldn't it be better to change read_fnct to take a timesource * and
> then change all the other guys to generic_timesource_ helper
> functions? This does away with the switch and makes it trivial to add
> new generic sources. Change mmio_ptr to void *private.

Not sure if I totally understand this, but originally I just had a read
function, but to allow this framework to function w/ ia64 fsyscalls (and
likely other arches vsyscalls) we need to pass the raw mmio pointers.
Thus the timesource type and switch idea was taken from the time
interpolator code.


> > @@ -467,6 +468,7 @@
> > pidhash_init();
> > init_timers();
> > softirq_init();
> > +   timeofday_init();
> > time_init();
> 
> Can we push time_init inside of timeofday_init?

Ideally, yea, but this way is cleaner until all the arches are converted
to the new timeofday code.

> > +/* Chapter 5: Kernel Variables [RFC 1589 pg. 28] */
> > +/* 5.1 Interface Variables */
> > +static int ntp_status  = STA_UNSYNC;   /* status */
> > +static long ntp_offset;/* usec */
> > +static long ntp_constant   = 2;/* ntp magic? */
> > +static long ntp_maxerror   = NTP_PHASE_LIMIT;  /* usec */
> > +static long ntp_esterror   = NTP_PHASE_LIMIT;  /* usec */
> > +static const long ntp_tolerance= MAXFREQ;  /* shifted ppm 
> > */
> > +static const long ntp_precision= 1;/* constant */
> > +
> > +/* 5.2 Phase-Lock Loop Variables */
> > +static long ntp_freq;  /* shifted ppm 
> > */
> > +static long ntp_reftime;   /* sec */
> 
> You present a nice argument for not using tabs except at the beginning
> of the line.

Yea, I should have caught that earlier. I have the tab length set to 4
in my editor. Sorry. 


> > +#define MILLION 100
> 
> Still a magic number despite being a define. Very meta. Unused.

Should have been yanked along with ntp_scale(). Good catch.

> > +/* int ntp_advance(nsec_t interval):
> > + * Periodic hook which increments NTP state machine by interval.
> > + *  Returns the signed PPM adjustment to be used for the next interval.
> > + * This is ntp_hardclock in the RFC.
> 
> Why is it not ntp_hardclock here?

I'm not sure if ntp_hardclock is a very good name. Since we're advancing
the state machine, ntp_advance() seems to be more clear to me. However
if the NTP folks care enough then I'll be fine with changing it.

> > +   /* decrement singleshot offset interval */
> > +   ss_offset_len =- interval;
> 
> Eh?

Gah! Great catch! I would have never seen that terrible typo!



> > +   /* bound the adjustment to MAXPHASE/MINSEC */
> > +   if (tmp > (MAXPHASE / MINSEC) << SHIFT_UPDATE)
> > +   tmp = (MAXPHASE / MINSEC) << SHIFT_UPDATE;
> > +   if (tmp < -(MAXPHASE / MINSEC) << SHIFT_UPDATE)
> > +   tmp = -(MAXPHASE / MINSEC) << SHIFT_UPDATE;
> 
> max, min?
> 
> > +   /* Make sure offset is bounded by MAXPHASE */
> > +   if (tmp > MAXPHASE)
> > +   tmp = MAXPHASE;
> > +   if (tmp < -MAXPHASE)
> > +   tmp = -MAXPHASE;
> 
> max, min.

Good idea. 


> > +int do_adjtimex(struct timex *tx)
> > +{
> > +   do_gettimeofday(>time); /* set timex->time*/
> 
> Oh. Move the cap check back here..

Will do.

> > +   if (time_suspend_state != TIME_RUNNING) {
> > +   printk(KERN_INFO "timeofday_suspend_hook: ACK! called while 
> > we're suspended!");
> 
> Line length. Perhaps BUG_ON instead.

Eh, its not fatal to BUG_ON seems a bit harsh. I'll fix the line length
though. 


> > +   /* finally, update legacy time values */
> > +   write_seqlock_irqsave(_lock, x_flags);
> > +   xtime = 

Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread john stultz
On Sat, 2005-03-12 at 16:49 -0800, Matt Mackall wrote:
 On Fri, Mar 11, 2005 at 05:24:15PM -0800, john stultz wrote:
  +struct timesource_t timesource_jiffies = {
  +   .name = jiffies,
  +   .priority = 0, /* lowest priority*/
  +   .type = TIMESOURCE_FUNCTION,
  +   .read_fnct = jiffies_read,
  +   .mask = (cycle_t)~0,
 
 Not sure this is right. The type of 0 is 'int' and the ~ will happen
 before the cast to a potentially longer type.

Good point. I'll change it to Andreas' suggestion of (type)-1.


  +   .mult = NSEC_PER_SEC/HZ,
 
 Does rounding matter here? Alpha has HZ of 1024, so this comes out to
 976562.5.

Actually, there are probably a number of places where I need to be
better with rounding. Its a good idea there as well.

  +static inline cycle_t read_timesource(struct timesource_t* ts)
  +{
  +   switch (ts-type) {
  +   case TIMESOURCE_MMIO_32:
  +   return (cycle_t)readl(ts-mmio_ptr);
  +   case TIMESOURCE_MMIO_64:
  +   return (cycle_t)readq(ts-mmio_ptr);
  +   case TIMESOURCE_CYCLES:
  +   return (cycle_t)get_cycles();
  +   default:/* case: TIMESOURCE_FUNCTION */
  +   return ts-read_fnct();
  +   }
  +}
 
 Wouldn't it be better to change read_fnct to take a timesource * and
 then change all the other guys to generic_timesource_foo helper
 functions? This does away with the switch and makes it trivial to add
 new generic sources. Change mmio_ptr to void *private.

Not sure if I totally understand this, but originally I just had a read
function, but to allow this framework to function w/ ia64 fsyscalls (and
likely other arches vsyscalls) we need to pass the raw mmio pointers.
Thus the timesource type and switch idea was taken from the time
interpolator code.


  @@ -467,6 +468,7 @@
  pidhash_init();
  init_timers();
  softirq_init();
  +   timeofday_init();
  time_init();
 
 Can we push time_init inside of timeofday_init?

Ideally, yea, but this way is cleaner until all the arches are converted
to the new timeofday code.

  +/* Chapter 5: Kernel Variables [RFC 1589 pg. 28] */
  +/* 5.1 Interface Variables */
  +static int ntp_status  = STA_UNSYNC;   /* status */
  +static long ntp_offset;/* usec */
  +static long ntp_constant   = 2;/* ntp magic? */
  +static long ntp_maxerror   = NTP_PHASE_LIMIT;  /* usec */
  +static long ntp_esterror   = NTP_PHASE_LIMIT;  /* usec */
  +static const long ntp_tolerance= MAXFREQ;  /* shifted ppm 
  */
  +static const long ntp_precision= 1;/* constant */
  +
  +/* 5.2 Phase-Lock Loop Variables */
  +static long ntp_freq;  /* shifted ppm 
  */
  +static long ntp_reftime;   /* sec */
 
 You present a nice argument for not using tabs except at the beginning
 of the line.

Yea, I should have caught that earlier. I have the tab length set to 4
in my editor. Sorry. 


  +#define MILLION 100
 
 Still a magic number despite being a define. Very meta. Unused.

Should have been yanked along with ntp_scale(). Good catch.

  +/* int ntp_advance(nsec_t interval):
  + * Periodic hook which increments NTP state machine by interval.
  + *  Returns the signed PPM adjustment to be used for the next interval.
  + * This is ntp_hardclock in the RFC.
 
 Why is it not ntp_hardclock here?

I'm not sure if ntp_hardclock is a very good name. Since we're advancing
the state machine, ntp_advance() seems to be more clear to me. However
if the NTP folks care enough then I'll be fine with changing it.

  +   /* decrement singleshot offset interval */
  +   ss_offset_len =- interval;
 
 Eh?

Gah! Great catch! I would have never seen that terrible typo!



  +   /* bound the adjustment to MAXPHASE/MINSEC */
  +   if (tmp  (MAXPHASE / MINSEC)  SHIFT_UPDATE)
  +   tmp = (MAXPHASE / MINSEC)  SHIFT_UPDATE;
  +   if (tmp  -(MAXPHASE / MINSEC)  SHIFT_UPDATE)
  +   tmp = -(MAXPHASE / MINSEC)  SHIFT_UPDATE;
 
 max, min?
 
  +   /* Make sure offset is bounded by MAXPHASE */
  +   if (tmp  MAXPHASE)
  +   tmp = MAXPHASE;
  +   if (tmp  -MAXPHASE)
  +   tmp = -MAXPHASE;
 
 max, min.

Good idea. 


  +int do_adjtimex(struct timex *tx)
  +{
  +   do_gettimeofday(tx-time); /* set timex-time*/
 
 Oh. Move the cap check back here..

Will do.

  +   if (time_suspend_state != TIME_RUNNING) {
  +   printk(KERN_INFO timeofday_suspend_hook: ACK! called while 
  we're suspended!);
 
 Line length. Perhaps BUG_ON instead.

Eh, its not fatal to BUG_ON seems a bit harsh. I'll fix the line length
though. 


  +   /* finally, update legacy time values */
  +   write_seqlock_irqsave(xtime_lock, x_flags);
  +   xtime = ns2timespec(system_time + wall_time_offset);
  +   wall_to_monotonic = ns2timespec(wall_time_offset);
  +   wall_to_monotonic.tv_sec = -wall_to_monotonic.tv_sec;
  +   wall_to_monotonic.tv_nsec = 

Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread Matt Mackall
On Mon, Mar 14, 2005 at 10:42:45AM -0800, john stultz wrote:
 
   +static inline cycle_t read_timesource(struct timesource_t* ts)
   +{
   + switch (ts-type) {
   + case TIMESOURCE_MMIO_32:
   + return (cycle_t)readl(ts-mmio_ptr);
   + case TIMESOURCE_MMIO_64:
   + return (cycle_t)readq(ts-mmio_ptr);
   + case TIMESOURCE_CYCLES:
   + return (cycle_t)get_cycles();
   + default:/* case: TIMESOURCE_FUNCTION */
   + return ts-read_fnct();
   + }
   +}
  
  Wouldn't it be better to change read_fnct to take a timesource * and
  then change all the other guys to generic_timesource_foo helper
  functions? This does away with the switch and makes it trivial to add
  new generic sources. Change mmio_ptr to void *private.
 
 Not sure if I totally understand this, but originally I just had a read
 function, but to allow this framework to function w/ ia64 fsyscalls (and
 likely other arches vsyscalls) we need to pass the raw mmio pointers.
 Thus the timesource type and switch idea was taken from the time
 interpolator code.

Well for vsyscall, we can leave the mmio_ptr and type. But in-kernel,
I think we'd rather always call read_fnct with generic helpers than hit this
switch every time.

   + if (time_suspend_state != TIME_RUNNING) {
   + printk(KERN_INFO timeofday_suspend_hook: ACK! called while 
   we're suspended!);
  
  Line length. Perhaps BUG_ON instead.
 
 Eh, its not fatal to BUG_ON seems a bit harsh. I'll fix the line length
 though. 

Well there's a trade-off here. If it's something that should never
happen and you only printk, you may never get a failure report
(especially at KERN_INFO). It's good to be accomodating of external
errors, but catching internal should-never-happen errors is important.

  Excellent question. 
 
 Indeed.  Currently jiffies is used as both a interrupt counter and a
 time unit, and I'm trying make it just the former. If I emulate it then
 it stops functioning as a interrupt counter, and if I don't then I'll
 probably break assumptions about jiffies being a time unit. So I'm not
 sure which is the easiest path to go until all the users of jiffies are
 audited for intent. 

Post this as a separate thread. There are various thoughts floating
around on this already.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread john stultz
On Mon, 2005-03-14 at 11:29 -0800, Matt Mackall wrote:
 On Mon, Mar 14, 2005 at 10:42:45AM -0800, john stultz wrote:
  
+static inline cycle_t read_timesource(struct timesource_t* ts)
+{
+   switch (ts-type) {
+   case TIMESOURCE_MMIO_32:
+   return (cycle_t)readl(ts-mmio_ptr);
+   case TIMESOURCE_MMIO_64:
+   return (cycle_t)readq(ts-mmio_ptr);
+   case TIMESOURCE_CYCLES:
+   return (cycle_t)get_cycles();
+   default:/* case: TIMESOURCE_FUNCTION */
+   return ts-read_fnct();
+   }
+}
   
   Wouldn't it be better to change read_fnct to take a timesource * and
   then change all the other guys to generic_timesource_foo helper
   functions? This does away with the switch and makes it trivial to add
   new generic sources. Change mmio_ptr to void *private.
  
  Not sure if I totally understand this, but originally I just had a read
  function, but to allow this framework to function w/ ia64 fsyscalls (and
  likely other arches vsyscalls) we need to pass the raw mmio pointers.
  Thus the timesource type and switch idea was taken from the time
  interpolator code.
 
 Well for vsyscall, we can leave the mmio_ptr and type. But in-kernel,
 I think we'd rather always call read_fnct with generic helpers than hit this
 switch every time.

Huh. So if I understand you properly, all timesources should have valid
read_fnct pointers that return the cycle value, however we'll still
preserve the type and mmio_ptr so fsyscall/vsyscall bits can use them
externally?

Hmm. I'm a little cautious, as I really want to make the vsyscall
gettimeofday and regular do_gettimeofday be a similar as possible to
avoid some of the bugs we've seen between different gettimeofday
implementations. However I'm not completely against the idea.

Christoph: Do you have any thoughts on this?


+   if (time_suspend_state != TIME_RUNNING) {
+   printk(KERN_INFO timeofday_suspend_hook: ACK! called 
while we're suspended!);
   
   Line length. Perhaps BUG_ON instead.
  
  Eh, its not fatal to BUG_ON seems a bit harsh. I'll fix the line length
  though. 
 
 Well there's a trade-off here. If it's something that should never
 happen and you only printk, you may never get a failure report
 (especially at KERN_INFO). It's good to be accomodating of external
 errors, but catching internal should-never-happen errors is important.

Fair enough. 


   Excellent question. 
  
  Indeed.  Currently jiffies is used as both a interrupt counter and a
  time unit, and I'm trying make it just the former. If I emulate it then
  it stops functioning as a interrupt counter, and if I don't then I'll
  probably break assumptions about jiffies being a time unit. So I'm not
  sure which is the easiest path to go until all the users of jiffies are
  audited for intent. 
 
 Post this as a separate thread. There are various thoughts floating
 around on this already.

I'm a little busy with other things today, but I'll try to stir up a
discussion on this soon. 

thanks
-john


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread john stultz
On Mon, 2005-03-14 at 11:51 -0800, Matt Mackall wrote:
 On Mon, Mar 14, 2005 at 11:43:21AM -0800, john stultz wrote:
  On Mon, 2005-03-14 at 11:29 -0800, Matt Mackall wrote:
   On Mon, Mar 14, 2005 at 10:42:45AM -0800, john stultz wrote:

  +static inline cycle_t read_timesource(struct timesource_t* ts)
  +{
  +   switch (ts-type) {
  +   case TIMESOURCE_MMIO_32:
  +   return (cycle_t)readl(ts-mmio_ptr);
  +   case TIMESOURCE_MMIO_64:
  +   return (cycle_t)readq(ts-mmio_ptr);
  +   case TIMESOURCE_CYCLES:
  +   return (cycle_t)get_cycles();
  +   default:/* case: TIMESOURCE_FUNCTION */
  +   return ts-read_fnct();
  +   }
  +}
 
 Wouldn't it be better to change read_fnct to take a timesource * and
 then change all the other guys to generic_timesource_foo helper
 functions? This does away with the switch and makes it trivial to add
 new generic sources. Change mmio_ptr to void *private.

Not sure if I totally understand this, but originally I just had a read
function, but to allow this framework to function w/ ia64 fsyscalls (and
likely other arches vsyscalls) we need to pass the raw mmio pointers.
Thus the timesource type and switch idea was taken from the time
interpolator code.
   
   Well for vsyscall, we can leave the mmio_ptr and type. But in-kernel,
   I think we'd rather always call read_fnct with generic helpers than hit 
   this
   switch every time.
  
  Huh. So if I understand you properly, all timesources should have valid
  read_fnct pointers that return the cycle value, however we'll still
  preserve the type and mmio_ptr so fsyscall/vsyscall bits can use them
  externally?
 
 Well where we'd read an MMIO address, we'd simply set read_fnct to
 generic_timesource_mmio32 or so. And that function just does the read.
 So both that function and read_timesource become one-liners and we
 drop the conditional branches in the switch.

However the vsyscall/fsyscall bits cannot call in-kernel functions (as
they execute in userspace or a sudo-userspace). As it stands now in my
design TIMESOURCE_FUNCTION timesources will not be usable for
vsyscall/fsyscall implementations, so I'm not sure if that's doable.

I'd be interested you've got a way around that.

thanks
-john


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread George Anzinger
john stultz wrote:
On Sat, 2005-03-12 at 16:49 -0800, Matt Mackall wrote:
~


+   /* finally, update legacy time values */
+   write_seqlock_irqsave(xtime_lock, x_flags);
+   xtime = ns2timespec(system_time + wall_time_offset);
+   wall_to_monotonic = ns2timespec(wall_time_offset);
+   wall_to_monotonic.tv_sec = -wall_to_monotonic.tv_sec;
+   wall_to_monotonic.tv_nsec = -wall_to_monotonic.tv_nsec;
+   /* XXX - should jiffies be updated here? */
Excellent question. 

Indeed.  Currently jiffies is used as both a interrupt counter and a
time unit, and I'm trying make it just the former. If I emulate it then
it stops functioning as a interrupt counter, and if I don't then I'll
probably break assumptions about jiffies being a time unit. So I'm not
sure which is the easiest path to go until all the users of jiffies are
audited for intent. 
Really?  Who counts interrupts???  The timer code treats jiffies as a unit of 
time.  You will need to rewrite that to make it otherwise.  But then you have 
another problem.  To correctly function, times need to expire on time (hay how 
bout that) not some time later.  To do this we need an interrupt source.  To 
this point in time, the jiffies interrupt has been the indication that one or 
more timer may have expired.  While we don't need to count the interrupts, we 
DO need them to expire the timers AND they need to be on time.

~
--
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread Christoph Lameter


On Fri, 11 Mar 2005, john stultz wrote:

 +/* get_lowres_timestamp():
 + *   Returns a low res timestamp.
 + *   (ie: the value of system_time as  calculated at
 + *   the last invocation of timeofday_periodic_hook() )
 + */
 +nsec_t get_lowres_timestamp(void)
 +{
 + nsec_t ret;
 + unsigned long seq;
 + do {
 + seq = read_seqbegin(system_time_lock);
 +
 + /* quickly grab system_time*/
 + ret = system_time;
 +
 + } while (read_seqretry(system_time_lock, seq));
 +
 + return ret;
 +}

On 64 bit platforms this could simply be a macro accessing system time.

 +/* do_gettimeofday():
 + *   Returns the time of day
 + */
 +void do_gettimeofday(struct timeval *tv)
 +{
 + nsec_t wall, sys;
 + unsigned long seq;
 +
 + /* atomically read wall and sys time */
 + do {
 + seq = read_seqbegin(system_time_lock);
 +
 + wall = wall_time_offset;
 + sys = __monotonic_clock();
 +
 + } while (read_seqretry(system_time_lock, seq));
 +
 + /* add them and convert to timeval */
 + *tv = ns2timeval(wall+sys);
 +}
 +EXPORT_SYMBOL(do_gettimeofday);

Good.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread Christoph Lameter
On Mon, 14 Mar 2005, john stultz wrote:

 Huh. So if I understand you properly, all timesources should have valid
 read_fnct pointers that return the cycle value, however we'll still
 preserve the type and mmio_ptr so fsyscall/vsyscall bits can use them
 externally?

 Hmm. I'm a little cautious, as I really want to make the vsyscall
 gettimeofday and regular do_gettimeofday be a similar as possible to
 avoid some of the bugs we've seen between different gettimeofday
 implementations. However I'm not completely against the idea.

 Christoph: Do you have any thoughts on this?

Sorry to be late to the party. It would be a weird implementation to have
two ways to obtain time for each timesource. Also would be even more a
headache to maintain than the existing fastcall vs. fullcall.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread john stultz
On Mon, 2005-03-14 at 16:28 -0800, Christoph Lameter wrote:
 On Mon, 14 Mar 2005, john stultz wrote:
 
  Huh. So if I understand you properly, all timesources should have valid
  read_fnct pointers that return the cycle value, however we'll still
  preserve the type and mmio_ptr so fsyscall/vsyscall bits can use them
  externally?
 
  Hmm. I'm a little cautious, as I really want to make the vsyscall
  gettimeofday and regular do_gettimeofday be a similar as possible to
  avoid some of the bugs we've seen between different gettimeofday
  implementations. However I'm not completely against the idea.
 
  Christoph: Do you have any thoughts on this?
 
 Sorry to be late to the party. It would be a weird implementation to have
 two ways to obtain time for each timesource. Also would be even more a
 headache to maintain than the existing fastcall vs. fullcall.

That's my feeling as well, unless a more convincing argument comes up.

thanks
-john

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread Christoph Lameter
On Mon, 14 Mar 2005, Matt Mackall wrote:
 We can either stick all the generic mmio timer functions in the
 vsyscall page (they're tiny) or leave the vsyscall using type/ptr but
 have the kernel internally use only the function pointer. Someone
 who's more familiar with the vsyscall timer code should chime in here.

No we cannot do any function calls in a fastcall path on ia64. The current
design is ok. Why duplicate the functionality with additional indirect
function calls? Plus an indirect function  calls stalls pipelines on some
processors and will limit the performance of gettimeofday.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread Albert Cahalan
On Mon, 2005-03-14 at 12:27 -0800, Matt Mackall wrote:
 On Mon, Mar 14, 2005 at 12:04:07PM -0800, john stultz wrote:
+static inline cycle_t read_timesource(struct timesource_t* ts)
+{
+   switch (ts-type) {
+   case TIMESOURCE_MMIO_32:
+   return (cycle_t)readl(ts-mmio_ptr);
+   case TIMESOURCE_MMIO_64:
+   return (cycle_t)readq(ts-mmio_ptr);
+   case TIMESOURCE_CYCLES:
+   return (cycle_t)get_cycles();
+   default:/* case: TIMESOURCE_FUNCTION */
+   return ts-read_fnct();
+   }
+}
   Well where we'd read an MMIO address, we'd simply set read_fnct to
   generic_timesource_mmio32 or so. And that function just does the read.
   So both that function and read_timesource become one-liners and we
   drop the conditional branches in the switch.
  
  However the vsyscall/fsyscall bits cannot call in-kernel functions (as
  they execute in userspace or a sudo-userspace). As it stands now in my
  design TIMESOURCE_FUNCTION timesources will not be usable for
  vsyscall/fsyscall implementations, so I'm not sure if that's doable.
  
  I'd be interested you've got a way around that.
 
 We can either stick all the generic mmio timer functions in the
 vsyscall page (they're tiny) or leave the vsyscall using type/ptr but
 have the kernel internally use only the function pointer. Someone
 who's more familiar with the vsyscall timer code should chime in here.

When the vsyscall page is created, copy the one needed function
into it. The kernel is already self-modifying in many places; this
is nothing new.



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread Christoph Lameter
On Mon, 14 Mar 2005, Albert Cahalan wrote:

 When the vsyscall page is created, copy the one needed function
 into it. The kernel is already self-modifying in many places; this
 is nothing new.

AFAIK this will only works on ia32 and x86_64 and not definitely not
on ia64. Who knows about the other platforms 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread Christoph Lameter
Note that similarities exist between the posix clock and the time sources.
Will all time sources be exportable as posix clocks?


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-14 Thread Christoph Lameter
On Fri, 11 Mar 2005, john stultz wrote:

 +/* cyc2ns():
 + *   Uses the timesource and ntp ajdustment interval to
 + *   convert cycle_ts to nanoseconds.
 + *   If rem is not null, it stores the remainder of the
 + *   calculation there.
 + *
 + */

This function is called in critical paths and it would be very important
to optimize it further.

 +static inline nsec_t cyc2ns(struct timesource_t* ts, int ntp_adj, cycle_t 
 cycles, cycle_t* rem)
 +{
 + u64 ret;
 + ret = (u64)cycles;
 + ret *= (ts-mult + ntp_adj);

This only changes when nt_adj changes. Maybe maintain the sum separately?

 + if (unlikely(rem)) {
 + /* XXX clean this up later!
 +  *  buf for now relax, we only calc
 +  *  remainders at interrupt time
 +  */
 + u64 remainder = ret  ((1  ts-shift) -1);
 + do_div(remainder, ts-mult);
 + *rem = remainder;

IA64 does not do remainder processing (maybe I just do not understand
this...) but this seems to be not necessay if one uses 64 bit values that
are properly shifted?

 + }
 + ret = ts-shift;
 + return (nsec_t)ret;
 +}

The whole function could simply be:

#define cyc2ns(cycles, ts) (cycles*ts-current_factor)  ts-shift
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-12 Thread Andreas Schwab
Matt Mackall <[EMAIL PROTECTED]> writes:

> On Fri, Mar 11, 2005 at 05:24:15PM -0800, john stultz wrote:
>> +struct timesource_t timesource_jiffies = {
>> +.name = "jiffies",
>> +.priority = 0, /* lowest priority*/
>> +.type = TIMESOURCE_FUNCTION,
>> +.read_fnct = jiffies_read,
>> +.mask = (cycle_t)~0,
>
> Not sure this is right. The type of 0 is 'int' and the ~ will happen
> before the cast to a potentially longer type.

If you want an all-one value for any unsigned type then (type)-1 is the
most reliable way.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-12 Thread Matt Mackall
On Fri, Mar 11, 2005 at 05:24:15PM -0800, john stultz wrote:
> +struct timesource_t timesource_jiffies = {
> + .name = "jiffies",
> + .priority = 0, /* lowest priority*/
> + .type = TIMESOURCE_FUNCTION,
> + .read_fnct = jiffies_read,
> + .mask = (cycle_t)~0,

Not sure this is right. The type of 0 is 'int' and the ~ will happen
before the cast to a potentially longer type.

> + .mult = NSEC_PER_SEC/HZ,

Does rounding matter here? Alpha has HZ of 1024, so this comes out to
976562.5.

> +struct timesource_t {
> + char* name;
> + int priority;
> + enum {
> + TIMESOURCE_FUNCTION,
> + TIMESOURCE_CYCLES,
> + TIMESOURCE_MMIO_32,
> + TIMESOURCE_MMIO_64
> + } type;
> + cycle_t (*read_fnct)(void);
> + void __iomem* mmio_ptr;

Convention is * goes next to the variable name rather than the type.

> +/* XXX - this should go somewhere better! */
> +#ifndef readq
> +static inline unsigned long long readq(void __iomem *addr)

Somewhere in asm-generic..

> +static inline cycle_t read_timesource(struct timesource_t* ts)
> +{
> + switch (ts->type) {
> + case TIMESOURCE_MMIO_32:
> + return (cycle_t)readl(ts->mmio_ptr);
> + case TIMESOURCE_MMIO_64:
> + return (cycle_t)readq(ts->mmio_ptr);
> + case TIMESOURCE_CYCLES:
> + return (cycle_t)get_cycles();
> + default:/* case: TIMESOURCE_FUNCTION */
> + return ts->read_fnct();
> + }
> +}

Wouldn't it be better to change read_fnct to take a timesource * and
then change all the other guys to generic_timesource_ helper
functions? This does away with the switch and makes it trivial to add
new generic sources. Change mmio_ptr to void *private.

> @@ -467,6 +468,7 @@
>   pidhash_init();
>   init_timers();
>   softirq_init();
> + timeofday_init();
>   time_init();

Can we push time_init inside of timeofday_init?

> +/* Chapter 5: Kernel Variables [RFC 1589 pg. 28] */
> +/* 5.1 Interface Variables */
> +static int ntp_status= STA_UNSYNC;   /* status */
> +static long ntp_offset;  /* usec */
> +static long ntp_constant = 2;/* ntp magic? */
> +static long ntp_maxerror = NTP_PHASE_LIMIT;  /* usec */
> +static long ntp_esterror = NTP_PHASE_LIMIT;  /* usec */
> +static const long ntp_tolerance  = MAXFREQ;  /* shifted ppm 
> */
> +static const long ntp_precision  = 1;/* constant */
> +
> +/* 5.2 Phase-Lock Loop Variables */
> +static long ntp_freq;/* shifted ppm 
> */
> +static long ntp_reftime; /* sec */

You present a nice argument for not using tabs except at the beginning
of the line.

> +#define MILLION 100

Still a magic number despite being a define. Very meta. Unused.

> +/* int ntp_advance(nsec_t interval):
> + *   Periodic hook which increments NTP state machine by interval.
> + *  Returns the signed PPM adjustment to be used for the next interval.
> + *   This is ntp_hardclock in the RFC.

Why is it not ntp_hardclock here?

> + */
> +int ntp_advance(nsec_t interval)
> +{
> + static u64 interval_sum=0;

Spaces.

> + /* decrement singleshot offset interval */
> + ss_offset_len =- interval;

Eh?

> + /* bound the adjustment to MAXPHASE/MINSEC */
> + if (tmp > (MAXPHASE / MINSEC) << SHIFT_UPDATE)
> + tmp = (MAXPHASE / MINSEC) << SHIFT_UPDATE;
> + if (tmp < -(MAXPHASE / MINSEC) << SHIFT_UPDATE)
> + tmp = -(MAXPHASE / MINSEC) << SHIFT_UPDATE;

max, min?

> + /* Make sure offset is bounded by MAXPHASE */
> + if (tmp > MAXPHASE)
> + tmp = MAXPHASE;
> + if (tmp < -MAXPHASE)
> + tmp = -MAXPHASE;

max, min.

> + if ((ntp_status & STA_FLL) && (interval >= MINSEC)) {
> + long damping;
> + tmp = (offset / interval); /* ppm (usec/sec)*/

(unnecessary parens)

> +/* int ntp_adjtimex(struct timex* tx)
> + *   Interface to change NTP state machine
> + */
> +int ntp_adjtimex(struct timex* tx)
> +{
> + long save_offset;
> + int result;
> + unsigned long flags;
> +
> +/*=[Sanity checking]===*/
> + /* Check capabilities if we're trying to modify something */
> + if (tx->modes && !capable(CAP_SYS_TIME))
> + return -EPERM;

This is already done in do_adjtimex.

> + /* clear everything */
> + ntp_status |= STA_UNSYNC;
> + ntp_maxerror = NTP_PHASE_LIMIT;
> + ntp_esterror = NTP_PHASE_LIMIT;
> + ss_offset_len=0;
> + singleshot_adj=0;
> + tick_adj=0;
> + offset_adj =0;

Spacing.

> +/*[Nanosecond based variables]

This comment style is weird. Kill the trailing dashes at least.

> +static enum { TIME_RUNNING, TIME_SUSPENDED } time_suspend_state = 
> TIME_RUNNING;

Insert some 

Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-12 Thread Matt Mackall
On Fri, Mar 11, 2005 at 05:24:15PM -0800, john stultz wrote:
 +struct timesource_t timesource_jiffies = {
 + .name = jiffies,
 + .priority = 0, /* lowest priority*/
 + .type = TIMESOURCE_FUNCTION,
 + .read_fnct = jiffies_read,
 + .mask = (cycle_t)~0,

Not sure this is right. The type of 0 is 'int' and the ~ will happen
before the cast to a potentially longer type.

 + .mult = NSEC_PER_SEC/HZ,

Does rounding matter here? Alpha has HZ of 1024, so this comes out to
976562.5.

 +struct timesource_t {
 + char* name;
 + int priority;
 + enum {
 + TIMESOURCE_FUNCTION,
 + TIMESOURCE_CYCLES,
 + TIMESOURCE_MMIO_32,
 + TIMESOURCE_MMIO_64
 + } type;
 + cycle_t (*read_fnct)(void);
 + void __iomem* mmio_ptr;

Convention is * goes next to the variable name rather than the type.

 +/* XXX - this should go somewhere better! */
 +#ifndef readq
 +static inline unsigned long long readq(void __iomem *addr)

Somewhere in asm-generic..

 +static inline cycle_t read_timesource(struct timesource_t* ts)
 +{
 + switch (ts-type) {
 + case TIMESOURCE_MMIO_32:
 + return (cycle_t)readl(ts-mmio_ptr);
 + case TIMESOURCE_MMIO_64:
 + return (cycle_t)readq(ts-mmio_ptr);
 + case TIMESOURCE_CYCLES:
 + return (cycle_t)get_cycles();
 + default:/* case: TIMESOURCE_FUNCTION */
 + return ts-read_fnct();
 + }
 +}

Wouldn't it be better to change read_fnct to take a timesource * and
then change all the other guys to generic_timesource_foo helper
functions? This does away with the switch and makes it trivial to add
new generic sources. Change mmio_ptr to void *private.

 @@ -467,6 +468,7 @@
   pidhash_init();
   init_timers();
   softirq_init();
 + timeofday_init();
   time_init();

Can we push time_init inside of timeofday_init?

 +/* Chapter 5: Kernel Variables [RFC 1589 pg. 28] */
 +/* 5.1 Interface Variables */
 +static int ntp_status= STA_UNSYNC;   /* status */
 +static long ntp_offset;  /* usec */
 +static long ntp_constant = 2;/* ntp magic? */
 +static long ntp_maxerror = NTP_PHASE_LIMIT;  /* usec */
 +static long ntp_esterror = NTP_PHASE_LIMIT;  /* usec */
 +static const long ntp_tolerance  = MAXFREQ;  /* shifted ppm 
 */
 +static const long ntp_precision  = 1;/* constant */
 +
 +/* 5.2 Phase-Lock Loop Variables */
 +static long ntp_freq;/* shifted ppm 
 */
 +static long ntp_reftime; /* sec */

You present a nice argument for not using tabs except at the beginning
of the line.

 +#define MILLION 100

Still a magic number despite being a define. Very meta. Unused.

 +/* int ntp_advance(nsec_t interval):
 + *   Periodic hook which increments NTP state machine by interval.
 + *  Returns the signed PPM adjustment to be used for the next interval.
 + *   This is ntp_hardclock in the RFC.

Why is it not ntp_hardclock here?

 + */
 +int ntp_advance(nsec_t interval)
 +{
 + static u64 interval_sum=0;

Spaces.

 + /* decrement singleshot offset interval */
 + ss_offset_len =- interval;

Eh?

 + /* bound the adjustment to MAXPHASE/MINSEC */
 + if (tmp  (MAXPHASE / MINSEC)  SHIFT_UPDATE)
 + tmp = (MAXPHASE / MINSEC)  SHIFT_UPDATE;
 + if (tmp  -(MAXPHASE / MINSEC)  SHIFT_UPDATE)
 + tmp = -(MAXPHASE / MINSEC)  SHIFT_UPDATE;

max, min?

 + /* Make sure offset is bounded by MAXPHASE */
 + if (tmp  MAXPHASE)
 + tmp = MAXPHASE;
 + if (tmp  -MAXPHASE)
 + tmp = -MAXPHASE;

max, min.

 + if ((ntp_status  STA_FLL)  (interval = MINSEC)) {
 + long damping;
 + tmp = (offset / interval); /* ppm (usec/sec)*/

(unnecessary parens)

 +/* int ntp_adjtimex(struct timex* tx)
 + *   Interface to change NTP state machine
 + */
 +int ntp_adjtimex(struct timex* tx)
 +{
 + long save_offset;
 + int result;
 + unsigned long flags;
 +
 +/*=[Sanity checking]===*/
 + /* Check capabilities if we're trying to modify something */
 + if (tx-modes  !capable(CAP_SYS_TIME))
 + return -EPERM;

This is already done in do_adjtimex.

 + /* clear everything */
 + ntp_status |= STA_UNSYNC;
 + ntp_maxerror = NTP_PHASE_LIMIT;
 + ntp_esterror = NTP_PHASE_LIMIT;
 + ss_offset_len=0;
 + singleshot_adj=0;
 + tick_adj=0;
 + offset_adj =0;

Spacing.

 +/*[Nanosecond based variables]

This comment style is weird. Kill the trailing dashes at least.

 +static enum { TIME_RUNNING, TIME_SUSPENDED } time_suspend_state = 
 TIME_RUNNING;

Insert some line breaks.

 + /* convert timespec to ns */
 + nsec_t newtime = timespec2ns(tv);

 + /* clear NTP settings */
 

Re: [RFC][PATCH] new timeofday core subsystem (v. A3)

2005-03-12 Thread Andreas Schwab
Matt Mackall [EMAIL PROTECTED] writes:

 On Fri, Mar 11, 2005 at 05:24:15PM -0800, john stultz wrote:
 +struct timesource_t timesource_jiffies = {
 +.name = jiffies,
 +.priority = 0, /* lowest priority*/
 +.type = TIMESOURCE_FUNCTION,
 +.read_fnct = jiffies_read,
 +.mask = (cycle_t)~0,

 Not sure this is right. The type of 0 is 'int' and the ~ will happen
 before the cast to a potentially longer type.

If you want an all-one value for any unsigned type then (type)-1 is the
most reliable way.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/