Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Thu, 2005-08-25 at 02:45 +0200, Roman Zippel wrote: > Hi, > > On Wed, 24 Aug 2005, john stultz wrote: > > > Ok, well, I'm still at a loss for understanding how this avoids my > > concern about time inconsistencies. > > Let's take a simple example to demonstrate the difference between system > time and reference time. [snip] > 17000 16974 8 -26 > 1818000 17958 8 -42 > 1919000 18942 8 -58 > 202 19926 8 -74 > > let's assume we're late with the update by 10 cycles > (gettimeofday=19926+10*8=20006), so a change to the mult also requires a > adjustment of the system time: > > 20+10 2 19916 9 -84 > > so gettimeofday=19916+10*9=20006 Hey Roman, Thanks for your patient persistence. The light bulb finally clicked on for me last night. I'll start playing with the idea and get back to you. thanks again, -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Wed, 2005-08-24 at 18:44 -0700, George Anzinger wrote: > > Ok, so your forcing gettimeofday to be interval aware, so its applying > > different fixed NTP adjustments to different chunks of the current > > interval. The issue of course is if you're using fixed adjustments, is > > that you have to have n ntp adjustments for n intervals, or you have to > > apply the same ntp adjustment to multiple intervals. > > Uh, are you saying that one ntpd call can set up several different > adjustments? Well, it allows for frequency adjustments, tick adjustments, and offset adjustments in a single call or just the singleshot (adjtime) adjustment. However it does not give multiple scaling factors for different intervals, so you are correct there. > I was assuming that any given call would set up either a > fixed adjustment for ever or a fixed adjustment to be applied for a > fixed number of ticks (or until so much correcting was done, which, in > the end is the same thing at this point in the code). > > If ntpd has to come back to change the adjustment, I am assuming that > some kernel action can be taken at that time to sync the xtime clock and > the gettimeofday reading of it. I.e. we would only have to keep track > of one adjustment with a possible pre specified end time. Well, I guess a component of the adjustment would end at a specified time, that's true. > >>I would argue that only two terms are needed here regardless of > >>how late a tick is. This is because, I would expect the ntp system call > >>to sync the two clocks. This means in your example, the ntp call would > >>have been made at, or prior to the timer interrupt at 2 and this is the > >>same edge that gettimeofday is to used to start applying the correction. > > > > > > If you argue that we only need two adjustments, why not argue for only > > one? You're saying have one adjustment that you apply for the first > > tick's worth of time, and a second adjustment that you apply for the > > following N ticks' worth of time in the interval. Why the odd base > > case? > > Correct me if I am wrong here, but I am assuming that ntpd can ask for > an adjustment of X amount which the kernel changes into N adjustments of > X/N amount spread over the next N ticks. No, sorry, you are correct there, I was confusing things. It may work, and I had considered a similar idea when developing my solution, but it seemed far too ugly and complicated. But that could have just been my fault. :) thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
john stultz wrote: On Wed, 2005-08-24 at 16:46 -0700, George Anzinger wrote: john stultz wrote: On Tue, 2005-08-23 at 17:29 -0700, George Anzinger wrote: Roman Zippel wrote: Hi, On Tue, 23 Aug 2005, john stultz wrote: I'm assuming gettimeofday()/clock_gettime() looks something like: xtime + (get_cycles()-last_update)*(mult+ntp_adj)>>shift Where did you get the ntp_adj from? It's not in my example. gettimeofday() was in the previous mail: "xtime + (cycle_offset * mult + error) >> shift". The difference between system time and reference time is really important. gettimeofday() returns the system time, NTP controls the reference time and these two are synchronized regularly. I didn't see that anywhere in your example. If I read your example right, the problem is when the NTP adjustment changes while the two clocks are out of sync (because of a late tick). Not quite. The issue that I'm trying to describe is that if, we inconsistently calculate time intervals in gettimeofday and the timer interrupt, we have the possibility for time inconsistencies. The trivial example using the current code would be something like: Again with my 2 cyc per tick clock, HZ=1000. gettimeofday(): xtime + offset_ns timer_interrupt: xtime += tick_length + ntp_adj offset_ns = 0 0: gettimeofday: 0 + 0 = 0 ns 1: gettimeofday: 0 + 500k ns = 500k ns 2: gettimeofday: 0 + 1M ns = 1M ns 2: timer_interrupt: 2: gettimeofday: 1M ns + 0 ns = 1M ns 3: gettimeofday: 1M ns + 500k ns = 1.5M ns 4: gettimeofday: 1M ns + 1M ns = 2 ns 4: timer_interrupt (using -500ppm adjustment) 4: gettimeofday: 1,999,500 ns + 0 ns = 1,999,500 ns At point 4 you are introducing a NEW ntp adjustment. This, I submit, needs to actually have been introduced to the system prior to the interrupt at point 2 with the first xtime change at point 4. However, gettimeofday() should be aware of it from the interrupt at point 2 and be doing corrections from that time forward. Thus when the point 4 interrutp happens xtime will be the same at the gettimeofday a ns earlier. Yes, clearly a forward knowledge of the NTP adjustment is necessary for gettimeofday(), because after the NTP adjustment has been accumulated into xtime, there's nothing left for gettimeofday to adjust (its already been applied). :) Likewise, gettimeofday() needs to know when to stop apply the correction so that if a tick is late, it will apply the correction only for those times that it was needed. This, could be done by figuring the offset thusly: offset = (offset from last tick to end of ntp period * ntp_adj1) + (offset from end of ntp period to now) Well, in my example, the ntp_adjustment is a fixed nanosecond offset, so it would be added to the nanosecond offset from the last tick (which is how the current code works). If you are doing scaling (as you have in the equation above), then the problem goes away, since you can apply the adjustment consistently through any interval. Until the end of the correction time... I suppose it is possible that the latter part of the offset is also under a different ntp correction which would mean a "* ntp_adj2" is needed. Ok, so your forcing gettimeofday to be interval aware, so its applying different fixed NTP adjustments to different chunks of the current interval. The issue of course is if you're using fixed adjustments, is that you have to have n ntp adjustments for n intervals, or you have to apply the same ntp adjustment to multiple intervals. Uh, are you saying that one ntpd call can set up several different adjustments? I was assuming that any given call would set up either a fixed adjustment for ever or a fixed adjustment to be applied for a fixed number of ticks (or until so much correcting was done, which, in the end is the same thing at this point in the code). If ntpd has to come back to change the adjustment, I am assuming that some kernel action can be taken at that time to sync the xtime clock and the gettimeofday reading of it. I.e. we would only have to keep track of one adjustment with a possible pre specified end time. I would argue that only two terms are needed here regardless of how late a tick is. This is because, I would expect the ntp system call to sync the two clocks. This means in your example, the ntp call would have been made at, or prior to the timer interrupt at 2 and this is the same edge that gettimeofday is to used to start applying the correction. If you argue that we only need two adjustments, why not argue for only one? You're saying have one adjustment that you apply for the first tick's worth of time, and a second adjustment that you apply for the following N ticks' worth of time in the interval. Why the odd base case? Correct me if I am wrong here, but I am assuming that ntpd can ask for an adjustment of X amount which the kernel changes into N adjustments of X/N amount spread over the ne
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
Hi, On Wed, 24 Aug 2005, john stultz wrote: > Ok, well, I'm still at a loss for understanding how this avoids my > concern about time inconsistencies. Let's take a simple example to demonstrate the difference between system time and reference time. NTP tells us to update the reference time by 1000 units every tick and a single tick consists of 123 cycles, so the initial multiplier is 8. This means after 1 tick the system time is 984 and off by -16: time (ticks)reference time system time multerror 0 0 0 8 0 1 1000984 8 -16 2 200019688 -32 3 300029528 -48 4 400039369 -64 the error is now big enough, so we speed up system time: 5 500050439 43 6 600061508 150 and slow it down again: 7 700071348 134 8 800081188 118 9 900091028 102 10 1 10086 8 86 11 11000 11070 8 70 12 12000 12054 8 54 13 13000 13038 8 38 14 14000 14022 8 22 15 15000 15006 8 6 16 16000 15990 8 -10 17 17000 16974 8 -26 18 18000 17958 8 -42 19 19000 18942 8 -58 20 2 19926 8 -74 let's assume we're late with the update by 10 cycles (gettimeofday=19926+10*8=20006), so a change to the mult also requires a adjustment of the system time: 20+10 2 19916 9 -84 so gettimeofday=19916+10*9=20006 21 21000 21023 9 23 22 22000 22130 8 130 now add a single adjustment of 500 to the reference time: 23 23500 23114 11 -386 24 24500 24467 8 -33 A detail which is missing now in my example code is that we actually should look ahead to the next update, so that multiplier is immediately adjusted and the error above would never exceed 123/2 unless an update is delayed. It's really not that difficult :), it's just important to understand the difference between reference time and system time. All the NTP adjustments are done to the reference time and we manipulate the speed of the system clock to keep it close. The latter has _nothing_ to do with NTP so I don't want to see anything called like ntp_adj there. bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Wed, 2005-08-24 at 16:46 -0700, George Anzinger wrote: > john stultz wrote: > > On Tue, 2005-08-23 at 17:29 -0700, George Anzinger wrote: > > > >>Roman Zippel wrote: > >> > >>>Hi, > >>> > >>>On Tue, 23 Aug 2005, john stultz wrote: > >>> > >>> > >>> > I'm assuming gettimeofday()/clock_gettime() looks something like: > xtime + (get_cycles()-last_update)*(mult+ntp_adj)>>shift > >>> > >>> > >>>Where did you get the ntp_adj from? It's not in my example. > >>>gettimeofday() was in the previous mail: "xtime + (cycle_offset * mult + > >>>error) >> shift". The difference between system time and reference > >>>time is really important. gettimeofday() returns the system time, NTP > >>>controls the reference time and these two are synchronized regularly. > >>>I didn't see that anywhere in your example. > >>> > > > > > >>If I read your example right, the problem is when the NTP adjustment > >>changes while the two clocks are out of sync (because of a late tick). > > > > > > Not quite. The issue that I'm trying to describe is that if, we > > inconsistently calculate time intervals in gettimeofday and the timer > > interrupt, we have the possibility for time inconsistencies. > > > > The trivial example using the current code would be something like: > > > > Again with my 2 cyc per tick clock, HZ=1000. > > > > gettimeofday(): > > xtime + offset_ns > > > > timer_interrupt: > > xtime += tick_length + ntp_adj > > offset_ns = 0 > > > > 0: gettimeofday: 0 + 0 = 0 ns > > 1: gettimeofday: 0 + 500k ns = 500k ns > > 2: gettimeofday: 0 + 1M ns = 1M ns > > 2: timer_interrupt: > > 2: gettimeofday: 1M ns + 0 ns = 1M ns > > 3: gettimeofday: 1M ns + 500k ns = 1.5M ns > > 4: gettimeofday: 1M ns + 1M ns = 2 ns > > 4: timer_interrupt (using -500ppm adjustment) > > 4: gettimeofday: 1,999,500 ns + 0 ns = 1,999,500 ns > > > At point 4 you are introducing a NEW ntp adjustment. This, I submit, > needs to actually have been introduced to the system prior to the > interrupt at point 2 with the first xtime change at point 4. However, > gettimeofday() should be aware of it from the interrupt at point 2 and > be doing corrections from that time forward. Thus when the point 4 > interrutp happens xtime will be the same at the gettimeofday a ns earlier. Yes, clearly a forward knowledge of the NTP adjustment is necessary for gettimeofday(), because after the NTP adjustment has been accumulated into xtime, there's nothing left for gettimeofday to adjust (its already been applied). :) > Likewise, gettimeofday() needs to know when to stop apply the correction > so that if a tick is late, it will apply the correction only for those > times that it was needed. This, could be done by figuring the offset > thusly: > > offset = (offset from last tick to end of ntp period * ntp_adj1) + > (offset from end of ntp period to now) Well, in my example, the ntp_adjustment is a fixed nanosecond offset, so it would be added to the nanosecond offset from the last tick (which is how the current code works). If you are doing scaling (as you have in the equation above), then the problem goes away, since you can apply the adjustment consistently through any interval. > I suppose it is possible that the latter part of the offset is also > under a different ntp correction which would mean a "* ntp_adj2" is > needed. Ok, so your forcing gettimeofday to be interval aware, so its applying different fixed NTP adjustments to different chunks of the current interval. The issue of course is if you're using fixed adjustments, is that you have to have n ntp adjustments for n intervals, or you have to apply the same ntp adjustment to multiple intervals. > I would argue that only two terms are needed here regardless of > how late a tick is. This is because, I would expect the ntp system call > to sync the two clocks. This means in your example, the ntp call would > have been made at, or prior to the timer interrupt at 2 and this is the > same edge that gettimeofday is to used to start applying the correction. If you argue that we only need two adjustments, why not argue for only one? You're saying have one adjustment that you apply for the first tick's worth of time, and a second adjustment that you apply for the following N ticks' worth of time in the interval. Why the odd base case? > >>It would appear that gettimeofday would need to know that the NTP > >>adjustment is changing (and to what). It would also appear that this > >>is known by the ntp code and could be made available to gettimeofday. > >>If it is changing due to an NTP call, that system call, itself, > >>should/must force synchronization. So the only case gettimeofday needs > >>to worry/know about is that an adjustment is to change at time X to > >>value Y. Also, me thinks there is only one such change that can be > >>present at any given time. > > > > > > Well, in many arches gettimeofday() works around the above issu
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
john stultz wrote: On Tue, 2005-08-23 at 17:29 -0700, George Anzinger wrote: Roman Zippel wrote: Hi, On Tue, 23 Aug 2005, john stultz wrote: I'm assuming gettimeofday()/clock_gettime() looks something like: xtime + (get_cycles()-last_update)*(mult+ntp_adj)>>shift Where did you get the ntp_adj from? It's not in my example. gettimeofday() was in the previous mail: "xtime + (cycle_offset * mult + error) >> shift". The difference between system time and reference time is really important. gettimeofday() returns the system time, NTP controls the reference time and these two are synchronized regularly. I didn't see that anywhere in your example. If I read your example right, the problem is when the NTP adjustment changes while the two clocks are out of sync (because of a late tick). Not quite. The issue that I'm trying to describe is that if, we inconsistently calculate time intervals in gettimeofday and the timer interrupt, we have the possibility for time inconsistencies. The trivial example using the current code would be something like: Again with my 2 cyc per tick clock, HZ=1000. gettimeofday(): xtime + offset_ns timer_interrupt: xtime += tick_length + ntp_adj offset_ns = 0 0: gettimeofday: 0 + 0 = 0 ns 1: gettimeofday: 0 + 500k ns = 500k ns 2: gettimeofday: 0 + 1M ns = 1M ns 2: timer_interrupt: 2: gettimeofday: 1M ns + 0 ns = 1M ns 3: gettimeofday: 1M ns + 500k ns = 1.5M ns 4: gettimeofday: 1M ns + 1M ns = 2 ns 4: timer_interrupt (using -500ppm adjustment) 4: gettimeofday: 1,999,500 ns + 0 ns = 1,999,500 ns At point 4 you are introducing a NEW ntp adjustment. This, I submit, needs to actually have been introduced to the system prior to the interrupt at point 2 with the first xtime change at point 4. However, gettimeofday() should be aware of it from the interrupt at point 2 and be doing corrections from that time forward. Thus when the point 4 interrutp happens xtime will be the same at the gettimeofday a ns earlier. Likewise, gettimeofday() needs to know when to stop apply the correction so that if a tick is late, it will apply the correction only for those times that it was needed. This, could be done by figuring the offset thusly: offset = (offset from last tick to end of ntp period * ntp_adj1) + (offset from end of ntp period to now) I suppose it is possible that the latter part of the offset is also under a different ntp correction which would mean a "* ntp_adj2" is needed. I would argue that only two terms are needed here regardless of how late a tick is. This is because, I would expect the ntp system call to sync the two clocks. This means in your example, the ntp call would have been made at, or prior to the timer interrupt at 2 and this is the same edge that gettimeofday is to used to start applying the correction. It would appear that gettimeofday would need to know that the NTP adjustment is changing (and to what). It would also appear that this is known by the ntp code and could be made available to gettimeofday. If it is changing due to an NTP call, that system call, itself, should/must force synchronization. So the only case gettimeofday needs to worry/know about is that an adjustment is to change at time X to value Y. Also, me thinks there is only one such change that can be present at any given time. Well, in many arches gettimeofday() works around the above issue by capping the offset_ns value as such: I think this may have been done with only usec gettimeofday. Now that we have clock_gettime() returning nsec, we need to be a bit more careful. gettimeofday: xtime + min(offset_ns, tick_len + ntp_adj) The problem with this is that when we have lost or late ticks, or if we are using dynamic ticks you have granularity problems. -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Wed, 2005-08-24 at 21:49 +0200, Roman Zippel wrote: > On Wed, 24 Aug 2005, john stultz wrote: > > > from your example: > > > // at init: system_update = update_cycles * mult; > > > system_time += system_update; > > > > and: > > > error = system_time - (xtime.tv_nsec << shift); > > > > This doesn't seem to make sense with the above. Could you clarify? > > The example here doesn't keep the complete system time, just enough to > compute the difference. Hey Roman, Ok, well, I'm still at a loss for understanding how this avoids my concern about time inconsistencies. However, I don't want to burn any more of your patience explaining it, so in the hopes making some productive outcome, I'm going to take a step back, pull the most trivial and uncontroversial cleanups and fixes in my patches and try to send them to Andrew one by one. Hopefully that will give me a chance to spend some time and understand your suggestions (or maybe allow someone else to express your suggestions differently) and think of alternate solutions without feeling like I'm constantly running into walls. Again, I really do appreciate the time you've spent giving me feedback. thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Tue, 2005-08-23 at 17:29 -0700, George Anzinger wrote: > Roman Zippel wrote: > > Hi, > > > > On Tue, 23 Aug 2005, john stultz wrote: > > > > > >>I'm assuming gettimeofday()/clock_gettime() looks something like: > >> xtime + (get_cycles()-last_update)*(mult+ntp_adj)>>shift > > > > > > Where did you get the ntp_adj from? It's not in my example. > > gettimeofday() was in the previous mail: "xtime + (cycle_offset * mult + > > error) >> shift". The difference between system time and reference > > time is really important. gettimeofday() returns the system time, NTP > > controls the reference time and these two are synchronized regularly. > > I didn't see that anywhere in your example. > > > If I read your example right, the problem is when the NTP adjustment > changes while the two clocks are out of sync (because of a late tick). Not quite. The issue that I'm trying to describe is that if, we inconsistently calculate time intervals in gettimeofday and the timer interrupt, we have the possibility for time inconsistencies. The trivial example using the current code would be something like: Again with my 2 cyc per tick clock, HZ=1000. gettimeofday(): xtime + offset_ns timer_interrupt: xtime += tick_length + ntp_adj offset_ns = 0 0: gettimeofday: 0 + 0 = 0 ns 1: gettimeofday: 0 + 500k ns = 500k ns 2: gettimeofday: 0 + 1M ns = 1M ns 2: timer_interrupt: 2: gettimeofday: 1M ns + 0 ns = 1M ns 3: gettimeofday: 1M ns + 500k ns = 1.5M ns 4: gettimeofday: 1M ns + 1M ns = 2 ns 4: timer_interrupt (using -500ppm adjustment) 4: gettimeofday: 1,999,500 ns + 0 ns = 1,999,500 ns > It would appear that gettimeofday would need to know that the NTP > adjustment is changing (and to what). It would also appear that this > is known by the ntp code and could be made available to gettimeofday. > If it is changing due to an NTP call, that system call, itself, > should/must force synchronization. So the only case gettimeofday needs > to worry/know about is that an adjustment is to change at time X to > value Y. Also, me thinks there is only one such change that can be > present at any given time. Well, in many arches gettimeofday() works around the above issue by capping the offset_ns value as such: gettimeofday: xtime + min(offset_ns, tick_len + ntp_adj) The problem with this is that when we have lost or late ticks, or if we are using dynamic ticks you have granularity problems. thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
Hi, On Wed, 24 Aug 2005, john stultz wrote: > from your example: > > // at init: system_update = update_cycles * mult; > > system_time += system_update; > > and: > > error = system_time - (xtime.tv_nsec << shift); > > This doesn't seem to make sense with the above. Could you clarify? The example here doesn't keep the complete system time, just enough to compute the difference. bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Wed, 2005-08-24 at 20:48 +0200, Roman Zippel wrote: > Hi, > > On Wed, 24 Aug 2005, john stultz wrote: > > > Ok, so then to clarify the above (as you mention gettimeofday uses > > system_time), would your gettimeofday look something like: > > > > gettiemofday(): > > return (system_time + (cycle_offset * mult) + error)>> shift > > > > ? > > No. > > reference_time = xtime; > system_time = xtime + error >> shift; > gettimeofday = system_time + (cycle_offset * mult) >> shift; Eh? In your example code from before you look to be keeping the system_time and error values in shifted nsec units. from your example: > // at init: system_update = update_cycles * mult; > system_time += system_update; and: > error = system_time - (xtime.tv_nsec << shift); This doesn't seem to make sense with the above. Could you clarify? thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
Hi, On Wed, 24 Aug 2005, john stultz wrote: > Ok, so then to clarify the above (as you mention gettimeofday uses > system_time), would your gettimeofday look something like: > > gettiemofday(): > return (system_time + (cycle_offset * mult) + error)>> shift > > ? No. reference_time = xtime; system_time = xtime + error >> shift; gettimeofday = system_time + (cycle_offset * mult) >> shift; bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Wed, 2005-08-24 at 01:54 +0200, Roman Zippel wrote: > Hi, > > On Tue, 23 Aug 2005, john stultz wrote: > > > I'm assuming gettimeofday()/clock_gettime() looks something like: > >xtime + (get_cycles()-last_update)*(mult+ntp_adj)>>shift > > Where did you get the ntp_adj from? It's not in my example. > gettimeofday() was in the previous mail: "xtime + (cycle_offset * mult + > error) >> shift". The difference between system time and reference > time is really important. gettimeofday() returns the system time, NTP > controls the reference time and these two are synchronized regularly. > I didn't see that anywhere in your example. Ok, so then to clarify the above (as you mention gettimeofday uses system_time), would your gettimeofday look something like: gettiemofday(): return (system_time + (cycle_offset * mult) + error)>> shift ? thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
Hi, On Wed, 24 Aug 2005, Ulrich Windl wrote: > I'm having a problem with your wording: NTP _does_ control the "system time" > (system clock), because it's the only clock it can use. The "reference time" > is > usually remote or elsewhere (multiple sources). Local NTP does not control > the > remote reference time(s). I'm open to better wording suggestions, but this is from the kernel perspective and ntp daemon has as much control over the kernel time as the remote server has control over the ntp daemon (and basically also the other way around). Every entity has its own idea of time and uses something else as reference. The ntp daemon uses the remote server as reference time and the kernel gets from a ntp daemon a reference time. The kernel can now either just jump in regular intervals to that reference time or it modifies the speed of the system time to keep close to it. It's really the kernel who modifies the system clock based on the parameters from the ntp daemon. bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On 24 Aug 2005 at 1:54, Roman Zippel wrote: [...] > error) >> shift". The difference between system time and reference > time is really important. gettimeofday() returns the system time, NTP > controls the reference time and these two are synchronized regularly. [...] Roman, I'm having a problem with your wording: NTP _does_ control the "system time" (system clock), because it's the only clock it can use. The "reference time" is usually remote or elsewhere (multiple sources). Local NTP does not control the remote reference time(s). Regards, Ulrich - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
Roman Zippel wrote: Hi, On Tue, 23 Aug 2005, john stultz wrote: I'm assuming gettimeofday()/clock_gettime() looks something like: xtime + (get_cycles()-last_update)*(mult+ntp_adj)>>shift Where did you get the ntp_adj from? It's not in my example. gettimeofday() was in the previous mail: "xtime + (cycle_offset * mult + error) >> shift". The difference between system time and reference time is really important. gettimeofday() returns the system time, NTP controls the reference time and these two are synchronized regularly. I didn't see that anywhere in your example. John, If I read your example right, the problem is when the NTP adjustment changes while the two clocks are out of sync (because of a late tick). It would appear that gettimeofday would need to know that the NTP adjustment is changing (and to what). It would also appear that this is known by the ntp code and could be made available to gettimeofday. If it is changing due to an NTP call, that system call, itself, should/must force synchronization. So the only case gettimeofday needs to worry/know about is that an adjustment is to change at time X to value Y. Also, me thinks there is only one such change that can be present at any given time. Hope this helps... -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
Hi, On Tue, 23 Aug 2005, john stultz wrote: > I'm assuming gettimeofday()/clock_gettime() looks something like: >xtime + (get_cycles()-last_update)*(mult+ntp_adj)>>shift Where did you get the ntp_adj from? It's not in my example. gettimeofday() was in the previous mail: "xtime + (cycle_offset * mult + error) >> shift". The difference between system time and reference time is really important. gettimeofday() returns the system time, NTP controls the reference time and these two are synchronized regularly. I didn't see that anywhere in your example. bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Tue, 2005-08-23 at 23:34 +0200, Roman Zippel wrote: > Hi, > > On Tue, 23 Aug 2005, john stultz wrote: > > > In the case above, you're accumulating in fixed cycle intervals. This > > does avoid having to do the mult/shift combo each interrupt, however > > since you do not accumulate the entire interval, and there is some > > sub-tick remainder in cycle_offset. We have to ensure that that sub-tick > > remainder is accumulated at the next interrupt using the same ntp > > adjustment it would use in a call to gettimeofday() just prior to this > > interrupt. > > Look closer and you'll notice that the cycle_offset remainder isn't lost. :) I'm not saying its lost, but that it is accumulated differently then how it would be used in gettimeofday(). So a somewhat lengthy and exaggerated example with a clock that has 2 cycles per milisecond, HZ=1000 and a 0ppm adjustment to begin with. I'm assuming gettimeofday()/clock_gettime() looks something like: xtime + (get_cycles()-last_update)*(mult+ntp_adj)>>shift Using (1,000,000, 1) for the (mult,shift) pair. So first, the easy case: time(0): gettimeofday: 0 + (0 - 0)*(1M+0)>>1 = 0 ns time(1): gettimeofday: 0 + (1 - 0)*(1M+0)>>1 = 0.5M ns time(2): gettimeofday: 0 + (2 - 0)*(1M+0)>>1 = 1M ns time(2): interrupt time(2): gettimeofday: 1M + (2 - 2)*(1M+0)>>1 = 1M ns time(3): gettimeofday: 1M + (3 - 2)*(1M+0)>>1 = 1.5M ns Now, lets look at how we deal with ticks that arrive late: time(6): gettimeofday: 2M + (6 - 4)*(1M+0)>>1 = 3M ns time(7): gettimeofday: 2M + (7 - 4)*(1M+0)>>1 = 3.5M ns time(7): interrupt time(7): gettimeofday: 3M + (7 - 6)*(1M+0)>>1 = 3.5M ns time(8): gettimeofday: 3M + (8 - 6)*(1M+0)>>1 = 4M ns So everything looks ducky. Now on to when we make NTP adjustments. time(11): gettimeofday: 5M + (11 - 10)*(1M+0)>>1 = 5.5M ns time(12): gettimeofday: 5M + (12 - 10)*(1M+0)>>1 = 6M ns time(12): interrupt (set ntp_adj = 1000 ~= 500ppm) time(12): gettimeofday: 6M + (12 - 12)*(1M+ 1000)>>1 = 6M ns time(13): gettimeofday: 6M + (13 - 12)*(1M+ 1000)>>1 = 6,500,500 ns time(14): gettimeofday: 6M + (14 - 12)*(1M+ 1000)>>1 = 7,001,000 ns Still doing fine. Now lets look at doing NTP adjustments while ticks arrive late: time(15): gettimeofday: 7,001k + (15 - 14)*(1M+ 1000)>>1 = 7,501,500 ns time(16): gettimeofday: 7,001k + (16 - 14)*(1M+ 1000)>>1 = 8,002,000 ns time(17): gettimeofday: 7,001k + (17 - 14)*(1M+ 1000)>>1 = 8,502,500 ns time(17): interrupt, (set ntp_adj = 0ppm) time(17): gettimeofday: 8,002k + (17 - 16)*(1M+ 0)>>1 = 8,502,000 ns And bang, we have a 500 ns time inconsistency! And that was only with a tick arriving 1/2 a tick late. I've dealt with systems that on occasion miss 30ms worth of ticks due to SMI crazyness. This is why I accumulate the entire interval with NTP adjustments consistently between the timer tick and gettimeofday. Right now I'm not sure how to work around this issue with your proposal, but let me know if you have an idea or I'm missing some other subtlety. thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
Hi, On Tue, 23 Aug 2005, john stultz wrote: > In the case above, you're accumulating in fixed cycle intervals. This > does avoid having to do the mult/shift combo each interrupt, however > since you do not accumulate the entire interval, and there is some > sub-tick remainder in cycle_offset. We have to ensure that that sub-tick > remainder is accumulated at the next interrupt using the same ntp > adjustment it would use in a call to gettimeofday() just prior to this > interrupt. Look closer and you'll notice that the cycle_offset remainder isn't lost. :) bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Tue, 2005-08-23 at 13:30 +0200, Roman Zippel wrote: > On Mon, 22 Aug 2005, john stultz wrote: > > > The reason why we calculate the interval_length in the continuous > > timesource case is because we are not assuming anything about the > > frequency that the timekeeping_periodic_hook() is called. > > The problem with your patch is that it doesn't allow making such > assumptions. > Anyway, it's rather simple, if you want to update the time asynchronously: > > cycle_offset = get_cycles() - last_update; > > while (cycle_offset >= update_cycles) { > cycle_offset -= update_cycles; > last_update += update_cycles; > // at init: system_update = update_cycles * mult; > system_time += system_update; > xtime += [tick_nsec, time_adj]; > } Hmm. An issue cropped up when I started working on this: It seems its prone to time inconsistencies. One of the bug issues with my work is that we consistently accumulate time in the exact same manner that we use it when calculating gettimeofday. That is: gettimeofday(): xtime + cyc2ns(timesource, ntp_adj, cycle_delta) periodic_hook(): interval = cyc2ns(timesource, ntp_adj, cycle_delta) xtime += interval ... Since we accumulate the entire interval using the same ntp_adjustment, we ensure that time will not go briefly backwards around a call to periodic_hook(). In the case above, you're accumulating in fixed cycle intervals. This does avoid having to do the mult/shift combo each interrupt, however since you do not accumulate the entire interval, and there is some sub-tick remainder in cycle_offset. We have to ensure that that sub-tick remainder is accumulated at the next interrupt using the same ntp adjustment it would use in a call to gettimeofday() just prior to this interrupt. Not yet sure how to get around that issue. I'll keep working on it, and maybe you might be able to shed some light on it? thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Tue, 2005-08-23 at 13:30 +0200, Roman Zippel wrote: > Hi, > > On Mon, 22 Aug 2005, john stultz wrote: > > > The reason why we calculate the interval_length in the continuous > > timesource case is because we are not assuming anything about the > > frequency that the timekeeping_periodic_hook() is called. > > The problem with your patch is that it doesn't allow making such > assumptions. > Anyway, it's rather simple, if you want to update the time asynchronously: > > cycle_offset = get_cycles() - last_update; > > while (cycle_offset >= update_cycles) { > cycle_offset -= update_cycles; > last_update += update_cycles; > // at init: system_update = update_cycles * mult; > system_time += system_update; > xtime += [tick_nsec, time_adj]; > } > > error = system_time - (xtime.tv_nsec << shift); > > if (abs(error) > update_cycles/2) { > mult_adj = (error +- update_cycles/2) / update_cycles; > mult += mult_adj; > system_update += mult_adj * update_cycles; > system_time -= mult_adj * cycle_offset; > error -= mult_adj * cycle_offset; > } > > if (xtime.tv_nsec + (error >> shift) > NSEC_PER_SEC) { > system_time -= NSEC_PER_SEC << shift; > second_overflow(); > } AH! Ok, now I get it. Forgive me for being so dense, but code is just so much more concrete and understandable. Let me take a swing at integrating some of this idea into my code and then we can go around again. :) > The last one may become a bit of a challenge to keep as much as possible > code common without abusing the preprocessor too much. In any case some > functions will differ completely anyway, especially gettimeofday will be > optimized differently depending on the arch/clock requirements, OTOH > introducing a common gettimeofday (that would even require a 64bit > divide) would be a huge mistake. I'd always want to allow for arch specific implementations, but there are many cases where the code is doing the exact same thing, so I'd like to at least consolidate those users. No divides in the hot-path are necessary. Thanks again for the review and patience. I really do appreciate it. thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
Hi, On Mon, 22 Aug 2005, john stultz wrote: > The reason why we calculate the interval_length in the continuous > timesource case is because we are not assuming anything about the > frequency that the timekeeping_periodic_hook() is called. The problem with your patch is that it doesn't allow making such assumptions. Anyway, it's rather simple, if you want to update the time asynchronously: cycle_offset = get_cycles() - last_update; while (cycle_offset >= update_cycles) { cycle_offset -= update_cycles; last_update += update_cycles; // at init: system_update = update_cycles * mult; system_time += system_update; xtime += [tick_nsec, time_adj]; } error = system_time - (xtime.tv_nsec << shift); if (abs(error) > update_cycles/2) { mult_adj = (error +- update_cycles/2) / update_cycles; mult += mult_adj; system_update += mult_adj * update_cycles; system_time -= mult_adj * cycle_offset; error -= mult_adj * cycle_offset; } if (xtime.tv_nsec + (error >> shift) > NSEC_PER_SEC) { system_time -= NSEC_PER_SEC << shift; second_overflow(); } Since we usually don't have to adjust for the error all at once, it should be possible to precalculate some of it in adjtimex/second_overflow and turn mult_adj into a mult_adj_shift. I didn't really check the math here in detail, so there should be enough errors left :), but I hope it's enough to show the idea (especially how to do it without mult/divide). There are now variations of this possible, the initial cycle_offset can be constant, this happens if it's regularly called from an interrupt (and it's sufficient for UP systems). We could also completely ignore the error, so that the core calculation of the above results in the familiar: xtime += [tick_nsec, time_adj]; if (xtime.tv_nsec > NSEC_PER_SEC) second_overflow(); Another variation would be useful for ppc64 (or maybe any 64bit arch, but ppc64 has already the matching gettimeofday). In this case we don't use a timespec based xtime and don't scale it to ns, but use 64bit values instead scaled to seconds. The last one may become a bit of a challenge to keep as much as possible code common without abusing the preprocessor too much. In any case some functions will differ completely anyway, especially gettimeofday will be optimized differently depending on the arch/clock requirements, OTOH introducing a common gettimeofday (that would even require a 64bit divide) would be a huge mistake. bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Mon, 2005-08-22 at 01:19 +0200, Roman Zippel wrote: > Hi, > > On Fri, 19 Aug 2005, john stultz wrote: > > > timekeeping_perioidic_hook(): > > > > /* get ntp adjusted interval length*/ > > interval_length = get_timesource_interval(ppm) > > Here starts the problem, this requires more expensive math than necessary, > as every time you first have to scale the values. Hmmm. I feel like we're mixing signals. See below for more on this. > Let's take a standard PIT timer as an example. With HZ=100 we program it > with 11932, for simplicity let's assume this corresponds to 10^7ns and > scale this by 2^8. This means the timer multiplier is initially 214549, > this updates the system time by 214549*11932 and the reference time by > 10^7*2^8 every tick. We can now just ignore the error or as soon as it > exceeds 11932/2 we increase/decrease the mutiplier. The error calculation > is rather simple, usually just adds and shifts, only if the error exceeds > 2*11932 it gets a little more complicated, but even here the possible > divide is avoidable. I feel like we're talking about different problems. Which reference clock (other then the system clock) are you wanting to increment at the tick time? Do you mean the ntp time_offset value? A little bit of psudo code might go a long way in helping me understand your solution. Also I'm not sure how this is connects to the continuous timesource situation where we do not assume timer ticks are not lost or late. > The gettimeofday would then basically be "xtime + (cycle_offset * mult + > error_offset) / 2^8". Depending on the update frequency and the required > precision it's even possible to keep this within 32bit. The ntp part stays > pretty much the same and the time source can add anything it wants on top > of that. The basic math is also pretty much the same so we can generate > most of the code depending on various parameters. Again, I must not be understanding what you're suggesting. Above where you called get_timesource_interval(ppm) too expensive, what you're suggesting here is almost exactly what get_timesource_interval(ppm) would do. In my timeofday patches, its called cyc2ns() and gettimeofday looks like: xtime + cyc2ns(timesource, ntp_adjustment, cycle_delta) Where cyc2ns does: (cycle_delta * (timesource->mult + ntp_adjustment))>>timesource->shift The reason why we calculate the interval_length in the continuous timesource case is because we are not assuming anything about the frequency that the timekeeping_periodic_hook() is called. Again, I'm really wanting to address your concerns, but I still do not really understand the specific objections. thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
Hi, On Fri, 19 Aug 2005, john stultz wrote: > timekeeping_perioidic_hook(): > > /* get ntp adjusted interval length*/ > interval_length = get_timesource_interval(ppm) Here starts the problem, this requires more expensive math than necessary, as every time you first have to scale the values. Let's take a standard PIT timer as an example. With HZ=100 we program it with 11932, for simplicity let's assume this corresponds to 10^7ns and scale this by 2^8. This means the timer multiplier is initially 214549, this updates the system time by 214549*11932 and the reference time by 10^7*2^8 every tick. We can now just ignore the error or as soon as it exceeds 11932/2 we increase/decrease the mutiplier. The error calculation is rather simple, usually just adds and shifts, only if the error exceeds 2*11932 it gets a little more complicated, but even here the possible divide is avoidable. The gettimeofday would then basically be "xtime + (cycle_offset * mult + error_offset) / 2^8". Depending on the update frequency and the required precision it's even possible to keep this within 32bit. The ntp part stays pretty much the same and the time source can add anything it wants on top of that. The basic math is also pretty much the same so we can generate most of the code depending on various parameters. bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Fri, 2005-08-19 at 02:27 +0200, Roman Zippel wrote: > On Tue, 16 Aug 2005, john stultz wrote: > > Maybe to focus this productively, I'll try to step back and outline the > > goals at a high level and you can address those. > > > > My Assumptions: > > 1. adjtimex() sets/gets NTP state values > > 2. Every tick we adjust those state values > > 3. Every tick we use those values to make a nanosecond adjustment to > > time. > > 4. Those state values are otherwise unused. > > > > Goals: > > 1. Isolate NTP code to clean up the tick based timekeeping, reducing the > > spaghetti-like code interactions. > > 2. Add interfaces to allow for continuous, rather then tick based, > > adjustments (much how ppc64 does currently, only shareable). > > Cleaning up the code would be nice, but that shouldn't be the priority > right now, first we should get the math right. > I looked a bit more on this aspect of your patch and I think it's overly > complex even for continuous time sources. You can reduce the complexity > by updating the clock in more regular intervals. I feel in some ways I do this (inside the second overflow loop), but maybe I'm misunderstanding you. > What basically is needed to update in constant intervals (n cycles) a > reference time controlled via NTP and the system time. The difference > between those two can be used to adjust the cycle multiplier for the next > n cycles to speed up or slow down the system clock. > Calculating the offset in constant intervals makes the math a lot simpler, > basically the current code is just a special case of that, where it > directly updates the system time from the reference time at every tick. > (In the end the differences between tick based and continuous sources may > be even smaller than your current patches suggest. :) ) That would be great! So, would you mind helping me scratch out some pseudo code for your idea? Currently we have something like: === do_adjtimex(): set ntp_status/maxerror/esterror/constant values set ntp_freq set ntp_tick if (singleshot_mode): set ntp_adjtime_offset else: set ntp_offset if appropriate, adjust ntp_freq timer_interrupt(): if (second_overflow): adjust ntp_maxerror/status /* calculate per tick phase adjustment using ntp_offset and ntp_freq */ sub_offset = math(ntp_offset) ntp_offset -= sub_offset phase_adj = math(sub_offset) phase_adj += math(ntp_freq) leapsecond_stuff() tick_adjustment = 0; /* calculate singleshot adjustment */ if (ntp_adjtime_offset): adj = min(ntp_adjtime_offset, tick_adj) ntp_adjtime_offset -= adj tick_adjustment += adj /* calculate the phase adjustment */ phase += phase_adj if (phase > UNIT): phase -= UNIT tick_adjustment += UNIT xtime += ntp_tick + tick_adjustment gettimeofday(): return xtime + hardware_offset() For continuous timesources, I'd like to see something like: === do_adjtimex(): no changes, only the addition of ntp_tick_ppm = calulate_ppm(ntp_tick) timekeeping_perioidic_hook(): /* get ntp adjusted interval length*/ interval_length = get_timesource_interval(ppm) /* accumulate the NTP adjusted interval */ xtime += interval_length /* inform NTP state machine that we have applied the last calculated adjustment for the interval length */ ntp_interval += interval_length while (ntp_interval > SECOND): /* just like second_overflow */ adjust ntp_maxerror/status /* calculate the offset ppm adjustment */ sub_offset = math(ntp_offset) ntp_offset -= sub_offset offset_ppm = math(sub_offset) /* same thing for single shot ntp_adjtime_offset */ sub_ss_offset = math(ntp_adjtime_offset) ntp_adjtime_offset -= sub_ss_offset ss_offset_ppm = math(sub_ss_offset) /* sum up the ppm adjustments into a single ntp adjustment */ ppm = offset_ppm + ntp_freq + ss_offset_ppm + ntp_tick_ppm leapsecond_stuff() do_gettimeofday(): interval = get_timesource_interval(ppm) return xtime + interval Now could you adapt this to better show me what you're thinking of? thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
Hi, On Tue, 16 Aug 2005, john stultz wrote: > If they are private clock variables, why are they in the generic > timer.c? Everyone is using it in exactly the same way, no? Why do you > oppose having the adjustment and phase values behind an ntp_function() > interface? These values belong to the clock, NTP specifies the speed of the clock via tick/frequency, these variables specify the current state of the clock. If we assume there is only one system clock, we need only one set of those, so leaving them in timer.c doesn't really hurt. How these variables are updated depends on the clock, so separating them doesn't make much sense. > Maybe to focus this productively, I'll try to step back and outline the > goals at a high level and you can address those. > > My Assumptions: > 1. adjtimex() sets/gets NTP state values > 2. Every tick we adjust those state values > 3. Every tick we use those values to make a nanosecond adjustment to > time. > 4. Those state values are otherwise unused. > > Goals: > 1. Isolate NTP code to clean up the tick based timekeeping, reducing the > spaghetti-like code interactions. > 2. Add interfaces to allow for continuous, rather then tick based, > adjustments (much how ppc64 does currently, only shareable). Cleaning up the code would be nice, but that shouldn't be the priority right now, first we should get the math right. I looked a bit more on this aspect of your patch and I think it's overly complex even for continuous time sources. You can reduce the complexity by updating the clock in more regular intervals. What basically is needed to update in constant intervals (n cycles) a reference time controlled via NTP and the system time. The difference between those two can be used to adjust the cycle multiplier for the next n cycles to speed up or slow down the system clock. Calculating the offset in constant intervals makes the math a lot simpler, basically the current code is just a special case of that, where it directly updates the system time from the reference time at every tick. (In the end the differences between tick based and continuous sources may be even smaller than your current patches suggest. :) ) bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
Roman Zippel wrote: ~ The thing that worries me about this function is that it does every thing in usec. We are using nsec in xtime now and I wonder if it would not be more accurate to do the math in nsecs. Even tick size (tick_nsec) does not translate well to usec, it currently being 999849 nsecs. George --- kernel/time.c |3 ++- kernel/timer.c | 53 + 2 files changed, 55 insertions(+), 1 deletion(-) Index: linux-2.6/kernel/time.c === --- linux-2.6.orig/kernel/time.c2005-07-13 03:18:04.0 +0200 +++ linux-2.6/kernel/time.c 2005-08-16 01:37:20.0 +0200 @@ -366,8 +366,9 @@ int do_adjtimex(struct timex *txc) } /* txc->modes & ADJ_OFFSET */ if (txc->modes & ADJ_TICK) { tick_usec = txc->tick; - tick_nsec = TICK_USEC_TO_NSEC(tick_usec); } + if (txc->modes & (ADJ_FREQUENCY|ADJ_OFFSET|ADJ_TICK)) + time_recalc(); } /* txc->modes */ leave: if ((time_status & (STA_UNSYNC|STA_CLOCKERR)) != 0 || ((time_status & (STA_PPSFREQ|STA_PPSTIME)) != 0 Index: linux-2.6/kernel/timer.c === --- linux-2.6.orig/kernel/timer.c 2005-07-13 03:18:04.0 +0200 +++ linux-2.6/kernel/timer.c2005-08-16 23:10:53.0 +0200 @@ -559,6 +559,7 @@ found: */ unsigned long tick_usec = TICK_USEC; /* USER_HZ period (usec) */ unsigned long tick_nsec = TICK_NSEC; /* ACTHZ period (nsec) */ +unsigned long tick_nsec2 = TICK_NSEC; /* * The current time @@ -569,6 +570,7 @@ unsigned long tick_nsec = TICK_NSEC; /* * the usual normalization. */ struct timespec xtime __attribute__ ((aligned (16))); +struct timespec xtime2 __attribute__ ((aligned (16))); struct timespec wall_to_monotonic __attribute__ ((aligned (16))); EXPORT_SYMBOL(xtime); @@ -596,6 +598,33 @@ static long time_adj; /* tick adjust ( long time_reftime; /* time at last adjustment (s) */ long time_adjust; long time_next_adjust; +static long time_adj2, time_adj2_cur, time_freq_adj2, time_freq_phase2, time_phase2; + +void time_recalc(void) +{ + long f, t; + tick_nsec = TICK_USEC_TO_NSEC(tick_usec); This leaves bits on the floor. Is it not possible to do this whole calculation in nano seconds? Currently, for example, tick_nsec is 999849... + + t = time_freq >> (SHIFT_USEC + 8); + if (t) { + time_freq -= t << (SHIFT_USEC + 8); + t *= 1000 << 8; + } + f = time_freq * 125; + t += tick_usec * USER_HZ * 1000 + (f >> (SHIFT_USEC - 3)); + f &= (1 << (SHIFT_USEC - 3)) - 1; + tick_nsec2 = t / HZ; + f += (t % HZ) << (SHIFT_USEC - 3); + f <<= 5; + time_adj2 = f / HZ; + time_freq_adj2 = f % HZ; + + printk("tr: %ld.%09ld(%ld,%ld,%ld,%ld) - %ld.%09ld(%ld,%ld,%ld)\n", + xtime.tv_sec, xtime.tv_sec, + tick_nsec, time_freq, time_offset, time_next_adjust, + xtime2.tv_sec, xtime2.tv_nsec, + tick_nsec2, time_adj2, time_freq_adj2); +} /* * this routine handles the overflow of the microsecond field @@ -739,6 +768,16 @@ static void second_overflow(void) #endif } +static void second_overflow2(void) +{ + time_adj2_cur = time_adj2; + time_freq_phase2 += time_freq_adj2; + if (time_freq_phase2 > HZ) { + time_freq_phase2 -= HZ; + time_adj2_cur++; + } +} + /* in the NTP reference this is called "hardclock()" */ static void update_wall_time_one_tick(void) { @@ -786,6 +825,20 @@ static void update_wall_time_one_tick(vo time_adjust = time_next_adjust; time_next_adjust = 0; } + + delta_nsec = tick_nsec2; + time_phase2 += time_adj2_cur; + if (time_phase2 >= (1 << (SHIFT_USEC + 2))) { + long ltemp = time_phase2 >> (SHIFT_USEC + 2); + time_phase2 -= ltemp << (SHIFT_USEC + 2); + delta_nsec += ltemp; + } + xtime2.tv_nsec += delta_nsec; + if (xtime2.tv_nsec >= NSEC_PER_SEC) { + xtime2.tv_nsec -= NSEC_PER_SEC; + xtime2.tv_sec++; + second_overflow2(); + } } /* - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please re
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Wed, 17 Aug 2005, Ulrich Windl wrote: > whatever the implementation is, at some point there must exist an interface > go get > and set "normal time", free of any jumps and jitter. That "frontend time" > will be > used a a base of correction. Basically that means time should be as monotonic > and > jitter free as possible for any measurement interval you like. The interpolator provides such a time as xtime + offset and will self-tune to be as accurate as possible given fluctuations of the timer interrupt. It will even adapt to NTP interval variations. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On 16 Aug 2005 at 18:17, john stultz wrote: [...] > Maybe to focus this productively, I'll try to step back and outline the > goals at a high level and you can address those. > > My Assumptions: > 1. adjtimex() sets/gets NTP state values One of the greatest mistakes in the past which still affects us now was the decision to piggy-back ntp_adjtime and ntp_gettime on top of adjtime() and thus creating adjtimex(). Only to save a system-call number or two. WE REALLY SHOULD GET RID OF THAT going back to Linux 0.something. > 2. Every tick we adjust those state values ... which require it. > 3. Every tick we use those values to make a nanosecond adjustment to > time. ...or even more frequent. In my code I tried to scale the tick interpolation as well, thus effectively making adjustments even within timer ticks (so far the theory...). I was assuming however that ticks and interpolation clocks are derived from one single source and would "float" the same way relative to each other. > 4. Those state values are otherwise unused. What is "otherwise"? Outside the "NTP clock model", or "between ticks"? > > Goals: > 1. Isolate NTP code to clean up the tick based timekeeping, reducing the > spaghetti-like code interactions. First you need a new clock model that's compatible with NTP. Then you can consider how to implement the NTP stuff. So the clock even without NTP has to be strictly monotonic for any interval it is read, be it nanoseconds, microseconds, milliseconds, seconds, minutes, hours, days, ... The clock delta (=increase of time) over time should be as constant as possible (i.e. time shouldn't go up like stairs). > 2. Add interfaces to allow for continuous, rather then tick based, > adjustments (much how ppc64 does currently, only shareable). Adjustments to the clock _model_ are asynchronous by definition, while adjustments to the clock itself are, well, periodic. Whatever the period. Maybe this helps and can be agreed on. Regards, Ulrich - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On 16 Aug 2005 at 11:25, Christoph Lameter wrote: > You mentioned that the NTP code has some issues with time interpolation > at the KS. This is due to the NTP layer not being aware of actual time > differences between timer interrupts that the interpolator knows about. If > the NTP layer would be aware of the actual intervals measured by the > timesource (or interpolator) then presumably time could be adjusted in a > more accurate way. Hi, whatever the implementation is, at some point there must exist an interface go get and set "normal time", free of any jumps and jitter. That "frontend time" will be used a a base of correction. Basically that means time should be as monotonic and jitter free as possible for any measurement interval you like. Otherwise when extrapolating the time-error, it (NTP) will try to overcompensate (or undercompensate), making the whole thing instable. Here's a sample from some ancient NTP distribution (pre-nanosecond), but you'll get the idea what to check: more util/jitter.c /* * This program can be used to calibrate the clock reading jitter of a * particular CPU and operating system. It first tickles every element * of an array, in order to force pages into memory, then repeatedly calls * gettimeofday() and, finally, writes out the time values for later * analysis. From this you can determine the jitter and if the clock ever * runs backwards. */ #include #include #define NBUF 20002 void main() { struct timeval ts, tr; struct timezone tzp; long temp, j, i, gtod[NBUF]; gettimeofday(&ts, &tzp); /* * Force pages into memory */ for (i = 0; i < NBUF; i ++) gtod[i] = 0; /* * Construct gtod array */ for (i = 0; i < NBUF; i ++) { gettimeofday(&tr, &tzp); gtod[i] = (tr.tv_sec - ts.tv_sec) * 100 + tr.tv_usec; } /* * Write out gtod array for later processing with S */ for (i = 0; i < NBUF - 2; i++) { /* printf("%lu\n", gtod[i]); */ gtod[i] = gtod[i + 1] - gtod[i]; printf("%lu\n", gtod[i]); } /* * Sort the gtod array and display deciles */ for (i = 0; i < NBUF - 2; i++) { for (j = 0; j <= i; j++) { if (gtod[j] > gtod[i]) { temp = gtod[j]; gtod[j] = gtod[i]; gtod[i] = temp; } } } fprintf(stderr, "First rank\n"); for (i = 0; i < 10; i++) fprintf(stderr, "%10ld%10ld\n", i, gtod[i]); fprintf(stderr, "Last rank\n"); for (i = NBUF - 12; i < NBUF - 2; i++) fprintf(stderr, "%10ld%10ld\n", i, gtod[i]); } Regards, Ulrich - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Wed, 2005-08-17 at 02:28 +0200, Roman Zippel wrote: > Let's look at the example patch below. I played a little with some code > and this just demonstrates an accurate conversion of the tick/freq values > into the internal values in ns resolution. It does a little more work > ahead, but the interrupt code becomes simpler and most important it > doesn't require any expensive 64bit math and you can't get it much more > accurate than that. The current gettimeofday code for tick based sources > is really cheap and I'd like to keep that (i.e. free of 64bit math). The > accuracy can and should be fixed (the change to timespec wasn't really a > major improvement, as it introduced new rounding errors). Hmm. It could really use some comments, but it looks interesting. Let me continue reading it and play around with it some more. > The other thing the example demonstrates is the interface from NTP to > timer code. The NTP code provides the basic parameters and then leaves it > to the clock implementation how they apply. The adjustment and phase > variables are really private variables. If they are private clock variables, why are they in the generic timer.c? Everyone is using it in exactly the same way, no? Why do you oppose having the adjustment and phase values behind an ntp_function() interface? Maybe to focus this productively, I'll try to step back and outline the goals at a high level and you can address those. My Assumptions: 1. adjtimex() sets/gets NTP state values 2. Every tick we adjust those state values 3. Every tick we use those values to make a nanosecond adjustment to time. 4. Those state values are otherwise unused. Goals: 1. Isolate NTP code to clean up the tick based timekeeping, reducing the spaghetti-like code interactions. 2. Add interfaces to allow for continuous, rather then tick based, adjustments (much how ppc64 does currently, only shareable). thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
Hi, On Mon, 15 Aug 2005, john stultz wrote: > > Please provide the right abstractions, e.g. leave the gettimeofday > > implementation to the timesource and use some helper (template) functions > > to do the actual work. Basically as long as you have a cycle_t in the > > common code something is really wrong, this code belongs in the continous > > clock template. > > I'm not sure I agree. By pushing all the gettimeofday logic behind the > timesource or clock class you describe above, you end up with lots of > duplicated and error prone code. That's there template code comes in, so that the drivers need only to add little code themselves. The point is that every time source has different requirements and if we want efficient implementations, too much common code doesn't really help. It's a tradeoff. Let's look at the example patch below. I played a little with some code and this just demonstrates an accurate conversion of the tick/freq values into the internal values in ns resolution. It does a little more work ahead, but the interrupt code becomes simpler and most important it doesn't require any expensive 64bit math and you can't get it much more accurate than that. The current gettimeofday code for tick based sources is really cheap and I'd like to keep that (i.e. free of 64bit math). The accuracy can and should be fixed (the change to timespec wasn't really a major improvement, as it introduced new rounding errors). The other thing the example demonstrates is the interface from NTP to timer code. The NTP code provides the basic parameters and then leaves it to the clock implementation how they apply. The adjustment and phase variables are really private variables. In the code below it's rather easily possible to make HZ another parameter and you can have clocks running at different frequencies (e.g. to implement dynamic ticks). A low frequency timer provides the wall clock and a separate timer takes care of the kernel timer. The code below needs of course a little more work, currently I use it to collect some data on how the current code behaves. I'll add the adjustment code and then I'll see how it compares to it. > > This also allows better implementations, e.g. gettimeofday can be done in > > a single step instead of two using a single lock instead of two. > > This is a miss-characterization. In most cases the continuous > gettimeofday is done in a single step with a single lock. Although it > does have the flexibility to allow for more complex setups, as well as > the ability to opt out and use the existing tick based code. You have it the wrong way around. In the general case you need two locks and only in some cases can you optimize one away. To evaluate the complexity of the design you really have to look at the general case for each component. You're rather focused on just the best cases. bye, Roman --- kernel/time.c |3 ++- kernel/timer.c | 53 + 2 files changed, 55 insertions(+), 1 deletion(-) Index: linux-2.6/kernel/time.c === --- linux-2.6.orig/kernel/time.c2005-07-13 03:18:04.0 +0200 +++ linux-2.6/kernel/time.c 2005-08-16 01:37:20.0 +0200 @@ -366,8 +366,9 @@ int do_adjtimex(struct timex *txc) } /* txc->modes & ADJ_OFFSET */ if (txc->modes & ADJ_TICK) { tick_usec = txc->tick; - tick_nsec = TICK_USEC_TO_NSEC(tick_usec); } + if (txc->modes & (ADJ_FREQUENCY|ADJ_OFFSET|ADJ_TICK)) + time_recalc(); } /* txc->modes */ leave: if ((time_status & (STA_UNSYNC|STA_CLOCKERR)) != 0 || ((time_status & (STA_PPSFREQ|STA_PPSTIME)) != 0 Index: linux-2.6/kernel/timer.c === --- linux-2.6.orig/kernel/timer.c 2005-07-13 03:18:04.0 +0200 +++ linux-2.6/kernel/timer.c2005-08-16 23:10:53.0 +0200 @@ -559,6 +559,7 @@ found: */ unsigned long tick_usec = TICK_USEC; /* USER_HZ period (usec) */ unsigned long tick_nsec = TICK_NSEC; /* ACTHZ period (nsec) */ +unsigned long tick_nsec2 = TICK_NSEC; /* * The current time @@ -569,6 +570,7 @@ unsigned long tick_nsec = TICK_NSEC;/* * the usual normalization. */ struct timespec xtime __attribute__ ((aligned (16))); +struct timespec xtime2 __attribute__ ((aligned (16))); struct timespec wall_to_monotonic __attribute__ ((aligned (16))); EXPORT_SYMBOL(xtime); @@ -596,6 +598,33 @@ static long time_adj; /* tick adjust ( long time_reftime; /* time at last adjustment (s) */ long time_adjust; long time_next_adjust; +static long time_adj2, time_adj2_cur, time_freq_adj2, time_freq_phase2, time_phase2; + +void time_recalc(void) +{ + long f, t; + tick_nsec = TICK_USEC_TO_NSEC(tick_usec); + +
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Tue, 16 Aug 2005, john stultz wrote: > That is why I'm suggesting time_interpolator users to move to my code > (when they're ready, of course :). Both are basically timesources. That is why I would suggest you upgrade the interpolators to timesources. Doing that would enable a gradual transition instead of a cutover to a new time subsystem. It should also insure that the gains we have made in terms of accuracy of time will be preserved in the new system. And the code would be able to use the existing proven code that already allows system time with nanosecond precision. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Tue, 2005-08-16 at 17:14 -0700, Christoph Lameter wrote: > On Tue, 16 Aug 2005, john stultz wrote: > > > This is basically what I do in my patch. I directly apply the NTP > > adjustment to the timesource interval, and periodically increment the > > NTP state machine by the timesource interval when we accumulate it. > > Is there some way to tell the NTP code how much the time_interpolator time > deviates from xtime? > > If the NTP code would use getnstimeofday or > do_gettimeofday then it would already get interpolated time. That seems a bit backwards, no? > The curious issue in the current arrangement is that the interpolator > knows much more accurately how much time has passed between interrupts > than the timer interrupt but it has no time to make that information > available to the NTP code. That is why I'm suggesting time_interpolator users to move to my code (when they're ready, of course :). thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Tue, 16 Aug 2005, john stultz wrote: > This is basically what I do in my patch. I directly apply the NTP > adjustment to the timesource interval, and periodically increment the > NTP state machine by the timesource interval when we accumulate it. Is there some way to tell the NTP code how much the time_interpolator time deviates from xtime? If the NTP code would use getnstimeofday or do_gettimeofday then it would already get interpolated time. The curious issue in the current arrangement is that the interpolator knows much more accurately how much time has passed between interrupts than the timer interrupt but it has no time to make that information available to the NTP code. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Tue, 2005-08-16 at 11:25 -0700, Christoph Lameter wrote: > You mentioned that the NTP code has some issues with time interpolation > at the KS. This is due to the NTP layer not being aware of actual time > differences between timer interrupts that the interpolator knows about. My understanding of the issue was that when NTP makes an adjustment, it only affects xtime, and any difference between the adjusted time and the interpolator's time was just accumulated in the interpolator's offset. That then, to my understanding, required the bit about adjusting the interpolator frequency to be slower then what we expect so negative offsets can be applied. Looking at it closer, it may very work, but it does seem to be addressing the issue somewhat indirectly. > If the NTP layer would be aware of the actual intervals measured by the > timesource (or interpolator) then presumably time could be adjusted in a > more accurate way. This is basically what I do in my patch. I directly apply the NTP adjustment to the timesource interval, and periodically increment the NTP state machine by the timesource interval when we accumulate it. thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Mon, 15 Aug 2005, john stultz wrote: > Sorry. It was subtle, but after thinking more about your arguments, I've > stepped back from my earlier goals of replacing the timekeeping code for > all arches and instead I've decided to just focus on allowing > architectures that would duplicate code using a continuous timesource > use a common code base. Thats great! > Think of it more as a replacement for the time_interpolator code (which > thanks to Christoph Lameter, it is quite influenced by). I have no objection to replacing the time_interpolator code if the timesources provide a superset of functionality. Rename time_interpolator to timesource (including all currently existing interpolator defintions which will become time sources) and modify/add fields to be able to satisfy your requirements. The interpolator compensations may become not necessary if the upper layers can deal with discrepancies between timer interrupts and actual intervals occurring between these interrupts and if the upper layer can adjust the time source in use. You mentioned that the NTP code has some issues with time interpolation at the KS. This is due to the NTP layer not being aware of actual time differences between timer interrupts that the interpolator knows about. If the NTP layer would be aware of the actual intervals measured by the timesource (or interpolator) then presumably time could be adjusted in a more accurate way. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Tue, 2005-08-16 at 00:14 +0200, Roman Zippel wrote: > On Wed, 10 Aug 2005, john stultz wrote: > > > Here's the next rev in my rework of the current timekeeping subsystem. > > No major changes, only some cleanups and further splitting the larger > > patches into smaller ones. > > > > The goal of this patch set is to provide a simplified and streamlined > > common timekeeping infrastructure that architectures can optionally use > > to avoid duplicating code with other architectures. > > It's still the same old abstraction. Let's try it in OO terms why it's the > wrong one. What basically is needed is something like this: > > base clock -> continuous clock -> clock implemention > -> tick clock -> > > The base clock class is an abstract class which provides the basic time > services like get time, set time... > The continuous clock class and tick clock class are also abstract classes, > but provide basic template functions, which can be used by the actual > implementations do most of the work. > > What you do with your patches is to just provide an abstract class for > continous clocks and tick based clocks have to emulate a continuous clock. Sorry. It was subtle, but after thinking more about your arguments, I've stepped back from my earlier goals of replacing the timekeeping code for all arches and instead I've decided to just focus on allowing architectures that would duplicate code using a continuous timesource use a common code base. Think of it more as a replacement for the time_interpolator code (which thanks to Christoph Lameter, it is quite influenced by). So in that way the "abstract class" will just be the current interface of: 1. do_gettimeofday() 2. do_settimeofday() 3. getnstimeofday() 4. periodic hook (update_wall_time) 5. init code To that I'd like to add 6. do_monotonic_clock() which I've just added and implementation for tick based systems. Then in the tick based class, nothing changes (except for the new do_monotonic_clock implementation). And in the continuous timesource class, it uses my generic-tod code. > Please provide the right abstractions, e.g. leave the gettimeofday > implementation to the timesource and use some helper (template) functions > to do the actual work. Basically as long as you have a cycle_t in the > common code something is really wrong, this code belongs in the continous > clock template. I'm not sure I agree. By pushing all the gettimeofday logic behind the timesource or clock class you describe above, you end up with lots of duplicated and error prone code. That's the issue I'm trying to avoid between the different arches. Additionally The current i386 timer_opts code (which I'm to blame for) does almost exactly this at the timesource level, and while it did allow for alternate timesources to be easily used, it caused a large amount of almost duplicate code with slightly differing behavior, and has made changes like dynamic ticks difficult to do correctly. It was this reason (along with Christoph's proddings - due to the fsyscall requirements) that the timesource structure only provides an abstraction to a free running counter instead of a state-full structure with function pointers that return the timeofday. Now, this does not limit any arch from doing their own thing and implementing their own "timeofday abstract class". I'm just trying to provide a correct and clean infrastructure for the arches that could use a continuous timesource. > This also allows better implementations, e.g. gettimeofday can be done in > a single step instead of two using a single lock instead of two. This is a miss-characterization. In most cases the continuous gettimeofday is done in a single step with a single lock. Although it does have the flexibility to allow for more complex setups, as well as the ability to opt out and use the existing tick based code. thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
Hi, On Wed, 10 Aug 2005, john stultz wrote: > Here's the next rev in my rework of the current timekeeping subsystem. > No major changes, only some cleanups and further splitting the larger > patches into smaller ones. > > The goal of this patch set is to provide a simplified and streamlined > common timekeeping infrastructure that architectures can optionally use > to avoid duplicating code with other architectures. It's still the same old abstraction. Let's try it in OO terms why it's the wrong one. What basically is needed is something like this: base clock -> continuous clock -> clock implemention -> tick clock -> The base clock class is an abstract class which provides the basic time services like get time, set time... The continuous clock class and tick clock class are also abstract classes, but provide basic template functions, which can be used by the actual implementations do most of the work. What you do with your patches is to just provide an abstract class for continous clocks and tick based clocks have to emulate a continuous clock. Please provide the right abstractions, e.g. leave the gettimeofday implementation to the timesource and use some helper (template) functions to do the actual work. Basically as long as you have a cycle_t in the common code something is really wrong, this code belongs in the continous clock template. This also allows better implementations, e.g. gettimeofday can be done in a single step instead of two using a single lock instead of two. bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On 10 Aug 2005 at 22:32, Lee Revell wrote: > On Wed, 2005-08-10 at 19:13 -0700, john stultz wrote: > > All, > > Here's the next rev in my rework of the current timekeeping subsystem. > > No major changes, only some cleanups and further splitting the larger > > patches into smaller ones. > > Last I heard this made gettimeofday() 20% slower on x86. Is this still > the case? If it's only 20% for an increase in resolution of 10%, it's quite good ;-) Regards, Ulrich > > Lee > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Wed, 2005-08-10 at 19:39 -0700, john stultz wrote: > Ah, I've got a patch on my laptop that takes that down to ~2% or less. > I didn't include it in this patch set but I'll work to get it > integrated before the next release. Sorry about that. > > If you have any suggestions for further performance improvements, > please let me know. 2% sounds reasonable to me. I just don't think we can afford 20% because so many stupid apps bang on gettimeofday() constantly. Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Wed, 2005-08-10 at 22:32 -0400, Lee Revell wrote: > On Wed, 2005-08-10 at 19:13 -0700, john stultz wrote: > > All, > > Here's the next rev in my rework of the current timekeeping subsystem. > > No major changes, only some cleanups and further splitting the larger > > patches into smaller ones. > > Last I heard this made gettimeofday() 20% slower on x86. Is this still > the case? Ah, I've got a patch on my laptop that takes that down to ~2% or less. I didn't include it in this patch set but I'll work to get it integrated before the next release. Sorry about that. If you have any suggestions for further performance improvements, please let me know. thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
On Wed, 2005-08-10 at 19:13 -0700, john stultz wrote: > All, > Here's the next rev in my rework of the current timekeeping subsystem. > No major changes, only some cleanups and further splitting the larger > patches into smaller ones. Last I heard this made gettimeofday() 20% slower on x86. Is this still the case? Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC - 0/9] Generic timekeeping subsystem (v. B5)
All, Here's the next rev in my rework of the current timekeeping subsystem. No major changes, only some cleanups and further splitting the larger patches into smaller ones. The goal of this patch set is to provide a simplified and streamlined common timekeeping infrastructure that architectures can optionally use to avoid duplicating code with other architectures. This generic timekeeping subsystem is designed around systems that have continuous timesources to insure correctness and avoid interpolation errors. Additionally it allows the timekeeping to correctly function independently from timer interrupts. For systems that do not have a continuous timesource, no changes are necessary, the existing tick-based timekeeping still remains. This code just avoids needless duplication in the arches that do. For another description on the rework, see here: http://lwn.net/Articles/120850/ (Many thanks to the LWN team for that easy to understand writeup!) I'd like to thank the following people who have contributed ideas, criticism, testing and code that has helped shape this work: George Anzinger, Nish Aravamudan, Max Asbock, Dominik Brodowski, Darren Hart, Christoph Lameter, Matt Mackal, Keith Mannthey, Ingo Oeser, Martin Schwidefsky, Frank Sorenson, Ulrich Windl, Darrick Wong, Roman Zippel and any others whom I've accidentally forgotten. thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/