Re: FW: [RFC] A more general timeout specification
On Thu, Sep 01, 2005 at 04:32:49PM +0200, Roman Zippel wrote: > On Thu, 1 Sep 2005, Joe Korty wrote: > > Kernel time sucks. It is just a single clock, it may not have > > the attributes of the clock that the user really wished to use. > > Wrong. The kernel time is simple and effective for almost all users. > We are talking about _timeouts_ here, what fancy "attributes" does that > need that are just not overkill? The name should be changed from 'struct timeout' to something like 'struct timeevent'. Joe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: FW: [RFC] A more general timeout specification
Hi, On Thu, 1 Sep 2005, Joe Korty wrote: > On Thu, Sep 01, 2005 at 11:19:51AM +0200, Roman Zippel wrote: > > > You still didn't explain what's the point in choosing > > different clock sources for a _timeout_. > > Well, if CLOCK_REALTIME is set forward by a minute, > timers & timeout specified against that clock will expire > a minute earlier than expected. That just rather suggests that the pthread API is broken as usual. (No other possible user was mentioned so far.) > That doesn't happen with > CLOCK_MONOTONIC. Applications should have the ability > to select what they want to happen in this case (ie, > whether the timeout/timer has to happen at a particular > wall-clock time, say 2pm, or if the interval aspects of > the timer/timeout are more important). Applications > get this if they have the ability to specify the clock > their timer or timeout is specified against. So setup a timer that goes off at that time and interrupts the operation. There is no need to overload the operation itself with an overly complex timeout specification. > The purpose of CLOCK_MONOTONIC is to provide an even, > unchanging progression of advancing time. That is, any two > intervals on this time-line of the same measured length > actually represent, as close as possible, the same length > of time. > > CLOCK_MONOTONIC should get adjustments only to bring its > frequency back into line (but currently gets more than this > in Linux). CLOCK_REALTIME should and does get adjustments > for frequency and then gets further, temporary speedups > or slowdown to bring its absolute value back into line. That would make a rather useless CLOCK_MONOTONIC. The basic problem is that it would be very hard to specify the time without exactly knowing it's frequency, the larger the time difference the larger the time skew would be compared to CLOCK_REALTIME and without an atomic clock in your computer you have no way of knowing which one is "real". So in practice it's easier to advance CLOCK_MONOTONIC/CLOCK_REALTIME equally and only apply time jumps to CLOCK_REALTIME. bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: FW: [RFC] A more general timeout specification
On Thu, 2005-09-01 at 16:32 +0200, Roman Zippel wrote: > Hi, > > On Thu, 1 Sep 2005, Joe Korty wrote: > > > > When you convert a user time to kernel time you can > > > automatically validate > > > > Kernel time sucks. It is just a single clock, it may not have > > the attributes of the clock that the user really wished to use. > > Wrong. The kernel time is simple and effective for almost all users. > We are talking about _timeouts_ here, what fancy "attributes" does that > need that are just not overkill? How do you feel about posix clocks ? Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: FW: [RFC] A more general timeout specification
On Thu, 2005-09-01 at 16:32 +0200, Roman Zippel wrote: > Hi, > > On Thu, 1 Sep 2005, Joe Korty wrote: > > > > When you convert a user time to kernel time you can > > > automatically validate > > > > Kernel time sucks. It is just a single clock, it may not have > > the attributes of the clock that the user really wished to use. > > Wrong. The kernel time is simple and effective for almost all users. > We are talking about _timeouts_ here, what fancy "attributes" does that > need that are just not overkill? Or rather, posix timers ? Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: FW: [RFC] A more general timeout specification
Hi, On Thu, 1 Sep 2005, Joe Korty wrote: > > When you convert a user time to kernel time you can > > automatically validate > > Kernel time sucks. It is just a single clock, it may not have > the attributes of the clock that the user really wished to use. Wrong. The kernel time is simple and effective for almost all users. We are talking about _timeouts_ here, what fancy "attributes" does that need that are just not overkill? bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: FW: [RFC] A more general timeout specification
On Thu, Sep 01, 2005 at 01:50:33AM +0200, Roman Zippel wrote: > When you convert a user time to kernel time you can > automatically validate Kernel time sucks. It is just a single clock, it may not have the attributes of the clock that the user really wished to use. Joe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: FW: [RFC] A more general timeout specification
On Thu, Sep 01, 2005 at 11:22:32AM +0200, Roman Zippel wrote: > For a timeout? Please get real. > If you need more precision, use a dedicated timer API, but don't make the > general case more complex for the 99.99% of other users. Struct timeout is just a struct timespec + a bit for absolute/relative + a field for clock specification. What's so complex about that? It captures everything needed to specify time, from here to the end of time. Joe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: FW: [RFC] A more general timeout specification
On Thu, Sep 01, 2005 at 11:19:51AM +0200, Roman Zippel wrote: > You still didn't explain what's the point in choosing > different clock sources for a _timeout_. Well, if CLOCK_REALTIME is set forward by a minute, timers & timeout specified against that clock will expire a minute earlier than expected. That doesn't happen with CLOCK_MONOTONIC. Applications should have the ability to select what they want to happen in this case (ie, whether the timeout/timer has to happen at a particular wall-clock time, say 2pm, or if the interval aspects of the timer/timeout are more important). Applications get this if they have the ability to specify the clock their timer or timeout is specified against. Also . (I am going off the deep end here) . The purpose of CLOCK_REALTIME is to track wall clock time. That means it can be speed up, slowed down, or even be force-fed a new time to make it match. The purpose of CLOCK_MONOTONIC is to provide an even, unchanging progression of advancing time. That is, any two intervals on this time-line of the same measured length actually represent, as close as possible, the same length of time. CLOCK_MONOTONIC should get adjustments only to bring its frequency back into line (but currently gets more than this in Linux). CLOCK_REALTIME should and does get adjustments for frequency and then gets further, temporary speedups or slowdown to bring its absolute value back into line. Note that there is no need for the two clocks to track each other in any way, as Linux currently goes to lengths to do. I know Linux does not implement the above definition of CLOCK_MONOTONIC; however, I would like an interface where when, if the day comes time is properly handled, applications can take advantage of it. Joe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: FW: [RFC] A more general timeout specification
Hi, On Thu, 1 Sep 2005, Perez-Gonzalez, Inaky wrote: > >You still didn't explain what's the point in choosing different clock > >sources for a _timeout_. > > The same reasons that compel to have CLOCK_REALTIME or > CLOCK_MONOTONIC, for example. Or the need to time out on a > high resolution clock. > > A certain application might have a need for a 10ms timeout, > but another one might have it on 100us--modern CPUs make that > more than possible. The precission of your time source permeates > to the precission of your timeout. Please give me a realistic and non-broken example. We can add lots of stuff to the kernel, because it _might_ be needed, but we (usually) don't if it hurts the general case, just adds bloat and userspace can achieve the same thing via different means. > [of course, now at the end it is still kernel time, but the > ongoing revamp work on timers will change some of that, one > way or another]. That doesn't mean it has to be exported via every single kernel API, which allows to specify a time. > >You didn't answer my other question, let's assume we add such a timeout > >structure, what's wrong with converting it to kernel time (which would > >automatically validate it). > > And again, that's what at the end this API is doing, convering it to > kernel time. No, it's not doing this at the validation point. > Give it a more "human" specification (timespec) and gets the job done. > No need to care on how long a jiffy is today in this system, no need > to replicate endlessly the conversion code, which happens to be > non-trivial (for the absolute time case--but still way more trivial > than userspace asking the kernel for the time, computing a relative > shift and dealing with the skews that preemption at a Murphy moment > could cause). > > It is mostly the same as schedule_timeout(), but it takes the sleep > time in a more general format. As every other API, it is designed so > that the caller doesn't need to care or know about the gory details > on how it has to be converted. Sorry, but I don't get what you're talking about. What has the user space concept of time to do with how the kernel finally handles a timeout? More specifically why does the first require a new API in the kernel to deal with all kinds of timeouts? bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: FW: [RFC] A more general timeout specification
>From: Roman Zippel [mailto:[EMAIL PROTECTED] >On Wed, 31 Aug 2005, Perez-Gonzalez, Inaky wrote: > >> Hmm, I cannot think of more ways to specify a timeout than how >> long I want to wait (relative) or until when (absolute) and which >> is the reference clock. And they don't seem broken to me, common >> sense, in any case. Do you have any examples? > >You still didn't explain what's the point in choosing different clock >sources for a _timeout_. The same reasons that compel to have CLOCK_REALTIME or CLOCK_MONOTONIC, for example. Or the need to time out on a high resolution clock. A certain application might have a need for a 10ms timeout, but another one might have it on 100us--modern CPUs make that more than possible. The precission of your time source permeates to the precission of your timeout. [of course, now at the end it is still kernel time, but the ongoing revamp work on timers will change some of that, one way or another]. >> Different versions of the same function that do relative, absolute. >> If I keep going that way, the reason becomes: >> >> sys_mutex_lock >> sys_mutex_lock_timed_relative_clock_realtime >> sys_mutex_lock_timed_absolute_clock_realtime >> sys_mutex_lock_timed_relative_clock_monotonic >> sys_mutex_lock_timed_absolute_clock_monotonic >> sys_mutex_lock_timed_relative_clock_monotonic_highres >> sys_mutex_lock_timed_absolute_clock_monotonic_highres > >Hiding it behind an API makes it better? It certainly cuts out cruft and to my not-so-trained eye, makes it cleaner and easier to maintain. >You didn't answer my other question, let's assume we add such a timeout >structure, what's wrong with converting it to kernel time (which would >automatically validate it). And again, that's what at the end this API is doing, convering it to kernel time. Give it a more "human" specification (timespec) and gets the job done. No need to care on how long a jiffy is today in this system, no need to replicate endlessly the conversion code, which happens to be non-trivial (for the absolute time case--but still way more trivial than userspace asking the kernel for the time, computing a relative shift and dealing with the skews that preemption at a Murphy moment could cause). It is mostly the same as schedule_timeout(), but it takes the sleep time in a more general format. As every other API, it is designed so that the caller doesn't need to care or know about the gory details on how it has to be converted. -- Inaky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: FW: [RFC] A more general timeout specification
Hi, On Wed, 31 Aug 2005, Daniel Walker wrote: > > What "more versions" are you talking about? When you convert a user time > > to kernel time you can automatically validate it and later you can use > > standard kernel APIs, so you don't have to add even more API bloat. > > What's kernel time? Are you talking about jiffies? The whole point of > multiple clocks is to allow for different degrees of precision. For a timeout? Please get real. If you need more precision, use a dedicated timer API, but don't make the general case more complex for the 99.99% of other users. bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: FW: [RFC] A more general timeout specification
Hi, On Wed, 31 Aug 2005, Perez-Gonzalez, Inaky wrote: > Hmm, I cannot think of more ways to specify a timeout than how > long I want to wait (relative) or until when (absolute) and which > is the reference clock. And they don't seem broken to me, common > sense, in any case. Do you have any examples? You still didn't explain what's the point in choosing different clock sources for a _timeout_. > Different versions of the same function that do relative, absolute. > If I keep going that way, the reason becomes: > > sys_mutex_lock > sys_mutex_lock_timed_relative_clock_realtime > sys_mutex_lock_timed_absolute_clock_realtime > sys_mutex_lock_timed_relative_clock_monotonic > sys_mutex_lock_timed_absolute_clock_monotonic > sys_mutex_lock_timed_relative_clock_monotonic_highres > sys_mutex_lock_timed_absolute_clock_monotonic_highres Hiding it behind an API makes it better? You didn't answer my other question, let's assume we add such a timeout structure, what's wrong with converting it to kernel time (which would automatically validate it). bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: FW: [RFC] A more general timeout specification
On Thu, 2005-09-01 at 01:50 +0200, Roman Zippel wrote: > What "more versions" are you talking about? When you convert a user time > to kernel time you can automatically validate it and later you can use > standard kernel APIs, so you don't have to add even more API bloat. What's kernel time? Are you talking about jiffies? The whole point of multiple clocks is to allow for different degrees of precision. Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: FW: [RFC] A more general timeout specification
>From: Roman Zippel [mailto:[EMAIL PROTECTED] >On Wed, 31 Aug 2005, Perez-Gonzalez, Inaky wrote: > >> I cannot produce (top of my head) any other POSIX API calls that >> allow you to specify another clock source, but they are there, >> somewhere. If I am to introduce a new API, I better make it >> flexible enough so that other subsystems can use it for more stuff >> other than... > >So we have to deal at kernel level with every broken timeout specification >that comes along? Hmm, I cannot think of more ways to specify a timeout than how long I want to wait (relative) or until when (absolute) and which is the reference clock. And they don't seem broken to me, common sense, in any case. Do you have any examples? In any case, like it or not, POSIX is what almost every application uses to talk to the kernel. >> ...adding more versions that add complexity and duplicate >> code in many different places (user-to-kernel copy, syscall entry >> points, timespec validation). And the minute you add a clock_id >> you can steal some bits for specifying absolute/relative (or vice >> versa), so it is almost a win-win situarion. > >What "more versions" are you talking about? When you convert a user time >to kernel time you can automatically validate it and later you can use >standard kernel APIs, so you don't have to add even more API bloat. The versions you were talking about: >From: Roman Zippel [mailto:[EMAIL PROTECTED] >... >Why is not sufficient to just add a relative/absolute version, >which convert the time at entry to kernel time? Different versions of the same function that do relative, absolute. If I keep going that way, the reason becomes: sys_mutex_lock sys_mutex_lock_timed_relative_clock_realtime sys_mutex_lock_timed_absolute_clock_realtime sys_mutex_lock_timed_relative_clock_monotonic sys_mutex_lock_timed_absolute_clock_monotonic sys_mutex_lock_timed_relative_clock_monotonic_highres sys_mutex_lock_timed_absolute_clock_monotonic_highres s/mutex_lock/ with whatever system call that takes a timeout you want and keep adding combinations. On each of those check for validity of the __user pointer, copy it, validate the timespec. [admitedly I am stretching the point with the different clock types]. So where is the problem on unifying all that handling? You are still not offering any constructive criticism to solve the issue that now the syscalls take relative timeouts vs the absolutes we need. -- Inaky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: FW: [RFC] A more general timeout specification
Hi, On Wed, 31 Aug 2005, Perez-Gonzalez, Inaky wrote: > I cannot produce (top of my head) any other POSIX API calls that > allow you to specify another clock source, but they are there, > somewhere. If I am to introduce a new API, I better make it > flexible enough so that other subsystems can use it for more stuff > other than... So we have to deal at kernel level with every broken timeout specification that comes along? > >Why is not sufficient to just add a relative/absolute version, > >which convert the time at entry to kernel time? > > ...adding more versions that add complexity and duplicate > code in many different places (user-to-kernel copy, syscall entry > points, timespec validation). And the minute you add a clock_id > you can steal some bits for specifying absolute/relative (or vice > versa), so it is almost a win-win situarion. What "more versions" are you talking about? When you convert a user time to kernel time you can automatically validate it and later you can use standard kernel APIs, so you don't have to add even more API bloat. bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: FW: [RFC] A more general timeout specification
>From: Roman Zippel [mailto:[EMAIL PROTECTED] >On Wed, 31 Aug 2005, Perez-Gonzalez, Inaky wrote: > >> Usefulness: (see the rationale in the patch), but in a nutshell; >> most POSIX timeout specs have to be absolute in CLOCK_REALTIME >> (eg: pthread_mutex_timed_lock()). Current kernel needs the timeout >> relative, so glibc calls the kernel/however gets the time, computes >> relative times and syscalls. Race conditions, overhead...etc. >> >> This mechanism supports both. That's why it is more general. > >Your patch basically only mentions fusyn, why does it need multiple clock >sources? I cannot produce (top of my head) any other POSIX API calls that allow you to specify another clock source, but they are there, somewhere. If I am to introduce a new API, I better make it flexible enough so that other subsystems can use it for more stuff other than... >Why is not sufficient to just add a relative/absolute version, >which convert the time at entry to kernel time? ...adding more versions that add complexity and duplicate code in many different places (user-to-kernel copy, syscall entry points, timespec validation). And the minute you add a clock_id you can steal some bits for specifying absolute/relative (or vice versa), so it is almost a win-win situarion. To summarize: thought about that, but it is fugly and not too practical. Consider also his allows you to write extensions to POSIX or your own user-level APIs that could allow (following the fusyn example) you to wait on a mutex with a timeout based off a monotonic clock, if you need it (or something that makes more sense than this--highres comes to mind). -- Inaky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: FW: [RFC] A more general timeout specification
>From: Christopher Friesen [mailto:[EMAIL PROTECTED] >Perez-Gonzalez, Inaky wrote: > >>>I can get the first sleep. Suppose I oversleep by X nanoseconds. I >>>wake, and get an opaque timeout back. How do I ask for the new wake >>>time to be "endtime + INTERVAL"? >> >> >> endtime.ts += INTERVAL >> [we all know opaque is relative too] > >Heh. Okay, then what are the rules about what I'm allowed to do with >endtime? Joe mentioned there was a bit in there somewhere to denote >absolute time. Well, it doesn't really matter. The bit in endtime.clock_id (highest, AFAIR) says if it is absolute or not, but because adding a relative value to a value maintains its condition (absolute or relative), it is not a concern. Just add it. Unless I am missing something really basic, of course. >> Or better, use itimers :) > >I as actually thinking in terms of implementing itimers on top of your >new API. Heh, got me. -- Inaky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: FW: [RFC] A more general timeout specification
Perez-Gonzalez, Inaky wrote: I can get the first sleep. Suppose I oversleep by X nanoseconds. I wake, and get an opaque timeout back. How do I ask for the new wake time to be "endtime + INTERVAL"? endtime.ts += INTERVAL [we all know opaque is relative too] Heh. Okay, then what are the rules about what I'm allowed to do with endtime? Joe mentioned there was a bit in there somewhere to denote absolute time. Or better, use itimers :) I as actually thinking in terms of implementing itimers on top of your new API. Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: FW: [RFC] A more general timeout specification
Hi, On Wed, 31 Aug 2005, Perez-Gonzalez, Inaky wrote: > >Why is that needed in a _general_ timeout API? What exactly makes it so > >useful for everyone and not just more complex for everyone? > > Because if a system call gets a timeout specification it needs to > verify its correctness first. Instead of doing that at the point > where it goes to sleep, that could be deep in an atomic section, > we provide a separate function [timeout_validate()] which is the > one you mention, to do that. > > Usefulness: (see the rationale in the patch), but in a nutshell; > most POSIX timeout specs have to be absolute in CLOCK_REALTIME > (eg: pthread_mutex_timed_lock()). Current kernel needs the timeout > relative, so glibc calls the kernel/however gets the time, computes > relative times and syscalls. Race conditions, overhead...etc. > > This mechanism supports both. That's why it is more general. Your patch basically only mentions fusyn, why does it need multiple clock sources? Why is not sufficient to just add a relative/absolute version, which convert the time at entry to kernel time? bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: FW: [RFC] A more general timeout specification
>From: Christopher Friesen [mailto:[EMAIL PROTECTED] >Joe Korty wrote: > >> The returned timeout struct has a bit used to mark the value as absolute. Thus >> the caller treats the returned timeout as a opaque cookie that can be >> reapplied to the next (or more likely, the to-be restarted) timeout. > >Okay, endtime is always absolute value of when it should have expired. >But I think I see a problem with the opaque cookie scheme and repeating >timeouts. > >Suppose I want to wake my application at INTERVAL nanoseconds from now >on the MONOTONIC clock, then again every INTERVAL nanoseconds after that. This API is not intended for your application to use directly, but for kernel APIs that take sleeps from userspace (like pthread_mutex_lock() and friends), so this scenario is not very likely. Granted, sleep() can be implemented with it too, so... >How do I do that with this API? > >I can get the first sleep. Suppose I oversleep by X nanoseconds. I >wake, and get an opaque timeout back. How do I ask for the new wake >time to be "endtime + INTERVAL"? endtime.ts += INTERVAL [we all know opaque is relative too] Or better, use itimers :) -- Inaky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: FW: [RFC] A more general timeout specification
>From: Roman Zippel [mailto:[EMAIL PROTECTED] >On Wed, 31 Aug 2005, Perez-Gonzalez, Inaky wrote: > >> +flags = tp->clock_id & TIMEOUT_FLAGS_MASK; >> +clock_id = tp->clock_id & TIMEOUT_CLOCK_MASK; >> + >> +result = -EINVAL; >> +if (flags & ~TIMEOUT_RELATIVE) >> +goto out; >> + >> +/* someday, we should support *all* clocks available to us */ >> +if (clock_id != CLOCK_REALTIME && clock_id != CLOCK_MONOTONIC) >> +goto out; >> +if ((unsigned long)tp->ts.tv_nsec >= NSEC_PER_SEC) >> +goto out; > >Why is that needed in a _general_ timeout API? What exactly makes it so >useful for everyone and not just more complex for everyone? Because if a system call gets a timeout specification it needs to verify its correctness first. Instead of doing that at the point where it goes to sleep, that could be deep in an atomic section, we provide a separate function [timeout_validate()] which is the one you mention, to do that. Usefulness: (see the rationale in the patch), but in a nutshell; most POSIX timeout specs have to be absolute in CLOCK_REALTIME (eg: pthread_mutex_timed_lock()). Current kernel needs the timeout relative, so glibc calls the kernel/however gets the time, computes relative times and syscalls. Race conditions, overhead...etc. This mechanism supports both. That's why it is more general. -- Inaky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: FW: [RFC] A more general timeout specification
Hi, On Wed, 31 Aug 2005, Perez-Gonzalez, Inaky wrote: > + flags = tp->clock_id & TIMEOUT_FLAGS_MASK; > + clock_id = tp->clock_id & TIMEOUT_CLOCK_MASK; > + > + result = -EINVAL; > + if (flags & ~TIMEOUT_RELATIVE) > + goto out; > + > + /* someday, we should support *all* clocks available to us */ > + if (clock_id != CLOCK_REALTIME && clock_id != CLOCK_MONOTONIC) > + goto out; > + if ((unsigned long)tp->ts.tv_nsec >= NSEC_PER_SEC) > + goto out; Why is that needed in a _general_ timeout API? What exactly makes it so useful for everyone and not just more complex for everyone? bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: FW: [RFC] A more general timeout specification
Joe Korty wrote: The returned timeout struct has a bit used to mark the value as absolute. Thus the caller treats the returned timeout as a opaque cookie that can be reapplied to the next (or more likely, the to-be restarted) timeout. Okay, endtime is always absolute value of when it should have expired. But I think I see a problem with the opaque cookie scheme and repeating timeouts. Suppose I want to wake my application at INTERVAL nanoseconds from now on the MONOTONIC clock, then again every INTERVAL nanoseconds after that. How do I do that with this API? I can get the first sleep. Suppose I oversleep by X nanoseconds. I wake, and get an opaque timeout back. How do I ask for the new wake time to be "endtime + INTERVAL"? Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: FW: [RFC] A more general timeout specification
On Wed, Aug 31, 2005 at 03:20:03PM -0600, Christopher Friesen wrote: > Perez-Gonzalez, Inaky wrote: > >In this structure, > >the user specifies: > >whether the time is absolute, or relative to 'now'. > > > >Timeout_sleep has a return argument, endtime, which is also in > >'struct timeout' format. If the input time was relative, then > >it is converted to absolute and returned through this argument. > > Wouldn't it make more sense for the endtime to be returned in the same > format (relative/absolute) as the original timer was specified? That > way an application can set a new timer for "timeout + SLEEPTIME" and on > average it will be reasonably accurate. > > In the proposed method, for endtime to be useful the app needs to check > the current time, compare with the endtime, and figure out the delta. > If you're going to force the app to do all that work anyway, the app may > as well use absolute times. > > Chris The returned timeout struct has a bit used to mark the value as absolute. Thus the caller treats the returned timeout as a opaque cookie that can be reapplied to the next (or more likely, the to-be restarted) timeout. A general principle is, once a time has been converted to absolute, it should never be converted back to relative time. To do so means the end-time starts to drift from the original end-time. Regards, Joe -- "Money can buy bandwidth, but latency is forever" -- John Mashey - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: FW: [RFC] A more general timeout specification
Perez-Gonzalez, Inaky wrote: In this structure, the user specifies: whether the time is absolute, or relative to 'now'. Timeout_sleep has a return argument, endtime, which is also in 'struct timeout' format. If the input time was relative, then it is converted to absolute and returned through this argument. Wouldn't it make more sense for the endtime to be returned in the same format (relative/absolute) as the original timer was specified? That way an application can set a new timer for "timeout + SLEEPTIME" and on average it will be reasonably accurate. In the proposed method, for endtime to be useful the app needs to check the current time, compare with the endtime, and figure out the delta. If you're going to force the app to do all that work anyway, the app may as well use absolute times. Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: FW: [RFC] A more general timeout specification
On Wed, Aug 31, 2005 at 01:55:54PM -0700, Perez-Gonzalez, Inaky wrote: > Hi Andrew > > This was developed by Joe Korty <[EMAIL PROTECTED]>, greatly > enhancing something I had done before, so I am signing it out > (although Joe should too, Joe?). The fusyn (robust mutexes) project proposes the creation of a more general data structure, 'struct timeout', for the specification of timeouts in new services. In this structure, the user specifies: a time, in timespec format. the clock the time is specified against (eg, CLOCK_MONOTONIC). whether the time is absolute, or relative to 'now'. That is, all combinations of useful timeout attributes become possible. Also proposed are two new kernel routines for the manipulation of timeouts: timeout_validate() timeout_sleep() timeout_validate() error-checks the syntax of a timeout argument and returns either zero or -EINVAL. By breaking timeout_validate() out from timeout_sleep(), it becomes possible to error check the timeout 'far away' from the places in the code where we would actually do the timeout, as well as being able to perform such checks only at those places we know the timeout specification is coming from an unsafe source. timeout_sleep() puts the caller to sleep until the specified end time is in the past, as measured against the given clock, or until the caller is awakened by other means (such as wake_up_process()). Like schedule_timeout(), TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE must be set ahead of time; if TASK_INTERRUPTIBLE is set then signals will also break the caller out of the sleep. timeout_sleep() returns either 0 (returned early) or -ETIMEDOUT (returned due to timeout). It is up to the caller to resolve, in the "returned early" case, why it returned early. Timeout_sleep has a return argument, endtime, which is also in 'struct timeout' format. If the input time was relative, then it is converted to absolute and returned through this argument. This can be used when an early-terminated service must be restarted and side effects of the early termination-n-restart (such as end time drift) are to be avoided. Signed-off-by: Inaky Perez-Gonzalez <[EMAIL PROTECTED]> Signed-off-by: Joe Korty <[EMAIL PROTECTED]> 2.6.12-rc4-jak/include/linux/time.h|6 + 2.6.12-rc4-jak/include/linux/timeout.h | 48 2.6.12-rc4-jak/kernel/posix-timers.c |7 + 2.6.12-rc4-jak/kernel/timer.c | 184 + 4 files changed, 245 insertions(+) diff -puNa include/linux/time.h~a.more.flexible.timeout.approach include/linux/time.h --- 2.6.12-rc4/include/linux/time.h~a.more.flexible.timeout.approach 2005-05-18 13:53:14.204417169 -0400 +++ 2.6.12-rc4-jak/include/linux/time.h 2005-05-18 13:53:14.212416002 -0400 @@ -25,6 +25,8 @@ struct timezone { int tz_dsttime; /* type of dst correction */ }; +#include + #ifdef __KERNEL__ /* Parameters used to convert the timespec values */ @@ -103,6 +105,10 @@ struct itimerval; extern int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue); extern int do_getitimer(int which, struct itimerval *value); extern void getnstimeofday (struct timespec *tv); +extern long clock_gettime(int which, struct timespec *tp); + +extern int FASTCALL(abs_timespec_to_abs_jiffies (clockid_t clock, const struct timespec *tp, unsigned long *jp)); +extern int FASTCALL(rel_to_abs_timespec(clockid_t clock, const struct timespec *tsrel, struct timespec *tsabs)); extern struct timespec timespec_trunc(struct timespec t, unsigned gran); diff -puNa /dev/null include/linux/timeout.h --- /dev/null 2004-06-24 14:04:38.0 -0400 +++ 2.6.12-rc4-jak/include/linux/timeout.h 2005-05-18 13:53:14.212416002 -0400 @@ -0,0 +1,48 @@ +/* + * Extended timeout specification + * + * (C) 2002-2005 Intel Corp + * Inaky Perez-Gonzalez <[EMAIL PROTECTED]>. + * + * Licensed under the FSF's GNU Public License v2 or later. + * + * Generic extended timeout specification. Broken out by Joe Korty + * <[EMAIL PROTECTED]> from linux/time.h so that it can be included + * by userspace applications in conjunction with #include "time.h". + */ + +#ifndef _LINUX_TIMEOUT_H +#define _LINUX_TIMEOUT_H + +/* 'struct timeout' flag values. OR these into clock_id along with + * a clock specification such as CLOCK_REALTIME or CLOCK_MONOTONIC. + */ +enum { + TIMEOUT_RELATIVE = 0x1000,/* relative timeout */ + + TIMEOUT_FLAGS_MASK = 0xf000,/* flags mask for clock_id */ + TIMEOUT_CLOCK_MASK = 0x0fff,/* clock mask for clock_id */ +}; + +/* Magic values a 'struct timeout' pointer can have */ + +#define TIMEOUT_MAX((struct timeout *) ~0UL) /* never time out */ +#define TIMEOUT_NONE ((struct timeout *) 0UL) /* time out immediately */ + +/** + * struct timeout - general timeout specification + * + * @clock_id: which clock source to use ORed with flags describing use. + * @ts: timespec fo
RE: FW: [RFC] A more general timeout specification
Hi John >From: john stultz [mailto:[EMAIL PROTECTED] >On Thu, 2005-07-28 at 18:52 -0700, Inaky Perez-Gonzalez wrote: >> The main user of this new inteface is to allow system calls to get >> time specified in an absolute form (as most of POSIX states) and thus >> avoid extra time conversion work. ... >> http://groups-beta.google.com/groups?q=a+more+general+timeout+specificat ion ... >> >> timeout_validate() error-checks the syntax of a timeout >> argument and returns either zero or -EINVAL. By breaking >> timeout_validate() out from timeout_sleep(), it becomes possible >> to error check the timeout 'far away' from the places in the >> code where we would actually do the timeout, as well as being >> able to perform such checks only at those places we know the >> timeout specification is coming from an unsafe source. > >using gettimeofday() so that part looks good. I'm not completely sold on >why the validate interface is needed, but I didn't hear any objections >from George, so I'd defer to those who deal more with those interfaces. _validate() is mostly needed when we take a timeout specification from user space (timeouts from kernel space are supposed to be ok). We need to validate that the clock id passed is correct (existant), that the 'struct timespec' is also legal (eg: nsec < 1000M), that the flags are ok (relative/absolute), etc... The idea is that in your code that uses this, once you copy the 'struct timeout' from user space you check it for validity. Then you can dive into any kind of code (atomic, sleep paths, whatever) without having to code an error path from deeep down for when the user passed a bad timeout. It sure makes it more simple :) -- Inaky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: FW: [RFC] A more general timeout specification
On Thu, 2005-07-28 at 18:52 -0700, Inaky Perez-Gonzalez wrote: > The main user of this new inteface is to allow system calls to get > time specified in an absolute form (as most of POSIX states) and thus > avoid extra time conversion work. > > There was a short thread about it, available at google groups > (grepping for the subject). > > http://groups-beta.google.com/groups?q=a+more+general+timeout+specification > > Thanks! > > [ for comment only ] > > The fusyn (robust mutexes) project proposes the creation > of a more general data structure, 'struct timeout', for the > specification of timeouts in new services. In this structure, > the user specifies: > > a time, in timespec format. > the clock the time is specified against (eg, CLOCK_MONOTONIC). > whether the time is absolute, or relative to 'now'. > > That is, all combinations of useful timeout attributes become > possible. > > Also proposed are two new kernel routines for the manipulation > of timeouts: > > timeout_validate() > timeout_sleep() > > timeout_validate() error-checks the syntax of a timeout > argument and returns either zero or -EINVAL. By breaking > timeout_validate() out from timeout_sleep(), it becomes possible > to error check the timeout 'far away' from the places in the > code where we would actually do the timeout, as well as being > able to perform such checks only at those places we know the > timeout specification is coming from an unsafe source. > > timeout_sleep() puts the caller to sleep until the > specified end time is in the past, as measured against > the given clock, or until the caller is awakened by other > means (such as wake_up_process()). Like schedule_timeout(), > TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE must be set ahead > of time; if TASK_INTERRUPTIBLE is set then signals will also > break the caller out of the sleep. > > timeout_sleep() returns either 0 (returned early) or -ETIMEDOUT > (returned due to timeout). It is up to the caller to resolve, > in the "returned early" case, why it returned early. > > Timeout_sleep has a return argument, endtime, which is also in > 'struct timeout' format. If the input time was relative, then > it is converted to absolute and returned through this argument. > This can be used when an early-terminated service must be > restarted and side effects of the early termination-n-restart > (such as end time drift) are to be avoided. Hey Inaky, Joe, Sorry for the terribly slow response. I haven't really dealt too much with the user-space interfaces for the posix clocks and timers, so I'm no authority on it. Being able to use absolute values does seem very important to me, as well as avoiding having glibc make the conversion using gettimeofday() so that part looks good. I'm not completely sold on why the validate interface is needed, but I didn't hear any objections from George, so I'd defer to those who deal more with those interfaces. Sorry for the lack of insight, although since I haven't heard much on this recently, maybe this will help stir up the debate? Nish, do you have any comments on this idea? thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/