Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
On 17.08.2005 [12:51:17 -0700], George Anzinger wrote: > Nishanth Aravamudan wrote: > ~ > >>IMNSHO we should not get too parental with kernel only interfaces. > >>Adding 1 is easy enough for the caller and even easier to explain in the > >>instructions (i.e. this call sleeps for X jiffies edges). This allows > >>the caller to do more if needed and, should he ever just want to sync to > >>the next jiffie he does not have to deal with backing out that +1. > > > > > >I don't want to be too parental either, but I also am trying to avoid > >code duplication. Lots of drivers basically do something like > >poll_event() does (or could do with some changes), i.e. looping a > >constant amount multiple times, checking something every so often. The > >patch was just a thought, though. I will keep evaluating drivers and see > >if it's a useful interface to have eventually. > > > >I guess I'm just concerned with making an unintuitive interface. As was > >brought up at OLS, drivers are a major source of bugs/buggy code. The > >simpler, more useful we can make interfaces, the better, I think. I'm > >not claiming you disagree, I just want to make my own motives clear. > >While fixing up the schedule_timeout() comment would make it clear what > >schedule_timeout() achieves, I'm not sure how useful such an interface > >is, if every caller adds 1 :) I need to mull it over, though... Lots to > >consider. I also, of course, want to stay flexible for the reasons you > >mention (letting the driver adjust the timeout as they expect to). > > I would leave the +1 alone and put in the correct documentation. This > way _more_ folks will be made aware of the mid jiffie issue. Far to > often we see (and let get in) patches that mess up user interfaces > around this issue. The recent changes to itimer come to mind... Ok, makes sense to me; does the following patch work for everybody? The wording is a bit awkward, but so is the issue :) Description: Fix schedule_timeout()'s comment, indicated the inter-tick rounding issue. Since the kernel does not keep track of an inter-tick position in jiffies, a caller which wishes to sleep for at least a certain number of jiffies should add its request to the *next* jiffies value (meaning add 1 to its relative request). Make that clear in the comment. Signed-off-by: Nishanth Aravamudan <[EMAIL PROTECTED]> --- diff -urpN 2.6.13-rc6/kernel/timer.c 2.6.13-rc6-dev/kernel/timer.c --- 2.6.13-rc6/kernel/timer.c 2005-08-09 15:22:57.0 -0700 +++ 2.6.13-rc6-dev/kernel/timer.c 2005-08-17 15:21:35.0 -0700 @@ -1077,9 +1077,15 @@ static void process_timeout(unsigned lon * schedule_timeout - sleep until timeout * @timeout: timeout value in jiffies * - * Make the current task sleep until @timeout jiffies have - * elapsed. The routine will return immediately unless - * the current task state has been set (see set_current_state()). + * Make the current task sleep until @timeout timer interrupts have + * occurred, meaning jiffies has incremented @timeout times and not + * necessarily that @timeout jiffies have elapsed. If the task wishes to + * sleep until (at least) @timeout jiffies have elapsed, then it should + * add 1 to its request. This is necessary, as the kernel does not keep + * track of an inter-jiffy position, so the caller must "round up" its + * request so that it begins at the next jiffy. The routine will return + * immediately unless the current task state has been set (see + * set_current_state()). * * You can set the task state as follows - * - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
Nishanth Aravamudan wrote: ~ IMNSHO we should not get too parental with kernel only interfaces. Adding 1 is easy enough for the caller and even easier to explain in the instructions (i.e. this call sleeps for X jiffies edges). This allows the caller to do more if needed and, should he ever just want to sync to the next jiffie he does not have to deal with backing out that +1. I don't want to be too parental either, but I also am trying to avoid code duplication. Lots of drivers basically do something like poll_event() does (or could do with some changes), i.e. looping a constant amount multiple times, checking something every so often. The patch was just a thought, though. I will keep evaluating drivers and see if it's a useful interface to have eventually. I guess I'm just concerned with making an unintuitive interface. As was brought up at OLS, drivers are a major source of bugs/buggy code. The simpler, more useful we can make interfaces, the better, I think. I'm not claiming you disagree, I just want to make my own motives clear. While fixing up the schedule_timeout() comment would make it clear what schedule_timeout() achieves, I'm not sure how useful such an interface is, if every caller adds 1 :) I need to mull it over, though... Lots to consider. I also, of course, want to stay flexible for the reasons you mention (letting the driver adjust the timeout as they expect to). I would leave the +1 alone and put in the correct documentation. This way _more_ folks will be made aware of the mid jiffie issue. Far to often we see (and let get in) patches that mess up user interfaces around this issue. The recent changes to itimer come to mind... ~ -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
Nishanth Aravamudan wrote: ~ IMNSHO we should not get too parental with kernel only interfaces. Adding 1 is easy enough for the caller and even easier to explain in the instructions (i.e. this call sleeps for X jiffies edges). This allows the caller to do more if needed and, should he ever just want to sync to the next jiffie he does not have to deal with backing out that +1. I don't want to be too parental either, but I also am trying to avoid code duplication. Lots of drivers basically do something like poll_event() does (or could do with some changes), i.e. looping a constant amount multiple times, checking something every so often. The patch was just a thought, though. I will keep evaluating drivers and see if it's a useful interface to have eventually. I guess I'm just concerned with making an unintuitive interface. As was brought up at OLS, drivers are a major source of bugs/buggy code. The simpler, more useful we can make interfaces, the better, I think. I'm not claiming you disagree, I just want to make my own motives clear. While fixing up the schedule_timeout() comment would make it clear what schedule_timeout() achieves, I'm not sure how useful such an interface is, if every caller adds 1 :) I need to mull it over, though... Lots to consider. I also, of course, want to stay flexible for the reasons you mention (letting the driver adjust the timeout as they expect to). I would leave the +1 alone and put in the correct documentation. This way _more_ folks will be made aware of the mid jiffie issue. Far to often we see (and let get in) patches that mess up user interfaces around this issue. The recent changes to itimer come to mind... ~ -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
On 17.08.2005 [12:51:17 -0700], George Anzinger wrote: Nishanth Aravamudan wrote: ~ IMNSHO we should not get too parental with kernel only interfaces. Adding 1 is easy enough for the caller and even easier to explain in the instructions (i.e. this call sleeps for X jiffies edges). This allows the caller to do more if needed and, should he ever just want to sync to the next jiffie he does not have to deal with backing out that +1. I don't want to be too parental either, but I also am trying to avoid code duplication. Lots of drivers basically do something like poll_event() does (or could do with some changes), i.e. looping a constant amount multiple times, checking something every so often. The patch was just a thought, though. I will keep evaluating drivers and see if it's a useful interface to have eventually. I guess I'm just concerned with making an unintuitive interface. As was brought up at OLS, drivers are a major source of bugs/buggy code. The simpler, more useful we can make interfaces, the better, I think. I'm not claiming you disagree, I just want to make my own motives clear. While fixing up the schedule_timeout() comment would make it clear what schedule_timeout() achieves, I'm not sure how useful such an interface is, if every caller adds 1 :) I need to mull it over, though... Lots to consider. I also, of course, want to stay flexible for the reasons you mention (letting the driver adjust the timeout as they expect to). I would leave the +1 alone and put in the correct documentation. This way _more_ folks will be made aware of the mid jiffie issue. Far to often we see (and let get in) patches that mess up user interfaces around this issue. The recent changes to itimer come to mind... Ok, makes sense to me; does the following patch work for everybody? The wording is a bit awkward, but so is the issue :) Description: Fix schedule_timeout()'s comment, indicated the inter-tick rounding issue. Since the kernel does not keep track of an inter-tick position in jiffies, a caller which wishes to sleep for at least a certain number of jiffies should add its request to the *next* jiffies value (meaning add 1 to its relative request). Make that clear in the comment. Signed-off-by: Nishanth Aravamudan [EMAIL PROTECTED] --- diff -urpN 2.6.13-rc6/kernel/timer.c 2.6.13-rc6-dev/kernel/timer.c --- 2.6.13-rc6/kernel/timer.c 2005-08-09 15:22:57.0 -0700 +++ 2.6.13-rc6-dev/kernel/timer.c 2005-08-17 15:21:35.0 -0700 @@ -1077,9 +1077,15 @@ static void process_timeout(unsigned lon * schedule_timeout - sleep until timeout * @timeout: timeout value in jiffies * - * Make the current task sleep until @timeout jiffies have - * elapsed. The routine will return immediately unless - * the current task state has been set (see set_current_state()). + * Make the current task sleep until @timeout timer interrupts have + * occurred, meaning jiffies has incremented @timeout times and not + * necessarily that @timeout jiffies have elapsed. If the task wishes to + * sleep until (at least) @timeout jiffies have elapsed, then it should + * add 1 to its request. This is necessary, as the kernel does not keep + * track of an inter-jiffy position, so the caller must round up its + * request so that it begins at the next jiffy. The routine will return + * immediately unless the current task state has been set (see + * set_current_state()). * * You can set the task state as follows - * - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
On 16.08.2005 [17:39:11 -0700], George Anzinger wrote: > Nishanth Aravamudan wrote: > >On 04.08.2005 [09:45:55 -0700], George Anzinger wrote: > > > >>Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and > >>timeval_to_jiffies() to add 1. This is NOT the right thing to do. For > >>repeating times (see setitimer code) we need the actual time as we KNOW > >>where the jiffies edge is in the repeating case. The +1 is needed ONLY > >>for the initial time, not the repeating time. > >> > >> > >>See: > >>http://marc.theaimsgroup.com/?l=linux-kernel=112208357906156=2 > > > > > >I followed that thread, George, but I think it's a different case with > >schedule_timeout() [maybe this indicates drivers/other users should > >maybe be using itimers, but I'll get to that in a sec]. > > I think I miss understood back then :). Ok, no problem. I was just thinking today about all the issues that were borught up... I really appreciate your feedback. > > > >With schedule_timeout(), we are just given a relative jiffies value. We > >have no context as to which task is requesting the delay, per se, > >meaning we don't (can't) know from the interface whether this is the > >first delay in a sequence, or a brand new one, without changing all > >users to have some sort of control structure. The callers of > >schedule_timeout() don't even get a pointer to the timer added > >internally. > > > >So, adding 1 to all sleeps seems like it might be reasonable, as looping > >sleeps probably need to use a different interface. I had worked a bit > >ago on something like poll_event() with the kernel-janitors group, which > >would abstract out the repeated sleeps. Basically wait_event() without > >wait-queues... Maybe we could make such an interface just use itimers? > >I've attached my old patch for poll_event(), just for reference. > > I think not. itimers is really pointed at a particular system call and > has resources in the task structure to do it. These would be hard to > share... Ok, I wasn't sure about that -- I'm not too familiar with the itimer code (maybe I'll take a look at it soon :) ) > >My point, I guess, is that in the schedule_timeout() case, we don't know > >where the jiffies edge is, as we either expire or receive a wait-queue > >event/signal, we never mod_timer() the internal timer... So we have to > >assume that we need to sleep the request. But maybe Roman's idea of > >sleeping a certain number of jiffy edges is sufficient. I am not yet > >convinced driver authors want/need such an interface, though, still > >thinking it over. > > IMNSHO we should not get too parental with kernel only interfaces. > Adding 1 is easy enough for the caller and even easier to explain in the > instructions (i.e. this call sleeps for X jiffies edges). This allows > the caller to do more if needed and, should he ever just want to sync to > the next jiffie he does not have to deal with backing out that +1. I don't want to be too parental either, but I also am trying to avoid code duplication. Lots of drivers basically do something like poll_event() does (or could do with some changes), i.e. looping a constant amount multiple times, checking something every so often. The patch was just a thought, though. I will keep evaluating drivers and see if it's a useful interface to have eventually. I guess I'm just concerned with making an unintuitive interface. As was brought up at OLS, drivers are a major source of bugs/buggy code. The simpler, more useful we can make interfaces, the better, I think. I'm not claiming you disagree, I just want to make my own motives clear. While fixing up the schedule_timeout() comment would make it clear what schedule_timeout() achieves, I'm not sure how useful such an interface is, if every caller adds 1 :) I need to mull it over, though... Lots to consider. I also, of course, want to stay flexible for the reasons you mention (letting the driver adjust the timeout as they expect to). Thanks, Nish - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
Nishanth Aravamudan wrote: On 04.08.2005 [09:45:55 -0700], George Anzinger wrote: Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and timeval_to_jiffies() to add 1. This is NOT the right thing to do. For repeating times (see setitimer code) we need the actual time as we KNOW where the jiffies edge is in the repeating case. The +1 is needed ONLY for the initial time, not the repeating time. See: http://marc.theaimsgroup.com/?l=linux-kernel=112208357906156=2 I followed that thread, George, but I think it's a different case with schedule_timeout() [maybe this indicates drivers/other users should maybe be using itimers, but I'll get to that in a sec]. I think I miss understood back then :). With schedule_timeout(), we are just given a relative jiffies value. We have no context as to which task is requesting the delay, per se, meaning we don't (can't) know from the interface whether this is the first delay in a sequence, or a brand new one, without changing all users to have some sort of control structure. The callers of schedule_timeout() don't even get a pointer to the timer added internally. So, adding 1 to all sleeps seems like it might be reasonable, as looping sleeps probably need to use a different interface. I had worked a bit ago on something like poll_event() with the kernel-janitors group, which would abstract out the repeated sleeps. Basically wait_event() without wait-queues... Maybe we could make such an interface just use itimers? I've attached my old patch for poll_event(), just for reference. I think not. itimers is really pointed at a particular system call and has resources in the task structure to do it. These would be hard to share... My point, I guess, is that in the schedule_timeout() case, we don't know where the jiffies edge is, as we either expire or receive a wait-queue event/signal, we never mod_timer() the internal timer... So we have to assume that we need to sleep the request. But maybe Roman's idea of sleeping a certain number of jiffy edges is sufficient. I am not yet convinced driver authors want/need such an interface, though, still thinking it over. IMNSHO we should not get too parental with kernel only interfaces. Adding 1 is easy enough for the caller and even easier to explain in the instructions (i.e. this call sleeps for X jiffies edges). This allows the caller to do more if needed and, should he ever just want to sync to the next jiffie he does not have to deal with backing out that +1. -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
On 04.08.2005 [09:45:55 -0700], George Anzinger wrote: > Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and > timeval_to_jiffies() to add 1. This is NOT the right thing to do. For > repeating times (see setitimer code) we need the actual time as we KNOW > where the jiffies edge is in the repeating case. The +1 is needed ONLY > for the initial time, not the repeating time. > > > See: > http://marc.theaimsgroup.com/?l=linux-kernel=112208357906156=2 I followed that thread, George, but I think it's a different case with schedule_timeout() [maybe this indicates drivers/other users should maybe be using itimers, but I'll get to that in a sec]. With schedule_timeout(), we are just given a relative jiffies value. We have no context as to which task is requesting the delay, per se, meaning we don't (can't) know from the interface whether this is the first delay in a sequence, or a brand new one, without changing all users to have some sort of control structure. The callers of schedule_timeout() don't even get a pointer to the timer added internally. So, adding 1 to all sleeps seems like it might be reasonable, as looping sleeps probably need to use a different interface. I had worked a bit ago on something like poll_event() with the kernel-janitors group, which would abstract out the repeated sleeps. Basically wait_event() without wait-queues... Maybe we could make such an interface just use itimers? I've attached my old patch for poll_event(), just for reference. My point, I guess, is that in the schedule_timeout() case, we don't know where the jiffies edge is, as we either expire or receive a wait-queue event/signal, we never mod_timer() the internal timer... So we have to assume that we need to sleep the request. But maybe Roman's idea of sleeping a certain number of jiffy edges is sufficient. I am not yet convinced driver authors want/need such an interface, though, still thinking it over. Thanks for all the feedback, Nish --- 2.6.11-kj-v/include/linux/delay.h 2005-03-01 23:37:51.0 -0800 +++ 2.6.11-kj/include/linux/delay.h 2005-03-04 16:03:24.0 -0800 @@ -47,4 +47,106 @@ static inline void ssleep(unsigned int s msleep(seconds * 1000); } +/* + * poll_event* check if an @condition is true every @interval milliseconds, + * up to a @timeout maximum, if specified, else forever. + * The *_interruptible versions will sleep in TASK_INTERRUPTIBLE + * (instead of TASK_UNINTERRUPTIBLE) and will return early on signals + * Note that these interfaces are distinctly different than + * wait_event*() in two ways: + * 1) No wait-queues + * 2) Specify the time you want to sleep, not when you want to + * wake-up. Thus, callers should not add the current time to + * @timeout before calling, as it will be done by the macro. + */ + +/** + * poll_event - check a condition at regular intervals + * @condition: Condition to check + * @interval: Number of milliseconds between checks + */ +#define poll_event(condition, interval)\ +do { \ + while (!(condition))\ + msleep(interval); \ +} while(0) + +/** + * poll_event_timeout - check a condition at regular intervals with a timeout + * @condition: Condition to check + * @interval: Number of milliseconds between checks + * @timeout: Maximum total number of milliseconds to take + * returns: 0, if condition caused return + * -ETIME, if timeout + */ +#define poll_event_timeout(condition, interval, timeout) \ +({ \ + int __ret = 0; \ + unsigned long stop = jiffies + msecs_to_jiffies(timeout); \ + \ + for (;;) { \ + if (condition) \ + break; \ + msleep(interval); \ + if (time_after(jiffies, stop)) {\ + __ret = -ETIME; \ + break; \ + } \ + } \ + __ret; \ +}) + +/** + * poll_event_interruptible - check a condition at regular intervals, returning early on signals + * @condition: Condition to check + * @interval: Number of milliseconds between checks + * returns: 0, if condition caused return + * -EINTR, if a signal + */ +#define
Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
On 04.08.2005 [09:45:55 -0700], George Anzinger wrote: Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and timeval_to_jiffies() to add 1. This is NOT the right thing to do. For repeating times (see setitimer code) we need the actual time as we KNOW where the jiffies edge is in the repeating case. The +1 is needed ONLY for the initial time, not the repeating time. See: http://marc.theaimsgroup.com/?l=linux-kernelm=112208357906156w=2 I followed that thread, George, but I think it's a different case with schedule_timeout() [maybe this indicates drivers/other users should maybe be using itimers, but I'll get to that in a sec]. With schedule_timeout(), we are just given a relative jiffies value. We have no context as to which task is requesting the delay, per se, meaning we don't (can't) know from the interface whether this is the first delay in a sequence, or a brand new one, without changing all users to have some sort of control structure. The callers of schedule_timeout() don't even get a pointer to the timer added internally. So, adding 1 to all sleeps seems like it might be reasonable, as looping sleeps probably need to use a different interface. I had worked a bit ago on something like poll_event() with the kernel-janitors group, which would abstract out the repeated sleeps. Basically wait_event() without wait-queues... Maybe we could make such an interface just use itimers? I've attached my old patch for poll_event(), just for reference. My point, I guess, is that in the schedule_timeout() case, we don't know where the jiffies edge is, as we either expire or receive a wait-queue event/signal, we never mod_timer() the internal timer... So we have to assume that we need to sleep the request. But maybe Roman's idea of sleeping a certain number of jiffy edges is sufficient. I am not yet convinced driver authors want/need such an interface, though, still thinking it over. Thanks for all the feedback, Nish --- 2.6.11-kj-v/include/linux/delay.h 2005-03-01 23:37:51.0 -0800 +++ 2.6.11-kj/include/linux/delay.h 2005-03-04 16:03:24.0 -0800 @@ -47,4 +47,106 @@ static inline void ssleep(unsigned int s msleep(seconds * 1000); } +/* + * poll_event* check if an @condition is true every @interval milliseconds, + * up to a @timeout maximum, if specified, else forever. + * The *_interruptible versions will sleep in TASK_INTERRUPTIBLE + * (instead of TASK_UNINTERRUPTIBLE) and will return early on signals + * Note that these interfaces are distinctly different than + * wait_event*() in two ways: + * 1) No wait-queues + * 2) Specify the time you want to sleep, not when you want to + * wake-up. Thus, callers should not add the current time to + * @timeout before calling, as it will be done by the macro. + */ + +/** + * poll_event - check a condition at regular intervals + * @condition: Condition to check + * @interval: Number of milliseconds between checks + */ +#define poll_event(condition, interval)\ +do { \ + while (!(condition))\ + msleep(interval); \ +} while(0) + +/** + * poll_event_timeout - check a condition at regular intervals with a timeout + * @condition: Condition to check + * @interval: Number of milliseconds between checks + * @timeout: Maximum total number of milliseconds to take + * returns: 0, if condition caused return + * -ETIME, if timeout + */ +#define poll_event_timeout(condition, interval, timeout) \ +({ \ + int __ret = 0; \ + unsigned long stop = jiffies + msecs_to_jiffies(timeout); \ + \ + for (;;) { \ + if (condition) \ + break; \ + msleep(interval); \ + if (time_after(jiffies, stop)) {\ + __ret = -ETIME; \ + break; \ + } \ + } \ + __ret; \ +}) + +/** + * poll_event_interruptible - check a condition at regular intervals, returning early on signals + * @condition: Condition to check + * @interval: Number of milliseconds between checks + * returns: 0, if condition caused return + * -EINTR, if a signal + */ +#define
Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
Nishanth Aravamudan wrote: On 04.08.2005 [09:45:55 -0700], George Anzinger wrote: Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and timeval_to_jiffies() to add 1. This is NOT the right thing to do. For repeating times (see setitimer code) we need the actual time as we KNOW where the jiffies edge is in the repeating case. The +1 is needed ONLY for the initial time, not the repeating time. See: http://marc.theaimsgroup.com/?l=linux-kernelm=112208357906156w=2 I followed that thread, George, but I think it's a different case with schedule_timeout() [maybe this indicates drivers/other users should maybe be using itimers, but I'll get to that in a sec]. I think I miss understood back then :). With schedule_timeout(), we are just given a relative jiffies value. We have no context as to which task is requesting the delay, per se, meaning we don't (can't) know from the interface whether this is the first delay in a sequence, or a brand new one, without changing all users to have some sort of control structure. The callers of schedule_timeout() don't even get a pointer to the timer added internally. So, adding 1 to all sleeps seems like it might be reasonable, as looping sleeps probably need to use a different interface. I had worked a bit ago on something like poll_event() with the kernel-janitors group, which would abstract out the repeated sleeps. Basically wait_event() without wait-queues... Maybe we could make such an interface just use itimers? I've attached my old patch for poll_event(), just for reference. I think not. itimers is really pointed at a particular system call and has resources in the task structure to do it. These would be hard to share... My point, I guess, is that in the schedule_timeout() case, we don't know where the jiffies edge is, as we either expire or receive a wait-queue event/signal, we never mod_timer() the internal timer... So we have to assume that we need to sleep the request. But maybe Roman's idea of sleeping a certain number of jiffy edges is sufficient. I am not yet convinced driver authors want/need such an interface, though, still thinking it over. IMNSHO we should not get too parental with kernel only interfaces. Adding 1 is easy enough for the caller and even easier to explain in the instructions (i.e. this call sleeps for X jiffies edges). This allows the caller to do more if needed and, should he ever just want to sync to the next jiffie he does not have to deal with backing out that +1. -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
On 16.08.2005 [17:39:11 -0700], George Anzinger wrote: Nishanth Aravamudan wrote: On 04.08.2005 [09:45:55 -0700], George Anzinger wrote: Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and timeval_to_jiffies() to add 1. This is NOT the right thing to do. For repeating times (see setitimer code) we need the actual time as we KNOW where the jiffies edge is in the repeating case. The +1 is needed ONLY for the initial time, not the repeating time. See: http://marc.theaimsgroup.com/?l=linux-kernelm=112208357906156w=2 I followed that thread, George, but I think it's a different case with schedule_timeout() [maybe this indicates drivers/other users should maybe be using itimers, but I'll get to that in a sec]. I think I miss understood back then :). Ok, no problem. I was just thinking today about all the issues that were borught up... I really appreciate your feedback. With schedule_timeout(), we are just given a relative jiffies value. We have no context as to which task is requesting the delay, per se, meaning we don't (can't) know from the interface whether this is the first delay in a sequence, or a brand new one, without changing all users to have some sort of control structure. The callers of schedule_timeout() don't even get a pointer to the timer added internally. So, adding 1 to all sleeps seems like it might be reasonable, as looping sleeps probably need to use a different interface. I had worked a bit ago on something like poll_event() with the kernel-janitors group, which would abstract out the repeated sleeps. Basically wait_event() without wait-queues... Maybe we could make such an interface just use itimers? I've attached my old patch for poll_event(), just for reference. I think not. itimers is really pointed at a particular system call and has resources in the task structure to do it. These would be hard to share... Ok, I wasn't sure about that -- I'm not too familiar with the itimer code (maybe I'll take a look at it soon :) ) My point, I guess, is that in the schedule_timeout() case, we don't know where the jiffies edge is, as we either expire or receive a wait-queue event/signal, we never mod_timer() the internal timer... So we have to assume that we need to sleep the request. But maybe Roman's idea of sleeping a certain number of jiffy edges is sufficient. I am not yet convinced driver authors want/need such an interface, though, still thinking it over. IMNSHO we should not get too parental with kernel only interfaces. Adding 1 is easy enough for the caller and even easier to explain in the instructions (i.e. this call sleeps for X jiffies edges). This allows the caller to do more if needed and, should he ever just want to sync to the next jiffie he does not have to deal with backing out that +1. I don't want to be too parental either, but I also am trying to avoid code duplication. Lots of drivers basically do something like poll_event() does (or could do with some changes), i.e. looping a constant amount multiple times, checking something every so often. The patch was just a thought, though. I will keep evaluating drivers and see if it's a useful interface to have eventually. I guess I'm just concerned with making an unintuitive interface. As was brought up at OLS, drivers are a major source of bugs/buggy code. The simpler, more useful we can make interfaces, the better, I think. I'm not claiming you disagree, I just want to make my own motives clear. While fixing up the schedule_timeout() comment would make it clear what schedule_timeout() achieves, I'm not sure how useful such an interface is, if every caller adds 1 :) I need to mull it over, though... Lots to consider. I also, of course, want to stay flexible for the reasons you mention (letting the driver adjust the timeout as they expect to). Thanks, Nish - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
On 8/4/05, George Anzinger wrote: > Nishanth Aravamudan wrote: > ~ > > Sorry, I forgot that sys_nanosleep() also always adds 1 to the request > > (to account for this same issue, I believe, as POSIX demands no early > > return from nanosleep() calls). There are some other locations where > > similar > > > > + (t.tv_sec || t.tv_nsec) > > This is not the same as "always add 1". We don't do it this way just to > have fun with C. If you change schedule_timeout() to add the 1, > nanosleep() will need to do things differently to get the same behavior. > (And, YES users do pass in zero sleep times.) Fair enough. Will need to think about this more. Thanks, Nish - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
On 8/4/05, George Anzinger wrote: > Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and > timeval_to_jiffies() to add 1. This is NOT the right thing to do. For > repeating times (see setitimer code) we need the actual time as we KNOW > where the jiffies edge is in the repeating case. The +1 is needed ONLY > for the initial time, not the repeating time. Please read the patch. I didn't touch timespec_to_jiffies() or timeval_to_jiffies(). Not sure why you think I did. I agree that we only need the initial time, my patch is no good. But it is hard for non-itimers, like schedule_timeout() callers, to provide an interface that only adds 1 to the initial request, since the callers currently pass in an absolute jiffies value. Thanks, Nish - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
Nishanth Aravamudan wrote: ~ Sorry, I forgot that sys_nanosleep() also always adds 1 to the request (to account for this same issue, I believe, as POSIX demands no early return from nanosleep() calls). There are some other locations where similar + (t.tv_sec || t.tv_nsec) This is not the same as "always add 1". We don't do it this way just to have fun with C. If you change schedule_timeout() to add the 1, nanosleep() will need to do things differently to get the same behavior. (And, YES users do pass in zero sleep times.) -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and timeval_to_jiffies() to add 1. This is NOT the right thing to do. For repeating times (see setitimer code) we need the actual time as we KNOW where the jiffies edge is in the repeating case. The +1 is needed ONLY for the initial time, not the repeating time. See: http://marc.theaimsgroup.com/?l=linux-kernel=112208357906156=2 George Nishanth Aravamudan wrote: On 03.08.2005 [17:51:47 -0700], Nishanth Aravamudan wrote: On 03.08.2005 [16:20:57 +0200], Roman Zippel wrote: Hi, On Mon, 1 Aug 2005, Nishanth Aravamudan wrote: +unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs) +{ + unsigned long expire_jifs; + + if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) { + expire_jifs = MAX_SCHEDULE_TIMEOUT; + } else { + /* +* msecs_to_jiffies() is a unit conversion, which truncates +* (rounds down), so we need to add 1. +*/ + expire_jifs = msecs_to_jiffies(timeout_msecs) + 1; + } + + expire_jifs = schedule_timeout(expire_jifs); + + /* +* don't need to add 1 here, even though there is truncation, +* because we will add 1 if/when the value is sent back in +*/ + return jiffies_to_msecs(expire_jifs); +} As I already mentioned for msleep_interruptible this is a really terrible interface. The "jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1)" case (when the process is immediately woken up again) makes the caller suspectible to timeout manipulations and requires constant reauditing, that no caller gets it wrong, so it's better to avoid this error source completely. After some thought today, I realized the +1 case is not specific to milliseconds. It's just that it's only being done *correctly* in the milliseconds case...I think ;) Description: Ensure that schedule_timeout() requests can not possibly expire early in the timeout case, by adding the requested relative jiffy value to the next value of jiffies. Currently, by adding to the current value of jiffies, we might actually expire a jiffy too early (in a corner case). Sorry, I forgot that sys_nanosleep() also always adds 1 to the request (to account for this same issue, I believe, as POSIX demands no early return from nanosleep() calls). There are some other locations where similar + (t.tv_sec || t.tv_nsec) rounding-ups occur. I'll fix those separately if this patch goes in. I change the one in sys_nanosleep() below to maintain the same latency as we currently have. I also screwed up my layout in the previous submission, sorry about that. Description: Ensure that schedule_timeout() requests can not possibly expire early in the timeout case, by adding the requested relative jiffy value to the next value of jiffies. Currently, by adding to the current value of jiffies, we might actually expire a jiffy too early (in a corner case). Modify the callers of schedule_timeout() in timer.c to not add 1 themselves. Signed-off-by: Nishanth Aravamudan <[EMAIL PROTECTED]> --- timer.c | 13 +++-- 1 files changed, 7 insertions(+), 6 deletions(-) --- 2.6.13-rc4-dev/kernel/timer.c 2005-08-01 12:31:53.0 -0700 +++ 2.6.13-rc5/kernel/timer.c 2005-08-03 22:05:16.0 -0700 @@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti } } - expire = timeout + jiffies; + expire = timeout + jiffies + 1; init_timer(); timer.expires = expire; @@ -1190,9 +1190,10 @@ unsigned int __sched schedule_timeout_ms } else { /* * msecs_to_jiffies() is a unit conversion, which truncates -* (rounds down), so we need to add 1. +* (rounds down), so we need to add 1, but this is taken +* care of by schedule_timeout() now. */ - expire_jifs = msecs_to_jiffies(timeout_msecs) + 1; + expire_jifs = msecs_to_jiffies(timeout_msecs); } expire_jifs = schedule_timeout(expire_jifs); @@ -1286,7 +1287,7 @@ asmlinkage long sys_nanosleep(struct tim if ((t.tv_nsec >= 10L) || (t.tv_nsec < 0) || (t.tv_sec < 0)) return -EINVAL; - expire = timespec_to_jiffies() + (t.tv_sec || t.tv_nsec); + expire = timespec_to_jiffies(); current->state = TASK_INTERRUPTIBLE; expire = schedule_timeout(expire); @@ -1675,7 +1676,7 @@ unregister_time_interpolator(struct time */ void msleep(unsigned int msecs) { - unsigned long timeout = msecs_to_jiffies(msecs) + 1; + unsigned long timeout = msecs_to_jiffies(msecs); while (timeout) { set_current_state(TASK_UNINTERRUPTIBLE); @@ -1691,7 +1692,7 @@ EXPORT_SYMBOL(msleep); */ unsigned long msleep_interruptible(unsigned int msecs) { - unsigned long timeout = msecs_to_jiffies(msecs) + 1; +
Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and timeval_to_jiffies() to add 1. This is NOT the right thing to do. For repeating times (see setitimer code) we need the actual time as we KNOW where the jiffies edge is in the repeating case. The +1 is needed ONLY for the initial time, not the repeating time. See: http://marc.theaimsgroup.com/?l=linux-kernelm=112208357906156w=2 George Nishanth Aravamudan wrote: On 03.08.2005 [17:51:47 -0700], Nishanth Aravamudan wrote: On 03.08.2005 [16:20:57 +0200], Roman Zippel wrote: Hi, On Mon, 1 Aug 2005, Nishanth Aravamudan wrote: +unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs) +{ + unsigned long expire_jifs; + + if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) { + expire_jifs = MAX_SCHEDULE_TIMEOUT; + } else { + /* +* msecs_to_jiffies() is a unit conversion, which truncates +* (rounds down), so we need to add 1. +*/ + expire_jifs = msecs_to_jiffies(timeout_msecs) + 1; + } + + expire_jifs = schedule_timeout(expire_jifs); + + /* +* don't need to add 1 here, even though there is truncation, +* because we will add 1 if/when the value is sent back in +*/ + return jiffies_to_msecs(expire_jifs); +} As I already mentioned for msleep_interruptible this is a really terrible interface. The jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1) case (when the process is immediately woken up again) makes the caller suspectible to timeout manipulations and requires constant reauditing, that no caller gets it wrong, so it's better to avoid this error source completely. After some thought today, I realized the +1 case is not specific to milliseconds. It's just that it's only being done *correctly* in the milliseconds case...I think ;) snip Description: Ensure that schedule_timeout() requests can not possibly expire early in the timeout case, by adding the requested relative jiffy value to the next value of jiffies. Currently, by adding to the current value of jiffies, we might actually expire a jiffy too early (in a corner case). Sorry, I forgot that sys_nanosleep() also always adds 1 to the request (to account for this same issue, I believe, as POSIX demands no early return from nanosleep() calls). There are some other locations where similar + (t.tv_sec || t.tv_nsec) rounding-ups occur. I'll fix those separately if this patch goes in. I change the one in sys_nanosleep() below to maintain the same latency as we currently have. I also screwed up my layout in the previous submission, sorry about that. Description: Ensure that schedule_timeout() requests can not possibly expire early in the timeout case, by adding the requested relative jiffy value to the next value of jiffies. Currently, by adding to the current value of jiffies, we might actually expire a jiffy too early (in a corner case). Modify the callers of schedule_timeout() in timer.c to not add 1 themselves. Signed-off-by: Nishanth Aravamudan [EMAIL PROTECTED] --- timer.c | 13 +++-- 1 files changed, 7 insertions(+), 6 deletions(-) --- 2.6.13-rc4-dev/kernel/timer.c 2005-08-01 12:31:53.0 -0700 +++ 2.6.13-rc5/kernel/timer.c 2005-08-03 22:05:16.0 -0700 @@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti } } - expire = timeout + jiffies; + expire = timeout + jiffies + 1; init_timer(timer); timer.expires = expire; @@ -1190,9 +1190,10 @@ unsigned int __sched schedule_timeout_ms } else { /* * msecs_to_jiffies() is a unit conversion, which truncates -* (rounds down), so we need to add 1. +* (rounds down), so we need to add 1, but this is taken +* care of by schedule_timeout() now. */ - expire_jifs = msecs_to_jiffies(timeout_msecs) + 1; + expire_jifs = msecs_to_jiffies(timeout_msecs); } expire_jifs = schedule_timeout(expire_jifs); @@ -1286,7 +1287,7 @@ asmlinkage long sys_nanosleep(struct tim if ((t.tv_nsec = 10L) || (t.tv_nsec 0) || (t.tv_sec 0)) return -EINVAL; - expire = timespec_to_jiffies(t) + (t.tv_sec || t.tv_nsec); + expire = timespec_to_jiffies(t); current-state = TASK_INTERRUPTIBLE; expire = schedule_timeout(expire); @@ -1675,7 +1676,7 @@ unregister_time_interpolator(struct time */ void msleep(unsigned int msecs) { - unsigned long timeout = msecs_to_jiffies(msecs) + 1; + unsigned long timeout = msecs_to_jiffies(msecs); while (timeout) { set_current_state(TASK_UNINTERRUPTIBLE); @@ -1691,7 +1692,7 @@ EXPORT_SYMBOL(msleep); */ unsigned long msleep_interruptible(unsigned int msecs) { - unsigned long timeout = msecs_to_jiffies(msecs) + 1; +
Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
Nishanth Aravamudan wrote: ~ Sorry, I forgot that sys_nanosleep() also always adds 1 to the request (to account for this same issue, I believe, as POSIX demands no early return from nanosleep() calls). There are some other locations where similar + (t.tv_sec || t.tv_nsec) This is not the same as always add 1. We don't do it this way just to have fun with C. If you change schedule_timeout() to add the 1, nanosleep() will need to do things differently to get the same behavior. (And, YES users do pass in zero sleep times.) -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
On 8/4/05, George Anzinger george@mvista.com wrote: Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and timeval_to_jiffies() to add 1. This is NOT the right thing to do. For repeating times (see setitimer code) we need the actual time as we KNOW where the jiffies edge is in the repeating case. The +1 is needed ONLY for the initial time, not the repeating time. Please read the patch. I didn't touch timespec_to_jiffies() or timeval_to_jiffies(). Not sure why you think I did. I agree that we only need the initial time, my patch is no good. But it is hard for non-itimers, like schedule_timeout() callers, to provide an interface that only adds 1 to the initial request, since the callers currently pass in an absolute jiffies value. Thanks, Nish - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
On 8/4/05, George Anzinger george@mvista.com wrote: Nishanth Aravamudan wrote: ~ Sorry, I forgot that sys_nanosleep() also always adds 1 to the request (to account for this same issue, I believe, as POSIX demands no early return from nanosleep() calls). There are some other locations where similar + (t.tv_sec || t.tv_nsec) This is not the same as always add 1. We don't do it this way just to have fun with C. If you change schedule_timeout() to add the 1, nanosleep() will need to do things differently to get the same behavior. (And, YES users do pass in zero sleep times.) Fair enough. Will need to think about this more. Thanks, Nish - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[UPDATE PATCH] push rounding up of relative request to schedule_timeout()
On 03.08.2005 [17:51:47 -0700], Nishanth Aravamudan wrote: > On 03.08.2005 [16:20:57 +0200], Roman Zippel wrote: > > Hi, > > > > On Mon, 1 Aug 2005, Nishanth Aravamudan wrote: > > > > > +unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs) > > > +{ > > > + unsigned long expire_jifs; > > > + > > > + if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) { > > > + expire_jifs = MAX_SCHEDULE_TIMEOUT; > > > + } else { > > > + /* > > > + * msecs_to_jiffies() is a unit conversion, which truncates > > > + * (rounds down), so we need to add 1. > > > + */ > > > + expire_jifs = msecs_to_jiffies(timeout_msecs) + 1; > > > + } > > > + > > > + expire_jifs = schedule_timeout(expire_jifs); > > > + > > > + /* > > > + * don't need to add 1 here, even though there is truncation, > > > + * because we will add 1 if/when the value is sent back in > > > + */ > > > + return jiffies_to_msecs(expire_jifs); > > > +} > > > > As I already mentioned for msleep_interruptible this is a really terrible > > interface. > > The "jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1)" case (when the > > process is immediately woken up again) makes the caller suspectible to > > timeout manipulations and requires constant reauditing, that no caller > > gets it wrong, so it's better to avoid this error source completely. > > After some thought today, I realized the +1 case is not specific to > milliseconds. It's just that it's only being done *correctly* in the > milliseconds case...I think ;) > Description: Ensure that schedule_timeout() requests can not possibly > expire early in the timeout case, by adding the requested relative jiffy > value to the next value of jiffies. Currently, by adding to the current > value of jiffies, we might actually expire a jiffy too early (in a > corner case). Sorry, I forgot that sys_nanosleep() also always adds 1 to the request (to account for this same issue, I believe, as POSIX demands no early return from nanosleep() calls). There are some other locations where similar + (t.tv_sec || t.tv_nsec) rounding-ups occur. I'll fix those separately if this patch goes in. I change the one in sys_nanosleep() below to maintain the same latency as we currently have. I also screwed up my layout in the previous submission, sorry about that. Description: Ensure that schedule_timeout() requests can not possibly expire early in the timeout case, by adding the requested relative jiffy value to the next value of jiffies. Currently, by adding to the current value of jiffies, we might actually expire a jiffy too early (in a corner case). Modify the callers of schedule_timeout() in timer.c to not add 1 themselves. Signed-off-by: Nishanth Aravamudan <[EMAIL PROTECTED]> --- timer.c | 13 +++-- 1 files changed, 7 insertions(+), 6 deletions(-) --- 2.6.13-rc4-dev/kernel/timer.c 2005-08-01 12:31:53.0 -0700 +++ 2.6.13-rc5/kernel/timer.c 2005-08-03 22:05:16.0 -0700 @@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti } } - expire = timeout + jiffies; + expire = timeout + jiffies + 1; init_timer(); timer.expires = expire; @@ -1190,9 +1190,10 @@ unsigned int __sched schedule_timeout_ms } else { /* * msecs_to_jiffies() is a unit conversion, which truncates -* (rounds down), so we need to add 1. +* (rounds down), so we need to add 1, but this is taken +* care of by schedule_timeout() now. */ - expire_jifs = msecs_to_jiffies(timeout_msecs) + 1; + expire_jifs = msecs_to_jiffies(timeout_msecs); } expire_jifs = schedule_timeout(expire_jifs); @@ -1286,7 +1287,7 @@ asmlinkage long sys_nanosleep(struct tim if ((t.tv_nsec >= 10L) || (t.tv_nsec < 0) || (t.tv_sec < 0)) return -EINVAL; - expire = timespec_to_jiffies() + (t.tv_sec || t.tv_nsec); + expire = timespec_to_jiffies(); current->state = TASK_INTERRUPTIBLE; expire = schedule_timeout(expire); @@ -1675,7 +1676,7 @@ unregister_time_interpolator(struct time */ void msleep(unsigned int msecs) { - unsigned long timeout = msecs_to_jiffies(msecs) + 1; + unsigned long timeout = msecs_to_jiffies(msecs); while (timeout) { set_current_state(TASK_UNINTERRUPTIBLE); @@ -1691,7 +1692,7 @@ EXPORT_SYMBOL(msleep); */ unsigned long msleep_interruptible(unsigned int msecs) { - unsigned long timeout = msecs_to_jiffies(msecs) + 1; + unsigned long timeout = msecs_to_jiffies(msecs); while (timeout && !signal_pending(current)) { set_current_state(TASK_INTERRUPTIBLE); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at
[UPDATE PATCH] push rounding up of relative request to schedule_timeout()
On 03.08.2005 [17:51:47 -0700], Nishanth Aravamudan wrote: On 03.08.2005 [16:20:57 +0200], Roman Zippel wrote: Hi, On Mon, 1 Aug 2005, Nishanth Aravamudan wrote: +unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs) +{ + unsigned long expire_jifs; + + if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) { + expire_jifs = MAX_SCHEDULE_TIMEOUT; + } else { + /* + * msecs_to_jiffies() is a unit conversion, which truncates + * (rounds down), so we need to add 1. + */ + expire_jifs = msecs_to_jiffies(timeout_msecs) + 1; + } + + expire_jifs = schedule_timeout(expire_jifs); + + /* + * don't need to add 1 here, even though there is truncation, + * because we will add 1 if/when the value is sent back in + */ + return jiffies_to_msecs(expire_jifs); +} As I already mentioned for msleep_interruptible this is a really terrible interface. The jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1) case (when the process is immediately woken up again) makes the caller suspectible to timeout manipulations and requires constant reauditing, that no caller gets it wrong, so it's better to avoid this error source completely. After some thought today, I realized the +1 case is not specific to milliseconds. It's just that it's only being done *correctly* in the milliseconds case...I think ;) snip Description: Ensure that schedule_timeout() requests can not possibly expire early in the timeout case, by adding the requested relative jiffy value to the next value of jiffies. Currently, by adding to the current value of jiffies, we might actually expire a jiffy too early (in a corner case). Sorry, I forgot that sys_nanosleep() also always adds 1 to the request (to account for this same issue, I believe, as POSIX demands no early return from nanosleep() calls). There are some other locations where similar + (t.tv_sec || t.tv_nsec) rounding-ups occur. I'll fix those separately if this patch goes in. I change the one in sys_nanosleep() below to maintain the same latency as we currently have. I also screwed up my layout in the previous submission, sorry about that. Description: Ensure that schedule_timeout() requests can not possibly expire early in the timeout case, by adding the requested relative jiffy value to the next value of jiffies. Currently, by adding to the current value of jiffies, we might actually expire a jiffy too early (in a corner case). Modify the callers of schedule_timeout() in timer.c to not add 1 themselves. Signed-off-by: Nishanth Aravamudan [EMAIL PROTECTED] --- timer.c | 13 +++-- 1 files changed, 7 insertions(+), 6 deletions(-) --- 2.6.13-rc4-dev/kernel/timer.c 2005-08-01 12:31:53.0 -0700 +++ 2.6.13-rc5/kernel/timer.c 2005-08-03 22:05:16.0 -0700 @@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti } } - expire = timeout + jiffies; + expire = timeout + jiffies + 1; init_timer(timer); timer.expires = expire; @@ -1190,9 +1190,10 @@ unsigned int __sched schedule_timeout_ms } else { /* * msecs_to_jiffies() is a unit conversion, which truncates -* (rounds down), so we need to add 1. +* (rounds down), so we need to add 1, but this is taken +* care of by schedule_timeout() now. */ - expire_jifs = msecs_to_jiffies(timeout_msecs) + 1; + expire_jifs = msecs_to_jiffies(timeout_msecs); } expire_jifs = schedule_timeout(expire_jifs); @@ -1286,7 +1287,7 @@ asmlinkage long sys_nanosleep(struct tim if ((t.tv_nsec = 10L) || (t.tv_nsec 0) || (t.tv_sec 0)) return -EINVAL; - expire = timespec_to_jiffies(t) + (t.tv_sec || t.tv_nsec); + expire = timespec_to_jiffies(t); current-state = TASK_INTERRUPTIBLE; expire = schedule_timeout(expire); @@ -1675,7 +1676,7 @@ unregister_time_interpolator(struct time */ void msleep(unsigned int msecs) { - unsigned long timeout = msecs_to_jiffies(msecs) + 1; + unsigned long timeout = msecs_to_jiffies(msecs); while (timeout) { set_current_state(TASK_UNINTERRUPTIBLE); @@ -1691,7 +1692,7 @@ EXPORT_SYMBOL(msleep); */ unsigned long msleep_interruptible(unsigned int msecs) { - unsigned long timeout = msecs_to_jiffies(msecs) + 1; + unsigned long timeout = msecs_to_jiffies(msecs); while (timeout !signal_pending(current)) { set_current_state(TASK_INTERRUPTIBLE); - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/