Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-17 Thread Nishanth Aravamudan
On 17.08.2005 [12:51:17 -0700], George Anzinger wrote:
> Nishanth Aravamudan wrote:
> ~
> >>IMNSHO we should not get too parental with kernel only interfaces. 
> >>Adding 1 is easy enough for the caller and even easier to explain in the 
> >>instructions (i.e. this call sleeps for X jiffies edges).  This allows 
> >>the caller to do more if needed and, should he ever just want to sync to 
> >>the next jiffie he does not have to deal with backing out that +1.
> >
> >
> >I don't want to be too parental either, but I also am trying to avoid
> >code duplication. Lots of drivers basically do something like
> >poll_event() does (or could do with some changes), i.e. looping a
> >constant amount multiple times, checking something every so often. The
> >patch was just a thought, though. I will keep evaluating drivers and see
> >if it's a useful interface to have eventually.
> >
> >I guess I'm just concerned with making an unintuitive interface. As was
> >brought up at OLS, drivers are a major source of bugs/buggy code. The
> >simpler, more useful we can make interfaces, the better, I think. I'm
> >not claiming you disagree, I just want to make my own motives clear.
> >While fixing up the schedule_timeout() comment would make it clear what
> >schedule_timeout() achieves, I'm not sure how useful such an interface
> >is, if every caller adds 1 :) I need to mull it over, though... Lots to
> >consider. I also, of course, want to stay flexible for the reasons you
> >mention (letting the driver adjust the timeout as they expect to).
> 
> I would leave the +1 alone and put in the correct documentation.  This 
> way _more_ folks will be made aware of the mid jiffie issue.  Far to 
> often we see (and let get in) patches that mess up user interfaces 
> around this issue.  The recent changes to itimer come to mind...

Ok, makes sense to me; does the following patch work for everybody? The
wording is a bit awkward, but so is the issue :)

Description: Fix schedule_timeout()'s comment, indicated the inter-tick
rounding issue. Since the kernel does not keep track of an inter-tick
position in jiffies, a caller which wishes to sleep for at least a
certain number of jiffies should add its request to the *next* jiffies
value (meaning add 1 to its relative request). Make that clear in the
comment.

Signed-off-by: Nishanth Aravamudan <[EMAIL PROTECTED]>

---

diff -urpN 2.6.13-rc6/kernel/timer.c 2.6.13-rc6-dev/kernel/timer.c
--- 2.6.13-rc6/kernel/timer.c   2005-08-09 15:22:57.0 -0700
+++ 2.6.13-rc6-dev/kernel/timer.c   2005-08-17 15:21:35.0 -0700
@@ -1077,9 +1077,15 @@ static void process_timeout(unsigned lon
  * schedule_timeout - sleep until timeout
  * @timeout: timeout value in jiffies
  *
- * Make the current task sleep until @timeout jiffies have
- * elapsed. The routine will return immediately unless
- * the current task state has been set (see set_current_state()).
+ * Make the current task sleep until @timeout timer interrupts have
+ * occurred, meaning jiffies has incremented @timeout times and not
+ * necessarily that @timeout jiffies have elapsed. If the task wishes to
+ * sleep until (at least) @timeout jiffies have elapsed, then it should
+ * add 1 to its request. This is necessary, as the kernel does not keep
+ * track of an inter-jiffy position, so the caller must "round up" its
+ * request so that it begins at the next jiffy. The routine will return
+ * immediately unless the current task state has been set (see
+ * set_current_state()).
  *
  * You can set the task state as follows -
  *
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-17 Thread George Anzinger

Nishanth Aravamudan wrote:
~
IMNSHO we should not get too parental with kernel only interfaces. 
Adding 1 is easy enough for the caller and even easier to explain in the 
instructions (i.e. this call sleeps for X jiffies edges).  This allows 
the caller to do more if needed and, should he ever just want to sync to 
the next jiffie he does not have to deal with backing out that +1.



I don't want to be too parental either, but I also am trying to avoid
code duplication. Lots of drivers basically do something like
poll_event() does (or could do with some changes), i.e. looping a
constant amount multiple times, checking something every so often. The
patch was just a thought, though. I will keep evaluating drivers and see
if it's a useful interface to have eventually.

I guess I'm just concerned with making an unintuitive interface. As was
brought up at OLS, drivers are a major source of bugs/buggy code. The
simpler, more useful we can make interfaces, the better, I think. I'm
not claiming you disagree, I just want to make my own motives clear.
While fixing up the schedule_timeout() comment would make it clear what
schedule_timeout() achieves, I'm not sure how useful such an interface
is, if every caller adds 1 :) I need to mull it over, though... Lots to
consider. I also, of course, want to stay flexible for the reasons you
mention (letting the driver adjust the timeout as they expect to).


I would leave the +1 alone and put in the correct documentation.  This 
way _more_ folks will be made aware of the mid jiffie issue.  Far to 
often we see (and let get in) patches that mess up user interfaces 
around this issue.  The recent changes to itimer come to mind...



~
--
George Anzinger   george@mvista.com
HRT (High-res-timers):  http://sourceforge.net/projects/high-res-timers/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-17 Thread George Anzinger

Nishanth Aravamudan wrote:
~
IMNSHO we should not get too parental with kernel only interfaces. 
Adding 1 is easy enough for the caller and even easier to explain in the 
instructions (i.e. this call sleeps for X jiffies edges).  This allows 
the caller to do more if needed and, should he ever just want to sync to 
the next jiffie he does not have to deal with backing out that +1.



I don't want to be too parental either, but I also am trying to avoid
code duplication. Lots of drivers basically do something like
poll_event() does (or could do with some changes), i.e. looping a
constant amount multiple times, checking something every so often. The
patch was just a thought, though. I will keep evaluating drivers and see
if it's a useful interface to have eventually.

I guess I'm just concerned with making an unintuitive interface. As was
brought up at OLS, drivers are a major source of bugs/buggy code. The
simpler, more useful we can make interfaces, the better, I think. I'm
not claiming you disagree, I just want to make my own motives clear.
While fixing up the schedule_timeout() comment would make it clear what
schedule_timeout() achieves, I'm not sure how useful such an interface
is, if every caller adds 1 :) I need to mull it over, though... Lots to
consider. I also, of course, want to stay flexible for the reasons you
mention (letting the driver adjust the timeout as they expect to).


I would leave the +1 alone and put in the correct documentation.  This 
way _more_ folks will be made aware of the mid jiffie issue.  Far to 
often we see (and let get in) patches that mess up user interfaces 
around this issue.  The recent changes to itimer come to mind...



~
--
George Anzinger   george@mvista.com
HRT (High-res-timers):  http://sourceforge.net/projects/high-res-timers/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-17 Thread Nishanth Aravamudan
On 17.08.2005 [12:51:17 -0700], George Anzinger wrote:
 Nishanth Aravamudan wrote:
 ~
 IMNSHO we should not get too parental with kernel only interfaces. 
 Adding 1 is easy enough for the caller and even easier to explain in the 
 instructions (i.e. this call sleeps for X jiffies edges).  This allows 
 the caller to do more if needed and, should he ever just want to sync to 
 the next jiffie he does not have to deal with backing out that +1.
 
 
 I don't want to be too parental either, but I also am trying to avoid
 code duplication. Lots of drivers basically do something like
 poll_event() does (or could do with some changes), i.e. looping a
 constant amount multiple times, checking something every so often. The
 patch was just a thought, though. I will keep evaluating drivers and see
 if it's a useful interface to have eventually.
 
 I guess I'm just concerned with making an unintuitive interface. As was
 brought up at OLS, drivers are a major source of bugs/buggy code. The
 simpler, more useful we can make interfaces, the better, I think. I'm
 not claiming you disagree, I just want to make my own motives clear.
 While fixing up the schedule_timeout() comment would make it clear what
 schedule_timeout() achieves, I'm not sure how useful such an interface
 is, if every caller adds 1 :) I need to mull it over, though... Lots to
 consider. I also, of course, want to stay flexible for the reasons you
 mention (letting the driver adjust the timeout as they expect to).
 
 I would leave the +1 alone and put in the correct documentation.  This 
 way _more_ folks will be made aware of the mid jiffie issue.  Far to 
 often we see (and let get in) patches that mess up user interfaces 
 around this issue.  The recent changes to itimer come to mind...

Ok, makes sense to me; does the following patch work for everybody? The
wording is a bit awkward, but so is the issue :)

Description: Fix schedule_timeout()'s comment, indicated the inter-tick
rounding issue. Since the kernel does not keep track of an inter-tick
position in jiffies, a caller which wishes to sleep for at least a
certain number of jiffies should add its request to the *next* jiffies
value (meaning add 1 to its relative request). Make that clear in the
comment.

Signed-off-by: Nishanth Aravamudan [EMAIL PROTECTED]

---

diff -urpN 2.6.13-rc6/kernel/timer.c 2.6.13-rc6-dev/kernel/timer.c
--- 2.6.13-rc6/kernel/timer.c   2005-08-09 15:22:57.0 -0700
+++ 2.6.13-rc6-dev/kernel/timer.c   2005-08-17 15:21:35.0 -0700
@@ -1077,9 +1077,15 @@ static void process_timeout(unsigned lon
  * schedule_timeout - sleep until timeout
  * @timeout: timeout value in jiffies
  *
- * Make the current task sleep until @timeout jiffies have
- * elapsed. The routine will return immediately unless
- * the current task state has been set (see set_current_state()).
+ * Make the current task sleep until @timeout timer interrupts have
+ * occurred, meaning jiffies has incremented @timeout times and not
+ * necessarily that @timeout jiffies have elapsed. If the task wishes to
+ * sleep until (at least) @timeout jiffies have elapsed, then it should
+ * add 1 to its request. This is necessary, as the kernel does not keep
+ * track of an inter-jiffy position, so the caller must round up its
+ * request so that it begins at the next jiffy. The routine will return
+ * immediately unless the current task state has been set (see
+ * set_current_state()).
  *
  * You can set the task state as follows -
  *
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-16 Thread Nishanth Aravamudan
On 16.08.2005 [17:39:11 -0700], George Anzinger wrote:
> Nishanth Aravamudan wrote:
> >On 04.08.2005 [09:45:55 -0700], George Anzinger wrote:
> >
> >>Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and 
> >>timeval_to_jiffies() to add 1.  This is NOT the right thing to do.  For 
> >>repeating times (see setitimer code) we need the actual time as we KNOW 
> >>where the jiffies edge is in the repeating case.  The +1 is needed ONLY 
> >>for the initial time, not the repeating time.
> >>
> >>
> >>See:
> >>http://marc.theaimsgroup.com/?l=linux-kernel=112208357906156=2
> >
> >
> >I followed that thread, George, but I think it's a different case with
> >schedule_timeout() [maybe this indicates drivers/other users should
> >maybe be using itimers, but I'll get to that in a sec].
> 
> I think I miss understood back then :).

Ok, no problem. I was just thinking today about all the issues that were
borught up... I really appreciate your feedback.

> >
> >With schedule_timeout(), we are just given a relative jiffies value. We
> >have no context as to which task is requesting the delay, per se,
> >meaning we don't (can't) know from the interface whether this is the
> >first delay in a sequence, or a brand new one, without changing all
> >users to have some sort of control structure. The callers of
> >schedule_timeout() don't even get a pointer to the timer added
> >internally.
> >
> >So, adding 1 to all sleeps seems like it might be reasonable, as looping
> >sleeps probably need to use a different interface. I had worked a bit
> >ago on something like poll_event() with the kernel-janitors group, which
> >would abstract out the repeated sleeps. Basically wait_event() without
> >wait-queues... Maybe we could make such an interface just use itimers?
> >I've attached my old patch for poll_event(), just for reference.
> 
> I think not.  itimers is really pointed at a particular system call and 
> has resources in the task structure to do it.  These would be hard to 
> share...

Ok, I wasn't sure about that -- I'm not too familiar with the itimer
code (maybe I'll take a look at it soon :) )

> >My point, I guess, is that in the schedule_timeout() case, we don't know
> >where the jiffies edge is, as we either expire or receive a wait-queue
> >event/signal, we never mod_timer() the internal timer... So we have to
> >assume that we need to sleep the request. But maybe Roman's idea of
> >sleeping a certain number of jiffy edges is sufficient. I am not yet
> >convinced driver authors want/need such an interface, though, still
> >thinking it over.
> 
> IMNSHO we should not get too parental with kernel only interfaces. 
> Adding 1 is easy enough for the caller and even easier to explain in the 
> instructions (i.e. this call sleeps for X jiffies edges).  This allows 
> the caller to do more if needed and, should he ever just want to sync to 
> the next jiffie he does not have to deal with backing out that +1.

I don't want to be too parental either, but I also am trying to avoid
code duplication. Lots of drivers basically do something like
poll_event() does (or could do with some changes), i.e. looping a
constant amount multiple times, checking something every so often. The
patch was just a thought, though. I will keep evaluating drivers and see
if it's a useful interface to have eventually.

I guess I'm just concerned with making an unintuitive interface. As was
brought up at OLS, drivers are a major source of bugs/buggy code. The
simpler, more useful we can make interfaces, the better, I think. I'm
not claiming you disagree, I just want to make my own motives clear.
While fixing up the schedule_timeout() comment would make it clear what
schedule_timeout() achieves, I'm not sure how useful such an interface
is, if every caller adds 1 :) I need to mull it over, though... Lots to
consider. I also, of course, want to stay flexible for the reasons you
mention (letting the driver adjust the timeout as they expect to).

Thanks,
Nish
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-16 Thread George Anzinger

Nishanth Aravamudan wrote:

On 04.08.2005 [09:45:55 -0700], George Anzinger wrote:

Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and 
timeval_to_jiffies() to add 1.  This is NOT the right thing to do.  For 
repeating times (see setitimer code) we need the actual time as we KNOW 
where the jiffies edge is in the repeating case.  The +1 is needed ONLY 
for the initial time, not the repeating time.



See:
http://marc.theaimsgroup.com/?l=linux-kernel=112208357906156=2



I followed that thread, George, but I think it's a different case with
schedule_timeout() [maybe this indicates drivers/other users should
maybe be using itimers, but I'll get to that in a sec].


I think I miss understood back then :).



With schedule_timeout(), we are just given a relative jiffies value. We
have no context as to which task is requesting the delay, per se,
meaning we don't (can't) know from the interface whether this is the
first delay in a sequence, or a brand new one, without changing all
users to have some sort of control structure. The callers of
schedule_timeout() don't even get a pointer to the timer added
internally.

So, adding 1 to all sleeps seems like it might be reasonable, as looping
sleeps probably need to use a different interface. I had worked a bit
ago on something like poll_event() with the kernel-janitors group, which
would abstract out the repeated sleeps. Basically wait_event() without
wait-queues... Maybe we could make such an interface just use itimers?
I've attached my old patch for poll_event(), just for reference.


I think not.  itimers is really pointed at a particular system call and 
has resources in the task structure to do it.  These would be hard to 
share...


My point, I guess, is that in the schedule_timeout() case, we don't know
where the jiffies edge is, as we either expire or receive a wait-queue
event/signal, we never mod_timer() the internal timer... So we have to
assume that we need to sleep the request. But maybe Roman's idea of
sleeping a certain number of jiffy edges is sufficient. I am not yet
convinced driver authors want/need such an interface, though, still
thinking it over.


IMNSHO we should not get too parental with kernel only interfaces. 
Adding 1 is easy enough for the caller and even easier to explain in the 
instructions (i.e. this call sleeps for X jiffies edges).  This allows 
the caller to do more if needed and, should he ever just want to sync to 
the next jiffie he does not have to deal with backing out that +1.




--
George Anzinger   george@mvista.com
HRT (High-res-timers):  http://sourceforge.net/projects/high-res-timers/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-16 Thread Nishanth Aravamudan
On 04.08.2005 [09:45:55 -0700], George Anzinger wrote:
> Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and 
> timeval_to_jiffies() to add 1.  This is NOT the right thing to do.  For 
> repeating times (see setitimer code) we need the actual time as we KNOW 
> where the jiffies edge is in the repeating case.  The +1 is needed ONLY 
> for the initial time, not the repeating time.
> 
> 
> See:
> http://marc.theaimsgroup.com/?l=linux-kernel=112208357906156=2

I followed that thread, George, but I think it's a different case with
schedule_timeout() [maybe this indicates drivers/other users should
maybe be using itimers, but I'll get to that in a sec].

With schedule_timeout(), we are just given a relative jiffies value. We
have no context as to which task is requesting the delay, per se,
meaning we don't (can't) know from the interface whether this is the
first delay in a sequence, or a brand new one, without changing all
users to have some sort of control structure. The callers of
schedule_timeout() don't even get a pointer to the timer added
internally.

So, adding 1 to all sleeps seems like it might be reasonable, as looping
sleeps probably need to use a different interface. I had worked a bit
ago on something like poll_event() with the kernel-janitors group, which
would abstract out the repeated sleeps. Basically wait_event() without
wait-queues... Maybe we could make such an interface just use itimers?
I've attached my old patch for poll_event(), just for reference.

My point, I guess, is that in the schedule_timeout() case, we don't know
where the jiffies edge is, as we either expire or receive a wait-queue
event/signal, we never mod_timer() the internal timer... So we have to
assume that we need to sleep the request. But maybe Roman's idea of
sleeping a certain number of jiffy edges is sufficient. I am not yet
convinced driver authors want/need such an interface, though, still
thinking it over.

Thanks for all the feedback,

Nish


--- 2.6.11-kj-v/include/linux/delay.h   2005-03-01 23:37:51.0 -0800
+++ 2.6.11-kj/include/linux/delay.h 2005-03-04 16:03:24.0 -0800
@@ -47,4 +47,106 @@ static inline void ssleep(unsigned int s
msleep(seconds * 1000);
 }
 
+/*
+ * poll_event* check if an @condition is true every @interval milliseconds,
+ * up to a @timeout maximum, if specified, else forever.
+ * The *_interruptible versions will sleep in TASK_INTERRUPTIBLE
+ * (instead of TASK_UNINTERRUPTIBLE) and will return early on signals
+ * Note that these interfaces are distinctly different than
+ * wait_event*() in two ways:
+ * 1) No wait-queues
+ * 2) Specify the time you want to sleep, not when you want to
+ * wake-up. Thus, callers should not add the current time to
+ * @timeout before calling, as it will be done by the macro.
+ */
+
+/**
+ * poll_event - check a condition at regular intervals
+ * @condition: Condition to check
+ * @interval: Number of milliseconds between checks
+ */
+#define poll_event(condition, interval)\
+do {   \
+   while (!(condition))\
+   msleep(interval);   \
+} while(0)
+
+/**
+ * poll_event_timeout - check a condition at regular intervals with a timeout
+ * @condition: Condition to check
+ * @interval: Number of milliseconds between checks
+ * @timeout: Maximum total number of milliseconds to take
+ * returns: 0, if condition caused return
+ * -ETIME, if timeout
+ */
+#define poll_event_timeout(condition, interval, timeout)   \
+({ \
+   int __ret = 0;  \
+   unsigned long stop = jiffies + msecs_to_jiffies(timeout);   \
+   \
+   for (;;) {  \
+   if (condition)  \
+   break;  \
+   msleep(interval);   \
+   if (time_after(jiffies, stop)) {\
+   __ret = -ETIME; \
+   break;  \
+   }   \
+   }   \
+   __ret;  \
+})
+
+/**
+ * poll_event_interruptible - check a condition at regular intervals, 
returning early on signals
+ * @condition: Condition to check
+ * @interval: Number of milliseconds between checks
+ * returns: 0, if condition caused return
+ * -EINTR, if a signal
+ */
+#define 

Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-16 Thread Nishanth Aravamudan
On 04.08.2005 [09:45:55 -0700], George Anzinger wrote:
 Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and 
 timeval_to_jiffies() to add 1.  This is NOT the right thing to do.  For 
 repeating times (see setitimer code) we need the actual time as we KNOW 
 where the jiffies edge is in the repeating case.  The +1 is needed ONLY 
 for the initial time, not the repeating time.
 
 
 See:
 http://marc.theaimsgroup.com/?l=linux-kernelm=112208357906156w=2

I followed that thread, George, but I think it's a different case with
schedule_timeout() [maybe this indicates drivers/other users should
maybe be using itimers, but I'll get to that in a sec].

With schedule_timeout(), we are just given a relative jiffies value. We
have no context as to which task is requesting the delay, per se,
meaning we don't (can't) know from the interface whether this is the
first delay in a sequence, or a brand new one, without changing all
users to have some sort of control structure. The callers of
schedule_timeout() don't even get a pointer to the timer added
internally.

So, adding 1 to all sleeps seems like it might be reasonable, as looping
sleeps probably need to use a different interface. I had worked a bit
ago on something like poll_event() with the kernel-janitors group, which
would abstract out the repeated sleeps. Basically wait_event() without
wait-queues... Maybe we could make such an interface just use itimers?
I've attached my old patch for poll_event(), just for reference.

My point, I guess, is that in the schedule_timeout() case, we don't know
where the jiffies edge is, as we either expire or receive a wait-queue
event/signal, we never mod_timer() the internal timer... So we have to
assume that we need to sleep the request. But maybe Roman's idea of
sleeping a certain number of jiffy edges is sufficient. I am not yet
convinced driver authors want/need such an interface, though, still
thinking it over.

Thanks for all the feedback,

Nish


--- 2.6.11-kj-v/include/linux/delay.h   2005-03-01 23:37:51.0 -0800
+++ 2.6.11-kj/include/linux/delay.h 2005-03-04 16:03:24.0 -0800
@@ -47,4 +47,106 @@ static inline void ssleep(unsigned int s
msleep(seconds * 1000);
 }
 
+/*
+ * poll_event* check if an @condition is true every @interval milliseconds,
+ * up to a @timeout maximum, if specified, else forever.
+ * The *_interruptible versions will sleep in TASK_INTERRUPTIBLE
+ * (instead of TASK_UNINTERRUPTIBLE) and will return early on signals
+ * Note that these interfaces are distinctly different than
+ * wait_event*() in two ways:
+ * 1) No wait-queues
+ * 2) Specify the time you want to sleep, not when you want to
+ * wake-up. Thus, callers should not add the current time to
+ * @timeout before calling, as it will be done by the macro.
+ */
+
+/**
+ * poll_event - check a condition at regular intervals
+ * @condition: Condition to check
+ * @interval: Number of milliseconds between checks
+ */
+#define poll_event(condition, interval)\
+do {   \
+   while (!(condition))\
+   msleep(interval);   \
+} while(0)
+
+/**
+ * poll_event_timeout - check a condition at regular intervals with a timeout
+ * @condition: Condition to check
+ * @interval: Number of milliseconds between checks
+ * @timeout: Maximum total number of milliseconds to take
+ * returns: 0, if condition caused return
+ * -ETIME, if timeout
+ */
+#define poll_event_timeout(condition, interval, timeout)   \
+({ \
+   int __ret = 0;  \
+   unsigned long stop = jiffies + msecs_to_jiffies(timeout);   \
+   \
+   for (;;) {  \
+   if (condition)  \
+   break;  \
+   msleep(interval);   \
+   if (time_after(jiffies, stop)) {\
+   __ret = -ETIME; \
+   break;  \
+   }   \
+   }   \
+   __ret;  \
+})
+
+/**
+ * poll_event_interruptible - check a condition at regular intervals, 
returning early on signals
+ * @condition: Condition to check
+ * @interval: Number of milliseconds between checks
+ * returns: 0, if condition caused return
+ * -EINTR, if a signal
+ */
+#define 

Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-16 Thread George Anzinger

Nishanth Aravamudan wrote:

On 04.08.2005 [09:45:55 -0700], George Anzinger wrote:

Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and 
timeval_to_jiffies() to add 1.  This is NOT the right thing to do.  For 
repeating times (see setitimer code) we need the actual time as we KNOW 
where the jiffies edge is in the repeating case.  The +1 is needed ONLY 
for the initial time, not the repeating time.



See:
http://marc.theaimsgroup.com/?l=linux-kernelm=112208357906156w=2



I followed that thread, George, but I think it's a different case with
schedule_timeout() [maybe this indicates drivers/other users should
maybe be using itimers, but I'll get to that in a sec].


I think I miss understood back then :).



With schedule_timeout(), we are just given a relative jiffies value. We
have no context as to which task is requesting the delay, per se,
meaning we don't (can't) know from the interface whether this is the
first delay in a sequence, or a brand new one, without changing all
users to have some sort of control structure. The callers of
schedule_timeout() don't even get a pointer to the timer added
internally.

So, adding 1 to all sleeps seems like it might be reasonable, as looping
sleeps probably need to use a different interface. I had worked a bit
ago on something like poll_event() with the kernel-janitors group, which
would abstract out the repeated sleeps. Basically wait_event() without
wait-queues... Maybe we could make such an interface just use itimers?
I've attached my old patch for poll_event(), just for reference.


I think not.  itimers is really pointed at a particular system call and 
has resources in the task structure to do it.  These would be hard to 
share...


My point, I guess, is that in the schedule_timeout() case, we don't know
where the jiffies edge is, as we either expire or receive a wait-queue
event/signal, we never mod_timer() the internal timer... So we have to
assume that we need to sleep the request. But maybe Roman's idea of
sleeping a certain number of jiffy edges is sufficient. I am not yet
convinced driver authors want/need such an interface, though, still
thinking it over.


IMNSHO we should not get too parental with kernel only interfaces. 
Adding 1 is easy enough for the caller and even easier to explain in the 
instructions (i.e. this call sleeps for X jiffies edges).  This allows 
the caller to do more if needed and, should he ever just want to sync to 
the next jiffie he does not have to deal with backing out that +1.




--
George Anzinger   george@mvista.com
HRT (High-res-timers):  http://sourceforge.net/projects/high-res-timers/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-16 Thread Nishanth Aravamudan
On 16.08.2005 [17:39:11 -0700], George Anzinger wrote:
 Nishanth Aravamudan wrote:
 On 04.08.2005 [09:45:55 -0700], George Anzinger wrote:
 
 Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and 
 timeval_to_jiffies() to add 1.  This is NOT the right thing to do.  For 
 repeating times (see setitimer code) we need the actual time as we KNOW 
 where the jiffies edge is in the repeating case.  The +1 is needed ONLY 
 for the initial time, not the repeating time.
 
 
 See:
 http://marc.theaimsgroup.com/?l=linux-kernelm=112208357906156w=2
 
 
 I followed that thread, George, but I think it's a different case with
 schedule_timeout() [maybe this indicates drivers/other users should
 maybe be using itimers, but I'll get to that in a sec].
 
 I think I miss understood back then :).

Ok, no problem. I was just thinking today about all the issues that were
borught up... I really appreciate your feedback.

 
 With schedule_timeout(), we are just given a relative jiffies value. We
 have no context as to which task is requesting the delay, per se,
 meaning we don't (can't) know from the interface whether this is the
 first delay in a sequence, or a brand new one, without changing all
 users to have some sort of control structure. The callers of
 schedule_timeout() don't even get a pointer to the timer added
 internally.
 
 So, adding 1 to all sleeps seems like it might be reasonable, as looping
 sleeps probably need to use a different interface. I had worked a bit
 ago on something like poll_event() with the kernel-janitors group, which
 would abstract out the repeated sleeps. Basically wait_event() without
 wait-queues... Maybe we could make such an interface just use itimers?
 I've attached my old patch for poll_event(), just for reference.
 
 I think not.  itimers is really pointed at a particular system call and 
 has resources in the task structure to do it.  These would be hard to 
 share...

Ok, I wasn't sure about that -- I'm not too familiar with the itimer
code (maybe I'll take a look at it soon :) )

 My point, I guess, is that in the schedule_timeout() case, we don't know
 where the jiffies edge is, as we either expire or receive a wait-queue
 event/signal, we never mod_timer() the internal timer... So we have to
 assume that we need to sleep the request. But maybe Roman's idea of
 sleeping a certain number of jiffy edges is sufficient. I am not yet
 convinced driver authors want/need such an interface, though, still
 thinking it over.
 
 IMNSHO we should not get too parental with kernel only interfaces. 
 Adding 1 is easy enough for the caller and even easier to explain in the 
 instructions (i.e. this call sleeps for X jiffies edges).  This allows 
 the caller to do more if needed and, should he ever just want to sync to 
 the next jiffie he does not have to deal with backing out that +1.

I don't want to be too parental either, but I also am trying to avoid
code duplication. Lots of drivers basically do something like
poll_event() does (or could do with some changes), i.e. looping a
constant amount multiple times, checking something every so often. The
patch was just a thought, though. I will keep evaluating drivers and see
if it's a useful interface to have eventually.

I guess I'm just concerned with making an unintuitive interface. As was
brought up at OLS, drivers are a major source of bugs/buggy code. The
simpler, more useful we can make interfaces, the better, I think. I'm
not claiming you disagree, I just want to make my own motives clear.
While fixing up the schedule_timeout() comment would make it clear what
schedule_timeout() achieves, I'm not sure how useful such an interface
is, if every caller adds 1 :) I need to mull it over, though... Lots to
consider. I also, of course, want to stay flexible for the reasons you
mention (letting the driver adjust the timeout as they expect to).

Thanks,
Nish
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-04 Thread Nish Aravamudan
On 8/4/05, George Anzinger  wrote:
> Nishanth Aravamudan wrote:
> ~
> > Sorry, I forgot that sys_nanosleep() also always adds 1 to the request
> > (to account for this same issue, I believe, as POSIX demands no early
> > return from nanosleep() calls). There are some other locations where
> > similar
> >
> >   + (t.tv_sec || t.tv_nsec)
> 
> This is not the same as "always add 1".  We don't do it this way just to
> have fun with C.  If you change schedule_timeout() to add the 1,
> nanosleep() will need to do things differently to get the same behavior.
>   (And, YES users do pass in zero sleep times.)

Fair enough. Will need to think about this more.

Thanks,
Nish
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-04 Thread Nish Aravamudan
On 8/4/05, George Anzinger  wrote:
> Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and
> timeval_to_jiffies() to add 1.  This is NOT the right thing to do.  For
> repeating times (see setitimer code) we need the actual time as we KNOW
> where the jiffies edge is in the repeating case.  The +1 is needed ONLY
> for the initial time, not the repeating time.

Please read the patch. I didn't touch timespec_to_jiffies() or
timeval_to_jiffies(). Not sure why you think I did. I agree that we
only need the initial time, my patch is no good. But it is hard for
non-itimers, like schedule_timeout() callers, to provide an interface
that only adds 1 to the initial request, since the callers currently
pass in an absolute jiffies value.

Thanks,
Nish
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-04 Thread George Anzinger

Nishanth Aravamudan wrote:
~

Sorry, I forgot that sys_nanosleep() also always adds 1 to the request
(to account for this same issue, I believe, as POSIX demands no early
return from nanosleep() calls). There are some other locations where
similar

+ (t.tv_sec || t.tv_nsec)


This is not the same as "always add 1".  We don't do it this way just to 
have fun with C.  If you change schedule_timeout() to add the 1, 
nanosleep() will need to do things differently to get the same behavior. 
 (And, YES users do pass in zero sleep times.)



--
George Anzinger   george@mvista.com
HRT (High-res-timers):  http://sourceforge.net/projects/high-res-timers/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-04 Thread George Anzinger
Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and 
timeval_to_jiffies() to add 1.  This is NOT the right thing to do.  For 
repeating times (see setitimer code) we need the actual time as we KNOW 
where the jiffies edge is in the repeating case.  The +1 is needed ONLY 
for the initial time, not the repeating time.



See:
http://marc.theaimsgroup.com/?l=linux-kernel=112208357906156=2

George

Nishanth Aravamudan wrote:

On 03.08.2005 [17:51:47 -0700], Nishanth Aravamudan wrote:


On 03.08.2005 [16:20:57 +0200], Roman Zippel wrote:


Hi,

On Mon, 1 Aug 2005, Nishanth Aravamudan wrote:



+unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs)
+{
+   unsigned long expire_jifs;
+
+   if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) {
+   expire_jifs = MAX_SCHEDULE_TIMEOUT;
+   } else {
+   /*
+* msecs_to_jiffies() is a unit conversion, which truncates
+* (rounds down), so we need to add 1.
+*/
+   expire_jifs = msecs_to_jiffies(timeout_msecs) + 1;
+   }
+
+   expire_jifs = schedule_timeout(expire_jifs);
+
+   /*
+* don't need to add 1 here, even though there is truncation,
+* because we will add 1 if/when the value is sent back in
+*/
+   return jiffies_to_msecs(expire_jifs);
+}


As I already mentioned for msleep_interruptible this is a really terrible 
interface.
The "jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1)" case (when the 
process is immediately woken up again) makes the caller suspectible to 
timeout manipulations and requires constant reauditing, that no caller 
gets it wrong, so it's better to avoid this error source completely.


After some thought today, I realized the +1 case is not specific to
milliseconds. It's just that it's only being done *correctly* in the
milliseconds case...I think ;)






Description: Ensure that schedule_timeout() requests can not possibly
expire early in the timeout case, by adding the requested relative jiffy
value to the next value of jiffies. Currently, by adding to the current
value of jiffies, we might actually expire a jiffy too early (in a
corner case).



Sorry, I forgot that sys_nanosleep() also always adds 1 to the request
(to account for this same issue, I believe, as POSIX demands no early
return from nanosleep() calls). There are some other locations where
similar

+ (t.tv_sec || t.tv_nsec)

rounding-ups occur. I'll fix those separately if this patch goes in.  I
change the one in sys_nanosleep() below to maintain the same latency as
we currently have. I also screwed up my layout in the previous
submission, sorry about that.

Description: Ensure that schedule_timeout() requests can not possibly
expire early in the timeout case, by adding the requested relative jiffy
value to the next value of jiffies. Currently, by adding to the current
value of jiffies, we might actually expire a jiffy too early (in a
corner case). Modify the callers of schedule_timeout() in timer.c to not
add 1 themselves.

Signed-off-by: Nishanth Aravamudan <[EMAIL PROTECTED]>

---

 timer.c |   13 +++--
 1 files changed, 7 insertions(+), 6 deletions(-)

--- 2.6.13-rc4-dev/kernel/timer.c   2005-08-01 12:31:53.0 -0700
+++ 2.6.13-rc5/kernel/timer.c   2005-08-03 22:05:16.0 -0700
@@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti
}
}
 
-	expire = timeout + jiffies;

+   expire = timeout + jiffies + 1;
 
 	init_timer();

timer.expires = expire;
@@ -1190,9 +1190,10 @@ unsigned int __sched schedule_timeout_ms
} else {
/*
 * msecs_to_jiffies() is a unit conversion, which truncates
-* (rounds down), so we need to add 1.
+* (rounds down), so we need to add 1, but this is taken
+* care of by schedule_timeout() now.
 */
-   expire_jifs = msecs_to_jiffies(timeout_msecs) + 1;
+   expire_jifs = msecs_to_jiffies(timeout_msecs);
}
 
 	expire_jifs = schedule_timeout(expire_jifs);

@@ -1286,7 +1287,7 @@ asmlinkage long sys_nanosleep(struct tim
if ((t.tv_nsec >= 10L) || (t.tv_nsec < 0) || (t.tv_sec < 0))
return -EINVAL;
 
-	expire = timespec_to_jiffies() + (t.tv_sec || t.tv_nsec);

+   expire = timespec_to_jiffies();
current->state = TASK_INTERRUPTIBLE;
expire = schedule_timeout(expire);
 
@@ -1675,7 +1676,7 @@ unregister_time_interpolator(struct time

  */
 void msleep(unsigned int msecs)
 {
-   unsigned long timeout = msecs_to_jiffies(msecs) + 1;
+   unsigned long timeout = msecs_to_jiffies(msecs);
 
 	while (timeout) {

set_current_state(TASK_UNINTERRUPTIBLE);
@@ -1691,7 +1692,7 @@ EXPORT_SYMBOL(msleep);
  */
 unsigned long msleep_interruptible(unsigned int msecs)
 {
-   unsigned long timeout = msecs_to_jiffies(msecs) + 1;
+  

Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-04 Thread George Anzinger
Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and 
timeval_to_jiffies() to add 1.  This is NOT the right thing to do.  For 
repeating times (see setitimer code) we need the actual time as we KNOW 
where the jiffies edge is in the repeating case.  The +1 is needed ONLY 
for the initial time, not the repeating time.



See:
http://marc.theaimsgroup.com/?l=linux-kernelm=112208357906156w=2

George

Nishanth Aravamudan wrote:

On 03.08.2005 [17:51:47 -0700], Nishanth Aravamudan wrote:


On 03.08.2005 [16:20:57 +0200], Roman Zippel wrote:


Hi,

On Mon, 1 Aug 2005, Nishanth Aravamudan wrote:



+unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs)
+{
+   unsigned long expire_jifs;
+
+   if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) {
+   expire_jifs = MAX_SCHEDULE_TIMEOUT;
+   } else {
+   /*
+* msecs_to_jiffies() is a unit conversion, which truncates
+* (rounds down), so we need to add 1.
+*/
+   expire_jifs = msecs_to_jiffies(timeout_msecs) + 1;
+   }
+
+   expire_jifs = schedule_timeout(expire_jifs);
+
+   /*
+* don't need to add 1 here, even though there is truncation,
+* because we will add 1 if/when the value is sent back in
+*/
+   return jiffies_to_msecs(expire_jifs);
+}


As I already mentioned for msleep_interruptible this is a really terrible 
interface.
The jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1) case (when the 
process is immediately woken up again) makes the caller suspectible to 
timeout manipulations and requires constant reauditing, that no caller 
gets it wrong, so it's better to avoid this error source completely.


After some thought today, I realized the +1 case is not specific to
milliseconds. It's just that it's only being done *correctly* in the
milliseconds case...I think ;)



snip


Description: Ensure that schedule_timeout() requests can not possibly
expire early in the timeout case, by adding the requested relative jiffy
value to the next value of jiffies. Currently, by adding to the current
value of jiffies, we might actually expire a jiffy too early (in a
corner case).



Sorry, I forgot that sys_nanosleep() also always adds 1 to the request
(to account for this same issue, I believe, as POSIX demands no early
return from nanosleep() calls). There are some other locations where
similar

+ (t.tv_sec || t.tv_nsec)

rounding-ups occur. I'll fix those separately if this patch goes in.  I
change the one in sys_nanosleep() below to maintain the same latency as
we currently have. I also screwed up my layout in the previous
submission, sorry about that.

Description: Ensure that schedule_timeout() requests can not possibly
expire early in the timeout case, by adding the requested relative jiffy
value to the next value of jiffies. Currently, by adding to the current
value of jiffies, we might actually expire a jiffy too early (in a
corner case). Modify the callers of schedule_timeout() in timer.c to not
add 1 themselves.

Signed-off-by: Nishanth Aravamudan [EMAIL PROTECTED]

---

 timer.c |   13 +++--
 1 files changed, 7 insertions(+), 6 deletions(-)

--- 2.6.13-rc4-dev/kernel/timer.c   2005-08-01 12:31:53.0 -0700
+++ 2.6.13-rc5/kernel/timer.c   2005-08-03 22:05:16.0 -0700
@@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti
}
}
 
-	expire = timeout + jiffies;

+   expire = timeout + jiffies + 1;
 
 	init_timer(timer);

timer.expires = expire;
@@ -1190,9 +1190,10 @@ unsigned int __sched schedule_timeout_ms
} else {
/*
 * msecs_to_jiffies() is a unit conversion, which truncates
-* (rounds down), so we need to add 1.
+* (rounds down), so we need to add 1, but this is taken
+* care of by schedule_timeout() now.
 */
-   expire_jifs = msecs_to_jiffies(timeout_msecs) + 1;
+   expire_jifs = msecs_to_jiffies(timeout_msecs);
}
 
 	expire_jifs = schedule_timeout(expire_jifs);

@@ -1286,7 +1287,7 @@ asmlinkage long sys_nanosleep(struct tim
if ((t.tv_nsec = 10L) || (t.tv_nsec  0) || (t.tv_sec  0))
return -EINVAL;
 
-	expire = timespec_to_jiffies(t) + (t.tv_sec || t.tv_nsec);

+   expire = timespec_to_jiffies(t);
current-state = TASK_INTERRUPTIBLE;
expire = schedule_timeout(expire);
 
@@ -1675,7 +1676,7 @@ unregister_time_interpolator(struct time

  */
 void msleep(unsigned int msecs)
 {
-   unsigned long timeout = msecs_to_jiffies(msecs) + 1;
+   unsigned long timeout = msecs_to_jiffies(msecs);
 
 	while (timeout) {

set_current_state(TASK_UNINTERRUPTIBLE);
@@ -1691,7 +1692,7 @@ EXPORT_SYMBOL(msleep);
  */
 unsigned long msleep_interruptible(unsigned int msecs)
 {
-   unsigned long timeout = msecs_to_jiffies(msecs) + 1;
+ 

Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-04 Thread George Anzinger

Nishanth Aravamudan wrote:
~

Sorry, I forgot that sys_nanosleep() also always adds 1 to the request
(to account for this same issue, I believe, as POSIX demands no early
return from nanosleep() calls). There are some other locations where
similar

+ (t.tv_sec || t.tv_nsec)


This is not the same as always add 1.  We don't do it this way just to 
have fun with C.  If you change schedule_timeout() to add the 1, 
nanosleep() will need to do things differently to get the same behavior. 
 (And, YES users do pass in zero sleep times.)



--
George Anzinger   george@mvista.com
HRT (High-res-timers):  http://sourceforge.net/projects/high-res-timers/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-04 Thread Nish Aravamudan
On 8/4/05, George Anzinger george@mvista.com wrote:
 Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and
 timeval_to_jiffies() to add 1.  This is NOT the right thing to do.  For
 repeating times (see setitimer code) we need the actual time as we KNOW
 where the jiffies edge is in the repeating case.  The +1 is needed ONLY
 for the initial time, not the repeating time.

Please read the patch. I didn't touch timespec_to_jiffies() or
timeval_to_jiffies(). Not sure why you think I did. I agree that we
only need the initial time, my patch is no good. But it is hard for
non-itimers, like schedule_timeout() callers, to provide an interface
that only adds 1 to the initial request, since the callers currently
pass in an absolute jiffies value.

Thanks,
Nish
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-04 Thread Nish Aravamudan
On 8/4/05, George Anzinger george@mvista.com wrote:
 Nishanth Aravamudan wrote:
 ~
  Sorry, I forgot that sys_nanosleep() also always adds 1 to the request
  (to account for this same issue, I believe, as POSIX demands no early
  return from nanosleep() calls). There are some other locations where
  similar
 
+ (t.tv_sec || t.tv_nsec)
 
 This is not the same as always add 1.  We don't do it this way just to
 have fun with C.  If you change schedule_timeout() to add the 1,
 nanosleep() will need to do things differently to get the same behavior.
   (And, YES users do pass in zero sleep times.)

Fair enough. Will need to think about this more.

Thanks,
Nish
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-03 Thread Nishanth Aravamudan
On 03.08.2005 [17:51:47 -0700], Nishanth Aravamudan wrote:
> On 03.08.2005 [16:20:57 +0200], Roman Zippel wrote:
> > Hi,
> > 
> > On Mon, 1 Aug 2005, Nishanth Aravamudan wrote:
> > 
> > > +unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs)
> > > +{
> > > + unsigned long expire_jifs;
> > > +
> > > + if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) {
> > > + expire_jifs = MAX_SCHEDULE_TIMEOUT;
> > > + } else {
> > > + /*
> > > +  * msecs_to_jiffies() is a unit conversion, which truncates
> > > +  * (rounds down), so we need to add 1.
> > > +  */
> > > + expire_jifs = msecs_to_jiffies(timeout_msecs) + 1;
> > > + }
> > > +
> > > + expire_jifs = schedule_timeout(expire_jifs);
> > > +
> > > + /*
> > > +  * don't need to add 1 here, even though there is truncation,
> > > +  * because we will add 1 if/when the value is sent back in
> > > +  */
> > > + return jiffies_to_msecs(expire_jifs);
> > > +}
> > 
> > As I already mentioned for msleep_interruptible this is a really terrible 
> > interface.
> > The "jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1)" case (when the 
> > process is immediately woken up again) makes the caller suspectible to 
> > timeout manipulations and requires constant reauditing, that no caller 
> > gets it wrong, so it's better to avoid this error source completely.
> 
> After some thought today, I realized the +1 case is not specific to
> milliseconds. It's just that it's only being done *correctly* in the
> milliseconds case...I think ;)



> Description: Ensure that schedule_timeout() requests can not possibly
> expire early in the timeout case, by adding the requested relative jiffy
> value to the next value of jiffies. Currently, by adding to the current
> value of jiffies, we might actually expire a jiffy too early (in a
> corner case).

Sorry, I forgot that sys_nanosleep() also always adds 1 to the request
(to account for this same issue, I believe, as POSIX demands no early
return from nanosleep() calls). There are some other locations where
similar

+ (t.tv_sec || t.tv_nsec)

rounding-ups occur. I'll fix those separately if this patch goes in.  I
change the one in sys_nanosleep() below to maintain the same latency as
we currently have. I also screwed up my layout in the previous
submission, sorry about that.

Description: Ensure that schedule_timeout() requests can not possibly
expire early in the timeout case, by adding the requested relative jiffy
value to the next value of jiffies. Currently, by adding to the current
value of jiffies, we might actually expire a jiffy too early (in a
corner case). Modify the callers of schedule_timeout() in timer.c to not
add 1 themselves.

Signed-off-by: Nishanth Aravamudan <[EMAIL PROTECTED]>

---

 timer.c |   13 +++--
 1 files changed, 7 insertions(+), 6 deletions(-)

--- 2.6.13-rc4-dev/kernel/timer.c   2005-08-01 12:31:53.0 -0700
+++ 2.6.13-rc5/kernel/timer.c   2005-08-03 22:05:16.0 -0700
@@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti
}
}
 
-   expire = timeout + jiffies;
+   expire = timeout + jiffies + 1;
 
init_timer();
timer.expires = expire;
@@ -1190,9 +1190,10 @@ unsigned int __sched schedule_timeout_ms
} else {
/*
 * msecs_to_jiffies() is a unit conversion, which truncates
-* (rounds down), so we need to add 1.
+* (rounds down), so we need to add 1, but this is taken
+* care of by schedule_timeout() now.
 */
-   expire_jifs = msecs_to_jiffies(timeout_msecs) + 1;
+   expire_jifs = msecs_to_jiffies(timeout_msecs);
}
 
expire_jifs = schedule_timeout(expire_jifs);
@@ -1286,7 +1287,7 @@ asmlinkage long sys_nanosleep(struct tim
if ((t.tv_nsec >= 10L) || (t.tv_nsec < 0) || (t.tv_sec < 0))
return -EINVAL;
 
-   expire = timespec_to_jiffies() + (t.tv_sec || t.tv_nsec);
+   expire = timespec_to_jiffies();
current->state = TASK_INTERRUPTIBLE;
expire = schedule_timeout(expire);
 
@@ -1675,7 +1676,7 @@ unregister_time_interpolator(struct time
  */
 void msleep(unsigned int msecs)
 {
-   unsigned long timeout = msecs_to_jiffies(msecs) + 1;
+   unsigned long timeout = msecs_to_jiffies(msecs);
 
while (timeout) {
set_current_state(TASK_UNINTERRUPTIBLE);
@@ -1691,7 +1692,7 @@ EXPORT_SYMBOL(msleep);
  */
 unsigned long msleep_interruptible(unsigned int msecs)
 {
-   unsigned long timeout = msecs_to_jiffies(msecs) + 1;
+   unsigned long timeout = msecs_to_jiffies(msecs);
 
while (timeout && !signal_pending(current)) {
set_current_state(TASK_INTERRUPTIBLE);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  

[UPDATE PATCH] push rounding up of relative request to schedule_timeout()

2005-08-03 Thread Nishanth Aravamudan
On 03.08.2005 [17:51:47 -0700], Nishanth Aravamudan wrote:
 On 03.08.2005 [16:20:57 +0200], Roman Zippel wrote:
  Hi,
  
  On Mon, 1 Aug 2005, Nishanth Aravamudan wrote:
  
   +unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs)
   +{
   + unsigned long expire_jifs;
   +
   + if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) {
   + expire_jifs = MAX_SCHEDULE_TIMEOUT;
   + } else {
   + /*
   +  * msecs_to_jiffies() is a unit conversion, which truncates
   +  * (rounds down), so we need to add 1.
   +  */
   + expire_jifs = msecs_to_jiffies(timeout_msecs) + 1;
   + }
   +
   + expire_jifs = schedule_timeout(expire_jifs);
   +
   + /*
   +  * don't need to add 1 here, even though there is truncation,
   +  * because we will add 1 if/when the value is sent back in
   +  */
   + return jiffies_to_msecs(expire_jifs);
   +}
  
  As I already mentioned for msleep_interruptible this is a really terrible 
  interface.
  The jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1) case (when the 
  process is immediately woken up again) makes the caller suspectible to 
  timeout manipulations and requires constant reauditing, that no caller 
  gets it wrong, so it's better to avoid this error source completely.
 
 After some thought today, I realized the +1 case is not specific to
 milliseconds. It's just that it's only being done *correctly* in the
 milliseconds case...I think ;)

snip

 Description: Ensure that schedule_timeout() requests can not possibly
 expire early in the timeout case, by adding the requested relative jiffy
 value to the next value of jiffies. Currently, by adding to the current
 value of jiffies, we might actually expire a jiffy too early (in a
 corner case).

Sorry, I forgot that sys_nanosleep() also always adds 1 to the request
(to account for this same issue, I believe, as POSIX demands no early
return from nanosleep() calls). There are some other locations where
similar

+ (t.tv_sec || t.tv_nsec)

rounding-ups occur. I'll fix those separately if this patch goes in.  I
change the one in sys_nanosleep() below to maintain the same latency as
we currently have. I also screwed up my layout in the previous
submission, sorry about that.

Description: Ensure that schedule_timeout() requests can not possibly
expire early in the timeout case, by adding the requested relative jiffy
value to the next value of jiffies. Currently, by adding to the current
value of jiffies, we might actually expire a jiffy too early (in a
corner case). Modify the callers of schedule_timeout() in timer.c to not
add 1 themselves.

Signed-off-by: Nishanth Aravamudan [EMAIL PROTECTED]

---

 timer.c |   13 +++--
 1 files changed, 7 insertions(+), 6 deletions(-)

--- 2.6.13-rc4-dev/kernel/timer.c   2005-08-01 12:31:53.0 -0700
+++ 2.6.13-rc5/kernel/timer.c   2005-08-03 22:05:16.0 -0700
@@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti
}
}
 
-   expire = timeout + jiffies;
+   expire = timeout + jiffies + 1;
 
init_timer(timer);
timer.expires = expire;
@@ -1190,9 +1190,10 @@ unsigned int __sched schedule_timeout_ms
} else {
/*
 * msecs_to_jiffies() is a unit conversion, which truncates
-* (rounds down), so we need to add 1.
+* (rounds down), so we need to add 1, but this is taken
+* care of by schedule_timeout() now.
 */
-   expire_jifs = msecs_to_jiffies(timeout_msecs) + 1;
+   expire_jifs = msecs_to_jiffies(timeout_msecs);
}
 
expire_jifs = schedule_timeout(expire_jifs);
@@ -1286,7 +1287,7 @@ asmlinkage long sys_nanosleep(struct tim
if ((t.tv_nsec = 10L) || (t.tv_nsec  0) || (t.tv_sec  0))
return -EINVAL;
 
-   expire = timespec_to_jiffies(t) + (t.tv_sec || t.tv_nsec);
+   expire = timespec_to_jiffies(t);
current-state = TASK_INTERRUPTIBLE;
expire = schedule_timeout(expire);
 
@@ -1675,7 +1676,7 @@ unregister_time_interpolator(struct time
  */
 void msleep(unsigned int msecs)
 {
-   unsigned long timeout = msecs_to_jiffies(msecs) + 1;
+   unsigned long timeout = msecs_to_jiffies(msecs);
 
while (timeout) {
set_current_state(TASK_UNINTERRUPTIBLE);
@@ -1691,7 +1692,7 @@ EXPORT_SYMBOL(msleep);
  */
 unsigned long msleep_interruptible(unsigned int msecs)
 {
-   unsigned long timeout = msecs_to_jiffies(msecs) + 1;
+   unsigned long timeout = msecs_to_jiffies(msecs);
 
while (timeout  !signal_pending(current)) {
set_current_state(TASK_INTERRUPTIBLE);

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/