On Wed, Aug 01, 2007 at 07:17:51PM -0700, Linus Torvalds wrote: > > > On Wed, 1 Aug 2007, Roman Zippel wrote: > > > > I'm not so sure about that. sched_clock() has to be fast, so many archs > > may want to continue to use jiffies. As soon as one does that one can also > > save a lot of computational overhead by using 32bit instead of 64bit. > > The question is then how easy that is possible. > > I have to say, it would be interesting to try to use 32-bit arithmetic. > > I also think it's likely a mistake to do a nanosecond resolution. That's > part of what forces us to 64 bits, and it's just not even an *interesting* > resolution.
I would add that I have been bothered by the 64-bit arithmetics when trying to see what could be improved in the code. In fact, it's very hard to optimize anything when you have arithmetics on integers larger than the CPU's, and gcc is known not to emit very good code in this situation (I remember it could not play with registers renaming, etc...). However, I undertand why Ingo chose to use 64 bits. It has the advantage that the numbers never wrap within 584 years. I'm well aware that it's very difficult to keep tasks ordered according to a key which can wrap. But if we consider that we don't need to be more precise than the return value from gettimeofday() that all applications use, we see that a bunch of microseconds is enough. 32 bits at the microsecond level wraps around every hour. We may accept to recompute all keys every hour. It's not that dramatic. The problem is how to detect that we will need to. I remember a trick used by Tim Schmielau in his jiffies64 patch for 2.4. He kept a copy of the highest bit of the lower word in the lowest bit of the higher word, and considered that the lower one could not wrap before we could check it. I liked this approach, which could be translated here in something like the following : Have all keys use 32-bit resolution, and monitor the 32nd bit. All tasks must have the same value in this bit, otherwise we consider that their keys have wrapped. The "current" value of this bit is copied somewhere. When we walk the tree and find a task with a key which does not have its 32nd bit equal to the current value, it means that this key has wrapped, so we have to use this information in our arithmetics. When all keys have their 32nd bit different from the "current" value, then we switch this value to reflect the new 32nd bit, and everything is in sync again. The only requirement is that no key wraps around before the "current" value is switched. This implies that no couple of tasks could have their keys distant by more than 31 bits (35 minutes), which seems reasonable. If we can recompute all tasks' keys when all of them have wrapped, then we do not have to store the "current" bit value anymore, and consider that it is always zero instead (I don't know if the code permits this). It is possible that using the 32nd bit to detect the wrapping may impose us to perform some computations on 33 bits. If this is the case, then it would be fine if we reduced the range to 31 bits, with all tasks distant from at most 30 bits (17 minutes). Also, I remember that the key is signed. I've never experimented with the tricks above on signed values, but we might be able to define something like this for the higher bits : 00 = positive, no wrap 01 = positive, wrapped 10 = negative, wrapped 11 = negative, no wrap I have no code to show, I just wanted to expose this idea. I know that if Ingo likes it, he will beat everyone at implementing it ;-) > Linus Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/