On Fri, 22 Jul 2005 at 11:13:21 -0700 (PDT), Linus Torvalds wrote: > Something like the following (totally untested) should make it be > non-lazy. It's going to slow down normal task switches, but might speed up > the "restoring FP context all the time" case. > > Chuck? This should work fine with or without your inline thing. Does it > make any difference?
Now that I am looking at the right output file -- yes, it does (after fixing one small bug.) Volanomark set a new record of 6125 messages/sec and the profile shows almost zero hits in math_state_restore. Since fxsave leaves the FPU state intact, there ought to be a better way to do this but it gets tricky. Maybe using the TSC to put a timestamp in every thread save area? when saving FPU state: put cpu# and timestamp in thread state info also store timestamp in per-cpu data on task switch: compare cpu# and timestamps for next task if equal, clear TS and set TS_USEDFPU when state becomes invalid for some reason: zero cpu's timestamp But the extra overhead might be too much in many cases. (Below is what I changed in the original patch, if anyone wants to try it.) + if (tsk_used_math(next_p)) should be: + if (!tsk_used_math(next_p)) __ Chuck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/