Richard Henderson <r...@twiddle.net> writes: I used the following, almost certainly not appropriate for general application. [snip]
Thanks. I would be very useful to make GMP timing work with the kernel Linux running om ARM. Do you know if there are similar problems with, say, NetBSD? I have checked in major ARM improvements the last few days, inspired by your patches. I have an A9 but no A8, so I have optimised just for the former. I have left the top-level mpn/arm code largely unmodified in order to keep supporting older v4 and v5 arch CPUs; new multiply code using umaal resides in the directory mpn/arm/v6. (Some new division code still in the forge will appear in mpn/arm/v5, due to its use of clz.) The new code is carefully software pipelined, and mul_1 and addmul_1 run faster than both the old code and your patched code, at least on A9. Could you please try it on A8 and see if it is at least as fast as your code there? If it is slower, we need to try to make innocent modifications that doesn't hurt A9, or if that turns out to be hard, provide several functions and choose asm code not only based on architecture, but also on exact core. I would appreciate if you timed all new and modified functions on A8. If you have ARMv4 (e.g., StrongARM) and/or ARMv5 (e.g., XScale) I would appreciate if you could check if they still work after the latest changes. Do you know if there is a portable mechanism for recognising an ARM core, akin to x86's cpuid? -- Torbjörn _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel