Torbjorn Granlund <t...@gmplib.org> writes: > It is OK for addmul_1, but our usage suffers from that they are on a > tight critical path.
Hmm, if I understand you correctly, it is preferable if the cpu can start doing the multiplication without any dependency on the carry from previous iteration, right? At least in theory, umaal could be implemented in such a way. > http://infocenter.arm.com/help/topic/com.arm.doc.ddi0388i/DDI0388I_cortex_a9_r4p1_trm.pdf Thanks! I have a couple of additional newbie questions: 1. What are the calling conventions? 2. What gcc flags should I use to be able to get uint64_t variables into neon registers? (I'll look into that when I'm back at work on Monday, and I hope the answers should be easy to find, so it's not urgent). > I haven't played with Neon much. There are lots of instructions there > which might be useful for us. At least lshift, lshiftc, rshift, > popcount, hamdist, copyi, copyd, and com could be improved. One difference to x86 simd (beyond style) is that there seems to be several widening instructions, with 32-bit inputs and 64-bit outputs, both related to multiplication and addition. I've been looking primarily for operations useful for crypto. Like wide xor, shift/rotate, other data shuffling. Or just using the additional registers to store uint64_t variables would give a decent speedup over using the regular registers, I imagine. > Using Neon in a robust way might be a bit tricky, though. I have no > idea how to determine if a CPU has Neon or not, and ARM has made most > useful meta instructions supervisor-only. For a start, I guess it could be a configure time option (with no fat-binary things). Either explicit, or automatically based on, e.g., linux' /proc/cpuinfo which lists available cpu extensions. Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance. _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel