> > > I'm trying to learn a bit of ppc assembly. Below is an implementation of > _chacha_core. Seems to work, when tested on gcc112.fsffrance.org (just > put the file in the powerpc64 directory and reconfigure). This machine > is little-endian, I haven't yet tested on big-endian. >
Great work. The implementation looks fine, I like the idea of using -16 instead of 16 for rotating because vspltisw is limited to (-16 to 15) and vrlw picks the low-order 5 bits which is the same for both -16 and 16. BTW this implementation should work as is on big-endian mode without any hassle because lxvw4x/stxvw4x are endianness aware of loading/storing word values. > Unfortunately I don't get any accurate benchmark numbers on that > machine, but I think speedup may be on the order of 50%. It could likely > be speedup further by processing 2, 3 or 4 blocks in parallel, similar to > recent improvements for arm and x86_64. I'd like to do that after the > simpler single-block function is properly merged. > I can benchmark the optimized core but it could take me a few days to get it done, you may want to try Unicamp Minicloud https://openpower.ic.unicamp.br/minicloud or POWER Cloud at OSU http://osuosl.org/services/powerdev Unicamp Minicloud offer good POWER instances and would approve your request in two days. > > I'm not sure where it fits under powerpc64. The code doesn't need any > cryptographic extensions, but it depends on vector instructions as well > as VSX registers (for the unaligned load and store instructions). So I'd > need advice both on the directory hierarchy and compile time > configuration, and appropriate runtime tests for fat builds. The VSX instructions are introduced in Power ISA v.2.06 so since you have used VSX instructions lxvw4x/stxvw4x the minimum processor you are targeting is POWER7 We can add new config option like "--enable-power-vsx" that enable this optimization. _______________________________________________ nettle-bugs mailing list nettle-bugs@lists.lysator.liu.se http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs