This patch adds a x86_64/avx assembler implementation of the Serpent block cipher. The implementation is very similar to the sse2 implementation and processes eight blocks in parallel. Because of the new non-destructive three operand syntax all move-instructions can be removed and therefore a little performance increase is provided.
/* /me adds CPU with AVX to wishlist. */ <snip>
diff --git a/arch/x86/crypto/serpent_avx_glue.c b/arch/x86/crypto/serpent_avx_glue.cnew file mode 100644 index 0000000..85ef6e7 --- /dev/null +++ b/arch/x86/crypto/serpent_avx_glue.c @@ -0,0 +1,949 @@ +/* + * Glue Code for AVX assembler versions of Serpent Cipher + * + * Copyright (C) 2012 Johannes Goetzfried + * <[email protected]> + * + * Glue code based on twofish_avx_glue.c by:
Should be serpent_sse2_glue.c?
+ * Copyright (C) 2011 Jussi Kivilinna <[email protected]> + *
<snip>
+}, {
+ .cra_name = "ecb(serpent)",
+ .cra_driver_name = "ecb-serpent-avx",
+ .cra_priority = 400,
serpent_sse2_glue.c has priority 400 too, so you should increase priority here to 500.
...Actually about duplicating glue code.. is it really needed? On x86_64, both avx and sse2 versions process 8-blocks parallel and therefore glue code could be easily shared (as is done in SHA1 SSSE3/AVX).
-Jussi -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
