Re: [PATCH v2.1 4/7] crypto: GnuPG based MPI lib - additional sources (part 4)

2011-10-17 Thread Kasatkin, Dmitry
On Sat, Oct 15, 2011 at 3:34 AM, James Morris jmor...@namei.org wrote:
 On Fri, 14 Oct 2011, Dmitry Kasatkin wrote:

 +#if 0                                /* not yet ported to MPI */
 +
 +mpi_limb_t
 +mpihelp_udiv_w_sdiv(mpi_limp_t *rp,
 +                 mpi_limp_t *a1, mpi_limp_t *a0, mpi_limp_t *d)

 Drop this if it's not working.


 --
 James Morris
 jmor...@namei.org
 --
 To unsubscribe from this list: send the line unsubscribe 
 linux-security-module in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


It is there for completeness and it will not be even compiled at all
without CONFIG_MPILIB_EXTRA

Still remove?

- Dmitry
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2.1 1/7] crypto: GnuPG based MPI lib - source files (part 1)

2011-10-17 Thread Kasatkin, Dmitry
From Kernel Docbook

Similar to functionEXPORT_SYMBOL()/function except that the
symbols exported by functionEXPORT_SYMBOL_GPL()/function can
only be seen by modules with a
functionMODULE_LICENSE()/function that specifies a GPL
compatible license.  It implies that the function is considered
an internal implementation issue, and not really an interface.

not really an interface

Should it really be EXPORT_SYMBOL_GPL?

- Dmitry

On Sat, Oct 15, 2011 at 3:28 AM, James Morris jmor...@namei.org wrote:
 On Fri, 14 Oct 2011, Dmitry Kasatkin wrote:

 +MPI mpi_alloc(unsigned nlimbs)
 +{
 +     MPI a;
 +
 +     a = (MPI) kmalloc(sizeof *a, GFP_KERNEL);

 Generally, typedef structs are frowned upon in the kernel.  I'd prefer to
 see this (and any others) changed to a normal type.

 Also, kmalloc return values do not need to be cast, they're void *.

 +EXPORT_SYMBOL(mpi_alloc);

 New interfaces should be EXPORT_SYMBOL_GPL.


 --
 James Morris
 jmor...@namei.org

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2.1 1/7] crypto: GnuPG based MPI lib - source files (part 1)

2011-10-17 Thread Kasatkin, Dmitry
On Mon, Oct 17, 2011 at 12:11 PM, Kasatkin, Dmitry
dmitry.kasat...@intel.com wrote:
 From Kernel Docbook

    Similar to functionEXPORT_SYMBOL()/function except that the
    symbols exported by functionEXPORT_SYMBOL_GPL()/function can
    only be seen by modules with a
    functionMODULE_LICENSE()/function that specifies a GPL
    compatible license.  It implies that the function is considered
    an internal implementation issue, and not really an interface.

 not really an interface

 Should it really be EXPORT_SYMBOL_GPL?

 - Dmitry

 On Sat, Oct 15, 2011 at 3:28 AM, James Morris jmor...@namei.org wrote:
 On Fri, 14 Oct 2011, Dmitry Kasatkin wrote:

 +MPI mpi_alloc(unsigned nlimbs)
 +{
 +     MPI a;
 +
 +     a = (MPI) kmalloc(sizeof *a, GFP_KERNEL);

 Generally, typedef structs are frowned upon in the kernel.  I'd prefer to
 see this (and any others) changed to a normal type.

 Also, kmalloc return values do not need to be cast, they're void *.

 +EXPORT_SYMBOL(mpi_alloc);

 New interfaces should be EXPORT_SYMBOL_GPL.


 --
 James Morris
 jmor...@namei.org



Hello James,

Also please let me know about other things so that I could fix them as well...

Thanks!

- Dmitry
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2.1 1/7] crypto: GnuPG based MPI lib - source files (part 1)

2011-10-17 Thread David Howells
James Morris jmor...@namei.org wrote:

  +MPI mpi_alloc(unsigned nlimbs)
  +{
  +   MPI a;
  +
  +   a = (MPI) kmalloc(sizeof *a, GFP_KERNEL);
 
 Generally, typedef structs are frowned upon in the kernel.  I'd prefer to 
 see this (and any others) changed to a normal type.

In this case, however, it makes it easier to compare back to the original
code.

David
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2.1 1/7] crypto: GnuPG based MPI lib - source files (part 1)

2011-10-17 Thread Greg KH
On Mon, Oct 17, 2011 at 12:11:37PM +0300, Kasatkin, Dmitry wrote:
 From Kernel Docbook
 
 Similar to functionEXPORT_SYMBOL()/function except that the
 symbols exported by functionEXPORT_SYMBOL_GPL()/function can
 only be seen by modules with a
 functionMODULE_LICENSE()/function that specifies a GPL
 compatible license.  It implies that the function is considered
 an internal implementation issue, and not really an interface.
 
 not really an interface
 
 Should it really be EXPORT_SYMBOL_GPL?

Yes.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/7] crypto: add SSE2-x86_64/i586 implementation of Serpent cipher

2011-10-17 Thread Jussi Kivilinna
This series adds SSE2 optimized version of Serpent cipher for x86_64 and i586
architectures. The i586 implementation processes four serpent blocks parallel
in SSE2 registers. The x86_64 implementation utilizes available extra SSE2
registers for higher performance on out-of-order CPUs, crypting 8 blocks
parallel.

Series depends on previous testmgr/tcrypt patches in twofish-asm-3way series
and also on following patches:
  http://marc.info/?l=linux-crypto-vgerm=131827700228773w=2
  http://marc.info/?l=linux-crypto-vgerm=131827699228759w=2

---

Jussi Kivilinna (7):
  crypto: testmgr: add new serpent test vectors
  crypto: tcrypt: add test_acipher_speed
  crypto: tcrypt: add serpent speed tests
  crypto: serpent: export common functions for x86_64/i386-sse2 assembler 
implementations
  crypto: serpent: rename module from serpent to serpent_generic
  crypto: serpent: add 8-way parallel x86_64/SSE2 assembler implementation
  crypto: serpent: add 4-way parallel i586/SSE2 assembler implementation


 arch/x86/crypto/Makefile |4 
 arch/x86/crypto/serpent-sse2-i586-asm_32.S   |  639 ++
 arch/x86/crypto/serpent-sse2-x86_64-asm_64.S |  761 ++
 arch/x86/crypto/serpent_sse2_glue.c  |  719 +
 arch/x86/include/asm/serpent.h   |   64 ++
 crypto/Kconfig   |   34 +
 crypto/Makefile  |4 
 crypto/serpent.c |   44 +-
 crypto/tcrypt.c  |  282 ++
 crypto/testmgr.c |   90 +++
 crypto/testmgr.h |  393 +
 include/crypto/serpent.h |   25 +
 12 files changed, 3037 insertions(+), 22 deletions(-)
 create mode 100644 arch/x86/crypto/serpent-sse2-i586-asm_32.S
 create mode 100644 arch/x86/crypto/serpent-sse2-x86_64-asm_64.S
 create mode 100644 arch/x86/crypto/serpent_sse2_glue.c
 create mode 100644 arch/x86/include/asm/serpent.h
 create mode 100644 include/crypto/serpent.h
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/7] crypto: testmgr: add new serpent test vectors

2011-10-17 Thread Jussi Kivilinna
Add new serpent tests for serpent_sse2 x86_64/i586 8-way/4-way code paths.

Signed-off-by: Jussi Kivilinna jussi.kivili...@mbnet.fi
---
 crypto/tcrypt.c  |2 
 crypto/testmgr.c |   30 
 crypto/testmgr.h |  393 ++
 3 files changed, 423 insertions(+), 2 deletions(-)

diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c
index 0c4e80f..ac9e4d2 100644
--- a/crypto/tcrypt.c
+++ b/crypto/tcrypt.c
@@ -793,6 +793,8 @@ static int do_test(int m)
 
case 9:
ret += tcrypt_test(ecb(serpent));
+   ret += tcrypt_test(cbc(serpent));
+   ret += tcrypt_test(ctr(serpent));
break;
 
case 10:
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index e91c1eb..49436b9 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -1675,6 +1675,21 @@ static const struct alg_test_desc alg_test_descs[] = {
}
}
}, {
+   .alg = cbc(serpent),
+   .test = alg_test_skcipher,
+   .suite = {
+   .cipher = {
+   .enc = {
+   .vecs = serpent_cbc_enc_tv_template,
+   .count = SERPENT_CBC_ENC_TEST_VECTORS
+   },
+   .dec = {
+   .vecs = serpent_cbc_dec_tv_template,
+   .count = SERPENT_CBC_DEC_TEST_VECTORS
+   }
+   }
+   }
+   }, {
.alg = cbc(twofish),
.test = alg_test_skcipher,
.suite = {
@@ -1771,6 +1786,21 @@ static const struct alg_test_desc alg_test_descs[] = {
}
}
}, {
+   .alg = ctr(serpent),
+   .test = alg_test_skcipher,
+   .suite = {
+   .cipher = {
+   .enc = {
+   .vecs = serpent_ctr_enc_tv_template,
+   .count = SERPENT_CTR_ENC_TEST_VECTORS
+   },
+   .dec = {
+   .vecs = serpent_ctr_dec_tv_template,
+   .count = SERPENT_CTR_DEC_TEST_VECTORS
+   }
+   }
+   }
+   }, {
.alg = ctr(twofish),
.test = alg_test_skcipher,
.suite = {
diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index 37b4d8f..ed4aec9 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -3096,12 +3096,18 @@ static struct cipher_testvec tf_ctr_dec_tv_template[] = 
{
  * Serpent test vectors.  These are backwards because Serpent writes
  * octet sequences in right-to-left mode.
  */
-#define SERPENT_ENC_TEST_VECTORS   4
-#define SERPENT_DEC_TEST_VECTORS   4
+#define SERPENT_ENC_TEST_VECTORS   5
+#define SERPENT_DEC_TEST_VECTORS   5
 
 #define TNEPRES_ENC_TEST_VECTORS   4
 #define TNEPRES_DEC_TEST_VECTORS   4
 
+#define SERPENT_CBC_ENC_TEST_VECTORS   1
+#define SERPENT_CBC_DEC_TEST_VECTORS   1
+
+#define SERPENT_CTR_ENC_TEST_VECTORS   2
+#define SERPENT_CTR_DEC_TEST_VECTORS   2
+
 static struct cipher_testvec serpent_enc_tv_template[] = {
{
.input  = \x00\x01\x02\x03\x04\x05\x06\x07
@@ -3140,6 +3146,50 @@ static struct cipher_testvec serpent_enc_tv_template[] = 
{
.result = \xdd\xd2\x6b\x98\xa5\xff\xd8\x2c
  \x05\x34\x5a\x9d\xad\xbf\xaf\x49,
.rlen   = 16,
+   }, { /* Generated with Crypto++ */
+   .key= \x85\x62\x3F\x1C\xF9\xD6\x1C\xF9
+ \xD6\xB3\x90\x6D\x4A\x90\x6D\x4A
+ \x27\x04\xE1\x27\x04\xE1\xBE\x9B
+ \x78\xBE\x9B\x78\x55\x32\x0F\x55,
+   .klen   = 32,
+   .input  = \x56\xED\x84\x1B\x8F\x26\xBD\x31
+ \xC8\x5F\xF6\x6A\x01\x98\x0C\xA3
+ \x3A\xD1\x45\xDC\x73\x0A\x7E\x15
+ \xAC\x20\xB7\x4E\xE5\x59\xF0\x87
+ \x1E\x92\x29\xC0\x34\xCB\x62\xF9
+ \x6D\x04\x9B\x0F\xA6\x3D\xD4\x48
+ \xDF\x76\x0D\x81\x18\xAF\x23\xBA
+ \x51\xE8\x5C\xF3\x8A\x21\x95\x2C
+ \xC3\x37\xCE\x65\xFC\x70\x07\x9E
+ \x12\xA9\x40\xD7\x4B\xE2\x79\x10
+ \x84\x1B\xB2\x26\xBD\x54\xEB\x5F
+ \xF6\x8D\x01\x98\x2F\xC6\x3A\xD1
+ \x68\xFF\x73\x0A\xA1\x15\xAC\x43
+ \xDA\x4E\xE5\x7C\x13\x87\x1E\xB5
+ \x29\xC0\x57\xEE\x62\xF9\x90\x04
+ 

[PATCH 2/7] crypto: tcrypt: add test_acipher_speed

2011-10-17 Thread Jussi Kivilinna
Add test_acipher_speed for testing async block ciphers.

Also include tests for aes/des/des3/ede as these appear to have ablk_cipher
implementations available.

Signed-off-by: Jussi Kivilinna jussi.kivili...@mbnet.fi
---
 crypto/tcrypt.c |  250 +++
 1 files changed, 250 insertions(+), 0 deletions(-)

diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c
index ac9e4d2..dd3a0f8 100644
--- a/crypto/tcrypt.c
+++ b/crypto/tcrypt.c
@@ -719,6 +719,207 @@ out:
crypto_free_ahash(tfm);
 }
 
+static inline int do_one_acipher_op(struct ablkcipher_request *req, int ret)
+{
+   if (ret == -EINPROGRESS || ret == -EBUSY) {
+   struct tcrypt_result *tr = req-base.data;
+
+   ret = wait_for_completion_interruptible(tr-completion);
+   if (!ret)
+   ret = tr-err;
+   INIT_COMPLETION(tr-completion);
+   }
+
+   return ret;
+}
+
+static int test_acipher_jiffies(struct ablkcipher_request *req, int enc,
+   int blen, int sec)
+{
+   unsigned long start, end;
+   int bcount;
+   int ret;
+
+   for (start = jiffies, end = start + sec * HZ, bcount = 0;
+time_before(jiffies, end); bcount++) {
+   if (enc)
+   ret = do_one_acipher_op(req,
+   crypto_ablkcipher_encrypt(req));
+   else
+   ret = do_one_acipher_op(req,
+   crypto_ablkcipher_decrypt(req));
+
+   if (ret)
+   return ret;
+   }
+
+   pr_cont(%d operations in %d seconds (%ld bytes)\n,
+   bcount, sec, (long)bcount * blen);
+   return 0;
+}
+
+static int test_acipher_cycles(struct ablkcipher_request *req, int enc,
+  int blen)
+{
+   unsigned long cycles = 0;
+   int ret = 0;
+   int i;
+
+   /* Warm-up run. */
+   for (i = 0; i  4; i++) {
+   if (enc)
+   ret = do_one_acipher_op(req,
+   crypto_ablkcipher_encrypt(req));
+   else
+   ret = do_one_acipher_op(req,
+   crypto_ablkcipher_decrypt(req));
+
+   if (ret)
+   goto out;
+   }
+
+   /* The real thing. */
+   for (i = 0; i  8; i++) {
+   cycles_t start, end;
+
+   start = get_cycles();
+   if (enc)
+   ret = do_one_acipher_op(req,
+   crypto_ablkcipher_encrypt(req));
+   else
+   ret = do_one_acipher_op(req,
+   crypto_ablkcipher_decrypt(req));
+   end = get_cycles();
+
+   if (ret)
+   goto out;
+
+   cycles += end - start;
+   }
+
+out:
+   if (ret == 0)
+   pr_cont(1 operation in %lu cycles (%d bytes)\n,
+   (cycles + 4) / 8, blen);
+
+   return ret;
+}
+
+static void test_acipher_speed(const char *algo, int enc, unsigned int sec,
+  struct cipher_speed_template *template,
+  unsigned int tcount, u8 *keysize)
+{
+   unsigned int ret, i, j, iv_len;
+   struct tcrypt_result tresult;
+   const char *key;
+   char iv[128];
+   struct ablkcipher_request *req;
+   struct crypto_ablkcipher *tfm;
+   const char *e;
+   u32 *b_size;
+
+   if (enc == ENCRYPT)
+   e = encryption;
+   else
+   e = decryption;
+
+   pr_info(\ntesting speed of async %s %s\n, algo, e);
+
+   init_completion(tresult.completion);
+
+   tfm = crypto_alloc_ablkcipher(algo, 0, 0);
+
+   if (IS_ERR(tfm)) {
+   pr_err(failed to load transform for %s: %ld\n, algo,
+  PTR_ERR(tfm));
+   return;
+   }
+
+   req = ablkcipher_request_alloc(tfm, GFP_KERNEL);
+   if (!req) {
+   pr_err(tcrypt: skcipher: Failed to allocate request for %s\n,
+  algo);
+   goto out;
+   }
+
+   ablkcipher_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
+   tcrypt_complete, tresult);
+
+   i = 0;
+   do {
+   b_size = block_sizes;
+
+   do {
+   struct scatterlist sg[TVMEMSIZE];
+
+   if ((*keysize + *b_size)  TVMEMSIZE * PAGE_SIZE) {
+   pr_err(template (%u) too big for 
+  tvmem (%lu)\n, *keysize + *b_size,
+  TVMEMSIZE * PAGE_SIZE);
+   goto out_free_req;
+   }
+
+

[PATCH 3/7] crypto: tcrypt: add serpent speed tests

2011-10-17 Thread Jussi Kivilinna
Signed-off-by: Jussi Kivilinna jussi.kivili...@mbnet.fi
---
 crypto/tcrypt.c |   30 ++
 1 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c
index dd3a0f8..5526065 100644
--- a/crypto/tcrypt.c
+++ b/crypto/tcrypt.c
@@ -1292,6 +1292,21 @@ static int do_test(int m)
  speed_template_16_32);
break;
 
+   case 207:
+   test_cipher_speed(ecb(serpent), ENCRYPT, sec, NULL, 0,
+ speed_template_16_32);
+   test_cipher_speed(ecb(serpent), DECRYPT, sec, NULL, 0,
+ speed_template_16_32);
+   test_cipher_speed(cbc(serpent), ENCRYPT, sec, NULL, 0,
+ speed_template_16_32);
+   test_cipher_speed(cbc(serpent), DECRYPT, sec, NULL, 0,
+ speed_template_16_32);
+   test_cipher_speed(ctr(serpent), ENCRYPT, sec, NULL, 0,
+ speed_template_16_32);
+   test_cipher_speed(ctr(serpent), DECRYPT, sec, NULL, 0,
+ speed_template_16_32);
+   break;
+
case 300:
/* fall through */
 
@@ -1493,6 +1508,21 @@ static int do_test(int m)
   speed_template_8);
break;
 
+   case 503:
+   test_acipher_speed(ecb(serpent), ENCRYPT, sec, NULL, 0,
+  speed_template_16_32);
+   test_acipher_speed(ecb(serpent), DECRYPT, sec, NULL, 0,
+  speed_template_16_32);
+   test_acipher_speed(cbc(serpent), ENCRYPT, sec, NULL, 0,
+  speed_template_16_32);
+   test_acipher_speed(cbc(serpent), DECRYPT, sec, NULL, 0,
+  speed_template_16_32);
+   test_acipher_speed(ctr(serpent), ENCRYPT, sec, NULL, 0,
+  speed_template_16_32);
+   test_acipher_speed(ctr(serpent), DECRYPT, sec, NULL, 0,
+  speed_template_16_32);
+   break;
+
case 1000:
test_available();
break;

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/7] crypto: serpent: export common functions for x86_64/i386-sse2 assembler implementations

2011-10-17 Thread Jussi Kivilinna
Serpent SSE2 assembler implementations only provide 4-way/8-way parallel
functions and need setkey and one-block encrypt/decrypt functions.

CC: Dag Arne Osvik os...@ii.uib.no
Signed-off-by: Jussi Kivilinna jussi.kivili...@mbnet.fi
---
 crypto/serpent.c |   41 ++---
 include/crypto/serpent.h |   25 +
 2 files changed, 47 insertions(+), 19 deletions(-)
 create mode 100644 include/crypto/serpent.h

diff --git a/crypto/serpent.c b/crypto/serpent.c
index b651a55..867ca93 100644
--- a/crypto/serpent.c
+++ b/crypto/serpent.c
@@ -21,16 +21,12 @@
 #include asm/byteorder.h
 #include linux/crypto.h
 #include linux/types.h
+#include crypto/serpent.h
 
 /* Key is padded to the maximum of 256 bits before round key generation.
  * Any key length = 256 bits (32 bytes) is allowed by the algorithm.
  */
 
-#define SERPENT_MIN_KEY_SIZE 0
-#define SERPENT_MAX_KEY_SIZE32
-#define SERPENT_EXPKEY_WORDS   132
-#define SERPENT_BLOCK_SIZE  16
-
 #define PHI 0x9e3779b9UL
 
 #define keyiter(a,b,c,d,i,j) \
@@ -210,13 +206,7 @@
x1 ^= x4;   x3 ^= x4;   x4 = x0;   \
x4 ^= x2;
 
-struct serpent_ctx {
-   u32 expkey[SERPENT_EXPKEY_WORDS];
-};
-
-
-static int serpent_setkey(struct crypto_tfm *tfm, const u8 *key,
- unsigned int keylen)
+int serpent_setkey(struct crypto_tfm *tfm, const u8 *key, unsigned int keylen)
 {
struct serpent_ctx *ctx = crypto_tfm_ctx(tfm);
u32 *k = ctx-expkey;
@@ -359,12 +349,11 @@ static int serpent_setkey(struct crypto_tfm *tfm, const 
u8 *key,
 
return 0;
 }
+EXPORT_SYMBOL_GPL(serpent_setkey);
 
-static void serpent_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
+void __serpent_encrypt(struct serpent_ctx *ctx, u8 *dst, const u8 *src)
 {
-   struct serpent_ctx *ctx = crypto_tfm_ctx(tfm);
-   const u32
-   *k = ctx-expkey;
+   const u32 *k = ctx-expkey;
const __le32 *s = (const __le32 *)src;
__le32  *d = (__le32 *)dst;
u32 r0, r1, r2, r3, r4;
@@ -418,12 +407,18 @@ static void serpent_encrypt(struct crypto_tfm *tfm, u8 
*dst, const u8 *src)
d[2] = cpu_to_le32(r2);
d[3] = cpu_to_le32(r3);
 }
+EXPORT_SYMBOL_GPL(__serpent_encrypt);
 
-static void serpent_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
+static void serpent_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
 {
struct serpent_ctx *ctx = crypto_tfm_ctx(tfm);
-   const u32
-   *k = ((struct serpent_ctx *)ctx)-expkey;
+
+   __serpent_encrypt(ctx, dst, src);
+}
+
+void __serpent_decrypt(struct serpent_ctx *ctx, u8 *dst, const u8 *src)
+{
+   const u32 *k = ctx-expkey;
const __le32 *s = (const __le32 *)src;
__le32  *d = (__le32 *)dst;
u32 r0, r1, r2, r3, r4;
@@ -472,6 +467,14 @@ static void serpent_decrypt(struct crypto_tfm *tfm, u8 
*dst, const u8 *src)
d[2] = cpu_to_le32(r1);
d[3] = cpu_to_le32(r4);
 }
+EXPORT_SYMBOL_GPL(__serpent_decrypt);
+
+static void serpent_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
+{
+   struct serpent_ctx *ctx = crypto_tfm_ctx(tfm);
+
+   __serpent_decrypt(ctx, dst, src);
+}
 
 static struct crypto_alg serpent_alg = {
.cra_name   =   serpent,
diff --git a/include/crypto/serpent.h b/include/crypto/serpent.h
new file mode 100644
index 000..40df885
--- /dev/null
+++ b/include/crypto/serpent.h
@@ -0,0 +1,25 @@
+/*
+ * Common values for serpent algorithms
+ */
+
+#ifndef _CRYPTO_SERPENT_H
+#define _CRYPTO_SERPENT_H
+
+#include linux/types.h
+#include linux/crypto.h
+
+#define SERPENT_MIN_KEY_SIZE 0
+#define SERPENT_MAX_KEY_SIZE32
+#define SERPENT_EXPKEY_WORDS   132
+#define SERPENT_BLOCK_SIZE  16
+
+struct serpent_ctx {
+   u32 expkey[SERPENT_EXPKEY_WORDS];
+};
+
+int serpent_setkey(struct crypto_tfm *tfm, const u8 *key, unsigned int keylen);
+
+void __serpent_encrypt(struct serpent_ctx *ctx, u8 *dst, const u8 *src);
+void __serpent_decrypt(struct serpent_ctx *ctx, u8 *dst, const u8 *src);
+
+#endif

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/7] crypto: serpent: rename module from serpent to serpent_generic

2011-10-17 Thread Jussi Kivilinna
Rename module from serpent.ko to serpent_generic.ko and add module alias. This
is to allow assembler implementation to autoload on 'modprobe serpent'. Also
add driver_name and priority for serpent cipher.

CC: Dag Arne Osvik os...@ii.uib.no
Signed-off-by: Jussi Kivilinna jussi.kivili...@mbnet.fi

---

(I choose not to do serpent.c - serpent_generic.c rename as such patch
triggers lots of checkpatch errors. I can provide rename+style fix
patches, if you want so.)
---
 crypto/Makefile  |4 +++-
 crypto/serpent.c |3 +++
 2 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/crypto/Makefile b/crypto/Makefile
index fa8cbbb..dac8979 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -64,7 +64,9 @@ obj-$(CONFIG_CRYPTO_BLOWFISH) += blowfish_generic.o
 obj-$(CONFIG_CRYPTO_BLOWFISH_COMMON) += blowfish_common.o
 obj-$(CONFIG_CRYPTO_TWOFISH) += twofish_generic.o
 obj-$(CONFIG_CRYPTO_TWOFISH_COMMON) += twofish_common.o
-obj-$(CONFIG_CRYPTO_SERPENT) += serpent.o
+
+serpent_generic-y := serpent.o
+obj-$(CONFIG_CRYPTO_SERPENT) += serpent_generic.o
 obj-$(CONFIG_CRYPTO_AES) += aes_generic.o
 obj-$(CONFIG_CRYPTO_CAMELLIA) += camellia.o
 obj-$(CONFIG_CRYPTO_CAST5) += cast5.o
diff --git a/crypto/serpent.c b/crypto/serpent.c
index 867ca93..eb61630 100644
--- a/crypto/serpent.c
+++ b/crypto/serpent.c
@@ -478,6 +478,8 @@ static void serpent_decrypt(struct crypto_tfm *tfm, u8 
*dst, const u8 *src)
 
 static struct crypto_alg serpent_alg = {
.cra_name   =   serpent,
+   .cra_driver_name=   serpent-generic,
+   .cra_priority   =   100,
.cra_flags  =   CRYPTO_ALG_TYPE_CIPHER,
.cra_blocksize  =   SERPENT_BLOCK_SIZE,
.cra_ctxsize=   sizeof(struct serpent_ctx),
@@ -588,3 +590,4 @@ MODULE_LICENSE(GPL);
 MODULE_DESCRIPTION(Serpent and tnepres (kerneli compatible serpent reversed) 
Cipher Algorithm);
 MODULE_AUTHOR(Dag Arne Osvik os...@ii.uib.no);
 MODULE_ALIAS(tnepres);
+MODULE_ALIAS(serpent);

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/7] crypto: serpent: add 8-way parallel x86_64/SSE2 assembler implementation

2011-10-17 Thread Jussi Kivilinna
Patch adds x86_64/SSE2 assembler implementation of serpent cipher. Assembler
functions crypt data in eigth block chunks (two 4 block chunk SSE2 operations
in parallel to improve performance on out-of-order CPUs). Glue code is based
on one from AES-NI implementation, so requests from irq context are redirected
to cryptd.

Patch has been tested with tcrypt and automated filesystem tests.

Tcrypt benchmarks results (serpent-sse2/serpent_generic speed ratios):

AMD Phenom II 1055T (fam:16, model:10):

sizeecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
16B 1.03x   1.01x   1.03x   1.05x   1.00x   0.99x
64B 1.00x   1.01x   1.02x   1.04x   1.02x   1.01x
256B2.34x   2.41x   0.99x   2.43x   2.39x   2.40x
1024B   2.51x   2.57x   1.00x   2.59x   2.56x   2.56x
8192B   2.50x   2.54x   1.00x   2.55x   2.57x   2.57x

Intel Celeron T1600 (fam:6, model:15, step:13):

sizeecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
16B 0.97x   0.97x   1.01x   1.01x   1.01x   1.02x
64B 1.00x   1.00x   1.00x   1.02x   1.01x   1.01x
256B3.41x   3.35x   1.00x   3.39x   3.42x   3.44x
1024B   3.75x   3.72x   0.99x   3.74x   3.75x   3.75x
8192B   3.70x   3.68x   0.99x   3.68x   3.69x   3.69x

Full output:
 http://koti.mbnet.fi/axh/kernel/crypto/phenom-ii-1055t/serpent-generic.txt
 http://koti.mbnet.fi/axh/kernel/crypto/phenom-ii-1055t/serpent-sse2.txt
 http://koti.mbnet.fi/axh/kernel/crypto/celeron-t1600/serpent-generic.txt
 http://koti.mbnet.fi/axh/kernel/crypto/celeron-t1600/serpent-sse2.txt

Signed-off-by: Jussi Kivilinna jussi.kivili...@mbnet.fi
---
 arch/x86/crypto/Makefile |2 
 arch/x86/crypto/serpent-sse2-x86_64-asm_64.S |  761 ++
 arch/x86/crypto/serpent_sse2_glue.c  |  719 +
 arch/x86/include/asm/serpent.h   |   33 +
 crypto/Kconfig   |   17 +
 crypto/testmgr.c |   60 ++
 6 files changed, 1592 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/crypto/serpent-sse2-x86_64-asm_64.S
 create mode 100644 arch/x86/crypto/serpent_sse2_glue.c
 create mode 100644 arch/x86/include/asm/serpent.h

diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile
index 3537d4b..12ebdbd 100644
--- a/arch/x86/crypto/Makefile
+++ b/arch/x86/crypto/Makefile
@@ -11,6 +11,7 @@ obj-$(CONFIG_CRYPTO_BLOWFISH_X86_64) += blowfish-x86_64.o
 obj-$(CONFIG_CRYPTO_TWOFISH_X86_64) += twofish-x86_64.o
 obj-$(CONFIG_CRYPTO_TWOFISH_X86_64_3WAY) += twofish-x86_64-3way.o
 obj-$(CONFIG_CRYPTO_SALSA20_X86_64) += salsa20-x86_64.o
+obj-$(CONFIG_CRYPTO_SERPENT_SSE2_X86_64) += serpent-sse2-x86_64.o
 obj-$(CONFIG_CRYPTO_AES_NI_INTEL) += aesni-intel.o
 obj-$(CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL) += ghash-clmulni-intel.o
 
@@ -26,6 +27,7 @@ blowfish-x86_64-y := blowfish-x86_64-asm_64.o blowfish_glue.o
 twofish-x86_64-y := twofish-x86_64-asm_64.o twofish_glue.o
 twofish-x86_64-3way-y := twofish-x86_64-asm_64-3way.o twofish_glue_3way.o
 salsa20-x86_64-y := salsa20-x86_64-asm_64.o salsa20_glue.o
+serpent-sse2-x86_64-y := serpent-sse2-x86_64-asm_64.o serpent_sse2_glue.o
 
 aesni-intel-y := aesni-intel_asm.o aesni-intel_glue.o fpu.o
 
diff --git a/arch/x86/crypto/serpent-sse2-x86_64-asm_64.S 
b/arch/x86/crypto/serpent-sse2-x86_64-asm_64.S
new file mode 100644
index 000..7f24a15
--- /dev/null
+++ b/arch/x86/crypto/serpent-sse2-x86_64-asm_64.S
@@ -0,0 +1,761 @@
+/*
+ * Serpent Cipher 8-way parallel algorithm (x86_64/SSE2)
+ *
+ * Copyright (C) 2011 Jussi Kivilinna jussi.kivili...@mbnet.fi
+ *
+ * Based on crypto/serpent.c by
+ *  Copyright (C) 2002 Dag Arne Osvik os...@ii.uib.no
+ *2003 Herbert Valerio Riedel h...@gnu.org
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307
+ * USA
+ *
+ */
+
+.file serpent-sse2-x86_64-asm_64.S
+.text
+
+#define CTX %rdi
+
+/**
+  8-way SSE2 serpent
+ **/
+#define RA1 %xmm0
+#define RB1 %xmm1
+#define RC1 %xmm2
+#define RD1 %xmm3
+#define RE1 %xmm4
+
+#define RA2 %xmm5
+#define RB2 %xmm6
+#define RC2 %xmm7
+#define RD2 %xmm8
+#define RE2 %xmm9
+
+#define RNOT %xmm10
+
+#define RK0 %xmm11
+#define RK1 %xmm12
+#define RK2 %xmm13
+#define RK3 %xmm14
+
+#define 

[PATCH 7/7] crypto: serpent: add 4-way parallel i586/SSE2 assembler implementation

2011-10-17 Thread Jussi Kivilinna
Patch adds i586/SSE2 assembler implementation of serpent cipher. Assembler
functions crypt data in four block chunks.

Patch has been tested with tcrypt and automated filesystem tests.

Tcrypt benchmarks results (serpent-sse2/serpent_generic speed ratios):

Intel Atom N270:

sizeecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
16  0.95x   1.12x   1.02x   1.07x   0.97x   0.98x
64  1.73x   1.82x   1.08x   1.82x   1.72x   1.73x
256 2.08x   2.00x   1.04x   2.07x   1.99x   2.01x
10242.28x   2.18x   1.05x   2.23x   2.17x   2.20x
81922.28x   2.13x   1.05x   2.23x   2.18x   2.20x

Full output:
 http://koti.mbnet.fi/axh/kernel/crypto/atom-n270/serpent-generic.txt
 http://koti.mbnet.fi/axh/kernel/crypto/atom-n270/serpent-sse2.txt

Userspace test results:

Encryption/decryption of sse2-i586 vs generic on Intel Atom N270:
 encrypt: 2.35x
 decrypt: 2.54x

Encryption/decryption of sse2-i586 vs generic on AMD Phenom II:
 encrypt: 1.82x
 decrypt: 2.51x

Encryption/decryption of sse2-i586 vs generic on Intel Xeon E7330:
 encrypt: 2.99x
 decrypt: 3.48x

Signed-off-by: Jussi Kivilinna jussi.kivili...@mbnet.fi
---
 arch/x86/crypto/Makefile   |2 
 arch/x86/crypto/serpent-sse2-i586-asm_32.S |  639 
 arch/x86/include/asm/serpent.h |   31 +
 crypto/Kconfig |   17 +
 4 files changed, 689 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/crypto/serpent-sse2-i586-asm_32.S

diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile
index 12ebdbd..2b0b963 100644
--- a/arch/x86/crypto/Makefile
+++ b/arch/x86/crypto/Makefile
@@ -5,6 +5,7 @@
 obj-$(CONFIG_CRYPTO_AES_586) += aes-i586.o
 obj-$(CONFIG_CRYPTO_TWOFISH_586) += twofish-i586.o
 obj-$(CONFIG_CRYPTO_SALSA20_586) += salsa20-i586.o
+obj-$(CONFIG_CRYPTO_SERPENT_SSE2_586) += serpent-sse2-i586.o
 
 obj-$(CONFIG_CRYPTO_AES_X86_64) += aes-x86_64.o
 obj-$(CONFIG_CRYPTO_BLOWFISH_X86_64) += blowfish-x86_64.o
@@ -21,6 +22,7 @@ obj-$(CONFIG_CRYPTO_SHA1_SSSE3) += sha1-ssse3.o
 aes-i586-y := aes-i586-asm_32.o aes_glue.o
 twofish-i586-y := twofish-i586-asm_32.o twofish_glue.o
 salsa20-i586-y := salsa20-i586-asm_32.o salsa20_glue.o
+serpent-sse2-i586-y := serpent-sse2-i586-asm_32.o serpent_sse2_glue.o
 
 aes-x86_64-y := aes-x86_64-asm_64.o aes_glue.o
 blowfish-x86_64-y := blowfish-x86_64-asm_64.o blowfish_glue.o
diff --git a/arch/x86/crypto/serpent-sse2-i586-asm_32.S 
b/arch/x86/crypto/serpent-sse2-i586-asm_32.S
new file mode 100644
index 000..6f2486d
--- /dev/null
+++ b/arch/x86/crypto/serpent-sse2-i586-asm_32.S
@@ -0,0 +1,639 @@
+/*
+ * Serpent Cipher 4-way parallel algorithm (i586/SSE2)
+ *
+ * Copyright (C) 2011 Jussi Kivilinna jussi.kivili...@mbnet.fi
+ *
+ * Based on crypto/serpent.c by
+ *  Copyright (C) 2002 Dag Arne Osvik os...@ii.uib.no
+ *2003 Herbert Valerio Riedel h...@gnu.org
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307
+ * USA
+ *
+ */
+
+.file serpent-sse2-i586-asm_32.S
+.text
+
+#define arg_ctx 4
+#define arg_dst 8
+#define arg_src 12
+#define arg_xor 16
+
+/**
+  4-way SSE2 serpent
+ **/
+#define CTX %edx
+
+#define RA %xmm0
+#define RB %xmm1
+#define RC %xmm2
+#define RD %xmm3
+#define RE %xmm4
+
+#define RT0 %xmm5
+#define RT1 %xmm6
+
+#define RNOT %xmm7
+
+#define get_key(i, j, t) \
+   movd (4*(i)+(j))*4(CTX), t; \
+   pshufd $0, t, t;
+
+#define K(x0, x1, x2, x3, x4, i) \
+   get_key(i, 0, x4); \
+   get_key(i, 1, RT0); \
+   get_key(i, 2, RT1); \
+   pxor x4,x0; \
+   pxor RT0,   x1; \
+   pxor RT1,   x2; \
+   get_key(i, 3, x4); \
+   pxor x4,x3;
+
+#define LK(x0, x1, x2, x3, x4, i) \
+   movdqa x0,  x4; \
+   pslld $13,  x0; \
+   psrld $(32 - 13),   x4; \
+   por x4, x0; \
+   pxor x0,x1; \
+   movdqa x2,  x4; \
+   pslld $3,   x2; \
+   psrld $(32 - 3),x4; \
+   por x4, x2; \
+   pxor x2,x1; \
+   movdqa x1,  x4; \
+   pslld $1,   x1; \
+