Re: [PATCH -mm crypto] AES: x86_64 asm implementation optimization

2008-05-06 Thread Huang, Ying
is /proc/cpuinfo of my testing machine. Best Regards, Huang Ying > also -- please drop the #define for R16 to %rsp ... it obfuscates more > than it helps anything. > > thanks > -dean > > On Wed, 30 Apr 2008, Sebastian Siewior wrote: > > > * Huang, Ying | 2008-04-25 11

Re: [PATCH -mm crypto] AES: x86_64 asm implementation optimization

2008-05-06 Thread Huang, Ying
Hi, Sebastian, On Wed, 2008-04-30 at 00:12 +0200, Sebastian Siewior wrote: > * Huang, Ying | 2008-04-25 11:11:17 [+0800]: > > >Hi, Sebastian, > Hi Huang, > > sorry for the delay. > > >I changed the patches to group the read or write together instead of > >i

[RFC PATCH crypto] AES: Add support to Intel AES-NI instructions

2008-12-11 Thread Huang Ying
soft_irq context, the general x86_64 implementation are used instead. Signed-off-by: Huang Ying --- arch/x86/crypto/aes_glue.c| 10 - arch/x86/include/asm/aes.h|9 + arch/x86/include/asm/cpufeature.h |1 drivers/crypto/Kconfig| 11 + drivers/crypto

Re: [RFC PATCH crypto] AES: Add support to Intel AES-NI instructions

2008-12-14 Thread Huang Ying
On Sat, 2008-12-13 at 03:57 +0800, Sebastian Andrzej Siewior wrote: > * Huang Ying | 2008-12-12 12:08:46 [+0800]: > > >Add support to Intel AES-NI instructions for x86_64 platform. > > > >Intel AES-NI is a new set of Single Instruction Multiple Data (SIMD) > >ins

Re: [RFC PATCH crypto] AES: Add support to Intel AES-NI instructions

2008-12-14 Thread Huang Ying
On Mon, 2008-12-15 at 11:38 +0800, Herbert Xu wrote: > On Mon, Dec 15, 2008 at 10:19:02AM +0800, Huang Ying wrote: > > > > The general x86 implementation is used as the fall back for new AES-NI > > based implementation. Because AES-NI can not be used in kernel soft_irq &

Re: [RFC PATCH crypto] AES: Add support to Intel AES-NI instructions

2008-12-14 Thread Huang Ying
On Mon, 2008-12-15 at 13:21 +0800, Herbert Xu wrote: > On Mon, Dec 15, 2008 at 01:14:59PM +0800, Huang Ying wrote: > > > > The PadLock instructions don't use/touch SSE registers, but might cause > > DNA fault when CR0.TS is set. So it is sufficient just to clear

Re: [RFC PATCH crypto] AES: Add support to Intel AES-NI instructions

2008-12-16 Thread Huang Ying
ernel context which is not > engaging in any kernel FPU operations. Yes. This is a better solution with much better performance. How about hybridise b. and a.: f. if TS is clear, then use x86_64 implementation. Otherwise if user-space has touched the FPU, we save the state, if not then simpl

Re: [RFC PATCH crypto] AES: Add support to Intel AES-NI instructions

2008-12-16 Thread Huang Ying
On Wed, 2008-12-17 at 09:26 +0800, Herbert Xu wrote: > Huang Ying wrote: > > > > f. if TS is clear, then use x86_64 implementation. Otherwise if > > user-space has touched the FPU, we save the state, if not then simply > > clear TS. > > Well I'd rather avoi

Re: alg: cipher: Test 1 failed on encryption for aes-asm

2008-12-23 Thread Huang Ying
1] : 00 01 02 03 04 05 06 07 08 08 08 08 08 08 08 08 This patch fixes this bug via making crypto_aes_gen_tabs public and invoking it in aes_x86_64 and aes_generic. Signed-off-by: Huang Ying --- arch/x86/crypto/aes_glue.c |1 + crypto/aes_generic.c | 20 +++-

Re: alg: cipher: Test 1 failed on encryption for aes-asm

2008-12-23 Thread Huang Ying
On Tue, 2008-12-23 at 21:01 +0800, Sebastian Andrzej Siewior wrote: > * Huang Ying | 2008-12-23 16:49:26 [+0800]: > > >If aes_x86_64 and aes_generic are compiled as builtin, the > >initialization order is undetermined. That is, aes_x86_64 may be > >initilized before aes_ge

[RFC PATCH crypto 1/4] AES-NI: Move key_length in struct crypto_aes_ctx to be the last field

2009-01-04 Thread Huang Ying
The Intel AES-NI AES acceleration instructions need key_enc, key_dec in struct crypto_aes_ctx to be 16 byte aligned, it make this easier to move key_length to be the last one. Signed-off-by: Huang Ying --- arch/x86/crypto/aes-i586-asm_32.S |6 +++--- arch/x86/crypto/aes-x86_64-asm_64.S

[RFC PATCH crypto 2/4] AES-NI: Export x86 AES encrypt/decrypt functions

2009-01-04 Thread Huang Ying
with 16 bytes alignment requirement of AES-NI implementation. Signed-off-by: Huang Ying --- arch/x86/crypto/aes-i586-asm_32.S | 18 +- arch/x86/crypto/aes-x86_64-asm_64.S |6 ++ arch/x86/crypto/aes_glue.c | 20 arch/x86/include/asm/aes.h

[RFC PATCH crypto 4/4] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-04 Thread Huang Ying
. - ablkcipher asynchronous machanism is used to delay a crypto request to work queue context upon FPU state is using by other kernel context. Signed-off-by: Huang Ying --- arch/x86/crypto/Makefile |3 arch/x86/crypto/aesni-intel_asm.S | 756 + arch

[RFC PATCH crypto 3/4] AES-NI: Make it possible to use blkcipher_walk for ablkcipher algorithm

2009-01-04 Thread Huang Ying
-off-by: Huang Ying --- crypto/blkcipher.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) --- a/crypto/blkcipher.c +++ b/crypto/blkcipher.c @@ -68,6 +68,16 @@ static inline u8 *blkcipher_get_spot(u8 return max(start, end_page); } +static inline unsigned int

Re: [RFC PATCH crypto 4/4] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-09 Thread Huang Ying
e FOO(aes-aesni) in conjunction > with cryptd(FOO(aes-aesni)). Yes. This is really a simple method. Best Regards, Huang Ying signature.asc Description: This is a digitally signed message part

Re: [RFC PATCH crypto 4/4] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-11 Thread Huang Ying
cipher automatically. With this method, we can allocate only one cryptd tfm internally, without dedicated blkcipher tfm. If kernel is using FPU, crypd tfm is used, otherwise, the underlying blkcipher tfm is used directly. Best Regards, Huang Ying signature.asc Description: This is a digitally signed message part

Re: [RFC PATCH crypto 4/4] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-11 Thread Huang Ying
ipher_request for cryptd(*) for each incoming struct ablkcipher_request. But is there any better solution? Best Regards, Huang Ying signature.asc Description: This is a digitally signed message part

Re: [RFC PATCH crypto 4/4] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-12 Thread Huang Ying
On Mon, 2009-01-12 at 18:43 +0800, Herbert Xu wrote: > On Mon, Jan 12, 2009 at 02:55:10PM +0800, Huang Ying wrote: > > > > I use a "shell" cbc(aes) algorithm which chooses between > > cryptd(__cbc-aes-aesni) and __cbc-aes-aesni according to context. But > > th

Re: [RFC PATCH crypto 4/4] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-12 Thread Huang Ying
On Tue, 2009-01-13 at 10:39 +0800, Herbert Xu wrote: > On Tue, Jan 13, 2009 at 10:34:13AM +0800, Huang Ying wrote: > > > > static void ablk_complete(struct crypto_async_request *req, int err) > > { > > struct ablkcipher_request *ablk_req = ablkcipher_request_ca

Use cryptd(%s) as cryptd-ed algorithm name instead of %s

2009-01-13 Thread Huang Ying
version. Signed-off-by: Huang Ying --- crypto/cryptd.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- a/crypto/cryptd.c +++ b/crypto/cryptd.c @@ -215,7 +215,9 @@ static struct crypto_instance *cryptd_al ctx->state = state; - memcpy(inst->alg.cra_nam

Re: Use cryptd(%s) as cryptd-ed algorithm name instead of %s

2009-01-13 Thread Huang Ying
On Wed, 2009-01-14 at 14:53 +0800, Herbert Xu wrote: > On Wed, Jan 14, 2009 at 02:44:08PM +0800, Huang Ying wrote: > > Because: > > > > 1. if use %s, you can only request cryptd(), not > >cryptd(), because generated new algorithm instance has > >algori

[RFC PATCH crypto -v3 1/2] AES-NI: Add support to access underlying blkcipher under cryptd ablkcipher

2009-01-13 Thread Huang Ying
cryptd_alloc_ablkcipher() will allocate a cryptd-ed ablkcipher for specified algorithm name. The new allocated one is guaranteed to be cryptd-ed ablkcipher, so the blkcipher underlying can be gotten via cryptd_ablkcipher_child(). Signed-off-by: Huang Ying --- crypto/cryptd.c | 19

[RFC PATCH crypto -v3 2/2] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-13 Thread Huang Ying
are implementation. - AES key scheduling algorithm is re-implemented with higher performance. - ablkcipher asynchronous machanism is used to delay a crypto request to work queue context upon FPU state is using by other kernel context. Signed-off-by: Huang Ying --- arch/x86/crypto/Makef

[PATCH crypto -v4 2/2] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-15 Thread Huang Ying
. Signed-off-by: Huang Ying --- arch/x86/crypto/Makefile |3 arch/x86/crypto/aesni-intel_asm.S | 896 + arch/x86/crypto/aesni-intel_glue.c | 460 ++ arch/x86/include/asm/cpufeature.h |1 crypto/Kconfig

[PATCH crypto -v4 1/2] AES-NI: Add support to access underlying blkcipher under cryptd ablkcipher

2009-01-15 Thread Huang Ying
cryptd_alloc_ablkcipher() will allocate a cryptd-ed ablkcipher for specified algorithm name. The new allocated one is guaranteed to be cryptd-ed ablkcipher, so the blkcipher underlying can be gotten via cryptd_ablkcipher_child(). Signed-off-by: Huang Ying --- crypto/cryptd.c | 30

Re: [PATCH crypto -v4 1/2] AES-NI: Add support to access underlying blkcipher under cryptd ablkcipher

2009-01-15 Thread Huang Ying
On Thu, 2009-01-15 at 16:47 +0800, Herbert Xu wrote: > On Thu, Jan 15, 2009 at 04:28:33PM +0800, Huang Ying wrote: > > > > + tfm = crypto_alloc_ablkcipher(cryptd_alg_name, type, mask); > > + BUG_ON(crypto_ablkcipher_tfm(tfm)->__crt_alg->cra_module != > > +

Re: [PATCH crypto -v4 1/2] AES-NI: Add support to access underlying blkcipher under cryptd ablkcipher

2009-01-15 Thread Huang Ying
On Thu, 2009-01-15 at 17:23 +0800, Herbert Xu wrote: > On Thu, Jan 15, 2009 at 05:21:47PM +0800, Huang Ying wrote: > > On Thu, 2009-01-15 at 16:47 +0800, Herbert Xu wrote: > > > On Thu, Jan 15, 2009 at 04:28:33PM +0800, Huang Ying wrote: > > > > > > >

Re: [PATCH crypto -v4 2/2] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-15 Thread Huang Ying
e to break > out of the loop? > i.e. > > while (!err && (nbytes = walk.nbytes)) > > (if that's erroneous, it occurs in other places as well) It seems that it is a bug. But it seems that the similar code in geode-aes.c and padlock-aes.c has same bug. I think we should fix them too. Best Regards, Huang Ying signature.asc Description: This is a digitally signed message part

Re: [PATCH crypto -v4 2/2] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-15 Thread Huang Ying
On Fri, 2009-01-16 at 09:53 +0800, Herbert Xu wrote: > On Fri, Jan 16, 2009 at 09:20:58AM +0800, Huang Ying wrote: > > On Thu, 2009-01-15 at 17:47 +0800, roel kluin wrote: > > > > > > + kernel_fpu_begin(); > > > > + while ((nbytes = walk.nbytes))

[RFC] per-CPU cryptd thread implementation based on workqueue

2009-01-15 Thread Huang Ying
, create a dedicate workqueue for crypto subsystem. This way, chainiv can use this crypto workqueue too. I will implement it if you have no plan to do it yourself. Best Regards, Huang Ying signature.asc Description: This is a digitally signed message part

Re: [PATCH crypto -v4 2/2] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-15 Thread Huang Ying
On Fri, 2009-01-16 at 11:26 +0800, Herbert Xu wrote: > On Fri, Jan 16, 2009 at 10:37:02AM +0800, Huang Ying wrote: > > > > But after checking blkcipher_walk_done() in 2.6.28, If input argument > > err != 0 and walk->flags & BLKCIPHER_WALK_SLOW != 0, when > > blk

[PATCH crypto -v5 1/2] AES-NI: Add support to access underlying blkcipher under cryptd ablkcipher

2009-01-15 Thread Huang Ying
processing in cryptd_alloc_ablkcipher() Signed-off-by: Huang Ying --- crypto/cryptd.c | 33 + include/crypto/cryptd.h | 27 +++ 2 files changed, 60 insertions(+) --- a/crypto/cryptd.c +++ b/crypto/cryptd.c @@ -12,6 +12,7

[PATCH crypto -v5 2/2] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-15 Thread Huang Ying
. Signed-off-by: Huang Ying --- arch/x86/crypto/Makefile |3 arch/x86/crypto/aesni-intel_asm.S | 896 + arch/x86/crypto/aesni-intel_glue.c | 461 +++ arch/x86/include/asm/cpufeature.h |1 crypto/Kconfig

[PATCH crypto -v6 1/2] AES-NI: Add support to access underlying blkcipher under cryptd ablkcipher

2009-01-15 Thread Huang Ying
processing in cryptd_alloc_ablkcipher() Signed-off-by: Huang Ying --- crypto/cryptd.c | 35 +++ include/crypto/cryptd.h | 27 +++ 2 files changed, 62 insertions(+) --- a/crypto/cryptd.c +++ b/crypto/cryptd.c @@ -12,6 +12,7

[PATCH crypto -v6 2/2] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-15 Thread Huang Ying
. Signed-off-by: Huang Ying --- arch/x86/crypto/Makefile |3 arch/x86/crypto/aesni-intel_asm.S | 896 + arch/x86/crypto/aesni-intel_glue.c | 461 +++ arch/x86/include/asm/cpufeature.h |1 crypto/Kconfig

Re: [RFC] per-CPU cryptd thread implementation based on workqueue

2009-01-21 Thread Huang Ying
On Fri, 2009-01-16 at 11:31 +0800, Herbert Xu wrote: > On Fri, Jan 16, 2009 at 11:10:36AM +0800, Huang Ying wrote: > > > > The scalability of current cryptd implementation is not good. So a > > per-CPU cryptd kthread implementation is necessary. The per-CPU kthread > >

Re: [RFC] per-CPU cryptd thread implementation based on workqueue

2009-01-21 Thread Huang Ying
On Thu, 2009-01-22 at 11:04 +0800, Herbert Xu wrote: > On Thu, Jan 22, 2009 at 10:32:17AM +0800, Huang Ying wrote: > > > > This is the first attempt to use a dedicate workqueue for crypto. It is > > not intended to be merged. Please feedback your comments, especially on > &

Re: [RFC] per-CPU cryptd thread implementation based on workqueue

2009-02-01 Thread Huang Ying
Sorry for my late. Last week is Chinese new year holiday. On Sat, 2009-01-24 at 15:07 +0800, Andrew Morton wrote: > On Thu, 22 Jan 2009 10:32:17 +0800 Huang Ying wrote: > > > Use dedicate workqueue for crypto > > > > - A dedicated workqueue named kcrypto_wq is created.

Re: [RFC] per-CPU cryptd thread implementation based on workqueue

2009-02-01 Thread Huang Ying
ystem 0:00.26elapsed 154%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6557minor)pagefaults 0swaps --w cryptowq end -- The middle value of elapsed time is: wo cryptwq: 0.31 w cryptwq: 0.26 The performance gain is about (0.31-0.26)/0.26 = 0.1

Re: [RFC] per-CPU cryptd thread implementation based on workqueue

2009-02-01 Thread Huang Ying
On Thu, 2009-01-22 at 15:30 +0800, Herbert Xu wrote: > On Thu, Jan 22, 2009 at 03:15:58PM +0800, Huang Ying wrote: The only needed spin lock usage is cryptd_tfm_in_queue() now, I think we can protect that via RCU, what's your opinions? > > Yes. Except that, now we do not need a sp

[PATCH 3/3] crypto: Uses kcrypto_wq instead of keventd_wq in chainiv

2009-02-01 Thread Huang Ying
keventd_wq has potential starvation problem, so use dedicated kcrypto_wq instead. Signed-off-by: Huang Ying --- crypto/Kconfig |1 + crypto/chainiv.c |3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -56,6 +56,7 @@ config

[PATCH 1/3] crypto: Use dedicated workqueue for crypto subsystem

2009-02-01 Thread Huang Ying
A dedicated workqueue named kcrypto_wq is created to be used by crypto subsystem. The system shared keventd_wq is not suitable for encryption/decryption, because of potential starvation problem. Signed-off-by: Huang Ying --- crypto/Kconfig |3 +++ crypto/Makefile

[PATCH 2/3] crypto: Per-CPU cryptd thread implementation based on kcrypto_wq

2009-02-01 Thread Huang Ying
594minor)pagefaults 0swaps 0.04user 0.35system 0:00.26elapsed 154%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6557minor)pagefaults 0swaps --w end -- The middle value of elapsed time is: wo cryptwq: 0.31 w cryptwq: 0.26 The performance g

Re: [PATCH 2/3] crypto: Per-CPU cryptd thread implementation based on kcrypto_wq

2009-02-03 Thread Huang Ying
On Tue, 2009-02-03 at 17:10 +0800, Andrew Morton wrote: > On Mon, 02 Feb 2009 14:42:20 +0800 Huang Ying wrote: > > > Original cryptd thread implementation has scalability issue, this > > patch solve the issue with a per-CPU thread implementation. > > > > struct c

[PATCH -v2 1/3] crypto: Use dedicated workqueue for crypto subsystem

2009-02-09 Thread Huang Ying
Use dedicated workqueue for crypto subsystem A dedicated workqueue named kcrypto_wq is created to be used by crypto subsystem. The system shared keventd_wq is not suitable for encryption/decryption, because of potential starvation problem. Signed-off-by: Huang Ying --- crypto/Kconfig

[PATCH -v2 2/3] crypto: Per-CPU cryptd thread implementation based on kcrypto_wq

2009-02-09 Thread Huang Ying
-- The middle value of elapsed time is: wo cryptwq: 0.31 w cryptwq: 0.26 The performance gain is about (0.31-0.26)/0.26 = 0.192. Signed-off-by: Huang Ying --- crypto/Kconfig |1 crypto/cryptd.c | 220 ++-- 2 files

[PATCH -v2 3/3] crypto: Uses kcrypto_wq instead of keventd_wq in chainiv

2009-02-09 Thread Huang Ying
Uses kcrypto_wq instead of keventd_wq in chainiv keventd_wq has potential starvation problem, so use dedicated kcrypto_wq instead. Signed-off-by: Huang Ying --- crypto/Kconfig |1 + crypto/chainiv.c |3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) --- a/crypto/Kconfig +++ b

Bug of dm-crypt?

2009-02-26 Thread Huang Ying
ion: kcryptd_async_done. This makes my AES-NI cryptd usage panic. Do you think that is a bug? Best Regards, Huang Ying signature.asc Description: This is a digitally signed message part

[BUGFIX] dm-crypt: Fix a bug of async cryption complete function

2009-02-27 Thread Huang Ying
equest. Signed-off-by: Huang Ying --- drivers/md/dm-crypt.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) --- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -60,6 +60,8 @@ struct dm_crypt_io { }; struct dm_crypt_request { + struct ablkcipher_reques

Re: Bug of dm-crypt?

2009-02-27 Thread Huang Ying
Hi, Milan, On Fri, 2009-02-27 at 16:41 +0800, Milan Broz wrote: > Herbert Xu wrote: > > On Fri, Feb 27, 2009 at 01:31:56PM +0800, Huang Ying wrote: > >> I had ever heard from you that the only thing guaranteed in the > >> completion function of async ablkcipher cr

[PATCH 1/3] crypto: Fix tfm allocation in cryptd_alloc_ablkcipher

2009-03-04 Thread Huang Ying
Use crypto_alloc_base() instead of crypto_alloc_ablkcipher() to allocate underlying tfm in cryptd_alloc_ablkcipher. Because crypto_alloc_ablkcipher() prefer GENIV encapsulated crypto instead of raw one, while cryptd_alloc_ablkcipher needed the raw one. Signed-off-by: Huang Ying --- crypto

[PATCH 2/3] crypto: Add fpu template, a wrapper for blkcipher touching FPU

2009-03-04 Thread Huang Ying
uot; template, which makes these operations to be invoked for each request. Signed-off-by: Huang Ying --- crypto/Kconfig |7 ++ crypto/Makefile |1 crypto/fpu.c| 166 3 files changed, 174 insertions(+) --- a/crypto/Kconfig

[PATCH 3/3] crypto: Add AES-NI support for more modes

2009-03-04 Thread Huang Ying
cryption time can be reduced to 50% of general mode implementation + aes-aesni implementation. Signed-off-by: Huang Ying --- arch/x86/crypto/aesni-intel_glue.c | 256 + crypto/Kconfig |1 2 files changed, 257 insertions(+) --- a/arch

[PATCH -v2 1/3] crypto: Fix tfm allocation in cryptd_alloc_ablkcipher

2009-03-09 Thread Huang Ying
Use crypto_alloc_base() instead of crypto_alloc_ablkcipher() to allocate underlying tfm in cryptd_alloc_ablkcipher. Because crypto_alloc_ablkcipher() prefer GENIV encapsulated crypto instead of raw one, while cryptd_alloc_ablkcipher needed the raw one. Signed-off-by: Huang Ying --- crypto

[PATCH -v2 2/3] crypto: Add fpu template, a wrapper for blkcipher touching FPU

2009-03-09 Thread Huang Ying
uot; template, which makes these operations to be invoked for each request. v2: Make FPU mode invisible to user Signed-off-by: Huang Ying --- crypto/Kconfig |5 + crypto/Makefile |1 crypto/fpu.c| 166 3 files changed, 172

[PATCH -v2 3/3] crypto: Add AES-NI support for more modes

2009-03-09 Thread Huang Ying
cryption time can be reduced to 50% of general mode implementation + aes-aesni implementation. v2: Add description of mode acceleration support in Kconfig Signed-off-by: Huang Ying --- arch/x86/crypto/aesni-intel_glue.c | 256 + crypto/Kconfig

[PATCH -v3 1/3] crypto: Fix tfm allocation in cryptd_alloc_ablkcipher

2009-03-17 Thread Huang Ying
Use crypto_alloc_base() instead of crypto_alloc_ablkcipher() to allocate underlying tfm in cryptd_alloc_ablkcipher. Because crypto_alloc_ablkcipher() prefer GENIV encapsulated crypto instead of raw one, while cryptd_alloc_ablkcipher needed the raw one. Signed-off-by: Huang Ying --- crypto

[PATCH -v3 2/3] crypto: Add fpu template, a wrapper for blkcipher touching FPU

2009-03-17 Thread Huang Ying
uot; template, which makes these operations to be invoked for each request. v2: Make FPU mode invisible to end user Signed-off-by: Huang Ying --- crypto/Kconfig |5 + crypto/Makefile |1 crypto/fpu.c| 166 3 files changed, 172

[PATCH -v3 3/3] crypto: Add AES-NI support for more modes

2009-03-17 Thread Huang Ying
cryption time can be reduced to 50% of general mode implementation + aes-aesni implementation. v2: Add description of mode acceleration support in Kconfig v3: Fix some bugs of CTR block size, LRW and XTS min/max key size. Signed-off-by: Huang Ying --- arch/x86/crypto/aesni-intel_glue.c | 267

Accelerate GCM with PCLMULQDQ-NI

2009-03-18 Thread Huang Ying
that of AES-NI, that is, XMM registers are used. To accelerate GCM with it, I make the following design: 1. Implement ghash as an ahash algorithm, Use ghash in gcm implementation. 2. Provide a new implementation of ghash with PCLMULQDQ-NI. What do you think about that? Best Regards, Huang Ying

Re: Accelerate GCM with PCLMULQDQ-NI

2009-03-29 Thread Huang Ying
On Sun, 2009-03-29 at 15:43 +0800, Herbert Xu wrote: > On Wed, Mar 18, 2009 at 04:52:12PM +0800, Huang Ying wrote: > > > > To accelerate GCM with it, I make the following design: > > > > 1. Implement ghash as an ahash algorithm, Use ghash in gcm > > imp

Re: GCM benchmark

2009-04-09 Thread Huang Ying
On Thu, 2009-04-09 at 16:21 +0800, Herbert Xu wrote: > On Thu, Apr 09, 2009 at 03:50:21PM +0800, Huang Ying wrote: > > Hi, Herbert, > > > > I am working on GCM acceleration with Intel new PCLMULQDQ instructions > > now. Can you tell me how to do GCM benchma

[RFC 0/7] crypto: PCLMULQDQ accelerated GHASH

2009-06-11 Thread Huang Ying
Hi, Herbert, This patchset adds PCLMULQDQ accelerated GHASH. Because conversion from crypto_hash to crypto_shash has not been done, this patchset is not intended to be merged now. Please take a look at the general design. Best Regards, Huang Ying -- To unsubscribe from this list: send the line

[RFC 3/7] crypto: Add crypto_spawn_shash

2009-06-11 Thread Huang Ying
Needed to use shash in cryptd hash. Signed-off-by: Huang Ying --- crypto/shash.c |6 ++ include/crypto/algapi.h |8 2 files changed, 14 insertions(+) --- a/include/crypto/algapi.h +++ b/include/crypto/algapi.h @@ -240,6 +240,14 @@ static inline struct cipher_alg

[RFC 5/7] crypto: cryptd: Add support to access underlaying shash

2009-06-11 Thread Huang Ying
cryptd_alloc_ahash() will allocate a cryptd-ed ahash for specified algorithm name. The new allocated one is guaranteed to be cryptd-ed ahash, so the shash underlying can be gotten via cryptd_ahash_child(). Signed-off-by: Huang Ying --- crypto/cryptd.c | 35

[RFC 1/7] crypto: Add GHASH digest algorithm for GCM

2009-06-11 Thread Huang Ying
GHASH is implemented as a shash algorithm. The actual implementation is copied from gcm.c. This makes it possible to add architecture/hardware accelerated GHASH implementation. Signed-off-by: Huang Ying --- crypto/Kconfig |7 + crypto/Makefile|2 crypto/ghash-generic.c

[RFC 4/7] crypto: use crypto_shash instead of crypto_hash in cryptd hash

2009-06-11 Thread Huang Ying
crypto_hash interface has some issue and will be replaced by crypto_shash. This patch replace crypto_hash in cryptd hash with crypto_shash. Signed-off-by: Huang Ying --- crypto/cryptd.c | 118 ++-- 1 file changed, 73 insertions(+), 45

[RFC 6/7] x86: Move kernel_fpu_using to asm/i387.h

2009-06-11 Thread Huang Ying
This is used by AES-NI accelerated AES implementation and PCLMULQDQ accelerated GHASH implementation. Signed-off-by: Huang Ying --- arch/x86/crypto/aesni-intel_glue.c |7 --- arch/x86/include/asm/i387.h|7 +++ 2 files changed, 7 insertions(+), 7 deletions(-) --- a/arch

[RFC 2/7] crypto: Use GHASH digest algorithm in GCM

2009-06-11 Thread Huang Ying
asynchronous interface. Signed-off-by: Huang Ying --- crypto/Kconfig |2 crypto/gcm.c | 531 +++-- 2 files changed, 367 insertions(+), 166 deletions(-) --- a/crypto/gcm.c +++ b/crypto/gcm.c @@ -12,6 +12,7 @@ #include #include #include

[RFC 7/7] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-06-11 Thread Huang Ying
, its usage must be enclosed with kernel_fpu_begin/end, which can be used only in process context, the acceleration is implemented as crypto_ahash. That is, request in soft IRQ context will be deferred to the cryptd kernel thread. Signed-off-by: Huang Ying --- arch/x86/crypto/Makefile

[BUGFIX 1/3] crypto: Fix AES-NI cbc mode IV saving

2009-06-15 Thread Huang Ying
Original implementation of aesni_cbc_dec do not save IV if input length % 4 == 0. This will make decryption of next block failed. Signed-off-by: Huang Ying --- arch/x86/crypto/aesni-intel_asm.S |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- a/arch/x86/crypto/aesni

[BUGFIX 2/3] crypto: Remove CRYPTO_TFM_REQ_MAY_SLEEP flag in AES-NI accelerated ecb/cbc mode

2009-06-15 Thread Huang Ying
Because AES-NI instructions will touch XMM state, corresponding code must be enclosed within kernel_fpu_begin/end, which used preempt_disable/enable. So sleep should be prevented between kernel_fpu_begin/end. Signed-off-by: Huang Ying --- arch/x86/crypto/aesni-intel_glue.c |4 1 file

[BUGFIX 3/3] crypto: Remove CRYPTO_TFM_REQ_MAY_SLEEP from fpu template

2009-06-15 Thread Huang Ying
kernel_fpu_begin/end used preempt_disable/enable, so sleep should be prevented between kernel_fpu_begin/end. Signed-off-by: Huang Ying --- arch/x86/crypto/fpu.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/arch/x86/crypto/fpu.c +++ b/arch/x86/crypto/fpu.c @@ -48,7 +48,7

Re: [RFC 6/7] x86: Move kernel_fpu_using to asm/i387.h

2009-06-17 Thread Huang Ying
ous #TS faults, while AES and PCLMUL need to check whether MMX/SSE registers are available. After some thinking, I think something as follow may be more appropriate: /* This may be useful for someone else */ static inline bool fpu_using(void) { return !(read_cr0() & X86_CR0_TS); }

Re: [RFC 1/7] crypto: Add GHASH digest algorithm for GCM

2009-06-17 Thread Huang Ying
On Thu, 2009-06-18 at 04:04 +0800, Sebastian Andrzej Siewior wrote: > * Huang Ying | 2009-06-11 15:10:26 [+0800]: > > >GHASH is implemented as a shash algorithm. The actual implementation > >is copied from gcm.c. This makes it possible to add > >architecture/ha

Re: [RFC 2/7] crypto: Use GHASH digest algorithm in GCM

2009-06-17 Thread Huang Ying
On Thu, 2009-06-18 at 04:47 +0800, Sebastian Andrzej Siewior wrote: > * Huang Ying | 2009-06-11 15:10:28 [+0800]: > > >Remove the dedicated GHASH implementation in GCM, and uses the GHASH > >digest algorithm instead. This will make GCM uses hardware accelerated >

Re: [RFC 1/7] crypto: Add GHASH digest algorithm for GCM

2009-06-18 Thread Huang Ying
On Thu, 2009-06-18 at 15:27 +0800, Sebastian Andrzej Siewior wrote: > * Huang Ying | 2009-06-18 10:08:27 [+0800]: > > >On Thu, 2009-06-18 at 04:04 +0800, Sebastian Andrzej Siewior wrote: > >> >+#include > >> >+#include > >> >+#include &

Re: [BUGFIX 2/3] crypto: Remove CRYPTO_TFM_REQ_MAY_SLEEP flag in AES-NI accelerated ecb/cbc mode

2009-06-18 Thread Huang Ying
On Thu, 2009-06-18 at 19:40 +0800, Herbert Xu wrote: > On Mon, Jun 15, 2009 at 05:04:57PM +0800, Huang Ying wrote: > > Because AES-NI instructions will touch XMM state, corresponding code > > must be enclosed within kernel_fpu_begin/end, which used > > preempt_disable/enabl

Re: [RFC 2/7] crypto: Use GHASH digest algorithm in GCM

2009-06-21 Thread Huang Ying
On Sun, 2009-06-21 at 21:46 +0800, Herbert Xu wrote: > Huang Ying wrote: > > > > + ghash = crypto_alloc_ahash("ghash", 0, 0); > > + if (IS_ERR(ghash)) > > + return PTR_ERR(ghash); > > We should add this as an extra parameter t

Re: [RFC 2/7] crypto: Use GHASH digest algorithm in GCM

2009-06-21 Thread Huang Ying
On Mon, 2009-06-22 at 10:03 +0800, Herbert Xu wrote: > On Mon, Jun 22, 2009 at 09:41:16AM +0800, Huang Ying wrote: > > > > Can crypto_alloc_ahash("ghash",...) select among different ghash > > implementation automatically based on priority? I think > > crypto

Re: [RFC 7/7] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-07-06 Thread Huang Ying
Hi, Herbert, On Sun, 2009-06-21 at 21:51 +0800, Herbert Xu wrote: > Huang Ying wrote: > > PCLMULQDQ is used to accelerate the most time-consuming part of GHASH, > > carry-less multiplication. More information about PCLMULQDQ can be > > found at: > > > > http://

Re: [RFC 7/7] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-07-06 Thread Huang Ying
ren't any remaining DIGEST algorithms :) > > I'll get onto hmac. Thank you. Will post the updated version after you have done with hmac. Best Regards, Huang Ying -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...

[BUGFIX] crypto: Fix ctr(aes) testing by specifying geniv

2009-08-03 Thread Huang Ying
with geniv, but defualt geniv mode may not work with ctr(aes). As that of rfc3686, this is fixed via specifying geniv mode to "seqiv". Signed-off-by: Huang Ying --- crypto/ctr.c |2 ++ 1 file changed, 2 insertions(+) --- a/crypto/ctr.c +++ b/crypto/ctr.c @@ -219,6 +219,8 @@ static st

[PATCH -v2 1/5] crypto: Add GHASH digest algorithm for GCM

2009-08-03 Thread Huang Ying
GHASH is implemented as a shash algorithm. The actual implementation is copied from gcm.c. This makes it possible to add architecture/hardware accelerated GHASH implementation. v2: - Fix a bug in Makefile (Thanks Sebastian) - Some other minor fixes Signed-off-by: Huang Ying --- crypto

[PATCH -v2 3/5] crypto: cryptd: Add support to access underlaying shash

2009-08-03 Thread Huang Ying
cryptd_alloc_ahash() will allocate a cryptd-ed ahash for specified algorithm name. The new allocated one is guaranteed to be cryptd-ed ahash, so the shash underlying can be gotten via cryptd_ahash_child(). Signed-off-by: Huang Ying --- crypto/cryptd.c | 35

[PATCH -v2 4/5] x86: Move kernel_fpu_using to irq_is_fpu_using in asm/i387.h

2009-08-03 Thread Huang Ying
This is used by AES-NI accelerated AES implementation and PCLMULQDQ accelerated GHASH implementation. v2: - Renamed to irq_is_fpu_using to reflect the real situation. Signed-off-by: Huang Ying CC: H. Peter Anvin --- arch/x86/crypto/aesni-intel_glue.c | 17 + arch/x86

[PATCH -v2 5/5] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-08-03 Thread Huang Ying
, its usage must be enclosed with kernel_fpu_begin/end, which can be used only in process context, the acceleration is implemented as crypto_ahash. That is, request in soft IRQ context will be defered to the cryptd kernel thread. Signed-off-by: Huang Ying --- arch/x86/crypto/Makefile

[PATCH -v2 2/5] crypto: Use GHASH digest algorithm in GCM

2009-08-03 Thread Huang Ying
asynchronous interface. v2: - Add parameter to gcm_base to choose ghash implementation. - Fix a memory leak about gcm_zeros (Thanks Sebastian) - Some minor fixes Signed-off-by: Huang Ying --- crypto/Kconfig |2 +- crypto/gcm.c | 580

Re: [BUGFIX] crypto: Fix ctr(aes) testing by specifying geniv

2009-08-05 Thread Huang Ying
On Wed, 2009-08-05 at 17:45 +0800, Herbert Xu wrote: > On Mon, Aug 03, 2009 at 03:44:43PM +0800, Huang Ying wrote: > > When doing "modeprobe tcrypt mode=10", the following error will show > > in dmesg. > > > > alg: skcipher: Failed to load transform for ctr(a

Re: [PATCH -v2 5/5] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-08-06 Thread Huang Ying
On Thu, 2009-08-06 at 15:17 +0800, Herbert Xu wrote: > On Mon, Aug 03, 2009 at 03:45:31PM +0800, Huang Ying wrote: > > PCLMULQDQ is used to accelerate the most time-consuming part of GHASH, > > carry-less multiplication. More information about PCLMULQDQ can be > > fo

Re: [BUGFIX] crypto: Fix ctr(aes) testing by specifying geniv

2009-08-11 Thread Huang Ying
On Thu, 2009-08-06 at 10:12 +0800, Huang Ying wrote: > On Wed, 2009-08-05 at 17:45 +0800, Herbert Xu wrote: > > On Mon, Aug 03, 2009 at 03:44:43PM +0800, Huang Ying wrote: > > > When doing "modeprobe tcrypt mode=10", the following error will show > > > in dmes

Re: [BUGFIX] crypto: Fix ctr(aes) testing by specifying geniv

2009-08-13 Thread Huang Ying
lly we can't use seqiv on raw counter mode because it cannot > guarantee IV uniqueness. I think reverting to chainiv is the safer > option. I see seqiv is used in rfc3686 mode, it means seqiv can not be used on raw counter mode but can be used for rfc3686? Best Regards, Huang

[PATCH -v3] x86: Move kernel_fpu_using to irq_fpu_usable in asm/i387.h

2009-08-30 Thread Huang Ying
PCLMULQDQ accelerated GHASH implementation. v3: - Renamed to irq_fpu_usable to reflect the purpose of the function. v2: - Renamed to irq_is_fpu_using to reflect the real situation. Signed-off-by: Huang Ying CC: H. Peter Anvin --- arch/x86/crypto/aesni-intel_glue.c | 17 + arch/x86

[PATCH -v3] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-09-14 Thread Huang Ying
Hi, Herbert, The dependency to irq_fpu_usable has been merged by linus' tree. Best Regards, Huang Ying --> PCLMULQDQ is used to accelerate the most time-consuming part of GHASH, carry-less multiplication. More inf

Re: [PATCH -v3] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-09-15 Thread Huang Ying
On Tue, 2009-09-15 at 22:42 +0800, Daniel Walker wrote: > On Tue, 2009-09-15 at 13:42 +0800, Huang Ying wrote: > > Hi, Herbert, > > > > The dependency to irq_fpu_usable has been merged by linus' tree. > >

[PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-09-15 Thread Huang Ying
, performance increase about 2x. Signed-off-by: Huang Ying --- arch/x86/crypto/Makefile |3 arch/x86/crypto/ghash-clmulni-intel_asm.S | 157 + arch/x86/crypto/ghash-clmulni-intel_glue.c | 333 + arch/x86/include/asm/cpufeature.h

[BUGFIX] Fix irq_fpu_usable usage in aesni

2009-10-18 Thread Huang Ying
not changed accordingly. This patch fixes this. Signed-off-by: Huang Ying --- arch/x86/crypto/aesni-intel_glue.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) --- a/arch/x86/crypto/aesni-intel_glue.c +++ b/arch/x86/crypto/aesni-intel_glue.c @@ -82,7 +82,7 @@ static int

[BUGFIX for .32] crypto, gcm, fix another complete call in complete fuction

2009-11-02 Thread Huang Ying
. - Expand complete_for_next_step(). Signed-off-by: Huang Ying --- crypto/gcm.c | 43 --- 1 file changed, 28 insertions(+), 15 deletions(-) --- a/crypto/gcm.c +++ b/crypto/gcm.c @@ -267,8 +267,7 @@ static int gcm_hash_final(struct aead_re return

[BUGFIX for crypto-dev] crypto, Fix irq_fpu_usable usage in clmulni-intel

2009-11-02 Thread Huang Ying
-intel_glue.c is not changed accordingly. This patch fixes this. Signed-off-by: Huang Ying --- arch/x86/crypto/ghash-clmulni-intel_glue.c |8 1 file changed, 4 insertions(+), 4 deletions(-) --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c

Re: [PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-11-02 Thread Huang Ying
> > > I'm happy to revisit this once inst.h exists. > > No reason to not do most of the change first though, the way i suggested > it. How about something as below? But it seems not appropriate to put these bits into i387.h, that is, to combine C and gas syntax. Best Reg

  1   2   >