Re: [PATCH v3 0/4] Patchset to use PCLMULQDQ to accelerate CRC-T10DIF checksum computation

2013-05-06 Thread Tim Chen
On Wed, 2013-05-01 at 12:52 -0700, Tim Chen wrote: > Currently the CRC-T10DIF checksum is computed using a generic table lookup > algorithm. By switching the checksum to PCLMULQDQ based computation, > we can speedup the computation by 8x for checksumming 512 bytes and > even mor

Re: [PATCH v2 1/4] Wrap crc_t10dif function all to use crypto transform framework

2013-05-01 Thread Tim Chen
On Tue, 2013-04-30 at 11:27 +0800, Herbert Xu wrote: > On Mon, Apr 29, 2013 at 01:40:30PM -0700, Tim Chen wrote: > > > > If I allocate the transform under the mod init instead, how can I make > > sure that the fast version is already registered if I have it compiled > >

[PATCH v3 2/4] Accelerated CRC T10 DIF computation with PCLMULQDQ instruction

2013-05-01 Thread Tim Chen
ents/white-papers/fast-crc-computation-generic-polynomials-pclmulqdq-paper.pdf Signed-off-by: Tim Chen --- arch/x86/crypto/crct10dif-pcl-asm_64.S | 643 + 1 file changed, 643 insertions(+) create mode 100644 arch/x86/crypto/crct10dif-pcl-asm_64.S diff --git a/arch/

[PATCH v3 4/4] Simple correctness and speed test for CRCT10DIF hash

2013-05-01 Thread Tim Chen
turbo off when running the speed test so the frequency governor will not tweak the frequency and affects the measurements. Signed-off-by: Tim Chen --- crypto/tcrypt.c | 8 crypto/testmgr.c | 10 ++ crypto/testmgr.h | 33 + 3 files changed, 51

[PATCH v3 3/4] Glue code to cast accelerated CRCT10DIF assembly as a crypto transform

2013-05-01 Thread Tim Chen
: Tim Chen --- arch/x86/crypto/Makefile| 2 + arch/x86/crypto/crct10dif-pclmul_glue.c | 157 crypto/Kconfig | 11 +++ 3 files changed, 170 insertions(+) create mode 100644 arch/x86/crypto/crct10dif-pclmul_glue.c diff --git

[PATCH v3 1/4] Wrap crc_t10dif function all to use crypto transform framework

2013-05-01 Thread Tim Chen
When CRC T10 DIF is calculated using the crypto transform framework, we wrap the crc_t10dif function call to utilize it. This allows us to take advantage of any accelerated CRC T10 DIF transform that is plugged into the crypto framework. Signed-off-by: Tim Chen --- crypto/Kconfig

[PATCH v3 0/4] Patchset to use PCLMULQDQ to accelerate CRC-T10DIF checksum computation

2013-05-01 Thread Tim Chen
Herbert Xu, Matthew Wilcox and Jussi Kivilinna who reviewed the patches and Keith Busch for testing version 1 of the patch set. Tim Chen (4): Wrap crc_t10dif function all to use crypto transform framework Accelerated CRC T10 DIF computation with PCLMULQDQ instruction Glue code to cast

Re: [PATCH v2 1/4] Wrap crc_t10dif function all to use crypto transform framework

2013-04-29 Thread Tim Chen
On Sun, 2013-04-28 at 08:11 +0800, Herbert Xu wrote: > On Fri, Apr 26, 2013 at 09:44:17AM -0700, Tim Chen wrote: > > > > + old_tfm = crct10dif_tfm; > > + crc_t10dif_newalg = true; > > + /* make sure new alg flag is turned on before starting

Re: [PATCH v2 1/4] Wrap crc_t10dif function all to use crypto transform framework

2013-04-26 Thread Tim Chen
On Fri, 2013-04-26 at 20:52 +0800, Herbert Xu wrote: > On Thu, Apr 25, 2013 at 10:28:30AM -0700, Tim Chen wrote: > > > > @@ -51,6 +54,98 @@ static const __u16 t10_dif_crc_table[256] = { > > 0xF0D8, 0x7B6F, 0x6C01, 0xE7B6, 0x42DD, 0xC96A, 0xDE04, 0x55B3 >

Re: [PATCH v2 1/4] Wrap crc_t10dif function all to use crypto transform framework

2013-04-25 Thread Tim Chen
calculated using the crypto transform framework, we wrap the crc_t10dif function call to utilize it. This allows us to take advantage of any accelerated CRC T10 DIF transform that is plugged into the crypto framework. Signed-off-by: Tim Chen --- crypto/Makefile| 1 + crypto/

[PATCH v2 4/4] Simple correctness and speed test for CRCT10DIF hash

2013-04-17 Thread Tim Chen
turbo off when running the speed test so the frequency governor will not tweak the frequency and affects the measurements. Signed-off-by: Tim Chen --- crypto/tcrypt.c | 8 crypto/testmgr.c | 10 ++ crypto/testmgr.h | 33 + 3 files changed, 51

[PATCH v2 2/4] Accelerated CRC T10 DIF computation with PCLMULQDQ instruction

2013-04-17 Thread Tim Chen
ents/white-papers/fast-crc-computation-generic-polynomials-pclmulqdq-paper.pdf Signed-off-by: Tim Chen --- arch/x86/crypto/crct10dif-pcl-asm_64.S | 643 + 1 file changed, 643 insertions(+) create mode 100644 arch/x86/crypto/crct10dif-pcl-asm_64.S diff --git a/arch/

[PATCH v2 3/4] Glue code to cast accelerated CRCT10DIF assembly as a crypto transform

2013-04-17 Thread Tim Chen
: Tim Chen --- arch/x86/crypto/Makefile| 2 + arch/x86/crypto/crct10dif-pclmul_glue.c | 153 crypto/Kconfig | 21 + 3 files changed, 176 insertions(+) create mode 100644 arch/x86/crypto/crct10dif-pclmul_glue.c diff

[PATCH v2 1/4] Wrap crc_t10dif function all to use crypto transform framework

2013-04-17 Thread Tim Chen
When CRC T10 DIF is calculated using the crypto transform framework, we wrap the crc_t10dif function call to utilize it. This allows us to take advantage of any accelerated CRC T10 DIF transform that is plugged into the crypto framework. Signed-off-by: Tim Chen --- include/linux/crc-t10dif.h

[PATCH v2 0/4] Patchset to use PCLMULQDQ to accelerate CRC-T10DIF checksum computation

2013-04-17 Thread Tim Chen
ths through crc t10dif computation. 4. Fix config dependencies of CRYPTO_CRCT10DIF. Thanks to Matthew and Jussi who reviewed the patches and Keith for testing version 1 of the patch set. Tim Chen (4): Wrap crc_t10dif function all to use crypto transform framework Accelerated CRC T10

Re: [PATCH 2/4] Accelerated CRC T10 DIF computation with PCLMULQDQ instruction

2013-04-17 Thread Tim Chen
On Wed, 2013-04-17 at 20:58 +0300, Jussi Kivilinna wrote: > On 16.04.2013 19:20, Tim Chen wrote: > > This is the x86_64 CRC T10 DIF transform accelerated with the PCLMULQDQ > > instructions. Details discussing the implementation can be found in the > > paper: > > &g

Re: [PATCH 4/4] Simple correctness and speed test for CRCT10DIF hash

2013-04-17 Thread Tim Chen
On Wed, 2013-04-17 at 20:58 +0300, Jussi Kivilinna wrote: > On 16.04.2013 19:20, Tim Chen wrote: > > These are simple tests to do sanity check of CRC T10 DIF hash. The > > correctness of the transform can be checked with the command > > modprobe tcrypt mode=47 > >

[PATCH 2/4] Accelerated CRC T10 DIF computation with PCLMULQDQ instruction

2013-04-16 Thread Tim Chen
323102.pdf Signed-off-by: Tim Chen Tested-by: Keith Busch --- arch/x86/crypto/crct10dif-pcl-asm_64.S | 659 + 1 file changed, 659 insertions(+) create mode 100644 arch/x86/crypto/crct10dif-pcl-asm_64.S diff --git a/arch/x86/crypto/crct10dif-pcl-asm_64.S b/arch/

[PATCH 0/4] Patchset to use PCLMULQDQ to accelerate CRC-T10DIF checksum computation

2013-04-16 Thread Tim Chen
. Will appreciate if you can consider merging this for the 3.10 kernel. Tim Tim Chen (4): Wrap crc_t10dif function all to use crypto transform framework Accelerated CRC T10 DIF computation with PCLMULQDQ instruction Glue code to cast accelerated CRCT10DIF assembly as a crypto transform

[PATCH 3/4] Glue code to cast accelerated CRCT10DIF assembly as a crypto transform

2013-04-16 Thread Tim Chen
: Tim Chen Tested-by: Keith Busch --- arch/x86/crypto/Makefile| 2 + arch/x86/crypto/crct10dif-pclmul_glue.c | 153 crypto/Kconfig | 21 + 3 files changed, 176 insertions(+) create mode 100644 arch/x86/crypto/crct10dif

[PATCH 1/4] Wrap crc_t10dif function all to use crypto transform framework

2013-04-16 Thread Tim Chen
When CRC T10 DIF is calculated using the crypto transform framework, we wrap the crc_t10dif function call to utilize it. This allows us to take advantage of any accelerated CRC T10 DIF transform that is plugged into the crypto framework. Signed-off-by: Tim Chen Tested-by: Keith Busch

[PATCH 4/4] Simple correctness and speed test for CRCT10DIF hash

2013-04-16 Thread Tim Chen
turbo off when running the speed test so the frequency governor will not tweak the frequency and affects the measurements. Signed-off-by: Tim Chen Tested-by: Keith Busch --- crypto/tcrypt.c | 8 crypto/testmgr.c | 10 ++ crypto/testmgr.h | 24 3 files