[PATCH v5 1/4] lib/raid6: Add log-of-2 table for RAID6 HW requiring disk position
The raid6_gfexp table represents {2}^n values for 0 <= n < 256. The Linux async_tx framework pass values from raid6_gfexp as coefficients for each source to prep_dma_pq() callback of DMA channel with PQ capability. This creates problem for RAID6 offload engines (such as Broadcom SBA) which take disk position (i.e. log of {2}) instead of multiplicative cofficients from raid6_gfexp table. This patch adds raid6_gflog table having log-of-2 value for any given x such that 0 <= x < 256. For any given disk coefficient x, the corresponding disk position is given by raid6_gflog[x]. The RAID6 offload engine driver can use this newly added raid6_gflog table to get disk position from multiplicative coefficient. Signed-off-by: Anup Patel Reviewed-by: Scott Branden Reviewed-by: Ray Jui --- include/linux/raid/pq.h | 1 + lib/raid6/mktables.c| 20 2 files changed, 21 insertions(+) diff --git a/include/linux/raid/pq.h b/include/linux/raid/pq.h index 4d57bba..30f9453 100644 --- a/include/linux/raid/pq.h +++ b/include/linux/raid/pq.h @@ -142,6 +142,7 @@ int raid6_select_algo(void); extern const u8 raid6_gfmul[256][256] __attribute__((aligned(256))); extern const u8 raid6_vgfmul[256][32] __attribute__((aligned(256))); extern const u8 raid6_gfexp[256] __attribute__((aligned(256))); +extern const u8 raid6_gflog[256] __attribute__((aligned(256))); extern const u8 raid6_gfinv[256] __attribute__((aligned(256))); extern const u8 raid6_gfexi[256] __attribute__((aligned(256))); diff --git a/lib/raid6/mktables.c b/lib/raid6/mktables.c index 39787db..e824d08 100644 --- a/lib/raid6/mktables.c +++ b/lib/raid6/mktables.c @@ -125,6 +125,26 @@ int main(int argc, char *argv[]) printf("EXPORT_SYMBOL(raid6_gfexp);\n"); printf("#endif\n"); + /* Compute log-of-2 table */ + printf("\nconst u8 __attribute__((aligned(256)))\n" + "raid6_gflog[256] =\n" "{\n"); + for (i = 0; i < 256; i += 8) { + printf("\t"); + for (j = 0; j < 8; j++) { + v = 255; + for (k = 0; k < 256; k++) + if (exptbl[k] == (i + j)) { + v = k; + break; + } + printf("0x%02x,%c", v, (j == 7) ? '\n' : ' '); + } + } + printf("};\n"); + printf("#ifdef __KERNEL__\n"); + printf("EXPORT_SYMBOL(raid6_gflog);\n"); + printf("#endif\n"); + /* Compute inverse table x^-1 == x^254 */ printf("\nconst u8 __attribute__((aligned(256)))\n" "raid6_gfinv[256] =\n" "{\n"); -- 2.7.4
[PATCH v5 3/4] dmaengine: Add Broadcom SBA RAID driver
The Broadcom stream buffer accelerator (SBA) provides offloading capabilities for RAID operations. This SBA offload engine is accessible via Broadcom SoC specific ring manager. This patch adds Broadcom SBA RAID driver which provides one DMA device with RAID capabilities using one or more Broadcom SoC specific ring manager channels. The SBA RAID driver in its current shape implements memcpy, xor, and pq operations. Signed-off-by: Anup Patel Reviewed-by: Ray Jui --- drivers/dma/Kconfig| 14 + drivers/dma/Makefile |1 + drivers/dma/bcm-sba-raid.c | 1785 3 files changed, 1800 insertions(+) create mode 100644 drivers/dma/bcm-sba-raid.c diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig index 263495d..3d23597 100644 --- a/drivers/dma/Kconfig +++ b/drivers/dma/Kconfig @@ -99,6 +99,20 @@ config AXI_DMAC controller is often used in Analog Device's reference designs for FPGA platforms. +config BCM_SBA_RAID + tristate "Broadcom SBA RAID engine support" + depends on (ARM64 && MAILBOX && RAID6_PQ) || COMPILE_TEST + select DMA_ENGINE + select DMA_ENGINE_RAID + select ASYNC_TX_DISABLE_XOR_VAL_DMA + select ASYNC_TX_DISABLE_PQ_VAL_DMA + default ARCH_BCM_IPROC + help + Enable support for Broadcom SBA RAID Engine. The SBA RAID + engine is available on most of the Broadcom iProc SoCs. It + has the capability to offload memcpy, xor and pq computation + for raid5/6. + config COH901318 bool "ST-Ericsson COH901318 DMA support" select DMA_ENGINE diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile index a4fa336..ba96bdd 100644 --- a/drivers/dma/Makefile +++ b/drivers/dma/Makefile @@ -17,6 +17,7 @@ obj-$(CONFIG_AMCC_PPC440SPE_ADMA) += ppc4xx/ obj-$(CONFIG_AT_HDMAC) += at_hdmac.o obj-$(CONFIG_AT_XDMAC) += at_xdmac.o obj-$(CONFIG_AXI_DMAC) += dma-axi-dmac.o +obj-$(CONFIG_BCM_SBA_RAID) += bcm-sba-raid.o obj-$(CONFIG_COH901318) += coh901318.o coh901318_lli.o obj-$(CONFIG_DMA_BCM2835) += bcm2835-dma.o obj-$(CONFIG_DMA_JZ4740) += dma-jz4740.o diff --git a/drivers/dma/bcm-sba-raid.c b/drivers/dma/bcm-sba-raid.c new file mode 100644 index 000..d6b927b --- /dev/null +++ b/drivers/dma/bcm-sba-raid.c @@ -0,0 +1,1785 @@ +/* + * Copyright (C) 2017 Broadcom + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +/* + * Broadcom SBA RAID Driver + * + * The Broadcom stream buffer accelerator (SBA) provides offloading + * capabilities for RAID operations. The SBA offload engine is accessible + * via Broadcom SoC specific ring manager. Two or more offload engines + * can share same Broadcom SoC specific ring manager due to this Broadcom + * SoC specific ring manager driver is implemented as a mailbox controller + * driver and offload engine drivers are implemented as mallbox clients. + * + * Typically, Broadcom SoC specific ring manager will implement larger + * number of hardware rings over one or more SBA hardware devices. By + * design, the internal buffer size of SBA hardware device is limited + * but all offload operations supported by SBA can be broken down into + * multiple small size requests and executed parallely on multiple SBA + * hardware devices for achieving high through-put. + * + * The Broadcom SBA RAID driver does not require any register programming + * except submitting request to SBA hardware device via mailbox channels. + * This driver implements a DMA device with one DMA channel using a set + * of mailbox channels provided by Broadcom SoC specific ring manager + * driver. To exploit parallelism (as described above), all DMA request + * coming to SBA RAID DMA channel are broken down to smaller requests + * and submitted to multiple mailbox channels in round-robin fashion. + * For having more SBA DMA channels, we can create more SBA device nodes + * in Broadcom SoC specific DTS based on number of hardware rings supported + * by Broadcom SoC ring manager. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "dmaengine.h" + +/* SBA command related defines */ +#define SBA_TYPE_SHIFT 48 +#define SBA_TYPE_MASK GENMASK(1, 0) +#define SBA_TYPE_A 0x0 +#define SBA_TYPE_B 0x2 +#define SBA_TYPE_C 0x3 +#define SBA_USER_DEF_SHIFT 32 +#define SBA_USER_DEF_MASK GENMASK(15, 0) +#define SBA_R_MDATA_SHIFT 24 +#define SBA_R_MDATA_MASK GENMASK(7, 0) +#define SBA_C_MDATA_MS_SHIFT 18 +#define SBA_C_MDATA_MS_MASK
[PATCH v5 4/4] dt-bindings: Add DT bindings document for Broadcom SBA RAID driver
This patch adds the DT bindings document for newly added Broadcom SBA RAID driver. Signed-off-by: Anup Patel Reviewed-by: Ray Jui Reviewed-by: Scott Branden --- .../devicetree/bindings/dma/brcm,iproc-sba.txt | 29 ++ 1 file changed, 29 insertions(+) create mode 100644 Documentation/devicetree/bindings/dma/brcm,iproc-sba.txt diff --git a/Documentation/devicetree/bindings/dma/brcm,iproc-sba.txt b/Documentation/devicetree/bindings/dma/brcm,iproc-sba.txt new file mode 100644 index 000..092913a --- /dev/null +++ b/Documentation/devicetree/bindings/dma/brcm,iproc-sba.txt @@ -0,0 +1,29 @@ +* Broadcom SBA RAID engine + +Required properties: +- compatible: Should be one of the following + "brcm,iproc-sba" + "brcm,iproc-sba-v2" + The "brcm,iproc-sba" has support for only 6 PQ coefficients + The "brcm,iproc-sba-v2" has support for only 30 PQ coefficients +- mboxes: List of phandle and mailbox channel specifiers + +Example: + +raid_mbox: mbox@6740 { + ... + #mbox-cells = <3>; + ... +}; + +raid0 { + compatible = "brcm,iproc-sba-v2"; + mboxes = <&raid_mbox 0 0x1 0x>, +<&raid_mbox 1 0x1 0x>, +<&raid_mbox 2 0x1 0x>, +<&raid_mbox 3 0x1 0x>, +<&raid_mbox 4 0x1 0x>, +<&raid_mbox 5 0x1 0x>, +<&raid_mbox 6 0x1 0x>, +<&raid_mbox 7 0x1 0x>; +}; -- 2.7.4
[PATCH v5 0/4] Broadcom SBA RAID support
The Broadcom SBA RAID is a stream-based device which provides RAID5/6 offload. It requires a SoC specific ring manager (such as Broadcom FlexRM ring manager) to provide ring-based programming interface. Due to this, the Broadcom SBA RAID driver (mailbox client) implements DMA device having one DMA channel using a set of mailbox channels provided by Broadcom SoC specific ring manager driver (mailbox controller). The Broadcom SBA RAID hardware requires PQ disk position instead of PQ disk coefficient. To address this, we have added raid_gflog table which will help driver to convert PQ disk coefficient to PQ disk position. This patchset is based on Linux-4.10-rc2 and depends on patchset "[PATCH v4 0/2] Broadcom FlexRM ring manager support" It is also available at sba-raid-v5 branch of https://github.com/Broadcom/arm64-linux.git Changes since v4: - Removed dependency of bcm-sba-raid driver on kconfig opton ASYNC_TX_ENABLE_CHANNEL_SWITCH - Select kconfig options ASYNC_TX_DISABLE_XOR_VAL_DMA and ASYNC_TX_DISABLE_PQ_VAL_DMA for bcm-sba-raid driver - Implemented device_prep_dma_interrupt() using dummy 8-byte copy operation so that the dma_async_device_register() can set DMA_ASYNC_TX capability for the DMA device provided by bcm-sba-raid driver Changes since v3: - Replaced SBA_ENC() with sba_cmd_enc() inline function - Use list_first_entry_or_null() wherever possible - Remove unwanted brances around loops wherever possible - Use lockdep_assert_held() where required Changes since v2: - Droped patch to handle DMA devices having support for fewer PQ coefficients in Linux Async Tx - Added work-around in bcm-sba-raid driver to handle unsupported PQ coefficients using multiple SBA requests Changes since v1: - Droped patch to add mbox_channel_device() API - Used GENMASK and BIT macros wherever possible in bcm-sba-raid driver - Replaced C_MDATA macros with static inline functions in bcm-sba-raid driver - Removed sba_alloc_chan_resources() callback in bcm-sba-raid driver - Used dev_err() instead of dev_info() wherever applicable - Removed call to sba_issue_pending() from sba_tx_submit() in bcm-sba-raid driver - Implemented SBA request chaning for handling (len > sba->req_size) in bcm-sba-raid driver - Implemented device_terminate_all() callback in bcm-sba-raid driver Anup Patel (4): lib/raid6: Add log-of-2 table for RAID6 HW requiring disk position async_tx: Fix DMA_PREP_FENCE usage in do_async_gen_syndrome() dmaengine: Add Broadcom SBA RAID driver dt-bindings: Add DT bindings document for Broadcom SBA RAID driver .../devicetree/bindings/dma/brcm,iproc-sba.txt | 29 + crypto/async_tx/async_pq.c |5 +- drivers/dma/Kconfig| 14 + drivers/dma/Makefile |1 + drivers/dma/bcm-sba-raid.c | 1785 include/linux/raid/pq.h|1 + lib/raid6/mktables.c | 20 + 7 files changed, 1852 insertions(+), 3 deletions(-) create mode 100644 Documentation/devicetree/bindings/dma/brcm,iproc-sba.txt create mode 100644 drivers/dma/bcm-sba-raid.c -- 2.7.4
[PATCH v5 2/4] async_tx: Fix DMA_PREP_FENCE usage in do_async_gen_syndrome()
The DMA_PREP_FENCE is to be used when preparing Tx descriptor if output of Tx descriptor is to be used by next/dependent Tx descriptor. The DMA_PREP_FENSE will not be set correctly in do_async_gen_syndrome() when calling dma->device_prep_dma_pq() under following conditions: 1. ASYNC_TX_FENCE not set in submit->flags 2. DMA_PREP_FENCE not set in dma_flags 3. src_cnt (= (disks - 2)) is greater than dma_maxpq(dma, dma_flags) This patch fixes DMA_PREP_FENCE usage in do_async_gen_syndrome() taking inspiration from do_async_xor() implementation. Signed-off-by: Anup Patel Reviewed-by: Ray Jui Reviewed-by: Scott Branden --- crypto/async_tx/async_pq.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/crypto/async_tx/async_pq.c b/crypto/async_tx/async_pq.c index f83de99..56bd612 100644 --- a/crypto/async_tx/async_pq.c +++ b/crypto/async_tx/async_pq.c @@ -62,9 +62,6 @@ do_async_gen_syndrome(struct dma_chan *chan, dma_addr_t dma_dest[2]; int src_off = 0; - if (submit->flags & ASYNC_TX_FENCE) - dma_flags |= DMA_PREP_FENCE; - while (src_cnt > 0) { submit->flags = flags_orig; pq_src_cnt = min(src_cnt, dma_maxpq(dma, dma_flags)); @@ -83,6 +80,8 @@ do_async_gen_syndrome(struct dma_chan *chan, if (cb_fn_orig) dma_flags |= DMA_PREP_INTERRUPT; } + if (submit->flags & ASYNC_TX_FENCE) + dma_flags |= DMA_PREP_FENCE; /* Drivers force forward progress in case they can not provide * a descriptor -- 2.7.4
Re: [RFC PATCH v1 1/1] mm: zswap - Add crypto acomp/scomp framework support
On Wed, Feb 15, 2017 at 07:27:30PM +0530, Narayana Prasad Athreya wrote: > > I assume all of these crypto_acomp_[compress|decompress] calls are > > actually synchronous, > > not asynchronous as the name suggests. Otherwise, this would blow up > > quite spectacularly > > since all the resources we use in the call get derefed/unmapped below. > > > > Could an async algorithm be implement/used that would break this assumption? > > The callback is set to NULL using acomp_request_set_callback(). This implies > synchronous mode of operation. So the underlying implementation must > complete the operation synchronously. This assumption is not correct. An asynchronous implementation, when it finishes processing a request, will call acomp_request_complete() which in turn calls the callback. If the callback is set to NULL, this function will dereference a NULL pointer. Regards, -- Giovanni
[PATCH 3/3] crypto: ccp - Add 3DES function on v5 CCPs
Wire up support for Triple DES in ECB mode. Signed-off-by: Gary R Hook --- drivers/crypto/ccp/Makefile |1 drivers/crypto/ccp/ccp-crypto-des3.c | 254 ++ drivers/crypto/ccp/ccp-crypto-main.c | 10 + drivers/crypto/ccp/ccp-crypto.h | 22 +++ drivers/crypto/ccp/ccp-dev-v3.c |1 drivers/crypto/ccp/ccp-dev-v5.c | 54 +++ drivers/crypto/ccp/ccp-dev.h | 14 ++ drivers/crypto/ccp/ccp-ops.c | 198 +++ include/linux/ccp.h | 57 +++- 9 files changed, 608 insertions(+), 3 deletions(-) create mode 100644 drivers/crypto/ccp/ccp-crypto-des3.c diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile index fd77225..563594a 100644 --- a/drivers/crypto/ccp/Makefile +++ b/drivers/crypto/ccp/Makefile @@ -14,4 +14,5 @@ ccp-crypto-objs := ccp-crypto-main.o \ ccp-crypto-aes-xts.o \ ccp-crypto-rsa.o \ ccp-crypto-aes-galois.o \ + ccp-crypto-des3.o \ ccp-crypto-sha.o diff --git a/drivers/crypto/ccp/ccp-crypto-des3.c b/drivers/crypto/ccp/ccp-crypto-des3.c new file mode 100644 index 000..5af7347 --- /dev/null +++ b/drivers/crypto/ccp/ccp-crypto-des3.c @@ -0,0 +1,254 @@ +/* + * AMD Cryptographic Coprocessor (CCP) DES3 crypto API support + * + * Copyright (C) 2016 Advanced Micro Devices, Inc. + * + * Author: Gary R Hook + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "ccp-crypto.h" + +static int ccp_des3_complete(struct crypto_async_request *async_req, int ret) +{ + struct ablkcipher_request *req = ablkcipher_request_cast(async_req); + struct ccp_ctx *ctx = crypto_tfm_ctx(req->base.tfm); + struct ccp_des3_req_ctx *rctx = ablkcipher_request_ctx(req); + + if (ret) + return ret; + + if (ctx->u.des3.mode != CCP_DES3_MODE_ECB) + memcpy(req->info, rctx->iv, DES3_EDE_BLOCK_SIZE); + + return 0; +} + +static int ccp_des3_setkey(struct crypto_ablkcipher *tfm, const u8 *key, + unsigned int key_len) +{ + struct ccp_ctx *ctx = crypto_tfm_ctx(crypto_ablkcipher_tfm(tfm)); + struct ccp_crypto_ablkcipher_alg *alg = + ccp_crypto_ablkcipher_alg(crypto_ablkcipher_tfm(tfm)); + u32 *flags = &tfm->base.crt_flags; + + + /* From des_generic.c: +* +* RFC2451: +* If the first two or last two independent 64-bit keys are +* equal (k1 == k2 or k2 == k3), then the DES3 operation is simply the +* same as DES. Implementers MUST reject keys that exhibit this +* property. +*/ + const u32 *K = (const u32 *)key; + + if (unlikely(!((K[0] ^ K[2]) | (K[1] ^ K[3])) || +!((K[2] ^ K[4]) | (K[3] ^ K[5]))) && +(*flags & CRYPTO_TFM_REQ_WEAK_KEY)) { + *flags |= CRYPTO_TFM_RES_WEAK_KEY; + return -EINVAL; + } + + /* It's not clear that there is any support for a keysize of 112. +* If needed, the caller should make K1 == K3 +*/ + ctx->u.des3.type = CCP_DES3_TYPE_168; + ctx->u.des3.mode = alg->mode; + ctx->u.des3.key_len = key_len; + + memcpy(ctx->u.des3.key, key, key_len); + sg_init_one(&ctx->u.des3.key_sg, ctx->u.des3.key, key_len); + + return 0; +} + +static int ccp_des3_crypt(struct ablkcipher_request *req, bool encrypt) +{ + struct ccp_ctx *ctx = crypto_tfm_ctx(req->base.tfm); + struct ccp_des3_req_ctx *rctx = ablkcipher_request_ctx(req); + struct scatterlist *iv_sg = NULL; + unsigned int iv_len = 0; + int ret; + + if (!ctx->u.des3.key_len) + return -EINVAL; + + if (((ctx->u.des3.mode == CCP_DES3_MODE_ECB) || +(ctx->u.des3.mode == CCP_DES3_MODE_CBC)) && + (req->nbytes & (DES3_EDE_BLOCK_SIZE - 1))) + return -EINVAL; + + if (ctx->u.des3.mode != CCP_DES3_MODE_ECB) { + if (!req->info) + return -EINVAL; + + memcpy(rctx->iv, req->info, DES3_EDE_BLOCK_SIZE); + iv_sg = &rctx->iv_sg; + iv_len = DES3_EDE_BLOCK_SIZE; + sg_init_one(iv_sg, rctx->iv, iv_len); + } + + memset(&rctx->cmd, 0, sizeof(rctx->cmd)); + INIT_LIST_HEAD(&rctx->cmd.entry); + rctx->cmd.engine = CCP_ENGINE_DES3; + rctx->cmd.u.des3.type = ctx->u.des3.type; + rctx->cmd.u.des3.mode = ctx->u.des3.mode; + rctx->cmd.u.des3.action = (encrypt) + ? CCP_DES3_ACTION_ENCRYPT + : CCP_DES3_ACTION_DECRYPT; + rctx->cmd.u.d
[PATCH 2/3] crypto: ccp - Add support for AES GCM on v5 CCPs
A version 5 device provides the primitive commands required for AES GCM. This patch adds support for en/decryption. Signed-off-by: Gary R Hook --- drivers/crypto/ccp/Makefile|2 drivers/crypto/ccp/ccp-crypto-aes-galois.c | 257 drivers/crypto/ccp/ccp-crypto-main.c | 20 ++ drivers/crypto/ccp/ccp-crypto.h| 14 ++ drivers/crypto/ccp/ccp-ops.c | 252 +++ include/linux/ccp.h|9 + 6 files changed, 554 insertions(+) create mode 100644 drivers/crypto/ccp/ccp-crypto-aes-galois.c diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile index 346ceb8..fd77225 100644 --- a/drivers/crypto/ccp/Makefile +++ b/drivers/crypto/ccp/Makefile @@ -12,4 +12,6 @@ ccp-crypto-objs := ccp-crypto-main.o \ ccp-crypto-aes.o \ ccp-crypto-aes-cmac.o \ ccp-crypto-aes-xts.o \ + ccp-crypto-rsa.o \ + ccp-crypto-aes-galois.o \ ccp-crypto-sha.o diff --git a/drivers/crypto/ccp/ccp-crypto-aes-galois.c b/drivers/crypto/ccp/ccp-crypto-aes-galois.c new file mode 100644 index 000..8bc18c9 --- /dev/null +++ b/drivers/crypto/ccp/ccp-crypto-aes-galois.c @@ -0,0 +1,257 @@ +/* + * AMD Cryptographic Coprocessor (CCP) AES GCM crypto API support + * + * Copyright (C) 2016 Advanced Micro Devices, Inc. + * + * Author: Gary R Hook + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "ccp-crypto.h" + +#defineAES_GCM_IVSIZE 12 + +static int ccp_aes_gcm_complete(struct crypto_async_request *async_req, int ret) +{ + return ret; +} + +static int ccp_aes_gcm_setkey(struct crypto_aead *tfm, const u8 *key, + unsigned int key_len) +{ + struct ccp_ctx *ctx = crypto_aead_ctx(tfm); + + switch (key_len) { + case AES_KEYSIZE_128: + ctx->u.aes.type = CCP_AES_TYPE_128; + break; + case AES_KEYSIZE_192: + ctx->u.aes.type = CCP_AES_TYPE_192; + break; + case AES_KEYSIZE_256: + ctx->u.aes.type = CCP_AES_TYPE_256; + break; + default: + crypto_aead_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN); + return -EINVAL; + } + + ctx->u.aes.mode = CCP_AES_MODE_GCM; + ctx->u.aes.key_len = key_len; + + memcpy(ctx->u.aes.key, key, key_len); + sg_init_one(&ctx->u.aes.key_sg, ctx->u.aes.key, key_len); + + return 0; +} + +static int ccp_aes_gcm_setauthsize(struct crypto_aead *tfm, + unsigned int authsize) +{ + return 0; +} + +static int ccp_aes_gcm_crypt(struct aead_request *req, bool encrypt) +{ + struct crypto_aead *tfm = crypto_aead_reqtfm(req); + struct ccp_ctx *ctx = crypto_aead_ctx(tfm); + struct ccp_aes_req_ctx *rctx = aead_request_ctx(req); + struct scatterlist *iv_sg = NULL; + unsigned int iv_len = 0; + int i; + int ret = 0; + + if (!ctx->u.aes.key_len) + return -EINVAL; + + if (ctx->u.aes.mode != CCP_AES_MODE_GCM) + return -EINVAL; + + if (!req->iv) + return -EINVAL; + + /* +* 5 parts: +* plaintext/ciphertext input +* AAD +* key +* IV +* Destination+tag buffer +*/ + + /* According to the way AES GCM has been implemented here, +* per RFC 4106 it seems, the provided IV is fixed at 12 bytes, +* occupies the beginning of the IV array. Write a 32-bit +* integer after that (bytes 13-16) with a value of "1". +*/ + memcpy(rctx->iv, req->iv, AES_GCM_IVSIZE); + for (i = 0; i < 3; i++) + rctx->iv[i + AES_GCM_IVSIZE] = 0; + rctx->iv[AES_BLOCK_SIZE - 1] = 1; + + /* Set up a scatterlist for the IV */ + iv_sg = &rctx->iv_sg; + iv_len = AES_BLOCK_SIZE; + sg_init_one(iv_sg, rctx->iv, iv_len); + + /* The AAD + plaintext are concatenated in the src buffer */ + memset(&rctx->cmd, 0, sizeof(rctx->cmd)); + INIT_LIST_HEAD(&rctx->cmd.entry); + rctx->cmd.engine = CCP_ENGINE_AES; + rctx->cmd.u.aes.type = ctx->u.aes.type; + rctx->cmd.u.aes.mode = ctx->u.aes.mode; + rctx->cmd.u.aes.action = + (encrypt) ? CCP_AES_ACTION_ENCRYPT : CCP_AES_ACTION_DECRYPT; + rctx->cmd.u.aes.key = &ctx->u.aes.key_sg; + rctx->cmd.u.aes.key_len = ctx->u.aes.key_len; + rctx->cmd.u.aes.iv = iv_sg; + rctx->cmd.u.aes.iv_len = iv_len; + rctx->cmd.u.aes.src = req->src; + rctx->cmd.u.aes.
[PATCH 0/3] Support new function in the newer CCP
The following series implements new function in a version 5 coprocessor. New features are: - Support for SHA-2 384-bit and 512-bit hashing - Support for AES GCM encryption - Support for 3DES encryption --- Gary R Hook (3): crypto: ccp - Add SHA-2 384-/512-/bit support crypto: ccp - Add support for AES GCM on v5 CCPs crypto: ccp - Add 3DES function on v5 CCPs drivers/crypto/ccp/Makefile|3 drivers/crypto/ccp/ccp-crypto-aes-galois.c | 257 ++ drivers/crypto/ccp/ccp-crypto-des3.c | 254 ++ drivers/crypto/ccp/ccp-crypto-main.c | 30 ++ drivers/crypto/ccp/ccp-crypto-sha.c| 22 + drivers/crypto/ccp/ccp-crypto.h| 44 ++ drivers/crypto/ccp/ccp-dev-v3.c|1 drivers/crypto/ccp/ccp-dev-v5.c| 54 +++ drivers/crypto/ccp/ccp-dev.h | 14 + drivers/crypto/ccp/ccp-ops.c | 522 include/linux/ccp.h| 68 11 files changed, 1263 insertions(+), 6 deletions(-) create mode 100644 drivers/crypto/ccp/ccp-crypto-aes-galois.c create mode 100644 drivers/crypto/ccp/ccp-crypto-des3.c -- I'm pretty sure donuts would help.
[PATCH 1/3] crypto: ccp - Add SHA-2 384-/512-bit support
Incorporate 384-bit and 512-bit hashing for a version 5 CCP device Signed-off-by: Gary R Hook --- drivers/crypto/ccp/ccp-crypto-sha.c | 22 +++ drivers/crypto/ccp/ccp-crypto.h |8 ++-- drivers/crypto/ccp/ccp-ops.c| 72 +++ include/linux/ccp.h |2 + 4 files changed, 101 insertions(+), 3 deletions(-) diff --git a/drivers/crypto/ccp/ccp-crypto-sha.c b/drivers/crypto/ccp/ccp-crypto-sha.c index 84a652b..6b46eea 100644 --- a/drivers/crypto/ccp/ccp-crypto-sha.c +++ b/drivers/crypto/ccp/ccp-crypto-sha.c @@ -146,6 +146,12 @@ static int ccp_do_sha_update(struct ahash_request *req, unsigned int nbytes, case CCP_SHA_TYPE_256: rctx->cmd.u.sha.ctx_len = SHA256_DIGEST_SIZE; break; + case CCP_SHA_TYPE_384: + rctx->cmd.u.sha.ctx_len = SHA384_DIGEST_SIZE; + break; + case CCP_SHA_TYPE_512: + rctx->cmd.u.sha.ctx_len = SHA512_DIGEST_SIZE; + break; default: /* Should never get here */ break; @@ -393,6 +399,22 @@ struct ccp_sha_def { .digest_size= SHA256_DIGEST_SIZE, .block_size = SHA256_BLOCK_SIZE, }, + { + .version= CCP_VERSION(5, 0), + .name = "sha384", + .drv_name = "sha384-ccp", + .type = CCP_SHA_TYPE_384, + .digest_size= SHA384_DIGEST_SIZE, + .block_size = SHA384_BLOCK_SIZE, + }, + { + .version= CCP_VERSION(5, 0), + .name = "sha512", + .drv_name = "sha512-ccp", + .type = CCP_SHA_TYPE_512, + .digest_size= SHA512_DIGEST_SIZE, + .block_size = SHA512_BLOCK_SIZE, + }, }; static int ccp_register_hmac_alg(struct list_head *head, diff --git a/drivers/crypto/ccp/ccp-crypto.h b/drivers/crypto/ccp/ccp-crypto.h index 8335b32..95cce27 100644 --- a/drivers/crypto/ccp/ccp-crypto.h +++ b/drivers/crypto/ccp/ccp-crypto.h @@ -137,9 +137,11 @@ struct ccp_aes_cmac_exp_ctx { u8 buf[AES_BLOCK_SIZE]; }; -/* SHA related defines */ -#define MAX_SHA_CONTEXT_SIZE SHA256_DIGEST_SIZE -#define MAX_SHA_BLOCK_SIZE SHA256_BLOCK_SIZE +/* SHA-related defines + * These values must be large enough to accommodate any variant + */ +#define MAX_SHA_CONTEXT_SIZE SHA512_DIGEST_SIZE +#define MAX_SHA_BLOCK_SIZE SHA512_BLOCK_SIZE struct ccp_sha_ctx { struct scatterlist opad_sg; diff --git a/drivers/crypto/ccp/ccp-ops.c b/drivers/crypto/ccp/ccp-ops.c index f1396c3..0d82080 100644 --- a/drivers/crypto/ccp/ccp-ops.c +++ b/drivers/crypto/ccp/ccp-ops.c @@ -41,6 +41,20 @@ cpu_to_be32(SHA256_H6), cpu_to_be32(SHA256_H7), }; +static const __be64 ccp_sha384_init[SHA512_DIGEST_SIZE / sizeof(__be64)] = { + cpu_to_be64(SHA384_H0), cpu_to_be64(SHA384_H1), + cpu_to_be64(SHA384_H2), cpu_to_be64(SHA384_H3), + cpu_to_be64(SHA384_H4), cpu_to_be64(SHA384_H5), + cpu_to_be64(SHA384_H6), cpu_to_be64(SHA384_H7), +}; + +static const __be64 ccp_sha512_init[SHA512_DIGEST_SIZE / sizeof(__be64)] = { + cpu_to_be64(SHA512_H0), cpu_to_be64(SHA512_H1), + cpu_to_be64(SHA512_H2), cpu_to_be64(SHA512_H3), + cpu_to_be64(SHA512_H4), cpu_to_be64(SHA512_H5), + cpu_to_be64(SHA512_H6), cpu_to_be64(SHA512_H7), +}; + #defineCCP_NEW_JOBID(ccp) ((ccp->vdata->version == CCP_VERSION(3, 0)) ? \ ccp_gen_jobid(ccp) : 0) @@ -955,6 +969,18 @@ static int ccp_run_sha_cmd(struct ccp_cmd_queue *cmd_q, struct ccp_cmd *cmd) return -EINVAL; block_size = SHA256_BLOCK_SIZE; break; + case CCP_SHA_TYPE_384: + if (cmd_q->ccp->vdata->version < CCP_VERSION(4, 0) + || sha->ctx_len < SHA384_DIGEST_SIZE) + return -EINVAL; + block_size = SHA384_BLOCK_SIZE; + break; + case CCP_SHA_TYPE_512: + if (cmd_q->ccp->vdata->version < CCP_VERSION(4, 0) + || sha->ctx_len < SHA512_DIGEST_SIZE) + return -EINVAL; + block_size = SHA512_BLOCK_SIZE; + break; default: return -EINVAL; } @@ -1042,6 +1068,21 @@ static int ccp_run_sha_cmd(struct ccp_cmd_queue *cmd_q, struct ccp_cmd *cmd) sb_count = 1; ooffset = ioffset = 0; break; + case CCP_SHA_TYPE_384: + digest_size = SHA384_DIGEST_SIZE; + init = (void *) ccp_sha384_init; + ctx_size = SHA512_DIGEST_SIZE; + sb_count = 2; + ioffset = 0; + ooffset = 2 * CCP_SB_BYTES - SHA384_DIGEST_SIZE; + brea
[PATCH v8 2/5] lib/decompress_unlz4: Change module to work with new LZ4 module version
Update the unlz4 wrapper to work with the updated LZ4 kernel module version. Signed-off-by: Sven Schmidt <4ssch...@informatik.uni-hamburg.de> --- lib/decompress_unlz4.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/lib/decompress_unlz4.c b/lib/decompress_unlz4.c index 036fc88..1b0baf3 100644 --- a/lib/decompress_unlz4.c +++ b/lib/decompress_unlz4.c @@ -72,7 +72,7 @@ STATIC inline int INIT unlz4(u8 *input, long in_len, error("NULL input pointer and missing fill function"); goto exit_1; } else { - inp = large_malloc(lz4_compressbound(uncomp_chunksize)); + inp = large_malloc(LZ4_compressBound(uncomp_chunksize)); if (!inp) { error("Could not allocate input buffer"); goto exit_1; @@ -136,7 +136,7 @@ STATIC inline int INIT unlz4(u8 *input, long in_len, inp += 4; size -= 4; } else { - if (chunksize > lz4_compressbound(uncomp_chunksize)) { + if (chunksize > LZ4_compressBound(uncomp_chunksize)) { error("chunk length is longer than allocated"); goto exit_2; } @@ -152,11 +152,14 @@ STATIC inline int INIT unlz4(u8 *input, long in_len, out_len -= dest_len; } else dest_len = out_len; - ret = lz4_decompress(inp, &chunksize, outp, dest_len); + + ret = LZ4_decompress_fast(inp, outp, dest_len); + chunksize = ret; #else dest_len = uncomp_chunksize; - ret = lz4_decompress_unknownoutputsize(inp, chunksize, outp, - &dest_len); + + ret = LZ4_decompress_safe(inp, outp, chunksize, dest_len); + dest_len = ret; #endif if (ret < 0) { error("Decoding failed"); -- 2.1.4
[PATCH v8 5/5] lib/lz4: Remove back-compat wrappers
Remove the functions introduced as wrappers for providing backwards compatibility to the prior LZ4 version. They're not needed anymore since there's no callers left. Signed-off-by: Sven Schmidt <4ssch...@informatik.uni-hamburg.de> --- include/linux/lz4.h | 69 lib/lz4/lz4_compress.c | 22 --- lib/lz4/lz4_decompress.c | 42 - lib/lz4/lz4hc_compress.c | 23 4 files changed, 156 deletions(-) diff --git a/include/linux/lz4.h b/include/linux/lz4.h index 1b0f8ca..394e3d9 100644 --- a/include/linux/lz4.h +++ b/include/linux/lz4.h @@ -173,18 +173,6 @@ static inline int LZ4_compressBound(size_t isize) } /** - * lz4_compressbound() - For backwards compatibility; see LZ4_compressBound - * @isize: Size of the input data - * - * Return: Max. size LZ4 may output in a "worst case" szenario - * (data not compressible) - */ -static inline int lz4_compressbound(size_t isize) -{ - return LZ4_COMPRESSBOUND(isize); -} - -/** * LZ4_compress_default() - Compress data from source to dest * @source: source address of the original data * @dest: output buffer address of the compressed data @@ -257,20 +245,6 @@ int LZ4_compress_fast(const char *source, char *dest, int inputSize, int LZ4_compress_destSize(const char *source, char *dest, int *sourceSizePtr, int targetDestSize, void *wrkmem); -/* - * lz4_compress() - For backward compatibility, see LZ4_compress_default - * @src: source address of the original data - * @src_len: size of the original data - * @dst: output buffer address of the compressed data. This requires 'dst' - * of size LZ4_COMPRESSBOUND - * @dst_len: is the output size, which is returned after compress done - * @workmem: address of the working memory. - * - * Return: Success if return 0, Error if return < 0 - */ -int lz4_compress(const unsigned char *src, size_t src_len, unsigned char *dst, - size_t *dst_len, void *wrkmem); - /*- * Decompression Functions **/ @@ -346,34 +320,6 @@ int LZ4_decompress_safe(const char *source, char *dest, int compressedSize, int LZ4_decompress_safe_partial(const char *source, char *dest, int compressedSize, int targetOutputSize, int maxDecompressedSize); -/* - * lz4_decompress_unknownoutputsize() - For backwards compatibility, - * see LZ4_decompress_safe - * @src: source address of the compressed data - * @src_len: is the input size, therefore the compressed size - * @dest: output buffer address of the decompressed data - * which must be already allocated - * @dest_len: is the max size of the destination buffer, which is - * returned with actual size of decompressed data after decompress done - * - * Return: Success if return 0, Error if return (< 0) - */ -int lz4_decompress_unknownoutputsize(const unsigned char *src, size_t src_len, - unsigned char *dest, size_t *dest_len); - -/** - * lz4_decompress() - For backwards cocmpatibility, see LZ4_decompress_fast - * @src: source address of the compressed data - * @src_len: is the input size, which is returned after decompress done - * @dest: output buffer address of the decompressed data, - * which must be already allocated - * @actual_dest_len: is the size of uncompressed data, supposing it's known - * - * Return: Success if return 0, Error if return (< 0) - */ -int lz4_decompress(const unsigned char *src, size_t *src_len, - unsigned char *dest, size_t actual_dest_len); - /*- * LZ4 HC Compression **/ @@ -401,21 +347,6 @@ int LZ4_compress_HC(const char *src, char *dst, int srcSize, int dstCapacity, int compressionLevel, void *wrkmem); /** - * lz4hc_compress() - For backwards compatibility, see LZ4_compress_HC - * @src: source address of the original data - * @src_len: size of the original data - * @dst: output buffer address of the compressed data. This requires 'dst' - * of size LZ4_COMPRESSBOUND. - * @dst_len: is the output size, which is returned after compress done - * @wrkmem: address of the working memory. - * This requires 'workmem' of size LZ4HC_MEM_COMPRESS. - * - * Return : Success if return 0, Error if return (< 0) - */ -int lz4hc_compress(const unsigned char *src, size_t src_len, unsigned char *dst, - size_t *dst_len, void *wrkmem); - -/** * LZ4_resetStreamHC() - Init an allocated 'LZ4_streamHC_t' structure * @streamHCPtr: pointer to the 'LZ4_streamHC_t' structure * @compressionLevel: Recommended values are between 4 and 9, although any diff --git a/lib/lz4/lz4_compress.c b/lib/lz4/lz4_compress.c index 53f313f..cc7b6d4 100644 --- a/lib/lz4/lz4_compress.c +++ b/lib/lz4/lz4_compress.c @@ -936,27 +93
[PATCH v8 1/5] lib: Update LZ4 compressor module
Update the LZ4 kernel module to LZ4 v1.7.3 by Yann Collet. The kernel module is inspired by the previous work by Chanho Min. The updated LZ4 module will not break existing code since the patchset contains appropriate changes. API changes: New method LZ4_compress_fast which differs from the variant available in kernel by the new acceleration parameter, allowing to trade compression ratio for more compression speed and vice versa. LZ4_decompress_fast is the respective decompression method, featuring a very fast decoder (multiple GB/s per core), able to reach RAM speed in multi-core systems. The decompressor allows to decompress data compressed with LZ4 fast as well as the LZ4 HC (high compression) algorithm. Also the useful functions LZ4_decompress_safe_partial and LZ4_compress_destsize were added. The latter reverses the logic by trying to compress as much data as possible from source to dest while the former aims to decompress partial blocks of data. A bunch of streaming functions were also added which allow compressig/decompressing data in multiple steps (so called "streaming mode"). The methods lz4_compress and lz4_decompress_unknownoutputsize are now known as LZ4_compress_default respectivley LZ4_decompress_safe. The old methods will be removed since there's no callers left in the code. Signed-off-by: Sven Schmidt <4ssch...@informatik.uni-hamburg.de> --- include/linux/lz4.h | 762 +++--- lib/lz4/Makefile |2 + lib/lz4/lz4_compress.c | 1161 +- lib/lz4/lz4_decompress.c | 705 ++-- lib/lz4/lz4defs.h| 338 -- lib/lz4/lz4hc_compress.c | 867 ++ 6 files changed, 2758 insertions(+), 1077 deletions(-) diff --git a/include/linux/lz4.h b/include/linux/lz4.h index 6b784c5..1b0f8ca 100644 --- a/include/linux/lz4.h +++ b/include/linux/lz4.h @@ -1,87 +1,717 @@ -#ifndef __LZ4_H__ -#define __LZ4_H__ -/* - * LZ4 Kernel Interface +/* LZ4 Kernel Interface * * Copyright (C) 2013, LG Electronics, Kyungsik Lee + * Copyright (C) 2016, Sven Schmidt <4ssch...@informatik.uni-hamburg.de> * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as * published by the Free Software Foundation. + * + * This file is based on the original header file + * for LZ4 - Fast LZ compression algorithm. + * + * LZ4 - Fast LZ compression algorithm + * Copyright (C) 2011-2016, Yann Collet. + * BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php) + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are + * met: + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following disclaimer + * in the documentation and/or other materials provided with the + * distribution. + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * You can contact the author at : + * - LZ4 homepage : http://www.lz4.org + * - LZ4 source repository : https://github.com/lz4/lz4 */ -#define LZ4_MEM_COMPRESS (16384) -#define LZ4HC_MEM_COMPRESS (262144 + (2 * sizeof(unsigned char *))) +#ifndef __LZ4_H__ +#define __LZ4_H__ + +#include +#include/* memset, memcpy */ + +/*- + * CONSTANTS + **/ /* - * lz4_compressbound() - * Provides the maximum size that LZ4 may output in a "worst case" scenario - * (input data not compressible) + * LZ4_MEMORY_USAGE : + * Memory usage formula : N->2^N Bytes + * (examples : 10 -> 1KB; 12 -> 4KB ; 16 -> 64KB; 20 -> 1MB; etc.) + * Increasing memory usage improves compression ratio + * Reduced memory usage can improve speed, due to cache effect + * Default value is 14, for 16KB, which nicely fits into Intel x86 L1 cache */ -static inline size_t lz4_compressbo
[PATCH v8 3/5] crypto: Change LZ4 modules to work with new LZ4 module version
Update the crypto modules using LZ4 compression as well as the test cases in testmgr.h to work with the new LZ4 module version. Signed-off-by: Sven Schmidt <4ssch...@informatik.uni-hamburg.de> --- crypto/lz4.c | 23 - crypto/lz4hc.c | 23 - crypto/testmgr.h | 142 +++ 3 files changed, 120 insertions(+), 68 deletions(-) diff --git a/crypto/lz4.c b/crypto/lz4.c index 99c1b2c..71eff9b 100644 --- a/crypto/lz4.c +++ b/crypto/lz4.c @@ -66,15 +66,13 @@ static void lz4_exit(struct crypto_tfm *tfm) static int __lz4_compress_crypto(const u8 *src, unsigned int slen, u8 *dst, unsigned int *dlen, void *ctx) { - size_t tmp_len = *dlen; - int err; + int out_len = LZ4_compress_default(src, dst, + slen, *dlen, ctx); - err = lz4_compress(src, slen, dst, &tmp_len, ctx); - - if (err < 0) + if (!out_len) return -EINVAL; - *dlen = tmp_len; + *dlen = out_len; return 0; } @@ -96,16 +94,13 @@ static int lz4_compress_crypto(struct crypto_tfm *tfm, const u8 *src, static int __lz4_decompress_crypto(const u8 *src, unsigned int slen, u8 *dst, unsigned int *dlen, void *ctx) { - int err; - size_t tmp_len = *dlen; - size_t __slen = slen; + int out_len = LZ4_decompress_safe(src, dst, slen, *dlen); - err = lz4_decompress_unknownoutputsize(src, __slen, dst, &tmp_len); - if (err < 0) - return -EINVAL; + if (out_len < 0) + return out_len; - *dlen = tmp_len; - return err; + *dlen = out_len; + return 0; } static int lz4_sdecompress(struct crypto_scomp *tfm, const u8 *src, diff --git a/crypto/lz4hc.c b/crypto/lz4hc.c index 75ffc4a..03a34a8 100644 --- a/crypto/lz4hc.c +++ b/crypto/lz4hc.c @@ -65,15 +65,13 @@ static void lz4hc_exit(struct crypto_tfm *tfm) static int __lz4hc_compress_crypto(const u8 *src, unsigned int slen, u8 *dst, unsigned int *dlen, void *ctx) { - size_t tmp_len = *dlen; - int err; + int out_len = LZ4_compress_HC(src, dst, slen, + *dlen, LZ4HC_DEFAULT_CLEVEL, ctx); - err = lz4hc_compress(src, slen, dst, &tmp_len, ctx); - - if (err < 0) + if (!out_len) return -EINVAL; - *dlen = tmp_len; + *dlen = out_len; return 0; } @@ -97,16 +95,13 @@ static int lz4hc_compress_crypto(struct crypto_tfm *tfm, const u8 *src, static int __lz4hc_decompress_crypto(const u8 *src, unsigned int slen, u8 *dst, unsigned int *dlen, void *ctx) { - int err; - size_t tmp_len = *dlen; - size_t __slen = slen; + int out_len = LZ4_decompress_safe(src, dst, slen, *dlen); - err = lz4_decompress_unknownoutputsize(src, __slen, dst, &tmp_len); - if (err < 0) - return -EINVAL; + if (out_len < 0) + return out_len; - *dlen = tmp_len; - return err; + *dlen = out_len; + return 0; } static int lz4hc_sdecompress(struct crypto_scomp *tfm, const u8 *src, diff --git a/crypto/testmgr.h b/crypto/testmgr.h index 9b656be..98d4be0 100644 --- a/crypto/testmgr.h +++ b/crypto/testmgr.h @@ -34498,31 +34498,62 @@ static struct hash_testvec bfin_crc_tv_template[] = { static struct comp_testvec lz4_comp_tv_template[] = { { - .inlen = 70, - .outlen = 45, - .input = "Join us now and share the software " - "Join us now and share the software ", - .output = "\xf0\x10\x4a\x6f\x69\x6e\x20\x75" - "\x73\x20\x6e\x6f\x77\x20\x61\x6e" - "\x64\x20\x73\x68\x61\x72\x65\x20" - "\x74\x68\x65\x20\x73\x6f\x66\x74" - "\x77\x0d\x00\x0f\x23\x00\x0b\x50" - "\x77\x61\x72\x65\x20", + .inlen = 255, + .outlen = 218, + .input = "LZ4 is lossless compression algorithm, providing" +" compression speed at 400 MB/s per core, scalable " +"with multi-cores CPU. It features an extremely fast " +"decoder, with speed in multiple GB/s per core, " +"typically reaching RAM speed limits on multi-core " +"systems.", + .output = "\xf9\x21\x4c\x5a\x34\x20\x69\x73\x20\x6c\x6f\x73\x73" + "\x6c\x65\x73\x73\x20\x63\x6f\x6d\x70\x72\x65\x73\x73" + "\x69\x6f\x6e\x20\x61\x6c\x67\x6f\x72\x69\x74\x68\x6d" + "\x2c\x20\x70\x72\x6f\x76\x69\x64\x69\x6e\x67\x21\x00" + "\xf0\x21\x73\x70\x65\x65\x64\x20\x61\x74\x20\x34\x30" + "\x30\x20\x4d\x4
[PATCH v8 0/5] Update LZ4 compressor module
This patchset is for updating the LZ4 compression module to a version based on LZ4 v1.7.3 allowing to use the fast compression algorithm aka LZ4 fast which provides an "acceleration" parameter as a tradeoff between high compression ratio and high compression speed. We want to use LZ4 fast in order to support compression in lustre and (mostly, based on that) investigate data reduction techniques in behalf of storage systems. Also, it will be useful for other users of LZ4 compression, as with LZ4 fast it is possible to enable applications to use fast and/or high compression depending on the usecase. For instance, ZRAM is offering a LZ4 backend and could benefit from an updated LZ4 in the kernel. LZ4 homepage: http://www.lz4.org/ LZ4 source repository: https://github.com/lz4/lz4 Source version: 1.7.3 Benchmark (taken from [1], Core i5-4300U @1.9GHz): |--||-- Compressor | Compression | Decompression | Ratio |--||-- memcpy | 4200 MB/s | 4200 MB/s | 1.000 LZ4 fast 50 | 1080 MB/s | 2650 MB/s | 1.375 LZ4 fast 17 | 680 MB/s | 2220 MB/s | 1.607 LZ4 fast 5 | 475 MB/s | 1920 MB/s | 1.886 LZ4 default | 385 MB/s | 1850 MB/s | 2.101 [1] http://fastcompression.blogspot.de/2015/04/sampling-or-faster-lz4.html [PATCH 1/5] lib: Update LZ4 compressor module [PATCH 2/5] lib/decompress_unlz4: Change module to work with new LZ4 module version [PATCH 3/5] crypto: Change LZ4 modules to work with new LZ4 module version [PATCH 4/5] fs/pstore: fs/squashfs: Change usage of LZ4 to work with new LZ4 version [PATCH 5/5] lib/lz4: Remove back-compat wrappers Changes: v8: - Rewrote the architecture-dependent definitions in lz4defs.h, such as LZ4_read*, LZ4_write* and LZ4_NbCommonBytes as proposed by Eric Biggers - Added -O3 compiler flag to Makefile, also suggested by Eric - Defined FORCE_INLINE, as used in upstream LZ4, as __always_inline and force-inlined most of the small functions in lz4defs.h - lz4_decompress: Wrapped the EXPORT_SYMBOL and MODULE_* macros in a #ifdef STATIC the way suggested by Andrew Morton, fixing the breakage of CONFIG_KERNEL_LZ4 in x86 as reported by Arnd Bergman v7: - Fixed errors reported by the Smatch tool - Changed function documentation comments in lz4.h to match kernel-doc style - Fixed a misbehaviour of LZ4HC caused by the wrong level of indentation concerning two for loops introduced after I refactored the code style using checkpatch.pl (upstream LZ4 put dozens of stuff in just one line, gnah) - Updated the crypto tests for LZ4 since they did fail for the new code and hence zram did fail to allocate memory for LZ4 v6: - Fixed LZ4_NBCOMMONBYTES() for 64-bit little endian - Reset LZ4_MEMORY_USAGE to 14 (which is the value used in upstream LZ4 as well as the previous kernel module) - Fixed that weird double-indentation in lz4defs.h and lz4.h - Adjusted general styling issues in lz4defs.h (e.g. lines consisting of more than one instruction) - Removed the architecture-dependent typedef to reg_t since upstream LZ4 is just using size_t and that works fine - Changed error messages in pstore/platform.c: * LZ4_compress_default always returns 0 in case of an error (no need to print the return value) * LZ4_decompress_safe returns a negative error message (return value _does_ matter) v5: - Added a fifth patch to remove the back-compat wrappers introduced to ensure bisectibility between the patches (the functions are no longer needed since there's no callers left) v4: - Fixed kbuild errors - Re-added lz4_compressbound as alias for LZ4_compressBound to ensure backwards compatibility - Wrapped LZ4_hash5 with check for LZ4_ARCH64 since it is only used there and triggers an unused function warning when false v3: - Adjusted the code to satisfy kernel coding style (checkpatch.pl) - Made sure the changes to LZ4 in Kernel (overflow checks etc.) are included in the new module (they are) - Removed the second LZ4_compressBound function with related name but different return type - Corrected version number (was LZ4 1.7.3) - Added missing LZ4 streaming functions v2: - Changed order of the patches since in the initial patchset the lz4.h was in the last patch but was referenced by the other ones - Split lib/decompress_unlz4.c in an own patch - Fixed errors reported by the buildbot - Further refactorings - Added more appropriate copyright note to include/linux/lz4.h
[PATCH v8 4/5] fs/pstore: fs/squashfs: Change usage of LZ4 to work with new LZ4 version
Update fs/pstore and fs/squashfs to use the updated functions from the new LZ4 module. Signed-off-by: Sven Schmidt <4ssch...@informatik.uni-hamburg.de> --- fs/pstore/platform.c | 22 +- fs/squashfs/lz4_wrapper.c | 12 ++-- 2 files changed, 19 insertions(+), 15 deletions(-) diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c index 729677e..efab7b6 100644 --- a/fs/pstore/platform.c +++ b/fs/pstore/platform.c @@ -342,31 +342,35 @@ static int compress_lz4(const void *in, void *out, size_t inlen, size_t outlen) { int ret; - ret = lz4_compress(in, inlen, out, &outlen, workspace); - if (ret) { - pr_err("lz4_compress error, ret = %d!\n", ret); + ret = LZ4_compress_default(in, out, inlen, outlen, workspace); + if (!ret) { + pr_err("LZ4_compress_default error; compression failed!\n"); return -EIO; } - return outlen; + return ret; } static int decompress_lz4(void *in, void *out, size_t inlen, size_t outlen) { int ret; - ret = lz4_decompress_unknownoutputsize(in, inlen, out, &outlen); - if (ret) { - pr_err("lz4_decompress error, ret = %d!\n", ret); + ret = LZ4_decompress_safe(in, out, inlen, outlen); + if (ret < 0) { + /* +* LZ4_decompress_safe will return an error code +* (< 0) if decompression failed +*/ + pr_err("LZ4_decompress_safe error, ret = %d!\n", ret); return -EIO; } - return outlen; + return ret; } static void allocate_lz4(void) { - big_oops_buf_sz = lz4_compressbound(psinfo->bufsize); + big_oops_buf_sz = LZ4_compressBound(psinfo->bufsize); big_oops_buf = kmalloc(big_oops_buf_sz, GFP_KERNEL); if (big_oops_buf) { workspace = kmalloc(LZ4_MEM_COMPRESS, GFP_KERNEL); diff --git a/fs/squashfs/lz4_wrapper.c b/fs/squashfs/lz4_wrapper.c index ff4468b..95da653 100644 --- a/fs/squashfs/lz4_wrapper.c +++ b/fs/squashfs/lz4_wrapper.c @@ -97,7 +97,6 @@ static int lz4_uncompress(struct squashfs_sb_info *msblk, void *strm, struct squashfs_lz4 *stream = strm; void *buff = stream->input, *data; int avail, i, bytes = length, res; - size_t dest_len = output->length; for (i = 0; i < b; i++) { avail = min(bytes, msblk->devblksize - offset); @@ -108,12 +107,13 @@ static int lz4_uncompress(struct squashfs_sb_info *msblk, void *strm, put_bh(bh[i]); } - res = lz4_decompress_unknownoutputsize(stream->input, length, - stream->output, &dest_len); - if (res) + res = LZ4_decompress_safe(stream->input, stream->output, + length, output->length); + + if (res < 0) return -EIO; - bytes = dest_len; + bytes = res; data = squashfs_first_page(output); buff = stream->output; while (data) { @@ -128,7 +128,7 @@ static int lz4_uncompress(struct squashfs_sb_info *msblk, void *strm, } squashfs_finish_page(output); - return dest_len; + return res; } const struct squashfs_decompressor squashfs_lz4_comp_ops = { -- 2.1.4
BUG: af_alg bind fails for 50 % request from userspace for hash algo
Hi Herbert/Stephen, When I try to run 100 application which calculates sha384 digest from userspace, nearly 50 applications fail in bind system call with error ENOENT. "crypto_alg_mod_lookup" in api.c call fails in kernel space. Issue comes in 1st try only(Seems some relation with crypto test executions). If I execute same test again issue didn't reproduce. Regards Harsh Jain
Re: [RFC PATCH v1 1/1] mm: zswap - Add crypto acomp/scomp framework support
I assume all of these crypto_acomp_[compress|decompress] calls are actually synchronous, not asynchronous as the name suggests. Otherwise, this would blow up quite spectacularly since all the resources we use in the call get derefed/unmapped below. Could an async algorithm be implement/used that would break this assumption? The callback is set to NULL using acomp_request_set_callback(). This implies synchronous mode of operation. So the underlying implementation must complete the operation synchronously. Prasad On Tuesday 14 February 2017 09:50 PM, Seth Jennings wrote: On Tue, Feb 14, 2017 at 9:40 AM, Mahipal Challa wrote: This adds the support for kernel's crypto new acomp/scomp framework to zswap. Signed-off-by: Mahipal Challa Signed-off-by: Vishnu Nair --- mm/zswap.c | 129 +++-- 1 file changed, 99 insertions(+), 30 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index 067a0d6..d08631b 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -33,6 +33,8 @@ #include #include #include +#include +#include #include #include @@ -114,7 +116,8 @@ static int zswap_compressor_param_set(const char *, struct zswap_pool { struct zpool *zpool; - struct crypto_comp * __percpu *tfm; + struct crypto_acomp * __percpu *acomp; + struct acomp_req * __percpu *acomp_req; struct kref kref; struct list_head list; struct work_struct work; @@ -379,30 +382,49 @@ static int zswap_dstmem_dead(unsigned int cpu) static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node) { struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node); - struct crypto_comp *tfm; + struct crypto_acomp *acomp; + struct acomp_req *acomp_req; - if (WARN_ON(*per_cpu_ptr(pool->tfm, cpu))) + if (WARN_ON(*per_cpu_ptr(pool->acomp, cpu))) return 0; + if (WARN_ON(*per_cpu_ptr(pool->acomp_req, cpu))) + return 0; + + acomp = crypto_alloc_acomp(pool->tfm_name, 0, 0); + if (IS_ERR_OR_NULL(acomp)) { + pr_err("could not alloc crypto acomp %s : %ld\n", + pool->tfm_name, PTR_ERR(acomp)); + return -ENOMEM; + } + *per_cpu_ptr(pool->acomp, cpu) = acomp; - tfm = crypto_alloc_comp(pool->tfm_name, 0, 0); - if (IS_ERR_OR_NULL(tfm)) { - pr_err("could not alloc crypto comp %s : %ld\n", - pool->tfm_name, PTR_ERR(tfm)); + acomp_req = acomp_request_alloc(acomp); + if (IS_ERR_OR_NULL(acomp_req)) { + pr_err("could not alloc crypto acomp %s : %ld\n", + pool->tfm_name, PTR_ERR(acomp)); return -ENOMEM; } - *per_cpu_ptr(pool->tfm, cpu) = tfm; + *per_cpu_ptr(pool->acomp_req, cpu) = acomp_req; + return 0; } static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) { struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node); - struct crypto_comp *tfm; + struct crypto_acomp *acomp; + struct acomp_req *acomp_req; + + acomp_req = *per_cpu_ptr(pool->acomp_req, cpu); + if (!IS_ERR_OR_NULL(acomp_req)) + acomp_request_free(acomp_req); + *per_cpu_ptr(pool->acomp_req, cpu) = NULL; + + acomp = *per_cpu_ptr(pool->acomp, cpu); + if (!IS_ERR_OR_NULL(acomp)) + crypto_free_acomp(acomp); + *per_cpu_ptr(pool->acomp, cpu) = NULL; - tfm = *per_cpu_ptr(pool->tfm, cpu); - if (!IS_ERR_OR_NULL(tfm)) - crypto_free_comp(tfm); - *per_cpu_ptr(pool->tfm, cpu) = NULL; return 0; } @@ -503,8 +525,14 @@ static struct zswap_pool *zswap_pool_create(char *type, char *compressor) pr_debug("using %s zpool\n", zpool_get_type(pool->zpool)); strlcpy(pool->tfm_name, compressor, sizeof(pool->tfm_name)); - pool->tfm = alloc_percpu(struct crypto_comp *); - if (!pool->tfm) { + pool->acomp = alloc_percpu(struct crypto_acomp *); + if (!pool->acomp) { + pr_err("percpu alloc failed\n"); + goto error; + } + + pool->acomp_req = alloc_percpu(struct acomp_req *); + if (!pool->acomp_req) { pr_err("percpu alloc failed\n"); goto error; } @@ -526,7 +554,8 @@ static struct zswap_pool *zswap_pool_create(char *type, char *compressor) return pool; error: - free_percpu(pool->tfm); + free_percpu(pool->acomp_req); + free_percpu(pool->acomp); if (pool->zpool) zpool_destroy_pool(pool->zpool); kfree(pool); @@ -566,7 +595,8 @@ static void zswap_pool_destroy(struct zswap_pool *pool) zswap_pool_debug("destroying", pool); cpuhp_state_remove_instance(CPUHP_MM_ZSWP_POOL_PREPARE, &pool->node); - free_percpu(pool->tfm); +
[PATCH] crypto: cavium/cpt: Fix couple of static checker errors
Fix the following smatch errors cptvf_reqmanager.c:333 do_post_process() warn: variable dereferenced before check 'cptvf' cptvf_main.c:825 cptvf_remove() error: we previously assumed 'cptvf' could be null Reported-by: Dan Carpenter Signed-off-by: George Cherian --- drivers/crypto/cavium/cpt/cptvf_main.c | 4 +++- drivers/crypto/cavium/cpt/cptvf_reqmanager.c | 4 ++-- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/crypto/cavium/cpt/cptvf_main.c b/drivers/crypto/cavium/cpt/cptvf_main.c index aac2966..e50872e 100644 --- a/drivers/crypto/cavium/cpt/cptvf_main.c +++ b/drivers/crypto/cavium/cpt/cptvf_main.c @@ -815,8 +815,10 @@ static void cptvf_remove(struct pci_dev *pdev) { struct cpt_vf *cptvf = pci_get_drvdata(pdev); - if (!cptvf) + if (!cptvf) { dev_err(&pdev->dev, "Invalid CPT-VF device\n"); + return; + } /* Convey DOWN to PF */ if (cptvf_send_vf_down(cptvf)) { diff --git a/drivers/crypto/cavium/cpt/cptvf_reqmanager.c b/drivers/crypto/cavium/cpt/cptvf_reqmanager.c index 7f57f30..169e662 100644 --- a/drivers/crypto/cavium/cpt/cptvf_reqmanager.c +++ b/drivers/crypto/cavium/cpt/cptvf_reqmanager.c @@ -330,8 +330,8 @@ void do_post_process(struct cpt_vf *cptvf, struct cpt_info_buffer *info) { struct pci_dev *pdev = cptvf->pdev; - if (!info || !cptvf) { - dev_err(&pdev->dev, "Input params are incorrect for post processing\n"); + if (!info) { + dev_err(&pdev->dev, "incorrect cpt_info_buffer for post processing\n"); return; } -- 2.1.4
Assalamu`Alaikum.
Dear Sir/Madam. Assalamu`Alaikum. I am Dr mohammad ouattara, I have ($10.6 Million us dollars) to transfer into your account, I will send you more details about this deal and the procedures to follow when I receive a positive response from you, Have a great day, Dr mohammad ouattara.
Assalamu`Alaikum.
Dear Sir/Madam. Assalamu`Alaikum. I am Dr mohammad ouattara, I have ($10.6 Million us dollars) to transfer into your account, I will send you more details about this deal and the procedures to follow when I receive a positive response from you, Have a great day, Dr mohammad ouattara.
Re: crypto/cavium MSI-X fixups
Hi Christoph, On 02/15/2017 12:48 PM, Christoph Hellwig wrote: Hi George, your commit "crypto: cavium - Add Support for Octeon-tx CPT Engine" add a new caller to pci_enable_msix. This API has long been deprecated so this series switches it to use pci_alloc_irq_vectors instead. Can you please test it and make sure it goes in before the end of the merge window so that no more users of the old API hit mainline? Yes the changes works well. Acked-by: George Cherian for the series.
Re: [PATCH v3 0/2] crypto: AF_ALG memory management fix
Am Montag, 13. Februar 2017, 11:04:50 CET schrieb Stephan Müller: Hi Herbert, as I just saw that you marked my patch with changes requested in patchwork, may I ask which changes should be applied? Ciao Stephan
[bug report] crypto: cavium - Add the Virtual Function driver for CPT
Hello George Cherian, This is a semi-automatic email about new static checker warnings. The patch c694b233295b: "crypto: cavium - Add the Virtual Function driver for CPT" from Feb 7, 2017, leads to the following Smatch complaint: drivers/crypto/cavium/cpt/cptvf_reqmanager.c:333 do_post_process() warn: variable dereferenced before check 'cptvf' (see line 331) drivers/crypto/cavium/cpt/cptvf_reqmanager.c 330 { 331 struct pci_dev *pdev = cptvf->pdev; ^^^ Dereference. 332 333 if (!info || !cptvf) { ^ Check is too late. 334 dev_err(&pdev->dev, "Input params are incorrect for post processing\n"); 335 return; regards, dan carpenter
Re: [PATCH v4 3/4] dmaengine: Add Broadcom SBA RAID driver
On Wed, Feb 15, 2017 at 12:55 PM, Dan Williams wrote: > On Tue, Feb 14, 2017 at 11:03 PM, Anup Patel wrote: >> On Wed, Feb 15, 2017 at 12:13 PM, Dan Williams >> wrote: >>> On Tue, Feb 14, 2017 at 10:25 PM, Anup Patel >>> wrote: On Tue, Feb 14, 2017 at 10:04 PM, Dan Williams wrote: > On Mon, Feb 13, 2017 at 10:51 PM, Anup Patel > wrote: >> The Broadcom stream buffer accelerator (SBA) provides offloading >> capabilities for RAID operations. This SBA offload engine is >> accessible via Broadcom SoC specific ring manager. >> >> This patch adds Broadcom SBA RAID driver which provides one >> DMA device with RAID capabilities using one or more Broadcom >> SoC specific ring manager channels. The SBA RAID driver in its >> current shape implements memcpy, xor, and pq operations. >> >> Signed-off-by: Anup Patel >> Reviewed-by: Ray Jui >> --- >> drivers/dma/Kconfig| 13 + >> drivers/dma/Makefile |1 + >> drivers/dma/bcm-sba-raid.c | 1694 >> >> 3 files changed, 1708 insertions(+) >> create mode 100644 drivers/dma/bcm-sba-raid.c >> >> diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig >> index 263495d..bf8fb84 100644 >> --- a/drivers/dma/Kconfig >> +++ b/drivers/dma/Kconfig >> @@ -99,6 +99,19 @@ config AXI_DMAC >> controller is often used in Analog Device's reference designs >> for FPGA >> platforms. >> >> +config BCM_SBA_RAID >> + tristate "Broadcom SBA RAID engine support" >> + depends on (ARM64 && MAILBOX && RAID6_PQ) || COMPILE_TEST >> + select DMA_ENGINE >> + select DMA_ENGINE_RAID >> + select ASYNC_TX_ENABLE_CHANNEL_SWITCH > > I thought you agreed to drop this. Its usage is broken. If ASYNC_TX_ENABLE_CHANNEL_SWITCH is not selected then async_dma_find_channel() will only try to find channel with DMA_ASYNC_TX capability. The DMA_ASYNC_TX capability is set by dma_async_device_register() when all Async Tx capabilities are supported by a DMA devices namely DMA_INTERRUPT, DMA_MEMCPY, DMA_XOR, DMA_XOR_VAL, DMA_PQ, and DMA_PQ_VAL. We only support DMA_MEMCPY, DMA_XOR, and DMA_PQ capabilities in BCM-SBA-RAID driver so DMA_ASYNC_TX capability is never set for the DMA device registered by BCM-SBA-RAID driver. Due to above, if ASYNC_TX_ENABLE_CHANNEL_SWITCH is not selected then Async Tx APIs fail to find DMA channel provided by BCM-SBA-RAID hence the option ASYNC_TX_ENABLE_CHANNEL_SWITCH is required for BCM-SBA-RAID. The DMA mappings are violated by channel switching only if we switch form DMA channel A to DMA channel B and both these DMA channels have different underlying "struct device". In most of the cases DMA mappings are not violated because DMA channels having Async Tx capabilities are provided using same underlying "struct device". >>> >>> No, fix the infrastructure. Do not put local hack in your driver for >>> this global problem [1]. >> >> There is no hack in the driver. We need >> ASYNC_TX_ENABLE_CHANNEL_SWITCH >> based on current state of dmaengine framework. >> >> The framework should be fixed as separate patchset. >> >> We have other RAID drivers such as xgene-dma and >> mv_xor_v2 who also require >> ASYNC_TX_ENABLE_CHANNEL_SWITCH due >> to same reason. >> >> Fixing the framework and improving framework is >> a ongoing process. I don't see why that should >> stop this patchset. >> > > Because this driver is turning on a dangerous compile time option and > is not using the functionality. If this silicon IP block appears in > another product in the future paired with another DMA engine then the > assumptions about a safe/single dma-device is violated. > > The realization of how async_tx was breaking DMA mapping api > assumptions came after some of these dma-drivers were added to the > kernel. We should stop making the problem worse. > > I should have submitted a patch like the below at the time we > discovered this problem, but unfortunately it languished when I > stopped maintaining the iop-adma and ioat drivers. > > diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig > index 263495d0adbd..6b30eb9ad125 100644 > --- a/drivers/dma/Kconfig > +++ b/drivers/dma/Kconfig > @@ -35,6 +35,7 @@ comment "DMA Devices" > > #core > config ASYNC_TX_ENABLE_CHANNEL_SWITCH > + depends on BROKEN > bool > > config ARCH_HAS_ASYNC_TX_FIND_CHANNEL Instead of selecting ASYNC_TX_ENABLE_CHANNEL_SWITCH, we can select the following in BCM_SBA_RAID config option: 1. ASYNC_TX_DISABLE_XOR_VAL 2. ASYNC_TX_DISABLE_PQ_VAL This will satisfy the needs of dma_async_device_register() when ASYNC_TX_ENABLE_CHANNEL_SWITCH is not selected. Will this be acceptable ?? Regards, Anup