[PATCH v2] crypto: talitos: Extend max key length for SHA384/512-HMAC and AEAD

2017-05-02 Thread Martin Hicks

An updated patch that also handles the additional key length requirements
for the AEAD algorithms.

The max keysize is not 96.  For SHA384/512 it's 128, and for the AEAD
algorithms it's longer still.  Extend the max keysize for the
AEAD size for AES256 + HMAC(SHA512).

Cc: <sta...@vger.kernel.org> # 3.6+
Fixes: 357fb60502ede ("crypto: talitos - add sha224, sha384 and sha512 to 
existing AEAD algorithms")
Signed-off-by: Martin Hicks <m...@bork.org>
---
 drivers/crypto/talitos.c |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 0bba6a1..79791c6 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -816,7 +816,7 @@ static void talitos_unregister_rng(struct device *dev)
  * HMAC_SNOOP_NO_AFEA (HSNA) instead of type IPSEC_ESP
  */
 #define TALITOS_CRA_PRIORITY_AEAD_HSNA (TALITOS_CRA_PRIORITY - 1)
-#define TALITOS_MAX_KEY_SIZE   96
+#define TALITOS_MAX_KEY_SIZE   (AES_MAX_KEY_SIZE + SHA512_BLOCK_SIZE)
 #define TALITOS_MAX_IV_LENGTH  16 /* max of AES_BLOCK_SIZE, 
DES3_EDE_BLOCK_SIZE */
 
 struct talitos_ctx {
@@ -1495,6 +1495,11 @@ static int ablkcipher_setkey(struct crypto_ablkcipher 
*cipher,
 {
struct talitos_ctx *ctx = crypto_ablkcipher_ctx(cipher);
 
+   if (keylen > TALITOS_MAX_KEY_SIZE) {
+   crypto_ablkcipher_set_flags(cipher, CRYPTO_TFM_RES_BAD_KEY_LEN);
+   return -EINVAL;
+   }
+
memcpy(>key, key, keylen);
ctx->keylen = keylen;
 
-- 
1.7.10.4


-- 
Martin Hicks P.Eng.|  m...@bork.org
Bork Consulting Inc.   |  +1 (613) 266-2296


[PATCH] crypto: talitos: Extend max key length for SHA384/512-HMAC

2017-04-27 Thread Martin Hicks

The max keysize for both of these is 128, not 96.  Before, with keysizes
over 96, the memcpy in ahash_setkey() would overwrite memory beyond the
key field.

Signed-off-by: Martin Hicks <m...@bork.org>
---
 drivers/crypto/talitos.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 0bba6a1..97dc85e 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -816,7 +816,7 @@ static void talitos_unregister_rng(struct device *dev)
  * HMAC_SNOOP_NO_AFEA (HSNA) instead of type IPSEC_ESP
  */
 #define TALITOS_CRA_PRIORITY_AEAD_HSNA (TALITOS_CRA_PRIORITY - 1)
-#define TALITOS_MAX_KEY_SIZE   96
+#define TALITOS_MAX_KEY_SIZE   SHA512_BLOCK_SIZE /* SHA512 has the 
largest keysize input */
 #define TALITOS_MAX_IV_LENGTH  16 /* max of AES_BLOCK_SIZE, 
DES3_EDE_BLOCK_SIZE */
 
 struct talitos_ctx {
-- 
1.7.10.4


-- 
Martin Hicks P.Eng.|  m...@bork.org
Bork Consulting Inc.   |  +1 (613) 266-2296


Re: [PATCH 4/4] crypto: talitos - add software backlog queue handling

2015-03-16 Thread Martin Hicks
On Sat, Mar 14, 2015 at 1:16 PM, Horia Geantă
horia.gea...@freescale.com wrote:
 On 3/13/2015 8:37 PM, Tom Lendacky wrote:
 +
 +/* Try to backlog request (if allowed) */
 +return crypto_enqueue_request(priv-chan[ch].queue, areq);

 I'd remembered something about how hardware drivers should use their
 own list element for queuing, searched back and found this:

 http://marc.info/?l=linux-crypto-vgerm=137609769605139w=2

 The crypto_async_request list field is used only for queuing using the
 crypto API functions - crypto_{enqueue,dequeue}_request.

 I am not sure I understand what is the problem.
 Are crypto_{enqueue,dequeue}_request part of *internal* crypto API?

I don't believe that the enqueue/dequeue functions are internal API.
I based my code on the usage of other hardware offload drivers.  Their
use is somewhat different because most of the other hardware can only
handle a single request at a time.

mh

-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] crypto: talitos: Add AES-XTS mode

2015-03-13 Thread Martin Hicks
Hi Horia,

On Wed, Mar 11, 2015 at 11:48 AM, Horia Geantă
horia.gea...@freescale.com wrote:

 While here: note that xts-talitos supports only two key lengths - 256
 and 512 bits. There are tcrypt speed tests that check also for 384-bit
 keys (which is out-of-spec, but still...), leading to a Key Size Error
 - see below (KSE bit in AESU Interrupt Status Register is set)

Ok.  I've limited the keysize to 32 or 64 bytes for AES-XTS in the
talitos driver.

This was my first experiments with the tcrypt module.  It also brought
up another issue related to the IV limitations of this hardware.  The
latest patch that I have returns an error when there is a non-zero
value in the second 8 bytes of the IV:

+   /*
+* AES-XTS uses the first two AES Context registers for:
+*
+* Register 1:   Sector Number (Little Endian)
+* Register 2:   Sector Size   (Big Endian)
+*
+* Whereas AES-CBC uses registers 1/2 as a 16-byte IV.
+*/
+   if ((ctx-desc_hdr_template 
+(DESC_HDR_SEL0_MASK | DESC_HDR_MODE0_MASK)) ==
+(DESC_HDR_SEL0_AESU | DESC_HDR_MODE0_AESU_XTS)) {
+   u64 *aesctx2 = (u64 *)areq-info + 1;
+
+   if (*aesctx2 != 0) {
+   dev_err(ctx-dev,
+   IV length limited to the first 8 bytes.);
+   return ERR_PTR(-EINVAL);
+   }
+
+   /* Fixed sized sector */
+   *aesctx2 = cpu_to_be64(1  SECTOR_SHIFT);
+   }


This approach causes the tcrypt tests to fail because tcrypt sets all
16 bytes of the IV to 0xff.  I think returning an error is the right
approach for the talitos module, but it would be nice if tcrypt still
worked.  Should tcrypt just set the IV bytes to 0 instead of 0xff?
Isn't one IV just as good as another?  I think adding exceptions to
the tcrypt code would be ugly, but maybe one should be made for XTS
since the standard dictates that the IV should be plain or plain64?

Thanks,
mh

-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] crypto: talitos: Add AES-XTS mode

2015-03-09 Thread Martin Hicks
On Mon, Mar 9, 2015 at 6:16 AM, Horia Geantă horia.gea...@freescale.com wrote:
 On 3/3/2015 7:44 PM, Martin Hicks wrote:
 On Tue, Mar 3, 2015 at 10:44 AM, Horia Geantă
 horia.gea...@freescale.com wrote:

 For talitos, there are two cases:

 1. request data size is = data unit / sector size
 talitos can handle any IV / tweak scheme

 2. request data size  sector size
 since talitos internally generates the IV for the next sector by
 incrementing the previous IV, only IV schemes that allocate consecutive
 IV to consecutive sectors will function correctly.


 it's not clear to me that #1 is right.  I guess it could be, but the
 IV length would be limited to 8 bytes.

 Yes, there's a limitation in talitos wrt. XTS IV / tweak size - it's up
 to 8 bytes.
 So I guess ESSIV won't work with talitos-xts, since the encrypted IV
 output is 16 bytes.
 But as previously said, ESSIV breaks the XTS standard requirement for
 having a consecutive IV for consecutive blocks. ESSIV should really be
 used only with disk-level encryption schemes that require an
 unpredictable IV.

Ok.  I'll verify that the second half of the IV is zeroed.

One last thing that I'm not sure of is what string to place in
cra_ablkcipher.geniv field.   eseqiv seems wrong if plain/plain64
are the IVs that XTS is designed for.

Thanks,
mh

-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] crypto: talitos: Add AES-XTS Support

2015-03-06 Thread Martin Hicks
On Thu, Mar 5, 2015 at 7:16 PM, Kim Phillips kim.phill...@freescale.com wrote:
 On Fri, 20 Feb 2015 12:00:10 -0500
 Martin Hicks m...@bork.org wrote:

 The newer talitos hardware has support for AES in XTS mode.

 Assuming it's the same thing, AES-XCBC gets added with SEC v3.0
 h/w.  Assuming hw_supports() doesn't already support this algorithm

AES-XCBC isn't the same thing as AES-XTS.

 combination (technically via the mode bit), this needs to be
 reflected in the patch so the driver doesn't think SEC 2.x devices
 can do XTS, too.

Right.  I hadn't looked into how exactly hw_supports() works.  It only
indicates which execution units are present (in this case the AES
unit).  I actually think XTS gets introduced in SEC v3.3.2.  I also
have an MPC8379 (sec3.3) and it does not have XTS.

Can you look internally to find out in which hardware it was
introduced?  Is there a SEC 3.3.1 that also has XTS?

I guess I just need a -features flag to conditionally register the
XTS algorithm for 3.3.x and newer.  Adding anything more complicated
doesn't seem warranted at this time, although that could change if
someone came along and needed to add a whole whack more of the AES
modes that were introduced at various times in the different SEC
revisions.


 + .desc_hdr_template = DESC_HDR_TYPE_COMMON_NONSNOOP_NO_AFEU |
 + DESC_HDR_SEL0_AESU |
 + DESC_HDR_MODE0_AESU_XTS,

 ...

  /* primary execution unit mode (MODE0) and derivatives */
  #define  DESC_HDR_MODE0_ENCRYPT  cpu_to_be32(0x0010)
  #define  DESC_HDR_MODE0_AESU_CBC cpu_to_be32(0x0020)
 +#define  DESC_HDR_MODE0_AESU_XTS cpu_to_be32(0x0420)

 I'd prefer these defines be kept as single bit definitions, and keep
 their names from the manual.  An XTS-specific definition can be
 composed from them either after this, or manually in the
 desc_hdr_template assignment above.

It doesn't look like there are divisible single-bit definitions for
the MODE0 bits. The AES Cipher modes are composed of 5 bits called
CipherMode, Extended CipherMode and Aux2.  Individually they don't
seem to mean anything.  Unless you want something like:

#define AES_MODE(cm, ecm, aux2)   cpu_to_be32(((ecm6) | (aux25) |
(cm1))  20)

#define DESC_HDR_MODE0_AESU_CBC  AES_MODE(1, 0, 0)
#define DESC_HDR_MODE0_AESU_XTS   AES_MODE(1, 1, 0)

Thanks,
mh

-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/5] crypto: talitos: Fix off-by-one and use all hardware slots

2015-03-04 Thread Martin Hicks
Ok, I'm fine dropping this patch.  I'm sure it doesn't affect
performance in a measurable way.

mh

On Tue, Mar 3, 2015 at 7:35 PM, Kim Phillips kim.phill...@freescale.com wrote:
 On Tue,  3 Mar 2015 08:21:35 -0500
 Martin Hicks m...@bork.org wrote:

 The submission count was off by one.

 Signed-off-by: Martin Hicks m...@bork.org
 ---
 sadly, this directly contradicts:

 commit 4b24ea971a93f5d0bec34bf7bfd0939f70cfaae6
 Author: Vishnu Suresh vis...@freescale.com
 Date:   Mon Oct 20 21:06:18 2008 +0800

 crypto: talitos - Preempt overflow interrupts off-by-one fix

 My guess is your request submission pattern differs from that of
 Vishnu's (probably IPSec and/or tcrypt), or later h/w versions have
 gotten better about dealing with channel near-overflow conditions.
 Either way, I'd prefer we not do this: it might break others, and
 I'm guessing doesn't improve performance _that_ much?

 If it does, we could risk it and restrict it to SEC versions 3.3 and
 above maybe?  Not sure what to do here exactly, barring digging up
 and old 2.x SEC and testing.

 Kim

 p.s. I checked, Vishnu isn't with Freescale anymore, so I can't
 cc him.



-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/5] crypto: talitos: Remove MD5_BLOCK_SIZE

2015-03-03 Thread Martin Hicks
This is properly defined in the md5 header file.

Signed-off-by: Martin Hicks m...@bork.org
---
 drivers/crypto/talitos.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index c49d977..89cf4d5 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -637,8 +637,6 @@ static void talitos_unregister_rng(struct device *dev)
 #define TALITOS_MAX_KEY_SIZE   96
 #define TALITOS_MAX_IV_LENGTH  16 /* max of AES_BLOCK_SIZE, 
DES3_EDE_BLOCK_SIZE */
 
-#define MD5_BLOCK_SIZE64
-
 struct talitos_ctx {
struct device *dev;
int ch;
@@ -2195,7 +2193,7 @@ static struct talitos_alg_template driver_algs[] = {
.halg.base = {
.cra_name = md5,
.cra_driver_name = md5-talitos,
-   .cra_blocksize = MD5_BLOCK_SIZE,
+   .cra_blocksize = MD5_HMAC_BLOCK_SIZE,
.cra_flags = CRYPTO_ALG_TYPE_AHASH |
 CRYPTO_ALG_ASYNC,
}
@@ -2285,7 +2283,7 @@ static struct talitos_alg_template driver_algs[] = {
.halg.base = {
.cra_name = hmac(md5),
.cra_driver_name = hmac-md5-talitos,
-   .cra_blocksize = MD5_BLOCK_SIZE,
+   .cra_blocksize = MD5_HMAC_BLOCK_SIZE,
.cra_flags = CRYPTO_ALG_TYPE_AHASH |
 CRYPTO_ALG_ASYNC,
}
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/5] crypto: talitos: Add crypto async queue handling

2015-03-03 Thread Martin Hicks
I was testing dm-crypt performance with a Freescale P1022 board with 
a recent kernel and was getting IO errors while doing testing with LUKS.
Investigation showed that all hardware FIFO slots were filling and
the driver was returning EAGAIN to the block layer, which is not an
expected response for an async crypto implementation.

The following patch series adds a few small fixes, and reworks the
submission path to use the crypto_queue mechanism to handle the
request backlog.

Changes since v1:

- Ran checkpatch.pl
- Split the path for submitting new requests vs. issuing backlogged
  requests.
- Avoid enqueuing a submitted request to the crypto queue unnecessarily.
- Fix return paths where CRYPTO_TFM_REQ_MAY_BACKLOG is not set.


Martin Hicks (5):
  crypto: talitos: Simplify per-channel initialization
  crypto: talitos: Remove MD5_BLOCK_SIZE
  crypto: talitos: Fix off-by-one and use all hardware slots
  crypto: talitos: Reorganize request submission data structures
  crypto: talitos: Add software backlog queue handling

 drivers/crypto/talitos.c |  240 +++---
 drivers/crypto/talitos.h |   44 +++--
 2 files changed, 177 insertions(+), 107 deletions(-)

-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/5] crypto: talitos: Simplify per-channel initialization

2015-03-03 Thread Martin Hicks
There were multiple loops in a row, for each separate step of the
initialization of the channels.  Simplify to a single loop.

Signed-off-by: Martin Hicks m...@bork.org
---
 drivers/crypto/talitos.c |   11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 067ec21..c49d977 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -2706,20 +2706,16 @@ static int talitos_probe(struct platform_device *ofdev)
goto err_out;
}
 
+   priv-fifo_len = roundup_pow_of_two(priv-chfifo_len);
+
for (i = 0; i  priv-num_channels; i++) {
priv-chan[i].reg = priv-reg + TALITOS_CH_STRIDE * (i + 1);
if (!priv-irq[1] || !(i  1))
priv-chan[i].reg += TALITOS_CH_BASE_OFFSET;
-   }
 
-   for (i = 0; i  priv-num_channels; i++) {
spin_lock_init(priv-chan[i].head_lock);
spin_lock_init(priv-chan[i].tail_lock);
-   }
 
-   priv-fifo_len = roundup_pow_of_two(priv-chfifo_len);
-
-   for (i = 0; i  priv-num_channels; i++) {
priv-chan[i].fifo = kzalloc(sizeof(struct talitos_request) *
 priv-fifo_len, GFP_KERNEL);
if (!priv-chan[i].fifo) {
@@ -2727,11 +2723,10 @@ static int talitos_probe(struct platform_device *ofdev)
err = -ENOMEM;
goto err_out;
}
-   }
 
-   for (i = 0; i  priv-num_channels; i++)
atomic_set(priv-chan[i].submit_count,
   -(priv-chfifo_len - 1));
+   }
 
dma_set_mask(dev, DMA_BIT_MASK(36));
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/5] crypto: talitos: Fix off-by-one and use all hardware slots

2015-03-03 Thread Martin Hicks
The submission count was off by one.

Signed-off-by: Martin Hicks m...@bork.org
---
 drivers/crypto/talitos.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 89cf4d5..7709805 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -2722,8 +2722,7 @@ static int talitos_probe(struct platform_device *ofdev)
goto err_out;
}
 
-   atomic_set(priv-chan[i].submit_count,
-  -(priv-chfifo_len - 1));
+   atomic_set(priv-chan[i].submit_count, -priv-chfifo_len);
}
 
dma_set_mask(dev, DMA_BIT_MASK(36));
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 4/5] crypto: talitos: Reorganize request submission data structures

2015-03-03 Thread Martin Hicks
This is preparatory work for moving to using the crypto async queue
handling code.  A talitos_request structure is buried into each
talitos_edesc so that when talitos_submit() is called, everything required
to defer the submission to the hardware is contained within talitos_edesc.

Signed-off-by: Martin Hicks m...@bork.org
---
 drivers/crypto/talitos.c |   95 +++---
 drivers/crypto/talitos.h |   41 +---
 2 files changed, 66 insertions(+), 70 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 7709805..b0c85ce 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -186,22 +186,16 @@ static int init_device(struct device *dev)
  * talitos_submit - submits a descriptor to the device for processing
  * @dev:   the SEC device to be used
  * @ch:the SEC device channel to be used
- * @desc:  the descriptor to be processed by the device
- * @callback:  whom to call when processing is complete
- * @context:   a handle for use by caller (optional)
+ * @edesc: the descriptor to be processed by the device
  *
  * desc must contain valid dma-mapped (bus physical) address pointers.
  * callback must check err and feedback in descriptor header
  * for device processing status.
  */
-int talitos_submit(struct device *dev, int ch, struct talitos_desc *desc,
-  void (*callback)(struct device *dev,
-   struct talitos_desc *desc,
-   void *context, int error),
-  void *context)
+int talitos_submit(struct device *dev, int ch, struct talitos_edesc *edesc)
 {
struct talitos_private *priv = dev_get_drvdata(dev);
-   struct talitos_request *request;
+   struct talitos_request *request = edesc-req;
unsigned long flags;
int head;
 
@@ -214,19 +208,15 @@ int talitos_submit(struct device *dev, int ch, struct 
talitos_desc *desc,
}
 
head = priv-chan[ch].head;
-   request = priv-chan[ch].fifo[head];
-
-   /* map descriptor and save caller data */
-   request-dma_desc = dma_map_single(dev, desc, sizeof(*desc),
+   request-dma_desc = dma_map_single(dev, request-desc,
+  sizeof(*request-desc),
   DMA_BIDIRECTIONAL);
-   request-callback = callback;
-   request-context = context;
 
/* increment fifo head */
priv-chan[ch].head = (priv-chan[ch].head + 1)  (priv-fifo_len - 1);
 
smp_wmb();
-   request-desc = desc;
+   priv-chan[ch].fifo[head] = request;
 
/* GO! */
wmb();
@@ -247,15 +237,16 @@ EXPORT_SYMBOL(talitos_submit);
 static void flush_channel(struct device *dev, int ch, int error, int reset_ch)
 {
struct talitos_private *priv = dev_get_drvdata(dev);
-   struct talitos_request *request, saved_req;
+   struct talitos_request *request;
unsigned long flags;
int tail, status;
 
spin_lock_irqsave(priv-chan[ch].tail_lock, flags);
 
tail = priv-chan[ch].tail;
-   while (priv-chan[ch].fifo[tail].desc) {
-   request = priv-chan[ch].fifo[tail];
+   while (priv-chan[ch].fifo[tail]) {
+   request = priv-chan[ch].fifo[tail];
+   status = 0;
 
/* descriptors with their done bits set don't get the error */
rmb();
@@ -271,14 +262,9 @@ static void flush_channel(struct device *dev, int ch, int 
error, int reset_ch)
 sizeof(struct talitos_desc),
 DMA_BIDIRECTIONAL);
 
-   /* copy entries so we can call callback outside lock */
-   saved_req.desc = request-desc;
-   saved_req.callback = request-callback;
-   saved_req.context = request-context;
-
/* release request entry in fifo */
smp_wmb();
-   request-desc = NULL;
+   priv-chan[ch].fifo[tail] = NULL;
 
/* increment fifo tail */
priv-chan[ch].tail = (tail + 1)  (priv-fifo_len - 1);
@@ -287,8 +273,8 @@ static void flush_channel(struct device *dev, int ch, int 
error, int reset_ch)
 
atomic_dec(priv-chan[ch].submit_count);
 
-   saved_req.callback(dev, saved_req.desc, saved_req.context,
-  status);
+   request-callback(dev, request-desc, request-context, status);
+
/* channel may resume processing in single desc error case */
if (error  !reset_ch  status == error)
return;
@@ -352,7 +338,8 @@ static u32 current_desc_hdr(struct device *dev, int ch)
tail = priv-chan[ch].tail;
 
iter = tail;
-   while (priv-chan[ch].fifo[iter].dma_desc != cur_desc) {
+   while (priv-chan[ch].fifo[iter] 
+  priv-chan[ch].fifo

[PATCH v2 5/5] crypto: talitos: Add software backlog queue handling

2015-03-03 Thread Martin Hicks
I was running into situations where the hardware FIFO was filling up, and
the code was returning EAGAIN to dm-crypt and just dropping the submitted
crypto request.

This adds support in talitos for a software backlog queue.  When requests
can't be queued to the hardware immediately EBUSY is returned.  The queued
requests are dispatched to the hardware in received order as hardware FIFO
slots become available.

Signed-off-by: Martin Hicks m...@bork.org
---
 drivers/crypto/talitos.c |  135 --
 drivers/crypto/talitos.h |3 ++
 2 files changed, 110 insertions(+), 28 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index b0c85ce..bb7fba0 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -182,55 +182,118 @@ static int init_device(struct device *dev)
return 0;
 }
 
-/**
- * talitos_submit - submits a descriptor to the device for processing
- * @dev:   the SEC device to be used
- * @ch:the SEC device channel to be used
- * @edesc: the descriptor to be processed by the device
- *
- * desc must contain valid dma-mapped (bus physical) address pointers.
- * callback must check err and feedback in descriptor header
- * for device processing status.
- */
-int talitos_submit(struct device *dev, int ch, struct talitos_edesc *edesc)
+/* Dispatch 'request' if provided, otherwise a backlogged request */
+static int __talitos_handle_queue(struct device *dev, int ch,
+ struct talitos_edesc *edesc,
+ unsigned long *irq_flags)
 {
struct talitos_private *priv = dev_get_drvdata(dev);
-   struct talitos_request *request = edesc-req;
-   unsigned long flags;
+   struct talitos_request *request;
+   struct crypto_async_request *areq;
int head;
+   int ret = -EINPROGRESS;
 
-   spin_lock_irqsave(priv-chan[ch].head_lock, flags);
-
-   if (!atomic_inc_not_zero(priv-chan[ch].submit_count)) {
+   if (!atomic_inc_not_zero(priv-chan[ch].submit_count))
/* h/w fifo is full */
-   spin_unlock_irqrestore(priv-chan[ch].head_lock, flags);
-   return -EAGAIN;
+   return -EBUSY;
+
+   if (!edesc) {
+   /* Dequeue the oldest request */
+   areq = crypto_dequeue_request(priv-chan[ch].queue);
+   request = container_of(areq, struct talitos_request, base);
+   } else {
+   request = edesc-req;
}
 
-   head = priv-chan[ch].head;
request-dma_desc = dma_map_single(dev, request-desc,
   sizeof(*request-desc),
   DMA_BIDIRECTIONAL);
 
/* increment fifo head */
+   head = priv-chan[ch].head;
priv-chan[ch].head = (priv-chan[ch].head + 1)  (priv-fifo_len - 1);
 
-   smp_wmb();
-   priv-chan[ch].fifo[head] = request;
+   spin_unlock_irqrestore(priv-chan[ch].head_lock, *irq_flags);
+
+   /*
+* Mark a backlogged request as in-progress.
+*/
+   if (!edesc) {
+   areq = request-context;
+   areq-complete(areq, -EINPROGRESS);
+   }
+
+   spin_lock_irqsave(priv-chan[ch].head_lock, *irq_flags);
 
/* GO! */
+   priv-chan[ch].fifo[head] = request;
wmb();
out_be32(priv-chan[ch].reg + TALITOS_FF,
 upper_32_bits(request-dma_desc));
out_be32(priv-chan[ch].reg + TALITOS_FF_LO,
 lower_32_bits(request-dma_desc));
 
+   return ret;
+}
+
+/**
+ * talitos_submit - performs submissions of a new descriptors
+ *
+ * @dev:   the SEC device to be used
+ * @ch:the SEC device channel to be used
+ * @edesc: the request to be processed by the device
+ *
+ * edesc-req must contain valid dma-mapped (bus physical) address pointers.
+ * callback must check err and feedback in descriptor header
+ * for device processing status upon completion.
+ */
+int talitos_submit(struct device *dev, int ch, struct talitos_edesc *edesc)
+{
+   struct talitos_private *priv = dev_get_drvdata(dev);
+   struct talitos_request *request = edesc-req;
+   unsigned long flags;
+   int ret = -EINPROGRESS;
+
+   spin_lock_irqsave(priv-chan[ch].head_lock, flags);
+
+   if (priv-chan[ch].queue.qlen) {
+   /*
+* There are backlogged requests.  Just queue this new request
+* and dispatch the oldest backlogged request to the hardware.
+*/
+   crypto_enqueue_request(priv-chan[ch].queue,
+  request-base);
+   __talitos_handle_queue(dev, ch, NULL, flags);
+   ret = -EBUSY;
+   } else {
+   ret = __talitos_handle_queue(dev, ch, edesc, flags);
+   if (ret == -EBUSY)
+   /* Hardware FIFO is full

Re: [PATCH 0/2] crypto: talitos: Add AES-XTS mode

2015-03-03 Thread Martin Hicks
On Tue, Mar 3, 2015 at 10:44 AM, Horia Geantă
horia.gea...@freescale.com wrote:
 On 3/3/2015 12:09 AM, Martin Hicks wrote:

 On Mon, Mar 02, 2015 at 03:37:28PM +0100, Milan Broz wrote:

 If crypto API allows to encrypt more sectors in one run
 (handling IV internally) dmcrypt can be modified of course.

 But do not forget we can use another IV (not only sequential number)
 e.g. ESSIV with XTS as well (even if it doesn't make much sense, some people
 are using it).

 Interesting, I'd not considered using XTS with an IV other than plain/64.
 The talitos hardware would not support aes/xts in any mode other than
 plain/plain64 I don't think...Although perhaps you could push in an 8-byte
 IV and the hardware would interpret it as the sector #.


 For talitos, there are two cases:

 1. request data size is = data unit / sector size
 talitos can handle any IV / tweak scheme

 2. request data size  sector size
 since talitos internally generates the IV for the next sector by
 incrementing the previous IV, only IV schemes that allocate consecutive
 IV to consecutive sectors will function correctly.


it's not clear to me that #1 is right.  I guess it could be, but the
IV length would be limited to 8 bytes.

This also points out that claiming that the XTS IV size is 16 bytes,
as my current patch does, could be problematic.  It's handy because
the first 8 bytes should contain a plain64 sector #, and the second
u64 can be used to encode the sector size but it would be a mistake
for someone to use the second 8 bytes for the rest of a 16byte IV.

mh

-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] crypto: talitos: Add AES-XTS mode

2015-03-02 Thread Martin Hicks

On Mon, Mar 02, 2015 at 03:37:28PM +0100, Milan Broz wrote:
 
 If crypto API allows to encrypt more sectors in one run
 (handling IV internally) dmcrypt can be modified of course.
 
 But do not forget we can use another IV (not only sequential number)
 e.g. ESSIV with XTS as well (even if it doesn't make much sense, some people
 are using it).

Interesting, I'd not considered using XTS with an IV other than plain/64.
The talitos hardware would not support aes/xts in any mode other than
plain/plain64 I don't think...Although perhaps you could push in an 8-byte
IV and the hardware would interpret it as the sector #.

 Maybe the following question would be if the dmcrypt sector IV algorithms
 should moved into crypto API as well.
 (But because I misused dmcrypt IVs hooks for some additional operations
 for loopAES and old Truecrypt CBC mode, it is not so simple...)

Speaking again with talitos in mind, there would be no advantage for this
hardware.  Although larger requests are possible only a single IV can be
provided per request, so for algorithms like AES-CBC and dm-crypt 512byte IOs
are the only option (short of switching to 4kB block size).

mh

-- 
Martin Hicks P.Eng.|  m...@bork.org
Bork Consulting Inc.   |  +1 (613) 266-2296
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] crypto: talitos: Add AES-XTS mode

2015-03-02 Thread Martin Hicks
On Mon, Mar 02, 2015 at 03:25:56PM +0200, Horia Geantă wrote:
 On 2/20/2015 7:00 PM, Martin Hicks wrote:
  This adds the AES-XTS mode, supported by the Freescale SEC 3.3.2.
  
  One of the nice things about this hardware is that it knows how to deal
  with encrypt/decrypt requests that are larger than sector size, but that 
  also requires that that the sector size be passed into the crypto engine
  as an XTS cipher context parameter.
  
  When a request is larger than the sector size the sector number is
  incremented by the talitos engine and the tweak key is re-calculated
  for the new sector.
  
  I've tested this with 256bit and 512bit keys (tweak and data keys of 128bit
  and 256bit) to ensure interoperability with the software AES-XTS
  implementation.  All testing was done using dm-crypt/LUKS with
  aes-xts-plain64.
  
  Is there a better solution that just hard coding the sector size to
  (1SECTOR_SHIFT)?  Maybe dm-crypt should be modified to pass the
  sector size along with the plain/plain64 IV to an XTS algorithm?
 
 AFAICT, SW implementation of xts mode in kernel (crypto/xts.c) is not
 aware of a sector size (data unit size in IEEE P1619 terminology):
 There's a hidden assumption that all the data send to xts in one request
 belongs to a single sector. Even more, it's supposed that the first
 16-byte block in the request is block 0 in the sector. These can be
 seen from the way the tweak (T) value is computed.
 (Side note: there's no support of ciphertext stealing in crypto/xts.c -
 i.e. sector sizes must be a multiple of underlying block cipher size -
 that is 16B.)
 
 If dm-crypt would be modified to pass sector size somehow, all in-kernel
 xts implementations would have to be made aware of the change.
 I have nothing against this, but let's see what crypto maintainers have
 to say...

Right.  Additionally, there may be some requirement for the encryption
implementation to broadcast the maximum size that can be handled in a single
request.  For example Talitos could handle XTS encrypt/decrypt requests of
up to 64kB (regardless of the block device's sector size).

 BTW, there were some discussions back in 2013 wrt. being able to
 configure / increase sector size, smth. crypto engines would benefit from:
 http://www.saout.de/pipermail/dm-crypt/2013-January/003125.html
 (experimental patch)
 http://www.saout.de/pipermail/dm-crypt/2013-March/003202.html
 
 The experimental patch sends sector size as the req-nbytes - hidden
 assumption: data size sent in an xts crypto request equals a sector.

I found this last week, and used it as a starting point for some testing.  I
modified it to keep the underlying sector size of the dm-crypt mapping as
512byte, but allowed the code to combine requests in IOs up to 4kB.  Doing
greater request sizes would require allocating additional pages...I plan to
implement that to see how much extra performance can be squeezed out.

patch below...

With regards to performance, with my low-powered Freescale P1022 board, I see
performance numbers like this on ext4, as measured by bonnie++.

Write (MB/s)Read (MB/s)
Unencrypted 140 176
aes-xts-plain64 512b113 115
aes-xts-plain64 4kB 71  56

The more detailed bonnie++ output is here:
http://www.bork.org/~mort/dm-crypt-enc-blksize.html

The larger IO sizes is a huge win for this board.

The patch I'm using to send IOs up to 4kB to talitos follows.

Thanks,
mh


diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 08981be..88e95b5 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -42,6 +42,7 @@ struct convert_context {
struct bvec_iter iter_out;
sector_t cc_sector;
atomic_t cc_pending;
+   unsigned int block_size;
struct ablkcipher_request *req;
 };
 
@@ -142,6 +143,8 @@ struct crypt_config {
sector_t iv_offset;
unsigned int iv_size;
 
+   unsigned int block_size;
+
/* ESSIV: struct crypto_cipher *essiv_tfm */
void *iv_private;
struct crypto_ablkcipher **tfms;
@@ -801,10 +804,17 @@ static void crypt_convert_init(struct crypt_config *cc,
 {
ctx-bio_in = bio_in;
ctx-bio_out = bio_out;
-   if (bio_in)
+   ctx-block_size = 0;
+   if (bio_in) {
ctx-iter_in = bio_in-bi_iter;
-   if (bio_out)
+   ctx-block_size = max(ctx-block_size, bio_cur_bytes(bio_in));
+   }
+   if (bio_out) {
ctx-iter_out = bio_out-bi_iter;
+   ctx-block_size = max(ctx-block_size, bio_cur_bytes(bio_out));
+   }
+   if (ctx-block_size  cc-block_size)
+   ctx-block_size = cc-block_size;
ctx-cc_sector = sector + cc-iv_offset;
init_completion(ctx-restart);
 }
@@ -844,15 +854,15 @@ static int crypt_convert_block(struct crypt_config *cc,
dmreq-iv_sector = ctx-cc_sector;
dmreq-ctx = ctx;
sg_init_table(dmreq-sg_in, 1

Re: [PATCH 0/2] crypto: talitos: Add AES-XTS mode

2015-03-02 Thread Martin Hicks
On Mon, Mar 02, 2015 at 04:44:19PM -0500, Martin Hicks wrote:
 
   Write (MB/s)Read (MB/s)
 Unencrypted   140 176
 aes-xts-plain64 512b  113 115
 aes-xts-plain64 4kB   71  56

I got the two AES lines backwards.  Sorry about that.

mh

-- 
Martin Hicks P.Eng.|  m...@bork.org
Bork Consulting Inc.   |  +1 (613) 266-2296
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] crypto: talitos: Add software backlog queue handling

2015-02-26 Thread Martin Hicks
Hi Horia,

On Tue, Feb 24, 2015 at 1:21 PM, Horia Geantă
horia.gea...@freescale.com wrote:
 On 2/20/2015 6:21 PM, Martin Hicks wrote:
 + int ret = -EINPROGRESS;

   spin_lock_irqsave(priv-chan[ch].head_lock, flags);

 + if (edesc) {
 + orig_request = edesc-req;
 + crypto_enqueue_request(priv-chan[ch].queue, 
 orig_request-base);
 + }

 The request goes through the SW queue even if there are empty slots in
 the HW queue, doing unnecessary crypto_queue_encrypt() and
 crypto_queue_decrypt(). Trying to use the HW queue first would be better.


Definitely a valid point.  In trying to reorganize this it really
complicates things.
Splitting the submit from issuing requests from the backlog seems to
make it much more straightforward.

 @@ -1170,6 +1211,8 @@ static struct talitos_edesc 
 *talitos_edesc_alloc(struct device *dev,
edesc-dma_len,
DMA_BIDIRECTIONAL);
   edesc-req.desc = edesc-desc;
 + /* A copy of the crypto_async_request to use the crypto_queue backlog 
 */
 + memcpy(edesc-req.base, areq, sizeof(struct crypto_async_request));

 Why not have a
 struct crypto_async_request *base;
 instead of
 struct crypto_async_request base;
 in talitos_request structure?

 This way:
 1. memcopy above is avoided
 2. talitos_request structure gets smaller - remember that
 talitos_request is now embedded in talitos_edesc structure, which gets
 kmalloc-ed for every request

The trouble I ran into was that after a request is backlogged and put
on the crypto queue, I couldn't see how else I could go from the
crypto_async_request that is pulled from the queue back to the
talitos_request that is needed in order put the pointer into the
hardware FIFO.

This is the one issue from you review that I didn't address in v2 of the patch.

Thanks for the review.  Not releasing / re-acquiring the spinlock
while trying to flush the backlog really improved performance.

mh

-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/5] crypto: talitos: Add software backlog queue handling

2015-02-20 Thread Martin Hicks
I was running into situations where the hardware FIFO was filling up, and
the code was returning EAGAIN to dm-crypt and just dropping the submitted
crypto request.

This adds support in talitos for a software backlog queue.  When requests
can't be queued to the hardware immediately EBUSY is returned.  The queued
requests are dispatched to the hardware in received order as hardware FIFO
slots become available.

Signed-off-by: Martin Hicks m...@bork.org
---
 drivers/crypto/talitos.c |   92 +++---
 drivers/crypto/talitos.h |3 ++
 2 files changed, 74 insertions(+), 21 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index d3472be..226654c 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -183,43 +183,72 @@ static int init_device(struct device *dev)
 }
 
 /**
- * talitos_submit - submits a descriptor to the device for processing
+ * talitos_handle_queue - performs submissions either of new descriptors
+ *or ones waiting in the queue backlog.
  * @dev:   the SEC device to be used
  * @ch:the SEC device channel to be used
- * @edesc: the descriptor to be processed by the device
- * @context:   a handle for use by caller (optional)
+ * @edesc: the descriptor to be processed by the device (optional)
  *
  * desc must contain valid dma-mapped (bus physical) address pointers.
  * callback must check err and feedback in descriptor header
- * for device processing status.
+ * for device processing status upon completion.
  */
-int talitos_submit(struct device *dev, int ch, struct talitos_edesc *edesc)
+int talitos_handle_queue(struct device *dev, int ch, struct talitos_edesc 
*edesc)
 {
struct talitos_private *priv = dev_get_drvdata(dev);
-   struct talitos_request *request = edesc-req;
+   struct talitos_request *request, *orig_request = NULL;
+   struct crypto_async_request *async_req;
unsigned long flags;
int head;
+   int ret = -EINPROGRESS; 
 
spin_lock_irqsave(priv-chan[ch].head_lock, flags);
 
+   if (edesc) {
+   orig_request = edesc-req;
+   crypto_enqueue_request(priv-chan[ch].queue, 
orig_request-base);
+   }
+
+flush_another:
+   if (priv-chan[ch].queue.qlen == 0) {
+   spin_unlock_irqrestore(priv-chan[ch].head_lock, flags);
+   return 0;
+   }
+
if (!atomic_inc_not_zero(priv-chan[ch].submit_count)) {
/* h/w fifo is full */
spin_unlock_irqrestore(priv-chan[ch].head_lock, flags);
-   return -EAGAIN;
+   return -EBUSY;
}
 
-   head = priv-chan[ch].head;
+   /* Dequeue the oldest request */
+   async_req = crypto_dequeue_request(priv-chan[ch].queue);
+
+   request = container_of(async_req, struct talitos_request, base);
request-dma_desc = dma_map_single(dev, request-desc,
   sizeof(*request-desc),
   DMA_BIDIRECTIONAL);
 
/* increment fifo head */
+   head = priv-chan[ch].head;
priv-chan[ch].head = (priv-chan[ch].head + 1)  (priv-fifo_len - 1);
 
-   smp_wmb();
-   priv-chan[ch].fifo[head] = request;
+   spin_unlock_irqrestore(priv-chan[ch].head_lock, flags);
+
+   /*
+* Mark a backlogged request as in-progress, return EBUSY because
+* the original request that was submitted is backlogged.
+*/
+   if (request != orig_request) {
+   struct crypto_async_request *areq = request-context;
+   areq-complete(areq, -EINPROGRESS);
+   ret = -EBUSY;
+   }
+
+   spin_lock_irqsave(priv-chan[ch].head_lock, flags);
 
/* GO! */
+   priv-chan[ch].fifo[head] = request;
wmb();
out_be32(priv-chan[ch].reg + TALITOS_FF,
 upper_32_bits(request-dma_desc));
@@ -228,9 +257,18 @@ int talitos_submit(struct device *dev, int ch, struct 
talitos_edesc *edesc)
 
spin_unlock_irqrestore(priv-chan[ch].head_lock, flags);
 
-   return -EINPROGRESS;
+   /*
+* When handling the queue via the completion path, queue more
+* requests if the hardware has room.
+*/
+   if (!edesc) {
+   spin_lock_irqsave(priv-chan[ch].head_lock, flags);
+   goto flush_another;
+   }
+
+   return ret;
 }
-EXPORT_SYMBOL(talitos_submit);
+EXPORT_SYMBOL(talitos_handle_queue);
 
 /*
  * process what was done, notify callback of error if not
@@ -284,6 +322,8 @@ static void flush_channel(struct device *dev, int ch, int 
error, int reset_ch)
}
 
spin_unlock_irqrestore(priv-chan[ch].tail_lock, flags);
+
+   talitos_handle_queue(dev, ch, NULL);
 }
 
 /*
@@ -1038,8 +1078,8 @@ static int ipsec_esp(struct talitos_edesc *edesc, struct 
aead_request *areq,
edesc-req.callback = callback;
edesc

[PATCH 0/5] crypto: talitos: Add crypto async queue handling

2015-02-20 Thread Martin Hicks
I was testing dm-crypt performance with a Freescale P1022 board with 
a recent kernel and was getting IO errors while doing testing with LUKS.
Investigation showed that all hardware FIFO slots were filling and
the driver was returning EAGAIN to the block layer, which is not an
expected response for an async crypto implementation.

The following patch series adds a few small fixes, and reworks the
submission path to use the crypto_queue mechanism to handle the
request backlog.


Martin Hicks (5):
  crypto: talitos: Simplify per-channel initialization
  crypto: talitos: Remove MD5_BLOCK_SIZE
  crypto: talitos: Fix off-by-one and use all hardware slots
  crypto: talitos: Reorganize request submission data structures
  crypto: talitos: Add software backlog queue handling

 drivers/crypto/talitos.c |  189 --
 drivers/crypto/talitos.h |   44 +--
 2 files changed, 137 insertions(+), 96 deletions(-)

-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/5] crypto: talitos: Simplify per-channel initialization

2015-02-20 Thread Martin Hicks
There were multiple loops in a row, for each separate step of the
initialization of the channels.  Simplify to a single loop.

Signed-off-by: Martin Hicks m...@bork.org
---
 drivers/crypto/talitos.c |   11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 067ec21..c49d977 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -2706,20 +2706,16 @@ static int talitos_probe(struct platform_device *ofdev)
goto err_out;
}
 
+   priv-fifo_len = roundup_pow_of_two(priv-chfifo_len);
+
for (i = 0; i  priv-num_channels; i++) {
priv-chan[i].reg = priv-reg + TALITOS_CH_STRIDE * (i + 1);
if (!priv-irq[1] || !(i  1))
priv-chan[i].reg += TALITOS_CH_BASE_OFFSET;
-   }
 
-   for (i = 0; i  priv-num_channels; i++) {
spin_lock_init(priv-chan[i].head_lock);
spin_lock_init(priv-chan[i].tail_lock);
-   }
 
-   priv-fifo_len = roundup_pow_of_two(priv-chfifo_len);
-
-   for (i = 0; i  priv-num_channels; i++) {
priv-chan[i].fifo = kzalloc(sizeof(struct talitos_request) *
 priv-fifo_len, GFP_KERNEL);
if (!priv-chan[i].fifo) {
@@ -2727,11 +2723,10 @@ static int talitos_probe(struct platform_device *ofdev)
err = -ENOMEM;
goto err_out;
}
-   }
 
-   for (i = 0; i  priv-num_channels; i++)
atomic_set(priv-chan[i].submit_count,
   -(priv-chfifo_len - 1));
+   }
 
dma_set_mask(dev, DMA_BIT_MASK(36));
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/5] crypto: talitos: Remove MD5_BLOCK_SIZE

2015-02-20 Thread Martin Hicks
This is properly defined in the md5 header file.
---
 drivers/crypto/talitos.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index c49d977..89cf4d5 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -637,8 +637,6 @@ static void talitos_unregister_rng(struct device *dev)
 #define TALITOS_MAX_KEY_SIZE   96
 #define TALITOS_MAX_IV_LENGTH  16 /* max of AES_BLOCK_SIZE, 
DES3_EDE_BLOCK_SIZE */
 
-#define MD5_BLOCK_SIZE64
-
 struct talitos_ctx {
struct device *dev;
int ch;
@@ -2195,7 +2193,7 @@ static struct talitos_alg_template driver_algs[] = {
.halg.base = {
.cra_name = md5,
.cra_driver_name = md5-talitos,
-   .cra_blocksize = MD5_BLOCK_SIZE,
+   .cra_blocksize = MD5_HMAC_BLOCK_SIZE,
.cra_flags = CRYPTO_ALG_TYPE_AHASH |
 CRYPTO_ALG_ASYNC,
}
@@ -2285,7 +2283,7 @@ static struct talitos_alg_template driver_algs[] = {
.halg.base = {
.cra_name = hmac(md5),
.cra_driver_name = hmac-md5-talitos,
-   .cra_blocksize = MD5_BLOCK_SIZE,
+   .cra_blocksize = MD5_HMAC_BLOCK_SIZE,
.cra_flags = CRYPTO_ALG_TYPE_AHASH |
 CRYPTO_ALG_ASYNC,
}
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/5] crypto: talitos: Fix off-by-one and use all hardware slots

2015-02-20 Thread Martin Hicks
The submission count was off by one.

Signed-off-by: Martin Hicks m...@bork.org
---
 drivers/crypto/talitos.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 89cf4d5..7709805 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -2722,8 +2722,7 @@ static int talitos_probe(struct platform_device *ofdev)
goto err_out;
}
 
-   atomic_set(priv-chan[i].submit_count,
-  -(priv-chfifo_len - 1));
+   atomic_set(priv-chan[i].submit_count, -priv-chfifo_len);
}
 
dma_set_mask(dev, DMA_BIT_MASK(36));
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/5] crypto: talitos: Reorganize request submission data structures

2015-02-20 Thread Martin Hicks
This is preparatory work for moving to using the crypto async queue
handling code.  A talitos_request structure is buried into each
talitos_edesc so that when talitos_submit() is called, everything required
to defer the submission to the hardware is contained within talitos_edesc.

Signed-off-by: Martin Hicks m...@bork.org
---
 drivers/crypto/talitos.c |   93 +++---
 drivers/crypto/talitos.h |   41 +---
 2 files changed, 65 insertions(+), 69 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 7709805..d3472be 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -186,22 +186,17 @@ static int init_device(struct device *dev)
  * talitos_submit - submits a descriptor to the device for processing
  * @dev:   the SEC device to be used
  * @ch:the SEC device channel to be used
- * @desc:  the descriptor to be processed by the device
- * @callback:  whom to call when processing is complete
+ * @edesc: the descriptor to be processed by the device
  * @context:   a handle for use by caller (optional)
  *
  * desc must contain valid dma-mapped (bus physical) address pointers.
  * callback must check err and feedback in descriptor header
  * for device processing status.
  */
-int talitos_submit(struct device *dev, int ch, struct talitos_desc *desc,
-  void (*callback)(struct device *dev,
-   struct talitos_desc *desc,
-   void *context, int error),
-  void *context)
+int talitos_submit(struct device *dev, int ch, struct talitos_edesc *edesc)
 {
struct talitos_private *priv = dev_get_drvdata(dev);
-   struct talitos_request *request;
+   struct talitos_request *request = edesc-req;
unsigned long flags;
int head;
 
@@ -214,19 +209,15 @@ int talitos_submit(struct device *dev, int ch, struct 
talitos_desc *desc,
}
 
head = priv-chan[ch].head;
-   request = priv-chan[ch].fifo[head];
-
-   /* map descriptor and save caller data */
-   request-dma_desc = dma_map_single(dev, desc, sizeof(*desc),
+   request-dma_desc = dma_map_single(dev, request-desc,
+  sizeof(*request-desc),
   DMA_BIDIRECTIONAL);
-   request-callback = callback;
-   request-context = context;
 
/* increment fifo head */
priv-chan[ch].head = (priv-chan[ch].head + 1)  (priv-fifo_len - 1);
 
smp_wmb();
-   request-desc = desc;
+   priv-chan[ch].fifo[head] = request;
 
/* GO! */
wmb();
@@ -247,15 +238,16 @@ EXPORT_SYMBOL(talitos_submit);
 static void flush_channel(struct device *dev, int ch, int error, int reset_ch)
 {
struct talitos_private *priv = dev_get_drvdata(dev);
-   struct talitos_request *request, saved_req;
+   struct talitos_request *request;
unsigned long flags;
int tail, status;
 
spin_lock_irqsave(priv-chan[ch].tail_lock, flags);
 
tail = priv-chan[ch].tail;
-   while (priv-chan[ch].fifo[tail].desc) {
-   request = priv-chan[ch].fifo[tail];
+   while (priv-chan[ch].fifo[tail]) {
+   request = priv-chan[ch].fifo[tail];
+   status = 0;
 
/* descriptors with their done bits set don't get the error */
rmb();
@@ -271,14 +263,9 @@ static void flush_channel(struct device *dev, int ch, int 
error, int reset_ch)
 sizeof(struct talitos_desc),
 DMA_BIDIRECTIONAL);
 
-   /* copy entries so we can call callback outside lock */
-   saved_req.desc = request-desc;
-   saved_req.callback = request-callback;
-   saved_req.context = request-context;
-
/* release request entry in fifo */
smp_wmb();
-   request-desc = NULL;
+   priv-chan[ch].fifo[tail] = NULL;
 
/* increment fifo tail */
priv-chan[ch].tail = (tail + 1)  (priv-fifo_len - 1);
@@ -287,8 +274,8 @@ static void flush_channel(struct device *dev, int ch, int 
error, int reset_ch)
 
atomic_dec(priv-chan[ch].submit_count);
 
-   saved_req.callback(dev, saved_req.desc, saved_req.context,
-  status);
+   request-callback(dev, request-desc, request-context, status);
+
/* channel may resume processing in single desc error case */
if (error  !reset_ch  status == error)
return;
@@ -352,7 +339,8 @@ static u32 current_desc_hdr(struct device *dev, int ch)
tail = priv-chan[ch].tail;
 
iter = tail;
-   while (priv-chan[ch].fifo[iter].dma_desc != cur_desc) {
+   while (priv-chan[ch].fifo[iter] 
+   priv-chan

[PATCH 1/2] crypto: talitos: Clean ups and comment fixes for ablkcipher commands

2015-02-20 Thread Martin Hicks
This just cleans up some of the initializers, and improves the comments
should any other ablkcipher modes be added in the future.  The header
words 1 and 5 have more possibilities than just passing an IV.  These
are pointers to the Cipher Context in/out registers.

Signed-off-by: Martin Hicks m...@bork.org
---
 drivers/crypto/talitos.c |   12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 226654c..6b2a19a 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -1377,11 +1377,9 @@ static int common_nonsnoop(struct talitos_edesc *edesc,
int sg_count, ret;
 
/* first DWORD empty */
-   desc-ptr[0].len = 0;
-   to_talitos_ptr(desc-ptr[0], 0);
-   desc-ptr[0].j_extent = 0;
+   desc-ptr[0] = zero_entry;
 
-   /* cipher iv */
+   /* cipher context */
to_talitos_ptr(desc-ptr[1], edesc-iv_dma);
desc-ptr[1].len = cpu_to_be16(ivsize);
desc-ptr[1].j_extent = 0;
@@ -1444,14 +1442,12 @@ static int common_nonsnoop(struct talitos_edesc *edesc,
   edesc-dma_len, DMA_BIDIRECTIONAL);
}
 
-   /* iv out */
+   /* cipher context out */
map_single_talitos_ptr(dev, desc-ptr[5], ivsize, ctx-iv, 0,
   DMA_FROM_DEVICE);
 
/* last DWORD empty */
-   desc-ptr[6].len = 0;
-   to_talitos_ptr(desc-ptr[6], 0);
-   desc-ptr[6].j_extent = 0;
+   desc-ptr[6] = zero_entry;
 
edesc-req.callback = callback;
edesc-req.context = areq;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] crypto: talitos: Add AES-XTS mode

2015-02-20 Thread Martin Hicks
This adds the AES-XTS mode, supported by the Freescale SEC 3.3.2.

One of the nice things about this hardware is that it knows how to deal
with encrypt/decrypt requests that are larger than sector size, but that 
also requires that that the sector size be passed into the crypto engine
as an XTS cipher context parameter.

When a request is larger than the sector size the sector number is
incremented by the talitos engine and the tweak key is re-calculated
for the new sector.

I've tested this with 256bit and 512bit keys (tweak and data keys of 128bit
and 256bit) to ensure interoperability with the software AES-XTS
implementation.  All testing was done using dm-crypt/LUKS with
aes-xts-plain64.

Is there a better solution that just hard coding the sector size to
(1SECTOR_SHIFT)?  Maybe dm-crypt should be modified to pass the
sector size along with the plain/plain64 IV to an XTS algorithm?

Martin Hicks (2):
  crypto: talitos: Clean ups and comment fixes for ablkcipher commands
  crypto: talitos: Add AES-XTS Support

 drivers/crypto/talitos.c |   45 +
 drivers/crypto/talitos.h |1 +
 2 files changed, 38 insertions(+), 8 deletions(-)

-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] crypto: talitos: Add AES-XTS Support

2015-02-20 Thread Martin Hicks
The newer talitos hardware has support for AES in XTS mode.

Signed-off-by: Martin Hicks m...@bork.org
---
 drivers/crypto/talitos.c |   33 +
 drivers/crypto/talitos.h |1 +
 2 files changed, 34 insertions(+)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 6b2a19a..38cbde1 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -40,9 +40,11 @@
 #include linux/spinlock.h
 #include linux/rtnetlink.h
 #include linux/slab.h
+#include linux/device-mapper.h
 
 #include crypto/algapi.h
 #include crypto/aes.h
+#include crypto/xts.h
 #include crypto/des.h
 #include crypto/sha.h
 #include crypto/md5.h
@@ -1464,9 +1466,22 @@ static struct talitos_edesc 
*ablkcipher_edesc_alloc(struct ablkcipher_request *
areq, bool encrypt)
 {
struct crypto_ablkcipher *cipher = crypto_ablkcipher_reqtfm(areq);
+   struct crypto_tfm *tfm = (struct crypto_tfm *)cipher;
struct talitos_ctx *ctx = crypto_ablkcipher_ctx(cipher);
unsigned int ivsize = crypto_ablkcipher_ivsize(cipher);
 
+   /*
+* AES-XTS uses the first two AES Context registers for:
+*
+* Register 1:   Sector Number (Little Endian)
+* Register 2:   Sector Size   (Big Endian)
+*
+* Whereas AES-CBC uses registers 1/2 as a 16-byte IV.
+*/
+   if (!strcmp(crypto_tfm_alg_name(tfm),xts(aes)))
+   /* Fixed sized sector */
+   *((u64 *)areq-info + 1) = cpu_to_be64((1SECTOR_SHIFT));
+
return talitos_edesc_alloc(ctx-dev, NULL, areq-src, areq-dst,
   areq-info, 0, areq-nbytes, 0, ivsize, 0,
   areq-base.flags, areq-base, encrypt);
@@ -2192,6 +2207,24 @@ static struct talitos_alg_template driver_algs[] = {
 DESC_HDR_MODE0_DEU_CBC |
 DESC_HDR_MODE0_DEU_3DES,
},
+   {   .type = CRYPTO_ALG_TYPE_ABLKCIPHER,
+   .alg.crypto = {
+   .cra_name = xts(aes),
+   .cra_driver_name = xts-aes-talitos,
+   .cra_blocksize = XTS_BLOCK_SIZE,
+   .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER |
+ CRYPTO_ALG_ASYNC,
+   .cra_ablkcipher = {
+   .min_keysize = AES_MIN_KEY_SIZE * 2,
+   .max_keysize = AES_MAX_KEY_SIZE * 2,
+   .ivsize = XTS_BLOCK_SIZE,
+   }
+   },
+   .desc_hdr_template = DESC_HDR_TYPE_COMMON_NONSNOOP_NO_AFEU |
+   DESC_HDR_SEL0_AESU |
+   DESC_HDR_MODE0_AESU_XTS,
+   },
+
/* AHASH algorithms. */
{   .type = CRYPTO_ALG_TYPE_AHASH,
.alg.hash = {
diff --git a/drivers/crypto/talitos.h b/drivers/crypto/talitos.h
index a6f73e2..735da82 100644
--- a/drivers/crypto/talitos.h
+++ b/drivers/crypto/talitos.h
@@ -316,6 +316,7 @@ extern int talitos_submit(struct device *dev, int ch, 
struct talitos_edesc *edes
 /* primary execution unit mode (MODE0) and derivatives */
 #defineDESC_HDR_MODE0_ENCRYPT  cpu_to_be32(0x0010)
 #defineDESC_HDR_MODE0_AESU_CBC cpu_to_be32(0x0020)
+#defineDESC_HDR_MODE0_AESU_XTS cpu_to_be32(0x0420)
 #defineDESC_HDR_MODE0_DEU_CBC  cpu_to_be32(0x0040)
 #defineDESC_HDR_MODE0_DEU_3DES cpu_to_be32(0x0020)
 #defineDESC_HDR_MODE0_MDEU_CONTcpu_to_be32(0x0800)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] crypto: talitos: Add crypto async queue handling

2015-02-20 Thread Martin Hicks
Resending to linux-crypto in plain text.
Sorry to everyone else for the duplication.
mh

On Fri, Feb 20, 2015 at 1:23 PM, Martin Hicks m...@bork.org wrote:

 I've just noticed that performance is pretty terrible with maxcpus=1, so
 I'll investigate that.
 mh

 On Fri, Feb 20, 2015 at 11:21 AM, Martin Hicks m...@bork.org wrote:

 I was testing dm-crypt performance with a Freescale P1022 board with
 a recent kernel and was getting IO errors while doing testing with LUKS.
 Investigation showed that all hardware FIFO slots were filling and
 the driver was returning EAGAIN to the block layer, which is not an
 expected response for an async crypto implementation.

 The following patch series adds a few small fixes, and reworks the
 submission path to use the crypto_queue mechanism to handle the
 request backlog.


 Martin Hicks (5):
   crypto: talitos: Simplify per-channel initialization
   crypto: talitos: Remove MD5_BLOCK_SIZE
   crypto: talitos: Fix off-by-one and use all hardware slots
   crypto: talitos: Reorganize request submission data structures
   crypto: talitos: Add software backlog queue handling

  drivers/crypto/talitos.c |  189
 --
  drivers/crypto/talitos.h |   44 +--
  2 files changed, 137 insertions(+), 96 deletions(-)

 --
 1.7.10.4




 --
 Martin Hicks P.Eng.  | m...@bork.org
 Bork Consulting Inc. |   +1 (613) 266-2296



-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html