[PATCH v2] crypto: talitos: Extend max key length for SHA384/512-HMAC and AEAD

2017-05-02 Thread Martin Hicks

An updated patch that also handles the additional key length requirements
for the AEAD algorithms.

The max keysize is not 96.  For SHA384/512 it's 128, and for the AEAD
algorithms it's longer still.  Extend the max keysize for the
AEAD size for AES256 + HMAC(SHA512).

Cc:  # 3.6+
Fixes: 357fb60502ede ("crypto: talitos - add sha224, sha384 and sha512 to 
existing AEAD algorithms")
Signed-off-by: Martin Hicks 
---
 drivers/crypto/talitos.c |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 0bba6a1..79791c6 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -816,7 +816,7 @@ static void talitos_unregister_rng(struct device *dev)
  * HMAC_SNOOP_NO_AFEA (HSNA) instead of type IPSEC_ESP
  */
 #define TALITOS_CRA_PRIORITY_AEAD_HSNA (TALITOS_CRA_PRIORITY - 1)
-#define TALITOS_MAX_KEY_SIZE   96
+#define TALITOS_MAX_KEY_SIZE   (AES_MAX_KEY_SIZE + SHA512_BLOCK_SIZE)
 #define TALITOS_MAX_IV_LENGTH  16 /* max of AES_BLOCK_SIZE, 
DES3_EDE_BLOCK_SIZE */
 
 struct talitos_ctx {
@@ -1495,6 +1495,11 @@ static int ablkcipher_setkey(struct crypto_ablkcipher 
*cipher,
 {
struct talitos_ctx *ctx = crypto_ablkcipher_ctx(cipher);
 
+   if (keylen > TALITOS_MAX_KEY_SIZE) {
+   crypto_ablkcipher_set_flags(cipher, CRYPTO_TFM_RES_BAD_KEY_LEN);
+   return -EINVAL;
+   }
+
memcpy(&ctx->key, key, keylen);
ctx->keylen = keylen;
 
-- 
1.7.10.4


-- 
Martin Hicks P.Eng.|  m...@bork.org
Bork Consulting Inc.   |  +1 (613) 266-2296


[PATCH] crypto: talitos: Extend max key length for SHA384/512-HMAC

2017-04-27 Thread Martin Hicks

The max keysize for both of these is 128, not 96.  Before, with keysizes
over 96, the memcpy in ahash_setkey() would overwrite memory beyond the
key field.

Signed-off-by: Martin Hicks 
---
 drivers/crypto/talitos.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 0bba6a1..97dc85e 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -816,7 +816,7 @@ static void talitos_unregister_rng(struct device *dev)
  * HMAC_SNOOP_NO_AFEA (HSNA) instead of type IPSEC_ESP
  */
 #define TALITOS_CRA_PRIORITY_AEAD_HSNA (TALITOS_CRA_PRIORITY - 1)
-#define TALITOS_MAX_KEY_SIZE   96
+#define TALITOS_MAX_KEY_SIZE   SHA512_BLOCK_SIZE /* SHA512 has the 
largest keysize input */
 #define TALITOS_MAX_IV_LENGTH  16 /* max of AES_BLOCK_SIZE, 
DES3_EDE_BLOCK_SIZE */
 
 struct talitos_ctx {
-- 
1.7.10.4


-- 
Martin Hicks P.Eng.|  m...@bork.org
Bork Consulting Inc.   |  +1 (613) 266-2296


Re: [PATCH 0/2] crypto: talitos: Add AES-XTS mode

2015-03-13 Thread Martin Hicks
Hi Horia,

On Wed, Mar 11, 2015 at 11:48 AM, Horia Geantă
 wrote:
>
> While here: note that xts-talitos supports only two key lengths - 256
> and 512 bits. There are tcrypt speed tests that check also for 384-bit
> keys (which is out-of-spec, but still...), leading to a "Key Size Error"
> - see below (KSE bit in AESU Interrupt Status Register is set)

Ok.  I've limited the keysize to 32 or 64 bytes for AES-XTS in the
talitos driver.

This was my first experiments with the tcrypt module.  It also brought
up another issue related to the IV limitations of this hardware.  The
latest patch that I have returns an error when there is a non-zero
value in the second 8 bytes of the IV:

+   /*
+* AES-XTS uses the first two AES Context registers for:
+*
+* Register 1:   Sector Number (Little Endian)
+* Register 2:   Sector Size   (Big Endian)
+*
+* Whereas AES-CBC uses registers 1/2 as a 16-byte IV.
+*/
+   if ((ctx->desc_hdr_template &
+(DESC_HDR_SEL0_MASK | DESC_HDR_MODE0_MASK)) ==
+(DESC_HDR_SEL0_AESU | DESC_HDR_MODE0_AESU_XTS)) {
+   u64 *aesctx2 = (u64 *)areq->info + 1;
+
+   if (*aesctx2 != 0) {
+   dev_err(ctx->dev,
+   "IV length limited to the first 8 bytes.");
+   return ERR_PTR(-EINVAL);
+   }
+
+   /* Fixed sized sector */
+   *aesctx2 = cpu_to_be64(1 << SECTOR_SHIFT);
+   }


This approach causes the tcrypt tests to fail because tcrypt sets all
16 bytes of the IV to 0xff.  I think returning an error is the right
approach for the talitos module, but it would be nice if tcrypt still
worked.  Should tcrypt just set the IV bytes to 0 instead of 0xff?
Isn't one IV just as good as another?  I think adding exceptions to
the tcrypt code would be ugly, but maybe one should be made for XTS
since the standard dictates that the IV should be plain or plain64?

Thanks,
mh

-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] crypto: talitos: Add AES-XTS mode

2015-03-09 Thread Martin Hicks
On Mon, Mar 9, 2015 at 6:16 AM, Horia Geantă  wrote:
> On 3/3/2015 7:44 PM, Martin Hicks wrote:
>> On Tue, Mar 3, 2015 at 10:44 AM, Horia Geantă
>>  wrote:
>>>
>>> For talitos, there are two cases:
>>>
>>> 1. request data size is <= data unit / sector size
>>> talitos can handle any IV / tweak scheme
>>>
>>> 2. request data size > sector size
>>> since talitos internally generates the IV for the next sector by
>>> incrementing the previous IV, only IV schemes that allocate consecutive
>>> IV to consecutive sectors will function correctly.
>>>
>>
>> it's not clear to me that #1 is right.  I guess it could be, but the
>> IV length would be limited to 8 bytes.
>
> Yes, there's a limitation in talitos wrt. XTS IV / tweak size - it's up
> to 8 bytes.
> So I guess ESSIV won't work with talitos-xts, since the encrypted IV
> output is 16 bytes.
> But as previously said, ESSIV breaks the XTS standard requirement for
> having a consecutive IV for consecutive blocks. ESSIV should really be
> used only with disk-level encryption schemes that require an
> unpredictable IV.

Ok.  I'll verify that the second half of the IV is zeroed.

One last thing that I'm not sure of is what string to place in
cra_ablkcipher.geniv field.   "eseqiv" seems wrong if plain/plain64
are the IVs that XTS is designed for.

Thanks,
mh

-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] crypto: talitos: Add AES-XTS Support

2015-03-06 Thread Martin Hicks
Hi Kim,

On Fri, Mar 6, 2015 at 11:49 AM, Martin Hicks  wrote:
> On Thu, Mar 5, 2015 at 7:16 PM, Kim Phillips  
> wrote:
>> On Fri, 20 Feb 2015 12:00:10 -0500
>> Martin Hicks  wrote:
>>
>>> The newer talitos hardware has support for AES in XTS mode.
>>
>> Assuming it's the same thing, AES-XCBC gets added with SEC v3.0
>> h/w.  Assuming hw_supports() doesn't already support this algorithm
>
> AES-XCBC isn't the same thing as AES-XTS.
>
>> combination (technically via the mode bit), this needs to be
>> reflected in the patch so the driver doesn't think SEC 2.x devices
>> can do XTS, too.
>
> Right.  I hadn't looked into how exactly hw_supports() works.  It only
> indicates which execution units are present (in this case the AES
> unit).  I actually think XTS gets introduced in SEC v3.3.2.  I also
> have an MPC8379 (sec3.3) and it does not have XTS.
>

I'm wrong about the 8379.  It's SEC3.0.  So XTS was introduced
somewhere between 3.0 and 3.3.2

> Can you look internally to find out in which hardware it was
> introduced?  Is there a SEC 3.3.1 that also has XTS?

Can you still look into where XTS was added?  I'm not even sure how
many versions of hardware exist between 3.0 and 3.3.2

Thanks,
mh

>
> I guess I just need a ->features flag to conditionally register the
> XTS algorithm for 3.3.x and newer.  Adding anything more complicated
> doesn't seem warranted at this time, although that could change if
> someone came along and needed to add a whole whack more of the AES
> modes that were introduced at various times in the different SEC
> revisions.
>
>>
>>> + .desc_hdr_template = DESC_HDR_TYPE_COMMON_NONSNOOP_NO_AFEU |
>>> + DESC_HDR_SEL0_AESU |
>>> + DESC_HDR_MODE0_AESU_XTS,
>>
>> ...
>>
>>>  /* primary execution unit mode (MODE0) and derivatives */
>>>  #define  DESC_HDR_MODE0_ENCRYPT  cpu_to_be32(0x0010)
>>>  #define  DESC_HDR_MODE0_AESU_CBC cpu_to_be32(0x0020)
>>> +#define  DESC_HDR_MODE0_AESU_XTS cpu_to_be32(0x0420)
>>
>> I'd prefer these defines be kept as single bit definitions, and keep
>> their names from the manual.  An XTS-specific definition can be
>> composed from them either after this, or manually in the
>> desc_hdr_template assignment above.
>
> It doesn't look like there are divisible single-bit definitions for
> the MODE0 bits. The AES Cipher modes are composed of 5 bits called
> CipherMode, Extended CipherMode and Aux2.  Individually they don't
> seem to mean anything.  Unless you want something like:
>
> #define AES_MODE(cm, ecm, aux2)   cpu_to_be32(((ecm<<6) | (aux2<<5) |
> (cm<<1)) << 20)
>
> #define DESC_HDR_MODE0_AESU_CBC  AES_MODE(1, 0, 0)
> #define DESC_HDR_MODE0_AESU_XTS   AES_MODE(1, 1, 0)
>
> Thanks,
> mh
>
> --
> Martin Hicks P.Eng.  | m...@bork.org
> Bork Consulting Inc. |   +1 (613) 266-2296



-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] crypto: talitos: Add AES-XTS Support

2015-03-06 Thread Martin Hicks
On Thu, Mar 5, 2015 at 7:16 PM, Kim Phillips  wrote:
> On Fri, 20 Feb 2015 12:00:10 -0500
> Martin Hicks  wrote:
>
>> The newer talitos hardware has support for AES in XTS mode.
>
> Assuming it's the same thing, AES-XCBC gets added with SEC v3.0
> h/w.  Assuming hw_supports() doesn't already support this algorithm

AES-XCBC isn't the same thing as AES-XTS.

> combination (technically via the mode bit), this needs to be
> reflected in the patch so the driver doesn't think SEC 2.x devices
> can do XTS, too.

Right.  I hadn't looked into how exactly hw_supports() works.  It only
indicates which execution units are present (in this case the AES
unit).  I actually think XTS gets introduced in SEC v3.3.2.  I also
have an MPC8379 (sec3.3) and it does not have XTS.

Can you look internally to find out in which hardware it was
introduced?  Is there a SEC 3.3.1 that also has XTS?

I guess I just need a ->features flag to conditionally register the
XTS algorithm for 3.3.x and newer.  Adding anything more complicated
doesn't seem warranted at this time, although that could change if
someone came along and needed to add a whole whack more of the AES
modes that were introduced at various times in the different SEC
revisions.

>
>> + .desc_hdr_template = DESC_HDR_TYPE_COMMON_NONSNOOP_NO_AFEU |
>> + DESC_HDR_SEL0_AESU |
>> + DESC_HDR_MODE0_AESU_XTS,
>
> ...
>
>>  /* primary execution unit mode (MODE0) and derivatives */
>>  #define  DESC_HDR_MODE0_ENCRYPT  cpu_to_be32(0x0010)
>>  #define  DESC_HDR_MODE0_AESU_CBC cpu_to_be32(0x0020)
>> +#define  DESC_HDR_MODE0_AESU_XTS cpu_to_be32(0x0420)
>
> I'd prefer these defines be kept as single bit definitions, and keep
> their names from the manual.  An XTS-specific definition can be
> composed from them either after this, or manually in the
> desc_hdr_template assignment above.

It doesn't look like there are divisible single-bit definitions for
the MODE0 bits. The AES Cipher modes are composed of 5 bits called
CipherMode, Extended CipherMode and Aux2.  Individually they don't
seem to mean anything.  Unless you want something like:

#define AES_MODE(cm, ecm, aux2)   cpu_to_be32(((ecm<<6) | (aux2<<5) |
(cm<<1)) << 20)

#define DESC_HDR_MODE0_AESU_CBC  AES_MODE(1, 0, 0)
#define DESC_HDR_MODE0_AESU_XTS   AES_MODE(1, 1, 0)

Thanks,
mh

-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 3/5] crypto: talitos: Fix off-by-one and use all hardware slots

2015-03-04 Thread Martin Hicks
Ok, I'm fine dropping this patch.  I'm sure it doesn't affect
performance in a measurable way.

mh

On Tue, Mar 3, 2015 at 7:35 PM, Kim Phillips  wrote:
> On Tue,  3 Mar 2015 08:21:35 -0500
> Martin Hicks  wrote:
>
>> The submission count was off by one.
>>
>> Signed-off-by: Martin Hicks 
>> ---
> sadly, this directly contradicts:
>
> commit 4b24ea971a93f5d0bec34bf7bfd0939f70cfaae6
> Author: Vishnu Suresh 
> Date:   Mon Oct 20 21:06:18 2008 +0800
>
> crypto: talitos - Preempt overflow interrupts off-by-one fix
>
> My guess is your request submission pattern differs from that of
> Vishnu's (probably IPSec and/or tcrypt), or later h/w versions have
> gotten better about dealing with channel near-overflow conditions.
> Either way, I'd prefer we not do this: it might break others, and
> I'm guessing doesn't improve performance _that_ much?
>
> If it does, we could risk it and restrict it to SEC versions 3.3 and
> above maybe?  Not sure what to do here exactly, barring digging up
> and old 2.x SEC and testing.
>
> Kim
>
> p.s. I checked, Vishnu isn't with Freescale anymore, so I can't
> cc him.



-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] crypto: talitos: Add AES-XTS mode

2015-03-03 Thread Martin Hicks
On Tue, Mar 3, 2015 at 10:44 AM, Horia Geantă
 wrote:
> On 3/3/2015 12:09 AM, Martin Hicks wrote:
>>
>> On Mon, Mar 02, 2015 at 03:37:28PM +0100, Milan Broz wrote:
>>>
>>> If crypto API allows to encrypt more sectors in one run
>>> (handling IV internally) dmcrypt can be modified of course.
>>>
>>> But do not forget we can use another IV (not only sequential number)
>>> e.g. ESSIV with XTS as well (even if it doesn't make much sense, some people
>>> are using it).
>>
>> Interesting, I'd not considered using XTS with an IV other than plain/64.
>> The talitos hardware would not support aes/xts in any mode other than
>> plain/plain64 I don't think...Although perhaps you could push in an 8-byte
>> IV and the hardware would interpret it as the sector #.
>>
>
> For talitos, there are two cases:
>
> 1. request data size is <= data unit / sector size
> talitos can handle any IV / tweak scheme
>
> 2. request data size > sector size
> since talitos internally generates the IV for the next sector by
> incrementing the previous IV, only IV schemes that allocate consecutive
> IV to consecutive sectors will function correctly.
>

it's not clear to me that #1 is right.  I guess it could be, but the
IV length would be limited to 8 bytes.

This also points out that claiming that the XTS IV size is 16 bytes,
as my current patch does, could be problematic.  It's handy because
the first 8 bytes should contain a plain64 sector #, and the second
u64 can be used to encode the sector size but it would be a mistake
for someone to use the second 8 bytes for the rest of a 16byte IV.

mh

-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 5/5] crypto: talitos: Add software backlog queue handling

2015-03-03 Thread Martin Hicks
I was running into situations where the hardware FIFO was filling up, and
the code was returning EAGAIN to dm-crypt and just dropping the submitted
crypto request.

This adds support in talitos for a software backlog queue.  When requests
can't be queued to the hardware immediately EBUSY is returned.  The queued
requests are dispatched to the hardware in received order as hardware FIFO
slots become available.

Signed-off-by: Martin Hicks 
---
 drivers/crypto/talitos.c |  135 --
 drivers/crypto/talitos.h |3 ++
 2 files changed, 110 insertions(+), 28 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index b0c85ce..bb7fba0 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -182,55 +182,118 @@ static int init_device(struct device *dev)
return 0;
 }
 
-/**
- * talitos_submit - submits a descriptor to the device for processing
- * @dev:   the SEC device to be used
- * @ch:the SEC device channel to be used
- * @edesc: the descriptor to be processed by the device
- *
- * desc must contain valid dma-mapped (bus physical) address pointers.
- * callback must check err and feedback in descriptor header
- * for device processing status.
- */
-int talitos_submit(struct device *dev, int ch, struct talitos_edesc *edesc)
+/* Dispatch 'request' if provided, otherwise a backlogged request */
+static int __talitos_handle_queue(struct device *dev, int ch,
+ struct talitos_edesc *edesc,
+ unsigned long *irq_flags)
 {
struct talitos_private *priv = dev_get_drvdata(dev);
-   struct talitos_request *request = &edesc->req;
-   unsigned long flags;
+   struct talitos_request *request;
+   struct crypto_async_request *areq;
int head;
+   int ret = -EINPROGRESS;
 
-   spin_lock_irqsave(&priv->chan[ch].head_lock, flags);
-
-   if (!atomic_inc_not_zero(&priv->chan[ch].submit_count)) {
+   if (!atomic_inc_not_zero(&priv->chan[ch].submit_count))
/* h/w fifo is full */
-   spin_unlock_irqrestore(&priv->chan[ch].head_lock, flags);
-   return -EAGAIN;
+   return -EBUSY;
+
+   if (!edesc) {
+   /* Dequeue the oldest request */
+   areq = crypto_dequeue_request(&priv->chan[ch].queue);
+   request = container_of(areq, struct talitos_request, base);
+   } else {
+   request = &edesc->req;
}
 
-   head = priv->chan[ch].head;
request->dma_desc = dma_map_single(dev, request->desc,
   sizeof(*request->desc),
   DMA_BIDIRECTIONAL);
 
/* increment fifo head */
+   head = priv->chan[ch].head;
priv->chan[ch].head = (priv->chan[ch].head + 1) & (priv->fifo_len - 1);
 
-   smp_wmb();
-   priv->chan[ch].fifo[head] = request;
+   spin_unlock_irqrestore(&priv->chan[ch].head_lock, *irq_flags);
+
+   /*
+* Mark a backlogged request as in-progress.
+*/
+   if (!edesc) {
+   areq = request->context;
+   areq->complete(areq, -EINPROGRESS);
+   }
+
+   spin_lock_irqsave(&priv->chan[ch].head_lock, *irq_flags);
 
/* GO! */
+   priv->chan[ch].fifo[head] = request;
wmb();
out_be32(priv->chan[ch].reg + TALITOS_FF,
 upper_32_bits(request->dma_desc));
out_be32(priv->chan[ch].reg + TALITOS_FF_LO,
 lower_32_bits(request->dma_desc));
 
+   return ret;
+}
+
+/**
+ * talitos_submit - performs submissions of a new descriptors
+ *
+ * @dev:   the SEC device to be used
+ * @ch:the SEC device channel to be used
+ * @edesc: the request to be processed by the device
+ *
+ * edesc->req must contain valid dma-mapped (bus physical) address pointers.
+ * callback must check err and feedback in descriptor header
+ * for device processing status upon completion.
+ */
+int talitos_submit(struct device *dev, int ch, struct talitos_edesc *edesc)
+{
+   struct talitos_private *priv = dev_get_drvdata(dev);
+   struct talitos_request *request = &edesc->req;
+   unsigned long flags;
+   int ret = -EINPROGRESS;
+
+   spin_lock_irqsave(&priv->chan[ch].head_lock, flags);
+
+   if (priv->chan[ch].queue.qlen) {
+   /*
+* There are backlogged requests.  Just queue this new request
+* and dispatch the oldest backlogged request to the hardware.
+*/
+   crypto_enqueue_request(&priv->chan[ch].queue,
+  &request->base);
+   __talitos_handle_queue(dev, ch, N

[PATCH v2 4/5] crypto: talitos: Reorganize request submission data structures

2015-03-03 Thread Martin Hicks
This is preparatory work for moving to using the crypto async queue
handling code.  A talitos_request structure is buried into each
talitos_edesc so that when talitos_submit() is called, everything required
to defer the submission to the hardware is contained within talitos_edesc.

Signed-off-by: Martin Hicks 
---
 drivers/crypto/talitos.c |   95 +++---
 drivers/crypto/talitos.h |   41 +---
 2 files changed, 66 insertions(+), 70 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 7709805..b0c85ce 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -186,22 +186,16 @@ static int init_device(struct device *dev)
  * talitos_submit - submits a descriptor to the device for processing
  * @dev:   the SEC device to be used
  * @ch:the SEC device channel to be used
- * @desc:  the descriptor to be processed by the device
- * @callback:  whom to call when processing is complete
- * @context:   a handle for use by caller (optional)
+ * @edesc: the descriptor to be processed by the device
  *
  * desc must contain valid dma-mapped (bus physical) address pointers.
  * callback must check err and feedback in descriptor header
  * for device processing status.
  */
-int talitos_submit(struct device *dev, int ch, struct talitos_desc *desc,
-  void (*callback)(struct device *dev,
-   struct talitos_desc *desc,
-   void *context, int error),
-  void *context)
+int talitos_submit(struct device *dev, int ch, struct talitos_edesc *edesc)
 {
struct talitos_private *priv = dev_get_drvdata(dev);
-   struct talitos_request *request;
+   struct talitos_request *request = &edesc->req;
unsigned long flags;
int head;
 
@@ -214,19 +208,15 @@ int talitos_submit(struct device *dev, int ch, struct 
talitos_desc *desc,
}
 
head = priv->chan[ch].head;
-   request = &priv->chan[ch].fifo[head];
-
-   /* map descriptor and save caller data */
-   request->dma_desc = dma_map_single(dev, desc, sizeof(*desc),
+   request->dma_desc = dma_map_single(dev, request->desc,
+  sizeof(*request->desc),
   DMA_BIDIRECTIONAL);
-   request->callback = callback;
-   request->context = context;
 
/* increment fifo head */
priv->chan[ch].head = (priv->chan[ch].head + 1) & (priv->fifo_len - 1);
 
smp_wmb();
-   request->desc = desc;
+   priv->chan[ch].fifo[head] = request;
 
/* GO! */
wmb();
@@ -247,15 +237,16 @@ EXPORT_SYMBOL(talitos_submit);
 static void flush_channel(struct device *dev, int ch, int error, int reset_ch)
 {
struct talitos_private *priv = dev_get_drvdata(dev);
-   struct talitos_request *request, saved_req;
+   struct talitos_request *request;
unsigned long flags;
int tail, status;
 
spin_lock_irqsave(&priv->chan[ch].tail_lock, flags);
 
tail = priv->chan[ch].tail;
-   while (priv->chan[ch].fifo[tail].desc) {
-   request = &priv->chan[ch].fifo[tail];
+   while (priv->chan[ch].fifo[tail]) {
+   request = priv->chan[ch].fifo[tail];
+   status = 0;
 
/* descriptors with their done bits set don't get the error */
rmb();
@@ -271,14 +262,9 @@ static void flush_channel(struct device *dev, int ch, int 
error, int reset_ch)
 sizeof(struct talitos_desc),
 DMA_BIDIRECTIONAL);
 
-   /* copy entries so we can call callback outside lock */
-   saved_req.desc = request->desc;
-   saved_req.callback = request->callback;
-   saved_req.context = request->context;
-
/* release request entry in fifo */
smp_wmb();
-   request->desc = NULL;
+   priv->chan[ch].fifo[tail] = NULL;
 
/* increment fifo tail */
priv->chan[ch].tail = (tail + 1) & (priv->fifo_len - 1);
@@ -287,8 +273,8 @@ static void flush_channel(struct device *dev, int ch, int 
error, int reset_ch)
 
atomic_dec(&priv->chan[ch].submit_count);
 
-   saved_req.callback(dev, saved_req.desc, saved_req.context,
-  status);
+   request->callback(dev, request->desc, request->context, status);
+
/* channel may resume processing in single desc error case */
if (error && !reset_ch && status == error)
return;
@@ -352,7 +338,8 @@ static u32 current_desc_hdr(struct device *dev, int ch)
tail = priv-&

[PATCH v2 3/5] crypto: talitos: Fix off-by-one and use all hardware slots

2015-03-03 Thread Martin Hicks
The submission count was off by one.

Signed-off-by: Martin Hicks 
---
 drivers/crypto/talitos.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 89cf4d5..7709805 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -2722,8 +2722,7 @@ static int talitos_probe(struct platform_device *ofdev)
goto err_out;
}
 
-   atomic_set(&priv->chan[i].submit_count,
-  -(priv->chfifo_len - 1));
+   atomic_set(&priv->chan[i].submit_count, -priv->chfifo_len);
}
 
dma_set_mask(dev, DMA_BIT_MASK(36));
-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 2/5] crypto: talitos: Remove MD5_BLOCK_SIZE

2015-03-03 Thread Martin Hicks
This is properly defined in the md5 header file.

Signed-off-by: Martin Hicks 
---
 drivers/crypto/talitos.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index c49d977..89cf4d5 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -637,8 +637,6 @@ static void talitos_unregister_rng(struct device *dev)
 #define TALITOS_MAX_KEY_SIZE   96
 #define TALITOS_MAX_IV_LENGTH  16 /* max of AES_BLOCK_SIZE, 
DES3_EDE_BLOCK_SIZE */
 
-#define MD5_BLOCK_SIZE64
-
 struct talitos_ctx {
struct device *dev;
int ch;
@@ -2195,7 +2193,7 @@ static struct talitos_alg_template driver_algs[] = {
.halg.base = {
.cra_name = "md5",
.cra_driver_name = "md5-talitos",
-   .cra_blocksize = MD5_BLOCK_SIZE,
+   .cra_blocksize = MD5_HMAC_BLOCK_SIZE,
.cra_flags = CRYPTO_ALG_TYPE_AHASH |
 CRYPTO_ALG_ASYNC,
}
@@ -2285,7 +2283,7 @@ static struct talitos_alg_template driver_algs[] = {
.halg.base = {
.cra_name = "hmac(md5)",
.cra_driver_name = "hmac-md5-talitos",
-   .cra_blocksize = MD5_BLOCK_SIZE,
+   .cra_blocksize = MD5_HMAC_BLOCK_SIZE,
.cra_flags = CRYPTO_ALG_TYPE_AHASH |
 CRYPTO_ALG_ASYNC,
}
-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 0/5] crypto: talitos: Add crypto async queue handling

2015-03-03 Thread Martin Hicks
I was testing dm-crypt performance with a Freescale P1022 board with 
a recent kernel and was getting IO errors while doing testing with LUKS.
Investigation showed that all hardware FIFO slots were filling and
the driver was returning EAGAIN to the block layer, which is not an
expected response for an async crypto implementation.

The following patch series adds a few small fixes, and reworks the
submission path to use the crypto_queue mechanism to handle the
request backlog.

Changes since v1:

- Ran checkpatch.pl
- Split the path for submitting new requests vs. issuing backlogged
  requests.
- Avoid enqueuing a submitted request to the crypto queue unnecessarily.
- Fix return paths where CRYPTO_TFM_REQ_MAY_BACKLOG is not set.


Martin Hicks (5):
  crypto: talitos: Simplify per-channel initialization
  crypto: talitos: Remove MD5_BLOCK_SIZE
  crypto: talitos: Fix off-by-one and use all hardware slots
  crypto: talitos: Reorganize request submission data structures
  crypto: talitos: Add software backlog queue handling

 drivers/crypto/talitos.c |  240 +++---
 drivers/crypto/talitos.h |   44 +++--
 2 files changed, 177 insertions(+), 107 deletions(-)

-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 1/5] crypto: talitos: Simplify per-channel initialization

2015-03-03 Thread Martin Hicks
There were multiple loops in a row, for each separate step of the
initialization of the channels.  Simplify to a single loop.

Signed-off-by: Martin Hicks 
---
 drivers/crypto/talitos.c |   11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 067ec21..c49d977 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -2706,20 +2706,16 @@ static int talitos_probe(struct platform_device *ofdev)
goto err_out;
}
 
+   priv->fifo_len = roundup_pow_of_two(priv->chfifo_len);
+
for (i = 0; i < priv->num_channels; i++) {
priv->chan[i].reg = priv->reg + TALITOS_CH_STRIDE * (i + 1);
if (!priv->irq[1] || !(i & 1))
priv->chan[i].reg += TALITOS_CH_BASE_OFFSET;
-   }
 
-   for (i = 0; i < priv->num_channels; i++) {
spin_lock_init(&priv->chan[i].head_lock);
spin_lock_init(&priv->chan[i].tail_lock);
-   }
 
-   priv->fifo_len = roundup_pow_of_two(priv->chfifo_len);
-
-   for (i = 0; i < priv->num_channels; i++) {
priv->chan[i].fifo = kzalloc(sizeof(struct talitos_request) *
 priv->fifo_len, GFP_KERNEL);
if (!priv->chan[i].fifo) {
@@ -2727,11 +2723,10 @@ static int talitos_probe(struct platform_device *ofdev)
err = -ENOMEM;
goto err_out;
}
-   }
 
-   for (i = 0; i < priv->num_channels; i++)
atomic_set(&priv->chan[i].submit_count,
   -(priv->chfifo_len - 1));
+   }
 
dma_set_mask(dev, DMA_BIT_MASK(36));
 
-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] crypto: talitos: Add AES-XTS mode

2015-03-02 Thread Martin Hicks

On Mon, Mar 02, 2015 at 03:37:28PM +0100, Milan Broz wrote:
> 
> If crypto API allows to encrypt more sectors in one run
> (handling IV internally) dmcrypt can be modified of course.
> 
> But do not forget we can use another IV (not only sequential number)
> e.g. ESSIV with XTS as well (even if it doesn't make much sense, some people
> are using it).

Interesting, I'd not considered using XTS with an IV other than plain/64.
The talitos hardware would not support aes/xts in any mode other than
plain/plain64 I don't think...Although perhaps you could push in an 8-byte
IV and the hardware would interpret it as the sector #.

> Maybe the following question would be if the dmcrypt sector IV algorithms
> should moved into crypto API as well.
> (But because I misused dmcrypt IVs hooks for some additional operations
> for loopAES and old Truecrypt CBC mode, it is not so simple...)

Speaking again with talitos in mind, there would be no advantage for this
hardware.  Although larger requests are possible only a single IV can be
provided per request, so for algorithms like AES-CBC and dm-crypt 512byte IOs
are the only option (short of switching to 4kB block size).

mh

-- 
Martin Hicks P.Eng.|  m...@bork.org
Bork Consulting Inc.   |  +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] crypto: talitos: Add AES-XTS mode

2015-03-02 Thread Martin Hicks
On Mon, Mar 02, 2015 at 04:44:19PM -0500, Martin Hicks wrote:
> 
>   Write (MB/s)Read (MB/s)
> Unencrypted   140 176
> aes-xts-plain64 512b  113 115
> aes-xts-plain64 4kB   71  56

I got the two AES lines backwards.  Sorry about that.

mh

-- 
Martin Hicks P.Eng.|  m...@bork.org
Bork Consulting Inc.   |  +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] crypto: talitos: Add AES-XTS mode

2015-03-02 Thread Martin Hicks
On Mon, Mar 02, 2015 at 03:25:56PM +0200, Horia Geantă wrote:
> On 2/20/2015 7:00 PM, Martin Hicks wrote:
> > This adds the AES-XTS mode, supported by the Freescale SEC 3.3.2.
> > 
> > One of the nice things about this hardware is that it knows how to deal
> > with encrypt/decrypt requests that are larger than sector size, but that 
> > also requires that that the sector size be passed into the crypto engine
> > as an XTS cipher context parameter.
> > 
> > When a request is larger than the sector size the sector number is
> > incremented by the talitos engine and the tweak key is re-calculated
> > for the new sector.
> > 
> > I've tested this with 256bit and 512bit keys (tweak and data keys of 128bit
> > and 256bit) to ensure interoperability with the software AES-XTS
> > implementation.  All testing was done using dm-crypt/LUKS with
> > aes-xts-plain64.
> > 
> > Is there a better solution that just hard coding the sector size to
> > (1< > sector size along with the plain/plain64 IV to an XTS algorithm?
> 
> AFAICT, SW implementation of xts mode in kernel (crypto/xts.c) is not
> aware of a sector size ("data unit size" in IEEE P1619 terminology):
> There's a hidden assumption that all the data send to xts in one request
> belongs to a single sector. Even more, it's supposed that the first
> 16-byte block in the request is "block 0" in the sector. These can be
> seen from the way the tweak ("T") value is computed.
> (Side note: there's no support of ciphertext stealing in crypto/xts.c -
> i.e. sector sizes must be a multiple of underlying block cipher size -
> that is 16B.)
> 
> If dm-crypt would be modified to pass sector size somehow, all in-kernel
> xts implementations would have to be made aware of the change.
> I have nothing against this, but let's see what crypto maintainers have
> to say...

Right.  Additionally, there may be some requirement for the encryption
implementation to broadcast the maximum size that can be handled in a single
request.  For example Talitos could handle XTS encrypt/decrypt requests of
up to 64kB (regardless of the block device's sector size).

> BTW, there were some discussions back in 2013 wrt. being able to
> configure / increase sector size, smth. crypto engines would benefit from:
> http://www.saout.de/pipermail/dm-crypt/2013-January/003125.html
> (experimental patch)
> http://www.saout.de/pipermail/dm-crypt/2013-March/003202.html
> 
> The experimental patch sends sector size as the req->nbytes - hidden
> assumption: data size sent in an xts crypto request equals a sector.

I found this last week, and used it as a starting point for some testing.  I
modified it to keep the underlying sector size of the dm-crypt mapping as
512byte, but allowed the code to combine requests in IOs up to 4kB.  Doing
greater request sizes would require allocating additional pages...I plan to
implement that to see how much extra performance can be squeezed out.

patch below...

With regards to performance, with my low-powered Freescale P1022 board, I see
performance numbers like this on ext4, as measured by bonnie++.

Write (MB/s)Read (MB/s)
Unencrypted 140 176
aes-xts-plain64 512b113 115
aes-xts-plain64 4kB 71  56

The more detailed bonnie++ output is here:
http://www.bork.org/~mort/dm-crypt-enc-blksize.html

The larger IO sizes is a huge win for this board.

The patch I'm using to send IOs up to 4kB to talitos follows.

Thanks,
mh


diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 08981be..88e95b5 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -42,6 +42,7 @@ struct convert_context {
struct bvec_iter iter_out;
sector_t cc_sector;
atomic_t cc_pending;
+   unsigned int block_size;
struct ablkcipher_request *req;
 };
 
@@ -142,6 +143,8 @@ struct crypt_config {
sector_t iv_offset;
unsigned int iv_size;
 
+   unsigned int block_size;
+
/* ESSIV: struct crypto_cipher *essiv_tfm */
void *iv_private;
struct crypto_ablkcipher **tfms;
@@ -801,10 +804,17 @@ static void crypt_convert_init(struct crypt_config *cc,
 {
ctx->bio_in = bio_in;
ctx->bio_out = bio_out;
-   if (bio_in)
+   ctx->block_size = 0;
+   if (bio_in) {
ctx->iter_in = bio_in->bi_iter;
-   if (bio_out)
+   ctx->block_size = max(ctx->block_size, bio_cur_bytes(bio_in));
+   }
+   if (bio_out) {
ctx->iter_out = bio_out->bi_iter;
+   ctx->block_size = max(ctx->block_size, bio_cur_bytes(bio_out));
+   }
+   if (ctx->block_size > cc->blo

Re: [PATCH 5/5] crypto: talitos: Add software backlog queue handling

2015-02-26 Thread Martin Hicks
Hi Horia,

On Tue, Feb 24, 2015 at 1:21 PM, Horia Geantă
 wrote:
> On 2/20/2015 6:21 PM, Martin Hicks wrote:
>> + int ret = -EINPROGRESS;
>>
>>   spin_lock_irqsave(&priv->chan[ch].head_lock, flags);
>>
>> + if (edesc) {
>> + orig_request = &edesc->req;
>> + crypto_enqueue_request(&priv->chan[ch].queue, 
>> &orig_request->base);
>> + }
>
> The request goes through the SW queue even if there are empty slots in
> the HW queue, doing unnecessary crypto_queue_encrypt() and
> crypto_queue_decrypt(). Trying to use the HW queue first would be better.
>

Definitely a valid point.  In trying to reorganize this it really
complicates things.
Splitting the submit from issuing requests from the backlog seems to
make it much more straightforward.

>> @@ -1170,6 +1211,8 @@ static struct talitos_edesc 
>> *talitos_edesc_alloc(struct device *dev,
>>edesc->dma_len,
>>DMA_BIDIRECTIONAL);
>>   edesc->req.desc = &edesc->desc;
>> + /* A copy of the crypto_async_request to use the crypto_queue backlog 
>> */
>> + memcpy(&edesc->req.base, areq, sizeof(struct crypto_async_request));
>
> Why not have a
> struct crypto_async_request *base;
> instead of
> struct crypto_async_request base;
> in talitos_request structure?
>
> This way:
> 1. memcopy above is avoided
> 2. talitos_request structure gets smaller - remember that
> talitos_request is now embedded in talitos_edesc structure, which gets
> kmalloc-ed for every request

The trouble I ran into was that after a request is backlogged and put
on the crypto queue, I couldn't see how else I could go from the
crypto_async_request that is pulled from the queue back to the
talitos_request that is needed in order put the pointer into the
hardware FIFO.

This is the one issue from you review that I didn't address in v2 of the patch.

Thanks for the review.  Not releasing / re-acquiring the spinlock
while trying to flush the backlog really improved performance.

mh

-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/5] crypto: talitos: Add crypto async queue handling

2015-02-20 Thread Martin Hicks
I've just noticed that performance is pretty terrible with maxcpus=1, so
I'll investigate that.
mh

On Fri, Feb 20, 2015 at 11:21 AM, Martin Hicks  wrote:

> I was testing dm-crypt performance with a Freescale P1022 board with
> a recent kernel and was getting IO errors while doing testing with LUKS.
> Investigation showed that all hardware FIFO slots were filling and
> the driver was returning EAGAIN to the block layer, which is not an
> expected response for an async crypto implementation.
>
> The following patch series adds a few small fixes, and reworks the
> submission path to use the crypto_queue mechanism to handle the
> request backlog.
>
>
> Martin Hicks (5):
>   crypto: talitos: Simplify per-channel initialization
>   crypto: talitos: Remove MD5_BLOCK_SIZE
>   crypto: talitos: Fix off-by-one and use all hardware slots
>   crypto: talitos: Reorganize request submission data structures
>   crypto: talitos: Add software backlog queue handling
>
>  drivers/crypto/talitos.c |  189
> --
>  drivers/crypto/talitos.h |   44 +--
>  2 files changed, 137 insertions(+), 96 deletions(-)
>
> --
> 1.7.10.4
>
>


-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/5] crypto: talitos: Add crypto async queue handling

2015-02-20 Thread Martin Hicks
Resending to linux-crypto in plain text.
Sorry to everyone else for the duplication.
mh

On Fri, Feb 20, 2015 at 1:23 PM, Martin Hicks  wrote:
>
> I've just noticed that performance is pretty terrible with maxcpus=1, so
> I'll investigate that.
> mh
>
> On Fri, Feb 20, 2015 at 11:21 AM, Martin Hicks  wrote:
>>
>> I was testing dm-crypt performance with a Freescale P1022 board with
>> a recent kernel and was getting IO errors while doing testing with LUKS.
>> Investigation showed that all hardware FIFO slots were filling and
>> the driver was returning EAGAIN to the block layer, which is not an
>> expected response for an async crypto implementation.
>>
>> The following patch series adds a few small fixes, and reworks the
>> submission path to use the crypto_queue mechanism to handle the
>> request backlog.
>>
>>
>> Martin Hicks (5):
>>   crypto: talitos: Simplify per-channel initialization
>>   crypto: talitos: Remove MD5_BLOCK_SIZE
>>   crypto: talitos: Fix off-by-one and use all hardware slots
>>   crypto: talitos: Reorganize request submission data structures
>>   crypto: talitos: Add software backlog queue handling
>>
>>  drivers/crypto/talitos.c |  189
>> ------
>>  drivers/crypto/talitos.h |   44 +--
>>  2 files changed, 137 insertions(+), 96 deletions(-)
>>
>> --
>> 1.7.10.4
>>
>
>
>
> --
> Martin Hicks P.Eng.  | m...@bork.org
> Bork Consulting Inc. |   +1 (613) 266-2296



-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/2] crypto: talitos: Add AES-XTS Support

2015-02-20 Thread Martin Hicks
The newer talitos hardware has support for AES in XTS mode.

Signed-off-by: Martin Hicks 
---
 drivers/crypto/talitos.c |   33 +
 drivers/crypto/talitos.h |1 +
 2 files changed, 34 insertions(+)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 6b2a19a..38cbde1 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -40,9 +40,11 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1464,9 +1466,22 @@ static struct talitos_edesc 
*ablkcipher_edesc_alloc(struct ablkcipher_request *
areq, bool encrypt)
 {
struct crypto_ablkcipher *cipher = crypto_ablkcipher_reqtfm(areq);
+   struct crypto_tfm *tfm = (struct crypto_tfm *)cipher;
struct talitos_ctx *ctx = crypto_ablkcipher_ctx(cipher);
unsigned int ivsize = crypto_ablkcipher_ivsize(cipher);
 
+   /*
+* AES-XTS uses the first two AES Context registers for:
+*
+* Register 1:   Sector Number (Little Endian)
+* Register 2:   Sector Size   (Big Endian)
+*
+* Whereas AES-CBC uses registers 1/2 as a 16-byte IV.
+*/
+   if (!strcmp(crypto_tfm_alg_name(tfm),"xts(aes)"))
+   /* Fixed sized sector */
+   *((u64 *)areq->info + 1) = cpu_to_be64((1<dev, NULL, areq->src, areq->dst,
   areq->info, 0, areq->nbytes, 0, ivsize, 0,
   areq->base.flags, &areq->base, encrypt);
@@ -2192,6 +2207,24 @@ static struct talitos_alg_template driver_algs[] = {
 DESC_HDR_MODE0_DEU_CBC |
 DESC_HDR_MODE0_DEU_3DES,
},
+   {   .type = CRYPTO_ALG_TYPE_ABLKCIPHER,
+   .alg.crypto = {
+   .cra_name = "xts(aes)",
+   .cra_driver_name = "xts-aes-talitos",
+   .cra_blocksize = XTS_BLOCK_SIZE,
+   .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER |
+ CRYPTO_ALG_ASYNC,
+   .cra_ablkcipher = {
+   .min_keysize = AES_MIN_KEY_SIZE * 2,
+   .max_keysize = AES_MAX_KEY_SIZE * 2,
+   .ivsize = XTS_BLOCK_SIZE,
+   }
+   },
+   .desc_hdr_template = DESC_HDR_TYPE_COMMON_NONSNOOP_NO_AFEU |
+   DESC_HDR_SEL0_AESU |
+   DESC_HDR_MODE0_AESU_XTS,
+   },
+
/* AHASH algorithms. */
{   .type = CRYPTO_ALG_TYPE_AHASH,
.alg.hash = {
diff --git a/drivers/crypto/talitos.h b/drivers/crypto/talitos.h
index a6f73e2..735da82 100644
--- a/drivers/crypto/talitos.h
+++ b/drivers/crypto/talitos.h
@@ -316,6 +316,7 @@ extern int talitos_submit(struct device *dev, int ch, 
struct talitos_edesc *edes
 /* primary execution unit mode (MODE0) and derivatives */
 #defineDESC_HDR_MODE0_ENCRYPT  cpu_to_be32(0x0010)
 #defineDESC_HDR_MODE0_AESU_CBC cpu_to_be32(0x0020)
+#defineDESC_HDR_MODE0_AESU_XTS cpu_to_be32(0x0420)
 #defineDESC_HDR_MODE0_DEU_CBC  cpu_to_be32(0x0040)
 #defineDESC_HDR_MODE0_DEU_3DES cpu_to_be32(0x0020)
 #defineDESC_HDR_MODE0_MDEU_CONTcpu_to_be32(0x0800)
-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/2] crypto: talitos: Clean ups and comment fixes for ablkcipher commands

2015-02-20 Thread Martin Hicks
This just cleans up some of the initializers, and improves the comments
should any other ablkcipher modes be added in the future.  The header
words 1 and 5 have more possibilities than just passing an IV.  These
are pointers to the Cipher Context in/out registers.

Signed-off-by: Martin Hicks 
---
 drivers/crypto/talitos.c |   12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 226654c..6b2a19a 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -1377,11 +1377,9 @@ static int common_nonsnoop(struct talitos_edesc *edesc,
int sg_count, ret;
 
/* first DWORD empty */
-   desc->ptr[0].len = 0;
-   to_talitos_ptr(&desc->ptr[0], 0);
-   desc->ptr[0].j_extent = 0;
+   desc->ptr[0] = zero_entry;
 
-   /* cipher iv */
+   /* cipher context */
to_talitos_ptr(&desc->ptr[1], edesc->iv_dma);
desc->ptr[1].len = cpu_to_be16(ivsize);
desc->ptr[1].j_extent = 0;
@@ -1444,14 +1442,12 @@ static int common_nonsnoop(struct talitos_edesc *edesc,
   edesc->dma_len, DMA_BIDIRECTIONAL);
}
 
-   /* iv out */
+   /* cipher context out */
map_single_talitos_ptr(dev, &desc->ptr[5], ivsize, ctx->iv, 0,
   DMA_FROM_DEVICE);
 
/* last DWORD empty */
-   desc->ptr[6].len = 0;
-   to_talitos_ptr(&desc->ptr[6], 0);
-   desc->ptr[6].j_extent = 0;
+   desc->ptr[6] = zero_entry;
 
edesc->req.callback = callback;
edesc->req.context = areq;
-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/2] crypto: talitos: Add AES-XTS mode

2015-02-20 Thread Martin Hicks
This adds the AES-XTS mode, supported by the Freescale SEC 3.3.2.

One of the nice things about this hardware is that it knows how to deal
with encrypt/decrypt requests that are larger than sector size, but that 
also requires that that the sector size be passed into the crypto engine
as an XTS cipher context parameter.

When a request is larger than the sector size the sector number is
incremented by the talitos engine and the tweak key is re-calculated
for the new sector.

I've tested this with 256bit and 512bit keys (tweak and data keys of 128bit
and 256bit) to ensure interoperability with the software AES-XTS
implementation.  All testing was done using dm-crypt/LUKS with
aes-xts-plain64.

Is there a better solution that just hard coding the sector size to
(1

[PATCH 5/5] crypto: talitos: Add software backlog queue handling

2015-02-20 Thread Martin Hicks
I was running into situations where the hardware FIFO was filling up, and
the code was returning EAGAIN to dm-crypt and just dropping the submitted
crypto request.

This adds support in talitos for a software backlog queue.  When requests
can't be queued to the hardware immediately EBUSY is returned.  The queued
requests are dispatched to the hardware in received order as hardware FIFO
slots become available.

Signed-off-by: Martin Hicks 
---
 drivers/crypto/talitos.c |   92 +++---
 drivers/crypto/talitos.h |3 ++
 2 files changed, 74 insertions(+), 21 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index d3472be..226654c 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -183,43 +183,72 @@ static int init_device(struct device *dev)
 }
 
 /**
- * talitos_submit - submits a descriptor to the device for processing
+ * talitos_handle_queue - performs submissions either of new descriptors
+ *or ones waiting in the queue backlog.
  * @dev:   the SEC device to be used
  * @ch:the SEC device channel to be used
- * @edesc: the descriptor to be processed by the device
- * @context:   a handle for use by caller (optional)
+ * @edesc: the descriptor to be processed by the device (optional)
  *
  * desc must contain valid dma-mapped (bus physical) address pointers.
  * callback must check err and feedback in descriptor header
- * for device processing status.
+ * for device processing status upon completion.
  */
-int talitos_submit(struct device *dev, int ch, struct talitos_edesc *edesc)
+int talitos_handle_queue(struct device *dev, int ch, struct talitos_edesc 
*edesc)
 {
struct talitos_private *priv = dev_get_drvdata(dev);
-   struct talitos_request *request = &edesc->req;
+   struct talitos_request *request, *orig_request = NULL;
+   struct crypto_async_request *async_req;
unsigned long flags;
int head;
+   int ret = -EINPROGRESS; 
 
spin_lock_irqsave(&priv->chan[ch].head_lock, flags);
 
+   if (edesc) {
+   orig_request = &edesc->req;
+   crypto_enqueue_request(&priv->chan[ch].queue, 
&orig_request->base);
+   }
+
+flush_another:
+   if (priv->chan[ch].queue.qlen == 0) {
+   spin_unlock_irqrestore(&priv->chan[ch].head_lock, flags);
+   return 0;
+   }
+
if (!atomic_inc_not_zero(&priv->chan[ch].submit_count)) {
/* h/w fifo is full */
spin_unlock_irqrestore(&priv->chan[ch].head_lock, flags);
-   return -EAGAIN;
+   return -EBUSY;
}
 
-   head = priv->chan[ch].head;
+   /* Dequeue the oldest request */
+   async_req = crypto_dequeue_request(&priv->chan[ch].queue);
+
+   request = container_of(async_req, struct talitos_request, base);
request->dma_desc = dma_map_single(dev, request->desc,
   sizeof(*request->desc),
   DMA_BIDIRECTIONAL);
 
/* increment fifo head */
+   head = priv->chan[ch].head;
priv->chan[ch].head = (priv->chan[ch].head + 1) & (priv->fifo_len - 1);
 
-   smp_wmb();
-   priv->chan[ch].fifo[head] = request;
+   spin_unlock_irqrestore(&priv->chan[ch].head_lock, flags);
+
+   /*
+* Mark a backlogged request as in-progress, return EBUSY because
+* the original request that was submitted is backlogged.
+*/
+   if (request != orig_request) {
+   struct crypto_async_request *areq = request->context;
+   areq->complete(areq, -EINPROGRESS);
+   ret = -EBUSY;
+   }
+
+   spin_lock_irqsave(&priv->chan[ch].head_lock, flags);
 
/* GO! */
+   priv->chan[ch].fifo[head] = request;
wmb();
out_be32(priv->chan[ch].reg + TALITOS_FF,
 upper_32_bits(request->dma_desc));
@@ -228,9 +257,18 @@ int talitos_submit(struct device *dev, int ch, struct 
talitos_edesc *edesc)
 
spin_unlock_irqrestore(&priv->chan[ch].head_lock, flags);
 
-   return -EINPROGRESS;
+   /*
+* When handling the queue via the completion path, queue more
+* requests if the hardware has room.
+*/
+   if (!edesc) {
+   spin_lock_irqsave(&priv->chan[ch].head_lock, flags);
+   goto flush_another;
+   }
+
+   return ret;
 }
-EXPORT_SYMBOL(talitos_submit);
+EXPORT_SYMBOL(talitos_handle_queue);
 
 /*
  * process what was done, notify callback of error if not
@@ -284,6 +322,8 @@ static void flush_channel(struct device *dev, int ch, int 
error, int reset_ch)
}
 
spin_unlock_irqrestore(&priv->chan[ch].tail_lock, flags);
+
+   talitos

[PATCH 4/5] crypto: talitos: Reorganize request submission data structures

2015-02-20 Thread Martin Hicks
This is preparatory work for moving to using the crypto async queue
handling code.  A talitos_request structure is buried into each
talitos_edesc so that when talitos_submit() is called, everything required
to defer the submission to the hardware is contained within talitos_edesc.

Signed-off-by: Martin Hicks 
---
 drivers/crypto/talitos.c |   93 +++---
 drivers/crypto/talitos.h |   41 +---
 2 files changed, 65 insertions(+), 69 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 7709805..d3472be 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -186,22 +186,17 @@ static int init_device(struct device *dev)
  * talitos_submit - submits a descriptor to the device for processing
  * @dev:   the SEC device to be used
  * @ch:the SEC device channel to be used
- * @desc:  the descriptor to be processed by the device
- * @callback:  whom to call when processing is complete
+ * @edesc: the descriptor to be processed by the device
  * @context:   a handle for use by caller (optional)
  *
  * desc must contain valid dma-mapped (bus physical) address pointers.
  * callback must check err and feedback in descriptor header
  * for device processing status.
  */
-int talitos_submit(struct device *dev, int ch, struct talitos_desc *desc,
-  void (*callback)(struct device *dev,
-   struct talitos_desc *desc,
-   void *context, int error),
-  void *context)
+int talitos_submit(struct device *dev, int ch, struct talitos_edesc *edesc)
 {
struct talitos_private *priv = dev_get_drvdata(dev);
-   struct talitos_request *request;
+   struct talitos_request *request = &edesc->req;
unsigned long flags;
int head;
 
@@ -214,19 +209,15 @@ int talitos_submit(struct device *dev, int ch, struct 
talitos_desc *desc,
}
 
head = priv->chan[ch].head;
-   request = &priv->chan[ch].fifo[head];
-
-   /* map descriptor and save caller data */
-   request->dma_desc = dma_map_single(dev, desc, sizeof(*desc),
+   request->dma_desc = dma_map_single(dev, request->desc,
+  sizeof(*request->desc),
   DMA_BIDIRECTIONAL);
-   request->callback = callback;
-   request->context = context;
 
/* increment fifo head */
priv->chan[ch].head = (priv->chan[ch].head + 1) & (priv->fifo_len - 1);
 
smp_wmb();
-   request->desc = desc;
+   priv->chan[ch].fifo[head] = request;
 
/* GO! */
wmb();
@@ -247,15 +238,16 @@ EXPORT_SYMBOL(talitos_submit);
 static void flush_channel(struct device *dev, int ch, int error, int reset_ch)
 {
struct talitos_private *priv = dev_get_drvdata(dev);
-   struct talitos_request *request, saved_req;
+   struct talitos_request *request;
unsigned long flags;
int tail, status;
 
spin_lock_irqsave(&priv->chan[ch].tail_lock, flags);
 
tail = priv->chan[ch].tail;
-   while (priv->chan[ch].fifo[tail].desc) {
-   request = &priv->chan[ch].fifo[tail];
+   while (priv->chan[ch].fifo[tail]) {
+   request = priv->chan[ch].fifo[tail];
+   status = 0;
 
/* descriptors with their done bits set don't get the error */
rmb();
@@ -271,14 +263,9 @@ static void flush_channel(struct device *dev, int ch, int 
error, int reset_ch)
 sizeof(struct talitos_desc),
 DMA_BIDIRECTIONAL);
 
-   /* copy entries so we can call callback outside lock */
-   saved_req.desc = request->desc;
-   saved_req.callback = request->callback;
-   saved_req.context = request->context;
-
/* release request entry in fifo */
smp_wmb();
-   request->desc = NULL;
+   priv->chan[ch].fifo[tail] = NULL;
 
/* increment fifo tail */
priv->chan[ch].tail = (tail + 1) & (priv->fifo_len - 1);
@@ -287,8 +274,8 @@ static void flush_channel(struct device *dev, int ch, int 
error, int reset_ch)
 
atomic_dec(&priv->chan[ch].submit_count);
 
-   saved_req.callback(dev, saved_req.desc, saved_req.context,
-  status);
+   request->callback(dev, request->desc, request->context, status);
+
/* channel may resume processing in single desc error case */
if (error && !reset_ch && status == error)
return;
@@ -352,7 +339,8 @@ static u32 current_desc_hdr(struct device *dev, int ch)
tail = priv-&

[PATCH 3/5] crypto: talitos: Fix off-by-one and use all hardware slots

2015-02-20 Thread Martin Hicks
The submission count was off by one.

Signed-off-by: Martin Hicks 
---
 drivers/crypto/talitos.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 89cf4d5..7709805 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -2722,8 +2722,7 @@ static int talitos_probe(struct platform_device *ofdev)
goto err_out;
}
 
-   atomic_set(&priv->chan[i].submit_count,
-  -(priv->chfifo_len - 1));
+   atomic_set(&priv->chan[i].submit_count, -priv->chfifo_len);
}
 
dma_set_mask(dev, DMA_BIT_MASK(36));
-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/5] crypto: talitos: Remove MD5_BLOCK_SIZE

2015-02-20 Thread Martin Hicks
This is properly defined in the md5 header file.
---
 drivers/crypto/talitos.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index c49d977..89cf4d5 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -637,8 +637,6 @@ static void talitos_unregister_rng(struct device *dev)
 #define TALITOS_MAX_KEY_SIZE   96
 #define TALITOS_MAX_IV_LENGTH  16 /* max of AES_BLOCK_SIZE, 
DES3_EDE_BLOCK_SIZE */
 
-#define MD5_BLOCK_SIZE64
-
 struct talitos_ctx {
struct device *dev;
int ch;
@@ -2195,7 +2193,7 @@ static struct talitos_alg_template driver_algs[] = {
.halg.base = {
.cra_name = "md5",
.cra_driver_name = "md5-talitos",
-   .cra_blocksize = MD5_BLOCK_SIZE,
+   .cra_blocksize = MD5_HMAC_BLOCK_SIZE,
.cra_flags = CRYPTO_ALG_TYPE_AHASH |
 CRYPTO_ALG_ASYNC,
}
@@ -2285,7 +2283,7 @@ static struct talitos_alg_template driver_algs[] = {
.halg.base = {
.cra_name = "hmac(md5)",
.cra_driver_name = "hmac-md5-talitos",
-   .cra_blocksize = MD5_BLOCK_SIZE,
+   .cra_blocksize = MD5_HMAC_BLOCK_SIZE,
.cra_flags = CRYPTO_ALG_TYPE_AHASH |
 CRYPTO_ALG_ASYNC,
}
-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/5] crypto: talitos: Simplify per-channel initialization

2015-02-20 Thread Martin Hicks
There were multiple loops in a row, for each separate step of the
initialization of the channels.  Simplify to a single loop.

Signed-off-by: Martin Hicks 
---
 drivers/crypto/talitos.c |   11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 067ec21..c49d977 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -2706,20 +2706,16 @@ static int talitos_probe(struct platform_device *ofdev)
goto err_out;
}
 
+   priv->fifo_len = roundup_pow_of_two(priv->chfifo_len);
+
for (i = 0; i < priv->num_channels; i++) {
priv->chan[i].reg = priv->reg + TALITOS_CH_STRIDE * (i + 1);
if (!priv->irq[1] || !(i & 1))
priv->chan[i].reg += TALITOS_CH_BASE_OFFSET;
-   }
 
-   for (i = 0; i < priv->num_channels; i++) {
spin_lock_init(&priv->chan[i].head_lock);
spin_lock_init(&priv->chan[i].tail_lock);
-   }
 
-   priv->fifo_len = roundup_pow_of_two(priv->chfifo_len);
-
-   for (i = 0; i < priv->num_channels; i++) {
priv->chan[i].fifo = kzalloc(sizeof(struct talitos_request) *
 priv->fifo_len, GFP_KERNEL);
if (!priv->chan[i].fifo) {
@@ -2727,11 +2723,10 @@ static int talitos_probe(struct platform_device *ofdev)
err = -ENOMEM;
goto err_out;
}
-   }
 
-   for (i = 0; i < priv->num_channels; i++)
atomic_set(&priv->chan[i].submit_count,
   -(priv->chfifo_len - 1));
+   }
 
dma_set_mask(dev, DMA_BIT_MASK(36));
 
-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/5] crypto: talitos: Add crypto async queue handling

2015-02-20 Thread Martin Hicks
I was testing dm-crypt performance with a Freescale P1022 board with 
a recent kernel and was getting IO errors while doing testing with LUKS.
Investigation showed that all hardware FIFO slots were filling and
the driver was returning EAGAIN to the block layer, which is not an
expected response for an async crypto implementation.

The following patch series adds a few small fixes, and reworks the
submission path to use the crypto_queue mechanism to handle the
request backlog.


Martin Hicks (5):
  crypto: talitos: Simplify per-channel initialization
  crypto: talitos: Remove MD5_BLOCK_SIZE
  crypto: talitos: Fix off-by-one and use all hardware slots
  crypto: talitos: Reorganize request submission data structures
  crypto: talitos: Add software backlog queue handling

 drivers/crypto/talitos.c |  189 --
 drivers/crypto/talitos.h |   44 +--
 2 files changed, 137 insertions(+), 96 deletions(-)

-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] sata-fsl: Apply link speed limits

2015-02-19 Thread Martin Hicks


The driver was ignoring limits requested by libata.force.  The output
would look like:

fsl-sata ffe18000.sata: Sata FSL Platform/CSB Driver init
ata1: FORCE: PHY spd limit set to 1.5Gbps
ata1: SATA max UDMA/133 irq 74
ata1: Signature Update detected @ 0 msecs
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 310)

Signed-off-by: Martin Hicks 
---
 drivers/ata/sata_fsl.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/ata/sata_fsl.c b/drivers/ata/sata_fsl.c
index f9054cd..a9b5508 100644
--- a/drivers/ata/sata_fsl.c
+++ b/drivers/ata/sata_fsl.c
@@ -868,6 +868,8 @@ try_offline_again:
 * PHY reset should remain asserted for atleast 1ms
 */
ata_msleep(ap, 1);
+   
+   sata_set_spd(link);
 
/*
 * Now, bring the host controller online again, this can take time
-- 
1.7.10.4


-- 
Martin Hicks P.Eng.|  m...@bork.org
Bork Consulting Inc.   |  +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] mmc: sdhci: Apply FSL ESDHC reset handling quirk to OF

2015-01-28 Thread Martin Hicks

The reset code was pushed into the esdhc-imx driver, but missed being
pushed into the FSL OF driver at the same time.  The commit that broke
the OF ESDHC driver was 0718e59ae259f7c48155b4e852d8b0632d59028e

Signed-off-by: Martin Hicks 
---
 drivers/mmc/host/sdhci-of-esdhc.c |   10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci-of-esdhc.c 
b/drivers/mmc/host/sdhci-of-esdhc.c
index 8872c85..4a654d4 100644
--- a/drivers/mmc/host/sdhci-of-esdhc.c
+++ b/drivers/mmc/host/sdhci-of-esdhc.c
@@ -276,6 +276,14 @@ static void esdhc_pltfm_set_bus_width(struct sdhci_host 
*host, int width)
ESDHC_CTRL_BUSWIDTH_MASK, ctrl);
 }
 
+static void esdhc_reset(struct sdhci_host *host, u8 mask)
+{
+   sdhci_reset(host, mask);
+
+   sdhci_writel(host, host->ier, SDHCI_INT_ENABLE);
+   sdhci_writel(host, host->ier, SDHCI_SIGNAL_ENABLE);
+}
+
 static const struct sdhci_ops sdhci_esdhc_ops = {
.read_l = esdhc_readl,
.read_w = esdhc_readw,
@@ -290,7 +298,7 @@ static const struct sdhci_ops sdhci_esdhc_ops = {
.platform_init = esdhc_of_platform_init,
.adma_workaround = esdhci_of_adma_workaround,
.set_bus_width = esdhc_pltfm_set_bus_width,
-   .reset = sdhci_reset,
+   .reset = esdhc_reset,
.set_uhs_signaling = sdhci_set_uhs_signaling,
 };
 
-- 
1.7.10.4


-- 
Martin Hicks P.Eng.|  m...@bork.org
Bork Consulting Inc.   |  +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/fsl: Add empty ranges to etsec2 dts files

2015-01-14 Thread Martin Hicks
Perfect.  I'm glad there's a patch.
mh

On Mon, Jan 12, 2015 at 4:31 PM, Scott Wood  wrote:

> On Mon, 2015-01-12 at 10:27 -0500, Martin Hicks wrote:
> >
> > With an earlier change (746c9e9f - Fix PowerPC address parsing hack),
> ethernet
> > has broken on Freescale boards such as the P1022.  All ranges used by the
> > ethernet controllers are also covered by sub-devices that properly
> > declared the used ranges.  The error shown is:
> >
> > fsl-gianfar: probe of soc@ffe0:ethernet@b failed with error -12
> >
> > Signed-off-by: Martin Hicks 
> > ---
> >  arch/powerpc/boot/dts/fsl/pq3-etsec2-0.dtsi |1 +
> >  arch/powerpc/boot/dts/fsl/pq3-etsec2-1.dtsi |1 +
> >  arch/powerpc/boot/dts/fsl/pq3-etsec2-2.dtsi |1 +
> >  3 files changed, 3 insertions(+)
>
> http://patchwork.ozlabs.org/patch/422446/
>
> > diff --git a/arch/powerpc/boot/dts/fsl/pq3-etsec2-0.dtsi
> b/arch/powerpc/boot/dts/fsl/pq3-etsec2-0.dtsi
> > index 1382fec..d1a6c48 100644
> > --- a/arch/powerpc/boot/dts/fsl/pq3-etsec2-0.dtsi
> > +++ b/arch/powerpc/boot/dts/fsl/pq3-etsec2-0.dtsi
> > @@ -43,6 +43,7 @@ mdio@24000 {
> >  ethernet@b {
> >   #address-cells = <1>;
> >   #size-cells = <1>;
> > + ranges = <>;
>
> The " = <>" is usually omitted for empty properties.
>
> -Scott
>
>
> ___
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
>



-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/fsl: Add empty ranges to etsec2 dts files

2015-01-12 Thread Martin Hicks


With an earlier change (746c9e9f - Fix PowerPC address parsing hack), ethernet
has broken on Freescale boards such as the P1022.  All ranges used by the
ethernet controllers are also covered by sub-devices that properly
declared the used ranges.  The error shown is:

fsl-gianfar: probe of soc@ffe0:ethernet@b failed with error -12

Signed-off-by: Martin Hicks 
---
 arch/powerpc/boot/dts/fsl/pq3-etsec2-0.dtsi |1 +
 arch/powerpc/boot/dts/fsl/pq3-etsec2-1.dtsi |1 +
 arch/powerpc/boot/dts/fsl/pq3-etsec2-2.dtsi |1 +
 3 files changed, 3 insertions(+)

diff --git a/arch/powerpc/boot/dts/fsl/pq3-etsec2-0.dtsi 
b/arch/powerpc/boot/dts/fsl/pq3-etsec2-0.dtsi
index 1382fec..d1a6c48 100644
--- a/arch/powerpc/boot/dts/fsl/pq3-etsec2-0.dtsi
+++ b/arch/powerpc/boot/dts/fsl/pq3-etsec2-0.dtsi
@@ -43,6 +43,7 @@ mdio@24000 {
 ethernet@b {
#address-cells = <1>;
#size-cells = <1>;
+   ranges = <>;
device_type = "network";
model = "eTSEC";
compatible = "fsl,etsec2";
diff --git a/arch/powerpc/boot/dts/fsl/pq3-etsec2-1.dtsi 
b/arch/powerpc/boot/dts/fsl/pq3-etsec2-1.dtsi
index 221cd2e..0447d38 100644
--- a/arch/powerpc/boot/dts/fsl/pq3-etsec2-1.dtsi
+++ b/arch/powerpc/boot/dts/fsl/pq3-etsec2-1.dtsi
@@ -43,6 +43,7 @@ mdio@25000 {
 ethernet@b1000 {
#address-cells = <1>;
#size-cells = <1>;
+   ranges = <>;
device_type = "network";
model = "eTSEC";
compatible = "fsl,etsec2";
diff --git a/arch/powerpc/boot/dts/fsl/pq3-etsec2-2.dtsi 
b/arch/powerpc/boot/dts/fsl/pq3-etsec2-2.dtsi
index 61456c3..d2b7255 100644
--- a/arch/powerpc/boot/dts/fsl/pq3-etsec2-2.dtsi
+++ b/arch/powerpc/boot/dts/fsl/pq3-etsec2-2.dtsi
@@ -42,6 +42,7 @@ mdio@26000 {
 ethernet@b2000 {
#address-cells = <1>;
#size-cells = <1>;
+   ranges = <>;
device_type = "network";
model = "eTSEC";
compatible = "fsl,etsec2";
-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Perf not resolving all symbols, showing 0x7ffffxxx

2013-10-16 Thread Martin Hicks
On Wed, Oct 16, 2013 at 2:42 PM, Benjamin Herrenschmidt
 wrote:
> On Wed, 2013-10-16 at 11:05 -0400, Martin Hicks wrote:
>> Actually, I was wrong, the mpc8379 is an e300c4.
>>
>> So it seems clear to me that we compile in the book3s code because
>> this is an 83xx CPU part.  I also see that Kconfig knows that I have
>> an core-fsl-emb but we don't actually compile the PMU backend for it
>> because there's no support for anything but e500.
>>
>> mort@chinook:~/src/s4v2-glibc/linux-mpc$ grep PERF .config
>> CONFIG_FSL_EMB_PERFMON=y
>> CONFIG_PPC_PERF_CTRS=y
>> CONFIG_HAVE_PERF_EVENTS=y
>> CONFIG_PERF_EVENTS=y
>> # CONFIG_DEBUG_PERF_USE_VMALLOC is not set
>> mort@chinook:~/src/s4v2-glibc/linux-mpc$ grep BOOK3S .config
>> CONFIG_PPC_BOOK3S_32=y
>> CONFIG_PPC_BOOK3S=y
>>
>> more below...
>>
>> On Tue, Oct 15, 2013 at 4:39 PM, Benjamin Herrenschmidt
>>  wrote:
>> > On Tue, 2013-10-15 at 15:22 -0500, Scott Wood wrote:
>> >> On Tue, 2013-10-15 at 14:53 -0500, Benjamin Herrenschmidt wrote:
>> >> > On Tue, 2013-10-15 at 14:44 -0400, Martin Hicks wrote:
>> >> > > >
>> >> > > > This is an e300 core right ? (603...). Do that have an SIAR at all
>> >> > > > (Scott ?)
>> >> > >
>> >> > > Yes, e300c3.
>> >> >
>> >> > Ok so I have a hard time figuring out how that patch can make a
>> >> > difference since for all I can see, there is no perf backend upstream
>> >> > for e300 at all :-(
>> >> >
>> >> > I must certainly be missing something ... Scott, can you have a look ?
>> >>
>> >> e300c3 has a core-fsl-emb style performance monitor (though Linux
>> >> doesn't support it yet).  If a bug was bisected to a change in
>> >> core-book3s.c, then it's probably a coincidence due to moving code
>> >> around.
>>
>> CONFIG_PPC_PERF_CTRS seems to give the mpc8379 some kind of basic
>> performance measuring.  Is this through dummy_perf() in
>> arch/powerpc/kernel/pmc.c?
>>
>> >
>> > Mort, can you see if just that change is enough to cause the problem ?
>>
>> It is not.  The patch that does get IPs working again in my 3.11 tree
>> is this one:
>>
>> diff --git a/arch/powerpc/perf/core-book3s.c 
>> b/arch/powerpc/perf/core-book3s.c
>> index eeae308..9a3f572 100644
>> --- a/arch/powerpc/perf/core-book3s.c
>> +++ b/arch/powerpc/perf/core-book3s.c
>> @@ -122,10 +122,6 @@ void power_pmu_flush_branch_stack(void) {}
>>  static inline void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw) {}
>>  #endif /* CONFIG_PPC32 */
>>
>> -static bool regs_use_siar(struct pt_regs *regs)
>> -{
>> -   return !!regs->result;
>> -}
>
> Can you try instead just chaning regs_use_siar to return false always ?
> Do that help ?
>

That does fix the problem.  v3.11 with the following:

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index eeae308..e91cf67 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -124,7 +124,7 @@ static inline void power_pmu_bhrb_read(struct
cpu_hw_events *cpuhw) {}

 static bool regs_use_siar(struct pt_regs *regs)
 {
-   return !!regs->result;
+   return 0; //!!regs->result;
 }

 /*


mh

-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Perf not resolving all symbols, showing 0x7ffffxxx

2013-10-16 Thread Martin Hicks
Actually, I was wrong, the mpc8379 is an e300c4.

So it seems clear to me that we compile in the book3s code because
this is an 83xx CPU part.  I also see that Kconfig knows that I have
an core-fsl-emb but we don't actually compile the PMU backend for it
because there's no support for anything but e500.

mort@chinook:~/src/s4v2-glibc/linux-mpc$ grep PERF .config
CONFIG_FSL_EMB_PERFMON=y
CONFIG_PPC_PERF_CTRS=y
CONFIG_HAVE_PERF_EVENTS=y
CONFIG_PERF_EVENTS=y
# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
mort@chinook:~/src/s4v2-glibc/linux-mpc$ grep BOOK3S .config
CONFIG_PPC_BOOK3S_32=y
CONFIG_PPC_BOOK3S=y

more below...

On Tue, Oct 15, 2013 at 4:39 PM, Benjamin Herrenschmidt
 wrote:
> On Tue, 2013-10-15 at 15:22 -0500, Scott Wood wrote:
>> On Tue, 2013-10-15 at 14:53 -0500, Benjamin Herrenschmidt wrote:
>> > On Tue, 2013-10-15 at 14:44 -0400, Martin Hicks wrote:
>> > > >
>> > > > This is an e300 core right ? (603...). Do that have an SIAR at all
>> > > > (Scott ?)
>> > >
>> > > Yes, e300c3.
>> >
>> > Ok so I have a hard time figuring out how that patch can make a
>> > difference since for all I can see, there is no perf backend upstream
>> > for e300 at all :-(
>> >
>> > I must certainly be missing something ... Scott, can you have a look ?
>>
>> e300c3 has a core-fsl-emb style performance monitor (though Linux
>> doesn't support it yet).  If a bug was bisected to a change in
>> core-book3s.c, then it's probably a coincidence due to moving code
>> around.

CONFIG_PPC_PERF_CTRS seems to give the mpc8379 some kind of basic
performance measuring.  Is this through dummy_perf() in
arch/powerpc/kernel/pmc.c?

>
> Mort, can you see if just that change is enough to cause the problem ?

It is not.  The patch that does get IPs working again in my 3.11 tree
is this one:

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index eeae308..9a3f572 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -122,10 +122,6 @@ void power_pmu_flush_branch_stack(void) {}
 static inline void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw) {}
 #endif /* CONFIG_PPC32 */

-static bool regs_use_siar(struct pt_regs *regs)
-{
-   return !!regs->result;
-}

 /*
  * Things that are specific to 64-bit implementations.
@@ -1802,14 +1798,13 @@ unsigned long perf_misc_flags(struct pt_regs *regs)
  */
 unsigned long perf_instruction_pointer(struct pt_regs *regs)
 {
-   bool use_siar = regs_use_siar(regs);
-
-   if (use_siar && siar_valid(regs))
-   return mfspr(SPRN_SIAR) + perf_ip_adjust(regs);
-   else if (use_siar)
-   return 0;   // no valid instruction pointer
-   else
+   unsigned long mmcra = regs->dsisr;
+   if (TRAP(regs) != 0xf00)
+   return regs->nip;
+   if ((ppmu->flags & PPMU_NO_CONT_SAMPLING) &&
+   !(mmcra & MMCRA_SAMPLE_ENABLE))
return regs->nip;
+   return mfspr(SPRN_SIAR) + perf_ip_adjust(regs);
 }

 static bool pmc_overflow_power7(unsigned long val)


mh

-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Perf not resolving all symbols, showing 0x7ffffxxx

2013-10-15 Thread Martin Hicks
On Tue, Oct 15, 2013 at 11:30 AM, Benjamin Herrenschmidt
 wrote:
> On Tue, 2013-10-15 at 09:59 -0400, Martin Hicks wrote:
>> I've tracked the start of the strange instruction pointers in 'perf
>> report' to a commit by Anton:
>>
>> commit 75382aa72f06823db7312ad069c3bae2eb3f8548
>> Author: Anton Blanchard 
>> Date:   Tue Jun 26 01:01:36 2012 +
>>
>> powerpc/perf: Move code to select SIAR or pt_regs into perf_read_regs
>>
>> I don't know enough about PPC to know what's going on, but reverting
>> the changes to perf_instruction_pointer() gets me reasonable 'perf
>> report' output with 3.11.
>
> This is an e300 core right ? (603...). Do that have an SIAR at all
> (Scott ?)

Yes, e300c3.

mh

-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Perf not resolving all symbols, showing 0x7ffffxxx

2013-10-15 Thread Martin Hicks
I've tracked the start of the strange instruction pointers in 'perf
report' to a commit by Anton:

commit 75382aa72f06823db7312ad069c3bae2eb3f8548
Author: Anton Blanchard 
Date:   Tue Jun 26 01:01:36 2012 +

powerpc/perf: Move code to select SIAR or pt_regs into perf_read_regs

I don't know enough about PPC to know what's going on, but reverting
the changes to perf_instruction_pointer() gets me reasonable 'perf
report' output with 3.11.

Thanks,
mh


On Thu, Oct 3, 2013 at 10:21 AM, Martin Hicks  wrote:
> Hi,
>
> I've been trying to track down a performance regression that started
> leading up to the v3.6 kernel, and while doing this I've been
> gathering perf data on v3.6-rc and comparing it to v3.11 perf reports
> of the same workload.
>
> With v3.6-rc kernels I get all symbols resolved like this:
>
> # Events: 39K cpu-clock-msecs
> #
> # Overhead  Command   Shared Object
> Symbol
> #   ...  ..
> 
> #
>  9.69% nfsd  [kernel.kallsyms]   [k] csum_partial
>  5.64% nfsd  [kernel.kallsyms]   [k] __do_softirq
>  3.12% nfsd  [sunrpc][k] svc_create
>  2.38% nfsd  [kernel.kallsyms]   [k] __queue_work
>  1.91% nfsd  [gianfar_driver][k] gfar_poll
>  1.73% nfsd  [kernel.kallsyms]   [k] memset
>  1.54% nfsd  [nfsd]  [k] nfsd_vfs_read.isra.16
>  1.40%  ksoftirqd/0  [kernel.kallsyms]   [k] 
> finish_task_switch.isra.54
>  1.30% nfsd  [kernel.kallsyms]   [k] get_page_from_freelist
>  1.21% nfsd  [gianfar_driver][k] gfar_start_xmit
>  1.20% nfsd  [sunrpc][k] svc_xprt_received
>
>
> But when I perf on v3.11 kernels I see a lot of unresolved symbols:
>
> # Events: 69K cpu-clock-msecs
> #
> # Overhead   Command  Shared Object
>Symbol
> #     .
> ...
> #
> 73.80%  nfsd  [unknown]  [k] 0x77fa
>  7.57%  nfsd  [kernel.kallsyms]  [k] csum_partial
>  4.59%   kworker/0:1  [unknown]  [k] 0x7832
>  3.76%   ksoftirqd/0  [unknown]  [k] 0x796e
>  0.94%   kswapd0  [unknown]  [k] 0x7cc2
>  0.92%   swapper  [unknown]  [k] 0x7a54
>  0.62%  nfsd  [kernel.kallsyms]  [k] __udp4_lib_lookup
>  0.49%  nfsd  [kernel.kallsyms]  [k] ip_append_page
>  0.48%  nfsd  [kernel.kallsyms]  [k] __do_softirq
>  0.36%  eventmon  [unknown]  [k] 0x79c4
>  0.32%  nfsd  [kernel.kallsyms]  [k] __getnstimeofday
>
>
> Any ideas?  Have I overlooked some necessary kernel config change?
> Does perf need some binary that I may not have installed on this
> embedded platform?
>
> Freescale mpc8379 (e300c4)
> Gcc-4.7.2 uClibC built with ct-ng 1.18.0
> binutils 2.22
>
> Thanks,
> mh
>
> --
> Martin Hicks P.Eng.  | m...@bork.org
> Bork Consulting Inc. |   +1 (613) 266-2296



-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Perf not resolving all symbols, showing 0x7ffffxxx

2013-10-04 Thread Martin Hicks
Hi,

I've been trying to track down a performance regression that started
leading up to the v3.6 kernel, and while doing this I've been
gathering perf data on v3.6-rc and comparing it to v3.11 perf reports
of the same workload.

With v3.6-rc kernels I get all symbols resolved like this:

# Events: 39K cpu-clock-msecs
#
# Overhead  Command   Shared Object
Symbol
#   ...  ..

#
 9.69% nfsd  [kernel.kallsyms]   [k] csum_partial
 5.64% nfsd  [kernel.kallsyms]   [k] __do_softirq
 3.12% nfsd  [sunrpc][k] svc_create
 2.38% nfsd  [kernel.kallsyms]   [k] __queue_work
 1.91% nfsd  [gianfar_driver][k] gfar_poll
 1.73% nfsd  [kernel.kallsyms]   [k] memset
 1.54% nfsd  [nfsd]  [k] nfsd_vfs_read.isra.16
 1.40%  ksoftirqd/0  [kernel.kallsyms]   [k] finish_task_switch.isra.54
 1.30% nfsd  [kernel.kallsyms]   [k] get_page_from_freelist
 1.21% nfsd  [gianfar_driver][k] gfar_start_xmit
 1.20% nfsd  [sunrpc][k] svc_xprt_received


But when I perf on v3.11 kernels I see a lot of unresolved symbols:

# Events: 69K cpu-clock-msecs
#
# Overhead   Command  Shared Object
   Symbol
#     .
...
#
73.80%  nfsd  [unknown]  [k] 0x77fa
 7.57%  nfsd  [kernel.kallsyms]  [k] csum_partial
 4.59%   kworker/0:1  [unknown]  [k] 0x7832
 3.76%   ksoftirqd/0  [unknown]  [k] 0x796e
 0.94%   kswapd0  [unknown]  [k] 0x7cc2
 0.92%   swapper  [unknown]  [k] 0x7a54
 0.62%  nfsd  [kernel.kallsyms]  [k] __udp4_lib_lookup
 0.49%  nfsd  [kernel.kallsyms]  [k] ip_append_page
 0.48%  nfsd  [kernel.kallsyms]  [k] __do_softirq
 0.36%  eventmon  [unknown]  [k] 0x79c4
 0.32%  nfsd  [kernel.kallsyms]  [k] __getnstimeofday


Any ideas?  Have I overlooked some necessary kernel config change?
Does perf need some binary that I may not have installed on this
embedded platform?

Freescale mpc8379 (e300c4)
Gcc-4.7.2 uClibC built with ct-ng 1.18.0
binutils 2.22

Thanks,
mh

-- 
Martin Hicks P.Eng.  | m...@bork.org
Bork Consulting Inc. |   +1 (613) 266-2296
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev