Re: [dm-devel] dm thin: superblock may write succeed before other metadata blocks because of wirting metadata in async mode.

2018-06-20 Thread monty


On Wed, Jun 20, 2018 at 10:51:17AM -0400, Mike Snitzer wrote:
> 
> On Wed, Jun 20 2018 at  1:03pm -0400,
> monty  wrote:
> 
> > 
> > On Tue, Jun 19, 2018 at 11:00:32AM -0400, Mike Snitzer wrote:
> > > 
> > > On Tue, Jun 19 2018 at 10:43am -0400,
> > > Joe Thornber  wrote:
> > > 
> > > > On Tue, Jun 19, 2018 at 09:11:06AM -0400, Mike Snitzer wrote:
> > > > > On Mon, May 21 2018 at  8:53pm -0400,
> > > > > Monty Pavel  wrote:
> > > > > 
> > > > > > 
> > > > > > If dm_bufio_write_dirty_buffers func is called by 
> > > > > > __commit_transaction
> > > > > > func and power loss happens during executing it, coincidencely
> > > > > > superblock wrote correctly but some metadata blocks didn't. The 
> > > > > > reason
> > > > > > is we write all metadata in async mode. We can guarantee that we 
> > > > > > send
> > > > > > superblock after other blocks but we cannot guarantee that 
> > > > > > superblock
> > > > > > write completely early than other blocks.
> > > > > > So, We need to commit other metadata blocks before change 
> > > > > > superblock.
> > > > > > 
> > > > > > Signed-off-by: Monty Pavel 
> > > > > > ---
> > > > > >  drivers/md/dm-thin-metadata.c |8 
> > > > > >  1 files changed, 8 insertions(+), 0 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/md/dm-thin-metadata.c 
> > > > > > b/drivers/md/dm-thin-metadata.c
> > > > > > index 36ef284..897d7d6 100644
> > > > > > --- a/drivers/md/dm-thin-metadata.c
> > > > > > +++ b/drivers/md/dm-thin-metadata.c
> > > > > > @@ -813,6 +813,14 @@ static int __commit_transaction(struct 
> > > > > > dm_pool_metadata *pmd)
> > > > > > if (r)
> > > > > > return r;
> > > > > >  
> > > > > > +   r = dm_tm_commit(pmd->tm, sblock);
> > > > > > +   if (r)
> > > > > > +   return r;
> > > > > > +
> > > > > > +   r = superblock_lock(pmd, );
> > > > > > +   if (r)
> > > > > > +   return r;
> > > > > > +
> > > > > > disk_super = dm_block_data(sblock);
> > > > > > disk_super->time = cpu_to_le32(pmd->time);
> > > > > > disk_super->data_mapping_root = cpu_to_le64(pmd->root);
> > > > 
> > > > I don't believe you've tested this; sblock is passed to dm_tm_commit()
> > > > uninitialised, and you didn't even bother to remove the later (and 
> > > > correct)
> > > > call to dm_tm_commit().
> > > 
> > > I pointed out to Joe that the patch, in isolation, is decieving.  It
> > > _looks_ like sblock may be uninitialized, etc.  But once the patch is
> > > applied and you look at the entirety of __commit_transaction() it is
> > > clear that you're reusing the existing superblock_lock() to safely
> > > accomplish your additional call to dm_tm_commit().
> > > 
> > > > What is the issue that started you looking in this area?
> > > 
> > > Right, as my previous reply asked: please clarify if you _know_ your
> > > patch fixes an actual problem you've experienced.  The more details the
> > > better.
> > > 
> > > Thanks,
> > > Mike
> > > 
> > Hi, Mike and Joe. Thanks for your reply. I read __commit_transaction
> > many times and didn't find any problem of 2-phase commit. I use
> > md-raid1(PCIe nvme and md-raid5) in write-behind mode to store dm-thin
> > metadata.
> > Test case:
> > 1. I do copy-diff test on thin device and then reboot my machine.
> > 2. Rebuild our device stack and exec "vgchang -ay".
> > The thin-pool can not be established(details_root become a bitmap node
> > and metadata's bitmap_root become a btree_node).
> 
> But are you saying your double commit in __commit_transaction() serves
> as a workaround for the corruption you're seeing?
> 
> Is it just a case where raid5's writebehind mode is _not_ safe for your
> storage config?  By "reboot" do you mean a clean shutdown?  Or a forced
> powerfail scenario?
> 
> Mike
> 
Reboot my machine means exec reboot command.
My patch seems unnecessary, because 2-phase commit have ensure that
committing superblock after other metadata have written to metadata
device.
This problem is hard to recreate, it didn't happen after applying the
patch above. I will check my device stack if there is any possible that
superblock write complete early than other metadata block.

Monty

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 05/11] crypto alg: Introduce max blocksize and alignmask

2018-06-20 Thread Eric Biggers
On Wed, Jun 20, 2018 at 05:04:08PM -0700, Kees Cook wrote:
> On Wed, Jun 20, 2018 at 4:40 PM, Eric Biggers  wrote:
> > On Wed, Jun 20, 2018 at 12:04:02PM -0700, Kees Cook wrote:
> >> In the quest to remove all stack VLA usage from the kernel[1], this
> >> exposes the existing upper bound on crypto block sizes for VLA removal,
> >> and introduces a new check for alignmask (current maximum in the kernel
> >> is 63 from manual inspection of all cra_alignmask settings).
> >>
> >> [1] 
> >> https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com
> >>
> >> Signed-off-by: Kees Cook 
> >> ---
> >>  crypto/algapi.c| 5 -
> >>  include/linux/crypto.h | 4 
> >>  2 files changed, 8 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/crypto/algapi.c b/crypto/algapi.c
> >> index c0755cf4f53f..760a412b059c 100644
> >> --- a/crypto/algapi.c
> >> +++ b/crypto/algapi.c
> >> @@ -57,7 +57,10 @@ static int crypto_check_alg(struct crypto_alg *alg)
> >>   if (alg->cra_alignmask & (alg->cra_alignmask + 1))
> >>   return -EINVAL;
> >>
> >> - if (alg->cra_blocksize > PAGE_SIZE / 8)
> >> + if (alg->cra_blocksize > CRYPTO_ALG_MAX_BLOCKSIZE)
> >> + return -EINVAL;
> >> +
> >> + if (alg->cra_alignmask > CRYPTO_ALG_MAX_ALIGNMASK)
> >>   return -EINVAL;
> >>
> >>   if (!alg->cra_type && (alg->cra_flags & CRYPTO_ALG_TYPE_MASK) ==
> >> diff --git a/include/linux/crypto.h b/include/linux/crypto.h
> >> index 6eb06101089f..e76ffcbd5aa6 100644
> >> --- a/include/linux/crypto.h
> >> +++ b/include/linux/crypto.h
> >> @@ -134,6 +134,10 @@
> >>   */
> >>  #define CRYPTO_MAX_ALG_NAME  128
> >>
> >> +/* Maximum values for registered algorithms. */
> >> +#define CRYPTO_ALG_MAX_BLOCKSIZE(PAGE_SIZE / 8)
> >> +#define CRYPTO_ALG_MAX_ALIGNMASK63
> >> +
> >
> > How do these differ from MAX_CIPHER_BLOCKSIZE and MAX_CIPHER_ALIGNMASK, and 
> > why
> > are they declared in different places?
> 
> This is what I get for staring at crypto code for so long. I entirely
> missed these checks... even though they're 8 line away:
> 
> if (!alg->cra_type && (alg->cra_flags & CRYPTO_ALG_TYPE_MASK) ==
>CRYPTO_ALG_TYPE_CIPHER) {
> if (alg->cra_alignmask > MAX_CIPHER_ALIGNMASK)
> return -EINVAL;
> 
> if (alg->cra_blocksize > MAX_CIPHER_BLOCKSIZE)
> return -EINVAL;
> }
> 
> However, this is only checking CRYPTO_ALG_TYPE_CIPHER, and
> cra_blocksize can be used for all kinds of things.
> 

It's overloaded for different purposes, depending on the type of algorithm.
It's poorly documented, but the uses I see are:

(1) Block size for "ciphers", i.e. what the rest of the world calls "block 
ciphers".
(2) Minimum input size for "skciphers" -- usually either 1 or the block size of
the underlying block cipher, in the case that the skcipher is something like
"cbc(aes)", where a block cipher is wrapped in a mode of operation.
(3) Block size for hash functions that use an internal compression function,
e.g. SHA-1 has a block size of 64 bytes.

I'm not sure it makes sense to have a single limit for all these uses.  All the
block ciphers supported by Linux have a block size of 16 bytes or less, while
hash functions usually have larger "block sizes".

> include/crypto/algapi.h:#define MAX_CIPHER_ALIGNMASK15
> ...
> drivers/crypto/mxs-dcp.c:   .cra_flags
>  = CRYPTO_ALG_ASYNC,
> drivers/crypto/mxs-dcp.c:   .cra_alignmask  = 63,
> 
> Is this one broken? It has no CRYPTO_ALG_TYPE_... ?
> 
> For my CRYPTO_ALG_MAX_BLOCKSIZE, there is:
> 
> crypto/xcbc.c:  u8 key1[CRYPTO_ALG_MAX_BLOCKSIZE];
> drivers/crypto/qat/qat_common/qat_algs.c:   char
> ipad[CRYPTO_ALG_MAX_BLOCKSIZE];
> drivers/crypto/qat/qat_common/qat_algs.c:   char
> opad[CRYPTO_ALG_MAX_BLOCKSIZE];
> 
> It looks like both xcbc and qat are used with shash, so that needs a
> separate max blocksize.

Actually, xcbc is a 'shash' template (CRYPTO_ALG_TYPE_SHASH) that wraps a block
cipher (CRYPTO_ALG_TYPE_CIPHER) and sets its own cra_blocksize to the block
cipher's block size.  So the same block size can be gotten from either
'crypto_shash_blocksize(parent)' or 'crypto_cipher_blocksize(ctx->child)'.
It can only be 16 bytes, currently, since xcbc_create() only allows
instantiating the template if that's the block size.

But in the case of qat_alg_do_precomputes(), yes it appears to need the hash
block size.

> 
> For my CRYPTO_ALG_MAX_ALIGNMASK, there is:
> 
> crypto/shash.c: u8 ubuf[CRYPTO_ALG_MAX_ALIGNMASK]
> crypto/shash.c: __aligned(CRYPTO_ALG_MAX_ALIGNMASK + 1);
> crypto/shash.c: __aligned(CRYPTO_ALG_MAX_ALIGNMASK + 1);
> 
> which is also shash.
> 
> How should I rename these and best apply the registration-time sanity checks?

I'm not sure, but it may make sense to enforce a smaller limit for 

Re: [dm-devel] [PATCH 09/11] crypto: shash: Remove VLA usage in unaligned hashing

2018-06-20 Thread Kees Cook
On Wed, Jun 20, 2018 at 4:57 PM, Eric Biggers  wrote:
> On Wed, Jun 20, 2018 at 12:04:06PM -0700, Kees Cook wrote:
>> In the quest to remove all stack VLA usage from the kernel[1], this uses
>> the newly defined max alignment to perform unaligned hashing to avoid
>> VLAs, and drops the helper function while adding sanity checks on the
>> resulting buffer sizes.
>>
>> [1] 
>> https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com
>>
>> Signed-off-by: Kees Cook 
>> ---
>>  crypto/shash.c | 21 ++---
>>  1 file changed, 10 insertions(+), 11 deletions(-)
>>
>> diff --git a/crypto/shash.c b/crypto/shash.c
>> index ab6902c6dae7..1bb58209330a 100644
>> --- a/crypto/shash.c
>> +++ b/crypto/shash.c
>> @@ -73,13 +73,6 @@ int crypto_shash_setkey(struct crypto_shash *tfm, const 
>> u8 *key,
>>  }
>>  EXPORT_SYMBOL_GPL(crypto_shash_setkey);
>>
>> -static inline unsigned int shash_align_buffer_size(unsigned len,
>> -unsigned long mask)
>> -{
>> - typedef u8 __aligned_largest u8_aligned;
>> - return len + (mask & ~(__alignof__(u8_aligned) - 1));
>> -}
>> -
>>  static int shash_update_unaligned(struct shash_desc *desc, const u8 *data,
>> unsigned int len)
>>  {
>> @@ -88,11 +81,14 @@ static int shash_update_unaligned(struct shash_desc 
>> *desc, const u8 *data,
>>   unsigned long alignmask = crypto_shash_alignmask(tfm);
>>   unsigned int unaligned_len = alignmask + 1 -
>>((unsigned long)data & alignmask);
>> - u8 ubuf[shash_align_buffer_size(unaligned_len, alignmask)]
>> - __aligned_largest;
>> + u8 ubuf[CRYPTO_ALG_MAX_ALIGNMASK]
>> + __aligned(CRYPTO_ALG_MAX_ALIGNMASK + 1);
>>   u8 *buf = PTR_ALIGN([0], alignmask + 1);
>>   int err;
>
> Are you sure that __attribute__((aligned(64))) works correctly on stack
> variables on all architectures?
>
> And if it is expected to work, then why is the buffer still aligned by hand on
> the very next line?

I really don't know -- the existing code was doing both the __align
bit and the manual alignment, so I was trying to simplify it while
removing the VLA. I'm totally open to suggestions here.

BTW, these are also the only users of __aligned_largest() in the
kernel, and the only use of unsized __attribute__((aligned))

-Kees

-- 
Kees Cook
Pixel Security

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 07/11] crypto: xcbc: Remove VLA usage

2018-06-20 Thread Kees Cook
On Wed, Jun 20, 2018 at 4:46 PM, Eric Biggers  wrote:
> On Wed, Jun 20, 2018 at 12:04:04PM -0700, Kees Cook wrote:
>> In the quest to remove all stack VLA usage from the kernel[1], this uses
>> the maximum blocksize and adds a sanity check.
>>
>> [1] 
>> https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com
>>
>> Signed-off-by: Kees Cook 
>> ---
>>  crypto/xcbc.c | 5 -
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/crypto/xcbc.c b/crypto/xcbc.c
>> index 25c75af50d3f..016481b1f3ba 100644
>> --- a/crypto/xcbc.c
>> +++ b/crypto/xcbc.c
>> @@ -65,7 +65,10 @@ static int crypto_xcbc_digest_setkey(struct crypto_shash 
>> *parent,
>>   int bs = crypto_shash_blocksize(parent);
>>   u8 *consts = PTR_ALIGN(>ctx[0], alignmask + 1);
>>   int err = 0;
>> - u8 key1[bs];
>> + u8 key1[CRYPTO_ALG_MAX_BLOCKSIZE];
>> +
>> + if (WARN_ON(bs > sizeof(key1)))
>> + return -EINVAL;
>
> Similarly, why not MAX_CIPHER_BLOCKSIZE?
>
> Also, xcbc_create() only allows a 16-byte block size, and you made the API
> enforce the maximum for algorithms anyway.  So I think the extra check here
> isn't very worthwhile.

Is the "parent" argument in crypto_xcbc_digest_setkey() always going
to be the "alg" from xcbc_create()? I couldn't convince myself that
was true. If it is, then yes, this VLA can trivially made to be 16,
but it seemed like they were separate instances...

-Kees

-- 
Kees Cook
Pixel Security

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 05/11] crypto alg: Introduce max blocksize and alignmask

2018-06-20 Thread Kees Cook
On Wed, Jun 20, 2018 at 4:40 PM, Eric Biggers  wrote:
> On Wed, Jun 20, 2018 at 12:04:02PM -0700, Kees Cook wrote:
>> In the quest to remove all stack VLA usage from the kernel[1], this
>> exposes the existing upper bound on crypto block sizes for VLA removal,
>> and introduces a new check for alignmask (current maximum in the kernel
>> is 63 from manual inspection of all cra_alignmask settings).
>>
>> [1] 
>> https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com
>>
>> Signed-off-by: Kees Cook 
>> ---
>>  crypto/algapi.c| 5 -
>>  include/linux/crypto.h | 4 
>>  2 files changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/crypto/algapi.c b/crypto/algapi.c
>> index c0755cf4f53f..760a412b059c 100644
>> --- a/crypto/algapi.c
>> +++ b/crypto/algapi.c
>> @@ -57,7 +57,10 @@ static int crypto_check_alg(struct crypto_alg *alg)
>>   if (alg->cra_alignmask & (alg->cra_alignmask + 1))
>>   return -EINVAL;
>>
>> - if (alg->cra_blocksize > PAGE_SIZE / 8)
>> + if (alg->cra_blocksize > CRYPTO_ALG_MAX_BLOCKSIZE)
>> + return -EINVAL;
>> +
>> + if (alg->cra_alignmask > CRYPTO_ALG_MAX_ALIGNMASK)
>>   return -EINVAL;
>>
>>   if (!alg->cra_type && (alg->cra_flags & CRYPTO_ALG_TYPE_MASK) ==
>> diff --git a/include/linux/crypto.h b/include/linux/crypto.h
>> index 6eb06101089f..e76ffcbd5aa6 100644
>> --- a/include/linux/crypto.h
>> +++ b/include/linux/crypto.h
>> @@ -134,6 +134,10 @@
>>   */
>>  #define CRYPTO_MAX_ALG_NAME  128
>>
>> +/* Maximum values for registered algorithms. */
>> +#define CRYPTO_ALG_MAX_BLOCKSIZE(PAGE_SIZE / 8)
>> +#define CRYPTO_ALG_MAX_ALIGNMASK63
>> +
>
> How do these differ from MAX_CIPHER_BLOCKSIZE and MAX_CIPHER_ALIGNMASK, and 
> why
> are they declared in different places?

This is what I get for staring at crypto code for so long. I entirely
missed these checks... even though they're 8 line away:

if (!alg->cra_type && (alg->cra_flags & CRYPTO_ALG_TYPE_MASK) ==
   CRYPTO_ALG_TYPE_CIPHER) {
if (alg->cra_alignmask > MAX_CIPHER_ALIGNMASK)
return -EINVAL;

if (alg->cra_blocksize > MAX_CIPHER_BLOCKSIZE)
return -EINVAL;
}

However, this is only checking CRYPTO_ALG_TYPE_CIPHER, and
cra_blocksize can be used for all kinds of things.

include/crypto/algapi.h:#define MAX_CIPHER_ALIGNMASK15
...
drivers/crypto/mxs-dcp.c:   .cra_flags
 = CRYPTO_ALG_ASYNC,
drivers/crypto/mxs-dcp.c:   .cra_alignmask  = 63,

Is this one broken? It has no CRYPTO_ALG_TYPE_... ?

For my CRYPTO_ALG_MAX_BLOCKSIZE, there is:

crypto/xcbc.c:  u8 key1[CRYPTO_ALG_MAX_BLOCKSIZE];
drivers/crypto/qat/qat_common/qat_algs.c:   char
ipad[CRYPTO_ALG_MAX_BLOCKSIZE];
drivers/crypto/qat/qat_common/qat_algs.c:   char
opad[CRYPTO_ALG_MAX_BLOCKSIZE];

It looks like both xcbc and qat are used with shash, so that needs a
separate max blocksize.

For my CRYPTO_ALG_MAX_ALIGNMASK, there is:

crypto/shash.c: u8 ubuf[CRYPTO_ALG_MAX_ALIGNMASK]
crypto/shash.c: __aligned(CRYPTO_ALG_MAX_ALIGNMASK + 1);
crypto/shash.c: __aligned(CRYPTO_ALG_MAX_ALIGNMASK + 1);

which is also shash.

How should I rename these and best apply the registration-time sanity checks?

-Kees

-- 
Kees Cook
Pixel Security

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 09/11] crypto: shash: Remove VLA usage in unaligned hashing

2018-06-20 Thread Eric Biggers
On Wed, Jun 20, 2018 at 12:04:06PM -0700, Kees Cook wrote:
> In the quest to remove all stack VLA usage from the kernel[1], this uses
> the newly defined max alignment to perform unaligned hashing to avoid
> VLAs, and drops the helper function while adding sanity checks on the
> resulting buffer sizes.
> 
> [1] 
> https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com
> 
> Signed-off-by: Kees Cook 
> ---
>  crypto/shash.c | 21 ++---
>  1 file changed, 10 insertions(+), 11 deletions(-)
> 
> diff --git a/crypto/shash.c b/crypto/shash.c
> index ab6902c6dae7..1bb58209330a 100644
> --- a/crypto/shash.c
> +++ b/crypto/shash.c
> @@ -73,13 +73,6 @@ int crypto_shash_setkey(struct crypto_shash *tfm, const u8 
> *key,
>  }
>  EXPORT_SYMBOL_GPL(crypto_shash_setkey);
>  
> -static inline unsigned int shash_align_buffer_size(unsigned len,
> -unsigned long mask)
> -{
> - typedef u8 __aligned_largest u8_aligned;
> - return len + (mask & ~(__alignof__(u8_aligned) - 1));
> -}
> -
>  static int shash_update_unaligned(struct shash_desc *desc, const u8 *data,
> unsigned int len)
>  {
> @@ -88,11 +81,14 @@ static int shash_update_unaligned(struct shash_desc 
> *desc, const u8 *data,
>   unsigned long alignmask = crypto_shash_alignmask(tfm);
>   unsigned int unaligned_len = alignmask + 1 -
>((unsigned long)data & alignmask);
> - u8 ubuf[shash_align_buffer_size(unaligned_len, alignmask)]
> - __aligned_largest;
> + u8 ubuf[CRYPTO_ALG_MAX_ALIGNMASK]
> + __aligned(CRYPTO_ALG_MAX_ALIGNMASK + 1);
>   u8 *buf = PTR_ALIGN([0], alignmask + 1);
>   int err;

Are you sure that __attribute__((aligned(64))) works correctly on stack
variables on all architectures?

And if it is expected to work, then why is the buffer still aligned by hand on
the very next line?

>  
> + if (WARN_ON(buf + unaligned_len > ubuf + sizeof(ubuf)))
> + return -EINVAL;
> +
>   if (unaligned_len > len)
>   unaligned_len = len;
>  
> @@ -124,11 +120,14 @@ static int shash_final_unaligned(struct shash_desc 
> *desc, u8 *out)
>   unsigned long alignmask = crypto_shash_alignmask(tfm);
>   struct shash_alg *shash = crypto_shash_alg(tfm);
>   unsigned int ds = crypto_shash_digestsize(tfm);
> - u8 ubuf[shash_align_buffer_size(ds, alignmask)]
> - __aligned_largest;
> + u8 ubuf[SHASH_MAX_DIGESTSIZE]
> + __aligned(CRYPTO_ALG_MAX_ALIGNMASK + 1);
>   u8 *buf = PTR_ALIGN([0], alignmask + 1);
>   int err;

Same questions here.

>  
> + if (WARN_ON(buf + ds > ubuf + sizeof(ubuf)))
> + return -EINVAL;
> +
>   err = shash->final(desc, buf);
>   if (err)
>   goto out;
> -- 

- Eric

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 07/11] crypto: xcbc: Remove VLA usage

2018-06-20 Thread Eric Biggers
On Wed, Jun 20, 2018 at 12:04:04PM -0700, Kees Cook wrote:
> In the quest to remove all stack VLA usage from the kernel[1], this uses
> the maximum blocksize and adds a sanity check.
> 
> [1] 
> https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com
> 
> Signed-off-by: Kees Cook 
> ---
>  crypto/xcbc.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/crypto/xcbc.c b/crypto/xcbc.c
> index 25c75af50d3f..016481b1f3ba 100644
> --- a/crypto/xcbc.c
> +++ b/crypto/xcbc.c
> @@ -65,7 +65,10 @@ static int crypto_xcbc_digest_setkey(struct crypto_shash 
> *parent,
>   int bs = crypto_shash_blocksize(parent);
>   u8 *consts = PTR_ALIGN(>ctx[0], alignmask + 1);
>   int err = 0;
> - u8 key1[bs];
> + u8 key1[CRYPTO_ALG_MAX_BLOCKSIZE];
> +
> + if (WARN_ON(bs > sizeof(key1)))
> + return -EINVAL;

Similarly, why not MAX_CIPHER_BLOCKSIZE?

Also, xcbc_create() only allows a 16-byte block size, and you made the API
enforce the maximum for algorithms anyway.  So I think the extra check here
isn't very worthwhile.

- Eric

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 04/11] dm verity fec: Remove VLA usage

2018-06-20 Thread Kees Cook
On Wed, Jun 20, 2018 at 4:33 PM, Eric Biggers  wrote:
> On Wed, Jun 20, 2018 at 12:04:01PM -0700, Kees Cook wrote:
>> In the quest to remove all stack VLA usage from the kernel[1], this
>> uses the newly defined max digest size macro. Also adds a sanity-check
>> at use-time.
>>
>> [1] 
>> https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com
>>
>> Signed-off-by: Kees Cook 
>> ---
>>  drivers/md/dm-verity-fec.c | 5 -
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
>> index 684af08d0747..0dfcc52835bc 100644
>> --- a/drivers/md/dm-verity-fec.c
>> +++ b/drivers/md/dm-verity-fec.c
>> @@ -212,12 +212,15 @@ static int fec_read_bufs(struct dm_verity *v, struct 
>> dm_verity_io *io,
>>   struct dm_verity_fec_io *fio = fec_io(io);
>>   u64 block, ileaved;
>>   u8 *bbuf, *rs_block;
>> - u8 want_digest[v->digest_size];
>> + u8 want_digest[AHASH_MAX_DIGESTSIZE];
>>   unsigned n, k;
>>
>>   if (neras)
>>   *neras = 0;
>>
>> + if (WARN_ON(v->digest_size < sizeof(want_digest)))
>> + return -EINVAL;
>> +
>
> This is backwards; it should be 'v->digest_size > sizeof(want_digest)'.

Yikes. Thank you for catching that! I will fix in v2.

-Kees

-- 
Kees Cook
Pixel Security

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 06/11] crypto: cbc: Remove VLA usage

2018-06-20 Thread Eric Biggers
On Wed, Jun 20, 2018 at 12:04:03PM -0700, Kees Cook wrote:
> In the quest to remove all stack VLA usage from the kernel[1], this
> uses the upper bounds on blocksize.
> 
> [1] 
> https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com
> 
> Signed-off-by: Kees Cook 
> ---
>  include/crypto/cbc.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/crypto/cbc.h b/include/crypto/cbc.h
> index f5b8bfc22e6d..260096bcf99b 100644
> --- a/include/crypto/cbc.h
> +++ b/include/crypto/cbc.h
> @@ -113,7 +113,7 @@ static inline int crypto_cbc_decrypt_inplace(
>   unsigned int bsize = crypto_skcipher_blocksize(tfm);
>   unsigned int nbytes = walk->nbytes;
>   u8 *src = walk->src.virt.addr;
> - u8 last_iv[bsize];
> + u8 last_iv[CRYPTO_ALG_MAX_BLOCKSIZE];
>  

Why not MAX_CIPHER_BLOCKSIZE?

- Eric

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 05/11] crypto alg: Introduce max blocksize and alignmask

2018-06-20 Thread Eric Biggers
On Wed, Jun 20, 2018 at 12:04:02PM -0700, Kees Cook wrote:
> In the quest to remove all stack VLA usage from the kernel[1], this
> exposes the existing upper bound on crypto block sizes for VLA removal,
> and introduces a new check for alignmask (current maximum in the kernel
> is 63 from manual inspection of all cra_alignmask settings).
> 
> [1] 
> https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com
> 
> Signed-off-by: Kees Cook 
> ---
>  crypto/algapi.c| 5 -
>  include/linux/crypto.h | 4 
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/crypto/algapi.c b/crypto/algapi.c
> index c0755cf4f53f..760a412b059c 100644
> --- a/crypto/algapi.c
> +++ b/crypto/algapi.c
> @@ -57,7 +57,10 @@ static int crypto_check_alg(struct crypto_alg *alg)
>   if (alg->cra_alignmask & (alg->cra_alignmask + 1))
>   return -EINVAL;
>  
> - if (alg->cra_blocksize > PAGE_SIZE / 8)
> + if (alg->cra_blocksize > CRYPTO_ALG_MAX_BLOCKSIZE)
> + return -EINVAL;
> +
> + if (alg->cra_alignmask > CRYPTO_ALG_MAX_ALIGNMASK)
>   return -EINVAL;
>  
>   if (!alg->cra_type && (alg->cra_flags & CRYPTO_ALG_TYPE_MASK) ==
> diff --git a/include/linux/crypto.h b/include/linux/crypto.h
> index 6eb06101089f..e76ffcbd5aa6 100644
> --- a/include/linux/crypto.h
> +++ b/include/linux/crypto.h
> @@ -134,6 +134,10 @@
>   */
>  #define CRYPTO_MAX_ALG_NAME  128
>  
> +/* Maximum values for registered algorithms. */
> +#define CRYPTO_ALG_MAX_BLOCKSIZE(PAGE_SIZE / 8)
> +#define CRYPTO_ALG_MAX_ALIGNMASK63
> +

How do these differ from MAX_CIPHER_BLOCKSIZE and MAX_CIPHER_ALIGNMASK, and why
are they declared in different places?

- Eric

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 04/11] dm verity fec: Remove VLA usage

2018-06-20 Thread Eric Biggers
On Wed, Jun 20, 2018 at 12:04:01PM -0700, Kees Cook wrote:
> In the quest to remove all stack VLA usage from the kernel[1], this
> uses the newly defined max digest size macro. Also adds a sanity-check
> at use-time.
> 
> [1] 
> https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com
> 
> Signed-off-by: Kees Cook 
> ---
>  drivers/md/dm-verity-fec.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
> index 684af08d0747..0dfcc52835bc 100644
> --- a/drivers/md/dm-verity-fec.c
> +++ b/drivers/md/dm-verity-fec.c
> @@ -212,12 +212,15 @@ static int fec_read_bufs(struct dm_verity *v, struct 
> dm_verity_io *io,
>   struct dm_verity_fec_io *fio = fec_io(io);
>   u64 block, ileaved;
>   u8 *bbuf, *rs_block;
> - u8 want_digest[v->digest_size];
> + u8 want_digest[AHASH_MAX_DIGESTSIZE];
>   unsigned n, k;
>  
>   if (neras)
>   *neras = 0;
>  
> + if (WARN_ON(v->digest_size < sizeof(want_digest)))
> + return -EINVAL;
> +

This is backwards; it should be 'v->digest_size > sizeof(want_digest)'.

- Eric

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 01/11] crypto: shash: Remove VLA usage

2018-06-20 Thread Kees Cook
On Wed, Jun 20, 2018 at 1:39 PM, Christophe LEROY
 wrote:
>
>
> Le 20/06/2018 à 22:36, Kees Cook a écrit :
>>
>> On Wed, Jun 20, 2018 at 12:30 PM, Christophe Leroy
>>  wrote:
>>>
>>>
>>>
>>> On 06/20/2018 07:03 PM, Kees Cook wrote:


 In the quest to remove all stack VLA usage from the kernel[1], this
 removes the VLAs in SHASH_DESC_ON_STACK (via crypto_shash_descsize())
 by using the maximum allowable size (which is now more clearly captured
 in a macro). Similar limits are turned into macros as well.

 [1]

 https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com

 Signed-off-by: Kees Cook 
>>>
>>>
>>>
>>> Got the following warnings:
>>>
>>> crypto/testmgr.c: In function ‘alg_test_crc32c.part.4’:
>>> crypto/testmgr.c:1896:1: warning: the frame size of 2088 bytes is larger
>>> than 1024 bytes [-Wframe-larger-than=]
>>> crypto/hmac.c: In function ‘hmac_setkey’:
>>> crypto/hmac.c:88:1: warning: the frame size of 2088 bytes is larger than
>>> 1024 bytes [-Wframe-larger-than=]
>>
>>
>> Ah yes. I didn't do 32-bit builds. So, here's the issue: this uncovers
>> the frame size problems that were hidden by in being a VLA before. It
>> was always possible for the frame to get this big, it's just that the
>> compiler couldn't see it.
>>
>> For qat, I raised the -Wframe-larger-than flag. It seems we'll need to
>> do this in some other places too.
>
>
> Maybe the issue is because I have selected 16k pages ?

That would do it! And that's exactly the problem Arnd mentioned. For
v2, I will switch all the PAGE_SIZE-related limits to an explicit 512.
(And I'll do some 32-bit builds too to see if any other cases pop up
that I need to mask out.)

-Kees

-- 
Kees Cook
Pixel Security

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Re: [dm-devel] [PATCH 11/11] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

2018-06-20 Thread Kees Cook
On Wed, Jun 20, 2018 at 12:44 PM, Arnd Bergmann  wrote:
> On Wed, Jun 20, 2018 at 9:04 PM, Kees Cook  wrote:
>> In the quest to remove all stack VLA usage from the kernel[1], this
>> caps the skcipher request size similar to other limits and adds a sanity
>> check at registration.
>>
>>
>> +#define SKCIPHER_MAX_REQSIZE   (PAGE_SIZE / 8)
>> +
>>  #define SKCIPHER_REQUEST_ON_STACK(name, tfm) \
>> char __##name##_desc[sizeof(struct skcipher_request) + \
>> -   crypto_skcipher_reqsize(tfm)] CRYPTO_MINALIGN_ATTR; \
>> +   SKCIPHER_MAX_REQSIZE] CRYPTO_MINALIGN_ATTR; \
>> struct skcipher_request *name = (void *)__##name##_desc
>>
>
> This is probably a bad idea on kernels with large values of PAGE_SIZE.
> Some users on ppc64 and arm64 use 64KB here, but still limit
> the per-function stack size to 2KB.

We could make all of these PAGE_SIZE-related limits explicitly 512? I
think that was the intent originally.

-Kees

-- 
Kees Cook
Pixel Security

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 01/11] crypto: shash: Remove VLA usage

2018-06-20 Thread Kees Cook
On Wed, Jun 20, 2018 at 12:30 PM, Christophe Leroy
 wrote:
>
>
> On 06/20/2018 07:03 PM, Kees Cook wrote:
>>
>> In the quest to remove all stack VLA usage from the kernel[1], this
>> removes the VLAs in SHASH_DESC_ON_STACK (via crypto_shash_descsize())
>> by using the maximum allowable size (which is now more clearly captured
>> in a macro). Similar limits are turned into macros as well.
>>
>> [1]
>> https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com
>>
>> Signed-off-by: Kees Cook 
>
>
> Got the following warnings:
>
> crypto/testmgr.c: In function ‘alg_test_crc32c.part.4’:
> crypto/testmgr.c:1896:1: warning: the frame size of 2088 bytes is larger
> than 1024 bytes [-Wframe-larger-than=]
> crypto/hmac.c: In function ‘hmac_setkey’:
> crypto/hmac.c:88:1: warning: the frame size of 2088 bytes is larger than
> 1024 bytes [-Wframe-larger-than=]

Ah yes. I didn't do 32-bit builds. So, here's the issue: this uncovers
the frame size problems that were hidden by in being a VLA before. It
was always possible for the frame to get this big, it's just that the
compiler couldn't see it.

For qat, I raised the -Wframe-larger-than flag. It seems we'll need to
do this in some other places too.

-Kees

-- 
Kees Cook
Pixel Security

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Re: [dm-devel] [PATCH 11/11] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

2018-06-20 Thread Arnd Bergmann
On Wed, Jun 20, 2018 at 9:04 PM, Kees Cook  wrote:
> In the quest to remove all stack VLA usage from the kernel[1], this
> caps the skcipher request size similar to other limits and adds a sanity
> check at registration.
>
>
> +#define SKCIPHER_MAX_REQSIZE   (PAGE_SIZE / 8)
> +
>  #define SKCIPHER_REQUEST_ON_STACK(name, tfm) \
> char __##name##_desc[sizeof(struct skcipher_request) + \
> -   crypto_skcipher_reqsize(tfm)] CRYPTO_MINALIGN_ATTR; \
> +   SKCIPHER_MAX_REQSIZE] CRYPTO_MINALIGN_ATTR; \
> struct skcipher_request *name = (void *)__##name##_desc
>

This is probably a bad idea on kernels with large values of PAGE_SIZE.
Some users on ppc64 and arm64 use 64KB here, but still limit
the per-function stack size to 2KB.

Arnd

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 05/11] crypto alg: Introduce max blocksize and alignmask

2018-06-20 Thread Kees Cook
In the quest to remove all stack VLA usage from the kernel[1], this
exposes the existing upper bound on crypto block sizes for VLA removal,
and introduces a new check for alignmask (current maximum in the kernel
is 63 from manual inspection of all cra_alignmask settings).

[1] 
https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com

Signed-off-by: Kees Cook 
---
 crypto/algapi.c| 5 -
 include/linux/crypto.h | 4 
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/crypto/algapi.c b/crypto/algapi.c
index c0755cf4f53f..760a412b059c 100644
--- a/crypto/algapi.c
+++ b/crypto/algapi.c
@@ -57,7 +57,10 @@ static int crypto_check_alg(struct crypto_alg *alg)
if (alg->cra_alignmask & (alg->cra_alignmask + 1))
return -EINVAL;
 
-   if (alg->cra_blocksize > PAGE_SIZE / 8)
+   if (alg->cra_blocksize > CRYPTO_ALG_MAX_BLOCKSIZE)
+   return -EINVAL;
+
+   if (alg->cra_alignmask > CRYPTO_ALG_MAX_ALIGNMASK)
return -EINVAL;
 
if (!alg->cra_type && (alg->cra_flags & CRYPTO_ALG_TYPE_MASK) ==
diff --git a/include/linux/crypto.h b/include/linux/crypto.h
index 6eb06101089f..e76ffcbd5aa6 100644
--- a/include/linux/crypto.h
+++ b/include/linux/crypto.h
@@ -134,6 +134,10 @@
  */
 #define CRYPTO_MAX_ALG_NAME128
 
+/* Maximum values for registered algorithms. */
+#define CRYPTO_ALG_MAX_BLOCKSIZE(PAGE_SIZE / 8)
+#define CRYPTO_ALG_MAX_ALIGNMASK63
+
 /*
  * The macro CRYPTO_MINALIGN_ATTR (along with the void * type in the actual
  * declaration) is used to ensure that the crypto_tfm context structure is
-- 
2.17.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 03/11] crypto: ahash: Remove VLA usage

2018-06-20 Thread Kees Cook
In the quest to remove all stack VLA usage from the kernel[1], this
introduces max size macros for ahash, as already done for shash, and
adjust the crypto user to max state size.

[1] 
https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com

Signed-off-by: Kees Cook 
---
 crypto/ahash.c| 4 ++--
 crypto/algif_hash.c   | 2 +-
 include/crypto/hash.h | 3 +++
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/crypto/ahash.c b/crypto/ahash.c
index a64c143165b1..6435bdbe42fd 100644
--- a/crypto/ahash.c
+++ b/crypto/ahash.c
@@ -550,8 +550,8 @@ static int ahash_prepare_alg(struct ahash_alg *alg)
 {
struct crypto_alg *base = >halg.base;
 
-   if (alg->halg.digestsize > PAGE_SIZE / 8 ||
-   alg->halg.statesize > PAGE_SIZE / 8 ||
+   if (alg->halg.digestsize > AHASH_MAX_DIGESTSIZE ||
+   alg->halg.statesize > AHASH_MAX_STATESIZE ||
alg->halg.statesize == 0)
return -EINVAL;
 
diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index bfcf595fd8f9..8974ee8ebead 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -239,7 +239,7 @@ static int hash_accept(struct socket *sock, struct socket 
*newsock, int flags,
struct alg_sock *ask = alg_sk(sk);
struct hash_ctx *ctx = ask->private;
struct ahash_request *req = >req;
-   char state[crypto_ahash_statesize(crypto_ahash_reqtfm(req)) ? : 1];
+   char state[AHASH_MAX_STATESIZE];
struct sock *sk2;
struct alg_sock *ask2;
struct hash_ctx *ctx2;
diff --git a/include/crypto/hash.h b/include/crypto/hash.h
index 308aad8bf523..b9df568dc98e 100644
--- a/include/crypto/hash.h
+++ b/include/crypto/hash.h
@@ -64,6 +64,9 @@ struct ahash_request {
void *__ctx[] CRYPTO_MINALIGN_ATTR;
 };
 
+#define AHASH_MAX_DIGESTSIZE   (PAGE_SIZE / 8)
+#define AHASH_MAX_STATESIZE(PAGE_SIZE / 8)
+
 #define AHASH_REQUEST_ON_STACK(name, ahash) \
char __##name##_desc[sizeof(struct ahash_request) + \
crypto_ahash_reqsize(ahash)] CRYPTO_MINALIGN_ATTR; \
-- 
2.17.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 00/11] crypto: Remove VLA usage

2018-06-20 Thread Kees Cook
Hi,

This is nearly the last of the VLA removals[1], but it's one of the
largest because crypto gets used in lots of places. After looking
through code, usage, reading the threads Gustavo started, and comparing
the use-cases to the other VLA removals that have landed in the kernel,
I think this series is likely the best way forward to shut the door on
VLAs forever.

As background, the crypto stack usage is for callers to do an immediate
bit of work that doesn't allocate new memory. This means that other VLA
removal techniques (like just using kmalloc) aren't workable, and the
next common technique is needed: examination of maximum stack usage and
the addition of sanity checks. This series does that, and in several
cases, these maximums were already implicit in the code.

This series is intended to land via the crypto tree, though it touches
dm as well, since there are dependent patches (new crypto #defines
being used).

Thanks!

-Kees

[1] 
https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com

Kees Cook (11):
  crypto: shash: Remove VLA usage
  dm integrity: Remove VLA usage
  crypto: ahash: Remove VLA usage
  dm verity fec: Remove VLA usage
  crypto alg: Introduce max blocksize and alignmask
  crypto: cbc: Remove VLA usage
  crypto: xcbc: Remove VLA usage
  crypto: qat: Remove VLA usage
  crypto: shash: Remove VLA usage in unaligned hashing
  crypto: ahash: Remove VLA usage for AHASH_REQUEST_ON_STACK
  crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

 crypto/ahash.c   |  4 ++--
 crypto/algapi.c  |  5 -
 crypto/algif_hash.c  |  2 +-
 crypto/shash.c   | 27 
 crypto/xcbc.c|  5 -
 drivers/crypto/qat/qat_common/Makefile   |  2 ++
 drivers/crypto/qat/qat_common/qat_algs.c |  8 +--
 drivers/md/dm-integrity.c| 23 ++--
 drivers/md/dm-verity-fec.c   |  5 -
 include/crypto/cbc.h |  2 +-
 include/crypto/hash.h| 12 +--
 include/crypto/internal/hash.h   |  1 +
 include/crypto/internal/skcipher.h   |  1 +
 include/crypto/skcipher.h|  4 +++-
 include/linux/crypto.h   |  4 
 15 files changed, 73 insertions(+), 32 deletions(-)

-- 
2.17.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 10/11] crypto: ahash: Remove VLA usage for AHASH_REQUEST_ON_STACK

2018-06-20 Thread Kees Cook
In the quest to remove all stack VLA usage from the kernel[1], this caps
the ahash request size similar to the other limits and adds a sanity
check at registration.

[1] 
https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com

Signed-off-by: Kees Cook 
---
 include/crypto/hash.h  | 3 ++-
 include/crypto/internal/hash.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/crypto/hash.h b/include/crypto/hash.h
index b9df568dc98e..7d62309baf0b 100644
--- a/include/crypto/hash.h
+++ b/include/crypto/hash.h
@@ -66,10 +66,11 @@ struct ahash_request {
 
 #define AHASH_MAX_DIGESTSIZE   (PAGE_SIZE / 8)
 #define AHASH_MAX_STATESIZE(PAGE_SIZE / 8)
+#define AHASH_MAX_REQSIZE  (PAGE_SIZE / 8)
 
 #define AHASH_REQUEST_ON_STACK(name, ahash) \
char __##name##_desc[sizeof(struct ahash_request) + \
-   crypto_ahash_reqsize(ahash)] CRYPTO_MINALIGN_ATTR; \
+   AHASH_MAX_REQSIZE] CRYPTO_MINALIGN_ATTR; \
struct ahash_request *name = (void *)__##name##_desc
 
 /**
diff --git a/include/crypto/internal/hash.h b/include/crypto/internal/hash.h
index a0b0ad9d585e..d96ae5f52125 100644
--- a/include/crypto/internal/hash.h
+++ b/include/crypto/internal/hash.h
@@ -142,6 +142,7 @@ static inline struct ahash_alg *__crypto_ahash_alg(struct 
crypto_alg *alg)
 static inline void crypto_ahash_set_reqsize(struct crypto_ahash *tfm,
unsigned int reqsize)
 {
+   BUG_ON(reqsize > AHASH_MAX_REQSIZE);
tfm->reqsize = reqsize;
 }
 
-- 
2.17.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 06/11] crypto: cbc: Remove VLA usage

2018-06-20 Thread Kees Cook
In the quest to remove all stack VLA usage from the kernel[1], this
uses the upper bounds on blocksize.

[1] 
https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com

Signed-off-by: Kees Cook 
---
 include/crypto/cbc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/crypto/cbc.h b/include/crypto/cbc.h
index f5b8bfc22e6d..260096bcf99b 100644
--- a/include/crypto/cbc.h
+++ b/include/crypto/cbc.h
@@ -113,7 +113,7 @@ static inline int crypto_cbc_decrypt_inplace(
unsigned int bsize = crypto_skcipher_blocksize(tfm);
unsigned int nbytes = walk->nbytes;
u8 *src = walk->src.virt.addr;
-   u8 last_iv[bsize];
+   u8 last_iv[CRYPTO_ALG_MAX_BLOCKSIZE];
 
/* Start of the last block. */
src += nbytes - (nbytes & (bsize - 1)) - bsize;
-- 
2.17.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 07/11] crypto: xcbc: Remove VLA usage

2018-06-20 Thread Kees Cook
In the quest to remove all stack VLA usage from the kernel[1], this uses
the maximum blocksize and adds a sanity check.

[1] 
https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com

Signed-off-by: Kees Cook 
---
 crypto/xcbc.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/crypto/xcbc.c b/crypto/xcbc.c
index 25c75af50d3f..016481b1f3ba 100644
--- a/crypto/xcbc.c
+++ b/crypto/xcbc.c
@@ -65,7 +65,10 @@ static int crypto_xcbc_digest_setkey(struct crypto_shash 
*parent,
int bs = crypto_shash_blocksize(parent);
u8 *consts = PTR_ALIGN(>ctx[0], alignmask + 1);
int err = 0;
-   u8 key1[bs];
+   u8 key1[CRYPTO_ALG_MAX_BLOCKSIZE];
+
+   if (WARN_ON(bs > sizeof(key1)))
+   return -EINVAL;
 
if ((err = crypto_cipher_setkey(ctx->child, inkey, keylen)))
return err;
-- 
2.17.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 11/11] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

2018-06-20 Thread Kees Cook
In the quest to remove all stack VLA usage from the kernel[1], this
caps the skcipher request size similar to other limits and adds a sanity
check at registration.

[1] 
https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com

Signed-off-by: Kees Cook 
---
 include/crypto/internal/skcipher.h | 1 +
 include/crypto/skcipher.h  | 4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/crypto/internal/skcipher.h 
b/include/crypto/internal/skcipher.h
index e42f7063f245..5035482cbe68 100644
--- a/include/crypto/internal/skcipher.h
+++ b/include/crypto/internal/skcipher.h
@@ -130,6 +130,7 @@ static inline struct crypto_skcipher *crypto_spawn_skcipher(
 static inline void crypto_skcipher_set_reqsize(
struct crypto_skcipher *skcipher, unsigned int reqsize)
 {
+   BUG_ON(reqsize > SKCIPHER_MAX_REQSIZE);
skcipher->reqsize = reqsize;
 }
 
diff --git a/include/crypto/skcipher.h b/include/crypto/skcipher.h
index 2f327f090c3e..25294d0089b2 100644
--- a/include/crypto/skcipher.h
+++ b/include/crypto/skcipher.h
@@ -139,9 +139,11 @@ struct skcipher_alg {
struct crypto_alg base;
 };
 
+#define SKCIPHER_MAX_REQSIZE   (PAGE_SIZE / 8)
+
 #define SKCIPHER_REQUEST_ON_STACK(name, tfm) \
char __##name##_desc[sizeof(struct skcipher_request) + \
-   crypto_skcipher_reqsize(tfm)] CRYPTO_MINALIGN_ATTR; \
+   SKCIPHER_MAX_REQSIZE] CRYPTO_MINALIGN_ATTR; \
struct skcipher_request *name = (void *)__##name##_desc
 
 /**
-- 
2.17.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 09/11] crypto: shash: Remove VLA usage in unaligned hashing

2018-06-20 Thread Kees Cook
In the quest to remove all stack VLA usage from the kernel[1], this uses
the newly defined max alignment to perform unaligned hashing to avoid
VLAs, and drops the helper function while adding sanity checks on the
resulting buffer sizes.

[1] 
https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com

Signed-off-by: Kees Cook 
---
 crypto/shash.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/crypto/shash.c b/crypto/shash.c
index ab6902c6dae7..1bb58209330a 100644
--- a/crypto/shash.c
+++ b/crypto/shash.c
@@ -73,13 +73,6 @@ int crypto_shash_setkey(struct crypto_shash *tfm, const u8 
*key,
 }
 EXPORT_SYMBOL_GPL(crypto_shash_setkey);
 
-static inline unsigned int shash_align_buffer_size(unsigned len,
-  unsigned long mask)
-{
-   typedef u8 __aligned_largest u8_aligned;
-   return len + (mask & ~(__alignof__(u8_aligned) - 1));
-}
-
 static int shash_update_unaligned(struct shash_desc *desc, const u8 *data,
  unsigned int len)
 {
@@ -88,11 +81,14 @@ static int shash_update_unaligned(struct shash_desc *desc, 
const u8 *data,
unsigned long alignmask = crypto_shash_alignmask(tfm);
unsigned int unaligned_len = alignmask + 1 -
 ((unsigned long)data & alignmask);
-   u8 ubuf[shash_align_buffer_size(unaligned_len, alignmask)]
-   __aligned_largest;
+   u8 ubuf[CRYPTO_ALG_MAX_ALIGNMASK]
+   __aligned(CRYPTO_ALG_MAX_ALIGNMASK + 1);
u8 *buf = PTR_ALIGN([0], alignmask + 1);
int err;
 
+   if (WARN_ON(buf + unaligned_len > ubuf + sizeof(ubuf)))
+   return -EINVAL;
+
if (unaligned_len > len)
unaligned_len = len;
 
@@ -124,11 +120,14 @@ static int shash_final_unaligned(struct shash_desc *desc, 
u8 *out)
unsigned long alignmask = crypto_shash_alignmask(tfm);
struct shash_alg *shash = crypto_shash_alg(tfm);
unsigned int ds = crypto_shash_digestsize(tfm);
-   u8 ubuf[shash_align_buffer_size(ds, alignmask)]
-   __aligned_largest;
+   u8 ubuf[SHASH_MAX_DIGESTSIZE]
+   __aligned(CRYPTO_ALG_MAX_ALIGNMASK + 1);
u8 *buf = PTR_ALIGN([0], alignmask + 1);
int err;
 
+   if (WARN_ON(buf + ds > ubuf + sizeof(ubuf)))
+   return -EINVAL;
+
err = shash->final(desc, buf);
if (err)
goto out;
-- 
2.17.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 08/11] crypto: qat: Remove VLA usage

2018-06-20 Thread Kees Cook
In the quest to remove all stack VLA usage from the kernel[1], this uses
the upper bound for the stack buffer. Also adds a sanity check. This
additionally raises the stack size limit during the build, to avoid
a compiler warning while keeping it reasonably close to expected stack
size. The warning was just exposing the existing max stack size, so there
is nothing new here; now that it is not hidden in a VLA, the compiler
can see how large it might get.

[1] 
https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com

Signed-off-by: Kees Cook 
---
 drivers/crypto/qat/qat_common/Makefile   | 2 ++
 drivers/crypto/qat/qat_common/qat_algs.c | 8 ++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/qat/qat_common/Makefile 
b/drivers/crypto/qat/qat_common/Makefile
index 47a8e3d8b81a..c2a042023dde 100644
--- a/drivers/crypto/qat/qat_common/Makefile
+++ b/drivers/crypto/qat/qat_common/Makefile
@@ -19,3 +19,5 @@ intel_qat-objs := adf_cfg.o \
 intel_qat-$(CONFIG_DEBUG_FS) += adf_transport_debug.o
 intel_qat-$(CONFIG_PCI_IOV) += adf_sriov.o adf_pf2vf_msg.o \
   adf_vf2pf_msg.o adf_vf_isr.o
+
+CFLAGS_qat_algs.o := $(call cc-option,-Wframe-larger-than=2300)
diff --git a/drivers/crypto/qat/qat_common/qat_algs.c 
b/drivers/crypto/qat/qat_common/qat_algs.c
index 1138e41d6805..257269126601 100644
--- a/drivers/crypto/qat/qat_common/qat_algs.c
+++ b/drivers/crypto/qat/qat_common/qat_algs.c
@@ -153,8 +153,8 @@ static int qat_alg_do_precomputes(struct 
icp_qat_hw_auth_algo_blk *hash,
struct sha512_state sha512;
int block_size = crypto_shash_blocksize(ctx->hash_tfm);
int digest_size = crypto_shash_digestsize(ctx->hash_tfm);
-   char ipad[block_size];
-   char opad[block_size];
+   char ipad[CRYPTO_ALG_MAX_BLOCKSIZE];
+   char opad[CRYPTO_ALG_MAX_BLOCKSIZE];
__be32 *hash_state_out;
__be64 *hash512_state_out;
int i, offset;
@@ -164,6 +164,10 @@ static int qat_alg_do_precomputes(struct 
icp_qat_hw_auth_algo_blk *hash,
shash->tfm = ctx->hash_tfm;
shash->flags = 0x0;
 
+   if (WARN_ON(block_size > sizeof(ipad) ||
+   sizeof(ipad) != sizeof(opad)))
+   return -EINVAL;
+
if (auth_keylen > block_size) {
int ret = crypto_shash_digest(shash, auth_key,
  auth_keylen, ipad);
-- 
2.17.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 02/11] dm integrity: Remove VLA usage

2018-06-20 Thread Kees Cook
In the quest to remove all stack VLA usage from the kernel[1], this uses
the new SHASH_MAX_DIGESTSIZE from the crypto layer to allocate the upper
bounds on stack usage.

[1] 
https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com

Signed-off-by: Kees Cook 
---
 drivers/md/dm-integrity.c | 23 +--
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index 86438b2f10dd..d057b41f8690 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -521,7 +521,12 @@ static void section_mac(struct dm_integrity_c *ic, 
unsigned section, __u8 result
}
memset(result + size, 0, JOURNAL_MAC_SIZE - size);
} else {
-   __u8 digest[size];
+   __u8 digest[SHASH_MAX_DIGESTSIZE];
+
+   if (WARN_ON(size > sizeof(digest))) {
+   dm_integrity_io_error(ic, "digest_size", -EINVAL);
+   goto err;
+   }
r = crypto_shash_final(desc, digest);
if (unlikely(r)) {
dm_integrity_io_error(ic, "crypto_shash_final", r);
@@ -1244,7 +1249,7 @@ static void integrity_metadata(struct work_struct *w)
struct bio *bio = dm_bio_from_per_bio_data(dio, sizeof(struct 
dm_integrity_io));
char *checksums;
unsigned extra_space = unlikely(digest_size > ic->tag_size) ? 
digest_size - ic->tag_size : 0;
-   char checksums_onstack[ic->tag_size + extra_space];
+   char checksums_onstack[SHASH_MAX_DIGESTSIZE];
unsigned sectors_to_process = dio->range.n_sectors;
sector_t sector = dio->range.logical_sector;
 
@@ -1253,8 +1258,14 @@ static void integrity_metadata(struct work_struct *w)
 
checksums = kmalloc((PAGE_SIZE >> SECTOR_SHIFT >> 
ic->sb->log2_sectors_per_block) * ic->tag_size + extra_space,
GFP_NOIO | __GFP_NORETRY | __GFP_NOWARN);
-   if (!checksums)
+   if (!checksums) {
checksums = checksums_onstack;
+   if (WARN_ON(extra_space &&
+   digest_size > sizeof(checksums_onstack))) {
+   r = -EINVAL;
+   goto error;
+   }
+   }
 
__bio_for_each_segment(bv, bio, iter, dio->orig_bi_iter) {
unsigned pos;
@@ -1466,7 +1477,7 @@ static bool __journal_read_write(struct dm_integrity_io 
*dio, struct bio *bio,
} while (++s < ic->sectors_per_block);
 #ifdef INTERNAL_VERIFY
if (ic->internal_hash) {
-   char 
checksums_onstack[max(crypto_shash_digestsize(ic->internal_hash), 
ic->tag_size)];
+   char 
checksums_onstack[max(SHASH_MAX_DIGESTSIZE, MAX_TAG_SIZE)];
 
integrity_sector_checksum(ic, 
logical_sector, mem + bv.bv_offset, checksums_onstack);
if (unlikely(memcmp(checksums_onstack, 
journal_entry_tag(ic, je), ic->tag_size))) {
@@ -1516,7 +1527,7 @@ static bool __journal_read_write(struct dm_integrity_io 
*dio, struct bio *bio,
if (ic->internal_hash) {
unsigned digest_size = 
crypto_shash_digestsize(ic->internal_hash);
if (unlikely(digest_size > 
ic->tag_size)) {
-   char 
checksums_onstack[digest_size];
+   char 
checksums_onstack[SHASH_MAX_DIGESTSIZE];
integrity_sector_checksum(ic, 
logical_sector, (char *)js, checksums_onstack);
memcpy(journal_entry_tag(ic, 
je), checksums_onstack, ic->tag_size);
} else
@@ -1937,7 +1948,7 @@ static void do_journal_write(struct dm_integrity_c *ic, 
unsigned write_start,
unlikely(from_replay) &&
 #endif
ic->internal_hash) {
-   char 
test_tag[max(crypto_shash_digestsize(ic->internal_hash), ic->tag_size)];
+   char test_tag[max(SHASH_MAX_DIGESTSIZE, 
MAX_TAG_SIZE)];
 
integrity_sector_checksum(ic, sec + ((l 
- j) << ic->sb->log2_sectors_per_block),
  (char 
*)access_journal_data(ic, i, l), test_tag);
-- 
2.17.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH v2 4/7] dm: prevent DAX mounts if not supported

2018-06-20 Thread Mike Snitzer
On Mon, Jun 04 2018 at  7:15pm -0400,
Ross Zwisler  wrote:

> On Fri, Jun 01, 2018 at 05:55:13PM -0400, Mike Snitzer wrote:
> > On Tue, May 29 2018 at  3:51pm -0400,
> > Ross Zwisler  wrote:
> > 
> > > Currently the code in dm_dax_direct_access() only checks whether the 
> > > target
> > > type has a direct_access() operation defined, not whether the underlying
> > > block devices all support DAX.  This latter property can be seen by 
> > > looking
> > > at whether we set the QUEUE_FLAG_DAX request queue flag when creating the
> > > DM device.
> > 
> > Wait... I thought DAX support was all or nothing?
> 
> Right, it is, and that's what I'm trying to capture.  The point of this series
> is to make sure that we don't use DAX thru DM if one of the DM members doesn't
> support DAX.
> 
> This is a bit tricky, though, because as you've pointed out there are a lot of
> elements that go into a block device actually supporting DAX.  
> 
> First, the block device has to have a direct_access() operation defined in its
> struct dax_operations table.  This is a static definition in the drivers,
> though, so it's necessary but not sufficient.  For example, the PMEM driver
> always defines a direct_access() operation, but depending on the mode of the
> namespace (raw, fsdax or sector) it may or may not support DAX.
> 
> The next step is that a driver needs to say that he block queue supports
> QUEUE_FLAG_DAX.  This again is necessary but not sufficient.  The PMEM driver
> currently sets this for all namespace modes, but I agree that this should be
> restricted to modes that support DAX.  Even once we do that, though, for the
> block driver this isn't fully sufficient.  We'd really like users to call
> bdev_dax_supported() so it can run some additional tests to make sure that DAX
> will work.
> 
> So, the real test that filesystems rely on is bdev_dax_suppported().
> 
> The trick is that with DM we need to verify each block device via
> bdev_dax_supported() just like a filesystem would, and then have some way of
> communicating the result of all those checks to the filesystem which is
> eventually mounted on the DM device.  At DAX mount time the filesystem will
> call bdev_dax_supported() on the DM device, but it'll really only check the
> first device.  
> 
> So, the strategy is to have DM manually check each member device via
> bdev_dax_supported() then if they all pass set QUEUE_FLAG_DAX.  This then
> becomes our one source of truth on whether or not a DM device supports DAX.
> When the filesystem mounts with DAX support it'll also run
> bdev_dax_supported(), but if we have QUEUE_FLAG_DAX set on the DM device, we
> know that this check will pass.
> 
> > > This is problematic if we have, for example, a dm-linear device made up of
> > > a PMEM namespace in fsdax mode followed by a ramdisk from BRD.
> > > QUEUE_FLAG_DAX won't be set on the dm-linear device's request queue, but
> > > we have a working direct_access() entry point and the first member of the
> > > dm-linear set *does* support DAX.
> > 
> > If you don't have a uniformly capable device then it is very dangerous
> > to advertise that the entire device has a certain capability.  That
> > completely bit me in the past with discard (because for every IO I
> > wasn't then checking if the destination device supported discards).
> >
> > It is all well and good that you're adding that check here.  But what I
> > don't like is how you're saying QUEUE_FLAG_DAX implies direct_access()
> > operation exists.. yet for raw PMEM namespaces we just discussed how
> > that is a lie.
> 
> QUEUE_FLAG_DAX does imply that direct_access() exits.  However, as discussed
> above for a given bdev we really do need to check bdev_dax_supported().
> 
> > SO this type of change showcases how the QUEUE_FLAG_DAX doesn't _really_
> > imply direct_access() exists.
> > 
> > > This allows the user to create a filesystem on the dm-linear device, and
> > > then mount it with DAX.  The filesystem's bdev_dax_supported() test will
> > > pass because it'll operate on the first member of the dm-linear device,
> > > which happens to be a fsdax PMEM namespace.
> > > 
> > > All DAX I/O will then fail to that dm-linear device because the lack of
> > > QUEUE_FLAG_DAX prevents fs_dax_get_by_bdev() from working.  This means 
> > > that
> > > the struct dax_device isn't ever set in the filesystem, so
> > > dax_direct_access() will always return -EOPNOTSUPP.
> > 
> > Now you've lost me... these past 2 paragraphs.  Why can a user mount it
> > is DAX mode?  Because bdev_dax_supported() only accesses the first
> > portion (which happens to have DAX capabilities?)
> 
> Right.  bdev_dax_supported() runs all of its checks, and because they are
> running against the first block device in the dm set, they all pass.  But the
> overall DM device does not actually support DAX.
> 
> > Isn't this exactly why you should be checking for QUEUE_FLAG_DAX in the
> > caller (bdev_dax_supported)?  Why not use bdev_get_queue() and 

Re: [dm-devel] dm thin: superblock may write succeed before other metadata blocks because of wirting metadata in async mode.

2018-06-20 Thread Joe Thornber
On Wed, Jun 20, 2018 at 01:03:57PM -0400, monty wrote:
> Hi, Mike and Joe. Thanks for your reply. I read __commit_transaction
> many times and didn't find any problem of 2-phase commit. I use
> md-raid1(PCIe nvme and md-raid5) in write-behind mode to store dm-thin
> metadata.
> Test case:
> 1. I do copy-diff test on thin device and then reboot my machine.
> 2. Rebuild our device stack and exec "vgchang -ay".
> The thin-pool can not be established(details_root become a bitmap node
> and metadata's bitmap_root become a btree_node).

As you simplify your setup does the problem go away?  eg, turn off 
write-behind, use just the nvme dev etc.

The only effect of your change is to call flush twice rather than once.

- Joe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] dm thin: superblock may write succeed before other metadata blocks because of wirting metadata in async mode.

2018-06-20 Thread Mike Snitzer
On Wed, Jun 20 2018 at  1:03pm -0400,
monty  wrote:

> 
> On Tue, Jun 19, 2018 at 11:00:32AM -0400, Mike Snitzer wrote:
> > 
> > On Tue, Jun 19 2018 at 10:43am -0400,
> > Joe Thornber  wrote:
> > 
> > > On Tue, Jun 19, 2018 at 09:11:06AM -0400, Mike Snitzer wrote:
> > > > On Mon, May 21 2018 at  8:53pm -0400,
> > > > Monty Pavel  wrote:
> > > > 
> > > > > 
> > > > > If dm_bufio_write_dirty_buffers func is called by __commit_transaction
> > > > > func and power loss happens during executing it, coincidencely
> > > > > superblock wrote correctly but some metadata blocks didn't. The reason
> > > > > is we write all metadata in async mode. We can guarantee that we send
> > > > > superblock after other blocks but we cannot guarantee that superblock
> > > > > write completely early than other blocks.
> > > > > So, We need to commit other metadata blocks before change superblock.
> > > > > 
> > > > > Signed-off-by: Monty Pavel 
> > > > > ---
> > > > >  drivers/md/dm-thin-metadata.c |8 
> > > > >  1 files changed, 8 insertions(+), 0 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/md/dm-thin-metadata.c 
> > > > > b/drivers/md/dm-thin-metadata.c
> > > > > index 36ef284..897d7d6 100644
> > > > > --- a/drivers/md/dm-thin-metadata.c
> > > > > +++ b/drivers/md/dm-thin-metadata.c
> > > > > @@ -813,6 +813,14 @@ static int __commit_transaction(struct 
> > > > > dm_pool_metadata *pmd)
> > > > >   if (r)
> > > > >   return r;
> > > > >  
> > > > > + r = dm_tm_commit(pmd->tm, sblock);
> > > > > + if (r)
> > > > > + return r;
> > > > > +
> > > > > + r = superblock_lock(pmd, );
> > > > > + if (r)
> > > > > + return r;
> > > > > +
> > > > >   disk_super = dm_block_data(sblock);
> > > > >   disk_super->time = cpu_to_le32(pmd->time);
> > > > >   disk_super->data_mapping_root = cpu_to_le64(pmd->root);
> > > 
> > > I don't believe you've tested this; sblock is passed to dm_tm_commit()
> > > uninitialised, and you didn't even bother to remove the later (and 
> > > correct)
> > > call to dm_tm_commit().
> > 
> > I pointed out to Joe that the patch, in isolation, is decieving.  It
> > _looks_ like sblock may be uninitialized, etc.  But once the patch is
> > applied and you look at the entirety of __commit_transaction() it is
> > clear that you're reusing the existing superblock_lock() to safely
> > accomplish your additional call to dm_tm_commit().
> > 
> > > What is the issue that started you looking in this area?
> > 
> > Right, as my previous reply asked: please clarify if you _know_ your
> > patch fixes an actual problem you've experienced.  The more details the
> > better.
> > 
> > Thanks,
> > Mike
> > 
> Hi, Mike and Joe. Thanks for your reply. I read __commit_transaction
> many times and didn't find any problem of 2-phase commit. I use
> md-raid1(PCIe nvme and md-raid5) in write-behind mode to store dm-thin
> metadata.
> Test case:
> 1. I do copy-diff test on thin device and then reboot my machine.
> 2. Rebuild our device stack and exec "vgchang -ay".
> The thin-pool can not be established(details_root become a bitmap node
> and metadata's bitmap_root become a btree_node).

But are you saying your double commit in __commit_transaction() serves
as a workaround for the corruption you're seeing?

Is it just a case where raid5's writebehind mode is _not_ safe for your
storage config?  By "reboot" do you mean a clean shutdown?  Or a forced
powerfail scenario?

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH v3 0/5] bitmap: Introduce alloc/free helpers

2018-06-20 Thread Andy Shevchenko
On Wed, 2018-06-20 at 10:33 +0300, Yury Norov wrote:
> On Mon, Jun 18, 2018 at 04:09:58PM +0300, Andy Shevchenko wrote:
> > External Email
> > 
> > A lot of code is using allocation of bitmaps using BITS_PER_LONG()
> > macro and
> > sizeof(unsigned long) operator. The readability suffers because of
> > this.
> > 
> > The series introduces three helpers, i.e. bitmap_alloc(),
> > bitmap_zalloc() and
> > bitmap_free(), to make it more cleaner.
> 

> tools/include/linux/bitmap.h already has bitmap_alloc(),

TBH, I don't give a crap about tools. They invented something that might
or might not follow kernel APIs. At the end, it's a user space.

> and it corresponds to bitmap_zalloc() in this patch. It may 
> cause problems in future if people will copy functions that
> use bitmap_alloc between kernel code and tools. So I think
> we have to propagate this API to tools and update existing
> users of bitmap_alloc() in tools.

Propose a patch then. Perhaps I need to inform tools people about new
API coming, but that's all what I can do. Existing something in tools/
does not and should not prevent extending / changing kernel internal
APIs.

> What about code that calls specific alloc functions, like
> memblock_virt_alloc() and pcpu_mem_zalloc() in mm/percpu.c,

What about it? It's not in scope of this API for sure. The above
mentioned functions have a very limited area of usage.
Your example also a bit complicated, since it's not allocating _just_ a
bitmap, but a structure _with_ embedded bitmap.

> or devm_kcalloc() in drivers/dma/edma.c?

There is no such file in the tree. You perhaps referred to
drivers/dma/ti/edma.c.

Yes, this is a candidate to convert later on if anyone wants to do it
(with introducing devm_bitmap_alloc() / devm_bitmap_zalloc() helpers).

>  If we are going to
> unify bitmap allocations in kernel, we should think about
> unification of that cases too. Should it be additional
> flag or optional pointer to the exact allocator in
> bitmap_{,z}alloc()?

So, I prefer one step at a time. Especially taking into consideration
the problems I have to solve now with those simplest helpers I proposed.

> 
> Yury
>  
> > Patch 1 is a preparatory to avoid namespace collisions between
> > bitmap API and
> > MD bitmap. No functional changes intended.
> > 
> > Patch 2 is just orphaned from previous release cycle.
> > 
> > Patch 3 introduces new helpers.
> > 
> > Patches 4 and 5 is just an example how to use new helpers. Locally I
> > have like
> > dozen of them against different subsystems and drivers.
> > 
> > Ideally it would go through Input subsystem, thus, needs an Ack from
> > MD maintainer(s).
> > 
> > Since v2:
> > - fix compilation issue in MD bitmap code
> > - elaborate changes in commit message of patch 5
> > 
> > Since v1:
> > - added namespace fix patch against MD bitmap API
> > - moved functions to lib/bitmap.c to avoid circular dependencies
> > - appended Dmitry's tags
> > 
> > Andy Shevchenko (5):
> >   md: Avoid namespace collision with bitmap API
> >   bitmap: Drop unnecessary 0 check for u32 array operations
> >   bitmap: Add bitmap_alloc(), bitmap_zalloc() and bitmap_free()
> >   Input: gpio-keys - Switch to bitmap_zalloc()
> >   Input: evdev - Switch to bitmap API
> > 
> >  drivers/input/evdev.c |  16 +-
> >  drivers/input/keyboard/gpio_keys.c|   8 +-
> >  drivers/md/dm-raid.c  |   6 +-
> >  drivers/md/md-bitmap.c| 301 +
> > -
> >  drivers/md/md-bitmap.h|  46 +--
> >  drivers/md/md-cluster.c   |  16 +-
> >  drivers/md/md.c   |  44 +--
> >  .../md/persistent-data/dm-space-map-common.c  |  12 +-
> >  drivers/md/raid1.c|  20 +-
> >  drivers/md/raid10.c   |  26 +-
> >  drivers/md/raid5-cache.c  |   2 +-
> >  drivers/md/raid5.c|  24 +-
> >  include/linux/bitmap.h|   8 +
> >  lib/bitmap.c  |  28 +-
> >  14 files changed, 283 insertions(+), 274 deletions(-)
> > 
> > --
> > 2.17.1

-- 
Andy Shevchenko 
Intel Finland Oy

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [RFC] kill bio_clone_kmalloc and bio_clone_bioset

2018-06-20 Thread Ming Lei
On Tue, Jun 19, 2018 at 06:52:10AM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> this series removes all but one users of the traditional deep bio clone,
> and then moves bio_clone_bioset to its only caller so that we get rid
> of the deep bio cloning in the block layer API.
> 
> Patch 1 is already in the device mapper queue for 4.18, so we should
> wait for that to go in before the series is applied.  The series is
> inteded as preparation work for the multi-page biovecs so that it doesn't
> have to deal with the difference between single page or larger bio vecs
> for cloning.

Looks good cleanup:

Reviewed-by: Ming Lei 

Thanks,
Ming

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH v2 5/5] Input: evdev - Switch to bitmap_zalloc()

2018-06-20 Thread Dmitry Torokhov
On Sat, Jun 16, 2018 at 12:42:31AM +0300, Yury Norov wrote:
> Hi Andy,
> 
> On Fri, Jun 15, 2018 at 04:20:17PM +0300, Andy Shevchenko wrote:
> > Switch to bitmap_zalloc() to show clearly what we are allocating.
> > Besides that it returns pointer of bitmap type instead of opaque void *.
> > 
> > Acked-by: Dmitry Torokhov 
> > Signed-off-by: Andy Shevchenko 
> > ---
> >  drivers/input/evdev.c | 16 +++-
> >  1 file changed, 7 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
> > index c81c79d01d93..370206f987f9 100644
> > --- a/drivers/input/evdev.c
> > +++ b/drivers/input/evdev.c
> > @@ -481,7 +481,7 @@ static int evdev_release(struct inode *inode, struct 
> > file *file)
> > evdev_detach_client(evdev, client);
> > 
> > for (i = 0; i < EV_CNT; ++i)
> > -   kfree(client->evmasks[i]);
> > +   bitmap_free(client->evmasks[i]);
> > 
> > kvfree(client);
> > 
> > @@ -925,17 +925,15 @@ static int evdev_handle_get_val(struct evdev_client 
> > *client,
> >  {
> > int ret;
> > unsigned long *mem;
> > -   size_t len;
> > 
> > -   len = BITS_TO_LONGS(maxbit) * sizeof(unsigned long);
> > -   mem = kmalloc(len, GFP_KERNEL);
> > +   mem = bitmap_alloc(maxbit, GFP_KERNEL);
> > if (!mem)
> > return -ENOMEM;
> 
> But in commit message you say you switch to bitmap_zalloc(). IIUC
> bitmap_alloc() is OK here. But could you please update comment to
> avoid confusing.
> 
> > 
> > spin_lock_irq(>event_lock);
> > spin_lock(>buffer_lock);
> > 
> > -   memcpy(mem, bits, len);
> > +   bitmap_copy(mem, bits, maxbit);
> > 
> > spin_unlock(>event_lock);
> > 
> > @@ -947,7 +945,7 @@ static int evdev_handle_get_val(struct evdev_client 
> > *client,
> > if (ret < 0)
> > evdev_queue_syn_dropped(client);
> > 
> > -   kfree(mem);
> > +   bitmap_free(mem);
> > 
> > return ret;
> >  }
> > @@ -1003,13 +1001,13 @@ static int evdev_set_mask(struct evdev_client 
> > *client,
> > if (!cnt)
> > return 0;
> > 
> > -   mask = kcalloc(sizeof(unsigned long), BITS_TO_LONGS(cnt), 
> > GFP_KERNEL);
> > +   mask = bitmap_zalloc(cnt, GFP_KERNEL);
> > if (!mask)
> > return -ENOMEM;
> > 
> > error = bits_from_user(mask, cnt - 1, codes_size, codes, compat);
> 
> If my understanding of bits_from_user() correct, here you can also use
> bitmap_alloc(), true?

bits_from_user() copies as much as user supplied, we want to zero out
the tail to make sure there is no garbage, so we want to use
kcalloc/kzalloc/bitmap_zalloc here.

Thanks.

-- 
Dmitry

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH v3 0/5] bitmap: Introduce alloc/free helpers

2018-06-20 Thread Yury Norov
On Mon, Jun 18, 2018 at 04:09:58PM +0300, Andy Shevchenko wrote:
> External Email
> 
> A lot of code is using allocation of bitmaps using BITS_PER_LONG() macro and
> sizeof(unsigned long) operator. The readability suffers because of this.
> 
> The series introduces three helpers, i.e. bitmap_alloc(), bitmap_zalloc() and
> bitmap_free(), to make it more cleaner.

tools/include/linux/bitmap.h already has bitmap_alloc(),
and it corresponds to bitmap_zalloc() in this patch. It may 
cause problems in future if people will copy functions that
use bitmap_alloc between kernel code and tools. So I think
we have to propagate this API to tools and update existing
users of bitmap_alloc() in tools.

What about code that calls specific alloc functions, like
memblock_virt_alloc() and pcpu_mem_zalloc() in mm/percpu.c,
or devm_kcalloc() in drivers/dma/edma.c? If we are going to
unify bitmap allocations in kernel, we should think about
unification of that cases too. Should it be additional
flag or optional pointer to the exact allocator in
bitmap_{,z}alloc()?

Yury
 
> Patch 1 is a preparatory to avoid namespace collisions between bitmap API and
> MD bitmap. No functional changes intended.
> 
> Patch 2 is just orphaned from previous release cycle.
> 
> Patch 3 introduces new helpers.
> 
> Patches 4 and 5 is just an example how to use new helpers. Locally I have like
> dozen of them against different subsystems and drivers.
> 
> Ideally it would go through Input subsystem, thus, needs an Ack from MD 
> maintainer(s).
> 
> Since v2:
> - fix compilation issue in MD bitmap code
> - elaborate changes in commit message of patch 5
> 
> Since v1:
> - added namespace fix patch against MD bitmap API
> - moved functions to lib/bitmap.c to avoid circular dependencies
> - appended Dmitry's tags
> 
> Andy Shevchenko (5):
>   md: Avoid namespace collision with bitmap API
>   bitmap: Drop unnecessary 0 check for u32 array operations
>   bitmap: Add bitmap_alloc(), bitmap_zalloc() and bitmap_free()
>   Input: gpio-keys - Switch to bitmap_zalloc()
>   Input: evdev - Switch to bitmap API
> 
>  drivers/input/evdev.c |  16 +-
>  drivers/input/keyboard/gpio_keys.c|   8 +-
>  drivers/md/dm-raid.c  |   6 +-
>  drivers/md/md-bitmap.c| 301 +-
>  drivers/md/md-bitmap.h|  46 +--
>  drivers/md/md-cluster.c   |  16 +-
>  drivers/md/md.c   |  44 +--
>  .../md/persistent-data/dm-space-map-common.c  |  12 +-
>  drivers/md/raid1.c|  20 +-
>  drivers/md/raid10.c   |  26 +-
>  drivers/md/raid5-cache.c  |   2 +-
>  drivers/md/raid5.c|  24 +-
>  include/linux/bitmap.h|   8 +
>  lib/bitmap.c  |  28 +-
>  14 files changed, 283 insertions(+), 274 deletions(-)
> 
> --
> 2.17.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH v2 5/5] Input: evdev - Switch to bitmap_zalloc()

2018-06-20 Thread Yury Norov
On Tue, Jun 19, 2018 at 11:33:16AM -0700, Dmitry Torokhov wrote:
> External Email
> 
> On Sat, Jun 16, 2018 at 12:42:31AM +0300, Yury Norov wrote:
> > Hi Andy,
> >
> > On Fri, Jun 15, 2018 at 04:20:17PM +0300, Andy Shevchenko wrote:
> > > Switch to bitmap_zalloc() to show clearly what we are allocating.
> > > Besides that it returns pointer of bitmap type instead of opaque void *.
> > >
> > > Acked-by: Dmitry Torokhov 
> > > Signed-off-by: Andy Shevchenko 
> > > ---
> > >  drivers/input/evdev.c | 16 +++-
> > >  1 file changed, 7 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
> > > index c81c79d01d93..370206f987f9 100644
> > > --- a/drivers/input/evdev.c
> > > +++ b/drivers/input/evdev.c
> > > @@ -481,7 +481,7 @@ static int evdev_release(struct inode *inode, struct 
> > > file *file)
> > > evdev_detach_client(evdev, client);
> > >
> > > for (i = 0; i < EV_CNT; ++i)
> > > -   kfree(client->evmasks[i]);
> > > +   bitmap_free(client->evmasks[i]);
> > >
> > > kvfree(client);
> > >
> > > @@ -925,17 +925,15 @@ static int evdev_handle_get_val(struct evdev_client 
> > > *client,
> > >  {
> > > int ret;
> > > unsigned long *mem;
> > > -   size_t len;
> > >
> > > -   len = BITS_TO_LONGS(maxbit) * sizeof(unsigned long);
> > > -   mem = kmalloc(len, GFP_KERNEL);
> > > +   mem = bitmap_alloc(maxbit, GFP_KERNEL);
> > > if (!mem)
> > > return -ENOMEM;
> >
> > But in commit message you say you switch to bitmap_zalloc(). IIUC
> > bitmap_alloc() is OK here. But could you please update comment to
> > avoid confusing.
> >
> > >
> > > spin_lock_irq(>event_lock);
> > > spin_lock(>buffer_lock);
> > >
> > > -   memcpy(mem, bits, len);
> > > +   bitmap_copy(mem, bits, maxbit);
> > >
> > > spin_unlock(>event_lock);
> > >
> > > @@ -947,7 +945,7 @@ static int evdev_handle_get_val(struct evdev_client 
> > > *client,
> > > if (ret < 0)
> > > evdev_queue_syn_dropped(client);
> > >
> > > -   kfree(mem);
> > > +   bitmap_free(mem);
> > >
> > > return ret;
> > >  }
> > > @@ -1003,13 +1001,13 @@ static int evdev_set_mask(struct evdev_client 
> > > *client,
> > > if (!cnt)
> > > return 0;
> > >
> > > -   mask = kcalloc(sizeof(unsigned long), BITS_TO_LONGS(cnt), 
> > > GFP_KERNEL);
> > > +   mask = bitmap_zalloc(cnt, GFP_KERNEL);
> > > if (!mask)
> > > return -ENOMEM;
> > >
> > > error = bits_from_user(mask, cnt - 1, codes_size, codes, compat);
> >
> > If my understanding of bits_from_user() correct, here you can also use
> > bitmap_alloc(), true?
> 
> bits_from_user() copies as much as user supplied, we want to zero out
> the tail to make sure there is no garbage, so we want to use
> kcalloc/kzalloc/bitmap_zalloc here.

I don't understand that. Tail bits of bitmap (i.e. after last used bit
till the end of last word) are always ignored by kernel code and there's
no matter what was stored in that bits.

(With the exception of copying bitmap from kernel to userspace. For this
case we have bitmap_copy_clear_tail() to avoid unintended exposing kernel
data to user.)

If you know any bitmap function that don't ignore tail bits, this is a
bug and should be fixed.

By the way, bits_from_user() is bad-designed because it takes 2 size
arguments - maxbit and maxlen, and should be reworked. There's a
single user of this function, and I suspect, it can be switched to
existing core API, like bitmap_from_arr32().

Yury

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel