Re: [PATCH v2 2/2] crypto: caam - fix DMA mapping of stack memory
Hi Horia, I didn't understand your patch thoroughly yet, but I tested it and it gets rid of my DMA-API warning, so: Tested-by: Roland Hieber Thanks! :) - Roland On Sat, Jan 26, 2019 at 08:02:15PM +0200, Horia Geantă wrote: > Roland reports the following issue and provides a root cause analysis: > > "On a v4.19 i.MX6 system with IMA and CONFIG_DMA_API_DEBUG enabled, a > warning is generated when accessing files on a filesystem for which IMA > measurement is enabled: > > [ cut here ] > WARNING: CPU: 0 PID: 1 at kernel/dma/debug.c:1181 > check_for_stack.part.9+0xd0/0x120 > caam_jr 2101000.jr0: DMA-API: device driver maps memory from stack > [addr=b668049e] > Modules linked in: > CPU: 0 PID: 1 Comm: switch_root Not tainted 4.19.0-20181214-1 #2 > Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) > Backtrace: > [] (dump_backtrace) from [] (show_stack+0x20/0x24) > [] (show_stack) from [] (dump_stack+0xa0/0xcc) > [] (dump_stack) from [] (__warn+0xf0/0x108) > [] (__warn) from [] (warn_slowpath_fmt+0x58/0x74) > [] (warn_slowpath_fmt) from [] > (check_for_stack.part.9+0xd0/0x120) > [] (check_for_stack.part.9) from [] > (debug_dma_map_page+0x144/0x174) > [] (debug_dma_map_page) from [] > (ahash_final_ctx+0x5b4/0xcf0) > [] (ahash_final_ctx) from [] (ahash_final+0x1c/0x20) > [] (ahash_final) from [] (crypto_ahash_op+0x38/0x80) > [] (crypto_ahash_op) from [] > (crypto_ahash_final+0x20/0x24) > [] (crypto_ahash_final) from [] > (ima_calc_file_hash+0x29c/0xa40) > [] (ima_calc_file_hash) from [] > (ima_collect_measurement+0x1dc/0x240) > [] (ima_collect_measurement) from [] > (process_measurement+0x4c4/0x6b8) > [] (process_measurement) from [] > (ima_file_check+0x88/0xa4) > [] (ima_file_check) from [] (path_openat+0x5d8/0x1364) > [] (path_openat) from [] (do_filp_open+0x84/0xf0) > [] (do_filp_open) from [] (do_open_execat+0x84/0x1b0) > [] (do_open_execat) from [] > (__do_execve_file+0x43c/0x890) > [] (__do_execve_file) from [] (sys_execve+0x44/0x4c) > [] (sys_execve) from [] (ret_fast_syscall+0x0/0x28) > ---[ end trace 3455789a10e3aefd ]--- > > The cause is that the struct ahash_request *req is created as a > stack-local variable up in the stack (presumably somewhere in the IMA > implementation), then passed down into the CAAM driver, which tries to > dma_single_map the req->result (indirectly via map_seq_out_ptr_result) > in order to make that buffer available for the CAAM to store the result > of the following hash operation. > > The calling code doesn't know how req will be used by the CAAM driver, > and there could be other such occurrences where stack memory is passed > down to the CAAM driver. Therefore we should rather fix this issue in > the CAAM driver where the requirements are known." > > Fix this problem by: > -instructing the crypto engine to write the final hash in state->caam_ctx > -subsequently memcpy-ing the final hash into req->result > > Cc: # v4.19+ > Reported-by: Roland Hieber > Signed-off-by: Horia Geantă > --- > drivers/crypto/caam/caamhash.c | 85 > +++--- > 1 file changed, 21 insertions(+), 64 deletions(-) > > diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c > index b65e2e54c562..89ecda28f87b 100644 > --- a/drivers/crypto/caam/caamhash.c > +++ b/drivers/crypto/caam/caamhash.c > @@ -178,18 +178,6 @@ static inline int map_seq_out_ptr_ctx(u32 *desc, struct > device *jrdev, > return 0; > } > > -/* Map req->result, and append seq_out_ptr command that points to it */ > -static inline dma_addr_t map_seq_out_ptr_result(u32 *desc, struct device > *jrdev, > - u8 *result, int digestsize) > -{ > - dma_addr_t dst_dma; > - > - dst_dma = dma_map_single(jrdev, result, digestsize, DMA_FROM_DEVICE); > - append_seq_out_ptr(desc, dst_dma, digestsize, 0); > - > - return dst_dma; > -} > - > /* Map current buffer in state (if length > 0) and put it in link table */ > static inline int buf_map_to_sec4_sg(struct device *jrdev, >struct sec4_sg_entry *sec4_sg, > @@ -426,7 +414,6 @@ static int ahash_setkey(struct crypto_ahash *ahash, > > /* > * ahash_edesc - s/w-extended ahash descriptor > - * @dst_dma: physical mapped address of req->result > * @sec4_sg_dma: physical mapped address of h/w link table > * @src_nents: number of segments in input scatterlist > * @sec4_sg_bytes: length of dma mapped sec4_sg space >
[PATCH 2/2] crypto: caam - fix DMA mapping of stack memory
On a v4.19 i.MX6 system with IMA and CONFIG_DMA_API_DEBUG enabled, a warning is generated when accessing files on a filesystem for which IMA measurement is enabled: [ cut here ] WARNING: CPU: 0 PID: 1 at kernel/dma/debug.c:1181 check_for_stack.part.9+0xd0/0x120 caam_jr 2101000.jr0: DMA-API: device driver maps memory from stack [addr=b668049e] Modules linked in: CPU: 0 PID: 1 Comm: switch_root Not tainted 4.19.0-20181214-1 #2 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) Backtrace: [] (dump_backtrace) from [] (show_stack+0x20/0x24) [] (show_stack) from [] (dump_stack+0xa0/0xcc) [] (dump_stack) from [] (__warn+0xf0/0x108) [] (__warn) from [] (warn_slowpath_fmt+0x58/0x74) [] (warn_slowpath_fmt) from [] (check_for_stack.part.9+0xd0/0x120) [] (check_for_stack.part.9) from [] (debug_dma_map_page+0x144/0x174) [] (debug_dma_map_page) from [] (ahash_final_ctx+0x5b4/0xcf0) [] (ahash_final_ctx) from [] (ahash_final+0x1c/0x20) [] (ahash_final) from [] (crypto_ahash_op+0x38/0x80) [] (crypto_ahash_op) from [] (crypto_ahash_final+0x20/0x24) [] (crypto_ahash_final) from [] (ima_calc_file_hash+0x29c/0xa40) [] (ima_calc_file_hash) from [] (ima_collect_measurement+0x1dc/0x240) [] (ima_collect_measurement) from [] (process_measurement+0x4c4/0x6b8) [] (process_measurement) from [] (ima_file_check+0x88/0xa4) [] (ima_file_check) from [] (path_openat+0x5d8/0x1364) [] (path_openat) from [] (do_filp_open+0x84/0xf0) [] (do_filp_open) from [] (do_open_execat+0x84/0x1b0) [] (do_open_execat) from [] (__do_execve_file+0x43c/0x890) [] (__do_execve_file) from [] (sys_execve+0x44/0x4c) [] (sys_execve) from [] (ret_fast_syscall+0x0/0x28) ---[ end trace 3455789a10e3aefd ]--- The cause is that the struct ahash_request *req is created as a stack-local variable up in the stack (presumably somewhere in the IMA implementation), then passed down into the CAAM driver, which tries to dma_single_map the req->result (indirectly via map_seq_out_ptr_result) in order to make that buffer available for the CAAM to store the result of the following hash operation. The calling code doesn't know how req will be used by the CAAM driver, and there could be other such occurrences where stack memory is passed down to the CAAM driver. Therefore we should rather fix this issue in the CAAM driver where the requirements are known. The problem is solved by introducing a temporary buffer in the auxiliary struct ahash_edesc, which is kmalloc'ed and can be DMA-mapped safely to receive the result from hardware. Then the result is copied back into req->result in the respective done callbacks that are called when the CAAM has finished the request. Other hardware crypto drivers which use DMA also solve it this way, see for example atmel_sha_copy_ready_hash() in drivers/crypto/atmel-sha.c. Supplementary for the above stack trace, fix the issue also in other code paths which show the same usage pattern of req->request. Cc: sta...@vger.kernel.org Signed-off-by: Roland Hieber --- drivers/crypto/caam/caamhash.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index b65e2e54c5625..3f425d3bf6972 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -426,7 +426,8 @@ static int ahash_setkey(struct crypto_ahash *ahash, /* * ahash_edesc - s/w-extended ahash descriptor - * @dst_dma: physical mapped address of req->result + * @result: temporary heap buffer to hold req->result + * @dst_dma: physical mapped address of result * @sec4_sg_dma: physical mapped address of h/w link table * @src_nents: number of segments in input scatterlist * @sec4_sg_bytes: length of dma mapped sec4_sg space @@ -434,6 +435,7 @@ static int ahash_setkey(struct crypto_ahash *ahash, * @sec4_sg: h/w link table */ struct ahash_edesc { + u8 result[CAAM_MAX_HASH_KEY_SIZE]; dma_addr_t dst_dma; dma_addr_t sec4_sg_dma; int src_nents; @@ -498,6 +500,7 @@ static void ahash_done(struct device *jrdev, u32 *desc, u32 err, caam_jr_strstatus(jrdev, err); ahash_unmap(jrdev, edesc, req, digestsize); + memcpy(req->result, edesc->result, digestsize); kfree(edesc); #ifdef DEBUG @@ -567,6 +570,7 @@ static void ahash_done_ctx_src(struct device *jrdev, u32 *desc, u32 err, caam_jr_strstatus(jrdev, err); ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_TO_DEVICE); + memcpy(req->result, edesc->result, digestsize); kfree(edesc); #ifdef DEBUG @@ -858,7 +862,7 @@ static int ahash_final_ctx(struct ahash_request *req) append_seq_in_ptr(desc, edesc->sec4_sg_dma, ctx->ctx_len + buflen, LDST_SGF); - edesc->dst_dma = map_seq_out_ptr_
[PATCH 1/2] crypto: caam - fix indentation of goto label
Signed-off-by: Roland Hieber --- drivers/crypto/caam/caamhash.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index bb1a2cdf19512..b65e2e54c5625 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -801,7 +801,7 @@ static int ahash_update_ctx(struct ahash_request *req) #endif return ret; - unmap_ctx: +unmap_ctx: ahash_unmap_ctx(jrdev, edesc, req, ctx->ctx_len, DMA_BIDIRECTIONAL); kfree(edesc); return ret; -- 2.20.1
editing for your photos
I would like to speak with the person that managing photos for your company? We provide image editing like – photos cutting out and retouching. Enhancing your images is just a part of what we can do for your business. Whether you’re an ecommerce store or portrait photographer, real estate professional, or an e-Retailer, we are your personal team of photo editors that integrate seamlessly with your business. Our mainly services are: . Cut out, masking, clipping path, deep etching, transparent background . Colour correction, black and white, light and shadows etc. . Dust cleaning, spot cleaning . Beauty retouching, skin retouching, face retouching, body retouching . Fashion/Beauty Image Retouching . Product image Retouching . Real estate image Retouching . Wedding & Event Album Design. . Restoration and repair old images . Vector Conversion . Portrait image Retouching We can provide you editing test on your photos. Please reply if you are interested. Thanks, Roland
Re: [CRYPTO] is it really optimized ?
> > It seems trivial to keep the last key you were given and do a quick > > memcmp in your setkey method to see if it's different from the last > > key you pushed to hardware, and set a flag if it is. Then only do > > your set_key() if you have a new key to pass to hardware. > > > > I'm assuming the expense is in the aes_write() calls, and you could > > avoid them if you know you're not writing something new. > that's a wrong assumption. aes_write()/aes_read() are both used to > access to the controller and are slow (no cache involved). Sorry, I wasn't clear. I meant that the hardware access is what is slow, and that anything you do on the CPU is relatively cheap compared to that. So my suggestion is just to keep a cache (in CPU memory) of what you have already loaded into the HW, and before reloading the HW just check the cache and don't do the actual HW access if you're not going to change the HW contents. So you avoid any extra aes_write and aes_read calls in the cache hit case. This would have the advantage of making anything that does lots of bulk encryption fast without special casing ecryptfs. - R. - To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [CRYPTO] is it really optimized ?
> > I wonder if there's some way you can cache the last caller and reload > > the key lazily (only when it changes). > > yes something that allows crypto drivers to detect if the key has > changed would be good. It seems trivial to keep the last key you were given and do a quick memcmp in your setkey method to see if it's different from the last key you pushed to hardware, and set a flag if it is. Then only do your set_key() if you have a new key to pass to hardware. I'm assuming the expense is in the aes_write() calls, and you could avoid them if you know you're not writing something new. - R. - To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [CRYPTO] is it really optimized ?
> Again, my code is faster only because I skip the key loading in > "cia_encrypt" method. Actually the gain is bigger in decryption mode > than in encryption one because I also generate the decryption key for > each block. I wonder if there's some way you can cache the last caller and reload the key lazily (only when it changes). Of course without your code it's hard to say... - To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html