On Wed, Jun 11, 2025 at 11:25:21PM -0700, Eric Biggers wrote:
> On Thu, Jun 12, 2025 at 12:59:14AM +0000, Eric Biggers wrote:
> > On Thu, Jun 12, 2025 at 09:21:26AM +0900, Simon Richter wrote:
> > > Hi,
> > > 
> > > On 6/12/25 05:58, Eric Biggers wrote:
> > > 
> > > > But
> > > > otherwise this style of hardware offload is basically obsolete and has
> > > > been superseded by hardware-accelerated crypto instructions directly on
> > > > the CPU as well as inline storage encryption (UFS/eMMC).
> > > 
> > > For desktop, yes, but embedded still has quite a few of these, for example
> > > the STM32 crypto offload engine
> 
> By the way, I noticed you specifically mentioned STM32.  I'm not sure if you
> looked at the links I had in my commit message, but one of them
> (https://github.com/google/fscryptctl/issues/32) was actually for the STM32
> driver being broken and returning the wrong results, which broke filename
> encryption.  The user fixed the issue by disabling the STM32 driver, and they
> seemed okay with that.
> 
> That doesn't sound like something useful, IMO.  It sounds more like something
> actively harmful to users.
> 
> Here's another one I forgot to mention:
> https://github.com/google/fscryptctl/issues/9
> 
> I get blamed for these issues, because it's fscrypt that breaks.

Since two people were pushing the STM32 crypto engine in this thread:

I measured decryption throughput on 4 KiB messages on an STM32MP157F-DK2.  This
is an embedded evaluation board that includes an STM32 crypto engine and has an
800 MHz Cortex-A7 processor.  Cortex-A7 doesn't have AES instructions:

    AES-128-CBC-ESSIV:
        essiv(stm32-cbc-aes,sha256-arm):
            3.1 MB/s
        essiv(cbc-aes-neonbs,sha256-arm): 
            15.5 MB/s

    AES-256-XTS:
        xts(stm32-ecb-aes):
            3.1 MB/s
        xts-aes-neonbs:
            11.0 MB/s
            
    Adiantum:
        adiantum(xchacha12-arm,aes-arm,nhpoly1305-neon):
            53.1 MB/s

That was the synchronous throughput.  However, submitting multiple requests
asynchronously (which again, fscrypt doesn't actually do) barely helps.
Apparently the STM32 crypto engine has only one hardware queue.

I already strongly suspected that these non-inline crypto engines aren't worth
using.  But I didn't realize they are quite this bad.  Even with AES on a
Cortex-A7 CPU that lacks AES instructions, the CPU is much faster!

But of course Adiantum is even faster, as it was specifically designed for CPUs
that don't have AES instructions.

- Eric


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to