Hi Bo,

On 2025/5/22 14:16, Bo Liu wrote:
This patch introdueces the use of the Intel QAT to decompress compressed
data in the EROFS filesystem, aiming to improve the decompression speed
of compressed datea.

We created a 285MiB compressed file and then used the following command to
create EROFS images with different cluster size.
      # mkfs.erofs -zdeflate,level=9 -C16384

fio command was used to test random read and small random read(~5%) and
sequential read performance.
      # fio -filename=testfile  -bs=4k -rw=read -name=job1
      # fio -filename=testfile  -bs=4k -rw=randread -name=job1
      # fio -filename=testfile  -bs=4k -rw=randread --io_size=14m -name=job1

Here are some performance numbers for reference:

Processors: Intel(R) Xeon(R) 6766E(144 core)
Memory:     521 GiB

|-----------------------------------------------------------------------------|
|           | Cluster size | sequential read | randread  | small randread(5%) |
|-----------|--------------|-----------------|-----------|--------------------|
| Intel QAT |    4096      |    538  MiB/s   | 112 MiB/s |     20.76 MiB/s    |
| Intel QAT |    16384     |    699  MiB/s   | 158 MiB/s |     21.02 MiB/s    |
| Intel QAT |    65536     |    917  MiB/s   | 278 MiB/s |     20.90 MiB/s    |
| Intel QAT |    131072    |    1056 MiB/s   | 351 MiB/s |     23.36 MiB/s    |
| Intel QAT |    262144    |    1145 MiB/s   | 431 MiB/s |     26.66 MiB/s    |
| deflate   |    4096      |    499  MiB/s   | 108 MiB/s |     21.50 MiB/s    |
| deflate   |    16384     |    422  MiB/s   | 125 MiB/s |     18.94 MiB/s    |
| deflate   |    65536     |    452  MiB/s   | 159 MiB/s |     13.02 MiB/s    |
| deflate   |    131072    |    452  MiB/s   | 177 MiB/s |     11.44 MiB/s    |
| deflate   |    262144    |    466  MiB/s   | 194 MiB/s |     10.60 MiB/s    |

Signed-off-by: Bo Liu <[email protected]>
---
v1: 
https://lore.kernel.org/linux-erofs/[email protected]/
v2: 
https://lore.kernel.org/linux-erofs/[email protected]/T/#t
v3: 
https://lore.kernel.org/linux-erofs/[email protected]/
v4: 
https://lore.kernel.org/linux-erofs/[email protected]/
change since v4:
  - add sysfs documentation.

  Documentation/ABI/testing/sysfs-fs-erofs |  12 ++
  fs/erofs/Kconfig                         |  14 ++
  fs/erofs/Makefile                        |   1 +
  fs/erofs/compress.h                      |  10 ++
  fs/erofs/decompressor_crypto.c           | 186 +++++++++++++++++++++++
  fs/erofs/decompressor_deflate.c          |  17 ++-
  fs/erofs/sysfs.c                         |  34 ++++-
  fs/erofs/zdata.c                         |   1 +
  8 files changed, 272 insertions(+), 3 deletions(-)
  create mode 100644 fs/erofs/decompressor_crypto.c

diff --git a/Documentation/ABI/testing/sysfs-fs-erofs 
b/Documentation/ABI/testing/sysfs-fs-erofs
index b134146d735b..95201a62f704 100644
--- a/Documentation/ABI/testing/sysfs-fs-erofs
+++ b/Documentation/ABI/testing/sysfs-fs-erofs
@@ -27,3 +27,15 @@ Description: Writing to this will drop compression-related 
caches,
                - 1 : invalidate cached compressed folios
                - 2 : drop in-memory pclusters
                - 3 : drop in-memory pclusters and cached compressed folios
+
+What:          /sys/fs/erofs/accel
+Date:          May 2025
+Contact:       "Bo Liu" <[email protected]>
+Description:   The accel file is read-write and allows to set or show
+               hardware decompression accelerators, and it supports writing
+               multiple accelerators separated by ā€˜\n’.

                Used to set or show hardware accelerators in effect and multiple
                accelerators are separated by '\n'.

                Supported accelerator(s): qat_deflate

                Disable all accelerators with an empty string (echo > accel).

+               Currently supported accelerators:

...

+
+static int __z_erofs_crypto_decompress(struct z_erofs_decompress_req *rq,
+                               struct crypto_acomp *tfm)
+{
+       struct sg_table st_src, st_dst;
+       struct acomp_req *req;
+       struct crypto_wait wait;
+       u8 *headpage;
+       int ret;
+
+       headpage = kmap_local_page(*rq->in);
+       ret = z_erofs_fixup_insize(rq, headpage + rq->pageofs_in,
+                               min_t(unsigned int, rq->inputsize,
+                                                       rq->sb->s_blocksize - 
rq->pageofs_in));

        ret = z_erofs_fixup_insize(rq, headpage + rq->pageofs_in,
                                min_t(unsigned int, rq->inputsize,
                                      rq->sb->s_blocksize - rq->pageofs_in));

Otherwise it looks good to me.

Thanks,
Gao Xiang

Reply via email to