Hi Bo,
On 2025/5/22 14:16, Bo Liu wrote:
This patch introdueces the use of the Intel QAT to decompress compressed
data in the EROFS filesystem, aiming to improve the decompression speed
of compressed datea.
We created a 285MiB compressed file and then used the following command to
create EROFS images with different cluster size.
# mkfs.erofs -zdeflate,level=9 -C16384
fio command was used to test random read and small random read(~5%) and
sequential read performance.
# fio -filename=testfile -bs=4k -rw=read -name=job1
# fio -filename=testfile -bs=4k -rw=randread -name=job1
# fio -filename=testfile -bs=4k -rw=randread --io_size=14m -name=job1
Here are some performance numbers for reference:
Processors: Intel(R) Xeon(R) 6766E(144 core)
Memory: 521 GiB
|-----------------------------------------------------------------------------|
| | Cluster size | sequential read | randread | small randread(5%) |
|-----------|--------------|-----------------|-----------|--------------------|
| Intel QAT | 4096 | 538 MiB/s | 112 MiB/s | 20.76 MiB/s |
| Intel QAT | 16384 | 699 MiB/s | 158 MiB/s | 21.02 MiB/s |
| Intel QAT | 65536 | 917 MiB/s | 278 MiB/s | 20.90 MiB/s |
| Intel QAT | 131072 | 1056 MiB/s | 351 MiB/s | 23.36 MiB/s |
| Intel QAT | 262144 | 1145 MiB/s | 431 MiB/s | 26.66 MiB/s |
| deflate | 4096 | 499 MiB/s | 108 MiB/s | 21.50 MiB/s |
| deflate | 16384 | 422 MiB/s | 125 MiB/s | 18.94 MiB/s |
| deflate | 65536 | 452 MiB/s | 159 MiB/s | 13.02 MiB/s |
| deflate | 131072 | 452 MiB/s | 177 MiB/s | 11.44 MiB/s |
| deflate | 262144 | 466 MiB/s | 194 MiB/s | 10.60 MiB/s |
Signed-off-by: Bo Liu <[email protected]>
---
v1:
https://lore.kernel.org/linux-erofs/[email protected]/
v2:
https://lore.kernel.org/linux-erofs/[email protected]/T/#t
v3:
https://lore.kernel.org/linux-erofs/[email protected]/
v4:
https://lore.kernel.org/linux-erofs/[email protected]/
change since v4:
- add sysfs documentation.
Documentation/ABI/testing/sysfs-fs-erofs | 12 ++
fs/erofs/Kconfig | 14 ++
fs/erofs/Makefile | 1 +
fs/erofs/compress.h | 10 ++
fs/erofs/decompressor_crypto.c | 186 +++++++++++++++++++++++
fs/erofs/decompressor_deflate.c | 17 ++-
fs/erofs/sysfs.c | 34 ++++-
fs/erofs/zdata.c | 1 +
8 files changed, 272 insertions(+), 3 deletions(-)
create mode 100644 fs/erofs/decompressor_crypto.c
diff --git a/Documentation/ABI/testing/sysfs-fs-erofs
b/Documentation/ABI/testing/sysfs-fs-erofs
index b134146d735b..95201a62f704 100644
--- a/Documentation/ABI/testing/sysfs-fs-erofs
+++ b/Documentation/ABI/testing/sysfs-fs-erofs
@@ -27,3 +27,15 @@ Description: Writing to this will drop compression-related
caches,
- 1 : invalidate cached compressed folios
- 2 : drop in-memory pclusters
- 3 : drop in-memory pclusters and cached compressed folios
+
+What: /sys/fs/erofs/accel
+Date: May 2025
+Contact: "Bo Liu" <[email protected]>
+Description: The accel file is read-write and allows to set or show
+ hardware decompression accelerators, and it supports writing
+ multiple accelerators separated by ā\nā.
Used to set or show hardware accelerators in effect and multiple
accelerators are separated by '\n'.
Supported accelerator(s): qat_deflate
Disable all accelerators with an empty string (echo > accel).
+ Currently supported accelerators:
...
+
+static int __z_erofs_crypto_decompress(struct z_erofs_decompress_req *rq,
+ struct crypto_acomp *tfm)
+{
+ struct sg_table st_src, st_dst;
+ struct acomp_req *req;
+ struct crypto_wait wait;
+ u8 *headpage;
+ int ret;
+
+ headpage = kmap_local_page(*rq->in);
+ ret = z_erofs_fixup_insize(rq, headpage + rq->pageofs_in,
+ min_t(unsigned int, rq->inputsize,
+ rq->sb->s_blocksize -
rq->pageofs_in));
ret = z_erofs_fixup_insize(rq, headpage + rq->pageofs_in,
min_t(unsigned int, rq->inputsize,
rq->sb->s_blocksize - rq->pageofs_in));
Otherwise it looks good to me.
Thanks,
Gao Xiang