Re: [RFC PATCH v0 0/3] erofs-utils: support multiple block compression

2020-12-30 Thread Gao Xiang
On Wed, Dec 30, 2020 at 04:47:25PM +0800, Gao Xiang via Linux-erofs wrote:
> From: Gao Xiang 
> 
> Hi folks,
> 
> This is the first RFC patch of multiple block compression (including
> erofsfuse) after I carefully think over the on-disk design to support
> multiblock in-place decompression.
> 
> Compression ratio results (POC, lz4hc, lz4-1.9.3, not final result):
>   10  enwik9
>   621211648   enwik9_4k.squashfs.img
>557858816  enwik9_4k.erofs.img
>   556191744   enwik9_8k.squashfs.img
>   502661120   enwik9_16k.squashfs.img
>500723712  enwik9_8k.erofs.img
>   458784768   enwik9_32k.squashfs.img
>453971968  enwik9_16k.erofs.img
>   422318080   enwik9_64k.squashfs.img
>416686080  enwik9_32k.erofs.img
>   398204928   enwik9_128k.squashfs.img
>395276288  enwik9_64k.erofs.img

I can also think out several compress strategies to control read amplification
but maintain a given C/R due to EROFS can compress variable-sized input data
to arbitary compressed block count for each pcluster, FYI.

Thanks,
Gao Xiang



[RFC PATCH v0 0/3] erofs-utils: support multiple block compression

2020-12-30 Thread Gao Xiang via Linux-erofs
From: Gao Xiang 

Hi folks,

This is the first RFC patch of multiple block compression (including
erofsfuse) after I carefully think over the on-disk design to support
multiblock in-place decompression.

Compression ratio results (POC, lz4hc, lz4-1.9.3, not final result):
10  enwik9
621211648   enwik9_4k.squashfs.img
 557858816  enwik9_4k.erofs.img
556191744   enwik9_8k.squashfs.img
502661120   enwik9_16k.squashfs.img
 500723712  enwik9_8k.erofs.img
458784768   enwik9_32k.squashfs.img
 453971968  enwik9_16k.erofs.img
422318080   enwik9_64k.squashfs.img
 416686080  enwik9_32k.erofs.img
398204928   enwik9_128k.squashfs.img
 395276288  enwik9_64k.erofs.img

TODO:
- support compact indexes for multiple block compression **;
- support multithread compression (keep compressed data in memory);
- carefully design kernel optimized paths to maximize runtime 
performance;
- widely testing.

If you think that'd be useful for your products and you also have interest
in development, feel free to follow that as well since I don't have abundant
free time so the progress might be somewhat slow (I tend to finish them all 
before
the next LTS).

Thanks,
Gao Xiang

Gao Xiang (3):
  erofs-utils: add -C# for the maximum size of pclusters
  erofs-utils: mkfs: support multiple block compression
  erofs-utils: fuse: support multiple block compression

 include/erofs/config.h   |  2 ++
 include/erofs/internal.h |  1 +
 include/erofs_fs.h   | 19 ---
 lib/compress.c   | 70 --
 lib/config.c |  1 +
 lib/data.c   |  4 +--
 lib/zmap.c   | 72 
 mkfs/main.c  | 14 +++-
 8 files changed, 146 insertions(+), 37 deletions(-)

-- 
2.24.0