Hi,
We noticed that in android if you build erofs images with ELFs which
have higher alignment say 16K or 64K, there was a considerable increase
in the size of the uncompressed erofs image. The size increase could be
mitigated with -Ededupe or --chunksize=4096 but that still results in
lot of redundant disk IOs during file read as all the zero blocks are
mapped to a single block on disk. Treating data blocks filled with zeros
as hole will save the diskspace and also will save us lot of disk IOs
during read.

Using EROFS tracepoints for the image built without the fix

md5sum-7535    [001] ..... 620668.748558: erofs_map_blocks_enter: dev = (7,0), 
nid = 60, la 364544 llen 45056 flags RAW
md5sum-7535    [001] ..... 620668.748559: erofs_map_blocks_exit: dev = (7,0), 
nid = 60, flags RAW la 364544 pa 40960 llen 4096 plen 4096 mflags M ret 0
md5sum-7535    [001] ..... 620668.748560: erofs_map_blocks_enter: dev = (7,0), 
nid = 60, la 368640 llen 40960 flags RAW
md5sum-7535    [001] ..... 620668.748560: erofs_map_blocks_exit: dev = (7,0), 
nid = 60, flags RAW la 368640 pa 40960 llen 4096 plen 4096 mflags M ret 0
md5sum-7535    [001] ..... 620668.748561: erofs_map_blocks_enter: dev = (7,0), 
nid = 60, la 372736 llen 36864 flags RAW
md5sum-7535    [001] ..... 620668.748561: erofs_map_blocks_exit: dev = (7,0), 
nid = 60, flags RAW la 372736 pa 40960 llen 4096 plen 4096 mflags M ret 0
md5sum-7535    [001] ..... 620668.748562: erofs_map_blocks_enter: dev = (7,0), 
nid = 60, la 376832 llen 32768 flags RAW

As you can see, all the reads are being redirected to read the same pa 40960.
Also this causes fragmentation.

Using EROFS tracepoints for the image built with detection of zero blocks

md5sum-7496    [000] ..... 620150.387246: erofs_map_blocks_enter: dev = (7,0), 
nid = 60, la 0 llen 65536 flags RAW
md5sum-7496    [000] ..... 620150.387249: erofs_map_blocks_exit: dev = (7,0), 
nid = 60, flags RAW la 0 pa 0 llen 262144 plen 262144 mflags  ret 0
md5sum-7496    [000] ..... 620150.387358: erofs_map_blocks_enter: dev = (7,0), 
nid = 60, la 65536 llen 131072 flags RAW
md5sum-7496    [000] ..... 620150.387358: erofs_map_blocks_exit: dev = (7,0), 
nid = 60, flags RAW la 0 pa 0 llen 262144 plen 262144 mflags  ret 0
md5sum-7496    [000] ..... 620150.387460: erofs_map_blocks_enter: dev = (7,0), 
nid = 60, la 196608 llen 212992 flags RAW

I think this optimization has wins on diskspace and IO cost so its better to be
default than enable conditionally with --sparse flag.

Thanks,
Sandeep.

PS: This patch is based on erofs-utils.git/experimental as it builds on the
previous fix of minextblks at
https://lore.kernel.org/all/20240403070700.1716252-1-dhav...@google.com/
which is not in erofs-utils.git/dev yet.


Sandeep Dhavale (1):
  erofs-utils: lib: treat data blocks filled with 0s as a hole

 lib/blobchunk.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

-- 
2.44.0.478.gd926399ef9-goog

Reply via email to