Hi, We noticed that in android if you build erofs images with ELFs which have higher alignment say 16K or 64K, there was a considerable increase in the size of the uncompressed erofs image. The size increase could be mitigated with -Ededupe or --chunksize=4096 but that still results in lot of redundant disk IOs during file read as all the zero blocks are mapped to a single block on disk. Treating data blocks filled with zeros as hole will save the diskspace and also will save us lot of disk IOs during read.
Using EROFS tracepoints for the image built without the fix md5sum-7535 [001] ..... 620668.748558: erofs_map_blocks_enter: dev = (7,0), nid = 60, la 364544 llen 45056 flags RAW md5sum-7535 [001] ..... 620668.748559: erofs_map_blocks_exit: dev = (7,0), nid = 60, flags RAW la 364544 pa 40960 llen 4096 plen 4096 mflags M ret 0 md5sum-7535 [001] ..... 620668.748560: erofs_map_blocks_enter: dev = (7,0), nid = 60, la 368640 llen 40960 flags RAW md5sum-7535 [001] ..... 620668.748560: erofs_map_blocks_exit: dev = (7,0), nid = 60, flags RAW la 368640 pa 40960 llen 4096 plen 4096 mflags M ret 0 md5sum-7535 [001] ..... 620668.748561: erofs_map_blocks_enter: dev = (7,0), nid = 60, la 372736 llen 36864 flags RAW md5sum-7535 [001] ..... 620668.748561: erofs_map_blocks_exit: dev = (7,0), nid = 60, flags RAW la 372736 pa 40960 llen 4096 plen 4096 mflags M ret 0 md5sum-7535 [001] ..... 620668.748562: erofs_map_blocks_enter: dev = (7,0), nid = 60, la 376832 llen 32768 flags RAW As you can see, all the reads are being redirected to read the same pa 40960. Also this causes fragmentation. Using EROFS tracepoints for the image built with detection of zero blocks md5sum-7496 [000] ..... 620150.387246: erofs_map_blocks_enter: dev = (7,0), nid = 60, la 0 llen 65536 flags RAW md5sum-7496 [000] ..... 620150.387249: erofs_map_blocks_exit: dev = (7,0), nid = 60, flags RAW la 0 pa 0 llen 262144 plen 262144 mflags ret 0 md5sum-7496 [000] ..... 620150.387358: erofs_map_blocks_enter: dev = (7,0), nid = 60, la 65536 llen 131072 flags RAW md5sum-7496 [000] ..... 620150.387358: erofs_map_blocks_exit: dev = (7,0), nid = 60, flags RAW la 0 pa 0 llen 262144 plen 262144 mflags ret 0 md5sum-7496 [000] ..... 620150.387460: erofs_map_blocks_enter: dev = (7,0), nid = 60, la 196608 llen 212992 flags RAW I think this optimization has wins on diskspace and IO cost so its better to be default than enable conditionally with --sparse flag. Thanks, Sandeep. PS: This patch is based on erofs-utils.git/experimental as it builds on the previous fix of minextblks at https://lore.kernel.org/all/20240403070700.1716252-1-dhav...@google.com/ which is not in erofs-utils.git/dev yet. Sandeep Dhavale (1): erofs-utils: lib: treat data blocks filled with 0s as a hole lib/blobchunk.c | 25 ++++++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) -- 2.44.0.478.gd926399ef9-goog