On Thu, Feb 14, 2019 at 03:17:20PM +0000, fdman...@kernel.org wrote:
> From: Filipe Manana <fdman...@suse.com>
> 
> In the past we had data corruption when reading compressed extents that
> are shared within the same file and they are consecutive, this got fixed
> by commit 005efedf2c7d0 ("Btrfs: fix read corruption of compressed and
> shared extents") and by commit 808f80b46790f ("Btrfs: update fix for read
> corruption of compressed and shared extents"). However there was a case
> that was missing in those fixes, which is when the shared and compressed
> extents are referenced with a non-zero offset. The following shell script
> creates a reproducer for this issue:
> 
>   #!/bin/bash
> 
>   mkfs.btrfs -f /dev/sdc &> /dev/null
>   mount -o compress /dev/sdc /mnt/sdc
> 
>   # Create a file with 3 consecutive compressed extents, each has an
>   # uncompressed size of 128Kb and a compressed size of 4Kb.
>   for ((i = 1; i <= 3; i++)); do
>       head -c 4096 /dev/zero
>       for ((j = 1; j <= 31; j++)); do
>           head -c 4096 /dev/zero | tr '\0' "\377"
>       done
>   done > /mnt/sdc/foobar
>   sync
> 
>   echo "Digest after file creation:   $(md5sum /mnt/sdc/foobar)"
> 
>   # Clone the first extent into offsets 128K and 256K.
>   xfs_io -c "reflink /mnt/sdc/foobar 0 128K 128K" /mnt/sdc/foobar
>   xfs_io -c "reflink /mnt/sdc/foobar 0 256K 128K" /mnt/sdc/foobar
>   sync
> 
>   echo "Digest after cloning:         $(md5sum /mnt/sdc/foobar)"
> 
>   # Punch holes into the regions that are already full of zeroes.
>   xfs_io -c "fpunch 0 4K" /mnt/sdc/foobar
>   xfs_io -c "fpunch 128K 4K" /mnt/sdc/foobar
>   xfs_io -c "fpunch 256K 4K" /mnt/sdc/foobar
>   sync
> 
>   echo "Digest after hole punching:   $(md5sum /mnt/sdc/foobar)"
> 
>   echo "Dropping page cache..."
>   sysctl -q vm.drop_caches=1
>   echo "Digest after hole punching:   $(md5sum /mnt/sdc/foobar)"
> 
>   umount /dev/sdc
> 
> When running the script we get the following output:
> 
>   Digest after file creation:   5a0888d80d7ab1fd31c229f83a3bbcc8  
> /mnt/sdc/foobar
>   linked 131072/131072 bytes at offset 131072
>   128 KiB, 1 ops; 0.0033 sec (36.960 MiB/sec and 295.6830 ops/sec)
>   linked 131072/131072 bytes at offset 262144
>   128 KiB, 1 ops; 0.0015 sec (78.567 MiB/sec and 628.5355 ops/sec)
>   Digest after cloning:         5a0888d80d7ab1fd31c229f83a3bbcc8  
> /mnt/sdc/foobar
>   Digest after hole punching:   5a0888d80d7ab1fd31c229f83a3bbcc8  
> /mnt/sdc/foobar
>   Dropping page cache...
>   Digest after hole punching:   fba694ae8664ed0c2e9ff8937e7f1484  
> /mnt/sdc/foobar
> 
> This happens because after reading all the pages of the extent in the
> range from 128K to 256K for example, we read the hole at offset 256K
> and then when reading the page at offset 260K we don't submit the
> existing bio, which is responsible for filling all the page in the
> range 128K to 256K only, therefore adding the pages from range 260K
> to 384K to the existing bio and submitting it after iterating over the
> entire range. Once the bio completes, the uncompressed data fills only
> the pages in the range 128K to 256K because there's no more data read
> from disk, leaving the pages in the range 260K to 384K unfilled. It is
> just a slightly different variant of what was solved by commit
> 005efedf2c7d0 ("Btrfs: fix read corruption of compressed and shared
> extents").
> 
> Fix this by forcing a bio submit, during readpages(), whenever we find a
> compressed extent map for a page that is different from the extent map
> for the previous page or has a different starting offset (in case it's
> the same compressed extent), instead of the extent map's original start
> offset.
> 
> A test case for fstests follows soon.
> 
> Reported-by: Zygo Blaxell <ce3g8...@umail.furryterror.org>
> Tested-by: Zygo Blaxell <ce3g8...@umail.furryterror.org>
> Cc: sta...@vger.kernel.org # 4.3+
> Signed-off-by: Filipe Manana <fdman...@suse.com>

Added to misc-next and scheduled for 5.1, thanks. And thanks to Zygo for
the report.

Reply via email to