On 18/03/11 16:04, Pádraig Brady wrote:
> On 18/03/11 13:48, Chris Mason wrote:
>> Excerpts from Pádraig Brady's message of 2011-03-18 09:19:44 -0400:
>>> On 18/03/11 12:04, Chris Mason wrote:
>>>> Excerpts from Jim Meyering's message of 2011-03-18 07:52:43 -0400:
>>>>> Pádraig Brady wrote:
>>>>>>
>>>>>> So am I right in thinking FIEMAP_FLAG_SYNC is no longer needed with 
>>>>>> 2.6.38.
>>>>>> I'm quite worried about implicit syncing in cp actually, where it might
>>>>>> introduce bad performance regressions.
>>>>>
>>>>> Good point.
>>>>>
>>>>>> Maybe we could disable this flag if XFS || kernel >= 2.6.38?
>>>>>> Or maybe we might do as you suggest above and disable extent_copy()
>>>>>> completely, for kernels before 2.6.38.
>>>>>> Not an ideal situation at all :(
>>>>>
>>>>> If there really is risk of data loss with ext4 on kernels < 2.6.38,
>>>>> even if it's only on unusual files, then we'll have to consider
>>>>> using a kernel version check to disable extent_copy.
>>>>>
>>>>> Is there a stand-alone ext4 reproducer?
>>>>
>>>> dd if=/dev/zero of=/mnt/foo bs=1M seek=1 count=1
>>>>
>>>> Without a sync the buggy ext4/btrfs will not find the extent, and report
>>>> only holes.
>>>
>>> OK, FIEMAP_FLAG_SYNC is an effective workaround for that.
>>> So as mentioned above, we might consider not using this flag for
>>> XFS || kernel >= 2.6.38
>>
>> Btrfs before 2.6.38 may have real trouble though, even with the sync.
>> We were returning overlapping ranges to you, so the destination would
>> end up bigger than the original.  This could be fixed in cp by making
>> sure to never seek backwards based on the extents returned.
>>
>> Example bad results:
>>
>> logical: [ 0 ... 1MB ] -> physical [ foo .. foo ]
>> logical  [ 0 ... 2MB ] -> physical [ foo2 .. foo2 ]
>>
>> 100% a btrfs bug, but cp would do the 0 .. 1MB bit and then seek back to
>> zero and do the 0 .. 2MB bit.  If cp had not seeked backwards you would
>> have covered for me.  2.6.38 final fixed this problem.
> 
>> I think even XFS was fixed relatively recently, 2.6.36 and later.
>> Looking at the commit, the simple dd test above wouldn't have triggered
>> it.  Actually, looking at this commit even the sync wouldn't save xfs
>> before 2.6.36, Eric am I reading this right?
>>
>> So maybe just give in and look for 2.6.38 instead of trying to remember
>> all the places where individual filesystems didn't suck.  Give the user
>> a cp --sparse=fiemap-im-really-really-sure to override your assumptions
>> about our bugs.
>>
>> commit 9af25465081480a75824fd7a16a37a5cfebeede9
>> Author: Tao Ma <[email protected]>
>> Date:   Mon Aug 30 02:44:03 2010 +0000
>>
>>     xfs: Make fiemap work with sparse files
>>     
>>     In xfs_vn_fiemap, we set bvm_count to fi_extent_max + 1 and want
>>     to return fi_extent_max extents, but actually it won't work for
>>     a sparse file. The reason is that in xfs_getbmap we will
>>     calculate holes and set it in 'out', while out is malloced by
>>     bmv_count(fi_extent_max+1) which didn't consider holes. So in the
>>     worst case, if 'out' vector looks like
>>     [hole, extent, hole, extent, hole, ... hole, extent, hole],
>>     we will only return half of fi_extent_max extents.
> 
> OK thanks for the info. So:
> 
> XFS may miss some extents for sparse files before 2.6.36
> BTRFS and EXT4 need a sync before fiemap() before 2.6.38
> BTRFS can return overlapping extents before 2.6.38
> 
> It seems like we should at least detect overlapping extents in our:
> http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/extent-scan.c;hb=HEAD
> and return false (don't use fiemap) in this case.
> We've already stopped doing the lseeks() but that assumes non overlapping 
> extents.

Attached is the proposed patch to address the overlapping extents case,
which we'll need if we don't bar kernels before 2.6.38,
but is good to have anyway in case it happens in future.

I think I'll do a follow up patch to get extent_scan_read()
to read all extents before returning, which will improve
the fix (and the previous merge extents change).

cheers,
Pádraig.
>From 602fa952feeeb0a2cd4a4bca92f4659529f01454 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= <[email protected]>
Date: Mon, 28 Mar 2011 19:22:21 +0100
Subject: [PATCH] copy: protect against overlapping extents

* src/extent-scan.c (extent_scan_read): Add a more stringent check
for OFF_T overflow, to ensure subsequent code is immune.
Detect overlapping extents and fail.  If this happens in the
initial fiemap scan, we'll revert to a standard copy, or otherwise
we'll fail with a diagnostic.
* src/copy.c (extent_copy): Add an assert to protect against
generating an invalid hole_size with overlapping extents.
We can remove this when extent_scan_read() is modified to read
all extents, thus encapsulating all extent checking and
merging within that function.
---
 src/copy.c        |    3 +++
 src/extent-scan.c |   11 ++++++++++-
 2 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/src/copy.c b/src/copy.c
index a672fd3..7c1b3c6 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -350,6 +350,9 @@ extent_copy (int src_fd, int dest_fd, char *buf, size_t buf_size,
               ext_len = 0;
             }
 
+          /* TODO: remove this if extent_scan_read() reads all extents.  */
+          assert (ext_start > last_ext_start + last_ext_len);
+
           hole_size = ext_start - last_ext_start - last_ext_len;
 
           wrote_hole_at_eof = false;
diff --git a/src/extent-scan.c b/src/extent-scan.c
index b9520b7..b1919ec 100644
--- a/src/extent-scan.c
+++ b/src/extent-scan.c
@@ -90,7 +90,7 @@ extent_scan_read (struct extent_scan *scan)
 
   for (i = 0; i < scan->ei_count; i++)
     {
-      assert (fm_extents[i].fe_logical <= OFF_T_MAX);
+      assert (fm_extents[i].fe_logical <= OFF_T_MAX - fm_extents[i].fe_length);
 
       if (si && last_ei->ext_flags ==
           (fm_extents[i].fe_flags & ~FIEMAP_EXTENT_LAST)
@@ -102,6 +102,15 @@ extent_scan_read (struct extent_scan *scan)
           /* Copy flags in case different.  */
           last_ei->ext_flags = fm_extents[i].fe_flags;
         }
+      else if (last_ei->ext_logical + last_ei->ext_length
+               > fm_extents[i].fe_logical)
+        {
+          /* BTRFS before 2.6.38 could return
+             overlapping extents for sparse files.  */
+          if (scan->scan_start == 0)
+            scan->initial_scan_failed = true;
+          return false;
+        }
       else
         {
           last_ei = scan->ext_info + si;
-- 
1.7.4

Reply via email to