On 07/11/2016 11:16 AM, David Sterba wrote:
On Mon, Jul 11, 2016 at 11:00:55AM -0400, Chris Mason wrote:
So, the real bug is that we're letting some delalloc stat hang around
after the truncate, probably related to IO in progress.  We do already
account for delalloc in what we return to stat, but there's a corner
case involving truncate where we screw it up.

So the original testcase:

    a) some "tool" creates sparse file
    b) that tool does not sync explicitly and exits ..
    c) tar is called immediately after that to archive the sparse file
    d) tar considers [2] the file is completely sparse (because st_blocks is
       zero) and archives no data.  Here comes data loss.

will not happen. The application would basically have to mimick the
provided reproducer script and do the truncate/write loop and be lucky
enough to let tar hit the short race window.


Looking harder there is a race window that can trigger this without the truncate loop:

1) application calls write(), we make the pages delalloc (in-ram st_blocks goes up)
2) VM calls write_cache_pages, we go find a contiguous delalloc range
3) We call cow_file_range on the locked range of pages
4) cow_file_range clears the delalloc bits (in-ram st_blocks goes down)

< ----- race begins here ----->

5) The io is started
6) The IO completes and extents are inserted into the metadata
7) the on disk/in-ram st_blocks goes up

< ---- race ends here ---->

This makes a ton more sense than leaking delalloc bits.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to