I second this.  I've seen the same behavior.

Clone seems to have evolved a little further than extent-same knows about.
e.g. there is code in the extent-same ioctl that tries to avoid doing
a clone from within one inode to elsewhere in the same inode; however,
the clone ioctl (which extent-same calls) has no such restriction.

As Matt mentioned, clone_range seems quite happy to accept a partial block
at EOF.  cp --reflink would be much harder to use if it did not.

On Mon, Mar 02, 2015 at 08:59:11PM +0000, Matt Robinson wrote:
> Hi David,
> 
> Have you had a chance to look at this?  Am very happy to answer
> further questions, adjust my implementation, provide a different kind
> of test case, etc.
> 
> Many Thanks,
> 
> Matt
> 
> On 28 January 2015 at 19:46, Matt Robinson <g...@nerdoftheherd.com> wrote:
> > On 28 January 2015 at 12:55, David Sterba <dste...@suse.cz> wrote:
> >> On Mon, Jan 26, 2015 at 06:05:51PM +0000, Matt Robinson wrote:
> >>> It is not currently possible to deduplicate the last block of files
> >>> whose size is not a multiple of the block size, as the btrfs_extent_same
> >>> ioctl returns -EINVAL if offset + size is greater than the file size or
> >>> is not aligned to the fs block size.
> >>
> >> Do you have a reproducer for that?
> >
> > I've been using the (quick and dirty) bash script at the end of this
> > mail which uses btrfs-extent-same from
> > https://github.com/markfasheh/duperemove/ to call the ioctl.  To
> > summarize: it creates a new filesystem, creates a file with a size
> > which is not a multiple of the block size, copies it, and then calls
> > the ioctl to ask firstly for all of the complete blocks (for
> > comparison) and then the entire files to be deduplicated.
> >
> > Running the script under a kernel compiled from
> > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git gives
> > a status of -22 from the second btrfs-extent-same call and the final
> > btrfs filesystem df shows:
> > Data, single: total=8.00MiB, used=1.91MiB
> >
> > However, running under the same kernel plus my patch shows this final
> > data usage:
> > Data, single: total=8.00MiB, used=980.00KiB
> >
> >> The alignment is required to let btrfs_clone and the extent dropping
> >> functions to work. [...]
> >
> > Which is why it is currently not possible to deduplicate a final
> > incomplete block of a file:
> > * Passing len + offset = the actual end of the file: Rejected as it is
> > not aligned
> > * Passing len + offset = the end of the block: Rejected as it exceeds
> > the actual end of the file.
> >
> > Please let me know if you need any further information, if my
> > implementation should be different or there is a better way I could
> > demonstrate the issue?
> >
> > Many Thanks,
> >
> > Matt
> >
> > ---
> >
> > #!/bin/bash -e
> >
> > if [[ $EUID -ne 0 ]]; then
> >    echo "This script must be run as root"
> >    exit 1
> > fi
> >
> > loopback=$(losetup -f)
> >
> > echo "## Create new btrfs filesystem on a loopback device"
> > dd if=/dev/zero of=testfs bs=1048576 count=1500
> > losetup $loopback testfs
> > mkfs.btrfs $loopback
> > mkdir testfsmnt
> > mount $loopback testfsmnt
> >
> > echo -e "\n## Create 1000000 byte random file"
> > dd if=/dev/urandom of=testfsmnt/test1 bs=1000000 count=1
> > echo
> > btrfs filesystem sync testfsmnt
> > btrfs filesystem df testfsmnt
> >
> > echo -e "\n## Copy file"
> > cp testfsmnt/test1 testfsmnt/test2
> > echo
> > btrfs filesystem sync testfsmnt
> > btrfs filesystem df testfsmnt
> >
> > echo -e "\n## Dedupe to end of last full block"
> > btrfs-extent-same 999424 testfsmnt/test1 0 testfsmnt/test2 0
> > echo
> > btrfs filesystem sync testfsmnt
> > btrfs filesystem df testfsmnt
> >
> > echo -e "\n## Dedupe to end of file"
> > btrfs-extent-same 1000000 testfsmnt/test1 0 testfsmnt/test2 0
> > echo
> > btrfs filesystem sync testfsmnt
> > btrfs filesystem df testfsmnt
> >
> > echo -e "\nClean up"
> > umount testfsmnt
> > rmdir testfsmnt
> > losetup -d $loopback
> > rm testfs
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Attachment: signature.asc
Description: Digital signature

Reply via email to