I second this. I've seen the same behavior. Clone seems to have evolved a little further than extent-same knows about. e.g. there is code in the extent-same ioctl that tries to avoid doing a clone from within one inode to elsewhere in the same inode; however, the clone ioctl (which extent-same calls) has no such restriction.
As Matt mentioned, clone_range seems quite happy to accept a partial block at EOF. cp --reflink would be much harder to use if it did not. On Mon, Mar 02, 2015 at 08:59:11PM +0000, Matt Robinson wrote: > Hi David, > > Have you had a chance to look at this? Am very happy to answer > further questions, adjust my implementation, provide a different kind > of test case, etc. > > Many Thanks, > > Matt > > On 28 January 2015 at 19:46, Matt Robinson <g...@nerdoftheherd.com> wrote: > > On 28 January 2015 at 12:55, David Sterba <dste...@suse.cz> wrote: > >> On Mon, Jan 26, 2015 at 06:05:51PM +0000, Matt Robinson wrote: > >>> It is not currently possible to deduplicate the last block of files > >>> whose size is not a multiple of the block size, as the btrfs_extent_same > >>> ioctl returns -EINVAL if offset + size is greater than the file size or > >>> is not aligned to the fs block size. > >> > >> Do you have a reproducer for that? > > > > I've been using the (quick and dirty) bash script at the end of this > > mail which uses btrfs-extent-same from > > https://github.com/markfasheh/duperemove/ to call the ioctl. To > > summarize: it creates a new filesystem, creates a file with a size > > which is not a multiple of the block size, copies it, and then calls > > the ioctl to ask firstly for all of the complete blocks (for > > comparison) and then the entire files to be deduplicated. > > > > Running the script under a kernel compiled from > > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git gives > > a status of -22 from the second btrfs-extent-same call and the final > > btrfs filesystem df shows: > > Data, single: total=8.00MiB, used=1.91MiB > > > > However, running under the same kernel plus my patch shows this final > > data usage: > > Data, single: total=8.00MiB, used=980.00KiB > > > >> The alignment is required to let btrfs_clone and the extent dropping > >> functions to work. [...] > > > > Which is why it is currently not possible to deduplicate a final > > incomplete block of a file: > > * Passing len + offset = the actual end of the file: Rejected as it is > > not aligned > > * Passing len + offset = the end of the block: Rejected as it exceeds > > the actual end of the file. > > > > Please let me know if you need any further information, if my > > implementation should be different or there is a better way I could > > demonstrate the issue? > > > > Many Thanks, > > > > Matt > > > > --- > > > > #!/bin/bash -e > > > > if [[ $EUID -ne 0 ]]; then > > echo "This script must be run as root" > > exit 1 > > fi > > > > loopback=$(losetup -f) > > > > echo "## Create new btrfs filesystem on a loopback device" > > dd if=/dev/zero of=testfs bs=1048576 count=1500 > > losetup $loopback testfs > > mkfs.btrfs $loopback > > mkdir testfsmnt > > mount $loopback testfsmnt > > > > echo -e "\n## Create 1000000 byte random file" > > dd if=/dev/urandom of=testfsmnt/test1 bs=1000000 count=1 > > echo > > btrfs filesystem sync testfsmnt > > btrfs filesystem df testfsmnt > > > > echo -e "\n## Copy file" > > cp testfsmnt/test1 testfsmnt/test2 > > echo > > btrfs filesystem sync testfsmnt > > btrfs filesystem df testfsmnt > > > > echo -e "\n## Dedupe to end of last full block" > > btrfs-extent-same 999424 testfsmnt/test1 0 testfsmnt/test2 0 > > echo > > btrfs filesystem sync testfsmnt > > btrfs filesystem df testfsmnt > > > > echo -e "\n## Dedupe to end of file" > > btrfs-extent-same 1000000 testfsmnt/test1 0 testfsmnt/test2 0 > > echo > > btrfs filesystem sync testfsmnt > > btrfs filesystem df testfsmnt > > > > echo -e "\nClean up" > > umount testfsmnt > > rmdir testfsmnt > > losetup -d $loopback > > rm testfs > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
signature.asc
Description: Digital signature