Hi David,

Have you had a chance to look at this?  Am very happy to answer
further questions, adjust my implementation, provide a different kind
of test case, etc.

Many Thanks,

Matt

On 28 January 2015 at 19:46, Matt Robinson <g...@nerdoftheherd.com> wrote:
> On 28 January 2015 at 12:55, David Sterba <dste...@suse.cz> wrote:
>> On Mon, Jan 26, 2015 at 06:05:51PM +0000, Matt Robinson wrote:
>>> It is not currently possible to deduplicate the last block of files
>>> whose size is not a multiple of the block size, as the btrfs_extent_same
>>> ioctl returns -EINVAL if offset + size is greater than the file size or
>>> is not aligned to the fs block size.
>>
>> Do you have a reproducer for that?
>
> I've been using the (quick and dirty) bash script at the end of this
> mail which uses btrfs-extent-same from
> https://github.com/markfasheh/duperemove/ to call the ioctl.  To
> summarize: it creates a new filesystem, creates a file with a size
> which is not a multiple of the block size, copies it, and then calls
> the ioctl to ask firstly for all of the complete blocks (for
> comparison) and then the entire files to be deduplicated.
>
> Running the script under a kernel compiled from
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git gives
> a status of -22 from the second btrfs-extent-same call and the final
> btrfs filesystem df shows:
> Data, single: total=8.00MiB, used=1.91MiB
>
> However, running under the same kernel plus my patch shows this final
> data usage:
> Data, single: total=8.00MiB, used=980.00KiB
>
>> The alignment is required to let btrfs_clone and the extent dropping
>> functions to work. [...]
>
> Which is why it is currently not possible to deduplicate a final
> incomplete block of a file:
> * Passing len + offset = the actual end of the file: Rejected as it is
> not aligned
> * Passing len + offset = the end of the block: Rejected as it exceeds
> the actual end of the file.
>
> Please let me know if you need any further information, if my
> implementation should be different or there is a better way I could
> demonstrate the issue?
>
> Many Thanks,
>
> Matt
>
> ---
>
> #!/bin/bash -e
>
> if [[ $EUID -ne 0 ]]; then
>    echo "This script must be run as root"
>    exit 1
> fi
>
> loopback=$(losetup -f)
>
> echo "## Create new btrfs filesystem on a loopback device"
> dd if=/dev/zero of=testfs bs=1048576 count=1500
> losetup $loopback testfs
> mkfs.btrfs $loopback
> mkdir testfsmnt
> mount $loopback testfsmnt
>
> echo -e "\n## Create 1000000 byte random file"
> dd if=/dev/urandom of=testfsmnt/test1 bs=1000000 count=1
> echo
> btrfs filesystem sync testfsmnt
> btrfs filesystem df testfsmnt
>
> echo -e "\n## Copy file"
> cp testfsmnt/test1 testfsmnt/test2
> echo
> btrfs filesystem sync testfsmnt
> btrfs filesystem df testfsmnt
>
> echo -e "\n## Dedupe to end of last full block"
> btrfs-extent-same 999424 testfsmnt/test1 0 testfsmnt/test2 0
> echo
> btrfs filesystem sync testfsmnt
> btrfs filesystem df testfsmnt
>
> echo -e "\n## Dedupe to end of file"
> btrfs-extent-same 1000000 testfsmnt/test1 0 testfsmnt/test2 0
> echo
> btrfs filesystem sync testfsmnt
> btrfs filesystem df testfsmnt
>
> echo -e "\nClean up"
> umount testfsmnt
> rmdir testfsmnt
> losetup -d $loopback
> rm testfs
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to