Le 11/06/2013 22:31, Mark Fasheh a écrit : > Perhaps this isn't a limiation per-se but extent-same requires read/write > access to the files we want to dedupe. During my last series I had a > conversation with Gabriel de Perthuis about access checking where we tried > to maintain the ability for a user to run extent-same against a readonly > snapshot. In addition, I reasoned that since the underlying data won't > change (at least to the user) that we ought only require the files to be > open for read. > > What I found however is that neither of these is a great idea ;) > > - We want to require that the inode be open for writing so that an > unprivileged user can't do things like run dedupe on a performance > sensitive file that they might only have read access to. In addition I > could see it as kind of a surprise (non-standard behavior) to an > administrator that users could alter the layout of files they are only > allowed to read. > > - Readonly snapshots won't let you open for write anyway (unsuprisingly, > open() returns -EROFS). So that kind of kills the idea of them being able > to open those files for write which we want to dedupe. > > That said, I still think being able to run this against a set of readonly > snapshots makes sense especially if those snapshots are taken for backup > purposes. I'm just not sure how we can sanely enable it.
The check could be: if (fmode_write || cap_sys_admin). This isn't incompatible with mnt_want_write, that check is at the level of the superblocks and vfsmount and not the subvolume fsid. > Code review is very much appreciated. Thanks, > --Mark > > > ChangeLog > > - check that we have appropriate access to each file before deduping. For > the source, we only check that it is opened for read. Target files have to > be open for write. > > - don't dedupe on readonly submounts (this is to maintain > > - check that we don't dedupe files with different checksumming states > (compare BTRFS_INODE_NODATASUM flags) > > - get and maintain write access to the mount during the extent same > operation (mount_want_write()) > > - allocate our read buffers up front in btrfs_ioctl_file_extent_same() and > pass them through for re-use on every call to btrfs_extent_same(). (thanks > to David Sterba <dste...@suse.cz> for reporting this > > - As the read buffers could possibly be up to 1MB (depending on user > request), we now conditionally vmalloc them. > > - removed redundant check for same inode. btrfs_extent_same() catches it now > and bubbles the error up. > > - remove some unnecessary printks > > Changes from RFC to v1: > > - don't error on large length value in btrfs exent-same, instead we just > dedupe the maximum allowed. That way userspace doesn't have to worry > about an arbitrary length limit. > > - btrfs_extent_same will now loop over the dedupe range at 1MB increments (for > a total of 16MB per request) > > - cleaned up poorly coded while loop in __extent_read_full_page() (thanks to > David Sterba <dste...@suse.cz> for reporting this) > > - included two fixes from Gabriel de Perthuis <g2p.c...@gmail.com>: > - allow dedupe across subvolumes > - don't lock compressed pages twice when deduplicating > > - removed some unused / poorly designed fields in btrfs_ioctl_same_args. > This should also give us a bit more reserved bytes. > > - return -E2BIG instead of -ENOMEM when arg list is too large (thanks to > David Sterba <dste...@suse.cz> for reporting this) > > - Some more reserved bytes are now included as a result of some of my > cleanups. Quite possibly we could add a couple more. > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html