Re: [PATCH v5 0/9] fs: multigrain timestamp redux
On Thu, Jul 11, 2024 at 07:08:04AM -0400, Jeff Layton wrote: > tl;dr for those who have been following along: > > There are several changes in this version. The conversion of ctime to > be a ktime_t value has been dropped, and we now use an unused bit in > the nsec field as the QUERIED flag (like the earlier patchset did). > > The floor value is now tracked as a monotonic clock value, and is > converted to a realtime value on an as-needed basis. This eliminates the > problem of trying to detect when the realtime clock jumps backward. > > Longer patch description for those just joining in: > > At LSF/MM this year, we had a discussion about the inode change > attribute. At the time I mentioned that I thought I could salvage the > multigrain timestamp work that had to be reverted last year [1]. > > That version had to be reverted because it was possible for a file to > get a coarse grained timestamp that appeared to be earlier than another > file that had recently gotten a fine-grained stamp. > > This version corrects the problem by establishing a per-time_namespace > ctime_floor value that should prevent this from occurring. In the above > situation, the two files might end up with the same timestamp value, but > they won't appear to have been modified in the wrong order. > > That problem was discovered by the test-stat-time gnulib test. Note that > that test still fails on multigrain timestamps, but that's because its > method of determining the minimum delay that will show a timestamp > change will no longer work with multigrain timestamps. I have a patch to > change the testcase to use a different method that is in the process of > being merged. > > The testing I've done seems to show performance parity with multigrain > timestamps enabled vs. disabled, but it's hard to rule this out > regressing some workload. > > This set is based on top of Christian's vfs.misc branch (which has the > earlier change to track inode timestamps as discrete integers). If there > are no major objections, I'd like to have this considered for v6.12, > after a nice long full-cycle soak in linux-next. > > PS: I took a stab at a conversion for bcachefs too, but it's not > trivial. bcachefs handles timestamps backward from the way most > block-based filesystems do. Instead of updating them in struct inode and > eventually copying them to a disk-based representation, it does the > reverse and updates the timestamps in its in-core image of the on-disk > inode, and then copies that into struct inode. Either that will need to > be changed, or we'll need to come up with a different way to do this for > bcachefs. > > [1]: > https://lore.kernel.org/linux-fsdevel/20230807-mgctime-v7-0-d1dec143a...@kernel.org/ > > Signed-off-by: Jeff Layton Reviewed-by: Josef Bacik Thanks, Josef
Re: [PATCH v2 09/11] btrfs: convert to multigrain timestamps
On Mon, Jul 01, 2024 at 09:57:43AM -0400, Jeff Layton wrote: > On Mon, 2024-07-01 at 09:49 -0400, Josef Bacik wrote: > > On Mon, Jul 01, 2024 at 06:26:45AM -0400, Jeff Layton wrote: > > > Enable multigrain timestamps, which should ensure that there is an > > > apparent change to the timestamp whenever it has been written after > > > being actively observed via getattr. > > > > > > Beyond enabling the FS_MGTIME flag, this patch eliminates > > > update_time_for_write, which goes to great pains to avoid in-memory > > > stores. Just have it overwrite the timestamps unconditionally. > > > > > > Signed-off-by: Jeff Layton > > > --- > > > fs/btrfs/file.c | 25 - > > > fs/btrfs/super.c | 3 ++- > > > 2 files changed, 6 insertions(+), 22 deletions(-) > > > > > > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > > > index d90138683a0a..409628c0c3cc 100644 > > > --- a/fs/btrfs/file.c > > > +++ b/fs/btrfs/file.c > > > @@ -1120,26 +1120,6 @@ void btrfs_check_nocow_unlock(struct > > > btrfs_inode *inode) > > > btrfs_drew_write_unlock(&inode->root->snapshot_lock); > > > } > > > > > > -static void update_time_for_write(struct inode *inode) > > > -{ > > > - struct timespec64 now, ts; > > > - > > > - if (IS_NOCMTIME(inode)) > > > - return; > > > - > > > - now = current_time(inode); > > > - ts = inode_get_mtime(inode); > > > - if (!timespec64_equal(&ts, &now)) > > > - inode_set_mtime_to_ts(inode, now); > > > - > > > - ts = inode_get_ctime(inode); > > > - if (!timespec64_equal(&ts, &now)) > > > - inode_set_ctime_to_ts(inode, now); > > > - > > > - if (IS_I_VERSION(inode)) > > > - inode_inc_iversion(inode); > > > -} > > > - > > > static int btrfs_write_check(struct kiocb *iocb, struct iov_iter > > > *from, > > > size_t count) > > > { > > > @@ -1171,7 +1151,10 @@ static int btrfs_write_check(struct kiocb > > > *iocb, struct iov_iter *from, > > > * need to start yet another transaction to update the > > > inode as we will > > > * update the inode when we finish writing whatever data > > > we write. > > > */ > > > - update_time_for_write(inode); > > > + if (!IS_NOCMTIME(inode)) { > > > + inode_set_mtime_to_ts(inode, > > > inode_set_ctime_current(inode)); > > > + inode_inc_iversion(inode); > > > > You've dropped the > > > > if (IS_I_VERSION(inode)) > > > > check here, and it doesn't appear to be in inode_inc_iversion. Is > > there a > > reason for this? Thanks, > > > > AFAICT, btrfs always sets SB_I_VERSION. Are there any cases where it > isn't? If so, then I can put this check back. I'll make a note about it > in the changelog if not. Ah ok I'm dumb, ignore me, thanks, Josef
Re: [PATCH v2 00/11] fs: multigrain timestamp redux
On Mon, Jul 01, 2024 at 06:26:36AM -0400, Jeff Layton wrote: > This set is essentially unchanged from the last one, aside from the > new file in Documentation/. I had a review comment from Andi Kleen > suggesting that the ctime_floor should be per time_namespace, but I > think that's incorrect as the realtime clock is not namespaced. > > At LSF/MM this year, we had a discussion about the inode change > attribute. At the time I mentioned that I thought I could salvage the > multigrain timestamp work that had to be reverted last year [1]. That > version had to be reverted because it was possible for a file to get a > coarse grained timestamp that appeared to be earlier than another file > that had recently gotten a fine-grained stamp. > > This version corrects the problem by establishing a per-time_namespace > ctime_floor value that should prevent this from occurring. In the above > situation that was problematic before, the two files might end up with > the same timestamp value, but they won't appear to have been modified in > the wrong order. > > That problem was discovered by the test-stat-time gnulib test. Note that > that test still fails on multigrain timestamps, but that's because its > method of determining the minimum delay that will show a timestamp > change will no longer work with multigrain timestamps. I have a patch to > change the testcase to use a different method that I've posted to the > bug-gnulib mailing list. > > The big question with this set is whether the performance will be > suitable. The testing I've done seems to show performance parity with > multigrain timestamps enabled, but it's hard to rule this out regressing > some workload. > > This set is based on top of Christian's vfs.misc branch (which has the > earlier change to track inode timestamps as discrete integers). If there > are no major objections, I'd like to let this soak in linux-next for a > bit to see if any problems shake out. > > [1]: > https://lore.kernel.org/linux-fsdevel/20230807-mgctime-v7-0-d1dec143a...@kernel.org/ > > Signed-off-by: Jeff Layton I have a few nits that need to be addressed, but you can add Reviewed-by: Josef Bacik to the series once they're addressed. Thanks, Josef
Re: [PATCH v2 11/11] Documentation: add a new file documenting multigrain timestamps
On Mon, Jul 01, 2024 at 06:26:47AM -0400, Jeff Layton wrote: > Add a high-level document that describes how multigrain timestamps work, > rationale for them, and some info about implementation and tradeoffs. > > Signed-off-by: Jeff Layton > --- > Documentation/filesystems/multigrain-ts.rst | 126 > > 1 file changed, 126 insertions(+) > > diff --git a/Documentation/filesystems/multigrain-ts.rst > b/Documentation/filesystems/multigrain-ts.rst > new file mode 100644 > index ..beef7f79108c > --- /dev/null > +++ b/Documentation/filesystems/multigrain-ts.rst > @@ -0,0 +1,126 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > += > +Multigrain Timestamps > += > + > +Introduction > + > +Historically, the kernel has always used a coarse time values to stamp > +inodes. This value is updated on every jiffy, so any change that happens > +within that jiffy will end up with the same timestamp. > + > +When the kernel goes to stamp an inode (due to a read or write), it first > gets > +the current time and then compares it to the existing timestamp(s) to see > +whether anything will change. If nothing changed, then it can avoid updating > +the inode's metadata. > + > +Coarse timestamps are therefore good from a performance standpoint, since > they > +reduce the need for metadata updates, but bad from the standpoint of > +determining whether anything has changed, since a lot of things can happen > in a > +jiffy. > + > +They are particularly troublesome with NFSv3, where unchanging timestamps can > +make it difficult to tell whether to invalidate caches. NFSv4 provides a > +dedicated change attribute that should always show a visible change, but not > +all filesystems implement this properly, and many just populating this with > +the ctime. > + > +Multigrain timestamps aim to remedy this by selectively using fine-grained > +timestamps when a file has had its timestamps queried recently, and the > current > +coarse-grained time does not cause a change. > + > +Inode Timestamps > + > +There are currently 3 timestamps in the inode that are updated to the current > +wallclock time on different activity: > + > +ctime: > + The inode change time. This is stamped with the current time whenever > + the inode's metadata is changed. Note that this value is not settable > + from userland. > + > +mtime: > + The inode modification time. This is stamped with the current time > + any time a file's contents change. > + > +atime: > + The inode access time. This is stamped whenever an inode's contents are > + read. Widely considered to be a terrible mistake. Usually avoided with > + options like noatime or relatime. > + > +Updating the mtime always implies a change to the ctime, but updating the > +atime due to a read request does not. > + > +Multigrain timestamps are only tracked for the ctime and the mtime. atimes > are > +not affected and always use the coarse-grained value (subject to the floor). > + > +Inode Timestamp Ordering > + > + > +In addition just providing info about changes to individual files, file > +timestamps also serve an important purpose in applications like "make". These > +programs measure timestamps in order to determine whether source files might > be > +newer than cached objects. > + > +Userland applications like make can only determine ordering based on > +operational boundaries. For a syscall those are the syscall entry and exit > +points. For io_uring or nfsd operations, that's the request submission and > +response. In the case of concurrent operations, userland can make no > +determination about the order in which things will occur. > + > +For instance, if a single thread modifies one file, and then another file in > +sequence, the second file must show an equal or later mtime than the first. > The > +same is true if two threads are issuing similar operations that do not > overlap > +in time. > + > +If however, two threads have racing syscalls that overlap in time, then there > +is no such guarantee, and the second file may appear to have been modified > +before, after or at the same time as the first, regardless of which one was > +submitted first. > + > +Multigrain Timestamps > += > +Multigrain timestamps are aimed at ensuring that changes to a single file are > +always recognizeable, without violating the ordering guarantees when multiple > +different files are modified. This affects the mtime and the ctime, but the > +atime will always use coarse-grained timestamps. > + > +It uses the lowest-order bit in the timestamp as a flag that indicates > whether > +the mtime or ctime have been queried. If either or both have, then the kernel > +takes special care to ensure the next timestamp update will display a visible > +change. This ensures tight cache coherency for use-cases like NFS, without > +sacrificing the benefits of reduced metadata update
Re: [PATCH v2 09/11] btrfs: convert to multigrain timestamps
On Mon, Jul 01, 2024 at 06:26:45AM -0400, Jeff Layton wrote: > Enable multigrain timestamps, which should ensure that there is an > apparent change to the timestamp whenever it has been written after > being actively observed via getattr. > > Beyond enabling the FS_MGTIME flag, this patch eliminates > update_time_for_write, which goes to great pains to avoid in-memory > stores. Just have it overwrite the timestamps unconditionally. > > Signed-off-by: Jeff Layton > --- > fs/btrfs/file.c | 25 - > fs/btrfs/super.c | 3 ++- > 2 files changed, 6 insertions(+), 22 deletions(-) > > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > index d90138683a0a..409628c0c3cc 100644 > --- a/fs/btrfs/file.c > +++ b/fs/btrfs/file.c > @@ -1120,26 +1120,6 @@ void btrfs_check_nocow_unlock(struct btrfs_inode > *inode) > btrfs_drew_write_unlock(&inode->root->snapshot_lock); > } > > -static void update_time_for_write(struct inode *inode) > -{ > - struct timespec64 now, ts; > - > - if (IS_NOCMTIME(inode)) > - return; > - > - now = current_time(inode); > - ts = inode_get_mtime(inode); > - if (!timespec64_equal(&ts, &now)) > - inode_set_mtime_to_ts(inode, now); > - > - ts = inode_get_ctime(inode); > - if (!timespec64_equal(&ts, &now)) > - inode_set_ctime_to_ts(inode, now); > - > - if (IS_I_VERSION(inode)) > - inode_inc_iversion(inode); > -} > - > static int btrfs_write_check(struct kiocb *iocb, struct iov_iter *from, >size_t count) > { > @@ -1171,7 +1151,10 @@ static int btrfs_write_check(struct kiocb *iocb, > struct iov_iter *from, >* need to start yet another transaction to update the inode as we will >* update the inode when we finish writing whatever data we write. >*/ > - update_time_for_write(inode); > + if (!IS_NOCMTIME(inode)) { > + inode_set_mtime_to_ts(inode, inode_set_ctime_current(inode)); > + inode_inc_iversion(inode); You've dropped the if (IS_I_VERSION(inode)) check here, and it doesn't appear to be in inode_inc_iversion. Is there a reason for this? Thanks, Josef
Re: [PATCH v2 07/11] xfs: switch to multigrain timestamps
On Mon, Jul 01, 2024 at 06:26:43AM -0400, Jeff Layton wrote: > Enable multigrain timestamps, which should ensure that there is an > apparent change to the timestamp whenever it has been written after > being actively observed via getattr. > > Also, anytime the mtime changes, the ctime must also change, and those > are now the only two options for xfs_trans_ichgtime. Have that function > unconditionally bump the ctime, and ASSERT that XFS_ICHGTIME_CHG is > always set. > > Signed-off-by: Jeff Layton > --- > fs/xfs/libxfs/xfs_trans_inode.c | 6 +++--- > fs/xfs/xfs_iops.c | 6 -- > fs/xfs/xfs_super.c | 2 +- > 3 files changed, 8 insertions(+), 6 deletions(-) > > diff --git a/fs/xfs/libxfs/xfs_trans_inode.c b/fs/xfs/libxfs/xfs_trans_inode.c > index 69fc5b981352..1f3639bbf5f0 100644 > --- a/fs/xfs/libxfs/xfs_trans_inode.c > +++ b/fs/xfs/libxfs/xfs_trans_inode.c > @@ -62,12 +62,12 @@ xfs_trans_ichgtime( > ASSERT(tp); > xfs_assert_ilocked(ip, XFS_ILOCK_EXCL); > > - tv = current_time(inode); > + /* If the mtime changes, then ctime must also change */ > + ASSERT(flags & XFS_ICHGTIME_CHG); > > + tv = inode_set_ctime_current(inode); > if (flags & XFS_ICHGTIME_MOD) > inode_set_mtime_to_ts(inode, tv); > - if (flags & XFS_ICHGTIME_CHG) > - inode_set_ctime_to_ts(inode, tv); > if (flags & XFS_ICHGTIME_CREATE) > ip->i_crtime = tv; > } > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c > index ff222827e550..ed6e6d9507df 100644 > --- a/fs/xfs/xfs_iops.c > +++ b/fs/xfs/xfs_iops.c > @@ -590,10 +590,12 @@ xfs_vn_getattr( > stat->gid = vfsgid_into_kgid(vfsgid); > stat->ino = ip->i_ino; > stat->atime = inode_get_atime(inode); > - stat->mtime = inode_get_mtime(inode); > - stat->ctime = inode_get_ctime(inode); > + > + fill_mg_cmtime(stat, request_mask, inode); > + > stat->blocks = XFS_FSB_TO_BB(mp, ip->i_nblocks + ip->i_delayed_blks); > > + Stray newline. Thanks, Josef
Re: [PATCH v4 00/46] btrfs: add fscrypt support
On Tue, Apr 09, 2024 at 07:42:22PM -0400, Eric Biggers wrote: > Hi Josef and Sweet Tea, > > On Fri, Dec 01, 2023 at 05:10:57PM -0500, Josef Bacik wrote: > > Hello, > > > > v3 can be found here > > > > https://lore.kernel.org/linux-btrfs/cover.1697480198.git.jo...@toxicpanda.com/ > > > > There's been a longer delay between versions than I'd like, this was mostly > > due > > to Plumbers, Holidays, and then uncovering a bunch of new issues with '-o > > test_dummy_encryption'. I'm still working through some of the btrfs > > specific > > failures, but the fscrypt side appears to be stable. I had to add a few > > changes > > to fscrypt since the last time, but nothing earth shattering, just moving > > the > > keyring destruction and adding a helper we need for btrfs send to work > > properly. > > > > This is passing a good chunk of the fstests, at this point the majority > > appear > > to be cases where I need to exclude the test when using > > test_dummy_encryption > > because of various limitations of our tools or other infrastructure related > > things. > > > > I likely will have a follow-up series with more fixes, but the bulk of this > > is > > unchanged since the last posting. There were some bug fixes and such but > > the > > overall design remains the same. Thanks, > > > > Is there a plan for someone to keep working on this? I think it was finally > getting somewhere, but the work on it seems to have stopped. > I fixed up all your review comments, but yes we don't care about this internally anymore so it's been de-prioritized. I have to rebase onto the new stuff, re-run tests, fix any bugs that may have creeped in, but the current code addressed all of your comments. Once I get time to get back to this you'll have a new version in your inbox, but that may be some time. Thanks, Josef
Re: [f2fs-dev] [PATCH 1/3] btrfs: call btrfs_close_devices from ->kill_sb
On Sat, Dec 16, 2023 at 05:12:21AM +0100, Christoph Hellwig wrote: > On Fri, Dec 15, 2023 at 04:45:50PM -0500, Josef Bacik wrote: > > I ran it through, you broke a test that isn't upstream yet to test the old > > mount > > api double mount thing that I have a test for > > > > https://github.com/btrfs/fstests/commit/2796723e77adb0f9da1059acf13fc402467f7ac4 > > > > In this case we end up leaking a reference on the fs_devices. If you add > > this > > fixup to "btrfs: call btrfs_close_devices from ->kill_sb" it fixes that > > failure. > > I'm re-running with that fixup applied, but I assume the rest is fine. > > Thanks, > > Is "this fixup" referring to a patch that was supposed to be attached > but is't? :) Sorry, vacation brain, here you go. Josef diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index f93fe2e5e378..2dfa2274b193 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1950,10 +1950,20 @@ static int btrfs_get_tree_super(struct fs_context *fc) */ static struct vfsmount *btrfs_reconfigure_for_mount(struct fs_context *fc) { + struct btrfs_fs_info *fs_info = fc->s_fs_info; struct vfsmount *mnt; int ret; const bool ro2rw = !(fc->sb_flags & SB_RDONLY); + /* +* We got a reference to our fs_devices, so we need to close it here to +* make sure we don't leak our reference on the fs_devices. +*/ + if (fs_info->fs_devices) { + btrfs_close_devices(fs_info->fs_devices); + fs_info->fs_devices = NULL; + } + /* * We got an EBUSY because our SB_RDONLY flag didn't match the existing * super block, so invert our setting here and retry the mount so we
Re: [f2fs-dev] [PATCH 1/3] btrfs: call btrfs_close_devices from ->kill_sb
On Wed, Dec 13, 2023 at 09:41:23AM +0100, Christoph Hellwig wrote: > On Tue, Dec 12, 2023 at 08:00:16PM -0800, Eric Biggers wrote: > > From: Christoph Hellwig > > > > blkdev_put must not be called under sb->s_umount to avoid a lock order > > reversal with disk->open_mutex once call backs from block devices to > > the file system using the holder ops are supported. Move the call > > to btrfs_close_devices into btrfs_free_fs_info so that it is closed > > from ->kill_sb (which is also called from the mount failure handling > > path unlike ->put_super) as well as when an fs_info is freed because > > an existing superblock already exists. > > Thanks, this looks roughly the same to what I have locally. > > I did in fact forward port everything missing from the get_super > series yesterday, but on my test setup btrfs/142 hangs even in the > baseline setup. I went back to Linux before giving up for now. > > Josef, any chane you could throw this branch: > > git://git.infradead.org/users/hch/misc.git btrfs-holder > > into your CI setup and see if it sticks? Except for the trivial last > three patches this is basically what you reviewed already, although > there was some heavy rebasing due to the mount API converison. I ran it through, you broke a test that isn't upstream yet to test the old mount api double mount thing that I have a test for https://github.com/btrfs/fstests/commit/2796723e77adb0f9da1059acf13fc402467f7ac4 In this case we end up leaking a reference on the fs_devices. If you add this fixup to "btrfs: call btrfs_close_devices from ->kill_sb" it fixes that failure. I'm re-running with that fixup applied, but I assume the rest is fine. Thanks, Josef
Re: [PATCH 1/3] btrfs: call btrfs_close_devices from ->kill_sb
On Wed, Dec 13, 2023 at 09:41:23AM +0100, Christoph Hellwig wrote: > On Tue, Dec 12, 2023 at 08:00:16PM -0800, Eric Biggers wrote: > > From: Christoph Hellwig > > > > blkdev_put must not be called under sb->s_umount to avoid a lock order > > reversal with disk->open_mutex once call backs from block devices to > > the file system using the holder ops are supported. Move the call > > to btrfs_close_devices into btrfs_free_fs_info so that it is closed > > from ->kill_sb (which is also called from the mount failure handling > > path unlike ->put_super) as well as when an fs_info is freed because > > an existing superblock already exists. > > Thanks, this looks roughly the same to what I have locally. > > I did in fact forward port everything missing from the get_super > series yesterday, but on my test setup btrfs/142 hangs even in the > baseline setup. I went back to Linux before giving up for now. > > Josef, any chane you could throw this branch: > > git://git.infradead.org/users/hch/misc.git btrfs-holder > > into your CI setup and see if it sticks? Except for the trivial last > three patches this is basically what you reviewed already, although > there was some heavy rebasing due to the mount API converison. > Yup, sorry Christoph I missed this email when you sent it, I'll throw it in there now. Thanks, Josef
Re: [PATCH] fscrypt: move the call to fscrypt_destroy_keyring() into ->put_super()
On Tue, Dec 05, 2023 at 04:13:24PM -0800, Eric Biggers wrote: > From: Eric Biggers > > btrfs, which is planning to add support for fscrypt, has a variety of > asynchronous things it does with inodes that can potentially last until > ->put_super, when it shuts everything down and cleans up all async work. > Consequently, btrfs needs the call to fscrypt_destroy_keyring() to > happen either after or within ->put_super. > > Meanwhile, f2fs needs the call to fscrypt_destroy_keyring() to happen > either *before* or within ->put_super, due to the dependency of > f2fs_get_devices() on ->s_fs_info still existing. > > To meet both of these constraints, this patch moves the keyring > destruction into ->put_super. This gives filesystems some flexibility > into when it is done. This does mean that the VFS no longer handles it > automatically for filesystems, which is unfortunate, though this is in > line with most of the other fscrypt functions. > > (The fscrypt keyring destruction has now been changed an embarrassingly > large number of times. Hopefully this will be The Last Change That > Finally Gets It Right!) > > Signed-off-by: Eric Biggers Reviewed-by: Josef Bacik Thanks, Josef
Re: [PATCH v4 00/46] btrfs: add fscrypt support
On Fri, Dec 01, 2023 at 05:10:57PM -0500, Josef Bacik wrote: > Hello, > > v3 can be found here > > https://lore.kernel.org/linux-btrfs/cover.1697480198.git.jo...@toxicpanda.com/ Sorry Eric, it's been a long week and I forgot how to use email, didn't cc you or linux-fscrypt on this series. It's on fsdevel and the btrfs list. Thanks, Josef
Re: [PATCH 07/12] btrfs: test snapshotting encrypted subvol
On Mon, Nov 27, 2023 at 10:16:28PM +0800, Anand Jain wrote: > > > On 31/10/2023 23:39, Filipe Manana wrote: > > On Tue, Oct 10, 2023 at 9:26 PM Josef Bacik wrote: > > > > > > From: Sweet Tea Dorminy > > > > > > Make sure that snapshots of encrypted data are readable and writeable. > > > > > > Test deliberately high-numbered to not conflict. > > > > > > Signed-off-by: Sweet Tea Dorminy > > > --- > > > tests/btrfs/614 | 76 ++ > > > tests/btrfs/614.out | 111 > > > 2 files changed, 187 insertions(+) > > > create mode 100755 tests/btrfs/614 > > > create mode 100644 tests/btrfs/614.out > > > > > > diff --git a/tests/btrfs/614 b/tests/btrfs/614 > > > new file mode 100755 > > > index ..87dd27f9 > > > --- /dev/null > > > +++ b/tests/btrfs/614 > > > @@ -0,0 +1,76 @@ > > > +#! /bin/bash > > > +# SPDX-License-Identifier: GPL-2.0 > > > +# Copyright (c) 2023 Meta Platforms, Inc. All Rights Reserved. > > > +# > > > +# FS QA Test 614 > > > +# > > > +# Try taking a snapshot of an encrypted subvolume. Make sure the > > > snapshot is > > > +# still readable. Rewrite part of the subvol with the same data; make > > > sure it's > > > +# still readable. > > > +# > > > +. ./common/preamble > > > +_begin_fstest auto encrypt > > > > Should be in the 'snapshot' and 'subvol' groups too, as it creates a > > snapshot and a subvolume. > > Also maybe in the 'quick' group too, see the comments further below. > > > > > + > > > +# Import common functions. > > > +. ./common/encrypt > > > +. ./common/filter > > > + > > > +# real QA test starts here > > > +_supported_fs btrfs > > > + > > > +_require_test > > > > The test device is not used, so this can go away. > > > > > +_require_scratch > > > +_require_scratch_encryption -v 2 > > > +_require_command "$KEYCTL_PROG" keyctl > > > + > > > +_scratch_mkfs_encrypted &>> $seqres.full > > > +_scratch_mount > > > + > > > +udir=$SCRATCH_MNT/reference > > > +dir=$SCRATCH_MNT/subvol > > > +dir2=$SCRATCH_MNT/subvol2 > > > +$BTRFS_UTIL_PROG subvolume create $dir >> $seqres.full > > > +mkdir $udir > > > + > > > +_set_encpolicy $dir $TEST_KEY_IDENTIFIER > > > +_add_enckey $SCRATCH_MNT "$TEST_RAW_KEY" > > > + > > > +# get files with lots of extents by using backwards writes. > > > +for j in `seq 0 50`; do > > > + for i in `seq 20 -1 1`; do > > > + $XFS_IO_PROG -f -d -c "pwrite $(($i * 4096)) 4096" \ > > > + $dir/foo-$j >> $seqres.full | _filter_xfs_io > > > + $XFS_IO_PROG -f -d -c "pwrite $(($i * 4096)) 4096" \ > > > + $udir/foo-$j >> $seqres.full | _filter_xfs_io > > > + done > > > +done > > > + > > > +$BTRFS_UTIL_PROG subvolume snapshot $dir $dir2 | _filter_scratch > > > + > > > +_scratch_remount > > > +_add_enckey $SCRATCH_MNT "$TEST_RAW_KEY" > > > +sleep 30 > > > > What's the sleep for? > > Is the 30 seconds to wait for a transaction commit? > > If it is then I'd rather mount the fs with -o commit=3 (or some other > > low value) and then "sleep 3" to make the test run much faster. > > A comment explaining why the sleep is there, what is its purpose, > > should also be in place. > > > > > +echo "Diffing $dir and $dir2" > > > +diff $dir $dir2 > > > + > > > +echo "Rewriting $dir2 partly" > > > +# rewrite half of each file in the snapshot > > > +for j in `seq 0 50`; do > > > + for i in `seq 10 -1 1`; do > > > + $XFS_IO_PROG -f -d -c "pwrite $(($i * 4096)) 4096" \ > > > + $dir2/foo-$j >> $seqres.full | _filter_xfs_io > > > + done > > > +done > > > + > > > +echo "Diffing $dir and $dir2" > > > +diff $dir $dir2 > > > + > > > +echo "Dropping key and diffing" > > > +_rm_enckey $SCRATCH_MNT $TEST_KEY_IDENTIFIER > > > +diff $dir $dir2 |& _filter_scratch | _filter_nokey_filenames > > > + > > > +$BTRFS_UTIL_PROG subvolume delete $dir > /dev/null 2>&1 > > > > What's the purpose of this subvolume delete? > > It's ignoring stdout and stderr, so it doesn't care whether it > > succeeds or fails, and we > > don't do any tests/checks after it. > > > > Thanks. > > > Josef, I'm planning to get this patchset ready for the PR. Are you planning > to address the review comments as mentioned above? These > aren't bugs, but they definitely add more clarity and adds to the > missing groups. > Can you hold off Anand? I haven't responded because I've been working on this series and making appropriate changes to my local branch, I'll send a refreshed version of the patches when I send the next set of the fscrypt enablement patches. I've got all the comments addressed locally, it'll save you some work. Thanks, Josef
Re: [PATCH v2 00/36] btrfs: add fscrypt support
On Tue, Nov 21, 2023 at 03:02:32PM -0800, Eric Biggers wrote: > On Tue, Oct 10, 2023 at 04:40:15PM -0400, Josef Bacik wrote: > > Hello, > > > > This is the next version of the fscrypt support. It is based on a > > combination > > of Sterba's for-next branch and the fscrypt for-next branch. The fscrypt > > stuff > > should apply cleanly to the fscrypt for-next, but it won't apply cleanly to > > our > > btrfs for-next branch. I did this in case Eric wants to go ahead and merge > > the > > fscrypt side, then we can figure out what to do on the btrfs side. > > > > v1 was posted here > > > > https://lore.kernel.org/linux-btrfs/cover.1695750478.git.jo...@toxicpanda.com/ > > Hi Josef! Are you planning to send out an updated version of this soon? > Hey Eric, Yup I meant to have another one out the door a couple of weeks ago but I was going through your fstests comments and learned about -o test_dummy_encryption so I implemented that and a few problems fell out, and then I was at Plumbers and Maintainers Summit. I'm working through my mount api changes now and the encryption thing is next, I hope to get it out today as I fixed most of the problems, I just have to fix one of our IOCTL's that exposes file names that wasn't decrypting the names and then hopefully it'll be good. FWIW all the things I had to fix didn't require changes to the fscrypt side, so it's mostly untouched since last time. Thanks, Josef
Re: [PATCH 09/12] fstests: split generic/580 into two tests
On Thu, Nov 02, 2023 at 07:42:50PM +0800, Anand Jain wrote: > On 10/11/23 04:26, Josef Bacik wrote: > > generic/580 tests both v1 and v2 encryption policies, however btrfs only > > supports v2 policies. Split this into two tests so that we can get the > > v2 coverage for btrfs. > > Instead of duplicating the test cases for v1 and v2 encryption policies, > can we check the supported version and run them accordingly within a > single test case? > > The same applies 10 and 11/12 patches as well. This will be awkward for file systems that support both, hence the split. I don't love suddenly generating a bunch of new tests, but this seems like the better option since btrfs is the only file system that only supports v2, and everybody else supports everything. Thanks, Josef
Re: [PATCH] fscrypt: track master key presence separately from secret
On Sat, Oct 14, 2023 at 11:10:55PM -0700, Eric Biggers wrote: > From: Eric Biggers > > Master keys can be in one of three states: present, incompletely > removed, and absent (as per FSCRYPT_KEY_STATUS_* used in the UAPI). > Currently, the way that "present" is distinguished from "incompletely > removed" internally is by whether ->mk_secret exists or not. > > With extent-based encryption, it will be necessary to allow per-extent > keys to be derived while the master key is incompletely removed, so that > I/O on open files will reliably continue working after removal of the > key has been initiated. (We could allow I/O to sometimes fail in that > case, but that seems problematic for reasons such as writes getting > silently thrown away and diverging from the existing fscrypt semantics.) > Therefore, when the filesystem is using extent-based encryption, > ->mk_secret can't be wiped when the key becomes incompletely removed. > > As a prerequisite for doing that, this patch makes the "present" state > be tracked using a new field, ->mk_present. No behavior is changed yet. > > The basic idea here is borrowed from Josef Bacik's patch > "fscrypt: use a flag to indicate that the master key is being evicted" > (https://lore.kernel.org/r/e86c16dddc049ff065f877d793ad773e4c6bfad9.1696970227.git.jo...@toxicpanda.com). > I reimplemented it using a "present" bool instead of an "evicted" flag, > fixed a couple bugs, and tried to update everything to be consistent. > > Note: I considered adding a ->mk_status field instead, holding one of > FSCRYPT_KEY_STATUS_*. At first that seemed nice, but it ended up being > more complex (despite simplifying FS_IOC_GET_ENCRYPTION_KEY_STATUS), > since it would have introduced redundancy and had weird locking rules. > > Signed-off-by: Eric Biggers Based my fscrypt patches ontop of this one, ran tests with both btrfs and ext4 with it applied, in addition to my normal review stuff. You can add Reviewed-by: Josef Bacik Thanks, Josef
[PATCH v2 32/36] btrfs: populate ordered_extent with the orig offset
For extent encryption we have to use a logical block nr as input for the IV. For btrfs we're using the offset into the extent we're operating on. For most ordered extents this is the same as the file_offset, however for prealloc and NOCOW we have to use the original offset. Add this as an argument and plumb it through everywhere, this will be used when setting up the bio. Signed-off-by: Josef Bacik --- fs/btrfs/inode.c| 15 ++- fs/btrfs/ordered-data.c | 22 -- fs/btrfs/ordered-data.h | 12 +--- 3 files changed, 31 insertions(+), 18 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index b0109b313217..1b844a27a0d1 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1165,6 +1165,7 @@ static void submit_one_async_extent(struct async_chunk *async_chunk, ordered = btrfs_alloc_ordered_extent(inode, em->fscrypt_info, start, /* file_offset */ + start, /* orig_start */ async_extent->ram_size, /* num_bytes */ async_extent->ram_size, /* ram_bytes */ ins.objectid,/* disk_bytenr */ @@ -1428,8 +1429,8 @@ static noinline int cow_file_range(struct btrfs_inode *inode, } ordered = btrfs_alloc_ordered_extent(inode, em->fscrypt_info, - start, ram_size, ram_size, ins.objectid, - cur_alloc_size, 0, + start, start, ram_size, ram_size, + ins.objectid, cur_alloc_size, 0, 1 << BTRFS_ORDERED_REGULAR, BTRFS_COMPRESS_NONE); free_extent_map(em); @@ -2178,7 +2179,9 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, } ordered = btrfs_alloc_ordered_extent(inode, fscrypt_info, - cur_offset, nocow_args.num_bytes, + cur_offset, + found_key.offset - nocow_args.extent_offset, + nocow_args.num_bytes, nocow_args.num_bytes, nocow_args.disk_bytenr, nocow_args.num_bytes, 0, is_prealloc @@ -7087,8 +7090,9 @@ static struct extent_map *btrfs_create_dio_extent(struct btrfs_inode *inode, fscrypt_info = orig_em->fscrypt_info; } - ordered = btrfs_alloc_ordered_extent(inode, fscrypt_info, start, len, -len, block_start, block_len, 0, + ordered = btrfs_alloc_ordered_extent(inode, fscrypt_info, start, +orig_start, len, len, block_start, +block_len, 0, (1 << type) | (1 << BTRFS_ORDERED_DIRECT), BTRFS_COMPRESS_NONE); @@ -10612,6 +10616,7 @@ ssize_t btrfs_do_encoded_write(struct kiocb *iocb, struct iov_iter *from, } ordered = btrfs_alloc_ordered_extent(inode, em->fscrypt_info, start, + start - encoded->unencoded_offset, num_bytes, ram_bytes, ins.objectid, ins.offset, encoded->unencoded_offset, (1 << BTRFS_ORDERED_ENCODED) | diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index ee3138a6d11e..75eb42b5c95b 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -148,9 +148,9 @@ static inline struct rb_node *ordered_tree_search(struct btrfs_inode *inode, static struct btrfs_ordered_extent *alloc_ordered_extent( struct btrfs_inode *inode, struct fscrypt_extent_info *fscrypt_info, - u64 file_offset, u64 num_bytes, u64 ram_bytes, - u64 disk_bytenr, u64 disk_num_bytes, u64 offset, - unsigned long flags, int compress_type) + u64 file_offset, u64 orig_offset, u64 num_bytes, + u64 ram_bytes, u64 disk_bytenr, u64 disk_num_bytes, + u64 offset, unsigned long flags, int compress_type) { struct btrfs_ordered_extent *entry; int ret; @@ -175,6 +175,7 @@ static struct btrfs_ordered_extent *alloc_ordered_extent( return ERR_PTR(-ENOMEM); entry->file_offset = file_offset; + entry->orig_offset = orig_offset; entry->num_byt
[PATCH v2 27/36] btrfs: explicitly track file extent length for replace and drop
From: Sweet Tea Dorminy With the advent of storing fscrypt contexts with each encrypted extent, extents will have a variable length depending on encryption status. Make sure the replace and drop file extent item helpers encode this information so that everything gets updated properly. Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/ctree.h| 2 ++ fs/btrfs/file.c | 4 ++-- fs/btrfs/inode.c| 7 +-- fs/btrfs/reflink.c | 1 + fs/btrfs/tree-log.c | 5 +++-- 5 files changed, 13 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index c8f1d2d7c46c..e5879bd7f2f7 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -372,6 +372,8 @@ struct btrfs_replace_extent_info { u64 file_offset; /* Pointer to a file extent item of type regular or prealloc. */ char *extent_buf; + /* The length of @extent_buf */ + u32 extent_buf_size; /* * Set to true when attempting to replace a file range with a new extent * described by this structure, set to false when attempting to clone an diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 26905b77c7e8..a19ac854e07f 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2261,14 +2261,14 @@ static int btrfs_insert_replace_extent(struct btrfs_trans_handle *trans, key.type = BTRFS_EXTENT_DATA_KEY; key.offset = extent_info->file_offset; ret = btrfs_insert_empty_item(trans, root, path, &key, - sizeof(struct btrfs_file_extent_item)); + extent_info->extent_buf_size); if (ret) return ret; leaf = path->nodes[0]; slot = path->slots[0]; write_extent_buffer(leaf, extent_info->extent_buf, btrfs_item_ptr_offset(leaf, slot), - sizeof(struct btrfs_file_extent_item)); + extent_info->extent_buf_size); extent = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item); ASSERT(btrfs_file_extent_type(leaf, extent) != BTRFS_FILE_EXTENT_INLINE); btrfs_set_file_extent_offset(leaf, extent, extent_info->data_offset); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index d20ccfc5038f..03bc9f41bd33 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2898,6 +2898,7 @@ static int insert_reserved_file_extent(struct btrfs_trans_handle *trans, u64 num_bytes = btrfs_stack_file_extent_num_bytes(stack_fi); u64 ram_bytes = btrfs_stack_file_extent_ram_bytes(stack_fi); struct btrfs_drop_extents_args drop_args = { 0 }; + size_t fscrypt_context_size = 0; int ret; path = btrfs_alloc_path(); @@ -2917,7 +2918,7 @@ static int insert_reserved_file_extent(struct btrfs_trans_handle *trans, drop_args.start = file_pos; drop_args.end = file_pos + num_bytes; drop_args.replace_extent = true; - drop_args.extent_item_size = sizeof(*stack_fi); + drop_args.extent_item_size = sizeof(*stack_fi) + fscrypt_context_size; ret = btrfs_drop_extents(trans, root, inode, &drop_args); if (ret) goto out; @@ -2928,7 +2929,7 @@ static int insert_reserved_file_extent(struct btrfs_trans_handle *trans, ins.type = BTRFS_EXTENT_DATA_KEY; ret = btrfs_insert_empty_item(trans, root, path, &ins, - sizeof(*stack_fi)); + sizeof(*stack_fi) + fscrypt_context_size); if (ret) goto out; } @@ -9671,6 +9672,7 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( u64 len = ins->offset; int qgroup_released; int ret; + size_t fscrypt_context_size = 0; memset(&stack_fi, 0, sizeof(stack_fi)); @@ -9703,6 +9705,7 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( extent_info.data_len = len; extent_info.file_offset = file_offset; extent_info.extent_buf = (char *)&stack_fi; + extent_info.extent_buf_size = sizeof(stack_fi) + fscrypt_context_size; extent_info.is_new_extent = true; extent_info.update_times = true; extent_info.qgroup_reserved = qgroup_released; diff --git a/fs/btrfs/reflink.c b/fs/btrfs/reflink.c index 3c66630d87ee..f5440ae447a4 100644 --- a/fs/btrfs/reflink.c +++ b/fs/btrfs/reflink.c @@ -500,6 +500,7 @@ static int btrfs_clone(struct inode *src, struct inode *inode, clone_info.data_len = datal; clone_info.file_offset = new_key.offset; clone_info.extent_buf = buf; + clone_info.extent_buf_size = size; clone_info.is_new_extent = false; clone_info.update_times = !no_time_update;
[PATCH v2 26/36] btrfs: add an optional encryption context to the end of file extents
The fscrypt encryption context can be extended to include different things in the future. To facilitate future expansion add an optional btrfs_encryption_info to the end of the file extent. This will hold the size of the context and then will have the binary context tacked onto the end of the extent item. Add the appropriate accessors to make it easy to read this information if we have encryption set, and then update the tree-checker to validate that if this is indeed set properly that the size matches properly. Signed-off-by: Josef Bacik --- fs/btrfs/accessors.h| 48 +++ fs/btrfs/tree-checker.c | 58 - include/uapi/linux/btrfs_tree.h | 17 +- 3 files changed, 113 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/accessors.h b/fs/btrfs/accessors.h index 5aaf204fa55f..a54a4671bd15 100644 --- a/fs/btrfs/accessors.h +++ b/fs/btrfs/accessors.h @@ -932,6 +932,10 @@ BTRFS_SETGET_STACK_FUNCS(super_uuid_tree_generation, struct btrfs_super_block, BTRFS_SETGET_STACK_FUNCS(super_nr_global_roots, struct btrfs_super_block, nr_global_roots, 64); +/* struct btrfs_file_extent_encryption_info */ +BTRFS_SETGET_FUNCS(encryption_info_size, struct btrfs_encryption_info, size, + 32); + /* struct btrfs_file_extent_item */ BTRFS_SETGET_STACK_FUNCS(stack_file_extent_type, struct btrfs_file_extent_item, type, 8); @@ -973,6 +977,50 @@ BTRFS_SETGET_FUNCS(file_extent_encryption, struct btrfs_file_extent_item, BTRFS_SETGET_FUNCS(file_extent_other_encoding, struct btrfs_file_extent_item, other_encoding, 16); +static inline struct btrfs_encryption_info *btrfs_file_extent_encryption_info( + const struct btrfs_file_extent_item *ei) +{ + unsigned long offset = (unsigned long)ei; + + offset += offsetof(struct btrfs_file_extent_item, encryption_info); + return (struct btrfs_encryption_info *)offset; +} + +static inline unsigned long btrfs_file_extent_encryption_ctx_offset( + const struct btrfs_file_extent_item *ei) +{ + unsigned long offset = (unsigned long)ei; + + offset += offsetof(struct btrfs_file_extent_item, encryption_info); + return offset + offsetof(struct btrfs_encryption_info, context); +} + +static inline u32 btrfs_file_extent_encryption_ctx_size( + const struct extent_buffer *eb, + const struct btrfs_file_extent_item *ei) +{ + return btrfs_encryption_info_size(eb, + btrfs_file_extent_encryption_info(ei)); +} + +static inline void btrfs_set_file_extent_encryption_ctx_size( + const struct extent_buffer *eb, + struct btrfs_file_extent_item *ei, + u32 val) +{ + btrfs_set_encryption_info_size(eb, + btrfs_file_extent_encryption_info(ei), + val); +} + +static inline u32 btrfs_file_extent_encryption_info_size( + const struct extent_buffer *eb, + const struct btrfs_file_extent_item *ei) +{ + return btrfs_encryption_info_size(eb, + btrfs_file_extent_encryption_info(ei)); +} + /* btrfs_qgroup_status_item */ BTRFS_SETGET_FUNCS(qgroup_status_generation, struct btrfs_qgroup_status_item, generation, 64); diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c index 825b235927c6..0b671c1e96f1 100644 --- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -211,6 +211,7 @@ static int check_extent_data_item(struct extent_buffer *leaf, u32 item_size = btrfs_item_size(leaf, slot); u64 extent_end; u8 policy; + u8 fe_type; if (unlikely(!IS_ALIGNED(key->offset, sectorsize))) { file_extent_err(leaf, slot, @@ -241,12 +242,12 @@ static int check_extent_data_item(struct extent_buffer *leaf, SZ_4K); return -EUCLEAN; } - if (unlikely(btrfs_file_extent_type(leaf, fi) >= -BTRFS_NR_FILE_EXTENT_TYPES)) { + + fe_type = btrfs_file_extent_type(leaf, fi); + if (unlikely(fe_type >= BTRFS_NR_FILE_EXTENT_TYPES)) { file_extent_err(leaf, slot, "invalid type for file extent, have %u expect range [0, %u]", - btrfs_file_extent_type(leaf, fi), - BTRFS_NR_FILE_EXTENT_TYPES - 1); + fe_type, BTRFS_NR_FILE_EXTENT_TYPES - 1); return -EUCLEAN; } @@ -295,12 +296,51 @@ static int check_extent_data_item(s
[PATCH v2 16/36] btrfs: implement fscrypt ioctls
From: Omar Sandoval These ioctls allow encryption to actually be used. The set_encryption_policy ioctl is the thing which actually turns on encryption, and therefore sets the ENCRYPT flag in the superblock. This prevents the filesystem from being loaded on older kernels. fscrypt provides CONFIG_FS_ENCRYPTION-disabled versions of all these functions which just return -EOPNOTSUPP, so the ioctls don't need to be compiled out if CONFIG_FS_ENCRYPTION isn't enabled. We could instead gate this ioctl on the superblock having the flag set, if we wanted to require mkfs with the encrypt flag in order to have a filesystem with any encryption. Signed-off-by: Omar Sandoval Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/ioctl.c | 28 1 file changed, 28 insertions(+) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 1f1506280619..5938adb64409 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -4575,6 +4575,34 @@ long btrfs_ioctl(struct file *file, unsigned int return btrfs_ioctl_get_fslabel(fs_info, argp); case FS_IOC_SETFSLABEL: return btrfs_ioctl_set_fslabel(file, argp); + case FS_IOC_SET_ENCRYPTION_POLICY: { + if (!IS_ENABLED(CONFIG_FS_ENCRYPTION)) + return -EOPNOTSUPP; + if (sb_rdonly(fs_info->sb)) + return -EROFS; + /* +* If we crash before we commit, nothing encrypted could have +* been written so it doesn't matter whether the encrypted +* state persists. +*/ + btrfs_set_fs_incompat(fs_info, ENCRYPT); + return fscrypt_ioctl_set_policy(file, (const void __user *)arg); + } + case FS_IOC_GET_ENCRYPTION_POLICY: + return fscrypt_ioctl_get_policy(file, (void __user *)arg); + case FS_IOC_GET_ENCRYPTION_POLICY_EX: + return fscrypt_ioctl_get_policy_ex(file, (void __user *)arg); + case FS_IOC_ADD_ENCRYPTION_KEY: + return fscrypt_ioctl_add_key(file, (void __user *)arg); + case FS_IOC_REMOVE_ENCRYPTION_KEY: + return fscrypt_ioctl_remove_key(file, (void __user *)arg); + case FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS: + return fscrypt_ioctl_remove_key_all_users(file, + (void __user *)arg); + case FS_IOC_GET_ENCRYPTION_KEY_STATUS: + return fscrypt_ioctl_get_key_status(file, (void __user *)arg); + case FS_IOC_GET_ENCRYPTION_NONCE: + return fscrypt_ioctl_get_nonce(file, (void __user *)arg); case FITRIM: return btrfs_ioctl_fitrim(fs_info, argp); case BTRFS_IOC_SNAP_CREATE: -- 2.41.0
[PATCH v2 22/36] btrfs: add fscrypt_info and encryption_type to ordered_extent
We're going to need these to update the file extent items once the writes are complete. Add them and add the pieces necessary to assign them and free everything. Signed-off-by: Josef Bacik --- fs/btrfs/ordered-data.c | 2 ++ fs/btrfs/ordered-data.h | 6 ++ 2 files changed, 8 insertions(+) diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index 574e8a55e24a..27350dd50828 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -181,6 +181,7 @@ static struct btrfs_ordered_extent *alloc_ordered_extent( entry->bytes_left = num_bytes; entry->inode = igrab(&inode->vfs_inode); entry->compress_type = compress_type; + entry->encryption_type = BTRFS_ENCRYPTION_NONE; entry->truncated_len = (u64)-1; entry->qgroup_rsv = ret; entry->flags = flags; @@ -564,6 +565,7 @@ void btrfs_put_ordered_extent(struct btrfs_ordered_extent *entry) list_del(&sum->list); kvfree(sum); } + fscrypt_put_extent_info(entry->fscrypt_info); kmem_cache_free(btrfs_ordered_extent_cache, entry); } } diff --git a/fs/btrfs/ordered-data.h b/fs/btrfs/ordered-data.h index 567a6d3d4712..cc422bdb5363 100644 --- a/fs/btrfs/ordered-data.h +++ b/fs/btrfs/ordered-data.h @@ -115,6 +115,9 @@ struct btrfs_ordered_extent { /* compression algorithm */ int compress_type; + /* encryption mode */ + int encryption_type; + /* Qgroup reserved space */ int qgroup_rsv; @@ -124,6 +127,9 @@ struct btrfs_ordered_extent { /* the inode we belong to */ struct inode *inode; + /* the fscrypt_info for this extent, if necessary */ + struct fscrypt_extent_info *fscrypt_info; + /* list of checksums for insertion when the extent io is done */ struct list_head list; -- 2.41.0
[PATCH v2 23/36] btrfs: plumb through setting the fscrypt_info for ordered extents
We're going to be getting fscrypt_info from the extent maps, update the helpers to take an fscrypt_info argument and use that to set the encryption type on the ordered extent. Signed-off-by: Josef Bacik --- fs/btrfs/inode.c| 20 +++- fs/btrfs/ordered-data.c | 32 fs/btrfs/ordered-data.h | 9 + 3 files changed, 36 insertions(+), 25 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 19087fd68cfe..a1fa5b6f3790 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1162,7 +1162,8 @@ static void submit_one_async_extent(struct async_chunk *async_chunk, } free_extent_map(em); - ordered = btrfs_alloc_ordered_extent(inode, start, /* file_offset */ + ordered = btrfs_alloc_ordered_extent(inode, NULL, + start, /* file_offset */ async_extent->ram_size, /* num_bytes */ async_extent->ram_size, /* ram_bytes */ ins.objectid,/* disk_bytenr */ @@ -1425,9 +1426,10 @@ static noinline int cow_file_range(struct btrfs_inode *inode, } free_extent_map(em); - ordered = btrfs_alloc_ordered_extent(inode, start, ram_size, - ram_size, ins.objectid, cur_alloc_size, - 0, 1 << BTRFS_ORDERED_REGULAR, + ordered = btrfs_alloc_ordered_extent(inode, NULL, + start, ram_size, ram_size, ins.objectid, + cur_alloc_size, 0, + 1 << BTRFS_ORDERED_REGULAR, BTRFS_COMPRESS_NONE); if (IS_ERR(ordered)) { ret = PTR_ERR(ordered); @@ -2158,7 +2160,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, free_extent_map(em); } - ordered = btrfs_alloc_ordered_extent(inode, cur_offset, + ordered = btrfs_alloc_ordered_extent(inode, NULL, cur_offset, nocow_args.num_bytes, nocow_args.num_bytes, nocow_args.disk_bytenr, nocow_args.num_bytes, 0, is_prealloc @@ -7040,7 +7042,7 @@ static struct extent_map *btrfs_create_dio_extent(struct btrfs_inode *inode, if (IS_ERR(em)) goto out; } - ordered = btrfs_alloc_ordered_extent(inode, start, len, len, + ordered = btrfs_alloc_ordered_extent(inode, NULL, start, len, len, block_start, block_len, 0, (1 << type) | (1 << BTRFS_ORDERED_DIRECT), @@ -10512,9 +10514,9 @@ ssize_t btrfs_do_encoded_write(struct kiocb *iocb, struct iov_iter *from, } free_extent_map(em); - ordered = btrfs_alloc_ordered_extent(inode, start, num_bytes, ram_bytes, - ins.objectid, ins.offset, - encoded->unencoded_offset, + ordered = btrfs_alloc_ordered_extent(inode, NULL, start, + num_bytes, ram_bytes, ins.objectid, + ins.offset, encoded->unencoded_offset, (1 << BTRFS_ORDERED_ENCODED) | (1 << BTRFS_ORDERED_COMPRESSED), compression); diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index 27350dd50828..ee3138a6d11e 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -146,9 +146,11 @@ static inline struct rb_node *ordered_tree_search(struct btrfs_inode *inode, } static struct btrfs_ordered_extent *alloc_ordered_extent( - struct btrfs_inode *inode, u64 file_offset, u64 num_bytes, - u64 ram_bytes, u64 disk_bytenr, u64 disk_num_bytes, - u64 offset, unsigned long flags, int compress_type) + struct btrfs_inode *inode, + struct fscrypt_extent_info *fscrypt_info, + u64 file_offset, u64 num_bytes, u64 ram_bytes, + u64 disk_bytenr, u64 disk_num_bytes, u64 offset, + unsigned long flags, int compress_type) { struct btrfs_ordered_extent *entry; int ret; @@ -181,10 +183,12 @@ static struct btrfs_ordered_extent *alloc_ordered_extent( entry->bytes_left = num_bytes; entry->inode = igrab(&inode->vfs_inode); entry->compress_type = compress_type; - entry->encr
[PATCH v2 17/36] btrfs: add encryption to CONFIG_BTRFS_DEBUG
From: Sweet Tea Dorminy Since encryption is currently under BTRFS_DEBUG, this adds its dependencies: inline encryption from fscrypt, and the inline encryption fallback path from the block layer. Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/ioctl.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 5938adb64409..c56986031870 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -4575,6 +4575,7 @@ long btrfs_ioctl(struct file *file, unsigned int return btrfs_ioctl_get_fslabel(fs_info, argp); case FS_IOC_SETFSLABEL: return btrfs_ioctl_set_fslabel(file, argp); +#ifdef CONFIG_BTRFS_DEBUG case FS_IOC_SET_ENCRYPTION_POLICY: { if (!IS_ENABLED(CONFIG_FS_ENCRYPTION)) return -EOPNOTSUPP; @@ -4603,6 +4604,7 @@ long btrfs_ioctl(struct file *file, unsigned int return fscrypt_ioctl_get_key_status(file, (void __user *)arg); case FS_IOC_GET_ENCRYPTION_NONCE: return fscrypt_ioctl_get_nonce(file, (void __user *)arg); +#endif /* CONFIG_BTRFS_DEBUG */ case FITRIM: return btrfs_ioctl_fitrim(fs_info, argp); case BTRFS_IOC_SNAP_CREATE: -- 2.41.0
[PATCH v2 34/36] btrfs: add a bio argument to btrfs_csum_one_bio
We only ever needed the bbio in btrfs_csum_one_bio, since that has the bio embedded in it. However with encryption we'll have a different bio with the encrypted data in it, and the original bbio. Update btrfs_csum_one_bio to take the bio we're going to csum as an argument, which will allow us to csum the encrypted bio and stuff the csums into the corresponding bbio to be used later when the IO completes. Signed-off-by: Josef Bacik --- fs/btrfs/bio.c | 2 +- fs/btrfs/file-item.c | 3 +-- fs/btrfs/file-item.h | 2 +- 3 files changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c index 4f3b693a16b1..90e4d4709fa3 100644 --- a/fs/btrfs/bio.c +++ b/fs/btrfs/bio.c @@ -533,7 +533,7 @@ static blk_status_t btrfs_bio_csum(struct btrfs_bio *bbio) { if (bbio->bio.bi_opf & REQ_META) return btree_csum_one_bio(bbio); - return btrfs_csum_one_bio(bbio); + return btrfs_csum_one_bio(bbio, &bbio->bio); } /* diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c index 35036fab58c4..d925d6d98bf4 100644 --- a/fs/btrfs/file-item.c +++ b/fs/btrfs/file-item.c @@ -730,13 +730,12 @@ int btrfs_lookup_csums_bitmap(struct btrfs_root *root, struct btrfs_path *path, /* * Calculate checksums of the data contained inside a bio. */ -blk_status_t btrfs_csum_one_bio(struct btrfs_bio *bbio) +blk_status_t btrfs_csum_one_bio(struct btrfs_bio *bbio, struct bio *bio) { struct btrfs_ordered_extent *ordered = bbio->ordered; struct btrfs_inode *inode = bbio->inode; struct btrfs_fs_info *fs_info = inode->root->fs_info; SHASH_DESC_ON_STACK(shash, fs_info->csum_shash); - struct bio *bio = &bbio->bio; struct btrfs_ordered_sum *sums; char *data; struct bvec_iter iter; diff --git a/fs/btrfs/file-item.h b/fs/btrfs/file-item.h index bb79014024bd..e52d5d71d533 100644 --- a/fs/btrfs/file-item.h +++ b/fs/btrfs/file-item.h @@ -51,7 +51,7 @@ int btrfs_lookup_file_extent(struct btrfs_trans_handle *trans, int btrfs_csum_file_blocks(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct btrfs_ordered_sum *sums); -blk_status_t btrfs_csum_one_bio(struct btrfs_bio *bbio); +blk_status_t btrfs_csum_one_bio(struct btrfs_bio *bbio, struct bio *bio); blk_status_t btrfs_alloc_dummy_sum(struct btrfs_bio *bbio); int btrfs_lookup_csums_range(struct btrfs_root *root, u64 start, u64 end, struct list_head *list, int search_commit, -- 2.41.0
[PATCH v2 20/36] btrfs: set file extent encryption excplicitly
From: Sweet Tea Dorminy This puts the long-preserved 1-byte encryption field to work, storing whether the extent is encrypted. Update the tree-checker to allow for the encryption bit to be set to our valid types. Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/accessors.h| 2 ++ fs/btrfs/inode.c| 8 ++-- fs/btrfs/tree-checker.c | 8 +--- fs/btrfs/tree-log.c | 2 ++ include/uapi/linux/btrfs_tree.h | 8 +++- 5 files changed, 22 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/accessors.h b/fs/btrfs/accessors.h index aa0844535644..5aaf204fa55f 100644 --- a/fs/btrfs/accessors.h +++ b/fs/btrfs/accessors.h @@ -949,6 +949,8 @@ BTRFS_SETGET_STACK_FUNCS(stack_file_extent_disk_num_bytes, struct btrfs_file_extent_item, disk_num_bytes, 64); BTRFS_SETGET_STACK_FUNCS(stack_file_extent_compression, struct btrfs_file_extent_item, compression, 8); +BTRFS_SETGET_STACK_FUNCS(stack_file_extent_encryption, +struct btrfs_file_extent_item, encryption, 8); BTRFS_SETGET_FUNCS(file_extent_type, struct btrfs_file_extent_item, type, 8); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index e5b52edcb042..9cb8b82ff8be 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2992,7 +2992,9 @@ static int insert_ordered_extent_file_extent(struct btrfs_trans_handle *trans, btrfs_set_stack_file_extent_num_bytes(&stack_fi, num_bytes); btrfs_set_stack_file_extent_ram_bytes(&stack_fi, ram_bytes); btrfs_set_stack_file_extent_compression(&stack_fi, oe->compress_type); - /* Encryption and other encoding is reserved and all 0 */ + btrfs_set_stack_file_extent_encryption(&stack_fi, + BTRFS_ENCRYPTION_NONE); + /* Other encoding is reserved and always 0 */ /* * For delalloc, when completing an ordered extent we update the inode's @@ -9640,7 +9642,9 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( btrfs_set_stack_file_extent_num_bytes(&stack_fi, len); btrfs_set_stack_file_extent_ram_bytes(&stack_fi, len); btrfs_set_stack_file_extent_compression(&stack_fi, BTRFS_COMPRESS_NONE); - /* Encryption and other encoding is reserved and all 0 */ + btrfs_set_stack_file_extent_encryption(&stack_fi, + BTRFS_ENCRYPTION_NONE); + /* Other encoding is reserved and always 0 */ qgroup_released = btrfs_qgroup_release_data(inode, file_offset, len); if (qgroup_released < 0) diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c index a416cbea75d1..825b235927c6 100644 --- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -210,6 +210,7 @@ static int check_extent_data_item(struct extent_buffer *leaf, u32 sectorsize = fs_info->sectorsize; u32 item_size = btrfs_item_size(leaf, slot); u64 extent_end; + u8 policy; if (unlikely(!IS_ALIGNED(key->offset, sectorsize))) { file_extent_err(leaf, slot, @@ -261,10 +262,11 @@ static int check_extent_data_item(struct extent_buffer *leaf, BTRFS_NR_COMPRESS_TYPES - 1); return -EUCLEAN; } - if (unlikely(btrfs_file_extent_encryption(leaf, fi))) { + policy = btrfs_file_extent_encryption(leaf, fi); + if (unlikely(policy >= BTRFS_NR_ENCRYPTION_TYPES)) { file_extent_err(leaf, slot, - "invalid encryption for file extent, have %u expect 0", - btrfs_file_extent_encryption(leaf, fi)); + "invalid encryption for file extent, have %u expect range [0, %u]", + policy, BTRFS_NR_ENCRYPTION_TYPES - 1); return -EUCLEAN; } if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_INLINE) { diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index c1fd4ef2dd8b..404577383513 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -4628,6 +4628,7 @@ static int log_one_extent(struct btrfs_trans_handle *trans, u64 extent_offset = em->start - em->orig_start; u64 block_len; int ret; + u8 encryption = BTRFS_ENCRYPTION_NONE; btrfs_set_stack_file_extent_generation(&fi, trans->transid); if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags)) @@ -4649,6 +4650,7 @@ static int log_one_extent(struct btrfs_trans_handle *trans, btrfs_set_stack_file_extent_num_bytes(&fi, em->len); btrfs_set_stack_file_extent_ram_bytes(&fi, em->ram_bytes); btrfs_set_stack_file_extent_compression(&fi, em->compress_type); + btrfs_set_stack_file_extent_encryption(&fi, encryption); ret = log_extent_csums(trans, inode,
[PATCH v2 28/36] btrfs: pass through fscrypt_extent_info to the file extent helpers
Now that we have the fscrypt_extnet_info in all of the supporting structures, pass this through and set the file extent encryption bit accordingly from the supporting structures. In subsequent patches code will be added to populate these appropriately. Signed-off-by: Josef Bacik --- fs/btrfs/inode.c| 18 +++--- fs/btrfs/tree-log.c | 2 +- 2 files changed, 12 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 03bc9f41bd33..87b38be47d0b 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2882,7 +2882,9 @@ int btrfs_writepage_cow_fixup(struct page *page) } static int insert_reserved_file_extent(struct btrfs_trans_handle *trans, - struct btrfs_inode *inode, u64 file_pos, + struct btrfs_inode *inode, + struct fscrypt_extent_info *fscrypt_info, + u64 file_pos, struct btrfs_file_extent_item *stack_fi, const bool update_inode_bytes, u64 qgroup_reserved) @@ -3014,8 +3016,7 @@ static int insert_ordered_extent_file_extent(struct btrfs_trans_handle *trans, btrfs_set_stack_file_extent_num_bytes(&stack_fi, num_bytes); btrfs_set_stack_file_extent_ram_bytes(&stack_fi, ram_bytes); btrfs_set_stack_file_extent_compression(&stack_fi, oe->compress_type); - btrfs_set_stack_file_extent_encryption(&stack_fi, - BTRFS_ENCRYPTION_NONE); + btrfs_set_stack_file_extent_encryption(&stack_fi, oe->encryption_type); /* Other encoding is reserved and always 0 */ /* @@ -3029,8 +3030,9 @@ static int insert_ordered_extent_file_extent(struct btrfs_trans_handle *trans, test_bit(BTRFS_ORDERED_TRUNCATED, &oe->flags); return insert_reserved_file_extent(trans, BTRFS_I(oe->inode), - oe->file_offset, &stack_fi, - update_inode_bytes, oe->qgroup_rsv); + oe->fscrypt_info, oe->file_offset, + &stack_fi, update_inode_bytes, + oe->qgroup_rsv); } /* @@ -9662,6 +9664,7 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( struct btrfs_trans_handle *trans_in, struct btrfs_inode *inode, struct btrfs_key *ins, + struct fscrypt_extent_info *fscrypt_info, u64 file_offset) { struct btrfs_file_extent_item stack_fi; @@ -9683,6 +9686,7 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( btrfs_set_stack_file_extent_ram_bytes(&stack_fi, len); btrfs_set_stack_file_extent_compression(&stack_fi, BTRFS_COMPRESS_NONE); btrfs_set_stack_file_extent_encryption(&stack_fi, + fscrypt_info ? BTRFS_ENCRYPTION_FSCRYPT : BTRFS_ENCRYPTION_NONE); /* Other encoding is reserved and always 0 */ @@ -9691,7 +9695,7 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( return ERR_PTR(qgroup_released); if (trans) { - ret = insert_reserved_file_extent(trans, inode, + ret = insert_reserved_file_extent(trans, inode, fscrypt_info, file_offset, &stack_fi, true, qgroup_released); if (ret) @@ -9785,7 +9789,7 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, last_alloc = ins.offset; trans = insert_prealloc_file_extent(trans, BTRFS_I(inode), - &ins, cur_offset); + &ins, NULL, cur_offset); /* * Now that we inserted the prealloc extent we can finally * decrement the number of reservations in the block group. diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index 6cdb924944d1..85267cf1f372 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -4629,7 +4629,7 @@ static int log_one_extent(struct btrfs_trans_handle *trans, u64 block_len; int ret; size_t fscrypt_context_size = 0; - u8 encryption = BTRFS_ENCRYPTION_NONE; + u8 encryption = em->encryption_type; btrfs_set_stack_file_extent_generation(&fi, trans->transid); if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags)) -- 2.41.0
[PATCH v2 35/36] btrfs: add orig_logical to btrfs_bio
When checksumming the encrypted bio on writes we need to know which logical address this checksum is for. At the point where we get the encrypted bio the bi_sector is the physical location on the target disk, so we need to save the original logical offset in the btrfs_bio. Then we can use this when csum'ing the bio instead of the bio->iter.bi_sector. Signed-off-by: Josef Bacik --- fs/btrfs/bio.c | 9 + fs/btrfs/bio.h | 3 +++ fs/btrfs/file-item.c | 2 +- 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c index 90e4d4709fa3..7d6931e53beb 100644 --- a/fs/btrfs/bio.c +++ b/fs/btrfs/bio.c @@ -96,6 +96,7 @@ static struct btrfs_bio *btrfs_split_bio(struct btrfs_fs_info *fs_info, if (bbio_has_ordered_extent(bbio)) { refcount_inc(&orig_bbio->ordered->refs); bbio->ordered = orig_bbio->ordered; + orig_bbio->orig_logical += map_length; } atomic_inc(&orig_bbio->pending_ios); return bbio; @@ -674,6 +675,14 @@ static bool btrfs_submit_chunk(struct btrfs_bio *bbio, int mirror_num) goto fail; } + /* +* For fscrypt writes we will get the encrypted bio after we've remapped +* our bio to the physical disk location, so we need to save the +* original bytenr so we know what we're checksumming. +*/ + if (bio_op(bio) == REQ_OP_WRITE && is_data_bbio(bbio)) + bbio->orig_logical = logical; + map_length = min(map_length, length); if (use_append) map_length = min(map_length, fs_info->max_zone_append_size); diff --git a/fs/btrfs/bio.h b/fs/btrfs/bio.h index ca79decee060..5d3f53dcd6d5 100644 --- a/fs/btrfs/bio.h +++ b/fs/btrfs/bio.h @@ -54,11 +54,14 @@ struct btrfs_bio { * - pointer to the checksums for this bio * - original physical address from the allocator * (for zone append only) +* - original logical address, used for checksumming fscrypt +* bios. */ struct { struct btrfs_ordered_extent *ordered; struct btrfs_ordered_sum *sums; u64 orig_physical; + u64 orig_logical; }; /* For metadata reads: parentness verification. */ diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c index d925d6d98bf4..26e3bc602655 100644 --- a/fs/btrfs/file-item.c +++ b/fs/btrfs/file-item.c @@ -756,7 +756,7 @@ blk_status_t btrfs_csum_one_bio(struct btrfs_bio *bbio, struct bio *bio) sums->len = bio->bi_iter.bi_size; INIT_LIST_HEAD(&sums->list); - sums->logical = bio->bi_iter.bi_sector << SECTOR_SHIFT; + sums->logical = bbio->orig_logical; index = 0; shash->tfm = fs_info->csum_shash; -- 2.41.0
[PATCH v2 31/36] btrfs: setup fscrypt_extent_info for new extents
New extents for encrypted inodes must have a fscrypt_extent_info, which has the necessary keys and does all the registration at the block layer for them. This is passed through all of the infrastructure we've previously added to make sure the context gets saved properly with the file extents. Signed-off-by: Josef Bacik --- fs/btrfs/inode.c | 39 +-- 1 file changed, 37 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 4f23c3af60be..b0109b313217 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7396,7 +7396,20 @@ static struct extent_map *create_io_em(struct btrfs_inode *inode, u64 start, set_bit(EXTENT_FLAG_COMPRESSED, &em->flags); em->compress_type = compress_type; } - em->encryption_type = BTRFS_ENCRYPTION_NONE; + + if (IS_ENCRYPTED(&inode->vfs_inode)) { + struct fscrypt_extent_info *fscrypt_info; + + em->encryption_type = BTRFS_ENCRYPTION_FSCRYPT; + fscrypt_info = fscrypt_prepare_new_extent(&inode->vfs_inode); + if (IS_ERR(fscrypt_info)) { + free_extent_map(em); + return ERR_CAST(fscrypt_info); + } + em->fscrypt_info = fscrypt_info; + } else { + em->encryption_type = BTRFS_ENCRYPTION_NONE; + } ret = btrfs_replace_extent_map_range(inode, em, true); if (ret) { @@ -9785,6 +9798,9 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, if (trans) own_trans = false; while (num_bytes > 0) { + struct fscrypt_extent_info *fscrypt_info = NULL; + int encryption_type = BTRFS_ENCRYPTION_NONE; + cur_bytes = min_t(u64, num_bytes, SZ_256M); cur_bytes = max(cur_bytes, min_size); /* @@ -9799,6 +9815,20 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, if (ret) break; + if (IS_ENCRYPTED(inode)) { + fscrypt_info = fscrypt_prepare_new_extent(inode); + if (IS_ERR(fscrypt_info)) { + btrfs_dec_block_group_reservations(fs_info, + ins.objectid); + btrfs_free_reserved_extent(fs_info, + ins.objectid, + ins.offset, 0); + ret = PTR_ERR(fscrypt_info); + break; + } + encryption_type = BTRFS_ENCRYPTION_FSCRYPT; + } + /* * We've reserved this space, and thus converted it from * ->bytes_may_use to ->bytes_reserved. Any error that happens @@ -9810,7 +9840,8 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, last_alloc = ins.offset; trans = insert_prealloc_file_extent(trans, BTRFS_I(inode), - &ins, NULL, cur_offset); + &ins, fscrypt_info, + cur_offset); /* * Now that we inserted the prealloc extent we can finally * decrement the number of reservations in the block group. @@ -9820,6 +9851,7 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, btrfs_dec_block_group_reservations(fs_info, ins.objectid); if (IS_ERR(trans)) { ret = PTR_ERR(trans); + fscrypt_put_extent_info(fscrypt_info); btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, 0); break; @@ -9827,6 +9859,7 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, em = alloc_extent_map(); if (!em) { + fscrypt_put_extent_info(fscrypt_info); btrfs_drop_extent_map_range(BTRFS_I(inode), cur_offset, cur_offset + ins.offset - 1, false); btrfs_set_inode_full_sync(BTRFS_I(inode)); @@ -9842,6 +9875,8 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, em->ram_bytes = ins.offset; set_bit(EXTENT_FLAG_PREALLOC, &em->flags); em->generation = trans->transid; + em->fscrypt_info = fscrypt_info; + em->encryption_type = encryption_
[PATCH v2 33/36] btrfs: set the bio fscrypt context when applicable
Now that we have the fscrypt_info plumbed through everywhere, add the code to setup the bio encryption context from the extent context. We use the per-extent fscrypt_extent_info for encryption/decryption. We use the offset into the extent as the lblk for fscrypt. So the start of the extent has the lblk of 0, 4k into the extent has the lblk of 4k, etc. This is done to allow things like relocation to continue to work properly. Signed-off-by: Josef Bacik --- fs/btrfs/compression.c | 6 fs/btrfs/extent_io.c | 63 +- fs/btrfs/fscrypt.c | 36 fs/btrfs/fscrypt.h | 22 +++ fs/btrfs/inode.c | 10 +++ 5 files changed, 136 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 19b22b4653c8..3f586ee40b94 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -36,6 +36,7 @@ #include "zoned.h" #include "file-item.h" #include "super.h" +#include "fscrypt.h" static struct bio_set btrfs_compressed_bioset; @@ -301,6 +302,9 @@ void btrfs_submit_compressed_write(struct btrfs_ordered_extent *ordered, cb->bbio.ordered = ordered; btrfs_add_compressed_bio_pages(cb); + btrfs_set_bio_crypt_ctx_from_extent(&cb->bbio.bio, inode, + ordered->fscrypt_info, 0); + btrfs_submit_bio(&cb->bbio, 0); } @@ -504,6 +508,8 @@ void btrfs_submit_compressed_read(struct btrfs_bio *bbio) cb->compress_type = em->compress_type; cb->orig_bbio = bbio; + btrfs_set_bio_crypt_ctx_from_extent(&cb->bbio.bio, inode, + em->fscrypt_info, 0); free_extent_map(em); cb->nr_pages = DIV_ROUND_UP(compressed_len, PAGE_SIZE); diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index c4265826278d..2251417106ea 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -37,6 +37,7 @@ #include "dev-replace.h" #include "super.h" #include "transaction.h" +#include "fscrypt.h" static struct kmem_cache *extent_buffer_cache; @@ -103,6 +104,10 @@ struct btrfs_bio_ctrl { blk_opf_t opf; btrfs_bio_end_io_t end_io_func; struct writeback_control *wbc; + + /* This is set for reads and we have encryption. */ + struct fscrypt_extent_info *fscrypt_info; + u64 orig_start; }; static void submit_one_bio(struct btrfs_bio_ctrl *bio_ctrl) @@ -707,10 +712,31 @@ static bool btrfs_bio_is_contig(struct btrfs_bio_ctrl *bio_ctrl, struct page *page, u64 disk_bytenr, unsigned int pg_offset) { - struct bio *bio = &bio_ctrl->bbio->bio; + struct inode *inode = page->mapping->host; + struct btrfs_bio *bbio = bio_ctrl->bbio; + struct bio *bio = &bbio->bio; struct bio_vec *bvec = bio_last_bvec_all(bio); const sector_t sector = disk_bytenr >> SECTOR_SHIFT; + if (IS_ENCRYPTED(inode)) { + u64 file_offset = page_offset(page) + pg_offset; + u64 offset = 0; + struct fscrypt_extent_info *fscrypt_info = NULL; + + /* bio_ctrl->fscrypt_info is only set in the READ case. */ + if (bio_ctrl->fscrypt_info) { + offset = file_offset - bio_ctrl->orig_start; + fscrypt_info = bio_ctrl->fscrypt_info; + } else if (bbio->ordered) { + fscrypt_info = bbio->ordered->fscrypt_info; + offset = file_offset - bbio->ordered->orig_offset; + } + + if (!btrfs_mergeable_encrypted_bio(bio, inode, fscrypt_info, + offset)) + return false; + } + if (bio_ctrl->compress_type != BTRFS_COMPRESS_NONE) { /* * For compression, all IO should have its logical bytenr set @@ -741,6 +767,8 @@ static void alloc_new_bio(struct btrfs_inode *inode, { struct btrfs_fs_info *fs_info = inode->root->fs_info; struct btrfs_bio *bbio; + struct fscrypt_extent_info *fscrypt_info = NULL; + u64 offset = 0; bbio = btrfs_bio_alloc(BIO_MAX_VECS, bio_ctrl->opf, fs_info, bio_ctrl->end_io_func, NULL); @@ -760,6 +788,8 @@ static void alloc_new_bio(struct btrfs_inode *inode, ordered->file_offset + ordered->disk_num_bytes - file_offset); bbio->ordered = ordered; + fscrypt_info = ordered->fscrypt_info; + offset = file_offset - ordered->orig_offset;
[PATCH v2 36/36] btrfs: implement process_bio cb for fscrypt
We are going to be checksumming the encrypted data, so we have to implement the ->process_bio fscrypt callback. This will provide us with the original bio and the encrypted bio to do work on. For WRITE's this will happen after the encrypted bio has been encrypted. For READ's this will happen after the read has completed and before the decryption step is done. For write's this is straightforward, we can just pass in the encrypted bio to btrfs_csum_one_bio and then the csums will be added to the bbio as normal. For read's this is relatively straightforward, but requires some care. We assume (because that's how it works currently) that the encrypted bio match the original bio, this is important because we save the iter of the bio before we submit. If this changes in the future we'll need a hook to give us the bi_iter of the decryption bio before it's submitted. We check the csums before decryption. If it doesn't match we simply error out and we let the normal path handle the repair work. Signed-off-by: Josef Bacik --- fs/btrfs/bio.c | 34 +- fs/btrfs/bio.h | 3 +++ fs/btrfs/fscrypt.c | 19 +++ 3 files changed, 55 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c index 7d6931e53beb..27ebf6373c8f 100644 --- a/fs/btrfs/bio.c +++ b/fs/btrfs/bio.c @@ -280,6 +280,34 @@ static struct btrfs_failed_bio *repair_one_sector(struct btrfs_bio *failed_bbio, return fbio; } +blk_status_t btrfs_check_encrypted_read_bio(struct btrfs_bio *bbio, + struct bio *enc_bio) +{ + struct btrfs_inode *inode = bbio->inode; + struct btrfs_fs_info *fs_info = inode->root->fs_info; + u32 sectorsize = fs_info->sectorsize; + struct bvec_iter iter = bbio->saved_iter; + struct btrfs_device *dev = bbio->bio.bi_private; + u32 offset = 0; + + /* +* We have to use a copy of iter in case there's an error, +* btrfs_check_read_bio will handle submitting the repair bios. +*/ + while (iter.bi_size) { + struct bio_vec bv = bio_iter_iovec(enc_bio, iter); + + bv.bv_len = min(bv.bv_len, sectorsize); + if (!btrfs_data_csum_ok(bbio, dev, offset, &bv)) + return BLK_STS_IOERR; + bio_advance_iter_single(enc_bio, &iter, sectorsize); + offset += sectorsize; + } + + bbio->csum_done = true; + return BLK_STS_OK; +} + static void btrfs_check_read_bio(struct btrfs_bio *bbio, struct btrfs_device *dev) { struct btrfs_inode *inode = bbio->inode; @@ -305,6 +333,10 @@ static void btrfs_check_read_bio(struct btrfs_bio *bbio, struct btrfs_device *de /* Clear the I/O error. A failed repair will reset it. */ bbio->bio.bi_status = BLK_STS_OK; + /* This was an encrypted bio and we've already done the csum check. */ + if (status == BLK_STS_OK && bbio->csum_done) + goto out; + while (iter->bi_size) { struct bio_vec bv = bio_iter_iovec(&bbio->bio, *iter); @@ -315,7 +347,7 @@ static void btrfs_check_read_bio(struct btrfs_bio *bbio, struct btrfs_device *de bio_advance_iter_single(&bbio->bio, iter, sectorsize); offset += sectorsize; } - +out: if (bbio->csum != bbio->csum_inline) kfree(bbio->csum); diff --git a/fs/btrfs/bio.h b/fs/btrfs/bio.h index 5d3f53dcd6d5..393ef32f5321 100644 --- a/fs/btrfs/bio.h +++ b/fs/btrfs/bio.h @@ -45,6 +45,7 @@ struct btrfs_bio { struct { u8 *csum; u8 csum_inline[BTRFS_BIO_INLINE_CSUM_SIZE]; + bool csum_done; struct bvec_iter saved_iter; }; @@ -110,5 +111,7 @@ void btrfs_submit_repair_write(struct btrfs_bio *bbio, int mirror_num, bool dev_ int btrfs_repair_io_failure(struct btrfs_fs_info *fs_info, u64 ino, u64 start, u64 length, u64 logical, struct page *page, unsigned int pg_offset, int mirror_num); +blk_status_t btrfs_check_encrypted_read_bio(struct btrfs_bio *bbio, + struct bio *enc_bio); #endif diff --git a/fs/btrfs/fscrypt.c b/fs/btrfs/fscrypt.c index 726cb6121934..b7e92ee5e60b 100644 --- a/fs/btrfs/fscrypt.c +++ b/fs/btrfs/fscrypt.c @@ -15,6 +15,7 @@ #include "transaction.h" #include "volumes.h" #include "xattr.h" +#include "file-item.h" /* * From a given location in a leaf, read a name into a qstr (usually a @@ -214,6 +215,23 @@ static struct block_device **btrfs_fscrypt_get_devices(struct super_block *sb, return devs; } +static blk_status_t btrfs_process_encrypted_bio(str
[PATCH v2 24/36] btrfs: populate the ordered_extent with the fscrypt context
The fscrypt_extent_info will be tied to the extent_map lifetime, so it will be created when we create the IO em, or it'll already exist in the NOCOW case. Use this fscrypt_info when creating the ordered extent to make sure everything is passed through properly. Signed-off-by: Josef Bacik --- fs/btrfs/inode.c | 62 +--- 1 file changed, 43 insertions(+), 19 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index a1fa5b6f3790..7d859e327485 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1160,9 +1160,8 @@ static void submit_one_async_extent(struct async_chunk *async_chunk, ret = PTR_ERR(em); goto out_free_reserve; } - free_extent_map(em); - ordered = btrfs_alloc_ordered_extent(inode, NULL, + ordered = btrfs_alloc_ordered_extent(inode, em->fscrypt_info, start, /* file_offset */ async_extent->ram_size, /* num_bytes */ async_extent->ram_size, /* ram_bytes */ @@ -1171,6 +1170,7 @@ static void submit_one_async_extent(struct async_chunk *async_chunk, 0, /* offset */ 1 << BTRFS_ORDERED_COMPRESSED, async_extent->compress_type); + free_extent_map(em); if (IS_ERR(ordered)) { btrfs_drop_extent_map_range(inode, start, end, false); ret = PTR_ERR(ordered); @@ -1424,13 +1424,13 @@ static noinline int cow_file_range(struct btrfs_inode *inode, ret = PTR_ERR(em); goto out_reserve; } - free_extent_map(em); - ordered = btrfs_alloc_ordered_extent(inode, NULL, + ordered = btrfs_alloc_ordered_extent(inode, em->fscrypt_info, start, ram_size, ram_size, ins.objectid, cur_alloc_size, 0, 1 << BTRFS_ORDERED_REGULAR, BTRFS_COMPRESS_NONE); + free_extent_map(em); if (IS_ERR(ordered)) { ret = PTR_ERR(ordered); goto out_drop_extent_cache; @@ -2003,6 +2003,8 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, struct btrfs_key found_key; struct btrfs_file_extent_item *fi; struct extent_buffer *leaf; + struct extent_map *em = NULL; + struct fscrypt_extent_info *fscrypt_info = NULL; u64 extent_end; u64 ram_bytes; u64 nocow_end; @@ -2143,7 +2145,6 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, is_prealloc = extent_type == BTRFS_FILE_EXTENT_PREALLOC; if (is_prealloc) { u64 orig_start = found_key.offset - nocow_args.extent_offset; - struct extent_map *em; em = create_io_em(inode, cur_offset, nocow_args.num_bytes, orig_start, @@ -2157,16 +2158,32 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, ret = PTR_ERR(em); goto error; } - free_extent_map(em); + fscrypt_info = em->fscrypt_info; + } else if (IS_ENCRYPTED(&inode->vfs_inode)) { + /* +* We only want to do this lookup if we're encrypted, +* otherwise fsrypt_info will be null and we can avoid +* this lookup. +*/ + em = btrfs_get_extent(inode, NULL, 0, cur_offset, + nocow_args.num_bytes); + if (IS_ERR(em)) { + btrfs_dec_nocow_writers(nocow_bg); + ret = PTR_ERR(em); + goto error; + } + fscrypt_info = em->fscrypt_info; } - ordered = btrfs_alloc_ordered_extent(inode, NULL, cur_offset, - nocow_args.num_bytes, nocow_args.num_bytes, - nocow_args.disk_bytenr, nocow_args.num_bytes, 0, + ordered = btrfs_alloc_ordered_extent(inode, fscrypt_info, + cur_offset, nocow_args.num_bytes, + nocow_args.num_bytes, nocow_args.disk_bytenr, + nocow_args.num_byt
[PATCH v2 30/36] btrfs: implement the fscrypt extent encryption hooks
This patch implements the necessary hooks from fscrypt to support per-extent encryption. There's two main entry points btrfs_fscrypt_load_extent_info btrfs_fscrypt_save_extent_info btrfs_fscrypt_load_extent_info gets called when we create the extent maps from the file extent item at btrfs_get_extent() time. We read the extent context, and pass it into fscrypt to create the appropriate fscrypt_extent_info structure. This is then used on the bio's to make sure the encryption is done properly. btrfs_fscrypt_save_extent_info is used to generate the fscrypt context from fscrypt and save it into the file extent item when we create a new file extent item. Signed-off-by: Josef Bacik --- fs/btrfs/defrag.c| 10 - fs/btrfs/file-item.c | 11 +- fs/btrfs/file-item.h | 5 - fs/btrfs/file.c | 9 fs/btrfs/fscrypt.c | 49 fs/btrfs/fscrypt.h | 31 fs/btrfs/inode.c | 22 +++- fs/btrfs/tree-log.c | 10 + 8 files changed, 143 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/defrag.c b/fs/btrfs/defrag.c index 5244561e2016..f3b7438ddbc7 100644 --- a/fs/btrfs/defrag.c +++ b/fs/btrfs/defrag.c @@ -16,6 +16,7 @@ #include "defrag.h" #include "file-item.h" #include "super.h" +#include "fscrypt.h" static struct kmem_cache *btrfs_inode_defrag_cachep; @@ -631,9 +632,12 @@ static struct extent_map *defrag_get_extent(struct btrfs_inode *inode, struct btrfs_path path = { 0 }; struct extent_map *em; struct btrfs_key key; + struct btrfs_fscrypt_ctx ctx; u64 ino = btrfs_ino(inode); int ret; + ctx.size = 0; + em = alloc_extent_map(); if (!em) { ret = -ENOMEM; @@ -728,7 +732,7 @@ static struct extent_map *defrag_get_extent(struct btrfs_inode *inode, goto next; /* Now this extent covers @start, convert it to em */ - btrfs_extent_item_to_extent_map(inode, &path, fi, em); + btrfs_extent_item_to_extent_map(inode, &path, fi, em, &ctx); break; next: ret = btrfs_next_item(root, &path); @@ -738,6 +742,10 @@ static struct extent_map *defrag_get_extent(struct btrfs_inode *inode, goto not_found; } btrfs_release_path(&path); + + ret = btrfs_fscrypt_load_extent_info(inode, em, &ctx); + if (ret) + goto err; return em; not_found: diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c index 26f35c1baedc..35036fab58c4 100644 --- a/fs/btrfs/file-item.c +++ b/fs/btrfs/file-item.c @@ -21,6 +21,7 @@ #include "accessors.h" #include "file-item.h" #include "super.h" +#include "fscrypt.h" #define __MAX_CSUM_ITEMS(r, size) ((unsigned long)(((BTRFS_LEAF_DATA_SIZE(r) - \ sizeof(struct btrfs_item) * 2) / \ @@ -1264,7 +1265,8 @@ int btrfs_csum_file_blocks(struct btrfs_trans_handle *trans, void btrfs_extent_item_to_extent_map(struct btrfs_inode *inode, const struct btrfs_path *path, struct btrfs_file_extent_item *fi, -struct extent_map *em) +struct extent_map *em, +struct btrfs_fscrypt_ctx *ctx) { struct btrfs_fs_info *fs_info = inode->root->fs_info; struct btrfs_root *root = inode->root; @@ -1306,6 +1308,13 @@ void btrfs_extent_item_to_extent_map(struct btrfs_inode *inode, set_bit(EXTENT_FLAG_PREALLOC, &em->flags); } em->encryption_type = btrfs_file_extent_encryption(leaf, fi); + if (em->encryption_type != BTRFS_ENCRYPTION_NONE) { + ctx->size = + btrfs_file_extent_encryption_ctx_size(leaf, fi); + read_extent_buffer(leaf, ctx->ctx, + btrfs_file_extent_encryption_ctx_offset(fi), + ctx->size); + } } else if (type == BTRFS_FILE_EXTENT_INLINE) { em->block_start = EXTENT_MAP_INLINE; em->start = extent_start; diff --git a/fs/btrfs/file-item.h b/fs/btrfs/file-item.h index 04bd2d34efb1..bb79014024bd 100644 --- a/fs/btrfs/file-item.h +++ b/fs/btrfs/file-item.h @@ -5,6 +5,8 @@ #include "accessors.h" +struct btrfs_fscrypt_ctx; + #define BTRFS_FILE_EXTENT_INLINE_DATA_START\ (offsetof(struct btrfs_file_extent_item, disk_bytenr)) @@ -63,7 +65,8 @@ int btrfs_lookup_csums_bitmap(struct btrfs_root *root, struct btrfs_path *path, void btrfs_extent_item_to_extent_map(
[PATCH v2 14/36] btrfs: adapt readdir for encrypted and nokey names
From: Omar Sandoval Deleting an encrypted file must always be permitted, even if the user does not have the appropriate key. Therefore, for listing an encrypted directory, so-called 'nokey' names are provided, and these nokey names must be sufficient to look up and delete the appropriate encrypted files. See 'struct fscrypt_nokey_name' for more information on the format of these names. The first part of supporting nokey names is allowing lookups by nokey name. Only a few entry points need to support these: deleting a directory, file, or subvolume -- each of these call fscrypt_setup_filename() with a '1' argument, indicating that the key is not required and therefore a nokey name may be provided. If a nokey name is provided, the fscrypt_name returned by fscrypt_setup_filename() will not have its disk_name field populated, but will have various other fields set. This change alters the relevant codepaths to pass a complete fscrypt_name anywhere that it might contain a nokey name. When it does contain a nokey name, the first time the name is successfully matched to a stored name populates the disk name field of the fscrypt_name, allowing the caller to use the normal disk name codepaths afterward. Otherwise, the matching functionality is in close analogue to the function fscrypt_match_name(). Functions where most callers are providing a fscrypt_str are duplicated and adapted for a fscrypt_name, and functions where most callers are providing a fscrypt_name are changed to so require at all callsites. Signed-off-by: Omar Sandoval Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/btrfs_inode.h | 2 +- fs/btrfs/delayed-inode.c | 29 ++- fs/btrfs/delayed-inode.h | 6 +- fs/btrfs/dir-item.c | 77 --- fs/btrfs/dir-item.h | 11 ++- fs/btrfs/extent_io.c | 18 + fs/btrfs/extent_io.h | 3 + fs/btrfs/fscrypt.c | 34 + fs/btrfs/fscrypt.h | 19 + fs/btrfs/inode.c | 158 ++- fs/btrfs/root-tree.c | 8 +- fs/btrfs/root-tree.h | 2 +- fs/btrfs/tree-log.c | 3 +- 13 files changed, 297 insertions(+), 73 deletions(-) diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index 052072373078..f6ffebeb2c8d 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -428,7 +428,7 @@ struct inode *btrfs_lookup_dentry(struct inode *dir, struct dentry *dentry); int btrfs_set_inode_index(struct btrfs_inode *dir, u64 *index); int btrfs_unlink_inode(struct btrfs_trans_handle *trans, struct btrfs_inode *dir, struct btrfs_inode *inode, - const struct fscrypt_str *name); + struct fscrypt_name *name); int btrfs_add_link(struct btrfs_trans_handle *trans, struct btrfs_inode *parent_inode, struct btrfs_inode *inode, const struct fscrypt_str *name, int add_backref, u64 index); diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c index 35d7616615c1..43b5fb3fce27 100644 --- a/fs/btrfs/delayed-inode.c +++ b/fs/btrfs/delayed-inode.c @@ -1762,7 +1762,9 @@ int btrfs_should_delete_dir_index(struct list_head *del_list, /* * Read dir info stored in the delayed tree. */ -int btrfs_readdir_delayed_dir_index(struct dir_context *ctx, +int btrfs_readdir_delayed_dir_index(struct inode *inode, + struct fscrypt_str *fstr, + struct dir_context *ctx, struct list_head *ins_list) { struct btrfs_dir_item *di; @@ -1772,6 +1774,7 @@ int btrfs_readdir_delayed_dir_index(struct dir_context *ctx, int name_len; int over = 0; unsigned char d_type; + size_t fstr_len = fstr->len; /* * Changing the data of the delayed item is impossible. So @@ -1796,8 +1799,28 @@ int btrfs_readdir_delayed_dir_index(struct dir_context *ctx, d_type = fs_ftype_to_dtype(btrfs_dir_flags_to_ftype(di->type)); btrfs_disk_key_to_cpu(&location, &di->location); - over = !dir_emit(ctx, name, name_len, - location.objectid, d_type); + if (di->type & BTRFS_FT_ENCRYPTED) { + int ret; + struct fscrypt_str iname = FSTR_INIT(name, name_len); + + fstr->len = fstr_len; + /* +* The hash is only used when the encryption key is not +* available. But if we have delayed insertions, then we +* must have the encryption key available or we wouldn't +* have been able to create entries in the directory. +* So, we don't calculate the hash. +*/ +
[PATCH v2 29/36] btrfs: pass the fscrypt_info through the replace extent infrastructure
Prealloc uses the btrfs_replace_file_extents() infrastructure to insert its new extents. We need to set the fscrypt context on these extents, so pass this through the btrfs_replace_extent_info so it can be used in a later patch when we hook in this infrastructure. Signed-off-by: Josef Bacik --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/inode.c | 1 + 2 files changed, 3 insertions(+) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index e5879bd7f2f7..f5367091c0cd 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -374,6 +374,8 @@ struct btrfs_replace_extent_info { char *extent_buf; /* The length of @extent_buf */ u32 extent_buf_size; + /* The fscrypt_extent_info for a new extent. */ + struct fscrypt_extent_info *fscrypt_info; /* * Set to true when attempting to replace a file range with a new extent * described by this structure, set to false when attempting to clone an diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 87b38be47d0b..99fb5a613fb8 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -9714,6 +9714,7 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( extent_info.update_times = true; extent_info.qgroup_reserved = qgroup_released; extent_info.insertions = 0; + extent_info.fscrypt_info = fscrypt_info; path = btrfs_alloc_path(); if (!path) { -- 2.41.0
[PATCH v2 21/36] btrfs: add fscrypt_info and encryption_type to extent_map
From: Sweet Tea Dorminy Each extent_map will end up with a pointer to its associated fscrypt_info if any, which should have the same lifetime as the extent_map. We are also going to need to track the encryption_type for the file extent items. Add the fscrypt_info to the extent_map, and the subsequent code for transferring it in the split and merge cases, as well as the code necessary to free them. A future patch will add the code to load them as appropriate. Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/extent_map.c | 32 +--- fs/btrfs/extent_map.h | 2 ++ fs/btrfs/file-item.c | 1 + fs/btrfs/inode.c | 1 + 4 files changed, 33 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index af5ff6b10865..8c8023388758 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -61,6 +61,7 @@ struct extent_map *alloc_extent_map(void) static void __free_extent_map(struct extent_map *em) { + fscrypt_put_extent_info(em->fscrypt_info); if (test_bit(EXTENT_FLAG_FS_MAPPING, &em->flags)) kfree(em->map_lookup); kmem_cache_free(extent_map_cache, em); @@ -103,12 +104,24 @@ void free_extent_map_safe(struct extent_map_tree *tree, if (!em) return; - if (refcount_dec_and_test(&em->refs)) { - WARN_ON(extent_map_in_tree(em)); - WARN_ON(!list_empty(&em->list)); + if (!refcount_dec_and_test(&em->refs)) + return; + + WARN_ON(extent_map_in_tree(em)); + WARN_ON(!list_empty(&em->list)); + + /* +* We could take a lock freeing the fscrypt_info, so add this to the +* list of freed_extents to be freed later. +*/ + if (em->fscrypt_info) { list_add_tail(&em->free_list, &tree->freed_extents); set_bit(EXTENT_MAP_TREE_PENDING_FREES, &tree->flags); + return; } + + /* Nothing scary here, just free the object. */ + __free_extent_map(em); } /* @@ -274,6 +287,12 @@ static int mergable_maps(struct extent_map *prev, struct extent_map *next) if (!list_empty(&prev->list) || !list_empty(&next->list)) return 0; + /* +* Don't merge adjacent encrypted maps. +*/ + if (prev->fscrypt_info || next->fscrypt_info) + return 0; + ASSERT(next->block_start != EXTENT_MAP_DELALLOC && prev->block_start != EXTENT_MAP_DELALLOC); @@ -884,6 +903,8 @@ void btrfs_drop_extent_map_range(struct btrfs_inode *inode, u64 start, u64 end, split->generation = gen; split->flags = flags; split->compress_type = em->compress_type; + split->fscrypt_info = + fscrypt_get_extent_info(em->fscrypt_info); replace_extent_mapping(em_tree, em, split, modified); free_extent_map(split); split = split2; @@ -925,6 +946,8 @@ void btrfs_drop_extent_map_range(struct btrfs_inode *inode, u64 start, u64 end, split->orig_block_len = 0; } + split->fscrypt_info = + fscrypt_get_extent_info(em->fscrypt_info); if (extent_map_in_tree(em)) { replace_extent_mapping(em_tree, em, split, modified); @@ -1087,6 +1110,7 @@ int split_extent_map(struct btrfs_inode *inode, u64 start, u64 len, u64 pre, split_pre->flags = flags; split_pre->compress_type = em->compress_type; split_pre->generation = em->generation; + split_pre->fscrypt_info = fscrypt_get_extent_info(em->fscrypt_info); replace_extent_mapping(em_tree, em, split_pre, 1); @@ -1106,6 +1130,8 @@ int split_extent_map(struct btrfs_inode *inode, u64 start, u64 len, u64 pre, split_mid->flags = flags; split_mid->compress_type = em->compress_type; split_mid->generation = em->generation; + split_mid->fscrypt_info = fscrypt_get_extent_info(em->fscrypt_info); + add_extent_mapping(em_tree, split_mid, 1); /* Once for us */ diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h index 2093720271ea..2d618e61ceb5 100644 --- a/fs/btrfs/extent_map.h +++ b/fs/btrfs/extent_map.h @@ -50,10 +50,12 @@ struct extent_map { */ u64 generation; unsigned long flags; + struct fscrypt_extent_info *fscrypt_info; /* Used for chunk mappings, flag EXTENT_FLAG_FS_MAPPING must be set */ struct map_lookup *map_lookup; refcount_t refs
[PATCH v2 18/36] btrfs: add get_devices hook for fscrypt
From: Sweet Tea Dorminy Since extent encryption requires inline encryption, even though we expect to use the inlinecrypt software fallback most of the time, we need to enumerate all the devices in use by btrfs. Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/fscrypt.c | 37 + 1 file changed, 37 insertions(+) diff --git a/fs/btrfs/fscrypt.c b/fs/btrfs/fscrypt.c index 9103da28af7e..2d037b105b5f 100644 --- a/fs/btrfs/fscrypt.c +++ b/fs/btrfs/fscrypt.c @@ -11,7 +11,9 @@ #include "ioctl.h" #include "messages.h" #include "root-tree.h" +#include "super.h" #include "transaction.h" +#include "volumes.h" #include "xattr.h" /* @@ -178,8 +180,43 @@ static bool btrfs_fscrypt_empty_dir(struct inode *inode) return inode->i_size == BTRFS_EMPTY_DIR_SIZE; } +static struct block_device **btrfs_fscrypt_get_devices(struct super_block *sb, + unsigned int *num_devs) +{ + struct btrfs_fs_info *fs_info = btrfs_sb(sb); + struct btrfs_fs_devices *fs_devices = fs_info->fs_devices; + int nr_devices = fs_devices->open_devices; + struct block_device **devs; + struct btrfs_device *device; + int i = 0; + + devs = kmalloc_array(nr_devices, sizeof(*devs), GFP_NOFS | GFP_NOWAIT); + if (!devs) + return ERR_PTR(-ENOMEM); + + rcu_read_lock(); + list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) { + if (!test_bit(BTRFS_DEV_STATE_IN_FS_METADATA, + &device->dev_state) || + !device->bdev || + test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state)) + continue; + + devs[i++] = device->bdev; + + if (i >= nr_devices) + break; + + } + rcu_read_unlock(); + + *num_devs = i; + return devs; +} + const struct fscrypt_operations btrfs_fscrypt_ops = { .get_context = btrfs_fscrypt_get_context, .set_context = btrfs_fscrypt_set_context, .empty_dir = btrfs_fscrypt_empty_dir, + .get_devices = btrfs_fscrypt_get_devices, }; -- 2.41.0
[PATCH v2 25/36] btrfs: keep track of fscrypt info and orig_start for dio reads
We keep track of this information in the ordered extent for writes, but we need it for reads as well. Add fscrypt_extent_info and orig_start to the dio_data so we can populate this on reads. This will be used later when we attach the fscrypt context to the bios. Signed-off-by: Josef Bacik --- fs/btrfs/inode.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 7d859e327485..d20ccfc5038f 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -83,6 +83,8 @@ struct btrfs_dio_data { ssize_t submitted; struct extent_changeset *data_reserved; struct btrfs_ordered_extent *ordered; + struct fscrypt_extent_info *fscrypt_info; + u64 orig_start; bool data_space_reserved; bool nocow_done; }; @@ -7727,6 +7729,10 @@ static int btrfs_dio_iomap_begin(struct inode *inode, loff_t start, release_len); } } else { + dio_data->fscrypt_info = + fscrypt_get_extent_info(em->fscrypt_info); + dio_data->orig_start = em->orig_start; + /* * We need to unlock only the end area that we aren't using. * The rest is going to be unlocked by the endio routine. @@ -7808,6 +7814,11 @@ static int btrfs_dio_iomap_end(struct inode *inode, loff_t pos, loff_t length, dio_data->ordered = NULL; } + if (dio_data->fscrypt_info) { + fscrypt_put_extent_info(dio_data->fscrypt_info); + dio_data->fscrypt_info = NULL; + } + if (write) extent_changeset_free(dio_data->data_reserved); return ret; -- 2.41.0
[PATCH v2 19/36] btrfs: turn on inlinecrypt mount option for encrypt
From: Sweet Tea Dorminy fscrypt's extent encryption requires the use of inline encryption or the software fallback that the block layer provides; it is rather complicated to allow software encryption with extent encryption due to the timing of memory allocations. Thus, if btrfs has ever had a encrypted file, or when encryption is enabled on a directory, update the mount flags to include inlinecrypt. Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/ioctl.c | 3 +++ fs/btrfs/super.c | 10 ++ 2 files changed, 13 insertions(+) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index c56986031870..69ab0d7e393f 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -4587,6 +4587,9 @@ long btrfs_ioctl(struct file *file, unsigned int * state persists. */ btrfs_set_fs_incompat(fs_info, ENCRYPT); + if (!(inode->i_sb->s_flags & SB_INLINECRYPT)) { + inode->i_sb->s_flags |= SB_INLINECRYPT; + } return fscrypt_ioctl_set_policy(file, (const void __user *)arg); } case FS_IOC_GET_ENCRYPTION_POLICY: diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index bacf5c4f2a5c..224760cc72b6 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1124,6 +1124,16 @@ static int btrfs_fill_super(struct super_block *sb, return err; } + if (btrfs_fs_incompat(fs_info, ENCRYPT)) { + if (IS_ENABLED(CONFIG_FS_ENCRYPTION_INLINE_CRYPT)) { + sb->s_flags |= SB_INLINECRYPT; + } else { + btrfs_err(fs_info, "encryption not supported"); + err = -EINVAL; + goto fail_close; + } + } + inode = btrfs_iget(sb, BTRFS_FIRST_FREE_OBJECTID, fs_info->fs_root); if (IS_ERR(inode)) { err = PTR_ERR(inode); -- 2.41.0
[PATCH v2 15/36] btrfs: handle nokey names.
From: Sweet Tea Dorminy For encrypted or unencrypted names, we calculate the offset for the dir item by hashing the name for the dir item. However, this doesn't work for a long nokey name, where we do not have the complete ciphertext. Instead, fscrypt stores the filesystem-provided hash in the nokey name, and we can extract it from the fscrypt_name structure in such a case. Additionally, for nokey names, if we find the nokey name on disk we can update the fscrypt_name with the disk name, so add that to searching for diritems. Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/dir-item.c | 37 +++-- fs/btrfs/fscrypt.c | 27 +++ fs/btrfs/fscrypt.h | 11 +++ 3 files changed, 73 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/dir-item.c b/fs/btrfs/dir-item.c index a64cfddff7f0..897fb5477369 100644 --- a/fs/btrfs/dir-item.c +++ b/fs/btrfs/dir-item.c @@ -231,6 +231,28 @@ struct btrfs_dir_item *btrfs_lookup_dir_item(struct btrfs_trans_handle *trans, return di; } +/* + * If appropriate, populate the disk name for a fscrypt_name looked up without + * a key. + * + * @path: The path to the extent buffer in which the name was found. + * @di:The dir item corresponding. + * @fname: The fscrypt_name to perhaps populate. + * + * Returns: 0 if the name is already populated or the dir item doesn't exist + * or the name was successfully populated, else an error code. + */ +static int ensure_disk_name_from_dir_item(struct btrfs_path *path, + struct btrfs_dir_item *di, + struct fscrypt_name *name) +{ + if (name->disk_name.name || !di) + return 0; + + return btrfs_fscrypt_get_disk_name(path->nodes[0], di, + &name->disk_name); +} + /* * Lookup for a directory item by fscrypt_name. * @@ -257,8 +279,12 @@ struct btrfs_dir_item *btrfs_lookup_dir_item_fname(struct btrfs_trans_handle *tr key.objectid = dir; key.type = BTRFS_DIR_ITEM_KEY; - key.offset = btrfs_name_hash(name->disk_name.name, name->disk_name.len); - /* XXX get the right hash for no-key names */ + + if (!name->disk_name.name) + key.offset = name->hash | ((u64)name->minor_hash << 32); + else + key.offset = btrfs_name_hash(name->disk_name.name, +name->disk_name.len); ret = btrfs_search_slot(trans, root, &key, path, mod, -mod); if (ret == 0) @@ -266,6 +292,8 @@ struct btrfs_dir_item *btrfs_lookup_dir_item_fname(struct btrfs_trans_handle *tr if (ret == -ENOENT || (di && IS_ERR(di) && PTR_ERR(di) == -ENOENT)) return NULL; + if (ret == 0) + ret = ensure_disk_name_from_dir_item(path, di, name); if (ret < 0) di = ERR_PTR(ret); @@ -382,7 +410,12 @@ btrfs_search_dir_index_item(struct btrfs_root *root, struct btrfs_path *path, btrfs_for_each_slot(root, &key, &key, path, ret) { if (key.objectid != dirid || key.type != BTRFS_DIR_INDEX_KEY) break; + di = btrfs_match_dir_item_fname(root->fs_info, path, name); + if (di) + ret = ensure_disk_name_from_dir_item(path, di, name); + if (ret) + break; if (di) return di; } diff --git a/fs/btrfs/fscrypt.c b/fs/btrfs/fscrypt.c index 6a4e4f63a660..9103da28af7e 100644 --- a/fs/btrfs/fscrypt.c +++ b/fs/btrfs/fscrypt.c @@ -14,6 +14,33 @@ #include "transaction.h" #include "xattr.h" +/* + * From a given location in a leaf, read a name into a qstr (usually a + * fscrypt_name's disk_name), allocating the required buffer. Used for + * nokey names. + */ +int btrfs_fscrypt_get_disk_name(struct extent_buffer *leaf, + struct btrfs_dir_item *dir_item, + struct fscrypt_str *name) +{ + unsigned long de_name_len = btrfs_dir_name_len(leaf, dir_item); + unsigned long de_name = (unsigned long)(dir_item + 1); + /* +* For no-key names, we use this opportunity to find the disk +* name, so future searches don't need to deal with nokey names +* and we know what the encrypted size is. +*/ + name->name = kmalloc(de_name_len, GFP_NOFS); + + if (!name->name) + return -ENOMEM; + + read_extent_buffer(leaf, name->name, de_name, de_name_len); + + name->len = de_name_len; + return 0; +} + /* * This function is extremely similar to fscrypt_match_name() but uses an * extent_buffer. diff --git a/fs/btrfs/fscrypt.h b/fs/btrfs/f
[PATCH v2 11/36] btrfs: start using fscrypt hooks
From: Omar Sandoval In order to appropriately encrypt, create, open, rename, and various symlink operations must call fscrypt hooks. These determine whether the inode should be encrypted and do other preparatory actions. The superblock must have fscrypt operations registered, so implement the minimal set also, and introduce the new fscrypt.[ch] files to hold the fscrypt-specific functionality. Signed-off-by: Omar Sandoval Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/Makefile | 1 + fs/btrfs/btrfs_inode.h | 1 + fs/btrfs/file.c| 3 ++ fs/btrfs/fscrypt.c | 7 +++ fs/btrfs/fscrypt.h | 10 fs/btrfs/inode.c | 110 ++--- fs/btrfs/super.c | 2 + 7 files changed, 116 insertions(+), 18 deletions(-) create mode 100644 fs/btrfs/fscrypt.c create mode 100644 fs/btrfs/fscrypt.h diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index 525af975f61c..6e51d054c17a 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -39,6 +39,7 @@ btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_REF_VERIFY) += ref-verify.o btrfs-$(CONFIG_BLK_DEV_ZONED) += zoned.o btrfs-$(CONFIG_FS_VERITY) += verity.o +btrfs-$(CONFIG_FS_ENCRYPTION) += fscrypt.o btrfs-$(CONFIG_BTRFS_FS_RUN_SANITY_TESTS) += tests/free-space-tests.o \ tests/extent-buffer-tests.o tests/btrfs-tests.o \ diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index bebb5921b922..052072373078 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -455,6 +455,7 @@ struct btrfs_new_inode_args { struct posix_acl *default_acl; struct posix_acl *acl; struct fscrypt_name fname; + bool encrypt; }; int btrfs_new_inode_prepare(struct btrfs_new_inode_args *args, diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 92419cb8508a..26905b77c7e8 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -3709,6 +3709,9 @@ static int btrfs_file_open(struct inode *inode, struct file *filp) filp->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC | FMODE_BUF_WASYNC | FMODE_CAN_ODIRECT; + ret = fscrypt_file_open(inode, filp); + if (ret) + return ret; ret = fsverity_file_open(inode, filp); if (ret) diff --git a/fs/btrfs/fscrypt.c b/fs/btrfs/fscrypt.c new file mode 100644 index ..48ab99dfe48d --- /dev/null +++ b/fs/btrfs/fscrypt.c @@ -0,0 +1,7 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include "ctree.h" +#include "fscrypt.h" + +const struct fscrypt_operations btrfs_fscrypt_ops = { +}; diff --git a/fs/btrfs/fscrypt.h b/fs/btrfs/fscrypt.h new file mode 100644 index ..7f4e6888bd43 --- /dev/null +++ b/fs/btrfs/fscrypt.h @@ -0,0 +1,10 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef BTRFS_FSCRYPT_H +#define BTRFS_FSCRYPT_H + +#include + +extern const struct fscrypt_operations btrfs_fscrypt_ops; + +#endif /* BTRFS_FSCRYPT_H */ diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 4806ff34224a..b92da4a4ed21 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5053,6 +5053,10 @@ static int btrfs_setattr(struct mnt_idmap *idmap, struct dentry *dentry, if (err) return err; + err = fscrypt_prepare_setattr(dentry, attr); + if (err) + return err; + if (S_ISREG(inode->i_mode) && (attr->ia_valid & ATTR_SIZE)) { err = btrfs_setsize(inode, attr); if (err) @@ -5207,11 +5211,8 @@ void btrfs_evict_inode(struct inode *inode) trace_btrfs_inode_evict(inode); - if (!root) { - fsverity_cleanup_inode(inode); - clear_inode(inode); - return; - } + if (!root) + goto cleanup; evict_inode_truncate_pages(inode); @@ -5311,6 +5312,9 @@ void btrfs_evict_inode(struct inode *inode) * to retry these periodically in the future. */ btrfs_remove_delayed_node(BTRFS_I(inode)); + +cleanup: + fscrypt_put_encryption_info(inode); fsverity_cleanup_inode(inode); clear_inode(inode); } @@ -6096,6 +6100,12 @@ int btrfs_new_inode_prepare(struct btrfs_new_inode_args *args, return ret; } + ret = fscrypt_prepare_new_inode(dir, inode, &args->encrypt); + if (ret) { + fscrypt_free_filename(&args->fname); + return ret; + } + /* 1 to add inode item */ *trans_num_items = 1; /* 1 to add compression property */ @@ -6569,9 +6579,13 @@ static int btrfs_link(struct dentry *old_dentry, struct inode *dir, if (inode->i_nlink >= BTRFS_LINK_MAX) return -EMLINK; + err = fscrypt_prepare_link(old_dentry, dir, dentry); + if (err) + return err; + err = fscrypt_setup_filename(dir, &dentry->d_name, 0, &fname)
[PATCH v2 13/36] btrfs: add new FEATURE_INCOMPAT_ENCRYPT flag
From: Omar Sandoval As encrypted files will be incompatible with older filesystem versions, new filesystems should be created with an incompat flag for fscrypt, which will gate access to the encryption ioctls. Signed-off-by: Omar Sandoval Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/fs.h | 3 ++- fs/btrfs/super.c | 5 + fs/btrfs/sysfs.c | 6 ++ include/uapi/linux/btrfs.h | 1 + 4 files changed, 14 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index 318df6f9d9cb..4a3b1bb61849 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -231,7 +231,8 @@ enum { #define BTRFS_FEATURE_INCOMPAT_SUPP\ (BTRFS_FEATURE_INCOMPAT_SUPP_STABLE | \ BTRFS_FEATURE_INCOMPAT_RAID_STRIPE_TREE | \ -BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2) +BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2 | \ +BTRFS_FEATURE_INCOMPAT_ENCRYPT) #else diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 0d88f871ba09..bacf5c4f2a5c 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -2383,6 +2383,11 @@ static int __init btrfs_print_mod_info(void) ", fsverity=yes" #else ", fsverity=no" +#endif +#ifdef CONFIG_FS_ENCRYPTION + ", fscrypt=yes" +#else + ", fscrypt=no" #endif ; pr_info("Btrfs loaded%s\n", options); diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index e6b51fb3ddc1..4ece703d9d5f 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -304,6 +304,9 @@ BTRFS_FEAT_ATTR_INCOMPAT(raid_stripe_tree, RAID_STRIPE_TREE); #ifdef CONFIG_FS_VERITY BTRFS_FEAT_ATTR_COMPAT_RO(verity, VERITY); #endif +#ifdef CONFIG_FS_ENCRYPTION +BTRFS_FEAT_ATTR_INCOMPAT(encryption, ENCRYPT); +#endif /* CONFIG_FS_ENCRYPTION */ /* * Features which depend on feature bits and may differ between each fs. @@ -336,6 +339,9 @@ static struct attribute *btrfs_supported_feature_attrs[] = { #ifdef CONFIG_FS_VERITY BTRFS_FEAT_ATTR_PTR(verity), #endif +#ifdef CONFIG_FS_ENCRYPTION + BTRFS_FEAT_ATTR_PTR(encryption), +#endif /* CONFIG_FS_ENCRYPTION */ NULL }; diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h index 7c29d82db9ee..6a0f4c0e4096 100644 --- a/include/uapi/linux/btrfs.h +++ b/include/uapi/linux/btrfs.h @@ -334,6 +334,7 @@ struct btrfs_ioctl_fs_info_args { #define BTRFS_FEATURE_INCOMPAT_ZONED (1ULL << 12) #define BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2 (1ULL << 13) #define BTRFS_FEATURE_INCOMPAT_RAID_STRIPE_TREE(1ULL << 14) +#define BTRFS_FEATURE_INCOMPAT_ENCRYPT (1ULL << 15) #define BTRFS_FEATURE_INCOMPAT_SIMPLE_QUOTA(1ULL << 16) struct btrfs_ioctl_feature_flags { -- 2.41.0
[PATCH v2 12/36] btrfs: add inode encryption contexts
From: Omar Sandoval In order to store encryption information for directories, symlinks, etc., fscrypt stores a context item with each encrypted non-regular inode. fscrypt provides an arbitrary blob for the filesystem to store, and it does not clearly fit into an existing structure, so this goes in a new item type. Signed-off-by: Omar Sandoval Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/fscrypt.c | 117 fs/btrfs/fscrypt.h | 2 + fs/btrfs/inode.c| 19 ++ fs/btrfs/ioctl.c| 8 ++- include/uapi/linux/btrfs_tree.h | 10 +++ 5 files changed, 154 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/fscrypt.c b/fs/btrfs/fscrypt.c index 48ab99dfe48d..0e4011d6b1cd 100644 --- a/fs/btrfs/fscrypt.c +++ b/fs/btrfs/fscrypt.c @@ -1,7 +1,124 @@ // SPDX-License-Identifier: GPL-2.0 +#include #include "ctree.h" +#include "accessors.h" +#include "btrfs_inode.h" +#include "disk-io.h" +#include "fs.h" #include "fscrypt.h" +#include "ioctl.h" +#include "messages.h" +#include "transaction.h" +#include "xattr.h" + +static int btrfs_fscrypt_get_context(struct inode *inode, void *ctx, size_t len) +{ + struct btrfs_key key = { + .objectid = btrfs_ino(BTRFS_I(inode)), + .type = BTRFS_FSCRYPT_CTX_ITEM_KEY, + .offset = 0, + }; + struct btrfs_path *path; + struct extent_buffer *leaf; + unsigned long ptr; + int ret; + + + path = btrfs_alloc_path(); + if (!path) + return -ENOMEM; + + ret = btrfs_search_slot(NULL, BTRFS_I(inode)->root, &key, path, 0, 0); + if (ret) { + len = -ENOENT; + goto out; + } + + leaf = path->nodes[0]; + ptr = btrfs_item_ptr_offset(leaf, path->slots[0]); + /* fscrypt provides max context length, but it could be less */ + len = min_t(size_t, len, btrfs_item_size(leaf, path->slots[0])); + read_extent_buffer(leaf, ctx, ptr, len); + +out: + btrfs_free_path(path); + return len; +} + +static int btrfs_fscrypt_set_context(struct inode *inode, const void *ctx, +size_t len, void *fs_data) +{ + struct btrfs_trans_handle *trans = fs_data; + struct btrfs_key key = { + .objectid = btrfs_ino(BTRFS_I(inode)), + .type = BTRFS_FSCRYPT_CTX_ITEM_KEY, + .offset = 0, + }; + struct btrfs_path *path = NULL; + struct extent_buffer *leaf; + unsigned long ptr; + int ret; + + if (!trans) + trans = btrfs_start_transaction(BTRFS_I(inode)->root, 2); + if (IS_ERR(trans)) + return PTR_ERR(trans); + + path = btrfs_alloc_path(); + if (!path) { + ret = -ENOMEM; + goto out_err; + } + + ret = btrfs_search_slot(trans, BTRFS_I(inode)->root, &key, path, 0, 1); + if (ret < 0) + goto out_err; + + if (ret > 0) { + btrfs_release_path(path); + ret = btrfs_insert_empty_item(trans, BTRFS_I(inode)->root, path, &key, len); + if (ret) + goto out_err; + } + + leaf = path->nodes[0]; + ptr = btrfs_item_ptr_offset(leaf, path->slots[0]); + + len = min_t(size_t, len, btrfs_item_size(leaf, path->slots[0])); + write_extent_buffer(leaf, ctx, ptr, len); + btrfs_mark_buffer_dirty(trans, leaf); + btrfs_release_path(path); + + if (fs_data) + return ret; + + BTRFS_I(inode)->flags |= BTRFS_INODE_ENCRYPT; + btrfs_sync_inode_flags_to_i_flags(inode); + inode_inc_iversion(inode); + inode_set_ctime_current(inode); + ret = btrfs_update_inode(trans, BTRFS_I(inode)); + if (ret) + goto out_abort; + btrfs_free_path(path); + btrfs_end_transaction(trans); + return 0; +out_abort: + btrfs_abort_transaction(trans, ret); +out_err: + if (!fs_data) + btrfs_end_transaction(trans); + btrfs_free_path(path); + return ret; +} + +static bool btrfs_fscrypt_empty_dir(struct inode *inode) +{ + return inode->i_size == BTRFS_EMPTY_DIR_SIZE; +} const struct fscrypt_operations btrfs_fscrypt_ops = { + .get_context = btrfs_fscrypt_get_context, + .set_context = btrfs_fscrypt_set_context, + .empty_dir = btrfs_fscrypt_empty_dir, }; diff --git a/fs/btrfs/fscrypt.h b/fs/btrfs/fscrypt.h index 7f4e6888bd43..80adb7e56826 100644 --- a/fs/btrfs/fscrypt.h +++ b/fs/btrfs/fscrypt.h @@ -5,6 +5,8 @@ #include +#include "fs.h" + extern const struct fscrypt_operations btrfs_fscrypt_ops; #endif /* BTRFS_FSCRYPT_H */ diff --git a/fs/btrfs/in
[PATCH v2 07/36] fscrypt: add documentation about extent encryption
Add a couple of sections to the fscrypt documentation about per-extent encryption. Signed-off-by: Josef Bacik --- Documentation/filesystems/fscrypt.rst | 36 +++ 1 file changed, 36 insertions(+) diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst index 28700fb41a00..6235b1caec2d 100644 --- a/Documentation/filesystems/fscrypt.rst +++ b/Documentation/filesystems/fscrypt.rst @@ -256,6 +256,21 @@ alternative master keys or to support rotating master keys. Instead, the master keys may be wrapped in userspace, e.g. as is done by the `fscrypt <https://github.com/google/fscrypt>`_ tool. +Per-extent encryption keys +-- + +For certain file systems, such as btrfs, it's desired to derive a +per-extent encryption key. This is to enable features such as snapshots +and reflink, where you could have different inodes pointing at the same +extent. When a new extent is created fscrypt randomly generates a +16-byte nonce and the file system stores it along side the extent. +Then, it uses a KDF (as described in `Key derivation function`_) to +derive the extent's key from the master key and nonce. + +Currently the inode's master key and encryption policy must match the +extent, so you cannot share extents between inodes that were encrypted +differently. + DIRECT_KEY policies --- @@ -1394,6 +1409,27 @@ by the kernel and is used as KDF input or as a tweak to cause different files to be encrypted differently; see `Per-file encryption keys`_ and `DIRECT_KEY policies`_. +Extent encryption context +- + +The extent encryption context mirrors the important parts of the above +`Encryption context`_, with a few ommisions. The struct is defined as +follows:: + +struct fscrypt_extent_context { +u8 version; +u8 encryption_mode; +u8 master_key_identifier[FSCRYPT_KEY_IDENTIFIER_SIZE]; +u8 nonce[FSCRYPT_FILE_NONCE_SIZE]; +}; + +Currently all fields much match the containing inode's encryption +context, with the exception of the nonce. + +Additionally extent encryption is only supported with +FSCRYPT_EXTENT_CONTEXT_V2 using the standard policy, all other policies +are disallowed. + Data path changes - -- 2.41.0
[PATCH v2 06/36] fscrypt: expose fscrypt_nokey_name
From: Omar Sandoval btrfs stores its data structures, including filenames in directories, in its own buffer implementation, struct extent_buffer, composed of several non-contiguous pages. We could copy filenames into a temporary buffer and use fscrypt_match_name() against that buffer, such extensive memcpying would be expensive. Instead, exposing fscrypt_nokey_name as in this change allows btrfs to recapitulate fscrypt_match_name() using methods on struct extent_buffer instead of dealing with a raw byte array. Signed-off-by: Omar Sandoval Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/crypto/fname.c | 39 +-- include/linux/fscrypt.h | 37 + 2 files changed, 38 insertions(+), 38 deletions(-) diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c index 7b3fc189593a..5607ee52703e 100644 --- a/fs/crypto/fname.c +++ b/fs/crypto/fname.c @@ -14,7 +14,6 @@ #include #include #include -#include #include #include "fscrypt_private.h" @@ -26,43 +25,7 @@ #define FSCRYPT_FNAME_MIN_MSG_LEN 16 /* - * struct fscrypt_nokey_name - identifier for directory entry when key is absent - * - * When userspace lists an encrypted directory without access to the key, the - * filesystem must present a unique "no-key name" for each filename that allows - * it to find the directory entry again if requested. Naively, that would just - * mean using the ciphertext filenames. However, since the ciphertext filenames - * can contain illegal characters ('\0' and '/'), they must be encoded in some - * way. We use base64url. But that can cause names to exceed NAME_MAX (255 - * bytes), so we also need to use a strong hash to abbreviate long names. - * - * The filesystem may also need another kind of hash, the "dirhash", to quickly - * find the directory entry. Since filesystems normally compute the dirhash - * over the on-disk filename (i.e. the ciphertext), it's not computable from - * no-key names that abbreviate the ciphertext using the strong hash to fit in - * NAME_MAX. It's also not computable if it's a keyed hash taken over the - * plaintext (but it may still be available in the on-disk directory entry); - * casefolded directories use this type of dirhash. At least in these cases, - * each no-key name must include the name's dirhash too. - * - * To meet all these requirements, we base64url-encode the following - * variable-length structure. It contains the dirhash, or 0's if the filesystem - * didn't provide one; up to 149 bytes of the ciphertext name; and for - * ciphertexts longer than 149 bytes, also the SHA-256 of the remaining bytes. - * - * This ensures that each no-key name contains everything needed to find the - * directory entry again, contains only legal characters, doesn't exceed - * NAME_MAX, is unambiguous unless there's a SHA-256 collision, and that we only - * take the performance hit of SHA-256 on very long filenames (which are rare). - */ -struct fscrypt_nokey_name { - u32 dirhash[2]; - u8 bytes[149]; - u8 sha256[SHA256_DIGEST_SIZE]; -}; /* 189 bytes => 252 bytes base64url-encoded, which is <= NAME_MAX (255) */ - -/* - * Decoded size of max-size no-key name, i.e. a name that was abbreviated using + * Decoded size of max-size nokey name, i.e. a name that was abbreviated using * the strong hash and thus includes the 'sha256' field. This isn't simply * sizeof(struct fscrypt_nokey_name), as the padding at the end isn't included. */ diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h index a3576da6a9fa..9a7cd1e2146e 100644 --- a/include/linux/fscrypt.h +++ b/include/linux/fscrypt.h @@ -17,6 +17,7 @@ #include #include #include +#include #include /* @@ -56,6 +57,42 @@ struct fscrypt_name { #define fname_name(p) ((p)->disk_name.name) #define fname_len(p) ((p)->disk_name.len) +/* + * struct fscrypt_nokey_name - identifier for directory entry when key is absent + * + * When userspace lists an encrypted directory without access to the key, the + * filesystem must present a unique "no-key name" for each filename that allows + * it to find the directory entry again if requested. Naively, that would just + * mean using the ciphertext filenames. However, since the ciphertext filenames + * can contain illegal characters ('\0' and '/'), they must be encoded in some + * way. We use base64url. But that can cause names to exceed NAME_MAX (255 + * bytes), so we also need to use a strong hash to abbreviate long names. + * + * The filesystem may also need another kind of hash, the "dirhash", to quickly + * find the directory entry. Since filesystems normally compute the dirhash + * over the on-disk filename (i.e. the ciphertext), it's not computable from + * no-key names that a
[PATCH v2 05/36] blk-crypto: add a process bio callback
Btrfs does checksumming, and the checksums need to match the bytes on disk. In order to facilitate this add a process bio callback for the blk-crypto layer. This allows the file system to specify a callback and then can process the encrypted bio as necessary. For btrfs, writes will have the checksums calculated and saved into our relevant data structures for storage once the write completes. For reads we will validate the checksums match what is on disk and error out if there is a mismatch. This is incompatible with native encryption obviously, so make sure we don't use native encryption if this callback is set. Signed-off-by: Josef Bacik --- block/blk-crypto-fallback.c| 28 block/blk-crypto-profile.c | 2 ++ block/blk-crypto.c | 6 +- fs/crypto/inline_crypt.c | 3 ++- include/linux/blk-crypto-profile.h | 7 +++ include/linux/blk-crypto.h | 9 +++-- include/linux/fscrypt.h| 14 ++ 7 files changed, 65 insertions(+), 4 deletions(-) diff --git a/block/blk-crypto-fallback.c b/block/blk-crypto-fallback.c index e6468eab2681..8b4a83534127 100644 --- a/block/blk-crypto-fallback.c +++ b/block/blk-crypto-fallback.c @@ -346,6 +346,15 @@ static bool blk_crypto_fallback_encrypt_bio(struct bio **bio_ptr) } } + /* Process the encrypted bio before we submit it. */ + if (bc->bc_key->crypto_cfg.process_bio) { + blk_st = bc->bc_key->crypto_cfg.process_bio(src_bio, enc_bio); + if (blk_st != BLK_STS_OK) { + src_bio->bi_status = blk_st; + goto out_free_bounce_pages; + } + } + enc_bio->bi_private = src_bio; enc_bio->bi_end_io = blk_crypto_fallback_encrypt_endio; *bio_ptr = enc_bio; @@ -391,6 +400,24 @@ static void blk_crypto_fallback_decrypt_bio(struct work_struct *work) unsigned int i; blk_status_t blk_st; + /* +* Process the bio first before trying to decrypt. +* +* NOTE: btrfs expects that this bio is the same that was submitted. If +* at any point this changes we will need to update process_bio to take +* f_ctx->crypt_iter in order to make sure we can iterate the pages for +* checksumming. We're currently saving this in our btrfs_bio, so this +* works, but if at any point in the future we start allocating a bounce +* bio or something we need to update this callback. +*/ + if (bc->bc_key->crypto_cfg.process_bio) { + blk_st = bc->bc_key->crypto_cfg.process_bio(bio, bio); + if (blk_st != BLK_STS_OK) { + bio->bi_status = blk_st; + goto out_no_keyslot; + } + } + /* * Get a blk-crypto-fallback keyslot that contains a crypto_skcipher for * this bio's algorithm and key. @@ -560,6 +587,7 @@ static int blk_crypto_fallback_init(void) blk_crypto_fallback_profile->ll_ops = blk_crypto_fallback_ll_ops; blk_crypto_fallback_profile->max_dun_bytes_supported = BLK_CRYPTO_MAX_IV_SIZE; + blk_crypto_fallback_profile->process_bio_supported = true; /* All blk-crypto modes have a crypto API fallback. */ for (i = 0; i < BLK_ENCRYPTION_MODE_MAX; i++) diff --git a/block/blk-crypto-profile.c b/block/blk-crypto-profile.c index 7fabc883e39f..640cf2ea3fcc 100644 --- a/block/blk-crypto-profile.c +++ b/block/blk-crypto-profile.c @@ -352,6 +352,8 @@ bool __blk_crypto_cfg_supported(struct blk_crypto_profile *profile, return false; if (profile->max_dun_bytes_supported < cfg->dun_bytes) return false; + if (cfg->process_bio && !profile->process_bio_supported) + return false; return true; } diff --git a/block/blk-crypto.c b/block/blk-crypto.c index 4d760b092deb..50556952df19 100644 --- a/block/blk-crypto.c +++ b/block/blk-crypto.c @@ -321,6 +321,8 @@ int __blk_crypto_rq_bio_prep(struct request *rq, struct bio *bio, * @dun_bytes: number of bytes that will be used to specify the DUN when this *key is used * @data_unit_size: the data unit size to use for en/decryption + * @process_bio: the call back if the upper layer needs to process the encrypted + * bio * * Return: 0 on success, -errno on failure. The caller is responsible for *zeroizing both blk_key and raw_key when done with them. @@ -328,7 +330,8 @@ int __blk_crypto_rq_bio_prep(struct request *rq, struct bio *bio, int blk_crypto_init_key(struct blk_crypto_key *blk_key, const u8 *raw_key, enum blk_crypto_mode_num crypto_mode, unsigned int dun_bytes, - unsigned int data_unit_size) +
[PATCH v2 08/36] btrfs: add infrastructure for safe em freeing
When we add fscrypt support we're going to have fscrypt objects hanging off of extent_maps. This includes a block key, which if we're the last one freeing the key we may have to unregister it from the block layer. This requires taking a semaphore in the block layer, which means we can't free em's under the extent map tree lock. Thankfully we only do this in two places, one where we're dropping a range of extent maps, and when we're freeing logged extents. Add a free_extent_map_safe() which will add the em to a list in the em_tree if we free'd the object. Currently this is unconditional but will be changed to conditional on the fscrypt object we will add in a later patch. To process these delayed objects add a free_pending_extent_maps() that is called after the lock has been dropped on the em_tree. This will process the extent maps on the freed list and do the appropriate freeing work in a safe manner. Signed-off-by: Josef Bacik --- fs/btrfs/extent_map.c | 80 --- fs/btrfs/extent_map.h | 10 ++ fs/btrfs/tree-log.c | 6 ++-- 3 files changed, 89 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index a6d8368ed0ed..af5ff6b10865 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -35,7 +35,9 @@ void __cold extent_map_exit(void) void extent_map_tree_init(struct extent_map_tree *tree) { tree->map = RB_ROOT_CACHED; + tree->flags = 0; INIT_LIST_HEAD(&tree->modified_extents); + INIT_LIST_HEAD(&tree->freed_extents); rwlock_init(&tree->lock); } @@ -53,9 +55,17 @@ struct extent_map *alloc_extent_map(void) em->compress_type = BTRFS_COMPRESS_NONE; refcount_set(&em->refs, 1); INIT_LIST_HEAD(&em->list); + INIT_LIST_HEAD(&em->free_list); return em; } +static void __free_extent_map(struct extent_map *em) +{ + if (test_bit(EXTENT_FLAG_FS_MAPPING, &em->flags)) + kfree(em->map_lookup); + kmem_cache_free(extent_map_cache, em); +} + /* * Drop the reference out on @em by one and free the structure if the reference * count hits zero. @@ -67,12 +77,69 @@ void free_extent_map(struct extent_map *em) if (refcount_dec_and_test(&em->refs)) { WARN_ON(extent_map_in_tree(em)); WARN_ON(!list_empty(&em->list)); - if (test_bit(EXTENT_FLAG_FS_MAPPING, &em->flags)) - kfree(em->map_lookup); - kmem_cache_free(extent_map_cache, em); + __free_extent_map(em); } } +/* + * Drop a ref for the extent map in the given tree. + * + * @tree: tree that the em is a part of. + * @em:the em to drop the reference to. + * + * Drop the reference count on @em by one, if the reference count hits 0 and + * there is an object on the em that can't be safely freed in the current + * context (if we are holding the extent_map_tree->lock for example), then add + * it to the freed_extents list on the extent_map_tree for later processing. + * + * This must be followed by a free_pending_extent_maps() to clear the pending + * frees. + */ +void free_extent_map_safe(struct extent_map_tree *tree, + struct extent_map *em) +{ + lockdep_assert_held_write(&tree->lock); + + if (!em) + return; + + if (refcount_dec_and_test(&em->refs)) { + WARN_ON(extent_map_in_tree(em)); + WARN_ON(!list_empty(&em->list)); + list_add_tail(&em->free_list, &tree->freed_extents); + set_bit(EXTENT_MAP_TREE_PENDING_FREES, &tree->flags); + } +} + +/* + * Free the em objects that exist on the em tree + * + * @tree: the tree to free the objects from. + * + * If there are any objects on the em->freed_extents list go ahead and free them + * here in a safe way. This is to be coupled with any uses of + * free_extent_map_safe(). + */ +void free_pending_extent_maps(struct extent_map_tree *tree) +{ + struct extent_map *em; + + /* Avoid taking the write lock if we don't have any pending frees. */ + if (!test_and_clear_bit(EXTENT_MAP_TREE_PENDING_FREES, &tree->flags)) + return; + + write_lock(&tree->lock); + while ((em = list_first_entry_or_null(&tree->freed_extents, + struct extent_map, free_list))) { + list_del_init(&em->free_list); + write_unlock(&tree->lock); + __free_extent_map(em); + cond_resched(); + write_lock(&tree->lock); + } + write_unlock(&tree->lock); +} + /* Do the math around the end of an extent, handling wrapping. */ static u64 range_
[PATCH v2 09/36] btrfs: disable various operations on encrypted inodes
From: Omar Sandoval Initially, only normal data extents will be encrypted. This change forbids various other bits: - allows reflinking only if both inodes have the same encryption status - disable inline data on encrypted inodes Signed-off-by: Omar Sandoval Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/inode.c | 3 ++- fs/btrfs/reflink.c | 7 +++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index c9317c047587..4806ff34224a 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -630,7 +630,8 @@ static noinline int cow_file_range_inline(struct btrfs_inode *inode, u64 size, * compressed) data fits in a leaf and the configured maximum inline * size. */ - if (size < i_size_read(&inode->vfs_inode) || + if (IS_ENCRYPTED(&inode->vfs_inode) || + size < i_size_read(&inode->vfs_inode) || size > fs_info->sectorsize || data_len > BTRFS_MAX_INLINE_DATA_SIZE(fs_info) || data_len > fs_info->max_inline) diff --git a/fs/btrfs/reflink.c b/fs/btrfs/reflink.c index fabd856e5079..3c66630d87ee 100644 --- a/fs/btrfs/reflink.c +++ b/fs/btrfs/reflink.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include +#include #include #include "ctree.h" #include "fs.h" @@ -809,6 +810,12 @@ static int btrfs_remap_file_range_prep(struct file *file_in, loff_t pos_in, ASSERT(inode_in->i_sb == inode_out->i_sb); } + /* +* Can only reflink encrypted files if both files are encrypted. +*/ + if (IS_ENCRYPTED(inode_in) != IS_ENCRYPTED(inode_out)) + return -EINVAL; + /* Don't make the dst file partly checksummed */ if ((BTRFS_I(inode_in)->flags & BTRFS_INODE_NODATASUM) != (BTRFS_I(inode_out)->flags & BTRFS_INODE_NODATASUM)) { -- 2.41.0
[PATCH v2 10/36] btrfs: disable verity on encrypted inodes
From: Sweet Tea Dorminy Right now there isn't a way to encrypt things that aren't either filenames in directories or data on blocks on disk with extent encryption, so for now, disable verity usage with encryption on btrfs. Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/verity.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/btrfs/verity.c b/fs/btrfs/verity.c index 66e2270b0dae..92536913df04 100644 --- a/fs/btrfs/verity.c +++ b/fs/btrfs/verity.c @@ -588,6 +588,9 @@ static int btrfs_begin_enable_verity(struct file *filp) ASSERT(inode_is_locked(file_inode(filp))); + if (IS_ENCRYPTED(&inode->vfs_inode)) + return -EINVAL; + if (test_bit(BTRFS_INODE_VERITY_IN_PROGRESS, &inode->runtime_flags)) return -EBUSY; -- 2.41.0
[PATCH v2 03/36] fscrypt: add per-extent encryption support
This adds the code necessary for per-extent encryption. We will store a nonce for every extent we create, and then use the inode's policy and the extents nonce to derive a per-extent key. This is meant to be flexible, if we choose to expand the on-disk extent information in the future we have a version number we can use to change what exists on disk. The file system indicates it wants to use per-extent encryption by setting s_cop->set_extent_context. This also requires the use of inline block encryption. The support is relatively straightforward, the only "extra" bit is we're deriving a per-extent key to use for the encryption, the inode still controls the policy and access to the master key. Signed-off-by: Josef Bacik --- fs/crypto/crypto.c | 10 ++- fs/crypto/fscrypt_private.h | 44 ++ fs/crypto/inline_crypt.c| 84 +++ fs/crypto/keysetup.c| 155 fs/crypto/policy.c | 47 +++ include/linux/fscrypt.h | 71 + 6 files changed, 410 insertions(+), 1 deletion(-) diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c index 328470d40dec..18bd96b9db4e 100644 --- a/fs/crypto/crypto.c +++ b/fs/crypto/crypto.c @@ -40,6 +40,7 @@ static struct workqueue_struct *fscrypt_read_workqueue; static DEFINE_MUTEX(fscrypt_init_mutex); struct kmem_cache *fscrypt_inode_info_cachep; +struct kmem_cache *fscrypt_extent_info_cachep; void fscrypt_enqueue_decrypt_work(struct work_struct *work) { @@ -414,12 +415,19 @@ static int __init fscrypt_init(void) if (!fscrypt_inode_info_cachep) goto fail_free_queue; + fscrypt_extent_info_cachep = KMEM_CACHE(fscrypt_extent_info, + SLAB_RECLAIM_ACCOUNT); + if (!fscrypt_extent_info_cachep) + goto fail_free_inode_info; + err = fscrypt_init_keyring(); if (err) - goto fail_free_inode_info; + goto fail_free_extent_info; return 0; +fail_free_extent_info: + kmem_cache_destroy(fscrypt_extent_info_cachep); fail_free_inode_info: kmem_cache_destroy(fscrypt_inode_info_cachep); fail_free_queue: diff --git a/fs/crypto/fscrypt_private.h b/fs/crypto/fscrypt_private.h index f44342f17269..c672c3e537f3 100644 --- a/fs/crypto/fscrypt_private.h +++ b/fs/crypto/fscrypt_private.h @@ -30,6 +30,8 @@ #define FSCRYPT_CONTEXT_V1 1 #define FSCRYPT_CONTEXT_V2 2 +#define FSCRYPT_EXTENT_CONTEXT_V1 1 + /* Keep this in sync with include/uapi/linux/fscrypt.h */ #define FSCRYPT_MODE_MAX FSCRYPT_MODE_AES_256_HCTR2 @@ -53,6 +55,28 @@ struct fscrypt_context_v2 { u8 nonce[FSCRYPT_FILE_NONCE_SIZE]; }; +/* + * fscrypt_extent_context - the encryption context of an extent + * + * This is the on-disk information stored for an extent. The policy and + * relevante information is stored in the inode, the per-extent information is + * simply the nonce that's used in as KDF input in conjunction with the inode + * context to derive a per-extent key for encryption. + * + * At this point the master_key_identifier exists only for possible future + * expansion. This will allow for an inode to have extents with multiple master + * keys, although sharing the same encryption mode. This would be for re-keying + * or for reflinking between two differently encrypted inodes. For now the + * master_key_descriptor must match the inode's, and we'll be using the inode's + * for all key derivation. + */ +struct fscrypt_extent_context { + u8 version; /* FSCRYPT_EXTENT_CONTEXT_V2 */ + u8 encryption_mode; + u8 master_key_identifier[FSCRYPT_KEY_IDENTIFIER_SIZE]; + u8 nonce[FSCRYPT_FILE_NONCE_SIZE]; +}; + /* * fscrypt_context - the encryption context of an inode * @@ -288,6 +312,25 @@ struct fscrypt_inode_info { u32 ci_hashed_ino; }; +/* + * fscrypt_extent_info - the "encryption key" for an extent. + * + * This contains the dervied key for the given extent and the nonce for the + * extent. + */ +struct fscrypt_extent_info { + refcount_t refs; + + /* The derived key for this extent. */ + struct fscrypt_prepared_key prep_key; + + /* The super block that this extent belongs to. */ + struct super_block *sb; + + /* This is the extents nonce, loaded from the fscrypt_extent_context */ + u8 nonce[FSCRYPT_FILE_NONCE_SIZE]; +}; + typedef enum { FS_DECRYPT = 0, FS_ENCRYPT, @@ -295,6 +338,7 @@ typedef enum { /* crypto.c */ extern struct kmem_cache *fscrypt_inode_info_cachep; +extern struct kmem_cache *fscrypt_extent_info_cachep; int fscrypt_initialize(struct super_block *sb); int fscrypt_crypt_data_unit(const struct fscrypt_inode_info *ci, fscrypt_direction_t rw, u64 index, diff --git a/fs/crypto/inline_crypt.c b/fs/crypto/inline_crypt.c in
[PATCH v2 04/36] fscrypt: disable all but standard v2 policies for extent encryption
The different encryption related options for fscrypt are too numerous to support for extent based encryption. Support for a few of these options could possibly be added, but since they're niche options simply reject them for file systems using extent based encryption. Signed-off-by: Josef Bacik --- fs/crypto/policy.c | 12 1 file changed, 12 insertions(+) diff --git a/fs/crypto/policy.c b/fs/crypto/policy.c index 4729f21e21d8..75a69f02f11d 100644 --- a/fs/crypto/policy.c +++ b/fs/crypto/policy.c @@ -209,6 +209,12 @@ static bool fscrypt_supported_v1_policy(const struct fscrypt_policy_v1 *policy, return false; } + if (inode->i_sb->s_cop->has_per_extent_encryption) { + fscrypt_warn(inode, +"v1 policies can't be used on file systems that use extent encryption"); + return false; + } + return true; } @@ -269,6 +275,12 @@ static bool fscrypt_supported_v2_policy(const struct fscrypt_policy_v2 *policy, } } + if ((inode->i_sb->s_cop->has_per_extent_encryption) && count) { + fscrypt_warn(inode, +"Encryption flags aren't supported on file systems that use extent encryption"); + return false; + } + if ((policy->flags & FSCRYPT_POLICY_FLAG_DIRECT_KEY) && !supported_direct_key_modes(inode, policy->contents_encryption_mode, policy->filenames_encryption_mode)) -- 2.41.0
[PATCH v2 02/36] fscrypt: don't wipe mk secret until the last active user is gone
Previously we were wiping the master key secret when we do FS_IOC_REMOVE_ENCRYPTION_KEY, and then using the fact that it was cleared as the mechanism from keeping new users from being setup. This works with inode based encryption, as the per-inode key is derived at setup time, so the secret disappearing doesn't affect any currently open files from being able to continue working. However for extent based encryption we do our key derivation at page writeout and readpage time, which means we need the master key secret to be available while we still have our file open. Since the master key lifetime is controlled by a flag, move the clearing of the secret to the mk_active_users cleanup stage. This counter represents the actively open files that still exist on the file system, and thus should still be able to operate normally. Once the last user is closed we can clear the secret. Until then no new users are allowed, and this allows currently open files to continue to operate until they're closed. Signed-off-by: Josef Bacik --- fs/crypto/keyring.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/crypto/keyring.c b/fs/crypto/keyring.c index e0e311ed6b88..31ea81d97075 100644 --- a/fs/crypto/keyring.c +++ b/fs/crypto/keyring.c @@ -116,6 +116,7 @@ void fscrypt_put_master_key_activeref(struct super_block *sb, memzero_explicit(&mk->mk_ino_hash_key, sizeof(mk->mk_ino_hash_key)); mk->mk_ino_hash_key_initialized = false; + wipe_master_key_secret(&mk->mk_secret); /* Drop the structural ref associated with the active refs. */ fscrypt_put_master_key(mk); @@ -245,7 +246,6 @@ void fscrypt_destroy_keyring(struct super_block *sb) WARN_ON_ONCE(refcount_read(&mk->mk_active_refs) != 1); WARN_ON_ONCE(refcount_read(&mk->mk_struct_refs) != 1); WARN_ON_ONCE(!is_master_key_secret_present(mk)); - wipe_master_key_secret(&mk->mk_secret); set_bit(FSCRYPT_MK_FLAG_EVICTED, &mk->mk_flags); fscrypt_put_master_key_activeref(sb, mk); } @@ -1064,7 +1064,6 @@ static int do_remove_key(struct file *filp, void __user *_uarg, bool all_users) /* No user claims remaining. Go ahead and wipe the secret. */ err = -ENOKEY; if (!test_and_set_bit(FSCRYPT_MK_FLAG_EVICTED, &mk->mk_flags)) { - wipe_master_key_secret(&mk->mk_secret); fscrypt_put_master_key_activeref(sb, mk); err = 0; } -- 2.41.0
[PATCH v2 01/36] fscrypt: use a flag to indicate that the master key is being evicted
Currently we wipe the mk->mk_secret when we remove the master key, and we use this status to tell everybody whether or not the master key is available for use. With extent based encryption we're going to need to keep the secret around until the inode is evicted, so we need a different mechanism to tell everybody that the key is currently unusable. Accomplish this with a mk_flags member in the master key, and update the is_master_key_secret_present() helper to return the status of this bit. Update the removal and adding helpers to manipulate this bit and use it as the source of truth about whether or not the key is available for use. Signed-off-by: Josef Bacik --- fs/crypto/fscrypt_private.h | 17 - fs/crypto/hooks.c | 2 +- fs/crypto/keyring.c | 20 ++-- fs/crypto/keysetup.c| 4 ++-- 4 files changed, 25 insertions(+), 18 deletions(-) diff --git a/fs/crypto/fscrypt_private.h b/fs/crypto/fscrypt_private.h index 2fb4ba435d27..f44342f17269 100644 --- a/fs/crypto/fscrypt_private.h +++ b/fs/crypto/fscrypt_private.h @@ -471,6 +471,10 @@ struct fscrypt_master_key_secret { } __randomize_layout; +enum fscrypt_mk_flags { + FSCRYPT_MK_FLAG_EVICTED = BIT(0), +}; + /* * fscrypt_master_key - an in-use master key * @@ -565,19 +569,14 @@ struct fscrypt_master_key { siphash_key_t mk_ino_hash_key; boolmk_ino_hash_key_initialized; + /* Flags for the master key. */ + unsigned long mk_flags; } __randomize_layout; static inline bool -is_master_key_secret_present(const struct fscrypt_master_key_secret *secret) +is_master_key_secret_present(const struct fscrypt_master_key *mk) { - /* -* The READ_ONCE() is only necessary for fscrypt_drop_inode(). -* fscrypt_drop_inode() runs in atomic context, so it can't take the key -* semaphore and thus 'secret' can change concurrently which would be a -* data race. But fscrypt_drop_inode() only need to know whether the -* secret *was* present at the time of check, so READ_ONCE() suffices. -*/ - return READ_ONCE(secret->size) != 0; + return !test_bit(FSCRYPT_MK_FLAG_EVICTED, &mk->mk_flags); } static inline const char *master_key_spec_type( diff --git a/fs/crypto/hooks.c b/fs/crypto/hooks.c index 85d2975b69b7..f7cf724cf256 100644 --- a/fs/crypto/hooks.c +++ b/fs/crypto/hooks.c @@ -187,7 +187,7 @@ int fscrypt_prepare_setflags(struct inode *inode, return -EINVAL; mk = ci->ci_master_key; down_read(&mk->mk_sem); - if (is_master_key_secret_present(&mk->mk_secret)) + if (is_master_key_secret_present(mk)) err = fscrypt_derive_dirhash_key(ci, mk); else err = -ENOKEY; diff --git a/fs/crypto/keyring.c b/fs/crypto/keyring.c index a51fa6a33de1..e0e311ed6b88 100644 --- a/fs/crypto/keyring.c +++ b/fs/crypto/keyring.c @@ -102,7 +102,7 @@ void fscrypt_put_master_key_activeref(struct super_block *sb, * ->mk_active_refs == 0 implies that ->mk_secret is not present and * that ->mk_decrypted_inodes is empty. */ - WARN_ON_ONCE(is_master_key_secret_present(&mk->mk_secret)); + WARN_ON_ONCE(is_master_key_secret_present(mk)); WARN_ON_ONCE(!list_empty(&mk->mk_decrypted_inodes)); for (i = 0; i <= FSCRYPT_MODE_MAX; i++) { @@ -236,11 +236,17 @@ void fscrypt_destroy_keyring(struct super_block *sb) * the keyring due to the single active ref associated * with ->mk_secret. There should be no structural refs * beyond the one associated with the active ref. +* +* We set the EVICTED flag in order to avoid the +* WARN_ON_ONCE(is_master_key_secret_present(mk)) in +* fscrypt_put_master_key_activeref(), as we want to +* maintain that warning for improper cleanup elsewhere. */ WARN_ON_ONCE(refcount_read(&mk->mk_active_refs) != 1); WARN_ON_ONCE(refcount_read(&mk->mk_struct_refs) != 1); - WARN_ON_ONCE(!is_master_key_secret_present(&mk->mk_secret)); + WARN_ON_ONCE(!is_master_key_secret_present(mk)); wipe_master_key_secret(&mk->mk_secret); + set_bit(FSCRYPT_MK_FLAG_EVICTED, &mk->mk_flags); fscrypt_put_master_key_activeref(sb, mk); } } @@ -479,9 +485,11 @@ static int add_existing_master_key(struct fscrypt_master_key *mk, } /* Re-add the secret if needed. */ - if (!is_master
[PATCH v2 00/36] btrfs: add fscrypt support
Hello, This is the next version of the fscrypt support. It is based on a combination of Sterba's for-next branch and the fscrypt for-next branch. The fscrypt stuff should apply cleanly to the fscrypt for-next, but it won't apply cleanly to our btrfs for-next branch. I did this in case Eric wants to go ahead and merge the fscrypt side, then we can figure out what to do on the btrfs side. v1 was posted here https://lore.kernel.org/linux-btrfs/cover.1695750478.git.jo...@toxicpanda.com/ v1->v2: - Dropped the rename patch as it's in the fscrypt tree. - Implemented the soft delete master key idea in a different way that's hopefully more straightforward and easier to understand. - A small fixup related to master keys being removed. This has been tested with the updated fstests, everything appears to be working well. Thanks, Josef Josef Bacik (21): fscrypt: use a flag to indicate that the master key is being evicted fscrypt: don't wipe mk secret until the last active user is gone fscrypt: add per-extent encryption support fscrypt: disable all but standard v2 policies for extent encryption blk-crypto: add a process bio callback fscrypt: add documentation about extent encryption btrfs: add infrastructure for safe em freeing btrfs: add fscrypt_info and encryption_type to ordered_extent btrfs: plumb through setting the fscrypt_info for ordered extents btrfs: populate the ordered_extent with the fscrypt context btrfs: keep track of fscrypt info and orig_start for dio reads btrfs: add an optional encryption context to the end of file extents btrfs: pass through fscrypt_extent_info to the file extent helpers btrfs: pass the fscrypt_info through the replace extent infrastructure btrfs: implement the fscrypt extent encryption hooks btrfs: setup fscrypt_extent_info for new extents btrfs: populate ordered_extent with the orig offset btrfs: set the bio fscrypt context when applicable btrfs: add a bio argument to btrfs_csum_one_bio btrfs: add orig_logical to btrfs_bio btrfs: implement process_bio cb for fscrypt Omar Sandoval (7): fscrypt: expose fscrypt_nokey_name btrfs: disable various operations on encrypted inodes btrfs: start using fscrypt hooks btrfs: add inode encryption contexts btrfs: add new FEATURE_INCOMPAT_ENCRYPT flag btrfs: adapt readdir for encrypted and nokey names btrfs: implement fscrypt ioctls Sweet Tea Dorminy (8): btrfs: disable verity on encrypted inodes btrfs: handle nokey names. btrfs: add encryption to CONFIG_BTRFS_DEBUG btrfs: add get_devices hook for fscrypt btrfs: turn on inlinecrypt mount option for encrypt btrfs: set file extent encryption excplicitly btrfs: add fscrypt_info and encryption_type to extent_map btrfs: explicitly track file extent length for replace and drop Documentation/filesystems/fscrypt.rst | 36 ++ block/blk-crypto-fallback.c | 28 ++ block/blk-crypto-profile.c| 2 + block/blk-crypto.c| 6 +- fs/btrfs/Makefile | 1 + fs/btrfs/accessors.h | 50 +++ fs/btrfs/bio.c| 45 ++- fs/btrfs/bio.h| 6 + fs/btrfs/btrfs_inode.h| 3 +- fs/btrfs/compression.c| 6 + fs/btrfs/ctree.h | 4 + fs/btrfs/defrag.c | 10 +- fs/btrfs/delayed-inode.c | 29 +- fs/btrfs/delayed-inode.h | 6 +- fs/btrfs/dir-item.c | 108 +- fs/btrfs/dir-item.h | 11 +- fs/btrfs/extent_io.c | 81 - fs/btrfs/extent_io.h | 3 + fs/btrfs/extent_map.c | 106 +- fs/btrfs/extent_map.h | 12 + fs/btrfs/file-item.c | 17 +- fs/btrfs/file-item.h | 7 +- fs/btrfs/file.c | 16 +- fs/btrfs/fs.h | 3 +- fs/btrfs/fscrypt.c| 326 ++ fs/btrfs/fscrypt.h| 95 + fs/btrfs/inode.c | 476 -- fs/btrfs/ioctl.c | 41 ++- fs/btrfs/ordered-data.c | 26 +- fs/btrfs/ordered-data.h | 21 +- fs/btrfs/reflink.c| 8 + fs/btrfs/root-tree.c | 8 +- fs/btrfs/root-tree.h | 2 +- fs/btrfs/super.c | 17 + fs/btrfs/sysfs.c | 6 + fs/btrfs/tree-checker.c | 66 +++- fs/btrfs/tree-log.c | 26 +- fs/btrfs/verity.c | 3 + fs/crypto/crypto.c| 10 +- fs/crypto/fname.c | 39 +-- fs/crypto/fscrypt_private.h | 61 +++- fs/crypto/hooks.c | 2 +- fs/crypto/inline_crypt.c | 87 - fs/crypto/keyring.c
[PATCH 05/12] common/verity: explicitly don't allow btrfs encryption
From: Sweet Tea Dorminy Currently btrfs encryption doesn't support verity, but it is planned to one day. To be explicit about the lack of support, add a custom error message to the combination. Signed-off-by: Sweet Tea Dorminy --- common/verity | 4 1 file changed, 4 insertions(+) diff --git a/common/verity b/common/verity index 03d175ce..4e601a81 100644 --- a/common/verity +++ b/common/verity @@ -224,6 +224,10 @@ _scratch_mkfs_encrypted_verity() # features with -O. Instead -O must be supplied multiple times. _scratch_mkfs -O encrypt -O verity ;; + btrfs) + # currently verity + encryption is not supported + _notrun "btrfs doesn't currently support verity + encryption" + ;; *) _notrun "$FSTYP not supported in _scratch_mkfs_encrypted_verity" ;; -- 2.41.0
[PATCH 12/12] fstest: add a fsstress+fscrypt test
I noticed we don't run fsstress with fscrypt in any of our tests, and this was helpful in uncovering a couple of symlink related corner cases for the btrfs support work. Add a basic test that creates a encrypted directory and runs fsstress in that directory. Signed-off-by: Josef Bacik --- tests/generic/736 | 38 ++ tests/generic/736.out | 3 +++ 2 files changed, 41 insertions(+) create mode 100644 tests/generic/736 create mode 100644 tests/generic/736.out diff --git a/tests/generic/736 b/tests/generic/736 new file mode 100644 index ..0ef37d7e --- /dev/null +++ b/tests/generic/736 @@ -0,0 +1,38 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright 2023 Meta +# +# FS QA Test No. generic/5736 +# +# Run fscrypt on an encrypted directory +# + +. ./common/preamble +_begin_fstest auto quick encrypt +echo + +# Import common functions. +. ./common/filter +. ./common/encrypt + +# real QA test starts here +_supported_fs generic +_require_scratch_encryption -v 2 + +_scratch_mkfs_encrypted &>> $seqres.full +_scratch_mount + +dir=$SCRATCH_MNT/dir +mkdir $dir + +_set_encpolicy $dir $TEST_KEY_IDENTIFIER +_add_enckey $SCRATCH_MNT "$TEST_RAW_KEY" + +args=$(_scale_fsstress_args -p 4 -n 1 -p 2 $FSSTRESS_AVOID -d $dir) +echo "Run fsstress $args" >>$seqres.full + +$FSSTRESS_PROG $args >> $seqres.full + +# success, all done +status=0 +exit diff --git a/tests/generic/736.out b/tests/generic/736.out new file mode 100644 index ..022754df --- /dev/null +++ b/tests/generic/736.out @@ -0,0 +1,3 @@ +QA output created by 736 + +Added encryption key with identifier 69b2f6edeee720cce0577937eb8a6751 -- 2.41.0
[PATCH 11/12] fstests: split generic/613 into two tests
generic/613 tests v1 and v2 policies, but btrfs can only support v2 policies. Split this into two different tests, 613 which will only test v1 policies, and then 735 which will test v2 policies. The 735 test will also add checks for the per-extent nonces to validate they're all sufficiently random. Signed-off-by: Josef Bacik --- tests/generic/613 | 20 ++-- tests/generic/613.out | 5 +- tests/generic/735 | 117 ++ tests/generic/735.out | 14 + 4 files changed, 138 insertions(+), 18 deletions(-) create mode 100644 tests/generic/735 create mode 100644 tests/generic/735.out diff --git a/tests/generic/613 b/tests/generic/613 index 47c60e9c..96b81a96 100755 --- a/tests/generic/613 +++ b/tests/generic/613 @@ -22,22 +22,21 @@ _begin_fstest auto quick encrypt # real QA test starts here _supported_fs generic -_require_scratch_encryption -v 2 +_require_scratch_encryption _require_get_encryption_nonce_support _require_command "$XZ_PROG" xz _scratch_mkfs_encrypted &>> $seqres.full _scratch_mount -echo -e "\n# Adding encryption keys" -_add_enckey $SCRATCH_MNT "$TEST_RAW_KEY" +echo -e "\n# Adding encryption key" _add_enckey $SCRATCH_MNT "$TEST_RAW_KEY" -d $TEST_KEY_DESCRIPTOR # Create a bunch of encrypted files and directories -- enough for the uniqueness # and randomness tests to be meaningful, but not so many that this test takes a -# long time. Test using both v1 and v2 encryption policies, and for each of -# those test the case of an encryption policy that is assigned to an empty -# directory as well as the case of a file created in an encrypted directory. +# long time. Test using the v1 encryption policy, test the case of an +# encryption policy that is assigned to an empty directory as well as the case +# of a file created in an encrypted directory. echo -e "\n# Creating encrypted files and directories" inodes=() for i in {1..50}; do @@ -45,20 +44,11 @@ for i in {1..50}; do mkdir $dir inodes+=("$(stat -c %i $dir)") _set_encpolicy $dir $TEST_KEY_DESCRIPTOR - - dir=$SCRATCH_MNT/v2_policy_dir_$i - mkdir $dir - inodes+=("$(stat -c %i $dir)") - _set_encpolicy $dir $TEST_KEY_IDENTIFIER done for i in {1..50}; do file=$SCRATCH_MNT/v1_policy_dir_1/$i touch $file inodes+=("$(stat -c %i $file)") - - file=$SCRATCH_MNT/v2_policy_dir_1/$i - touch $file - inodes+=("$(stat -c %i $file)") done _scratch_unmount diff --git a/tests/generic/613.out b/tests/generic/613.out index 203a64f2..4a218d03 100644 --- a/tests/generic/613.out +++ b/tests/generic/613.out @@ -1,7 +1,6 @@ QA output created by 613 -# Adding encryption keys -Added encryption key with identifier 69b2f6edeee720cce0577937eb8a6751 +# Adding encryption key Added encryption key with descriptor # Creating encrypted files and directories @@ -12,5 +11,5 @@ Added encryption key with descriptor Listing non-unique nonces: # Verifying randomness of nonces -Uncompressed size is 3200 bytes +Uncompressed size is 1600 bytes Nonces are incompressible, as expected diff --git a/tests/generic/735 b/tests/generic/735 new file mode 100644 index ..c901be1f --- /dev/null +++ b/tests/generic/735 @@ -0,0 +1,117 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright 2023 Meta +# +# FS QA Test No. 735 +# +# A variation of generic/613 that only tests v2, and checks data nonces for any +# file system that supporst per-extent encryption. +# +# Test that encryption nonces are unique and random, where randomness is +# approximated as "incompressible by the xz program". +# +# An encryption nonce is the 16-byte value that the filesystem generates for +# each encrypted file. These nonces must be unique in order to cause different +# files to be encrypted differently, which is an important security property. +# In practice, they need to be random to achieve that; and it's easy enough to +# test for both uniqueness and randomness, so we test for both. +# +. ./common/preamble +_begin_fstest auto quick encrypt + +# Import common functions. +. ./common/filter +. ./common/encrypt + +# real QA test starts here +_supported_fs generic +_require_scratch_encryption -v 2 +_require_get_encryption_nonce_support +_require_command "$XZ_PROG" xz + +_check_nonce() +{ + local nonce=$1 + + if (( ${#nonce} != 32 )) || [ -n "$(echo "$nonce" | tr -d 0-9a-fA-F)" ] + then + _fail "Expected nonce for inode $inode to be 16 bytes (32 hex characters), but got \"$nonce\"" + fi +} + +_scratch_mkfs_encrypted &>> $seqres.full +_scratch_mount + +echo -e "\n# Adding encryption key" +_add_enckey $SCRATCH_MNT "$TEST_RAW_KEY" + +# Create a bunch o
[PATCH 07/12] btrfs: test snapshotting encrypted subvol
From: Sweet Tea Dorminy Make sure that snapshots of encrypted data are readable and writeable. Test deliberately high-numbered to not conflict. Signed-off-by: Sweet Tea Dorminy --- tests/btrfs/614 | 76 ++ tests/btrfs/614.out | 111 2 files changed, 187 insertions(+) create mode 100755 tests/btrfs/614 create mode 100644 tests/btrfs/614.out diff --git a/tests/btrfs/614 b/tests/btrfs/614 new file mode 100755 index ..87dd27f9 --- /dev/null +++ b/tests/btrfs/614 @@ -0,0 +1,76 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2023 Meta Platforms, Inc. All Rights Reserved. +# +# FS QA Test 614 +# +# Try taking a snapshot of an encrypted subvolume. Make sure the snapshot is +# still readable. Rewrite part of the subvol with the same data; make sure it's +# still readable. +# +. ./common/preamble +_begin_fstest auto encrypt + +# Import common functions. +. ./common/encrypt +. ./common/filter + +# real QA test starts here +_supported_fs btrfs + +_require_test +_require_scratch +_require_scratch_encryption -v 2 +_require_command "$KEYCTL_PROG" keyctl + +_scratch_mkfs_encrypted &>> $seqres.full +_scratch_mount + +udir=$SCRATCH_MNT/reference +dir=$SCRATCH_MNT/subvol +dir2=$SCRATCH_MNT/subvol2 +$BTRFS_UTIL_PROG subvolume create $dir >> $seqres.full +mkdir $udir + +_set_encpolicy $dir $TEST_KEY_IDENTIFIER +_add_enckey $SCRATCH_MNT "$TEST_RAW_KEY" + +# get files with lots of extents by using backwards writes. +for j in `seq 0 50`; do + for i in `seq 20 -1 1`; do + $XFS_IO_PROG -f -d -c "pwrite $(($i * 4096)) 4096" \ + $dir/foo-$j >> $seqres.full | _filter_xfs_io + $XFS_IO_PROG -f -d -c "pwrite $(($i * 4096)) 4096" \ + $udir/foo-$j >> $seqres.full | _filter_xfs_io + done +done + +$BTRFS_UTIL_PROG subvolume snapshot $dir $dir2 | _filter_scratch + +_scratch_remount +_add_enckey $SCRATCH_MNT "$TEST_RAW_KEY" +sleep 30 +echo "Diffing $dir and $dir2" +diff $dir $dir2 + +echo "Rewriting $dir2 partly" +# rewrite half of each file in the snapshot +for j in `seq 0 50`; do + for i in `seq 10 -1 1`; do + $XFS_IO_PROG -f -d -c "pwrite $(($i * 4096)) 4096" \ + $dir2/foo-$j >> $seqres.full | _filter_xfs_io + done +done + +echo "Diffing $dir and $dir2" +diff $dir $dir2 + +echo "Dropping key and diffing" +_rm_enckey $SCRATCH_MNT $TEST_KEY_IDENTIFIER +diff $dir $dir2 |& _filter_scratch | _filter_nokey_filenames + +$BTRFS_UTIL_PROG subvolume delete $dir > /dev/null 2>&1 + +# success, all done +status=0 +exit diff --git a/tests/btrfs/614.out b/tests/btrfs/614.out new file mode 100644 index ..390807e8 --- /dev/null +++ b/tests/btrfs/614.out @@ -0,0 +1,111 @@ +QA output created by 614 +Added encryption key with identifier 69b2f6edeee720cce0577937eb8a6751 +Create a snapshot of 'SCRATCH_MNT/subvol' in 'SCRATCH_MNT/subvol2' +Added encryption key with identifier 69b2f6edeee720cce0577937eb8a6751 +Diffing /mnt/scratch/subvol and /mnt/scratch/subvol2 +Rewriting /mnt/scratch/subvol2 partly +Diffing /mnt/scratch/subvol and /mnt/scratch/subvol2 +Dropping key and diffing +Removed encryption key with identifier 69b2f6edeee720cce0577937eb8a6751 +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOKEY_NAME NOKEY_NAME NOKEY_NAME +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOKEY_NAME NOKEY_NAME NOKEY_NAME +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOKEY_NAME NOKEY_NAME NOKEY_NAME +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOKEY_NAME NOKEY_NAME NOKEY_NAME +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOKEY_NAME NOKEY_NAME NOKEY_NAME +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOKEY_NAME NOKEY_NAME NOKEY_NAME +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOKEY_NAME NOKEY_NAME NOKEY_NAME +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOKEY_NAME NOKEY_NAME NOKEY_NAME +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOKEY_NAME NOKEY_NAME NOKEY_NAME +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOKEY_NAME NOKEY_NAME NOKEY_NAME +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOKEY_NAME NOKEY_NAME NOKEY_NAME +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOKEY_NAME NOKEY_NAME NOKEY_NAME +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOKEY_NAME NOKEY_NAME NOKEY_NAME +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOKEY_NAME NOKEY_NAME NOKEY_NAME +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOKEY_NAME NOKEY_NAME NOKEY_NAME +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOKEY_NAME NOKEY_NAME NOKEY_NAME +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOKEY_NAME NOKEY_NAME NOKEY_NAME +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOKEY_NAME NOKEY_NAME NOKEY_NAME +NOKEY_NAME: NOKEY_NAME/NOKEY_NAME/NOKEY_NAME: NOKEY_NAME NOK
[PATCH 10/12] fstests: split generic/581 into two tests
generic/581 is mostly a v2 policy test, but it does do some quick checks of v1 policies as a normal user. Split the v1 and v2 related parts into two different tests so that the v2 part can get properly tested for btrfs file systems, which only support v2 policies. Signed-off-by: Josef Bacik --- tests/generic/581 | 89 +--- tests/generic/581.out | 50 tests/generic/734 | 135 ++ tests/generic/734.out | 51 4 files changed, 188 insertions(+), 137 deletions(-) create mode 100644 tests/generic/734 create mode 100644 tests/generic/734.out diff --git a/tests/generic/581 b/tests/generic/581 index cabc7e1c..ab930ac6 100755 --- a/tests/generic/581 +++ b/tests/generic/581 @@ -4,8 +4,7 @@ # # FS QA Test No. generic/581 # -# Test non-root use of the fscrypt filesystem-level encryption keyring -# and v2 encryption policies. +# Test non-root use of the fscrypt filesystem-level encryption keyring policy. # . ./common/preamble @@ -31,7 +30,7 @@ _cleanup() # real QA test starts here _supported_fs generic _require_user -_require_scratch_encryption -v 2 +_require_scratch_encryption _scratch_mkfs_encrypted &>> $seqres.full _scratch_mount @@ -58,90 +57,6 @@ echo "# Adding v1 policy key as regular user (should fail with EACCES)" _user_do_add_enckey $SCRATCH_MNT "$raw_key" -d $keydesc rm -rf $dir -echo -_user_do "mkdir $dir" - -echo "# Setting v2 policy as regular user without key already added (should fail with ENOKEY)" -_user_do_set_encpolicy $dir $keyid |& _filter_scratch - -echo "# Adding v2 policy key as regular user (should succeed)" -_user_do_add_enckey $SCRATCH_MNT "$raw_key" - -echo "# Setting v2 policy as regular user with key added (should succeed)" -_user_do_set_encpolicy $dir $keyid - -echo "# Getting v2 policy as regular user (should succeed)" -_user_do_get_encpolicy $dir | _filter_scratch - -echo "# Creating encrypted file as regular user (should succeed)" -_user_do "echo contents > $dir/file" - -echo "# Removing v2 policy key as regular user (should succeed)" -_user_do_rm_enckey $SCRATCH_MNT $keyid - -_scratch_cycle_mount # Clear all keys - -# Wait for any invalidated keys to be garbage-collected. -i=0 -while grep -E -q '^[0-9a-f]+ [^ ]*i[^ ]*' /proc/keys; do - if ((++i >= 20)); then - echo "Timed out waiting for invalidated keys to be GC'ed" >> $seqres.full - break - fi - sleep 0.5 -done - -# Set the user key quota to the fsgqa user's current number of keys plus 5. -orig_keys=$(_user_do "awk '/^[[:space:]]*$(id -u fsgqa):/{print \$4}' /proc/key-users | cut -d/ -f1") -: ${orig_keys:=0} -echo "orig_keys=$orig_keys" >> $seqres.full -orig_maxkeys=$( /proc/sys/kernel/keys/maxkeys - -echo -echo "# Testing user key quota" -for i in `seq $((keys_to_add + 1))`; do - rand_raw_key=$(_generate_raw_encryption_key) - _user_do_add_enckey $SCRATCH_MNT "$rand_raw_key" \ - | sed 's/ with identifier .*$//' -done - -# Restore the original key quota. -echo "$orig_maxkeys" > /proc/sys/kernel/keys/maxkeys - -rm -rf $dir -echo -_user_do "mkdir $dir" -_scratch_cycle_mount # Clear all keys - -# Test multiple users adding the same key. -echo "# Adding key as root" -_add_enckey $SCRATCH_MNT "$raw_key" -echo "# Getting key status as regular user" -_user_do_enckey_status $SCRATCH_MNT $keyid -echo "# Removing key only added by another user (should fail with ENOKEY)" -_user_do_rm_enckey $SCRATCH_MNT $keyid -echo "# Setting v2 encryption policy with key only added by another user (should fail with ENOKEY)" -_user_do_set_encpolicy $dir $keyid |& _filter_scratch -echo "# Adding second user of key" -_user_do_add_enckey $SCRATCH_MNT "$raw_key" -echo "# Getting key status as regular user" -_user_do_enckey_status $SCRATCH_MNT $keyid -echo "# Setting v2 encryption policy as regular user" -_user_do_set_encpolicy $dir $keyid -echo "# Removing this user's claim to the key" -_user_do_rm_enckey $SCRATCH_MNT $keyid -echo "# Getting key status as regular user" -_user_do_enckey_status $SCRATCH_MNT $keyid -echo "# Adding back second user of key" -_user_do_add_enckey $SCRATCH_MNT "$raw_key" -echo "# Remove key for \"all users\", as regular user (should fail with EACCES)" -_user_do_rm_enckey $SCRATCH_MNT $keyid -a |& _filter_scratch -_enckey_status $SCRATCH_MNT $keyid -echo "# Remove key for \"all users\", as root" -_rm_enckey $SCRATCH_MNT $keyid -a -_enckey_status $SCRATCH_MNT $keyid # success, all done status=0 diff -
[PATCH 08/12] fstests: properly test for v1 encryption policies in encrypt tests
With btrfs adding fscrypt support we're limiting the usage to plain v2 policies only. This means we need to update the _require's for generic/593 that tests both v1 and v2 policies. The other sort of tests will be split into two tests in later patches. Signed-off-by: Josef Bacik --- common/encrypt| 2 ++ tests/generic/593 | 1 + 2 files changed, 3 insertions(+) diff --git a/common/encrypt b/common/encrypt index 1372af66..120ca612 100644 --- a/common/encrypt +++ b/common/encrypt @@ -59,6 +59,8 @@ _require_scratch_encryption() # policy required by the test. if [ $# -ne 0 ]; then _require_encryption_policy_support $SCRATCH_MNT "$@" + else + _require_encryption_policy_support $SCRATCH_MNT -v 1 fi _scratch_unmount diff --git a/tests/generic/593 b/tests/generic/593 index 2dda5d76..7907236c 100755 --- a/tests/generic/593 +++ b/tests/generic/593 @@ -17,6 +17,7 @@ _begin_fstest auto quick encrypt # real QA test starts here _supported_fs generic +_require_scratch_encryption -v 1 _require_scratch_encryption -v 2 _require_command "$KEYCTL_PROG" keyctl -- 2.41.0
[PATCH 09/12] fstests: split generic/580 into two tests
generic/580 tests both v1 and v2 encryption policies, however btrfs only supports v2 policies. Split this into two tests so that we can get the v2 coverage for btrfs. Signed-off-by: Josef Bacik --- tests/generic/580 | 118 ++ tests/generic/580.out | 40 -- tests/generic/733 | 79 tests/generic/733.out | 44 4 files changed, 173 insertions(+), 108 deletions(-) create mode 100644 tests/generic/733 create mode 100644 tests/generic/733.out diff --git a/tests/generic/580 b/tests/generic/580 index 73f32ff9..63ab9712 100755 --- a/tests/generic/580 +++ b/tests/generic/580 @@ -5,7 +5,7 @@ # FS QA Test generic/580 # # Basic test of the fscrypt filesystem-level encryption keyring -# and v2 encryption policies. +# policy. # . ./common/preamble @@ -18,80 +18,62 @@ echo # real QA test starts here _supported_fs generic -_require_scratch_encryption -v 2 +_require_scratch_encryption _scratch_mkfs_encrypted &>> $seqres.full _scratch_mount -test_with_policy_version() -{ - local vers=$1 - - if (( vers == 1 )); then - local keyspec=$TEST_KEY_DESCRIPTOR - local add_enckey_args="-d $keyspec" - else - local keyspec=$TEST_KEY_IDENTIFIER - local add_enckey_args="" - fi - - mkdir $dir - echo "# Setting v$vers encryption policy" - _set_encpolicy $dir $keyspec - echo "# Getting v$vers encryption policy" - _get_encpolicy $dir | _filter_scratch - if (( vers == 1 )); then - echo "# Getting v1 encryption policy using old ioctl" - _get_encpolicy $dir -1 | _filter_scratch - fi - echo "# Trying to create file without key added yet" - $XFS_IO_PROG -f $dir/file |& _filter_scratch - echo "# Getting encryption key status" - _enckey_status $SCRATCH_MNT $keyspec - echo "# Adding encryption key" - _add_enckey $SCRATCH_MNT "$TEST_RAW_KEY" $add_enckey_args - echo "# Creating encrypted file" - echo contents > $dir/file - echo "# Getting encryption key status" - _enckey_status $SCRATCH_MNT $keyspec - echo "# Removing encryption key" - _rm_enckey $SCRATCH_MNT $keyspec - echo "# Getting encryption key status" - _enckey_status $SCRATCH_MNT $keyspec - echo "# Verifying that the encrypted directory was \"locked\"" - cat $dir/file |& _filter_scratch - cat "$(find $dir -type f)" |& _filter_scratch | cut -d ' ' -f3- - - # Test removing key with a file open. - echo "# Re-adding encryption key" - _add_enckey $SCRATCH_MNT "$TEST_RAW_KEY" $add_enckey_args - echo "# Creating another encrypted file" - echo foo > $dir/file2 - echo "# Removing key while an encrypted file is open" - exec 3< $dir/file - _rm_enckey $SCRATCH_MNT $keyspec - echo "# Non-open file should have been evicted" - cat $dir/file2 |& _filter_scratch - echo "# Open file shouldn't have been evicted" - cat $dir/file - echo "# Key should be in \"incompletely removed\" state" - _enckey_status $SCRATCH_MNT $keyspec - echo "# Closing file and removing key for real now" - exec 3<&- - _rm_enckey $SCRATCH_MNT $keyspec - cat $dir/file |& _filter_scratch - - echo "# Cleaning up" - rm -rf $dir - _scratch_cycle_mount# Clear all keys - echo -} - dir=$SCRATCH_MNT/dir +keyspec=$TEST_KEY_DESCRIPTOR -test_with_policy_version 1 +mkdir $dir +echo "# Setting v1 encryption policy" +_set_encpolicy $dir $keyspec +echo "# Getting v1 encryption policy" +_get_encpolicy $dir | _filter_scratch +echo "# Getting v1 encryption policy using old ioctl" +_get_encpolicy $dir -1 | _filter_scratch +echo "# Trying to create file without key added yet" +$XFS_IO_PROG -f $dir/file |& _filter_scratch +echo "# Getting encryption key status" +_enckey_status $SCRATCH_MNT $keyspec +echo "# Adding encryption key" +_add_enckey $SCRATCH_MNT "$TEST_RAW_KEY" -d $keyspec +echo "# Creating encrypted file" +echo contents > $dir/file +echo "# Getting encryption key status" +_enckey_status $SCRATCH_MNT $keyspec +echo "# Removing encryption key" +_rm_enckey $SCRATCH_MNT $keyspec +echo "# Getting encryption key status" +_enckey_status $SCRATCH_MNT $keyspec +echo "# Verifying that the encrypted directory was \"locked\"" +cat $dir/file |& _filter_scratch +cat "$(find $dir -type f)"
[PATCH 04/12] common/encrypt: enable making a encrypted btrfs filesystem
From: Sweet Tea Dorminy Signed-off-by: Sweet Tea Dorminy --- common/encrypt | 6 ++ 1 file changed, 6 insertions(+) diff --git a/common/encrypt b/common/encrypt index 2c1925da..1372af66 100644 --- a/common/encrypt +++ b/common/encrypt @@ -153,6 +153,9 @@ _scratch_mkfs_encrypted() # erase the UBI volume; reformated automatically on next mount $UBIUPDATEVOL_PROG ${SCRATCH_DEV} -t ;; + btrfs) + _scratch_mkfs + ;; ceph) _scratch_cleanup_files ;; @@ -168,6 +171,9 @@ _scratch_mkfs_sized_encrypted() ext4|f2fs) MKFS_OPTIONS="$MKFS_OPTIONS -O encrypt" _scratch_mkfs_sized $* ;; + btrfs) + _scratch_mkfs_sized $* + ;; *) _notrun "Filesystem $FSTYP not supported in _scratch_mkfs_sized_encrypted" ;; -- 2.41.0
[PATCH 06/12] btrfs: add simple test of reflink of encrypted data
From: Sweet Tea Dorminy Make sure that we succeed at reflinking encrypted data. Test deliberately numbered with a high number so it won't conflict with tests between now and merge. --- tests/btrfs/613 | 59 + tests/btrfs/613.out | 13 ++ 2 files changed, 72 insertions(+) create mode 100755 tests/btrfs/613 create mode 100644 tests/btrfs/613.out diff --git a/tests/btrfs/613 b/tests/btrfs/613 new file mode 100755 index ..0288016e --- /dev/null +++ b/tests/btrfs/613 @@ -0,0 +1,59 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2023 Meta Platforms, Inc. All Rights Reserved. +# +# FS QA Test 613 +# +# Check if reflinking one encrypted file on btrfs succeeds. +# +. ./common/preamble +_begin_fstest auto encrypt + +# Import common functions. +. ./common/encrypt +. ./common/filter +. ./common/reflink + +# real QA test starts here + +# Modify as appropriate. +_supported_fs btrfs + +_require_test +_require_scratch +_require_cp_reflink +_require_scratch_encryption -v 2 +_require_command "$KEYCTL_PROG" keyctl + +_scratch_mkfs_encrypted &>> $seqres.full +_scratch_mount + +dir=$SCRATCH_MNT/dir +mkdir $dir +_set_encpolicy $dir $TEST_KEY_IDENTIFIER +_add_enckey $SCRATCH_MNT "$TEST_RAW_KEY" +echo "Creating and reflinking a file" +$XFS_IO_PROG -t -f -c "pwrite 0 33k" $dir/test > /dev/null +cp --reflink=always $dir/test $dir/test2 + +echo "Can't reflink encrypted and unencrypted" +cp --reflink=always $dir/test $SCRATCH_MNT/fail |& _filter_scratch + +echo "Diffing the file and its copy" +diff $dir/test $dir/test2 + +echo "Verifying the files are reflinked" +_verify_reflink $dir/test $dir/test2 + +echo "Diffing the files after remount" +_scratch_cycle_mount +_add_enckey $SCRATCH_MNT "$TEST_RAW_KEY" +diff $dir/test $dir/test2 + +echo "Diffing the files after key remove" +_rm_enckey $SCRATCH_MNT $TEST_KEY_IDENTIFIER +diff $dir/test $dir/test2 |& _filter_scratch + +# success, all done +status=0 +exit diff --git a/tests/btrfs/613.out b/tests/btrfs/613.out new file mode 100644 index ..4895d6dd --- /dev/null +++ b/tests/btrfs/613.out @@ -0,0 +1,13 @@ +QA output created by 613 +Added encryption key with identifier 69b2f6edeee720cce0577937eb8a6751 +Creating and reflinking a file +Can't reflink encrypted and unencrypted +cp: failed to clone 'SCRATCH_MNT/fail' from 'SCRATCH_MNT/dir/test': Invalid argument +Diffing the file and its copy +Verifying the files are reflinked +Diffing the files after remount +Added encryption key with identifier 69b2f6edeee720cce0577937eb8a6751 +Diffing the files after key remove +Removed encryption key with identifier 69b2f6edeee720cce0577937eb8a6751 +diff: SCRATCH_MNT/dir/test: No such file or directory +diff: SCRATCH_MNT/dir/test2: No such file or directory -- 2.41.0
[PATCH 02/12] common/encrypt: add btrfs to get_encryption_*nonce
From: Sweet Tea Dorminy Add the modes of getting the encryption nonces, either inode or extent, to the various get_encryption_nonce functions. For now, no encrypt test makes a file with more than one extent, so we can just grab the first extent's nonce for the data nonce; when we write a bigger file test, we'll need to change that. Signed-off-by: Sweet Tea Dorminy --- common/encrypt | 31 +++ 1 file changed, 31 insertions(+) diff --git a/common/encrypt b/common/encrypt index 04b6e5ac..fc1c8cc7 100644 --- a/common/encrypt +++ b/common/encrypt @@ -531,6 +531,17 @@ _get_encryption_file_nonce() found = 0; }' ;; + btrfs) + # Retrieve the fscrypt context for an inode as a hex string. + # btrfs prints these like: + #item 14 key ($inode FSCRYPT_CTXT_ITEM 0) itemoff 15491 itemsize 40 + #value: 020104008fabf3dd745d41856e812458cd765bf0140f41d62853f4c0351837daff4dcc8f + + $BTRFS_UTIL_PROG inspect-internal dump-tree $device | \ + grep -A 1 "key ($inode FSCRYPT_CTXT_ITEM 0)" | \ + grep --only-matching 'value: [[:xdigit:]]\+' | \ + tr -d ' \n' | tail -c 32 + ;; *) _fail "_get_encryption_file_nonce() isn't implemented on $FSTYP" ;; @@ -550,6 +561,23 @@ _get_encryption_data_nonce() ext4|f2fs) _get_encryption_file_nonce $device $inode ;; + btrfs) + # Retrieve the encryption IV of the first file extent in an inode as a hex + # string. btrfs prints the file extents (for simple unshared + # inodes) like: + # item 21 key ($inode EXTENT_DATA 0) itemoff 2534 itemsize 69 + #generation 7 type 1 (regular) +# extent data disk byte 5304320 nr 1048576 +# extent data offset 0 nr 1048576 ram 1048576 +# extent compression 0 (none) +# extent encryption 161 ((1, 40: context 02010402116a77667261d7422a4b1ed8c427e685edb7a0d370d0c9d4003033303330)) + + + $BTRFS_UTIL_PROG inspect-internal dump-tree $device | \ + grep -A 5 "key ($inode EXTENT_DATA 0)" | \ + grep --only-matching 'context [[:xdigit:]]\+' | \ + tr -d ' \n' | tail -c 32 + ;; *) _fail "_get_encryption_data_nonce() isn't implemented on $FSTYP" ;; @@ -572,6 +600,9 @@ _require_get_encryption_nonce_support() # Otherwise the xattr is incorrectly parsed as v1. But just let # the test fail in that case, as it was an f2fs-tools bug... ;; + btrfs) + _require_command "$BTRFS_UTIL_PROG" btrfs + ;; *) _notrun "_get_encryption_*nonce() isn't implemented on $FSTYP" ;; -- 2.41.0
[PATCH 03/12] common/encrypt: add btrfs to get_ciphertext_filename
From: Sweet Tea Dorminy Add the relevant call to get an encrypted filename from btrfs. Signed-off-by: Sweet Tea Dorminy --- common/encrypt | 16 1 file changed, 16 insertions(+) diff --git a/common/encrypt b/common/encrypt index fc1c8cc7..2c1925da 100644 --- a/common/encrypt +++ b/common/encrypt @@ -618,6 +618,19 @@ _get_ciphertext_filename() local dir_inode=$3 case $FSTYP in + btrfs) + # Extract the filename from the inode_ref object, similar to: + # item 24 key (259 INODE_REF 257) itemoff 14826 itemsize 26 + # index 3 namelen 16 name: J\xf7\x15tD\x8eL\xae/\x98\x9f\x09\xc1\xb6\x09> + # + $BTRFS_UTIL_PROG inspect-internal dump-tree $device | \ + grep -A 1 "key ($inode INODE_REF " | tail -n 1 | \ + perl -ne ' + s/.*?name: //; + chomp; + s/\\x([[:xdigit:]]{2})/chr hex $1/eg; + print;' + ;; ext4) # Extract the filename from the debugfs output line like: # @@ -715,6 +728,9 @@ _require_get_ciphertext_filename_support() _notrun "dump.f2fs (f2fs-tools) is too old; doesn't support showing unambiguous on-disk filenames" fi ;; + btrfs) + _require_command "$BTRFS_UTIL_PROG" btrfs + ;; *) _notrun "_get_ciphertext_filename() isn't implemented on $FSTYP" ;; -- 2.41.0
[PATCH 00/12] fstests: fscrypt test updates
Hello, Btrfs is adding fscrypt support, and thus requires a variety of changes to the current fscrypt tests and infrastructure, as well as adding a few extra tests. The bulk of the changes to the existing tests is simply breaking the v1 and v2 policy tests into two different tests, as btrfs will only support v2 policies. The infrastructure related work is around pulling the nonce's out of the file system in order to support the different nonce/decryption related checks. And finally there are 3 new tests, two around reflinks and snapshots and then a generic fsstress test. I've tested these with ext4 and btrfs (with our patches) to make sure everything works properly. Thanks, Josef Josef Bacik (5): fstests: properly test for v1 encryption policies in encrypt tests fstests: split generic/580 into two tests fstests: split generic/581 into two tests fstests: split generic/613 into two tests fstest: add a fsstress+fscrypt test Sweet Tea Dorminy (7): common/encrypt: separate data and inode nonces common/encrypt: add btrfs to get_encryption_*nonce common/encrypt: add btrfs to get_ciphertext_filename common/encrypt: enable making a encrypted btrfs filesystem common/verity: explicitly don't allow btrfs encryption btrfs: add simple test of reflink of encrypted data btrfs: test snapshotting encrypted subvol common/encrypt| 88 --- common/verity | 4 ++ tests/btrfs/613 | 59 ++ tests/btrfs/613.out | 13 tests/btrfs/614 | 76 tests/btrfs/614.out | 111 ++ tests/f2fs/002| 2 +- tests/generic/580 | 118 tests/generic/580.out | 40 - tests/generic/581 | 89 +--- tests/generic/581.out | 50 tests/generic/593 | 1 + tests/generic/613 | 24 +++- tests/generic/613.out | 5 +- tests/generic/733 | 79 tests/generic/733.out | 44 ++ tests/generic/734 | 135 ++ tests/generic/734.out | 51 tests/generic/735 | 117 tests/generic/735.out | 14 + tests/generic/736 | 38 tests/generic/736.out | 3 + 22 files changed, 888 insertions(+), 273 deletions(-) create mode 100755 tests/btrfs/613 create mode 100644 tests/btrfs/613.out create mode 100755 tests/btrfs/614 create mode 100644 tests/btrfs/614.out create mode 100644 tests/generic/733 create mode 100644 tests/generic/733.out create mode 100644 tests/generic/734 create mode 100644 tests/generic/734.out create mode 100644 tests/generic/735 create mode 100644 tests/generic/735.out create mode 100644 tests/generic/736 create mode 100644 tests/generic/736.out -- 2.41.0
[PATCH 01/12] common/encrypt: separate data and inode nonces
From: Sweet Tea Dorminy btrfs will have different inode and data nonces, so we need to be specific about which nonce each use needs. For now, there is no difference in the two functions. Signed-off-by: Sweet Tea Dorminy --- common/encrypt| 33 ++--- tests/f2fs/002| 2 +- tests/generic/613 | 4 ++-- 3 files changed, 29 insertions(+), 10 deletions(-) diff --git a/common/encrypt b/common/encrypt index 1a77e23b..04b6e5ac 100644 --- a/common/encrypt +++ b/common/encrypt @@ -488,7 +488,7 @@ _add_fscrypt_provisioning_key() # Retrieve the encryption nonce of the given inode as a hex string. The nonce # was randomly generated by the filesystem and isn't exposed directly to # userspace. But it can be read using the filesystem's debugging tools. -_get_encryption_nonce() +_get_encryption_file_nonce() { local device=$1 local inode=$2 @@ -532,15 +532,34 @@ _get_encryption_nonce() }' ;; *) - _fail "_get_encryption_nonce() isn't implemented on $FSTYP" + _fail "_get_encryption_file_nonce() isn't implemented on $FSTYP" ;; esac } -# Require support for _get_encryption_nonce() +# Retrieve the encryption nonce used to encrypt the data of the given inode as +# a hex string. The nonce was randomly generated by the filesystem and isn't +# exposed directly to userspace. But it can be read using the filesystem's +# debugging tools. +_get_encryption_data_nonce() +{ + local device=$1 + local inode=$2 + + case $FSTYP in + ext4|f2fs) + _get_encryption_file_nonce $device $inode + ;; + *) + _fail "_get_encryption_data_nonce() isn't implemented on $FSTYP" + ;; + esac +} + +# Require support for _get_encryption_*nonce() _require_get_encryption_nonce_support() { - echo "Checking for _get_encryption_nonce() support for $FSTYP" >> $seqres.full + echo "Checking for _get_encryption_*nonce() support for $FSTYP" >> $seqres.full case $FSTYP in ext4) _require_command "$DEBUGFS_PROG" debugfs @@ -554,7 +573,7 @@ _require_get_encryption_nonce_support() # the test fail in that case, as it was an f2fs-tools bug... ;; *) - _notrun "_get_encryption_nonce() isn't implemented on $FSTYP" + _notrun "_get_encryption_*nonce() isn't implemented on $FSTYP" ;; esac } @@ -760,7 +779,7 @@ _do_verify_ciphertext_for_encryption_policy() echo "Verifying encrypted file contents" >> $seqres.full for f in "${test_contents_files[@]}"; do read -r src inode blocklist <<< "$f" - nonce=$(_get_encryption_nonce $SCRATCH_DEV $inode) + nonce=$(_get_encryption_data_nonce $SCRATCH_DEV $inode) _dump_ciphertext_blocks $SCRATCH_DEV $blocklist > $tmp.actual_contents $crypt_contents_cmd $contents_encryption_mode $raw_key_hex \ --file-nonce=$nonce --block-size=$blocksize \ @@ -780,7 +799,7 @@ _do_verify_ciphertext_for_encryption_policy() echo "Verifying encrypted file names" >> $seqres.full for f in "${test_filenames_files[@]}"; do read -r name inode dir_inode padding <<< "$f" - nonce=$(_get_encryption_nonce $SCRATCH_DEV $dir_inode) + nonce=$(_get_encryption_file_nonce $SCRATCH_DEV $dir_inode) _get_ciphertext_filename $SCRATCH_DEV $inode $dir_inode \ > $tmp.actual_name echo -n "$name" | \ diff --git a/tests/f2fs/002 b/tests/f2fs/002 index 8235d88a..a51ddf22 100755 --- a/tests/f2fs/002 +++ b/tests/f2fs/002 @@ -129,7 +129,7 @@ blocklist=$(_get_ciphertext_block_list $file) _scratch_unmount echo -e "\n# Getting file's encryption nonce" -nonce=$(_get_encryption_nonce $SCRATCH_DEV $inode) +nonce=$(_get_encryption_data_nonce $SCRATCH_DEV $inode) echo -e "\n# Dumping the file's raw data" _dump_ciphertext_blocks $SCRATCH_DEV $blocklist > $tmp.raw diff --git a/tests/generic/613 b/tests/generic/613 index 4cf5ccc6..47c60e9c 100755 --- a/tests/generic/613 +++ b/tests/generic/613 @@ -68,10 +68,10 @@ echo -e "\n# Getting encryption nonces from inodes" echo -n > $tmp.nonces_hex echo -n > $tmp.nonces_bin for inode in "${inodes[@]}"; do - nonce=$(_get_encryption_nonce $SCRATCH_DEV $inode) + nonce=$(_get_encryption_data_nonce $SCRATCH_DEV $inode) if (( ${#nonce} != 32 )) || [ -n "$(echo "$nonce" | tr -d 0-9a-fA-F)" ] then - _fail "Expected nonce to be 16 bytes (32 hex characters), but got \"$nonce\"" + _fail "Expected nonce for inode $inode to be 16 bytes (32 hex characters), but got \"$nonce\"" fi echo $nonce >> $tmp.nonces_hex echo -ne "$(echo $nonce | sed 's/[0-9a-fA-F]\{2\}/\\x\0/g')" \ -
Master key removal semantics
Hello, While getting the fstests stuff nailed down to deal with btrfs I ran into failures with generic/595, specifically the multi-threaded part. In one thread we have a loop adding and removing the master key. In the other thread we have us trying to echo a character into a flie in the encrypted side, and if it succeeds we echo a character into a temporary file, and then after the runtime has elapsed we compare these two files to make sure they match. The problem with this is that btrfs derives the per-extent key from the master key at writeout time. Everybody else has their content key derived at flie open time, so they don't need the master key to be around once the file is opened, so any writes that occur while that file is held open are allowed to happen. Sweet Tea had some changes around soft unloading the master key to handle this case. Basically we allow the master key to stick around by anybody who may need it who is currently open, and then any new users get denied. Once all the outstanding open files are closed the master key is unloaded. This keeps the semantics of what happens for everybody else. What is currently happening with my version of the patchset, which didn't bring in those patches, is that you get an ENOKEY at writeout time if you remove the key. The fstest fails because even tho we let you write to the file sometimes, it doesn't necessarily mean it'll make it to disk. If we want to keep the semantics of "when userspace tells us to throw away the master key, we absolutely throw the master key away" then I can just make adjustments to the fstests test and call what I have good enough. If we want to have the semantics of "when userspace tells us to throw away the master key we'll make it unavailable for any new users, but existing open files operate as normal" then I can pull in Sweet Tea's soft removal patches and call it good enough. There's a third option that is a bit of a middle ground with varying degrees of raciness. We could sync the file system before we do the removal just to narrow the window where we could successfully write to a file but get an ENOKEY at writeout time. We could freeze the filesystem to make sure it's sync'ed and allow any current writers to complete, this would be a stronger version of the first option, again just narrows the window. Neither of these cases help if the file is being held open. If we wanted to fully deal with the file being held open case we could set a flag, sync, then remove the key. Then we add a new fscrypt_prep_write() hook that filesystems could optionally use, obviously just btrfs for now, that we'd stick in the write path that would check for this flag or if the master key had been removed so we can deny dirtying when the key is removed. At this point I don't have strong opinions, it's easier for me to just leave it like it is and change fstests. Anything else is a change in the semantics of how the master key is handled, and that's not really a decision I feel comfortable making for everybody. Once we nail this detail down I can send the updated version of all the patches and we can start talking about inclusion. Thanks, Josef
[PATCH 35/35] btrfs: implement process_bio cb for fscrypt
We are going to be checksumming the encrypted data, so we have to implement the ->process_bio fscrypt callback. This will provide us with the original bio and the encrypted bio to do work on. For WRITE's this will happen after the encrypted bio has been encrypted. For READ's this will happen after the read has completed and before the decryption step is done. For write's this is straightforward, we can just pass in the encrypted bio to btrfs_csum_one_bio and then the csums will be added to the bbio as normal. For read's this is relatively straightforward, but requires some care. We assume (because that's how it works currently) that the encrypted bio match the original bio, this is important because we save the iter of the bio before we submit. If this changes in the future we'll need a hook to give us the bi_iter of the decryption bio before it's submitted. We check the csums before decryption. If it doesn't match we simply error out and we let the normal path handle the repair work. Signed-off-by: Josef Bacik --- fs/btrfs/bio.c | 34 +- fs/btrfs/bio.h | 3 +++ fs/btrfs/fscrypt.c | 19 +++ 3 files changed, 55 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c index 7d6931e53beb..27ebf6373c8f 100644 --- a/fs/btrfs/bio.c +++ b/fs/btrfs/bio.c @@ -280,6 +280,34 @@ static struct btrfs_failed_bio *repair_one_sector(struct btrfs_bio *failed_bbio, return fbio; } +blk_status_t btrfs_check_encrypted_read_bio(struct btrfs_bio *bbio, + struct bio *enc_bio) +{ + struct btrfs_inode *inode = bbio->inode; + struct btrfs_fs_info *fs_info = inode->root->fs_info; + u32 sectorsize = fs_info->sectorsize; + struct bvec_iter iter = bbio->saved_iter; + struct btrfs_device *dev = bbio->bio.bi_private; + u32 offset = 0; + + /* +* We have to use a copy of iter in case there's an error, +* btrfs_check_read_bio will handle submitting the repair bios. +*/ + while (iter.bi_size) { + struct bio_vec bv = bio_iter_iovec(enc_bio, iter); + + bv.bv_len = min(bv.bv_len, sectorsize); + if (!btrfs_data_csum_ok(bbio, dev, offset, &bv)) + return BLK_STS_IOERR; + bio_advance_iter_single(enc_bio, &iter, sectorsize); + offset += sectorsize; + } + + bbio->csum_done = true; + return BLK_STS_OK; +} + static void btrfs_check_read_bio(struct btrfs_bio *bbio, struct btrfs_device *dev) { struct btrfs_inode *inode = bbio->inode; @@ -305,6 +333,10 @@ static void btrfs_check_read_bio(struct btrfs_bio *bbio, struct btrfs_device *de /* Clear the I/O error. A failed repair will reset it. */ bbio->bio.bi_status = BLK_STS_OK; + /* This was an encrypted bio and we've already done the csum check. */ + if (status == BLK_STS_OK && bbio->csum_done) + goto out; + while (iter->bi_size) { struct bio_vec bv = bio_iter_iovec(&bbio->bio, *iter); @@ -315,7 +347,7 @@ static void btrfs_check_read_bio(struct btrfs_bio *bbio, struct btrfs_device *de bio_advance_iter_single(&bbio->bio, iter, sectorsize); offset += sectorsize; } - +out: if (bbio->csum != bbio->csum_inline) kfree(bbio->csum); diff --git a/fs/btrfs/bio.h b/fs/btrfs/bio.h index 5d3f53dcd6d5..393ef32f5321 100644 --- a/fs/btrfs/bio.h +++ b/fs/btrfs/bio.h @@ -45,6 +45,7 @@ struct btrfs_bio { struct { u8 *csum; u8 csum_inline[BTRFS_BIO_INLINE_CSUM_SIZE]; + bool csum_done; struct bvec_iter saved_iter; }; @@ -110,5 +111,7 @@ void btrfs_submit_repair_write(struct btrfs_bio *bbio, int mirror_num, bool dev_ int btrfs_repair_io_failure(struct btrfs_fs_info *fs_info, u64 ino, u64 start, u64 length, u64 logical, struct page *page, unsigned int pg_offset, int mirror_num); +blk_status_t btrfs_check_encrypted_read_bio(struct btrfs_bio *bbio, + struct bio *enc_bio); #endif diff --git a/fs/btrfs/fscrypt.c b/fs/btrfs/fscrypt.c index 1c690fcd0693..b2dfc26221e7 100644 --- a/fs/btrfs/fscrypt.c +++ b/fs/btrfs/fscrypt.c @@ -15,6 +15,7 @@ #include "transaction.h" #include "volumes.h" #include "xattr.h" +#include "file-item.h" /* * From a given location in a leaf, read a name into a qstr (usually a @@ -214,6 +215,23 @@ static struct block_device **btrfs_fscrypt_get_devices(struct super_block *sb, return devs; } +static blk_status_t btrfs_process_encrypted_bio(str
[PATCH 34/35] btrfs: add orig_logical to btrfs_bio
When checksumming the encrypted bio on writes we need to know which logical address this checksum is for. At the point where we get the encrypted bio the bi_sector is the physical location on the target disk, so we need to save the original logical offset in the btrfs_bio. Then we can use this when csum'ing the bio instead of the bio->iter.bi_sector. Signed-off-by: Josef Bacik --- fs/btrfs/bio.c | 9 + fs/btrfs/bio.h | 3 +++ fs/btrfs/file-item.c | 2 +- 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c index 90e4d4709fa3..7d6931e53beb 100644 --- a/fs/btrfs/bio.c +++ b/fs/btrfs/bio.c @@ -96,6 +96,7 @@ static struct btrfs_bio *btrfs_split_bio(struct btrfs_fs_info *fs_info, if (bbio_has_ordered_extent(bbio)) { refcount_inc(&orig_bbio->ordered->refs); bbio->ordered = orig_bbio->ordered; + orig_bbio->orig_logical += map_length; } atomic_inc(&orig_bbio->pending_ios); return bbio; @@ -674,6 +675,14 @@ static bool btrfs_submit_chunk(struct btrfs_bio *bbio, int mirror_num) goto fail; } + /* +* For fscrypt writes we will get the encrypted bio after we've remapped +* our bio to the physical disk location, so we need to save the +* original bytenr so we know what we're checksumming. +*/ + if (bio_op(bio) == REQ_OP_WRITE && is_data_bbio(bbio)) + bbio->orig_logical = logical; + map_length = min(map_length, length); if (use_append) map_length = min(map_length, fs_info->max_zone_append_size); diff --git a/fs/btrfs/bio.h b/fs/btrfs/bio.h index ca79decee060..5d3f53dcd6d5 100644 --- a/fs/btrfs/bio.h +++ b/fs/btrfs/bio.h @@ -54,11 +54,14 @@ struct btrfs_bio { * - pointer to the checksums for this bio * - original physical address from the allocator * (for zone append only) +* - original logical address, used for checksumming fscrypt +* bios. */ struct { struct btrfs_ordered_extent *ordered; struct btrfs_ordered_sum *sums; u64 orig_physical; + u64 orig_logical; }; /* For metadata reads: parentness verification. */ diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c index d925d6d98bf4..26e3bc602655 100644 --- a/fs/btrfs/file-item.c +++ b/fs/btrfs/file-item.c @@ -756,7 +756,7 @@ blk_status_t btrfs_csum_one_bio(struct btrfs_bio *bbio, struct bio *bio) sums->len = bio->bi_iter.bi_size; INIT_LIST_HEAD(&sums->list); - sums->logical = bio->bi_iter.bi_sector << SECTOR_SHIFT; + sums->logical = bbio->orig_logical; index = 0; shash->tfm = fs_info->csum_shash; -- 2.41.0
[PATCH 33/35] btrfs: add a bio argument to btrfs_csum_one_bio
We only ever needed the bbio in btrfs_csum_one_bio, since that has the bio embedded in it. However with encryption we'll have a different bio with the encrypted data in it, and the original bbio. Update btrfs_csum_one_bio to take the bio we're going to csum as an argument, which will allow us to csum the encrypted bio and stuff the csums into the corresponding bbio to be used later when the IO completes. Signed-off-by: Josef Bacik --- fs/btrfs/bio.c | 2 +- fs/btrfs/file-item.c | 3 +-- fs/btrfs/file-item.h | 2 +- 3 files changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c index 4f3b693a16b1..90e4d4709fa3 100644 --- a/fs/btrfs/bio.c +++ b/fs/btrfs/bio.c @@ -533,7 +533,7 @@ static blk_status_t btrfs_bio_csum(struct btrfs_bio *bbio) { if (bbio->bio.bi_opf & REQ_META) return btree_csum_one_bio(bbio); - return btrfs_csum_one_bio(bbio); + return btrfs_csum_one_bio(bbio, &bbio->bio); } /* diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c index 35036fab58c4..d925d6d98bf4 100644 --- a/fs/btrfs/file-item.c +++ b/fs/btrfs/file-item.c @@ -730,13 +730,12 @@ int btrfs_lookup_csums_bitmap(struct btrfs_root *root, struct btrfs_path *path, /* * Calculate checksums of the data contained inside a bio. */ -blk_status_t btrfs_csum_one_bio(struct btrfs_bio *bbio) +blk_status_t btrfs_csum_one_bio(struct btrfs_bio *bbio, struct bio *bio) { struct btrfs_ordered_extent *ordered = bbio->ordered; struct btrfs_inode *inode = bbio->inode; struct btrfs_fs_info *fs_info = inode->root->fs_info; SHASH_DESC_ON_STACK(shash, fs_info->csum_shash); - struct bio *bio = &bbio->bio; struct btrfs_ordered_sum *sums; char *data; struct bvec_iter iter; diff --git a/fs/btrfs/file-item.h b/fs/btrfs/file-item.h index bb79014024bd..e52d5d71d533 100644 --- a/fs/btrfs/file-item.h +++ b/fs/btrfs/file-item.h @@ -51,7 +51,7 @@ int btrfs_lookup_file_extent(struct btrfs_trans_handle *trans, int btrfs_csum_file_blocks(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct btrfs_ordered_sum *sums); -blk_status_t btrfs_csum_one_bio(struct btrfs_bio *bbio); +blk_status_t btrfs_csum_one_bio(struct btrfs_bio *bbio, struct bio *bio); blk_status_t btrfs_alloc_dummy_sum(struct btrfs_bio *bbio); int btrfs_lookup_csums_range(struct btrfs_root *root, u64 start, u64 end, struct list_head *list, int search_commit, -- 2.41.0
[PATCH 32/35] btrfs: set the bio fscrypt context when applicable
Now that we have the fscrypt_info plumbed through everywhere, add the code to setup the bio encryption context from the extent context. We use the per-extent fscrypt_extent_info for encryption/decryption. We use the offset into the extent as the lblk for fscrypt. So the start of the extent has the lblk of 0, 4k into the extent has the lblk of 4k, etc. This is done to allow things like relocation to continue to work properly. Signed-off-by: Josef Bacik --- fs/btrfs/compression.c | 6 fs/btrfs/extent_io.c | 63 +- fs/btrfs/fscrypt.c | 36 fs/btrfs/fscrypt.h | 22 +++ fs/btrfs/inode.c | 10 +++ 5 files changed, 136 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 19b22b4653c8..3f586ee40b94 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -36,6 +36,7 @@ #include "zoned.h" #include "file-item.h" #include "super.h" +#include "fscrypt.h" static struct bio_set btrfs_compressed_bioset; @@ -301,6 +302,9 @@ void btrfs_submit_compressed_write(struct btrfs_ordered_extent *ordered, cb->bbio.ordered = ordered; btrfs_add_compressed_bio_pages(cb); + btrfs_set_bio_crypt_ctx_from_extent(&cb->bbio.bio, inode, + ordered->fscrypt_info, 0); + btrfs_submit_bio(&cb->bbio, 0); } @@ -504,6 +508,8 @@ void btrfs_submit_compressed_read(struct btrfs_bio *bbio) cb->compress_type = em->compress_type; cb->orig_bbio = bbio; + btrfs_set_bio_crypt_ctx_from_extent(&cb->bbio.bio, inode, + em->fscrypt_info, 0); free_extent_map(em); cb->nr_pages = DIV_ROUND_UP(compressed_len, PAGE_SIZE); diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 121746b7ce95..49f6bbe3b75b 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -37,6 +37,7 @@ #include "dev-replace.h" #include "super.h" #include "transaction.h" +#include "fscrypt.h" static struct kmem_cache *extent_buffer_cache; @@ -103,6 +104,10 @@ struct btrfs_bio_ctrl { blk_opf_t opf; btrfs_bio_end_io_t end_io_func; struct writeback_control *wbc; + + /* This is set for reads and we have encryption. */ + struct fscrypt_extent_info *fscrypt_info; + u64 orig_start; }; static void submit_one_bio(struct btrfs_bio_ctrl *bio_ctrl) @@ -707,10 +712,31 @@ static bool btrfs_bio_is_contig(struct btrfs_bio_ctrl *bio_ctrl, struct page *page, u64 disk_bytenr, unsigned int pg_offset) { - struct bio *bio = &bio_ctrl->bbio->bio; + struct inode *inode = page->mapping->host; + struct btrfs_bio *bbio = bio_ctrl->bbio; + struct bio *bio = &bbio->bio; struct bio_vec *bvec = bio_last_bvec_all(bio); const sector_t sector = disk_bytenr >> SECTOR_SHIFT; + if (IS_ENCRYPTED(inode)) { + u64 file_offset = page_offset(page) + pg_offset; + u64 offset = 0; + struct fscrypt_extent_info *fscrypt_info = NULL; + + /* bio_ctrl->fscrypt_info is only set in the READ case. */ + if (bio_ctrl->fscrypt_info) { + offset = file_offset - bio_ctrl->orig_start; + fscrypt_info = bio_ctrl->fscrypt_info; + } else if (bbio->ordered) { + fscrypt_info = bbio->ordered->fscrypt_info; + offset = file_offset - bbio->ordered->orig_offset; + } + + if (!btrfs_mergeable_encrypted_bio(bio, inode, fscrypt_info, + offset)) + return false; + } + if (bio_ctrl->compress_type != BTRFS_COMPRESS_NONE) { /* * For compression, all IO should have its logical bytenr set @@ -741,6 +767,8 @@ static void alloc_new_bio(struct btrfs_inode *inode, { struct btrfs_fs_info *fs_info = inode->root->fs_info; struct btrfs_bio *bbio; + struct fscrypt_extent_info *fscrypt_info = NULL; + u64 offset = 0; bbio = btrfs_bio_alloc(BIO_MAX_VECS, bio_ctrl->opf, fs_info, bio_ctrl->end_io_func, NULL); @@ -760,6 +788,8 @@ static void alloc_new_bio(struct btrfs_inode *inode, ordered->file_offset + ordered->disk_num_bytes - file_offset); bbio->ordered = ordered; + fscrypt_info = ordered->fscrypt_info; + offset = file_offset - ordered->orig_offset;
[PATCH 30/35] btrfs: setup fscrypt_extent_info for new extents
New extents for encrypted inodes must have a fscrypt_extent_info, which has the necessary keys and does all the registration at the block layer for them. This is passed through all of the infrastructure we've previously added to make sure the context gets saved properly with the file extents. Signed-off-by: Josef Bacik --- fs/btrfs/inode.c | 39 +-- 1 file changed, 37 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 9414991d6b6b..aa536b838ce3 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7398,7 +7398,20 @@ static struct extent_map *create_io_em(struct btrfs_inode *inode, u64 start, set_bit(EXTENT_FLAG_COMPRESSED, &em->flags); em->compress_type = compress_type; } - em->encryption_type = BTRFS_ENCRYPTION_NONE; + + if (IS_ENCRYPTED(&inode->vfs_inode)) { + struct fscrypt_extent_info *fscrypt_info; + + em->encryption_type = BTRFS_ENCRYPTION_FSCRYPT; + fscrypt_info = fscrypt_prepare_new_extent(&inode->vfs_inode); + if (IS_ERR(fscrypt_info)) { + free_extent_map(em); + return ERR_CAST(fscrypt_info); + } + em->fscrypt_info = fscrypt_info; + } else { + em->encryption_type = BTRFS_ENCRYPTION_NONE; + } ret = btrfs_replace_extent_map_range(inode, em, true); if (ret) { @@ -9785,6 +9798,9 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, if (trans) own_trans = false; while (num_bytes > 0) { + struct fscrypt_extent_info *fscrypt_info = NULL; + int encryption_type = BTRFS_ENCRYPTION_NONE; + cur_bytes = min_t(u64, num_bytes, SZ_256M); cur_bytes = max(cur_bytes, min_size); /* @@ -9799,6 +9815,20 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, if (ret) break; + if (IS_ENCRYPTED(inode)) { + fscrypt_info = fscrypt_prepare_new_extent(inode); + if (IS_ERR(fscrypt_info)) { + btrfs_dec_block_group_reservations(fs_info, + ins.objectid); + btrfs_free_reserved_extent(fs_info, + ins.objectid, + ins.offset, 0); + ret = PTR_ERR(fscrypt_info); + break; + } + encryption_type = BTRFS_ENCRYPTION_FSCRYPT; + } + /* * We've reserved this space, and thus converted it from * ->bytes_may_use to ->bytes_reserved. Any error that happens @@ -9810,7 +9840,8 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, last_alloc = ins.offset; trans = insert_prealloc_file_extent(trans, BTRFS_I(inode), - &ins, NULL, cur_offset); + &ins, fscrypt_info, + cur_offset); /* * Now that we inserted the prealloc extent we can finally * decrement the number of reservations in the block group. @@ -9820,6 +9851,7 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, btrfs_dec_block_group_reservations(fs_info, ins.objectid); if (IS_ERR(trans)) { ret = PTR_ERR(trans); + fscrypt_put_extent_info(fscrypt_info); btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, 0); break; @@ -9827,6 +9859,7 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, em = alloc_extent_map(); if (!em) { + fscrypt_put_extent_info(fscrypt_info); btrfs_drop_extent_map_range(BTRFS_I(inode), cur_offset, cur_offset + ins.offset - 1, false); btrfs_set_inode_full_sync(BTRFS_I(inode)); @@ -9842,6 +9875,8 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, em->ram_bytes = ins.offset; set_bit(EXTENT_FLAG_PREALLOC, &em->flags); em->generation = trans->transid; + em->fscrypt_info = fscrypt_info; + em->encryption_type = encryption_
[PATCH 31/35] btrfs: populate ordered_extent with the orig offset
For extent encryption we have to use a logical block nr as input for the IV. For btrfs we're using the offset into the extent we're operating on. For most ordered extents this is the same as the file_offset, however for prealloc and NOCOW we have to use the original offset. Add this as an argument and plumb it through everywhere, this will be used when setting up the bio. Signed-off-by: Josef Bacik --- fs/btrfs/inode.c| 15 ++- fs/btrfs/ordered-data.c | 22 -- fs/btrfs/ordered-data.h | 12 +--- 3 files changed, 31 insertions(+), 18 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index aa536b838ce3..14420683651a 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1165,6 +1165,7 @@ static void submit_one_async_extent(struct async_chunk *async_chunk, ordered = btrfs_alloc_ordered_extent(inode, em->fscrypt_info, start, /* file_offset */ + start, /* orig_start */ async_extent->ram_size, /* num_bytes */ async_extent->ram_size, /* ram_bytes */ ins.objectid,/* disk_bytenr */ @@ -1428,8 +1429,8 @@ static noinline int cow_file_range(struct btrfs_inode *inode, } ordered = btrfs_alloc_ordered_extent(inode, em->fscrypt_info, - start, ram_size, ram_size, ins.objectid, - cur_alloc_size, 0, + start, start, ram_size, ram_size, + ins.objectid, cur_alloc_size, 0, 1 << BTRFS_ORDERED_REGULAR, BTRFS_COMPRESS_NONE); free_extent_map(em); @@ -2178,7 +2179,9 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, } ordered = btrfs_alloc_ordered_extent(inode, fscrypt_info, - cur_offset, nocow_args.num_bytes, + cur_offset, + found_key.offset - nocow_args.extent_offset, + nocow_args.num_bytes, nocow_args.num_bytes, nocow_args.disk_bytenr, nocow_args.num_bytes, 0, is_prealloc @@ -7088,8 +7091,9 @@ static struct extent_map *btrfs_create_dio_extent(struct btrfs_inode *inode, fscrypt_info = orig_em->fscrypt_info; } - ordered = btrfs_alloc_ordered_extent(inode, fscrypt_info, start, len, -len, block_start, block_len, 0, + ordered = btrfs_alloc_ordered_extent(inode, fscrypt_info, start, +orig_start, len, len, block_start, +block_len, 0, (1 << type) | (1 << BTRFS_ORDERED_DIRECT), BTRFS_COMPRESS_NONE); @@ -10612,6 +10616,7 @@ ssize_t btrfs_do_encoded_write(struct kiocb *iocb, struct iov_iter *from, } ordered = btrfs_alloc_ordered_extent(inode, em->fscrypt_info, start, + start - encoded->unencoded_offset, num_bytes, ram_bytes, ins.objectid, ins.offset, encoded->unencoded_offset, (1 << BTRFS_ORDERED_ENCODED) | diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index 81b0fe575011..172a6ca38987 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -149,9 +149,9 @@ static inline struct rb_node *tree_search(struct btrfs_ordered_inode_tree *tree, static struct btrfs_ordered_extent *alloc_ordered_extent( struct btrfs_inode *inode, struct fscrypt_extent_info *fscrypt_info, - u64 file_offset, u64 num_bytes, u64 ram_bytes, - u64 disk_bytenr, u64 disk_num_bytes, u64 offset, - unsigned long flags, int compress_type) + u64 file_offset, u64 orig_offset, u64 num_bytes, + u64 ram_bytes, u64 disk_bytenr, u64 disk_num_bytes, + u64 offset, unsigned long flags, int compress_type) { struct btrfs_ordered_extent *entry; int ret; @@ -176,6 +176,7 @@ static struct btrfs_ordered_extent *alloc_ordered_extent( return ERR_PTR(-ENOMEM); entry->file_offset = file_offset; + entry->orig_offset = orig_offset; en
[PATCH 29/35] btrfs: implement the fscrypt extent encryption hooks
This patch implements the necessary hooks from fscrypt to support per-extent encryption. There's two main entry points btrfs_fscrypt_load_extent_info btrfs_fscrypt_save_extent_info btrfs_fscrypt_load_extent_info gets called when we create the extent maps from the file extent item at btrfs_get_extent() time. We read the extent context, and pass it into fscrypt to create the appropriate fscrypt_extent_info structure. This is then used on the bio's to make sure the encryption is done properly. btrfs_fscrypt_save_extent_info is used to generate the fscrypt context from fscrypt and save it into the file extent item when we create a new file extent item. Signed-off-by: Josef Bacik --- fs/btrfs/defrag.c| 10 - fs/btrfs/file-item.c | 11 +- fs/btrfs/file-item.h | 5 - fs/btrfs/file.c | 9 + fs/btrfs/fscrypt.c | 48 fs/btrfs/fscrypt.h | 31 fs/btrfs/inode.c | 22 +++- fs/btrfs/tree-log.c | 10 + 8 files changed, 142 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/defrag.c b/fs/btrfs/defrag.c index dde70f358d6f..ec438961eedb 100644 --- a/fs/btrfs/defrag.c +++ b/fs/btrfs/defrag.c @@ -16,6 +16,7 @@ #include "defrag.h" #include "file-item.h" #include "super.h" +#include "fscrypt.h" static struct kmem_cache *btrfs_inode_defrag_cachep; @@ -526,9 +527,12 @@ static struct extent_map *defrag_get_extent(struct btrfs_inode *inode, struct btrfs_path path = { 0 }; struct extent_map *em; struct btrfs_key key; + struct btrfs_fscrypt_ctx ctx; u64 ino = btrfs_ino(inode); int ret; + ctx.size = 0; + em = alloc_extent_map(); if (!em) { ret = -ENOMEM; @@ -623,7 +627,7 @@ static struct extent_map *defrag_get_extent(struct btrfs_inode *inode, goto next; /* Now this extent covers @start, convert it to em */ - btrfs_extent_item_to_extent_map(inode, &path, fi, em); + btrfs_extent_item_to_extent_map(inode, &path, fi, em, &ctx); break; next: ret = btrfs_next_item(root, &path); @@ -633,6 +637,10 @@ static struct extent_map *defrag_get_extent(struct btrfs_inode *inode, goto not_found; } btrfs_release_path(&path); + + ret = btrfs_fscrypt_load_extent_info(inode, em, &ctx); + if (ret) + goto err; return em; not_found: diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c index 26f35c1baedc..35036fab58c4 100644 --- a/fs/btrfs/file-item.c +++ b/fs/btrfs/file-item.c @@ -21,6 +21,7 @@ #include "accessors.h" #include "file-item.h" #include "super.h" +#include "fscrypt.h" #define __MAX_CSUM_ITEMS(r, size) ((unsigned long)(((BTRFS_LEAF_DATA_SIZE(r) - \ sizeof(struct btrfs_item) * 2) / \ @@ -1264,7 +1265,8 @@ int btrfs_csum_file_blocks(struct btrfs_trans_handle *trans, void btrfs_extent_item_to_extent_map(struct btrfs_inode *inode, const struct btrfs_path *path, struct btrfs_file_extent_item *fi, -struct extent_map *em) +struct extent_map *em, +struct btrfs_fscrypt_ctx *ctx) { struct btrfs_fs_info *fs_info = inode->root->fs_info; struct btrfs_root *root = inode->root; @@ -1306,6 +1308,13 @@ void btrfs_extent_item_to_extent_map(struct btrfs_inode *inode, set_bit(EXTENT_FLAG_PREALLOC, &em->flags); } em->encryption_type = btrfs_file_extent_encryption(leaf, fi); + if (em->encryption_type != BTRFS_ENCRYPTION_NONE) { + ctx->size = + btrfs_file_extent_encryption_ctx_size(leaf, fi); + read_extent_buffer(leaf, ctx->ctx, + btrfs_file_extent_encryption_ctx_offset(fi), + ctx->size); + } } else if (type == BTRFS_FILE_EXTENT_INLINE) { em->block_start = EXTENT_MAP_INLINE; em->start = extent_start; diff --git a/fs/btrfs/file-item.h b/fs/btrfs/file-item.h index 04bd2d34efb1..bb79014024bd 100644 --- a/fs/btrfs/file-item.h +++ b/fs/btrfs/file-item.h @@ -5,6 +5,8 @@ #include "accessors.h" +struct btrfs_fscrypt_ctx; + #define BTRFS_FILE_EXTENT_INLINE_DATA_START\ (offsetof(struct btrfs_file_extent_item, disk_bytenr)) @@ -63,7 +65,8 @@ int btrfs_lookup_csums_bitmap(struct btrfs_root *root, struct btrfs_path *path, void btrfs_ext
[PATCH 28/35] btrfs: pass the fscrypt_info through the replace extent infrastructure
Prealloc uses the btrfs_replace_file_extents() infrastructure to insert its new extents. We need to set the fscrypt context on these extents, so pass this through the btrfs_replace_extent_info so it can be used in a later patch when we hook in this infrastructure. Signed-off-by: Josef Bacik --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/inode.c | 1 + 2 files changed, 3 insertions(+) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 5b0cccdab92a..b4437c1a9f22 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -341,6 +341,8 @@ struct btrfs_replace_extent_info { char *extent_buf; /* The length of @extent_buf */ u32 extent_buf_size; + /* The fscrypt_extent_info for a new extent. */ + struct fscrypt_extent_info *fscrypt_info; /* * Set to true when attempting to replace a file range with a new extent * described by this structure, set to false when attempting to clone an diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index fdb7c9e1c210..ee1ac2718ce3 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -9714,6 +9714,7 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( extent_info.update_times = true; extent_info.qgroup_reserved = qgroup_released; extent_info.insertions = 0; + extent_info.fscrypt_info = fscrypt_info; path = btrfs_alloc_path(); if (!path) { -- 2.41.0
[PATCH 26/35] btrfs: explicitly track file extent length for replace and drop
From: Sweet Tea Dorminy With the advent of storing fscrypt contexts with each encrypted extent, extents will have a variable length depending on encryption status. Make sure the replace and drop file extent item helpers encode this information so that everything gets updated properly. Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/ctree.h| 2 ++ fs/btrfs/file.c | 4 ++-- fs/btrfs/inode.c| 7 +-- fs/btrfs/reflink.c | 1 + fs/btrfs/tree-log.c | 5 +++-- 5 files changed, 13 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 2758fbae7e39..5b0cccdab92a 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -339,6 +339,8 @@ struct btrfs_replace_extent_info { u64 file_offset; /* Pointer to a file extent item of type regular or prealloc. */ char *extent_buf; + /* The length of @extent_buf */ + u32 extent_buf_size; /* * Set to true when attempting to replace a file range with a new extent * described by this structure, set to false when attempting to clone an diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 8dcc5ae9c9e1..70a801b90d13 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2262,14 +2262,14 @@ static int btrfs_insert_replace_extent(struct btrfs_trans_handle *trans, key.type = BTRFS_EXTENT_DATA_KEY; key.offset = extent_info->file_offset; ret = btrfs_insert_empty_item(trans, root, path, &key, - sizeof(struct btrfs_file_extent_item)); + extent_info->extent_buf_size); if (ret) return ret; leaf = path->nodes[0]; slot = path->slots[0]; write_extent_buffer(leaf, extent_info->extent_buf, btrfs_item_ptr_offset(leaf, slot), - sizeof(struct btrfs_file_extent_item)); + extent_info->extent_buf_size); extent = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item); ASSERT(btrfs_file_extent_type(leaf, extent) != BTRFS_FILE_EXTENT_INLINE); btrfs_set_file_extent_offset(leaf, extent, extent_info->data_offset); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 89cb09a40f58..6a835967684d 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2899,6 +2899,7 @@ static int insert_reserved_file_extent(struct btrfs_trans_handle *trans, u64 num_bytes = btrfs_stack_file_extent_num_bytes(stack_fi); u64 ram_bytes = btrfs_stack_file_extent_ram_bytes(stack_fi); struct btrfs_drop_extents_args drop_args = { 0 }; + size_t fscrypt_context_size = 0; int ret; path = btrfs_alloc_path(); @@ -2918,7 +2919,7 @@ static int insert_reserved_file_extent(struct btrfs_trans_handle *trans, drop_args.start = file_pos; drop_args.end = file_pos + num_bytes; drop_args.replace_extent = true; - drop_args.extent_item_size = sizeof(*stack_fi); + drop_args.extent_item_size = sizeof(*stack_fi) + fscrypt_context_size; ret = btrfs_drop_extents(trans, root, inode, &drop_args); if (ret) goto out; @@ -2929,7 +2930,7 @@ static int insert_reserved_file_extent(struct btrfs_trans_handle *trans, ins.type = BTRFS_EXTENT_DATA_KEY; ret = btrfs_insert_empty_item(trans, root, path, &ins, - sizeof(*stack_fi)); + sizeof(*stack_fi) + fscrypt_context_size); if (ret) goto out; } @@ -9671,6 +9672,7 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( u64 len = ins->offset; int qgroup_released; int ret; + size_t fscrypt_context_size = 0; memset(&stack_fi, 0, sizeof(stack_fi)); @@ -9703,6 +9705,7 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( extent_info.data_len = len; extent_info.file_offset = file_offset; extent_info.extent_buf = (char *)&stack_fi; + extent_info.extent_buf_size = sizeof(stack_fi) + fscrypt_context_size; extent_info.is_new_extent = true; extent_info.update_times = true; extent_info.qgroup_reserved = qgroup_released; diff --git a/fs/btrfs/reflink.c b/fs/btrfs/reflink.c index 3c66630d87ee..f5440ae447a4 100644 --- a/fs/btrfs/reflink.c +++ b/fs/btrfs/reflink.c @@ -500,6 +500,7 @@ static int btrfs_clone(struct inode *src, struct inode *inode, clone_info.data_len = datal; clone_info.file_offset = new_key.offset; clone_info.extent_buf = buf; + clone_info.extent_buf_size = size; clone_info.is_new_extent = false; clone_info.update_times = !no_time_update;
[PATCH 25/35] btrfs: add an optional encryption context to the end of file extents
The fscrypt encryption context can be extended to include different things in the future. To facilitate future expansion add an optional btrfs_encryption_info to the end of the file extent. This will hold the size of the context and then will have the binary context tacked onto the end of the extent item. Add the appropriate accessors to make it easy to read this information if we have encryption set, and then update the tree-checker to validate that if this is indeed set properly that the size matches properly. Signed-off-by: Josef Bacik --- fs/btrfs/accessors.h| 48 +++ fs/btrfs/tree-checker.c | 58 - include/uapi/linux/btrfs_tree.h | 17 +- 3 files changed, 113 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/accessors.h b/fs/btrfs/accessors.h index 5627f13a3d3e..3e6c81449ce7 100644 --- a/fs/btrfs/accessors.h +++ b/fs/btrfs/accessors.h @@ -934,6 +934,10 @@ BTRFS_SETGET_STACK_FUNCS(super_uuid_tree_generation, struct btrfs_super_block, BTRFS_SETGET_STACK_FUNCS(super_nr_global_roots, struct btrfs_super_block, nr_global_roots, 64); +/* struct btrfs_file_extent_encryption_info */ +BTRFS_SETGET_FUNCS(encryption_info_size, struct btrfs_encryption_info, size, + 32); + /* struct btrfs_file_extent_item */ BTRFS_SETGET_STACK_FUNCS(stack_file_extent_type, struct btrfs_file_extent_item, type, 8); @@ -975,6 +979,50 @@ BTRFS_SETGET_FUNCS(file_extent_encryption, struct btrfs_file_extent_item, BTRFS_SETGET_FUNCS(file_extent_other_encoding, struct btrfs_file_extent_item, other_encoding, 16); +static inline struct btrfs_encryption_info *btrfs_file_extent_encryption_info( + const struct btrfs_file_extent_item *ei) +{ + unsigned long offset = (unsigned long)ei; + + offset += offsetof(struct btrfs_file_extent_item, encryption_info); + return (struct btrfs_encryption_info *)offset; +} + +static inline unsigned long btrfs_file_extent_encryption_ctx_offset( + const struct btrfs_file_extent_item *ei) +{ + unsigned long offset = (unsigned long)ei; + + offset += offsetof(struct btrfs_file_extent_item, encryption_info); + return offset + offsetof(struct btrfs_encryption_info, context); +} + +static inline u32 btrfs_file_extent_encryption_ctx_size( + const struct extent_buffer *eb, + const struct btrfs_file_extent_item *ei) +{ + return btrfs_encryption_info_size(eb, + btrfs_file_extent_encryption_info(ei)); +} + +static inline void btrfs_set_file_extent_encryption_ctx_size( + const struct extent_buffer *eb, + struct btrfs_file_extent_item *ei, + u32 val) +{ + btrfs_set_encryption_info_size(eb, + btrfs_file_extent_encryption_info(ei), + val); +} + +static inline u32 btrfs_file_extent_encryption_info_size( + const struct extent_buffer *eb, + const struct btrfs_file_extent_item *ei) +{ + return btrfs_encryption_info_size(eb, + btrfs_file_extent_encryption_info(ei)); +} + /* btrfs_qgroup_status_item */ BTRFS_SETGET_FUNCS(qgroup_status_generation, struct btrfs_qgroup_status_item, generation, 64); diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c index c2afdf65c2bf..7fe6210c243a 100644 --- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -211,6 +211,7 @@ static int check_extent_data_item(struct extent_buffer *leaf, u32 item_size = btrfs_item_size(leaf, slot); u64 extent_end; u8 policy; + u8 fe_type; if (unlikely(!IS_ALIGNED(key->offset, sectorsize))) { file_extent_err(leaf, slot, @@ -241,12 +242,12 @@ static int check_extent_data_item(struct extent_buffer *leaf, SZ_4K); return -EUCLEAN; } - if (unlikely(btrfs_file_extent_type(leaf, fi) >= -BTRFS_NR_FILE_EXTENT_TYPES)) { + + fe_type = btrfs_file_extent_type(leaf, fi); + if (unlikely(fe_type >= BTRFS_NR_FILE_EXTENT_TYPES)) { file_extent_err(leaf, slot, "invalid type for file extent, have %u expect range [0, %u]", - btrfs_file_extent_type(leaf, fi), - BTRFS_NR_FILE_EXTENT_TYPES - 1); + fe_type, BTRFS_NR_FILE_EXTENT_TYPES - 1); return -EUCLEAN; } @@ -295,12 +296,51 @@ static int check_extent_data_item(s
[PATCH 27/35] btrfs: pass through fscrypt_extent_info to the file extent helpers
Now that we have the fscrypt_extnet_info in all of the supporting structures, pass this through and set the file extent encryption bit accordingly from the supporting structures. In subsequent patches code will be added to populate these appropriately. Signed-off-by: Josef Bacik --- fs/btrfs/inode.c| 18 +++--- fs/btrfs/tree-log.c | 2 +- 2 files changed, 12 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 6a835967684d..fdb7c9e1c210 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2883,7 +2883,9 @@ int btrfs_writepage_cow_fixup(struct page *page) } static int insert_reserved_file_extent(struct btrfs_trans_handle *trans, - struct btrfs_inode *inode, u64 file_pos, + struct btrfs_inode *inode, + struct fscrypt_extent_info *fscrypt_info, + u64 file_pos, struct btrfs_file_extent_item *stack_fi, const bool update_inode_bytes, u64 qgroup_reserved) @@ -3015,8 +3017,7 @@ static int insert_ordered_extent_file_extent(struct btrfs_trans_handle *trans, btrfs_set_stack_file_extent_num_bytes(&stack_fi, num_bytes); btrfs_set_stack_file_extent_ram_bytes(&stack_fi, ram_bytes); btrfs_set_stack_file_extent_compression(&stack_fi, oe->compress_type); - btrfs_set_stack_file_extent_encryption(&stack_fi, - BTRFS_ENCRYPTION_NONE); + btrfs_set_stack_file_extent_encryption(&stack_fi, oe->encryption_type); /* Other encoding is reserved and always 0 */ /* @@ -3030,8 +3031,9 @@ static int insert_ordered_extent_file_extent(struct btrfs_trans_handle *trans, test_bit(BTRFS_ORDERED_TRUNCATED, &oe->flags); return insert_reserved_file_extent(trans, BTRFS_I(oe->inode), - oe->file_offset, &stack_fi, - update_inode_bytes, oe->qgroup_rsv); + oe->fscrypt_info, oe->file_offset, + &stack_fi, update_inode_bytes, + oe->qgroup_rsv); } /* @@ -9662,6 +9664,7 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( struct btrfs_trans_handle *trans_in, struct btrfs_inode *inode, struct btrfs_key *ins, + struct fscrypt_extent_info *fscrypt_info, u64 file_offset) { struct btrfs_file_extent_item stack_fi; @@ -9683,6 +9686,7 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( btrfs_set_stack_file_extent_ram_bytes(&stack_fi, len); btrfs_set_stack_file_extent_compression(&stack_fi, BTRFS_COMPRESS_NONE); btrfs_set_stack_file_extent_encryption(&stack_fi, + fscrypt_info ? BTRFS_ENCRYPTION_FSCRYPT : BTRFS_ENCRYPTION_NONE); /* Other encoding is reserved and always 0 */ @@ -9691,7 +9695,7 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( return ERR_PTR(qgroup_released); if (trans) { - ret = insert_reserved_file_extent(trans, inode, + ret = insert_reserved_file_extent(trans, inode, fscrypt_info, file_offset, &stack_fi, true, qgroup_released); if (ret) @@ -9785,7 +9789,7 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, last_alloc = ins.offset; trans = insert_prealloc_file_extent(trans, BTRFS_I(inode), - &ins, cur_offset); + &ins, NULL, cur_offset); /* * Now that we inserted the prealloc extent we can finally * decrement the number of reservations in the block group. diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index d659547c9900..40dd5c652f0e 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -4631,7 +4631,7 @@ static int log_one_extent(struct btrfs_trans_handle *trans, u64 block_len; int ret; size_t fscrypt_context_size = 0; - u8 encryption = BTRFS_ENCRYPTION_NONE; + u8 encryption = em->encryption_type; btrfs_set_stack_file_extent_generation(&fi, trans->transid); if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags)) -- 2.41.0
[PATCH 20/35] btrfs: add fscrypt_info and encryption_type to extent_map
From: Sweet Tea Dorminy Each extent_map will end up with a pointer to its associated fscrypt_info if any, which should have the same lifetime as the extent_map. We are also going to need to track the encryption_type for the file extent items. Add the fscrypt_info to the extent_map, and the subsequent code for transferring it in the split and merge cases, as well as the code necessary to free them. A future patch will add the code to load them as appropriate. Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/extent_map.c | 32 +--- fs/btrfs/extent_map.h | 2 ++ fs/btrfs/file-item.c | 1 + fs/btrfs/inode.c | 1 + 4 files changed, 33 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index af5ff6b10865..8c8023388758 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -61,6 +61,7 @@ struct extent_map *alloc_extent_map(void) static void __free_extent_map(struct extent_map *em) { + fscrypt_put_extent_info(em->fscrypt_info); if (test_bit(EXTENT_FLAG_FS_MAPPING, &em->flags)) kfree(em->map_lookup); kmem_cache_free(extent_map_cache, em); @@ -103,12 +104,24 @@ void free_extent_map_safe(struct extent_map_tree *tree, if (!em) return; - if (refcount_dec_and_test(&em->refs)) { - WARN_ON(extent_map_in_tree(em)); - WARN_ON(!list_empty(&em->list)); + if (!refcount_dec_and_test(&em->refs)) + return; + + WARN_ON(extent_map_in_tree(em)); + WARN_ON(!list_empty(&em->list)); + + /* +* We could take a lock freeing the fscrypt_info, so add this to the +* list of freed_extents to be freed later. +*/ + if (em->fscrypt_info) { list_add_tail(&em->free_list, &tree->freed_extents); set_bit(EXTENT_MAP_TREE_PENDING_FREES, &tree->flags); + return; } + + /* Nothing scary here, just free the object. */ + __free_extent_map(em); } /* @@ -274,6 +287,12 @@ static int mergable_maps(struct extent_map *prev, struct extent_map *next) if (!list_empty(&prev->list) || !list_empty(&next->list)) return 0; + /* +* Don't merge adjacent encrypted maps. +*/ + if (prev->fscrypt_info || next->fscrypt_info) + return 0; + ASSERT(next->block_start != EXTENT_MAP_DELALLOC && prev->block_start != EXTENT_MAP_DELALLOC); @@ -884,6 +903,8 @@ void btrfs_drop_extent_map_range(struct btrfs_inode *inode, u64 start, u64 end, split->generation = gen; split->flags = flags; split->compress_type = em->compress_type; + split->fscrypt_info = + fscrypt_get_extent_info(em->fscrypt_info); replace_extent_mapping(em_tree, em, split, modified); free_extent_map(split); split = split2; @@ -925,6 +946,8 @@ void btrfs_drop_extent_map_range(struct btrfs_inode *inode, u64 start, u64 end, split->orig_block_len = 0; } + split->fscrypt_info = + fscrypt_get_extent_info(em->fscrypt_info); if (extent_map_in_tree(em)) { replace_extent_mapping(em_tree, em, split, modified); @@ -1087,6 +1110,7 @@ int split_extent_map(struct btrfs_inode *inode, u64 start, u64 len, u64 pre, split_pre->flags = flags; split_pre->compress_type = em->compress_type; split_pre->generation = em->generation; + split_pre->fscrypt_info = fscrypt_get_extent_info(em->fscrypt_info); replace_extent_mapping(em_tree, em, split_pre, 1); @@ -1106,6 +1130,8 @@ int split_extent_map(struct btrfs_inode *inode, u64 start, u64 len, u64 pre, split_mid->flags = flags; split_mid->compress_type = em->compress_type; split_mid->generation = em->generation; + split_mid->fscrypt_info = fscrypt_get_extent_info(em->fscrypt_info); + add_extent_mapping(em_tree, split_mid, 1); /* Once for us */ diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h index 2093720271ea..2d618e61ceb5 100644 --- a/fs/btrfs/extent_map.h +++ b/fs/btrfs/extent_map.h @@ -50,10 +50,12 @@ struct extent_map { */ u64 generation; unsigned long flags; + struct fscrypt_extent_info *fscrypt_info; /* Used for chunk mappings, flag EXTENT_FLAG_FS_MAPPING must be set */ struct map_lookup *map_lookup; refcount_t refs
[PATCH 23/35] btrfs: populate the ordered_extent with the fscrypt context
The fscrypt_extent_info will be tied to the extent_map lifetime, so it will be created when we create the IO em, or it'll already exist in the NOCOW case. Use this fscrypt_info when creating the ordered extent to make sure everything is passed through properly. Signed-off-by: Josef Bacik --- fs/btrfs/inode.c | 62 +--- 1 file changed, 43 insertions(+), 19 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 903ec2d460f5..19831291fb54 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1160,9 +1160,8 @@ static void submit_one_async_extent(struct async_chunk *async_chunk, ret = PTR_ERR(em); goto out_free_reserve; } - free_extent_map(em); - ordered = btrfs_alloc_ordered_extent(inode, NULL, + ordered = btrfs_alloc_ordered_extent(inode, em->fscrypt_info, start, /* file_offset */ async_extent->ram_size, /* num_bytes */ async_extent->ram_size, /* ram_bytes */ @@ -1171,6 +1170,7 @@ static void submit_one_async_extent(struct async_chunk *async_chunk, 0, /* offset */ 1 << BTRFS_ORDERED_COMPRESSED, async_extent->compress_type); + free_extent_map(em); if (IS_ERR(ordered)) { btrfs_drop_extent_map_range(inode, start, end, false); ret = PTR_ERR(ordered); @@ -1424,13 +1424,13 @@ static noinline int cow_file_range(struct btrfs_inode *inode, ret = PTR_ERR(em); goto out_reserve; } - free_extent_map(em); - ordered = btrfs_alloc_ordered_extent(inode, NULL, + ordered = btrfs_alloc_ordered_extent(inode, em->fscrypt_info, start, ram_size, ram_size, ins.objectid, cur_alloc_size, 0, 1 << BTRFS_ORDERED_REGULAR, BTRFS_COMPRESS_NONE); + free_extent_map(em); if (IS_ERR(ordered)) { ret = PTR_ERR(ordered); goto out_drop_extent_cache; @@ -2003,6 +2003,8 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, struct btrfs_key found_key; struct btrfs_file_extent_item *fi; struct extent_buffer *leaf; + struct extent_map *em = NULL; + struct fscrypt_extent_info *fscrypt_info = NULL; u64 extent_end; u64 ram_bytes; u64 nocow_end; @@ -2143,7 +2145,6 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, is_prealloc = extent_type == BTRFS_FILE_EXTENT_PREALLOC; if (is_prealloc) { u64 orig_start = found_key.offset - nocow_args.extent_offset; - struct extent_map *em; em = create_io_em(inode, cur_offset, nocow_args.num_bytes, orig_start, @@ -2157,16 +2158,32 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, ret = PTR_ERR(em); goto error; } - free_extent_map(em); + fscrypt_info = em->fscrypt_info; + } else if (IS_ENCRYPTED(&inode->vfs_inode)) { + /* +* We only want to do this lookup if we're encrypted, +* otherwise fsrypt_info will be null and we can avoid +* this lookup. +*/ + em = btrfs_get_extent(inode, NULL, 0, cur_offset, + nocow_args.num_bytes); + if (IS_ERR(em)) { + btrfs_dec_nocow_writers(nocow_bg); + ret = PTR_ERR(em); + goto error; + } + fscrypt_info = em->fscrypt_info; } - ordered = btrfs_alloc_ordered_extent(inode, NULL, cur_offset, - nocow_args.num_bytes, nocow_args.num_bytes, - nocow_args.disk_bytenr, nocow_args.num_bytes, 0, + ordered = btrfs_alloc_ordered_extent(inode, fscrypt_info, + cur_offset, nocow_args.num_bytes, + nocow_args.num_bytes, nocow_args.disk_bytenr, + nocow_args.num_byt
[PATCH 21/35] btrfs: add fscrypt_info and encryption_type to ordered_extent
We're going to need these to update the file extent items once the writes are complete. Add them and add the pieces necessary to assign them and free everything. Signed-off-by: Josef Bacik --- fs/btrfs/ordered-data.c | 2 ++ fs/btrfs/ordered-data.h | 6 ++ 2 files changed, 8 insertions(+) diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index b133ea0bc459..d33a780d9893 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -182,6 +182,7 @@ static struct btrfs_ordered_extent *alloc_ordered_extent( entry->bytes_left = num_bytes; entry->inode = igrab(&inode->vfs_inode); entry->compress_type = compress_type; + entry->encryption_type = BTRFS_ENCRYPTION_NONE; entry->truncated_len = (u64)-1; entry->qgroup_rsv = ret; entry->flags = flags; @@ -568,6 +569,7 @@ void btrfs_put_ordered_extent(struct btrfs_ordered_extent *entry) list_del(&sum->list); kvfree(sum); } + fscrypt_put_extent_info(entry->fscrypt_info); kmem_cache_free(btrfs_ordered_extent_cache, entry); } } diff --git a/fs/btrfs/ordered-data.h b/fs/btrfs/ordered-data.h index 1c51ac57e5df..607814876f1f 100644 --- a/fs/btrfs/ordered-data.h +++ b/fs/btrfs/ordered-data.h @@ -122,6 +122,9 @@ struct btrfs_ordered_extent { /* compression algorithm */ int compress_type; + /* encryption mode */ + int encryption_type; + /* Qgroup reserved space */ int qgroup_rsv; @@ -131,6 +134,9 @@ struct btrfs_ordered_extent { /* the inode we belong to */ struct inode *inode; + /* the fscrypt_info for this extent, if necessary */ + struct fscrypt_extent_info *fscrypt_info; + /* list of checksums for insertion when the extent io is done */ struct list_head list; -- 2.41.0
[PATCH 22/35] btrfs: plumb through setting the fscrypt_info for ordered extents
We're going to be getting fscrypt_info from the extent maps, update the helpers to take an fscrypt_info argument and use that to set the encryption type on the ordered extent. Signed-off-by: Josef Bacik --- fs/btrfs/inode.c| 20 +++- fs/btrfs/ordered-data.c | 32 fs/btrfs/ordered-data.h | 9 + 3 files changed, 36 insertions(+), 25 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index d26062b67211..903ec2d460f5 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1162,7 +1162,8 @@ static void submit_one_async_extent(struct async_chunk *async_chunk, } free_extent_map(em); - ordered = btrfs_alloc_ordered_extent(inode, start, /* file_offset */ + ordered = btrfs_alloc_ordered_extent(inode, NULL, + start, /* file_offset */ async_extent->ram_size, /* num_bytes */ async_extent->ram_size, /* ram_bytes */ ins.objectid,/* disk_bytenr */ @@ -1425,9 +1426,10 @@ static noinline int cow_file_range(struct btrfs_inode *inode, } free_extent_map(em); - ordered = btrfs_alloc_ordered_extent(inode, start, ram_size, - ram_size, ins.objectid, cur_alloc_size, - 0, 1 << BTRFS_ORDERED_REGULAR, + ordered = btrfs_alloc_ordered_extent(inode, NULL, + start, ram_size, ram_size, ins.objectid, + cur_alloc_size, 0, + 1 << BTRFS_ORDERED_REGULAR, BTRFS_COMPRESS_NONE); if (IS_ERR(ordered)) { ret = PTR_ERR(ordered); @@ -2158,7 +2160,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, free_extent_map(em); } - ordered = btrfs_alloc_ordered_extent(inode, cur_offset, + ordered = btrfs_alloc_ordered_extent(inode, NULL, cur_offset, nocow_args.num_bytes, nocow_args.num_bytes, nocow_args.disk_bytenr, nocow_args.num_bytes, 0, is_prealloc @@ -7041,7 +7043,7 @@ static struct extent_map *btrfs_create_dio_extent(struct btrfs_inode *inode, if (IS_ERR(em)) goto out; } - ordered = btrfs_alloc_ordered_extent(inode, start, len, len, + ordered = btrfs_alloc_ordered_extent(inode, NULL, start, len, len, block_start, block_len, 0, (1 << type) | (1 << BTRFS_ORDERED_DIRECT), @@ -10512,9 +10514,9 @@ ssize_t btrfs_do_encoded_write(struct kiocb *iocb, struct iov_iter *from, } free_extent_map(em); - ordered = btrfs_alloc_ordered_extent(inode, start, num_bytes, ram_bytes, - ins.objectid, ins.offset, - encoded->unencoded_offset, + ordered = btrfs_alloc_ordered_extent(inode, NULL, start, + num_bytes, ram_bytes, ins.objectid, + ins.offset, encoded->unencoded_offset, (1 << BTRFS_ORDERED_ENCODED) | (1 << BTRFS_ORDERED_COMPRESSED), compression); diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index d33a780d9893..81b0fe575011 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -147,9 +147,11 @@ static inline struct rb_node *tree_search(struct btrfs_ordered_inode_tree *tree, } static struct btrfs_ordered_extent *alloc_ordered_extent( - struct btrfs_inode *inode, u64 file_offset, u64 num_bytes, - u64 ram_bytes, u64 disk_bytenr, u64 disk_num_bytes, - u64 offset, unsigned long flags, int compress_type) + struct btrfs_inode *inode, + struct fscrypt_extent_info *fscrypt_info, + u64 file_offset, u64 num_bytes, u64 ram_bytes, + u64 disk_bytenr, u64 disk_num_bytes, u64 offset, + unsigned long flags, int compress_type) { struct btrfs_ordered_extent *entry; int ret; @@ -182,10 +184,12 @@ static struct btrfs_ordered_extent *alloc_ordered_extent( entry->bytes_left = num_bytes; entry->inode = igrab(&inode->vfs_inode); entry->compress_type = compress_type; - entry-&
[PATCH 24/35] btrfs: keep track of fscrypt info and orig_start for dio reads
We keep track of this information in the ordered extent for writes, but we need it for reads as well. Add fscrypt_extent_info and orig_start to the dio_data so we can populate this on reads. This will be used later when we attach the fscrypt context to the bios. Signed-off-by: Josef Bacik --- fs/btrfs/inode.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 19831291fb54..89cb09a40f58 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -83,6 +83,8 @@ struct btrfs_dio_data { ssize_t submitted; struct extent_changeset *data_reserved; struct btrfs_ordered_extent *ordered; + struct fscrypt_extent_info *fscrypt_info; + u64 orig_start; bool data_space_reserved; bool nocow_done; }; @@ -7729,6 +7731,10 @@ static int btrfs_dio_iomap_begin(struct inode *inode, loff_t start, release_len); } } else { + dio_data->fscrypt_info = + fscrypt_get_extent_info(em->fscrypt_info); + dio_data->orig_start = em->orig_start; + /* * We need to unlock only the end area that we aren't using. * The rest is going to be unlocked by the endio routine. @@ -7810,6 +7816,11 @@ static int btrfs_dio_iomap_end(struct inode *inode, loff_t pos, loff_t length, dio_data->ordered = NULL; } + if (dio_data->fscrypt_info) { + fscrypt_put_extent_info(dio_data->fscrypt_info); + dio_data->fscrypt_info = NULL; + } + if (write) extent_changeset_free(dio_data->data_reserved); return ret; -- 2.41.0
[PATCH 15/35] btrfs: implement fscrypt ioctls
From: Omar Sandoval These ioctls allow encryption to actually be used. The set_encryption_policy ioctl is the thing which actually turns on encryption, and therefore sets the ENCRYPT flag in the superblock. This prevents the filesystem from being loaded on older kernels. fscrypt provides CONFIG_FS_ENCRYPTION-disabled versions of all these functions which just return -EOPNOTSUPP, so the ioctls don't need to be compiled out if CONFIG_FS_ENCRYPTION isn't enabled. We could instead gate this ioctl on the superblock having the flag set, if we wanted to require mkfs with the encrypt flag in order to have a filesystem with any encryption. Signed-off-by: Omar Sandoval Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/ioctl.c | 28 1 file changed, 28 insertions(+) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index ae674c823d14..ddc2d2c7fc7f 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -4587,6 +4587,34 @@ long btrfs_ioctl(struct file *file, unsigned int return btrfs_ioctl_get_fslabel(fs_info, argp); case FS_IOC_SETFSLABEL: return btrfs_ioctl_set_fslabel(file, argp); + case FS_IOC_SET_ENCRYPTION_POLICY: { + if (!IS_ENABLED(CONFIG_FS_ENCRYPTION)) + return -EOPNOTSUPP; + if (sb_rdonly(fs_info->sb)) + return -EROFS; + /* +* If we crash before we commit, nothing encrypted could have +* been written so it doesn't matter whether the encrypted +* state persists. +*/ + btrfs_set_fs_incompat(fs_info, ENCRYPT); + return fscrypt_ioctl_set_policy(file, (const void __user *)arg); + } + case FS_IOC_GET_ENCRYPTION_POLICY: + return fscrypt_ioctl_get_policy(file, (void __user *)arg); + case FS_IOC_GET_ENCRYPTION_POLICY_EX: + return fscrypt_ioctl_get_policy_ex(file, (void __user *)arg); + case FS_IOC_ADD_ENCRYPTION_KEY: + return fscrypt_ioctl_add_key(file, (void __user *)arg); + case FS_IOC_REMOVE_ENCRYPTION_KEY: + return fscrypt_ioctl_remove_key(file, (void __user *)arg); + case FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS: + return fscrypt_ioctl_remove_key_all_users(file, + (void __user *)arg); + case FS_IOC_GET_ENCRYPTION_KEY_STATUS: + return fscrypt_ioctl_get_key_status(file, (void __user *)arg); + case FS_IOC_GET_ENCRYPTION_NONCE: + return fscrypt_ioctl_get_nonce(file, (void __user *)arg); case FITRIM: return btrfs_ioctl_fitrim(fs_info, argp); case BTRFS_IOC_SNAP_CREATE: -- 2.41.0
[PATCH 19/35] btrfs: set file extent encryption excplicitly
From: Sweet Tea Dorminy This puts the long-preserved 1-byte encryption field to work, storing whether the extent is encrypted. Update the tree-checker to allow for the encryption bit to be set to our valid types. Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/accessors.h| 2 ++ fs/btrfs/inode.c| 8 ++-- fs/btrfs/tree-checker.c | 8 +--- fs/btrfs/tree-log.c | 2 ++ include/uapi/linux/btrfs_tree.h | 8 +++- 5 files changed, 22 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/accessors.h b/fs/btrfs/accessors.h index b780d9087490..5627f13a3d3e 100644 --- a/fs/btrfs/accessors.h +++ b/fs/btrfs/accessors.h @@ -951,6 +951,8 @@ BTRFS_SETGET_STACK_FUNCS(stack_file_extent_disk_num_bytes, struct btrfs_file_extent_item, disk_num_bytes, 64); BTRFS_SETGET_STACK_FUNCS(stack_file_extent_compression, struct btrfs_file_extent_item, compression, 8); +BTRFS_SETGET_STACK_FUNCS(stack_file_extent_encryption, +struct btrfs_file_extent_item, encryption, 8); BTRFS_SETGET_FUNCS(file_extent_type, struct btrfs_file_extent_item, type, 8); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index ccbaeea68e2e..db053b392d26 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2993,7 +2993,9 @@ static int insert_ordered_extent_file_extent(struct btrfs_trans_handle *trans, btrfs_set_stack_file_extent_num_bytes(&stack_fi, num_bytes); btrfs_set_stack_file_extent_ram_bytes(&stack_fi, ram_bytes); btrfs_set_stack_file_extent_compression(&stack_fi, oe->compress_type); - /* Encryption and other encoding is reserved and all 0 */ + btrfs_set_stack_file_extent_encryption(&stack_fi, + BTRFS_ENCRYPTION_NONE); + /* Other encoding is reserved and always 0 */ /* * For delalloc, when completing an ordered extent we update the inode's @@ -9640,7 +9642,9 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( btrfs_set_stack_file_extent_num_bytes(&stack_fi, len); btrfs_set_stack_file_extent_ram_bytes(&stack_fi, len); btrfs_set_stack_file_extent_compression(&stack_fi, BTRFS_COMPRESS_NONE); - /* Encryption and other encoding is reserved and all 0 */ + btrfs_set_stack_file_extent_encryption(&stack_fi, + BTRFS_ENCRYPTION_NONE); + /* Other encoding is reserved and always 0 */ qgroup_released = btrfs_qgroup_release_data(inode, file_offset, len); if (qgroup_released < 0) diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c index 1f2c389b0bfa..c2afdf65c2bf 100644 --- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -210,6 +210,7 @@ static int check_extent_data_item(struct extent_buffer *leaf, u32 sectorsize = fs_info->sectorsize; u32 item_size = btrfs_item_size(leaf, slot); u64 extent_end; + u8 policy; if (unlikely(!IS_ALIGNED(key->offset, sectorsize))) { file_extent_err(leaf, slot, @@ -261,10 +262,11 @@ static int check_extent_data_item(struct extent_buffer *leaf, BTRFS_NR_COMPRESS_TYPES - 1); return -EUCLEAN; } - if (unlikely(btrfs_file_extent_encryption(leaf, fi))) { + policy = btrfs_file_extent_encryption(leaf, fi); + if (unlikely(policy >= BTRFS_NR_ENCRYPTION_TYPES)) { file_extent_err(leaf, slot, - "invalid encryption for file extent, have %u expect 0", - btrfs_file_extent_encryption(leaf, fi)); + "invalid encryption for file extent, have %u expect range [0, %u]", + policy, BTRFS_NR_ENCRYPTION_TYPES - 1); return -EUCLEAN; } if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_INLINE) { diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index cc262305f4c5..f6f45a0f1b1e 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -4630,6 +4630,7 @@ static int log_one_extent(struct btrfs_trans_handle *trans, u64 extent_offset = em->start - em->orig_start; u64 block_len; int ret; + u8 encryption = BTRFS_ENCRYPTION_NONE; btrfs_set_stack_file_extent_generation(&fi, trans->transid); if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags)) @@ -4651,6 +4652,7 @@ static int log_one_extent(struct btrfs_trans_handle *trans, btrfs_set_stack_file_extent_num_bytes(&fi, em->len); btrfs_set_stack_file_extent_ram_bytes(&fi, em->ram_bytes); btrfs_set_stack_file_extent_compression(&fi, em->compress_type); + btrfs_set_stack_file_extent_encryption(&fi, encryption); ret = log_extent_csums(trans, inode,
[PATCH 17/35] btrfs: add get_devices hook for fscrypt
From: Sweet Tea Dorminy Since extent encryption requires inline encryption, even though we expect to use the inlinecrypt software fallback most of the time, we need to enumerate all the devices in use by btrfs. Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/fscrypt.c | 37 + 1 file changed, 37 insertions(+) diff --git a/fs/btrfs/fscrypt.c b/fs/btrfs/fscrypt.c index 254e48005aec..4fe0a8804ac5 100644 --- a/fs/btrfs/fscrypt.c +++ b/fs/btrfs/fscrypt.c @@ -11,7 +11,9 @@ #include "ioctl.h" #include "messages.h" #include "root-tree.h" +#include "super.h" #include "transaction.h" +#include "volumes.h" #include "xattr.h" /* @@ -178,9 +180,44 @@ static bool btrfs_fscrypt_empty_dir(struct inode *inode) return inode->i_size == BTRFS_EMPTY_DIR_SIZE; } +static struct block_device **btrfs_fscrypt_get_devices(struct super_block *sb, + unsigned int *num_devs) +{ + struct btrfs_fs_info *fs_info = btrfs_sb(sb); + struct btrfs_fs_devices *fs_devices = fs_info->fs_devices; + int nr_devices = fs_devices->open_devices; + struct block_device **devs; + struct btrfs_device *device; + int i = 0; + + devs = kmalloc_array(nr_devices, sizeof(*devs), GFP_NOFS | GFP_NOWAIT); + if (!devs) + return ERR_PTR(-ENOMEM); + + rcu_read_lock(); + list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) { + if (!test_bit(BTRFS_DEV_STATE_IN_FS_METADATA, + &device->dev_state) || + !device->bdev || + test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state)) + continue; + + devs[i++] = device->bdev; + + if (i >= nr_devices) + break; + + } + rcu_read_unlock(); + + *num_devs = i; + return devs; +} + const struct fscrypt_operations btrfs_fscrypt_ops = { .get_context = btrfs_fscrypt_get_context, .set_context = btrfs_fscrypt_set_context, .empty_dir = btrfs_fscrypt_empty_dir, + .get_devices = btrfs_fscrypt_get_devices, .key_prefix = "btrfs:" }; -- 2.41.0
[PATCH 18/35] btrfs: turn on inlinecrypt mount option for encrypt
From: Sweet Tea Dorminy fscrypt's extent encryption requires the use of inline encryption or the software fallback that the block layer provides; it is rather complicated to allow software encryption with extent encryption due to the timing of memory allocations. Thus, if btrfs has ever had a encrypted file, or when encryption is enabled on a directory, update the mount flags to include inlinecrypt. Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/ioctl.c | 3 +++ fs/btrfs/super.c | 10 ++ 2 files changed, 13 insertions(+) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 8e5f9dbb547a..33bf4d9416a9 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -4599,6 +4599,9 @@ long btrfs_ioctl(struct file *file, unsigned int * state persists. */ btrfs_set_fs_incompat(fs_info, ENCRYPT); + if (!(inode->i_sb->s_flags & SB_INLINECRYPT)) { + inode->i_sb->s_flags |= SB_INLINECRYPT; + } return fscrypt_ioctl_set_policy(file, (const void __user *)arg); } case FS_IOC_GET_ENCRYPTION_POLICY: diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 2b5d60cb7fed..a8a609f4a9f4 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1120,6 +1120,16 @@ static int btrfs_fill_super(struct super_block *sb, return err; } + if (btrfs_fs_incompat(fs_info, ENCRYPT)) { + if (IS_ENABLED(CONFIG_FS_ENCRYPTION_INLINE_CRYPT)) { + sb->s_flags |= SB_INLINECRYPT; + } else { + btrfs_err(fs_info, "encryption not supported"); + err = -EINVAL; + goto fail_close; + } + } + inode = btrfs_iget(sb, BTRFS_FIRST_FREE_OBJECTID, fs_info->fs_root); if (IS_ERR(inode)) { err = PTR_ERR(inode); -- 2.41.0
[PATCH 14/35] btrfs: handle nokey names.
From: Sweet Tea Dorminy For encrypted or unencrypted names, we calculate the offset for the dir item by hashing the name for the dir item. However, this doesn't work for a long nokey name, where we do not have the complete ciphertext. Instead, fscrypt stores the filesystem-provided hash in the nokey name, and we can extract it from the fscrypt_name structure in such a case. Additionally, for nokey names, if we find the nokey name on disk we can update the fscrypt_name with the disk name, so add that to searching for diritems. Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/dir-item.c | 37 +++-- fs/btrfs/fscrypt.c | 27 +++ fs/btrfs/fscrypt.h | 11 +++ 3 files changed, 73 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/dir-item.c b/fs/btrfs/dir-item.c index a64cfddff7f0..897fb5477369 100644 --- a/fs/btrfs/dir-item.c +++ b/fs/btrfs/dir-item.c @@ -231,6 +231,28 @@ struct btrfs_dir_item *btrfs_lookup_dir_item(struct btrfs_trans_handle *trans, return di; } +/* + * If appropriate, populate the disk name for a fscrypt_name looked up without + * a key. + * + * @path: The path to the extent buffer in which the name was found. + * @di:The dir item corresponding. + * @fname: The fscrypt_name to perhaps populate. + * + * Returns: 0 if the name is already populated or the dir item doesn't exist + * or the name was successfully populated, else an error code. + */ +static int ensure_disk_name_from_dir_item(struct btrfs_path *path, + struct btrfs_dir_item *di, + struct fscrypt_name *name) +{ + if (name->disk_name.name || !di) + return 0; + + return btrfs_fscrypt_get_disk_name(path->nodes[0], di, + &name->disk_name); +} + /* * Lookup for a directory item by fscrypt_name. * @@ -257,8 +279,12 @@ struct btrfs_dir_item *btrfs_lookup_dir_item_fname(struct btrfs_trans_handle *tr key.objectid = dir; key.type = BTRFS_DIR_ITEM_KEY; - key.offset = btrfs_name_hash(name->disk_name.name, name->disk_name.len); - /* XXX get the right hash for no-key names */ + + if (!name->disk_name.name) + key.offset = name->hash | ((u64)name->minor_hash << 32); + else + key.offset = btrfs_name_hash(name->disk_name.name, +name->disk_name.len); ret = btrfs_search_slot(trans, root, &key, path, mod, -mod); if (ret == 0) @@ -266,6 +292,8 @@ struct btrfs_dir_item *btrfs_lookup_dir_item_fname(struct btrfs_trans_handle *tr if (ret == -ENOENT || (di && IS_ERR(di) && PTR_ERR(di) == -ENOENT)) return NULL; + if (ret == 0) + ret = ensure_disk_name_from_dir_item(path, di, name); if (ret < 0) di = ERR_PTR(ret); @@ -382,7 +410,12 @@ btrfs_search_dir_index_item(struct btrfs_root *root, struct btrfs_path *path, btrfs_for_each_slot(root, &key, &key, path, ret) { if (key.objectid != dirid || key.type != BTRFS_DIR_INDEX_KEY) break; + di = btrfs_match_dir_item_fname(root->fs_info, path, name); + if (di) + ret = ensure_disk_name_from_dir_item(path, di, name); + if (ret) + break; if (di) return di; } diff --git a/fs/btrfs/fscrypt.c b/fs/btrfs/fscrypt.c index 588966f0414f..254e48005aec 100644 --- a/fs/btrfs/fscrypt.c +++ b/fs/btrfs/fscrypt.c @@ -14,6 +14,33 @@ #include "transaction.h" #include "xattr.h" +/* + * From a given location in a leaf, read a name into a qstr (usually a + * fscrypt_name's disk_name), allocating the required buffer. Used for + * nokey names. + */ +int btrfs_fscrypt_get_disk_name(struct extent_buffer *leaf, + struct btrfs_dir_item *dir_item, + struct fscrypt_str *name) +{ + unsigned long de_name_len = btrfs_dir_name_len(leaf, dir_item); + unsigned long de_name = (unsigned long)(dir_item + 1); + /* +* For no-key names, we use this opportunity to find the disk +* name, so future searches don't need to deal with nokey names +* and we know what the encrypted size is. +*/ + name->name = kmalloc(de_name_len, GFP_NOFS); + + if (!name->name) + return -ENOMEM; + + read_extent_buffer(leaf, name->name, de_name, de_name_len); + + name->len = de_name_len; + return 0; +} + /* * This function is extremely similar to fscrypt_match_name() but uses an * extent_buffer. diff --git a/fs/btrfs/fscrypt.h b/fs/btrfs/f
[PATCH 16/35] btrfs: add encryption to CONFIG_BTRFS_DEBUG
From: Sweet Tea Dorminy Since encryption is currently under BTRFS_DEBUG, this adds its dependencies: inline encryption from fscrypt, and the inline encryption fallback path from the block layer. Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/ioctl.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index ddc2d2c7fc7f..8e5f9dbb547a 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -4587,6 +4587,7 @@ long btrfs_ioctl(struct file *file, unsigned int return btrfs_ioctl_get_fslabel(fs_info, argp); case FS_IOC_SETFSLABEL: return btrfs_ioctl_set_fslabel(file, argp); +#ifdef CONFIG_BTRFS_DEBUG case FS_IOC_SET_ENCRYPTION_POLICY: { if (!IS_ENABLED(CONFIG_FS_ENCRYPTION)) return -EOPNOTSUPP; @@ -4615,6 +4616,7 @@ long btrfs_ioctl(struct file *file, unsigned int return fscrypt_ioctl_get_key_status(file, (void __user *)arg); case FS_IOC_GET_ENCRYPTION_NONCE: return fscrypt_ioctl_get_nonce(file, (void __user *)arg); +#endif /* CONFIG_BTRFS_DEBUG */ case FITRIM: return btrfs_ioctl_fitrim(fs_info, argp); case BTRFS_IOC_SNAP_CREATE: -- 2.41.0
[PATCH 13/35] btrfs: adapt readdir for encrypted and nokey names
From: Omar Sandoval Deleting an encrypted file must always be permitted, even if the user does not have the appropriate key. Therefore, for listing an encrypted directory, so-called 'nokey' names are provided, and these nokey names must be sufficient to look up and delete the appropriate encrypted files. See 'struct fscrypt_nokey_name' for more information on the format of these names. The first part of supporting nokey names is allowing lookups by nokey name. Only a few entry points need to support these: deleting a directory, file, or subvolume -- each of these call fscrypt_setup_filename() with a '1' argument, indicating that the key is not required and therefore a nokey name may be provided. If a nokey name is provided, the fscrypt_name returned by fscrypt_setup_filename() will not have its disk_name field populated, but will have various other fields set. This change alters the relevant codepaths to pass a complete fscrypt_name anywhere that it might contain a nokey name. When it does contain a nokey name, the first time the name is successfully matched to a stored name populates the disk name field of the fscrypt_name, allowing the caller to use the normal disk name codepaths afterward. Otherwise, the matching functionality is in close analogue to the function fscrypt_match_name(). Functions where most callers are providing a fscrypt_str are duplicated and adapted for a fscrypt_name, and functions where most callers are providing a fscrypt_name are changed to so require at all callsites. Signed-off-by: Omar Sandoval Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/btrfs_inode.h | 2 +- fs/btrfs/delayed-inode.c | 29 ++- fs/btrfs/delayed-inode.h | 6 +- fs/btrfs/dir-item.c | 77 --- fs/btrfs/dir-item.h | 11 ++- fs/btrfs/extent_io.c | 18 + fs/btrfs/extent_io.h | 3 + fs/btrfs/fscrypt.c | 34 + fs/btrfs/fscrypt.h | 19 + fs/btrfs/inode.c | 158 ++- fs/btrfs/root-tree.c | 8 +- fs/btrfs/root-tree.h | 2 +- fs/btrfs/tree-log.c | 3 +- 13 files changed, 297 insertions(+), 73 deletions(-) diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index 68ebb6096822..93390f12a4ef 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -422,7 +422,7 @@ struct inode *btrfs_lookup_dentry(struct inode *dir, struct dentry *dentry); int btrfs_set_inode_index(struct btrfs_inode *dir, u64 *index); int btrfs_unlink_inode(struct btrfs_trans_handle *trans, struct btrfs_inode *dir, struct btrfs_inode *inode, - const struct fscrypt_str *name); + struct fscrypt_name *name); int btrfs_add_link(struct btrfs_trans_handle *trans, struct btrfs_inode *parent_inode, struct btrfs_inode *inode, const struct fscrypt_str *name, int add_backref, u64 index); diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c index 35d7616615c1..43b5fb3fce27 100644 --- a/fs/btrfs/delayed-inode.c +++ b/fs/btrfs/delayed-inode.c @@ -1762,7 +1762,9 @@ int btrfs_should_delete_dir_index(struct list_head *del_list, /* * Read dir info stored in the delayed tree. */ -int btrfs_readdir_delayed_dir_index(struct dir_context *ctx, +int btrfs_readdir_delayed_dir_index(struct inode *inode, + struct fscrypt_str *fstr, + struct dir_context *ctx, struct list_head *ins_list) { struct btrfs_dir_item *di; @@ -1772,6 +1774,7 @@ int btrfs_readdir_delayed_dir_index(struct dir_context *ctx, int name_len; int over = 0; unsigned char d_type; + size_t fstr_len = fstr->len; /* * Changing the data of the delayed item is impossible. So @@ -1796,8 +1799,28 @@ int btrfs_readdir_delayed_dir_index(struct dir_context *ctx, d_type = fs_ftype_to_dtype(btrfs_dir_flags_to_ftype(di->type)); btrfs_disk_key_to_cpu(&location, &di->location); - over = !dir_emit(ctx, name, name_len, - location.objectid, d_type); + if (di->type & BTRFS_FT_ENCRYPTED) { + int ret; + struct fscrypt_str iname = FSTR_INIT(name, name_len); + + fstr->len = fstr_len; + /* +* The hash is only used when the encryption key is not +* available. But if we have delayed insertions, then we +* must have the encryption key available or we wouldn't +* have been able to create entries in the directory. +* So, we don't calculate the hash. +*/ +
[PATCH 11/35] btrfs: add inode encryption contexts
From: Omar Sandoval In order to store encryption information for directories, symlinks, etc., fscrypt stores a context item with each encrypted non-regular inode. fscrypt provides an arbitrary blob for the filesystem to store, and it does not clearly fit into an existing structure, so this goes in a new item type. Signed-off-by: Omar Sandoval Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/fscrypt.c | 117 fs/btrfs/fscrypt.h | 2 + fs/btrfs/inode.c| 19 ++ fs/btrfs/ioctl.c| 8 ++- include/uapi/linux/btrfs_tree.h | 10 +++ 5 files changed, 154 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/fscrypt.c b/fs/btrfs/fscrypt.c index 3a53dc59c1e4..bb3a83d01032 100644 --- a/fs/btrfs/fscrypt.c +++ b/fs/btrfs/fscrypt.c @@ -1,8 +1,125 @@ // SPDX-License-Identifier: GPL-2.0 +#include #include "ctree.h" +#include "accessors.h" +#include "btrfs_inode.h" +#include "disk-io.h" +#include "fs.h" #include "fscrypt.h" +#include "ioctl.h" +#include "messages.h" +#include "transaction.h" +#include "xattr.h" + +static int btrfs_fscrypt_get_context(struct inode *inode, void *ctx, size_t len) +{ + struct btrfs_key key = { + .objectid = btrfs_ino(BTRFS_I(inode)), + .type = BTRFS_FSCRYPT_CTX_ITEM_KEY, + .offset = 0, + }; + struct btrfs_path *path; + struct extent_buffer *leaf; + unsigned long ptr; + int ret; + + + path = btrfs_alloc_path(); + if (!path) + return -ENOMEM; + + ret = btrfs_search_slot(NULL, BTRFS_I(inode)->root, &key, path, 0, 0); + if (ret) { + len = -ENOENT; + goto out; + } + + leaf = path->nodes[0]; + ptr = btrfs_item_ptr_offset(leaf, path->slots[0]); + /* fscrypt provides max context length, but it could be less */ + len = min_t(size_t, len, btrfs_item_size(leaf, path->slots[0])); + read_extent_buffer(leaf, ctx, ptr, len); + +out: + btrfs_free_path(path); + return len; +} + +static int btrfs_fscrypt_set_context(struct inode *inode, const void *ctx, +size_t len, void *fs_data) +{ + struct btrfs_trans_handle *trans = fs_data; + struct btrfs_key key = { + .objectid = btrfs_ino(BTRFS_I(inode)), + .type = BTRFS_FSCRYPT_CTX_ITEM_KEY, + .offset = 0, + }; + struct btrfs_path *path = NULL; + struct extent_buffer *leaf; + unsigned long ptr; + int ret; + + if (!trans) + trans = btrfs_start_transaction(BTRFS_I(inode)->root, 2); + if (IS_ERR(trans)) + return PTR_ERR(trans); + + path = btrfs_alloc_path(); + if (!path) { + ret = -ENOMEM; + goto out_err; + } + + ret = btrfs_search_slot(trans, BTRFS_I(inode)->root, &key, path, 0, 1); + if (ret < 0) + goto out_err; + + if (ret > 0) { + btrfs_release_path(path); + ret = btrfs_insert_empty_item(trans, BTRFS_I(inode)->root, path, &key, len); + if (ret) + goto out_err; + } + + leaf = path->nodes[0]; + ptr = btrfs_item_ptr_offset(leaf, path->slots[0]); + + len = min_t(size_t, len, btrfs_item_size(leaf, path->slots[0])); + write_extent_buffer(leaf, ctx, ptr, len); + btrfs_mark_buffer_dirty(trans, leaf); + btrfs_release_path(path); + + if (fs_data) + return ret; + + BTRFS_I(inode)->flags |= BTRFS_INODE_ENCRYPT; + btrfs_sync_inode_flags_to_i_flags(inode); + inode_inc_iversion(inode); + inode_set_ctime_current(inode); + ret = btrfs_update_inode(trans, BTRFS_I(inode)); + if (ret) + goto out_abort; + btrfs_free_path(path); + btrfs_end_transaction(trans); + return 0; +out_abort: + btrfs_abort_transaction(trans, ret); +out_err: + if (!fs_data) + btrfs_end_transaction(trans); + btrfs_free_path(path); + return ret; +} + +static bool btrfs_fscrypt_empty_dir(struct inode *inode) +{ + return inode->i_size == BTRFS_EMPTY_DIR_SIZE; +} const struct fscrypt_operations btrfs_fscrypt_ops = { + .get_context = btrfs_fscrypt_get_context, + .set_context = btrfs_fscrypt_set_context, + .empty_dir = btrfs_fscrypt_empty_dir, .key_prefix = "btrfs:" }; diff --git a/fs/btrfs/fscrypt.h b/fs/btrfs/fscrypt.h index 7f4e6888bd43..80adb7e56826 100644 --- a/fs/btrfs/fscrypt.h +++ b/fs/btrfs/fscrypt.h @@ -5,6 +5,8 @@ #include +#include "fs.h" + extern const struct fscrypt_operations btrfs_fscrypt_ops; #end
[PATCH 03/35] fscrypt: disable all but standard v2 policies for extent encryption
The different encryption related options for fscrypt are too numerous to support for extent based encryption. Support for a few of these options could possibly be added, but since they're niche options simply reject them for file systems using extent based encryption. Signed-off-by: Josef Bacik --- fs/crypto/policy.c | 12 1 file changed, 12 insertions(+) diff --git a/fs/crypto/policy.c b/fs/crypto/policy.c index 8b8da04068b8..38807d0ee742 100644 --- a/fs/crypto/policy.c +++ b/fs/crypto/policy.c @@ -198,6 +198,12 @@ static bool fscrypt_supported_v1_policy(const struct fscrypt_policy_v1 *policy, return false; } + if (inode->i_sb->s_cop->flags & FS_CFLG_EXTENT_ENCRYPTION) { + fscrypt_warn(inode, +"v1 policies can't be used on file systems that use extent encryption"); + return false; + } + return true; } @@ -233,6 +239,12 @@ static bool fscrypt_supported_v2_policy(const struct fscrypt_policy_v2 *policy, return false; } + if ((inode->i_sb->s_cop->flags & FS_CFLG_EXTENT_ENCRYPTION) && count) { + fscrypt_warn(inode, +"Encryption flags aren't supported on file systems that use extent encryption"); + return false; + } + if ((policy->flags & FSCRYPT_POLICY_FLAG_DIRECT_KEY) && !supported_direct_key_modes(inode, policy->contents_encryption_mode, policy->filenames_encryption_mode)) -- 2.41.0
[PATCH 09/35] btrfs: disable verity on encrypted inodes
From: Sweet Tea Dorminy Right now there isn't a way to encrypt things that aren't either filenames in directories or data on blocks on disk with extent encryption, so for now, disable verity usage with encryption on btrfs. Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/verity.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/btrfs/verity.c b/fs/btrfs/verity.c index 66e2270b0dae..92536913df04 100644 --- a/fs/btrfs/verity.c +++ b/fs/btrfs/verity.c @@ -588,6 +588,9 @@ static int btrfs_begin_enable_verity(struct file *filp) ASSERT(inode_is_locked(file_inode(filp))); + if (IS_ENCRYPTED(&inode->vfs_inode)) + return -EINVAL; + if (test_bit(BTRFS_INODE_VERITY_IN_PROGRESS, &inode->runtime_flags)) return -EBUSY; -- 2.41.0
[PATCH 05/35] fscrypt: expose fscrypt_nokey_name
From: Omar Sandoval btrfs stores its data structures, including filenames in directories, in its own buffer implementation, struct extent_buffer, composed of several non-contiguous pages. We could copy filenames into a temporary buffer and use fscrypt_match_name() against that buffer, such extensive memcpying would be expensive. Instead, exposing fscrypt_nokey_name as in this change allows btrfs to recapitulate fscrypt_match_name() using methods on struct extent_buffer instead of dealing with a raw byte array. Signed-off-by: Omar Sandoval Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/crypto/fname.c | 39 +-- include/linux/fscrypt.h | 37 + 2 files changed, 38 insertions(+), 38 deletions(-) diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c index 7b3fc189593a..5607ee52703e 100644 --- a/fs/crypto/fname.c +++ b/fs/crypto/fname.c @@ -14,7 +14,6 @@ #include #include #include -#include #include #include "fscrypt_private.h" @@ -26,43 +25,7 @@ #define FSCRYPT_FNAME_MIN_MSG_LEN 16 /* - * struct fscrypt_nokey_name - identifier for directory entry when key is absent - * - * When userspace lists an encrypted directory without access to the key, the - * filesystem must present a unique "no-key name" for each filename that allows - * it to find the directory entry again if requested. Naively, that would just - * mean using the ciphertext filenames. However, since the ciphertext filenames - * can contain illegal characters ('\0' and '/'), they must be encoded in some - * way. We use base64url. But that can cause names to exceed NAME_MAX (255 - * bytes), so we also need to use a strong hash to abbreviate long names. - * - * The filesystem may also need another kind of hash, the "dirhash", to quickly - * find the directory entry. Since filesystems normally compute the dirhash - * over the on-disk filename (i.e. the ciphertext), it's not computable from - * no-key names that abbreviate the ciphertext using the strong hash to fit in - * NAME_MAX. It's also not computable if it's a keyed hash taken over the - * plaintext (but it may still be available in the on-disk directory entry); - * casefolded directories use this type of dirhash. At least in these cases, - * each no-key name must include the name's dirhash too. - * - * To meet all these requirements, we base64url-encode the following - * variable-length structure. It contains the dirhash, or 0's if the filesystem - * didn't provide one; up to 149 bytes of the ciphertext name; and for - * ciphertexts longer than 149 bytes, also the SHA-256 of the remaining bytes. - * - * This ensures that each no-key name contains everything needed to find the - * directory entry again, contains only legal characters, doesn't exceed - * NAME_MAX, is unambiguous unless there's a SHA-256 collision, and that we only - * take the performance hit of SHA-256 on very long filenames (which are rare). - */ -struct fscrypt_nokey_name { - u32 dirhash[2]; - u8 bytes[149]; - u8 sha256[SHA256_DIGEST_SIZE]; -}; /* 189 bytes => 252 bytes base64url-encoded, which is <= NAME_MAX (255) */ - -/* - * Decoded size of max-size no-key name, i.e. a name that was abbreviated using + * Decoded size of max-size nokey name, i.e. a name that was abbreviated using * the strong hash and thus includes the 'sha256' field. This isn't simply * sizeof(struct fscrypt_nokey_name), as the padding at the end isn't included. */ diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h index 07493ad2588b..44dc10837499 100644 --- a/include/linux/fscrypt.h +++ b/include/linux/fscrypt.h @@ -17,6 +17,7 @@ #include #include #include +#include #include /* @@ -56,6 +57,42 @@ struct fscrypt_name { #define fname_name(p) ((p)->disk_name.name) #define fname_len(p) ((p)->disk_name.len) +/* + * struct fscrypt_nokey_name - identifier for directory entry when key is absent + * + * When userspace lists an encrypted directory without access to the key, the + * filesystem must present a unique "no-key name" for each filename that allows + * it to find the directory entry again if requested. Naively, that would just + * mean using the ciphertext filenames. However, since the ciphertext filenames + * can contain illegal characters ('\0' and '/'), they must be encoded in some + * way. We use base64url. But that can cause names to exceed NAME_MAX (255 + * bytes), so we also need to use a strong hash to abbreviate long names. + * + * The filesystem may also need another kind of hash, the "dirhash", to quickly + * find the directory entry. Since filesystems normally compute the dirhash + * over the on-disk filename (i.e. the ciphertext), it's not computable from + * no-key names that a
[PATCH 12/35] btrfs: add new FEATURE_INCOMPAT_ENCRYPT flag
From: Omar Sandoval As encrypted files will be incompatible with older filesystem versions, new filesystems should be created with an incompat flag for fscrypt, which will gate access to the encryption ioctls. Signed-off-by: Omar Sandoval Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/fs.h | 3 ++- fs/btrfs/super.c | 5 + fs/btrfs/sysfs.c | 6 ++ include/uapi/linux/btrfs.h | 1 + 4 files changed, 14 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index 25a43ca4e0dd..cb2b0d442de8 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -232,7 +232,8 @@ enum { #define BTRFS_FEATURE_INCOMPAT_SUPP\ (BTRFS_FEATURE_INCOMPAT_SUPP_STABLE | \ BTRFS_FEATURE_INCOMPAT_RAID_STRIPE_TREE | \ -BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2) +BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2 | \ +BTRFS_FEATURE_INCOMPAT_ENCRYPT) #else diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 21c5358e9202..2b5d60cb7fed 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -2378,6 +2378,11 @@ static int __init btrfs_print_mod_info(void) ", fsverity=yes" #else ", fsverity=no" +#endif +#ifdef CONFIG_FS_ENCRYPTION + ", fscrypt=yes" +#else + ", fscrypt=no" #endif ; pr_info("Btrfs loaded%s\n", options); diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index 143f0553714b..409244b569a5 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -305,6 +305,9 @@ BTRFS_FEAT_ATTR_INCOMPAT(raid_stripe_tree, RAID_STRIPE_TREE); #ifdef CONFIG_FS_VERITY BTRFS_FEAT_ATTR_COMPAT_RO(verity, VERITY); #endif +#ifdef CONFIG_FS_ENCRYPTION +BTRFS_FEAT_ATTR_INCOMPAT(encryption, ENCRYPT); +#endif /* CONFIG_FS_ENCRYPTION */ /* * Features which depend on feature bits and may differ between each fs. @@ -338,6 +341,9 @@ static struct attribute *btrfs_supported_feature_attrs[] = { #ifdef CONFIG_FS_VERITY BTRFS_FEAT_ATTR_PTR(verity), #endif +#ifdef CONFIG_FS_ENCRYPTION + BTRFS_FEAT_ATTR_PTR(encryption), +#endif /* CONFIG_FS_ENCRYPTION */ NULL }; diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h index e2c106bb0586..3ff21c95e1bb 100644 --- a/include/uapi/linux/btrfs.h +++ b/include/uapi/linux/btrfs.h @@ -341,6 +341,7 @@ struct btrfs_ioctl_fs_info_args { #define BTRFS_FEATURE_INCOMPAT_ZONED (1ULL << 12) #define BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2 (1ULL << 13) #define BTRFS_FEATURE_INCOMPAT_RAID_STRIPE_TREE(1ULL << 14) +#define BTRFS_FEATURE_INCOMPAT_ENCRYPT (1ULL << 15) #define BTRFS_FEATURE_INCOMPAT_SIMPLE_QUOTA(1ULL << 16) struct btrfs_ioctl_feature_flags { -- 2.41.0
[PATCH 10/35] btrfs: start using fscrypt hooks
From: Omar Sandoval In order to appropriately encrypt, create, open, rename, and various symlink operations must call fscrypt hooks. These determine whether the inode should be encrypted and do other preparatory actions. The superblock must have fscrypt operations registered, so implement the minimal set also, and introduce the new fscrypt.[ch] files to hold the fscrypt-specific functionality. Also add the key prefix for fscrypt v1 keys. Signed-off-by: Omar Sandoval Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/Makefile | 1 + fs/btrfs/btrfs_inode.h | 1 + fs/btrfs/file.c| 3 ++ fs/btrfs/fscrypt.c | 8 +++ fs/btrfs/fscrypt.h | 10 fs/btrfs/inode.c | 110 ++--- fs/btrfs/super.c | 2 + 7 files changed, 117 insertions(+), 18 deletions(-) create mode 100644 fs/btrfs/fscrypt.c create mode 100644 fs/btrfs/fscrypt.h diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index 525af975f61c..6e51d054c17a 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -39,6 +39,7 @@ btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_REF_VERIFY) += ref-verify.o btrfs-$(CONFIG_BLK_DEV_ZONED) += zoned.o btrfs-$(CONFIG_FS_VERITY) += verity.o +btrfs-$(CONFIG_FS_ENCRYPTION) += fscrypt.o btrfs-$(CONFIG_BTRFS_FS_RUN_SANITY_TESTS) += tests/free-space-tests.o \ tests/extent-buffer-tests.o tests/btrfs-tests.o \ diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index f2c928345d53..68ebb6096822 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -449,6 +449,7 @@ struct btrfs_new_inode_args { struct posix_acl *default_acl; struct posix_acl *acl; struct fscrypt_name fname; + bool encrypt; }; int btrfs_new_inode_prepare(struct btrfs_new_inode_args *args, diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 004c53482f05..8dcc5ae9c9e1 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -3710,6 +3710,9 @@ static int btrfs_file_open(struct inode *inode, struct file *filp) filp->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC | FMODE_BUF_WASYNC | FMODE_CAN_ODIRECT; + ret = fscrypt_file_open(inode, filp); + if (ret) + return ret; ret = fsverity_file_open(inode, filp); if (ret) diff --git a/fs/btrfs/fscrypt.c b/fs/btrfs/fscrypt.c new file mode 100644 index ..3a53dc59c1e4 --- /dev/null +++ b/fs/btrfs/fscrypt.c @@ -0,0 +1,8 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include "ctree.h" +#include "fscrypt.h" + +const struct fscrypt_operations btrfs_fscrypt_ops = { + .key_prefix = "btrfs:" +}; diff --git a/fs/btrfs/fscrypt.h b/fs/btrfs/fscrypt.h new file mode 100644 index ..7f4e6888bd43 --- /dev/null +++ b/fs/btrfs/fscrypt.h @@ -0,0 +1,10 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef BTRFS_FSCRYPT_H +#define BTRFS_FSCRYPT_H + +#include + +extern const struct fscrypt_operations btrfs_fscrypt_ops; + +#endif /* BTRFS_FSCRYPT_H */ diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 6cba648d5656..94c2e13934aa 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5054,6 +5054,10 @@ static int btrfs_setattr(struct mnt_idmap *idmap, struct dentry *dentry, if (err) return err; + err = fscrypt_prepare_setattr(dentry, attr); + if (err) + return err; + if (S_ISREG(inode->i_mode) && (attr->ia_valid & ATTR_SIZE)) { err = btrfs_setsize(inode, attr); if (err) @@ -5208,11 +5212,8 @@ void btrfs_evict_inode(struct inode *inode) trace_btrfs_inode_evict(inode); - if (!root) { - fsverity_cleanup_inode(inode); - clear_inode(inode); - return; - } + if (!root) + goto cleanup; evict_inode_truncate_pages(inode); @@ -5312,6 +5313,9 @@ void btrfs_evict_inode(struct inode *inode) * to retry these periodically in the future. */ btrfs_remove_delayed_node(BTRFS_I(inode)); + +cleanup: + fscrypt_put_encryption_info(inode); fsverity_cleanup_inode(inode); clear_inode(inode); } @@ -6097,6 +6101,12 @@ int btrfs_new_inode_prepare(struct btrfs_new_inode_args *args, return ret; } + ret = fscrypt_prepare_new_inode(dir, inode, &args->encrypt); + if (ret) { + fscrypt_free_filename(&args->fname); + return ret; + } + /* 1 to add inode item */ *trans_num_items = 1; /* 1 to add compression property */ @@ -6570,9 +6580,13 @@ static int btrfs_link(struct dentry *old_dentry, struct inode *dir, if (inode->i_nlink >= BTRFS_LINK_MAX) return -EMLINK; + err = fscrypt_prepare_link(old_dentry, dir, dentry); + if (err) + ret
[PATCH 06/35] fscrypt: add documentation about extent encryption
Add a couple of sections to the fscrypt documentation about per-extent encryption. Signed-off-by: Josef Bacik --- Documentation/filesystems/fscrypt.rst | 36 +++ 1 file changed, 36 insertions(+) diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst index a624e92f2687..9981eaf61f32 100644 --- a/Documentation/filesystems/fscrypt.rst +++ b/Documentation/filesystems/fscrypt.rst @@ -256,6 +256,21 @@ alternative master keys or to support rotating master keys. Instead, the master keys may be wrapped in userspace, e.g. as is done by the `fscrypt <https://github.com/google/fscrypt>`_ tool. +Per-extent encryption keys +-- + +For certain file systems, such as btrfs, it's desired to derive a +per-extent encryption key. This is to enable features such as snapshots +and reflink, where you could have different inodes pointing at the same +extent. When a new extent is created fscrypt randomly generates a +16-byte nonce and the file system stores it along side the extent. +Then, it uses a KDF (as described in `Key derivation function`_) to +derive the extent's key from the master key and nonce. + +Currently the inode's master key and encryption policy must match the +extent, so you cannot share extents between inodes that were encrypted +differently. + DIRECT_KEY policies --- @@ -1339,6 +1354,27 @@ by the kernel and is used as KDF input or as a tweak to cause different files to be encrypted differently; see `Per-file encryption keys`_ and `DIRECT_KEY policies`_. +Extent encryption context +- + +The extent encryption context mirrors the important parts of the above +`Encryption context`_, with a few ommisions. The struct is defined as +follows:: + +struct fscrypt_extent_context { +u8 version; +u8 encryption_mode; +u8 master_key_identifier[FSCRYPT_KEY_IDENTIFIER_SIZE]; +u8 nonce[FSCRYPT_FILE_NONCE_SIZE]; +}; + +Currently all fields much match the containing inode's encryption +context, with the exception of the nonce. + +Additionally extent encryption is only supported with +FSCRYPT_EXTENT_CONTEXT_V2 using the standard policy, all other policies +are disallowed. + Data path changes - -- 2.41.0
[PATCH 07/35] btrfs: add infrastructure for safe em freeing
When we add fscrypt support we're going to have fscrypt objects hanging off of extent_maps. This includes a block key, which if we're the last one freeing the key we may have to unregister it from the block layer. This requires taking a semaphore in the block layer, which means we can't free em's under the extent map tree lock. Thankfully we only do this in two places, one where we're dropping a range of extent maps, and when we're freeing logged extents. Add a free_extent_map_safe() which will add the em to a list in the em_tree if we free'd the object. Currently this is unconditional but will be changed to conditional on the fscrypt object we will add in a later patch. To process these delayed objects add a free_pending_extent_maps() that is called after the lock has been dropped on the em_tree. This will process the extent maps on the freed list and do the appropriate freeing work in a safe manner. Signed-off-by: Josef Bacik --- fs/btrfs/extent_map.c | 80 --- fs/btrfs/extent_map.h | 10 ++ fs/btrfs/tree-log.c | 6 ++-- 3 files changed, 89 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index a6d8368ed0ed..af5ff6b10865 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -35,7 +35,9 @@ void __cold extent_map_exit(void) void extent_map_tree_init(struct extent_map_tree *tree) { tree->map = RB_ROOT_CACHED; + tree->flags = 0; INIT_LIST_HEAD(&tree->modified_extents); + INIT_LIST_HEAD(&tree->freed_extents); rwlock_init(&tree->lock); } @@ -53,9 +55,17 @@ struct extent_map *alloc_extent_map(void) em->compress_type = BTRFS_COMPRESS_NONE; refcount_set(&em->refs, 1); INIT_LIST_HEAD(&em->list); + INIT_LIST_HEAD(&em->free_list); return em; } +static void __free_extent_map(struct extent_map *em) +{ + if (test_bit(EXTENT_FLAG_FS_MAPPING, &em->flags)) + kfree(em->map_lookup); + kmem_cache_free(extent_map_cache, em); +} + /* * Drop the reference out on @em by one and free the structure if the reference * count hits zero. @@ -67,12 +77,69 @@ void free_extent_map(struct extent_map *em) if (refcount_dec_and_test(&em->refs)) { WARN_ON(extent_map_in_tree(em)); WARN_ON(!list_empty(&em->list)); - if (test_bit(EXTENT_FLAG_FS_MAPPING, &em->flags)) - kfree(em->map_lookup); - kmem_cache_free(extent_map_cache, em); + __free_extent_map(em); } } +/* + * Drop a ref for the extent map in the given tree. + * + * @tree: tree that the em is a part of. + * @em:the em to drop the reference to. + * + * Drop the reference count on @em by one, if the reference count hits 0 and + * there is an object on the em that can't be safely freed in the current + * context (if we are holding the extent_map_tree->lock for example), then add + * it to the freed_extents list on the extent_map_tree for later processing. + * + * This must be followed by a free_pending_extent_maps() to clear the pending + * frees. + */ +void free_extent_map_safe(struct extent_map_tree *tree, + struct extent_map *em) +{ + lockdep_assert_held_write(&tree->lock); + + if (!em) + return; + + if (refcount_dec_and_test(&em->refs)) { + WARN_ON(extent_map_in_tree(em)); + WARN_ON(!list_empty(&em->list)); + list_add_tail(&em->free_list, &tree->freed_extents); + set_bit(EXTENT_MAP_TREE_PENDING_FREES, &tree->flags); + } +} + +/* + * Free the em objects that exist on the em tree + * + * @tree: the tree to free the objects from. + * + * If there are any objects on the em->freed_extents list go ahead and free them + * here in a safe way. This is to be coupled with any uses of + * free_extent_map_safe(). + */ +void free_pending_extent_maps(struct extent_map_tree *tree) +{ + struct extent_map *em; + + /* Avoid taking the write lock if we don't have any pending frees. */ + if (!test_and_clear_bit(EXTENT_MAP_TREE_PENDING_FREES, &tree->flags)) + return; + + write_lock(&tree->lock); + while ((em = list_first_entry_or_null(&tree->freed_extents, + struct extent_map, free_list))) { + list_del_init(&em->free_list); + write_unlock(&tree->lock); + __free_extent_map(em); + cond_resched(); + write_lock(&tree->lock); + } + write_unlock(&tree->lock); +} + /* Do the math around the end of an extent, handling wrapping. */ static u64 range_
[PATCH 08/35] btrfs: disable various operations on encrypted inodes
From: Omar Sandoval Initially, only normal data extents will be encrypted. This change forbids various other bits: - allows reflinking only if both inodes have the same encryption status - disable inline data on encrypted inodes Signed-off-by: Omar Sandoval Signed-off-by: Sweet Tea Dorminy Signed-off-by: Josef Bacik --- fs/btrfs/inode.c | 3 ++- fs/btrfs/reflink.c | 7 +++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 52576deda654..6cba648d5656 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -630,7 +630,8 @@ static noinline int cow_file_range_inline(struct btrfs_inode *inode, u64 size, * compressed) data fits in a leaf and the configured maximum inline * size. */ - if (size < i_size_read(&inode->vfs_inode) || + if (IS_ENCRYPTED(&inode->vfs_inode) || + size < i_size_read(&inode->vfs_inode) || size > fs_info->sectorsize || data_len > BTRFS_MAX_INLINE_DATA_SIZE(fs_info) || data_len > fs_info->max_inline) diff --git a/fs/btrfs/reflink.c b/fs/btrfs/reflink.c index fabd856e5079..3c66630d87ee 100644 --- a/fs/btrfs/reflink.c +++ b/fs/btrfs/reflink.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include +#include #include #include "ctree.h" #include "fs.h" @@ -809,6 +810,12 @@ static int btrfs_remap_file_range_prep(struct file *file_in, loff_t pos_in, ASSERT(inode_in->i_sb == inode_out->i_sb); } + /* +* Can only reflink encrypted files if both files are encrypted. +*/ + if (IS_ENCRYPTED(inode_in) != IS_ENCRYPTED(inode_out)) + return -EINVAL; + /* Don't make the dst file partly checksummed */ if ((BTRFS_I(inode_in)->flags & BTRFS_INODE_NODATASUM) != (BTRFS_I(inode_out)->flags & BTRFS_INODE_NODATASUM)) { -- 2.41.0
[PATCH 04/35] blk-crypto: add a process bio callback
Btrfs does checksumming, and the checksums need to match the bytes on disk. In order to facilitate this add a process bio callback for the blk-crypto layer. This allows the file system to specify a callback and then can process the encrypted bio as necessary. For btrfs, writes will have the checksums calculated and saved into our relevant data structures for storage once the write completes. For reads we will validate the checksums match what is on disk and error out if there is a mismatch. This is incompatible with native encryption obviously, so make sure we don't use native encryption if this callback is set. Signed-off-by: Josef Bacik --- block/blk-crypto-fallback.c| 28 block/blk-crypto-profile.c | 2 ++ block/blk-crypto.c | 6 +- fs/crypto/inline_crypt.c | 3 ++- include/linux/blk-crypto-profile.h | 7 +++ include/linux/blk-crypto.h | 9 +++-- include/linux/fscrypt.h| 14 ++ 7 files changed, 65 insertions(+), 4 deletions(-) diff --git a/block/blk-crypto-fallback.c b/block/blk-crypto-fallback.c index e6468eab2681..8b4a83534127 100644 --- a/block/blk-crypto-fallback.c +++ b/block/blk-crypto-fallback.c @@ -346,6 +346,15 @@ static bool blk_crypto_fallback_encrypt_bio(struct bio **bio_ptr) } } + /* Process the encrypted bio before we submit it. */ + if (bc->bc_key->crypto_cfg.process_bio) { + blk_st = bc->bc_key->crypto_cfg.process_bio(src_bio, enc_bio); + if (blk_st != BLK_STS_OK) { + src_bio->bi_status = blk_st; + goto out_free_bounce_pages; + } + } + enc_bio->bi_private = src_bio; enc_bio->bi_end_io = blk_crypto_fallback_encrypt_endio; *bio_ptr = enc_bio; @@ -391,6 +400,24 @@ static void blk_crypto_fallback_decrypt_bio(struct work_struct *work) unsigned int i; blk_status_t blk_st; + /* +* Process the bio first before trying to decrypt. +* +* NOTE: btrfs expects that this bio is the same that was submitted. If +* at any point this changes we will need to update process_bio to take +* f_ctx->crypt_iter in order to make sure we can iterate the pages for +* checksumming. We're currently saving this in our btrfs_bio, so this +* works, but if at any point in the future we start allocating a bounce +* bio or something we need to update this callback. +*/ + if (bc->bc_key->crypto_cfg.process_bio) { + blk_st = bc->bc_key->crypto_cfg.process_bio(bio, bio); + if (blk_st != BLK_STS_OK) { + bio->bi_status = blk_st; + goto out_no_keyslot; + } + } + /* * Get a blk-crypto-fallback keyslot that contains a crypto_skcipher for * this bio's algorithm and key. @@ -560,6 +587,7 @@ static int blk_crypto_fallback_init(void) blk_crypto_fallback_profile->ll_ops = blk_crypto_fallback_ll_ops; blk_crypto_fallback_profile->max_dun_bytes_supported = BLK_CRYPTO_MAX_IV_SIZE; + blk_crypto_fallback_profile->process_bio_supported = true; /* All blk-crypto modes have a crypto API fallback. */ for (i = 0; i < BLK_ENCRYPTION_MODE_MAX; i++) diff --git a/block/blk-crypto-profile.c b/block/blk-crypto-profile.c index 7fabc883e39f..640cf2ea3fcc 100644 --- a/block/blk-crypto-profile.c +++ b/block/blk-crypto-profile.c @@ -352,6 +352,8 @@ bool __blk_crypto_cfg_supported(struct blk_crypto_profile *profile, return false; if (profile->max_dun_bytes_supported < cfg->dun_bytes) return false; + if (cfg->process_bio && !profile->process_bio_supported) + return false; return true; } diff --git a/block/blk-crypto.c b/block/blk-crypto.c index 4d760b092deb..50556952df19 100644 --- a/block/blk-crypto.c +++ b/block/blk-crypto.c @@ -321,6 +321,8 @@ int __blk_crypto_rq_bio_prep(struct request *rq, struct bio *bio, * @dun_bytes: number of bytes that will be used to specify the DUN when this *key is used * @data_unit_size: the data unit size to use for en/decryption + * @process_bio: the call back if the upper layer needs to process the encrypted + * bio * * Return: 0 on success, -errno on failure. The caller is responsible for *zeroizing both blk_key and raw_key when done with them. @@ -328,7 +330,8 @@ int __blk_crypto_rq_bio_prep(struct request *rq, struct bio *bio, int blk_crypto_init_key(struct blk_crypto_key *blk_key, const u8 *raw_key, enum blk_crypto_mode_num crypto_mode, unsigned int dun_bytes, - unsigned int data_unit_size) +