> On Feb 5, 2021, at 1:39 AM, Eric Biggers <[email protected]> wrote:
>
> On Thu, Feb 04, 2021 at 03:21:38PM -0800, Boris Burkov wrote:
>> +/*
>> + * drop all the items for this inode with this key_type. Before
>> + * doing a verity enable we cleanup any existing verity items.
>> + *
>> + * This is also used to clean up if a verity enable failed half way
>> + * through
>> + */
>> +static int drop_verity_items(struct btrfs_inode *inode, u8 key_type)
>
> I assume you checked whether there's already code in btrfs that does this?
Nikolay correctly called out btrfs_truncate_inode_items(), but I wanted to keep
my fingers out of that in v0 of the patches.
> This
> seems like a fairly generic thing that might be needed elsewhere in btrfs.
> Similarly for write_key_bytes() and read_key_bytes().
>
It might make sense to make read/write_key_bytes() our generic functions, but
unless I missed something I didn’t see great fits.
>> +/*
>> + * fsverity does a two pass setup for reading the descriptor, in the first
>> pass
>> + * it calls with buf_size = 0 to query the size of the descriptor,
>> + * and then in the second pass it actually reads the descriptor off
>> + * disk.
>> + */
>> +static int btrfs_get_verity_descriptor(struct inode *inode, void *buf,
>> + size_t buf_size)
>> +{
>> + ssize_t ret = 0;
>> +
>> + if (!buf_size) {
>> + return read_key_bytes(BTRFS_I(inode),
>> + BTRFS_VERITY_DESC_ITEM_KEY,
>> + 0, NULL, (u64)-1, NULL);
>> + }
>> +
>> + ret = read_key_bytes(BTRFS_I(inode),
>> + BTRFS_VERITY_DESC_ITEM_KEY, 0,
>> + buf, buf_size, NULL);
>> + if (ret < 0)
>> + return ret;
>> +
>> + if (ret != buf_size)
>> + return -EIO;
>> +
>> + return buf_size;
>> +}
>
> This doesn't return the right value when buf_size != 0 && buf_size !=
> desc_size.
> It's supposed to return the actual size or -ERANGE, like getxattr() does; see
> the comment above fsverity_operations::get_verity_descriptor.
>
> It doesn't matter much because that case doesn't happen currently, but it
> would
> be nice to keep things consistent.
>
Forgot all about this, but I was going to suggest optimizing this part a bit,
since btrfs is doing a tree search for each call. I’d love to have a way to do
it in one search-allocate-copy round.
>> +/*
>> + * reads and caches a merkle tree page. These are stored in the btree,
>> + * but we cache them in the inode's address space after EOF.
>> + */
>> +static struct page *btrfs_read_merkle_tree_page(struct inode *inode,
>> + pgoff_t index,
>> + unsigned long num_ra_pages)
>> +{
>> + struct page *p;
>> + u64 start = index << PAGE_SHIFT;
>> + unsigned long mapping_index;
>> + ssize_t ret;
>> + int err;
>> +
>> + /*
>> + * the file is readonly, so i_size can't change here. We jump
>> + * some pages past the last page to cache our merkles. The goal
>> + * is just to jump past any hugepages that might be mapped in.
>> + */
>> + mapping_index = (i_size_read(inode) >> PAGE_SHIFT) + 2048 + index;
>
> btrfs allows files of up to the page cache limit of MAX_LFS_FILESIZE already.
> So if the Merkle tree pages are cached past EOF like this, it would be
> necessary
> to limit the size of files that verity can be enabled on, like what ext4 and
> f2fs do. See the -EFBIG case in pagecache_write() in fs/ext4/verity.c and
> fs/f2fs/verity.c.
>
> Note that this extra limit isn't likely to be encountered in practice, as it
> would only decrease a very large limit by about 1%, and fs-verity isn't likely
> to be used on terabyte-sized files.
>
> However maybe there's a way to avoid this weirdness entirely, e.g. by
> allocating
> a temporary in-memory inode and using its address_space?
This is a good point, I think maybe just do the EFBIG dance for now? It’s not
a factor for the disk format, and we can add the separate address space any
time. For the common case today, I would prefer avoiding the overhead of
allocating the temp inode/address_space.
-chris