"decompress failed" in 1-2 files always causes kernel oops, check/scrub pass

2018-05-11 Thread james harvey
100% reproducible, booting from disk, or even Arch installation ISO. Kernel 4.16.7. btrfs-progs v4.16. Reading one of two journalctl files causes a kernel oops. Initially ran into it from "journalctl --list-boots", but cat'ing the file does it too. I believe this shows there's compressed data

Re: [PATCH 2/2] vfs: dedupe should return EPERM if permission is not granted

2018-05-11 Thread Duncan
Darrick J. Wong posted on Fri, 11 May 2018 17:06:34 -0700 as excerpted: > On Fri, May 11, 2018 at 12:26:51PM -0700, Mark Fasheh wrote: >> Right now we return EINVAL if a process does not have permission to dedupe a >> file. This was an oversight on my part. EPERM gives a true description of >>

Re: [PATCH 2/2] vfs: dedupe should return EPERM if permission is not granted

2018-05-11 Thread Amir Goldstein
On Sat, May 12, 2018 at 3:06 AM, Darrick J. Wong wrote: > On Fri, May 11, 2018 at 12:26:51PM -0700, Mark Fasheh wrote: >> Right now we return EINVAL if a process does not have permission to dedupe a >> file. This was an oversight on my part. EPERM gives a true description

Re: [PATCH 1/2] vfs: allow dedupe of user owned read-only files

2018-05-11 Thread Adam Borowski
On Fri, May 11, 2018 at 12:26:50PM -0700, Mark Fasheh wrote: > The permission check in vfs_dedupe_file_range() is too coarse - We > only allow dedupe of the destination file if the user is root, or > they have the file open for write. > > This effectively limits a non-root user from deduping

Re: [PATCH 1/2 V2] hoist BTRFS_IOC_[SG]ET_FSLABEL to vfs

2018-05-11 Thread Darrick J. Wong
On Fri, May 11, 2018 at 04:41:45PM +0200, David Sterba wrote: > On Fri, May 11, 2018 at 09:36:09AM -0500, Eric Sandeen wrote: > > On 5/11/18 9:32 AM, Chris Mason wrote: > > > On 11 May 2018, at 10:10, David Sterba wrote: > > > > > >> On Thu, May 10, 2018 at 08:16:09PM +0100, Al Viro wrote: > >

Re: [PATCH] btrfs: qgroup: Search commit root for rescan to avoid missing extent

2018-05-11 Thread Qu Wenruo
On 2018年05月12日 01:08, Jeff Mahoney wrote: > On 5/3/18 3:20 AM, Qu Wenruo wrote: >> When doing qgroup rescan using the following script (modified from >> btrfs/017 test case), we can sometimes hit qgroup corruption. >> >> -- >> umount $dev &> /dev/null >> umount $mnt &> /dev/null >> >>

Re: [PATCH 2/2] vfs: dedupe should return EPERM if permission is not granted

2018-05-11 Thread Darrick J. Wong
On Fri, May 11, 2018 at 12:26:51PM -0700, Mark Fasheh wrote: > Right now we return EINVAL if a process does not have permission to dedupe a > file. This was an oversight on my part. EPERM gives a true description of > the nature of our error, and EINVAL is already used for the case that the >

Re: [PATCH 1/2] vfs: allow dedupe of user owned read-only files

2018-05-11 Thread Darrick J. Wong
On Fri, May 11, 2018 at 12:26:50PM -0700, Mark Fasheh wrote: > The permission check in vfs_dedupe_file_range() is too coarse - We > only allow dedupe of the destination file if the user is root, or > they have the file open for write. > > This effectively limits a non-root user from deduping

Re: [RFC v2 3/4] ext4: add verifier check for symlink with append/immutable flags

2018-05-11 Thread Jan Kara
On Thu 10-05-18 16:13:58, Luis R. Rodriguez wrote: > The Linux VFS does not allow a way to set append/immuttable > attributes to symlinks, this is just not possible. If this is > detected inform the user as the filesystem must be corrupted. > > Signed-off-by: Luis R. Rodriguez

Re: [PATCH] btrfs: use kvzalloc for EXTENT_SAME temporary data

2018-05-11 Thread David Sterba
On Fri, May 11, 2018 at 06:16:15PM +0100, Filipe Manana wrote: > On Fri, May 11, 2018 at 5:49 PM, David Sterba wrote: > > On Fri, May 11, 2018 at 05:25:50PM +0100, Filipe Manana wrote: > >> On Fri, May 11, 2018 at 4:57 PM, David Sterba wrote: > >> > The dedupe

Re: [PATCH 1/3] fs: add initial bh_result->b_private value to __blockdev_direct_IO()

2018-05-11 Thread Omar Sandoval
On Fri, May 11, 2018 at 09:32:28PM +0100, Al Viro wrote: > On Fri, May 11, 2018 at 01:30:01PM -0700, Omar Sandoval wrote: > > On Fri, May 11, 2018 at 09:05:38PM +0100, Al Viro wrote: > > > On Thu, May 10, 2018 at 11:30:10PM -0700, Omar Sandoval wrote: > > > > do_blockdev_direct_IO(struct kiocb

Re: [PATCH 1/3] fs: add initial bh_result->b_private value to __blockdev_direct_IO()

2018-05-11 Thread Al Viro
On Fri, May 11, 2018 at 01:30:01PM -0700, Omar Sandoval wrote: > On Fri, May 11, 2018 at 09:05:38PM +0100, Al Viro wrote: > > On Thu, May 10, 2018 at 11:30:10PM -0700, Omar Sandoval wrote: > > > do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode, > > > struct

Re: [PATCH 1/3] fs: add initial bh_result->b_private value to __blockdev_direct_IO()

2018-05-11 Thread Omar Sandoval
On Fri, May 11, 2018 at 09:05:38PM +0100, Al Viro wrote: > On Thu, May 10, 2018 at 11:30:10PM -0700, Omar Sandoval wrote: > > do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode, > > struct block_device *bdev, struct iov_iter *iter, > > get_block_t

[PATCH v4 06/12] Btrfs: delete dead code in btrfs_orphan_commit_root()

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval btrfs_orphan_commit_root() tries to delete an orphan item for a subvolume in the tree root, but we don't actually insert that item in the first place. See commit 0a0d4415e338 ("Btrfs: delete dead code in btrfs_orphan_add()"). We can get rid of it.

[PATCH v4 09/12] Btrfs: fix ENOSPC caused by orphan items reservations

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval Currently, we keep space reserved for all inode orphan items until the inode is evicted (i.e., all references to it are dropped). We hit an issue where an application would keep a bunch of deleted files open (by design) and thus keep a large amount of space

[PATCH v4 12/12] Btrfs: reserve space for O_TMPFILE orphan item deletion

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval btrfs_link() calls btrfs_orphan_del() if it's linking an O_TMPFILE but it doesn't reserve space to do so. Even before the removal of the orphan_block_rsv it wasn't using it. Fixes: ef3b9af50bfa ("Btrfs: implement inode_operations callback tmpfile")

[PATCH v4 10/12] Btrfs: get rid of unused orphan infrastructure

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval Now that we don't keep long-standing reservations for orphan items, root->orphan_block_rsv isn't used. We can git rid of it, along with: - root->orphan_lock, which was used to protect root->orphan_block_rsv - root->orphan_inodes, which was used as a refcount

[PATCH v4 07/12] Btrfs: don't return ino to ino cache if inode item removal fails

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval In btrfs_evict_inode(), if btrfs_truncate_inode_items() fails, the inode item will still be in the tree but we still return the ino to the ino cache. That will blow up later when someone tries to allocate that ino, so don't return it to the cache. Fixes:

[PATCH v4 05/12] Btrfs: get rid of BTRFS_INODE_HAS_ORPHAN_ITEM

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval Now that we don't add orphan items for truncate, there can't be races on adding or deleting an orphan item, so this bit is unnecessary. Reviewed-by: Nikolay Borisov Signed-off-by: Omar Sandoval --- fs/btrfs/btrfs_inode.h

[PATCH v4 00/12] Btrfs: orphan and truncate fixes

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval Hi, This is the fourth (and hopefully final) version of the orphan item early ENOSPC and related fixes. Changes since v3: - Changed another stale comment in patch 1 - Moved BTRFS_INODE_ORPHAN_META_RESERVED flag removal to patch 10 instead of patch 9 -

[PATCH v4 01/12] Btrfs: update stale comments referencing vmtruncate()

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval Commit a41ad394a03b ("Btrfs: convert to the new truncate sequence") changed btrfs_setsize() to call truncate_setsize() instead of vmtruncate() but didn't update the comment above it. truncate_setsize() never fails (the IS_SWAPFILE() check happens elsewhere),

[PATCH v4 03/12] Btrfs: don't BUG_ON() in btrfs_truncate_inode_items()

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval btrfs_free_extent() can fail because of ENOMEM. There's no reason to panic here, we can just abort the transaction. Fixes: f4b9aa8d3b87 ("btrfs_truncate") Reviewed-by: Nikolay Borisov Signed-off-by: Omar Sandoval ---

[PATCH v4 08/12] Btrfs: refactor btrfs_evict_inode() reserve refill dance

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval The truncate loop in btrfs_evict_inode() does two things at once: - It refills the temporary block reserve, potentially stealing from the global reserve or committing - It calls btrfs_truncate_inode_items() The tangle of continues hides the fact that these

[PATCH v4 02/12] Btrfs: fix error handling in btrfs_truncate_inode_items()

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval btrfs_truncate_inode_items() uses two variables for error handling, ret and err. These are not handled consistently, leading to a couple of bugs. - Errors from btrfs_del_items() are handled but not propagated to the caller - If btrfs_run_delayed_refs()

[PATCH v4 04/12] Btrfs: stop creating orphan items for truncate

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval Currently, we insert an orphan item during a truncate so that if there's a crash, we don't leak extents past the on-disk i_size. However, since commit 7f4f6e0a3f6d ("Btrfs: only update disk_i_size as we remove extents"), we keep disk_i_size in sync with the

[PATCH v4 11/12] Btrfs: renumber BTRFS_INODE_ runtime flags

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval We got rid of BTRFS_INODE_HAS_ORPHAN_ITEM and BTRFS_INODE_ORPHAN_META_RESERVED, so we can renumber the flags to make them consecutive again. Signed-off-by: Omar Sandoval --- fs/btrfs/btrfs_inode.h | 16 1 file changed, 8

Re: [PATCH 1/3] fs: add initial bh_result->b_private value to __blockdev_direct_IO()

2018-05-11 Thread Al Viro
On Thu, May 10, 2018 at 11:30:10PM -0700, Omar Sandoval wrote: > do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode, > struct block_device *bdev, struct iov_iter *iter, > get_block_t get_block, dio_iodone_t end_io, > -

[PATCH 2/2] vfs: dedupe should return EPERM if permission is not granted

2018-05-11 Thread Mark Fasheh
Right now we return EINVAL if a process does not have permission to dedupe a file. This was an oversight on my part. EPERM gives a true description of the nature of our error, and EINVAL is already used for the case that the filesystem does not support dedupe. Signed-off-by: Mark Fasheh

[PATCH 1/2] vfs: allow dedupe of user owned read-only files

2018-05-11 Thread Mark Fasheh
The permission check in vfs_dedupe_file_range() is too coarse - We only allow dedupe of the destination file if the user is root, or they have the file open for write. This effectively limits a non-root user from deduping their own read-only files. As file data during a dedupe does not change,

[PATCH 0/2] vfs: better dedupe permission check

2018-05-11 Thread Mark Fasheh
Hi, The following patches fix a couple of issues with the permission check we do in vfs_dedupe_file_range(). The first patch expands our check to allow dedupe of a readonly file if the user owns it. Existing behavior is that we'll allow dedupe only if: - the user is an admin (root) - the user

Re: [PATCH] btrfs: use kvzalloc for EXTENT_SAME temporary data

2018-05-11 Thread Timofey Titovets
пт, 11 мая 2018 г. в 20:32, Omar Sandoval : > On Fri, May 11, 2018 at 06:49:16PM +0200, David Sterba wrote: > > On Fri, May 11, 2018 at 05:25:50PM +0100, Filipe Manana wrote: > > > On Fri, May 11, 2018 at 4:57 PM, David Sterba wrote: > > > > The dedupe

Re: [PATCH] btrfs: use kvzalloc for EXTENT_SAME temporary data

2018-05-11 Thread Omar Sandoval
On Fri, May 11, 2018 at 06:49:16PM +0200, David Sterba wrote: > On Fri, May 11, 2018 at 05:25:50PM +0100, Filipe Manana wrote: > > On Fri, May 11, 2018 at 4:57 PM, David Sterba wrote: > > > The dedupe range is 16 MiB, with 4KiB pages and 8 byte pointers, the > > > arrays can be

Re: [PATCH 0/3] Btrfs: stop abusing current->journal_info for direct I/O

2018-05-11 Thread Omar Sandoval
On Fri, May 11, 2018 at 12:53:36PM +0300, Nikolay Borisov wrote: > > > On 11.05.2018 09:30, Omar Sandoval wrote: > > From: Omar Sandoval > > > > Hi, everyone, > > > > Btrfs currently abuses current->journal_info in btrfs_direct_IO() in > > order to pass around some state to

Re: [PATCH v3 01/11] Btrfs: remove stale comment referencing vmtruncate()

2018-05-11 Thread Omar Sandoval
On Fri, May 11, 2018 at 12:19:43PM +0200, David Sterba wrote: > On Fri, May 11, 2018 at 12:56:06AM -0700, Omar Sandoval wrote: > > From: Omar Sandoval > > > > Commit a41ad394a03b ("Btrfs: convert to the new truncate sequence") > > changed vmtruncate() to truncate_setsize() but

Re: [PATCH] btrfs: use kvzalloc for EXTENT_SAME temporary data

2018-05-11 Thread Filipe Manana
On Fri, May 11, 2018 at 5:49 PM, David Sterba wrote: > On Fri, May 11, 2018 at 05:25:50PM +0100, Filipe Manana wrote: >> On Fri, May 11, 2018 at 4:57 PM, David Sterba wrote: >> > The dedupe range is 16 MiB, with 4KiB pages and 8 byte pointers, the >> > arrays

Re: [PATCH v3 05/11] Btrfs: get rid of BTRFS_INODE_HAS_ORPHAN_ITEM

2018-05-11 Thread Omar Sandoval
On Fri, May 11, 2018 at 06:51:30PM +0200, David Sterba wrote: > On Fri, May 11, 2018 at 12:10:38PM -0400, Josef Bacik wrote: > > I told him to do this, these flags aren't exposed anywhere are they? > > They are in-kernel specific stuff, please tell me we aren't exposing > > these via sysfs? > >

Re: [PATCH] btrfs: qgroup: Search commit root for rescan to avoid missing extent

2018-05-11 Thread Jeff Mahoney
On 5/3/18 3:20 AM, Qu Wenruo wrote: > When doing qgroup rescan using the following script (modified from > btrfs/017 test case), we can sometimes hit qgroup corruption. > > -- > umount $dev &> /dev/null > umount $mnt &> /dev/null > > mkfs.btrfs -f -n 64k $dev > mount $dev $mnt > >

Re: [PATCH v3 05/11] Btrfs: get rid of BTRFS_INODE_HAS_ORPHAN_ITEM

2018-05-11 Thread David Sterba
On Fri, May 11, 2018 at 12:10:38PM -0400, Josef Bacik wrote: > I told him to do this, these flags aren't exposed anywhere are they? > They are in-kernel specific stuff, please tell me we aren't exposing > these via sysfs? No worries, they're completely internal, just that shifting the number

Re: [PATCH] btrfs: use kvzalloc for EXTENT_SAME temporary data

2018-05-11 Thread David Sterba
On Fri, May 11, 2018 at 05:25:50PM +0100, Filipe Manana wrote: > On Fri, May 11, 2018 at 4:57 PM, David Sterba wrote: > > The dedupe range is 16 MiB, with 4KiB pages and 8 byte pointers, the > > arrays can be 32KiB large. To avoid allocation failures due to > > fragmented

Re: [PATCH] btrfs: use kvzalloc for EXTENT_SAME temporary data

2018-05-11 Thread Filipe Manana
On Fri, May 11, 2018 at 4:57 PM, David Sterba wrote: > The dedupe range is 16 MiB, with 4KiB pages and 8 byte pointers, the > arrays can be 32KiB large. To avoid allocation failures due to > fragmented memory, use the allocation with fallback to vmalloc. > > Signed-off-by: David

Re: [PATCH v3 05/11] Btrfs: get rid of BTRFS_INODE_HAS_ORPHAN_ITEM

2018-05-11 Thread Josef Bacik
I told him to do this, these flags aren't exposed anywhere are they? They are in-kernel specific stuff, please tell me we aren't exposing these via sysfs? Josef On Fri, May 11, 2018 at 6:06 AM, David Sterba wrote: > On Fri, May 11, 2018 at 12:56:10AM -0700, Omar Sandoval wrote:

[PATCH] btrfs: use kvzalloc for EXTENT_SAME temporary data

2018-05-11 Thread David Sterba
The dedupe range is 16 MiB, with 4KiB pages and 8 byte pointers, the arrays can be 32KiB large. To avoid allocation failures due to fragmented memory, use the allocation with fallback to vmalloc. Signed-off-by: David Sterba --- This depends on the patches that remove the 16MiB

Re: Strange behavior (possible bugs) in btrfs

2018-05-11 Thread Filipe Manana
On Mon, Apr 30, 2018 at 5:04 PM, Vijay Chidambaram wrote: > Hi, > > We found two more cases where the btrfs behavior is a little strange. > In one case, an fsync-ed file goes missing after a crash. In the > other, a renamed file shows up in both directories after a crash. > >

[PATCH] fstests: generic test for fsync of file with xattrs

2018-05-11 Thread fdmanana
From: Filipe Manana Test that xattrs are not lost after calling fsync multiple times with a filesystem commit in between the fsync calls. This test is motivated by a bug found in btrfs which is fixed by a patch for the linux kernel titled: Btrfs: fix xattr loss after power

[PATCH] Btrfs: fix xattr loss after power failure

2018-05-11 Thread fdmanana
From: Filipe Manana If a file has xattrs, we fsync it, to ensure we clear the flags BTRFS_INODE_NEEDS_FULL_SYNC and BTRFS_INODE_COPY_EVERYTHING from its inode, the current transaction commits and then we fsync it (without either of those bits being set in its inode), we end up

Any chance to get snapshot-aware defragmentation?

2018-05-11 Thread Niccolò Belli
Hi, I'm waiting for this feature since years and initially it seemed like something which would have been worked on, sooner or later. A long time had passed without any progress on this, so I would like to know if there is any technical limitation preventing this or if it's something which

Re: [PATCH V3 0/3] Btrfs: btrfs_dedupe_file_range() ioctl, remove 16MiB restriction

2018-05-11 Thread David Sterba
On Wed, May 02, 2018 at 08:15:35AM +0300, Timofey Titovets wrote: > At now btrfs_dedupe_file_range() restricted to 16MiB range for > limit locking time and memory requirement for dedup ioctl() > > For too big input range code silently set range to 16MiB > > Let's remove that restriction by do

Re: [PATCH] btrfs: qgroup: Finish rescan when hit the last leaf of extent tree

2018-05-11 Thread Jeff Mahoney
On 5/4/18 1:56 AM, Qu Wenruo wrote: > Under the following case, qgroup rescan can double account cowed tree > blocks: > > In this case, extent tree only has one tree block. > > - > | transid=5 last committed=4 > | btrfs_qgroup_rescan_worker() > | |- btrfs_start_transaction() > | | transid = 5 >

Re: [PATCH 1/2 V2] hoist BTRFS_IOC_[SG]ET_FSLABEL to vfs

2018-05-11 Thread David Sterba
On Fri, May 11, 2018 at 09:36:09AM -0500, Eric Sandeen wrote: > On 5/11/18 9:32 AM, Chris Mason wrote: > > On 11 May 2018, at 10:10, David Sterba wrote: > > > >> On Thu, May 10, 2018 at 08:16:09PM +0100, Al Viro wrote: > >>> On Thu, May 10, 2018 at 01:13:57PM -0500, Eric Sandeen wrote: >

Re: [PATCH 1/2 V2] hoist BTRFS_IOC_[SG]ET_FSLABEL to vfs

2018-05-11 Thread Eric Sandeen
On 5/11/18 9:32 AM, Chris Mason wrote: > On 11 May 2018, at 10:10, David Sterba wrote: > >> On Thu, May 10, 2018 at 08:16:09PM +0100, Al Viro wrote: >>> On Thu, May 10, 2018 at 01:13:57PM -0500, Eric Sandeen wrote: Move the btrfs label ioctls up to the vfs for general use. This

Re: [PATCH 1/2 V2] hoist BTRFS_IOC_[SG]ET_FSLABEL to vfs

2018-05-11 Thread Chris Mason
On 11 May 2018, at 10:10, David Sterba wrote: On Thu, May 10, 2018 at 08:16:09PM +0100, Al Viro wrote: On Thu, May 10, 2018 at 01:13:57PM -0500, Eric Sandeen wrote: Move the btrfs label ioctls up to the vfs for general use. This retains 256 chars as the maximum size through the interface,

Re: [PATCH v3 07/11] Btrfs: don't return ino to ino cache if inode item removal fails

2018-05-11 Thread Josef Bacik
On Fri, May 11, 2018 at 12:56:12AM -0700, Omar Sandoval wrote: > From: Omar Sandoval > > In btrfs_evict_inode(), if btrfs_truncate_inode_items() fails, the inode > item will still be in the tree but we still return the ino to the ino > cache. That will blow up later when someone

Re: [PATCH v3 06/11] Btrfs: delete dead code in btrfs_orphan_commit_root()

2018-05-11 Thread Josef Bacik
On Fri, May 11, 2018 at 12:56:11AM -0700, Omar Sandoval wrote: > From: Omar Sandoval > > btrfs_orphan_commit_root() tries to delete an orphan item for a > subvolume in the tree root, but we don't actually insert that item in > the first place. See commit 0a0d4415e338 ("Btrfs:

Re: [PATCH v3 04/11] Btrfs: stop creating orphan items for truncate

2018-05-11 Thread Josef Bacik
On Fri, May 11, 2018 at 12:56:09AM -0700, Omar Sandoval wrote: > From: Omar Sandoval > > Currently, we insert an orphan item during a truncate so that if there's > a crash, we don't leak extents past the on-disk i_size. However, since > commit 7f4f6e0a3f6d ("Btrfs: only update

Re: [PATCH 1/2 V2] fs: hoist BTRFS_IOC_[SG]ET_FSLABEL to vfs

2018-05-11 Thread David Sterba
On Thu, May 10, 2018 at 08:16:09PM +0100, Al Viro wrote: > On Thu, May 10, 2018 at 01:13:57PM -0500, Eric Sandeen wrote: > > Move the btrfs label ioctls up to the vfs for general use. > > > > This retains 256 chars as the maximum size through the interface, which > > is the btrfs limit and AFAIK

Re: [RFC v2 4/4] btrfs: verify symlinks with append/immutable flags

2018-05-11 Thread David Sterba
On Thu, May 10, 2018 at 04:13:59PM -0700, Luis R. Rodriguez wrote: > The Linux VFS does not allow a way to set append/immuttable ^^ Typo, in all 3 patches. > attributes to symlinks, this is just not possible. If this is > detected inform

Re: [PATCH 00/17] Freespace tree big fs_info cleanup

2018-05-11 Thread David Sterba
On Thu, May 10, 2018 at 03:44:39PM +0300, Nikolay Borisov wrote: > Here is a series which cleans _all_ freespace tree functions from a redundant > fs_info argument since they already take either a transaction or a > block_group_cache structure. Both of those structures contain a reference to >

Re: [PATCH v3 3/3] btrfs: Do super block verification before writing it to disk

2018-05-11 Thread David Sterba
On Fri, May 11, 2018 at 01:35:27PM +0800, Qu Wenruo wrote: > +/* > + * Check the validation of super block at write time. > + * Some checks like bytenr check will be skipped as their values will be > + * overwritten soon. > + * Extra checks like csum type and incompact flags will be executed here.

Re: [PATCH v3 0/3] btrfs: Add write time super block validation

2018-05-11 Thread David Sterba
On Fri, May 11, 2018 at 01:35:24PM +0800, Qu Wenruo wrote: > This patchset can be fetched from github: > https://github.com/adam900710/linux/tree/write_time_sb_check > > We have 2 reports about corrupted btrfs super block, which has some garbage > in its super block, but otherwise it's completely

Re: [PATCH v3 1/3] btrfs: Move btrfs_check_super_valid() to avoid forward declaration

2018-05-11 Thread David Sterba
On Fri, May 11, 2018 at 11:36:54AM +0200, David Sterba wrote: > On Fri, May 11, 2018 at 01:35:25PM +0800, Qu Wenruo wrote: > > Just move btrfs_check_super_valid() before its single caller to avoid > > forward declaration. > > Please don't move functions just to get rid of the forward

Re: [PATCH 0/3] Btrfs: stop abusing current->journal_info for direct I/O

2018-05-11 Thread David Sterba
On Fri, May 11, 2018 at 12:53:36PM +0300, Nikolay Borisov wrote: > > > On 11.05.2018 09:30, Omar Sandoval wrote: > > From: Omar Sandoval > > > > Hi, everyone, > > > > Btrfs currently abuses current->journal_info in btrfs_direct_IO() in > > order to pass around some state to

Re: [PATCH v3 01/11] Btrfs: remove stale comment referencing vmtruncate()

2018-05-11 Thread David Sterba
On Fri, May 11, 2018 at 12:56:06AM -0700, Omar Sandoval wrote: > From: Omar Sandoval > > Commit a41ad394a03b ("Btrfs: convert to the new truncate sequence") > changed vmtruncate() to truncate_setsize() but didn't update the comment > above it. truncate_setsize() never fails (the

Re: [PATCH v3 05/11] Btrfs: get rid of BTRFS_INODE_HAS_ORPHAN_ITEM

2018-05-11 Thread David Sterba
On Fri, May 11, 2018 at 12:56:10AM -0700, Omar Sandoval wrote: > --- a/fs/btrfs/btrfs_inode.h > +++ b/fs/btrfs/btrfs_inode.h > @@ -23,13 +23,12 @@ > #define BTRFS_INODE_ORPHAN_META_RESERVED 1 > #define BTRFS_INODE_DUMMY2 > #define BTRFS_INODE_IN_DEFRAG

Re: [PATCH v2] btrfs: incremental send, fix BUG when invalid memory access

2018-05-11 Thread Filipe Manana
On Fri, May 11, 2018 at 7:34 AM, robbieko wrote: > From: Robbie Ko > > [BUG] > btrfs incremental send BUG happens when creating a snapshot of snapshot > that is being used by send. > > [REASON] > The problem can happen if while we are doing a send

Re: [PATCH 0/3] Btrfs: stop abusing current->journal_info for direct I/O

2018-05-11 Thread Nikolay Borisov
On 11.05.2018 09:30, Omar Sandoval wrote: > From: Omar Sandoval > > Hi, everyone, > > Btrfs currently abuses current->journal_info in btrfs_direct_IO() in > order to pass around some state to get_block() and submit_io(). This > hack is ugly and unnecessary because the data we

Re: [PATCH v3 09/11] Btrfs: fix ENOSPC caused by orphan items reservations

2018-05-11 Thread Nikolay Borisov
On 11.05.2018 10:56, Omar Sandoval wrote: > From: Omar Sandoval > > Currently, we keep space reserved for all inode orphan items until the > inode is evicted (i.e., all references to it are dropped). We hit an > issue where an application would keep a bunch of deleted files

Re: [PATCH v3 2/3] btrfs: Refactor btrfs_check_super_valid()

2018-05-11 Thread David Sterba
On Fri, May 11, 2018 at 01:35:26PM +0800, Qu Wenruo wrote: > Refactor btrfs_check_super_valid() by the ways: > > 1) Rename it to btrfs_validate_mount_super() >Now it's more obvious when the function should be called. > > 2) Extract core check routine into __validate_super() >So later

Re: [PATCH v3 3/3] btrfs: Do super block verification before writing it to disk

2018-05-11 Thread David Sterba
On Fri, May 11, 2018 at 01:35:27PM +0800, Qu Wenruo wrote: > There are already 2 reports about strangely corrupted super blocks, > where csum still matches but extra garbage gets slipped into super block. > > The corruption would looks like: > -- > superblock: bytenr=65536, device=/dev/sdc1 >

Re: [PATCH v3 1/3] btrfs: Move btrfs_check_super_valid() to avoid forward declaration

2018-05-11 Thread David Sterba
On Fri, May 11, 2018 at 01:35:25PM +0800, Qu Wenruo wrote: > Just move btrfs_check_super_valid() before its single caller to avoid > forward declaration. Please don't move functions just to get rid of the forward declarations. Moving functions to make them static or if they're in a wrong .c is

Re: [PATCH v3 10/11] Btrfs: get rid of unused orphan infrastructure

2018-05-11 Thread Nikolay Borisov
On 11.05.2018 10:56, Omar Sandoval wrote: > From: Omar Sandoval > > Now that we don't keep long-standing reservations for orphan items, > root->orphan_block_rsv isn't used. We can git rid of it, along with > root->orphan_lock, which was used to protect it, root->orphan_inodes,

Re: [PATCH 0/3] Btrfs: stop abusing current->journal_info for direct I/O

2018-05-11 Thread David Sterba
On Thu, May 10, 2018 at 11:30:09PM -0700, Omar Sandoval wrote: > From: Omar Sandoval > > Hi, everyone, > > Btrfs currently abuses current->journal_info in btrfs_direct_IO() in > order to pass around some state to get_block() and submit_io(). This > hack is ugly and unnecessary

Re: [PATCH v3 05/11] Btrfs: get rid of BTRFS_INODE_HAS_ORPHAN_ITEM

2018-05-11 Thread Nikolay Borisov
On 11.05.2018 10:56, Omar Sandoval wrote: > From: Omar Sandoval > > Now that we don't add orphan items for truncate, there can't be races on > adding or deleting an orphan item, so this bit is unnecessary. > > Signed-off-by: Omar Sandoval > --- >

Re: [PATCH 2/5] btrfs: Split btrfs_del_delalloc_inode into 2 functions

2018-05-11 Thread Nikolay Borisov
On 11.05.2018 08:44, Anand Jain wrote: > > > On 04/27/2018 05:21 PM, Nikolay Borisov wrote: >> This is in preparation of fixing delalloc inodes leakage on transaction >> abort. Also export the new function. >> >> Signed-off-by: Nikolay Borisov > >  nit: I think we are

[PATCH v3 09/11] Btrfs: fix ENOSPC caused by orphan items reservations

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval Currently, we keep space reserved for all inode orphan items until the inode is evicted (i.e., all references to it are dropped). We hit an issue where an application would keep a bunch of deleted files open (by design) and thus keep a large amount of space

[PATCH v3 04/11] Btrfs: stop creating orphan items for truncate

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval Currently, we insert an orphan item during a truncate so that if there's a crash, we don't leak extents past the on-disk i_size. However, since commit 7f4f6e0a3f6d ("Btrfs: only update disk_i_size as we remove extents"), we keep disk_i_size in sync with the

[PATCH v3 06/11] Btrfs: delete dead code in btrfs_orphan_commit_root()

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval btrfs_orphan_commit_root() tries to delete an orphan item for a subvolume in the tree root, but we don't actually insert that item in the first place. See commit 0a0d4415e338 ("Btrfs: delete dead code in btrfs_orphan_add()"). We can get rid of it.

[PATCH v3 10/11] Btrfs: get rid of unused orphan infrastructure

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval Now that we don't keep long-standing reservations for orphan items, root->orphan_block_rsv isn't used. We can git rid of it, along with root->orphan_lock, which was used to protect it, root->orphan_inodes, which was used as a refcount for it, and

[PATCH v3 07/11] Btrfs: don't return ino to ino cache if inode item removal fails

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval In btrfs_evict_inode(), if btrfs_truncate_inode_items() fails, the inode item will still be in the tree but we still return the ino to the ino cache. That will blow up later when someone tries to allocate that ino, so don't return it to the cache. Fixes:

[PATCH v3 08/11] Btrfs: refactor btrfs_evict_inode() reserve refill dance

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval The truncate loop in btrfs_evict_inode() does two things at once: - It refills the temporary block reserve, potentially stealing from the global reserve or committing - It calls btrfs_truncate_inode_items() The tangle of continues hides the fact that these

[PATCH v3 11/11] Btrfs: reserve space for O_TMPFILE orphan item deletion

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval btrfs_link() calls btrfs_orphan_del() if it's linking an O_TMPFILE but it doesn't reserve space to do so. Even before the removal of the orphan_block_rsv it wasn't using it. Fixes: ef3b9af50bfa ("Btrfs: implement inode_operations callback tmpfile")

[PATCH v3 05/11] Btrfs: get rid of BTRFS_INODE_HAS_ORPHAN_ITEM

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval Now that we don't add orphan items for truncate, there can't be races on adding or deleting an orphan item, so this bit is unnecessary. Signed-off-by: Omar Sandoval --- fs/btrfs/btrfs_inode.h | 13 fs/btrfs/inode.c | 76

[PATCH v3 03/11] Btrfs: don't BUG_ON() in btrfs_truncate_inode_items()

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval btrfs_free_extent() can fail because of ENOMEM. There's no reason to panic here, we can just abort the transaction. Fixes: f4b9aa8d3b87 ("btrfs_truncate") Reviewed-by: Nikolay Borisov Signed-off-by: Omar Sandoval ---

[PATCH v3 02/11] Btrfs: fix error handling in btrfs_truncate_inode_items()

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval btrfs_truncate_inode_items() uses two variables for error handling, ret and err. These are not handled consistently, leading to a couple of bugs. - Errors from btrfs_del_items() are handled but not propagated to the caller - If btrfs_run_delayed_refs()

[PATCH v3 01/11] Btrfs: remove stale comment referencing vmtruncate()

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval Commit a41ad394a03b ("Btrfs: convert to the new truncate sequence") changed vmtruncate() to truncate_setsize() but didn't update the comment above it. truncate_setsize() never fails (the IS_SWAPFILE() check happens elsewhere), so remove the comment.

[PATCH v3 00/11] Btrfs: orphan and truncate fixes

2018-05-11 Thread Omar Sandoval
From: Omar Sandoval Hi, This is v3 of the fixes for the orphan item early ENOSPC issue we hit at Facebook. The big change is that I now also got rid of BTRFS_INODE_HAS_ORPHAN_ITEM (thanks, Nikolay) and shuffled the patches around so there is less churn. Changes since v2: - Add

[PATCH 00/11] btrfs-progs: Rework of "subvolume list/show" and relax the root privileges of them

2018-05-11 Thread Tomohiro Misono
Hello, This series is an updated version of [RFC PATCH v3 0/7] btrfs-progs: Allow normal user to call "subvolume list/show" [1] and requires new ioctls which can be found in ML as [PATCH v4 0/3] btrfs: Add three new unprivileged ioctls to allow normal users to call "sub list/show" etc. Or,

[PATCH 11/11] btrfs-porgs: test: Add cli-test/009 to check subvolume list for both root and normal user

2018-05-11 Thread Tomohiro Misono
Signed-off-by: Tomohiro Misono --- tests/cli-tests/009-subvolume-list/test.sh | 136 + 1 file changed, 136 insertions(+) create mode 100755 tests/cli-tests/009-subvolume-list/test.sh diff --git

[PATCH 03/11] btrfs-porgs: libbtrfsutil: Relax the privileges of util_subvolume_info()

2018-05-11 Thread Tomohiro Misono
By using new ioctl (BTRFS_IOC_GET_SUBVOL_INFO), this commit allows non-privileged user to call util_subvolume_info() as long as @id is zero (user can only get the information of the subvolume which he can open). Signed-off-by: Tomohiro Misono ---

[PATCH 04/11] btrfs-progs: libbtrfsuitl: Factor out btrfs_util_subvolume_iterator_next()

2018-05-11 Thread Tomohiro Misono
Factor out the main logic of btrfs_util_subvolume_iterator_next(). This is a prepareation work to update the behavior of this function and relax the required root privilege. No functional change happens. Signed-off-by: Tomohiro Misono ---

[PATCH 05/11] btrfs-progs: libbtrfsutil: Update the behavior of subvolume iterator and relax the privileges

2018-05-11 Thread Tomohiro Misono
By using new ioctls (BTRFS_IOC_GET_ROOTREF_INFO/BTRFS_IOC_INO_LOOKUP_USER), this commit update the subvolume iterator when it is created by btrfs_util_create_subvolume_iterator() with @top zero (i.e. if the iterator is created from givin path/fd). In that case, - an iterator can be created from

[PATCH 06/11] btrfs-progs: sub list: Use libbtrfsuitl for subvolume list

2018-05-11 Thread Tomohiro Misono
This is a copy of non-merged following patch originally written by Omar Sandoval: btrfs-progs: use libbtrfsutil for subvolume list expect this commit keeps libbtrfs implementation which above commit tries to remove (therefore this adds suffix _v2 for struct/function). Original Author: Omar

[PATCH 01/11] btrfs-progs: ioctl/libbtrfsutil: Add 3 definitions of new unprivileged ioctl

2018-05-11 Thread Tomohiro Misono
Add 3 definitions of new unprivileged ioctl (BTRFS_IOC_GET_SUBVOL_INFO, BTRFS_IOC_GET_SUBVOL_ROOTREF and BTRFS_IOC_INO_LOOKUP_USER). They will be used to implement the user version of "btrfs subvolume list" etc. Signed-off-by: Tomohiro Misono --- ioctl.h

[PATCH 08/11] btrfs-progs: utils: Fallback to open without O_NOATIME flag in find_mount_root():

2018-05-11 Thread Tomohiro Misono
O_NOATIME flag requires effective UID of process matches file's owner or has CAP_FOWNER capabilities. Fallback to open without O_NOATIME flag so that non-privileged user can also call find_mount_root(). This is a preparation work to allow non-privileged user to call "subvolume show".

[PATCH 07/11] btrfs-progs: sub list: Change the default behavior of "subvolume list" and allow non-privileged user to call it

2018-05-11 Thread Tomohiro Misono
Change the default behavior of "subvolume list" and allow non-privileged user to call it as well. >From this commit, by default it only lists subvolumes under the specified path (incl. the path itself except top-level subvolume). Also, if kernel supports new ioctls

[PATCH 09/11] btrfs-progs: sub show: Allow non-privileged user to call "subvolume show"

2018-05-11 Thread Tomohiro Misono
Allow non-privileged user to call subvolume show (-r or -u cannot be used) if new ioctls (BTRFS_IOC_GET_SUBVOL_INFO etc.) are available. The behavior for root user is the same as before. There are some output differences between root and user: root ... subvolume path is from top-level subvolume

[PATCH v4 2/3] btrfs: Add unprivileged ioctl which returns subvolume's ROOT_REF

2018-05-11 Thread Tomohiro Misono
Add unprivileged ioctl BTRFS_IOC_GET_SUBVOL_ROOTREF which returns ROOT_REF information of the subvolume containing this inode except the subvolume name (this is because to prevent potential name leak). The subvolume name will be gained by user version of ino_lookup ioctl

[PATCH v4 1/3] btrfs: Add unprivileged ioctl which returns subvolume information

2018-05-11 Thread Tomohiro Misono
Add new unprivileged ioctl BTRFS_IOC_GET_SUBVOL_INFO which returns the information of subvolume containing this inode. (i.e. returns the information in ROOT_ITEM and ROOT_BACKREF.) Signed-off-by: Tomohiro Misono --- fs/btrfs/ioctl.c | 129

[PATCH v4 3/3] btrfs: Add unprivileged version of ino_lookup ioctl

2018-05-11 Thread Tomohiro Misono
Add unprivileged version of ino_lookup ioctl BTRFS_IOC_INO_LOOKUP_USER to allow normal users to call "btrfs subvololume list/show" etc. in combination with BTRFS_IOC_GET_SUBVOL_INFO/BTRFS_IOC_GET_SUBVOL_ROOTREF. This can be used like BTRFS_IOC_INO_LOOKUP but the argument is different. This is

[PATCH v4 0/3] btrfs: Add three new unprivileged ioctls to allow normal users to call "sub list/show" etc.

2018-05-11 Thread Tomohiro Misono
changelog: v3 -> v4 - call btrfs_next_leaf() after btrfs_search_slot() when the slot position exceeds the number of items - rebased to current misc-next v2 -> v3 - fix kbuild test bot warning v1 -> v2 - completely reimplement 1st/2nd ioctl to have user friendly api - various cleanup,

Re: [PATCH v2 09/12] Btrfs: get rid of root->orphan_block_rsv and root->orphan_lock

2018-05-11 Thread Omar Sandoval
On Thu, May 10, 2018 at 11:48:28PM -0700, Omar Sandoval wrote: > On Fri, May 11, 2018 at 09:44:36AM +0300, Nikolay Borisov wrote: > > > > > > On 11.05.2018 03:11, Omar Sandoval wrote: > > > From: Omar Sandoval > > > > > > Now that we don't keep long-standing reservations for

  1   2   >