On Fri, Oct 12, 2018 at 4:38 PM wrote:
>
> From: Filipe Manana
>
> At inode.c:compress_file_range(), under the "free_pages_out" label, we can
> end up dereferencing the "pages" pointer when it has a NULL value. This
> case happens when "start" has a value of 0 and we fail to allocate memory
> for
From: Darrick J. Wong
Prior to remapping blocks, it is necessary to remove pages from the
destination file's page cache. Unfortunately, the truncation is not
aggressive enough -- if page size > block size, we'll end up zeroing
subpage blocks instead of removing them. So, round the start offset
From: Darrick J. Wong
Change the ocfs2 remap code to allow for returning partial results.
Signed-off-by: Darrick J. Wong
---
fs/ocfs2/file.c |7 +
fs/ocfs2/refcounttree.c | 73 ++-
fs/ocfs2/refcounttree.h | 12
3 files ch
From: Darrick J. Wong
Back when the XFS reflink code only supported clone_file_range, we were
only able to return zero or negative error codes to userspace. However,
now that copy_file_range (which returns bytes copied) can use XFS'
clone_file_range, we have the opportunity to return partial res
From: Darrick J. Wong
Now that we've moved the partial EOF block checks to the VFS helpers, we
can remove the redundantn functionality from XFS.
Signed-off-by: Darrick J. Wong
Reviewed-by: Dave Chinner
---
fs/xfs/xfs_reflink.c | 20
1 file changed, 20 deletions(-)
dif
From: Darrick J. Wong
Prior to remapping blocks, it is necessary to remove pages from the
destination file's page cache. Unfortunately, the truncation is not
aggressive enough -- if page size > block size, we'll end up zeroing
subpage blocks instead of removing them. So, round the start offset
From: Darrick J. Wong
Create a RFR_TO_SRC_EOF flag to explicitly declare that the caller wants
the remap implementation to remap to the end of the source file, once
the files are locked.
Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
---
fs/ioctl.c |3 ++-
fs/nfsd/vfs.
From: Darrick J. Wong
For a given dedupe request, the bytes_deduped field in the control
structure tells userspace if we managed to deduplicate some, but not all
of, the requested regions starting from the file offsets supplied.
However, due to sloppy coding, the current dedupe code returns
FILE_
From: Darrick J. Wong
There are no callers of vfs_dedupe_file_range_compare, so we might as
well make it a static helper and remove the export.
Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
---
fs/read_write.c| 191 ++--
includ
From: Darrick J. Wong
When cloning blocks into another file, truncate the page cache before we
start remapping blocks so that concurrent reads wait for us to finish.
Signed-off-by: Darrick J. Wong
---
fs/ocfs2/refcounttree.c | 10 --
1 file changed, 4 insertions(+), 6 deletions(-)
From: Darrick J. Wong
Plumb the remap flags through the filesystem from the vfs function
dispatcher all the way to the prep function to prepare for behavior
changes in subsequent patches.
Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
---
fs/ocfs2/file.c |2 +-
fs/ocfs
From: Darrick J. Wong
Plumb in a remap flag that enables the filesystem remap handler to
shorten remapping requests for callers that can handle it. Now
copy_file_range can report partial success (in case we run up against
alignment problems, resource limits, etc.).
We also enable CAN_SHORTEN fo
From: Darrick J. Wong
Pass the same remap flags to generic_remap_checks for consistency.
Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
---
fs/read_write.c|2 +-
include/linux/fs.h |2 +-
mm/filemap.c |4 ++--
3 files changed, 4 insertions(+), 4 deletions(-)
From: Darrick J. Wong
Change the remap_file_range functions to take a number of bytes to
operate upon and return the number of bytes they operated on. This is a
requirement for allowing fs implementations to return short clone/dedupe
results to the user, which will enable us to obey resource lim
From: Darrick J. Wong
Plumb a remap_flags argument through the vfs_dedupe_file_range_one
functions so that dedupe can take advantage of it.
Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
---
fs/overlayfs/file.c |3 ++-
fs/read_write.c |9 ++---
include/linux/fs.h
From: Darrick J. Wong
Plumb a remap_flags argument through the {do,vfs}_clone_file_range
functions so that clone can take advantage of it.
Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
---
fs/ioctl.c |2 +-
fs/nfsd/vfs.c |2 +-
fs/overlayfs/copy_up.c
From: Darrick J. Wong
Since we use clone_verify_area for both clone and dedupe range checks,
rename the function to make it clear that it's for both.
Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
---
fs/read_write.c | 10 +-
1 file changed, 5 insertions(+), 5 deletions(
From: Darrick J. Wong
The vfs_clone_file_prep is a generic function to be called by filesystem
implementations only. Rename the prefix to generic_ and make it more
clear that it applies to remap operations, not just clones.
Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
---
fs/oc
From: Darrick J. Wong
Don't bother calling the filesystem for a zero-length dedupe request;
we can return zero and exit.
Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Reviewed-by: Amir Goldstein
---
fs/read_write.c |5 +
1 file changed, 5 insertions(+)
diff --git a/
From: Darrick J. Wong
Move the file range checks from vfs_clone_file_prep into a separate
generic_remap_checks function so that all the checks are collected in a
central location. This forms the basis for adding more checks from
generic_write_checks that will make cloning's input checking more
c
From: Darrick J. Wong
Create a new VFS helper to handle inode metadata updates when remapping
into a file. If the operation can possibly alter the file contents, we
must update the ctime and mtime and remove security privileges, just
like we do for regular file writes. Wire up ocfs2 to ensure c
From: Darrick J. Wong
A deduplication data corruption is exposed in XFS and btrfs. It is
caused by extending the block match range to include the partial EOF
block, but then allowing unknown data beyond EOF to be considered a
"match" to data in the destination file because the comparison is only
From: Darrick J. Wong
Combine the clone_file_range and dedupe_file_range operations into a
single remap_file_range file operation dispatch since they're
fundamentally the same operation. The differences between the two can
be made in the prep functions.
Signed-off-by: Darrick J. Wong
Reviewed-
From: Darrick J. Wong
File range remapping, if allowed to run past the destination file's EOF,
is an optimization on a regular file write. Regular file writes that
extend the file length are subject to various constraints which are not
checked by range cloning.
This is a correctness problem bec
From: Darrick J. Wong
vfs_clone_file_prep_inodes cannot return 0 if it is asked to remap from
a zero byte file because that's what btrfs does.
Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
---
fs/read_write.c |3 ---
1 file changed, 3 deletions(-)
diff --git a/fs/read_wr
Hi all,
Dave, Eric, and I have been chasing a stale data exposure bug in the XFS
reflink implementation, and tracked it down to reflink forgetting to do
some of the file-extending activities that must happen for regular
writes.
We then started auditing the clone, dedupe, and copyfile code and
rea
From: Darrick J. Wong
Add a "xfs_tprintk" macro so that developers can use trace_printk to
print out arbitrary debugging information with the XFS device name
attached to the trace output.
Signed-off-by: Darrick J. Wong
---
fs/xfs/xfs_error.h |6 ++
1 file changed, 6 insertions(+)
dif
From: Filipe Manana
At inode.c:compress_file_range(), under the "free_pages_out" label, we can
end up dereferencing the "pages" pointer when it has a NULL value. This
case happens when "start" has a value of 0 and we fail to allocate memory
for the "pages" pointer. When that happens we jump to th
On Tue, 11 Sep 2018 15:34:45 -0700 Omar Sandoval wrote:
> From: Omar Sandoval
>
> Btrfs will need this for swap file support.
>
Acked-by: Andrew Morton
On Tue, 11 Sep 2018 15:34:44 -0700 Omar Sandoval wrote:
> From: Omar Sandoval
>
> The SWP_FILE flag serves two purposes: to make swap_{read,write}page()
> go through the filesystem, and to make swapoff() call
> ->swap_deactivate(). For Btrfs, we want the latter but not the former,
> so split th
On Fri, Oct 12, 2018 at 8:35 PM Josef Bacik wrote:
>
> This could result in a really bad case where we do something like
>
> evict
> evict_refill_and_join
> btrfs_commit_transaction
> btrfs_run_delayed_iputs
> evict
> evict_refill_and_join
> btrfs_commit_t
From: Filipe Manana
At inode.c:compress_file_range(), under the "free_pages_out" label, we can
end up dereferencing the "pages" pointer when it has a NULL value. This
case happens when "start" has a value of 0 and we fail to allocate memory
for the "pages" pointer. When that happens we jump to th
On Thu, Oct 11, 2018 at 5:13 AM Darrick J. Wong wrote:
>
> From: Darrick J. Wong
>
> A deduplication data corruption is exposed by fstests generic/505 on
> XFS.
(and btrfs)
Btw, the generic test I wrote was indeed numbered 505, however it was
never committed and there's now a generic/505 which
We were not handling the reserved byte accounting properly for data
references. Metadata was fine, if it errored out the error paths would
free the bytes_reserved count and pin the extent, but it even missed one
of the error cases. So instead move this handling up into
run_one_delayed_ref so we a
We may abort the transaction during a commit and not have a chance to
run the pending bgs stuff, which will leave block groups on our list and
cause us accounting issues and leaked memory. Fix this by running the
pending bgs when we cleanup a transaction.
Reviewed-by: Omar Sandoval
Signed-off-by
We could generate a lot of delayed refs in evict but never have any left
over space from our block rsv to make up for that fact. So reserve some
extra space and give it to the transaction so it can be used to refill
the delayed refs rsv every loop through the truncate path.
Signed-off-by: Josef B
Instead of open coding this stuff use the helper instead.
Reviewed-by: Nikolay Borisov
Signed-off-by: Josef Bacik
---
fs/btrfs/disk-io.c | 7 +--
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 121ab180a78a..fe1f229320ef 100644
---
We don't need it, rsv->size is set once and never changes throughout
its lifetime, so just use that for the reserve size.
Reviewed-by: David Sterba
Signed-off-by: Josef Bacik
---
fs/btrfs/inode.c | 16 ++--
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/fs/btrfs/inod
The throttle path doesn't take cleaner_delayed_iput_mutex, which means
we could think we're done flushing iputs in the data space reservation
path when we could have a throttler doing an iput. There's no real
reason to serialize the delayed iput flushing, so instead of taking the
cleaner_delayed_i
If we flip read-only before we initiate writeback on all dirty pages for
ordered extents we've created then we'll have ordered extents left over
on umount, which results in all sorts of bad things happening. Fix this
by making sure we wait on ordered extents if we have to do the aborted
transactio
This could result in a really bad case where we do something like
evict
evict_refill_and_join
btrfs_commit_transaction
btrfs_run_delayed_iputs
evict
evict_refill_and_join
btrfs_commit_transaction
... forever
We have plenty of other places where we run del
We have this open coded in btrfs_destroy_delayed_refs, use the helper
instead.
Reviewed-by: Nikolay Borisov
Signed-off-by: Josef Bacik
---
fs/btrfs/disk-io.c | 11 ++-
1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 39bd158466c
I noticed in a giant dbench run that we spent a lot of time on lock
contention while running transaction commit. This is because dbench
results in a lot of fsync()'s that do a btrfs_transaction_commit(), and
they all run the delayed refs first thing, so they all contend with
each other. This lead
For FLUSH_LIMIT flushers we really can only allocate chunks and flush
delayed inode items, everything else is problematic. I added a bunch of
new states and it lead to weirdness in the FLUSH_LIMIT case because I
forgot about how it worked. So instead explicitly declare the states
that are ok for
My work email is completely useless, switch it to my personal address so
I get emails on a account I actually pay attention to.
Signed-off-by: Josef Bacik
---
MAINTAINERS | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index 32fbc6f732d4..7723dc958e9
While testing my backport I noticed there was a panic if I ran
generic/416 generic/417 generic/418 all in a row. This just happened to
uncover a race where we had outstanding IO after we destroy all of our
workqueues, and then we'd go to queue the endio work on those free'd
workqueues. This is be
The cleaner thread usually takes care of delayed iputs, with the
exception of the btrfs_end_transaction_throttle path. The cleaner
thread only gets woken up every 30 seconds, so instead wake it up to do
it's work so that we can free up that space as quickly as possible.
Reviewed-by: Filipe Manana
We still need to do all of the accounting cleanup for pending block
groups if we abort. So set the ret to trans->aborted so if we aborted
the cleanup happens and everybody is happy.
Reviewed-by: Omar Sandoval
Signed-off-by: Josef Bacik
---
fs/btrfs/extent-tree.c | 8 +++-
1 file changed, 7
From: Josef Bacik
We can't use entry->bytes if our entry is a bitmap entry, we need to use
entry->max_extent_size in that case. Fix up all the logic to make this
consistent.
Signed-off-by: Josef Bacik
---
fs/btrfs/free-space-cache.c | 30 --
1 file changed, 20 inse
We don't need the trans except to get the delayed_refs_root, so just
pass the delayed_refs_root into btrfs_delayed_ref_lock and call it a
day.
Reviewed-by: Nikolay Borisov
Signed-off-by: Josef Bacik
---
fs/btrfs/delayed-ref.c | 5 +
fs/btrfs/delayed-ref.h | 2 +-
fs/btrfs/extent-tree.c | 2
With severe fragmentation we can end up with our inode rsv size being
huge during writeout, which would cause us to need to make very large
metadata reservations. However we may not actually need that much once
writeout is complete. So instead try to make our reservation, and if we
couldn't make
When we insert the file extent once the ordered extent completes we free
the reserved extent reservation as it'll have been migrated to the
bytes_used counter. However if we error out after this step we'll still
clear the reserved extent reservation, resulting in a negative
accounting of the reser
From: Josef Bacik
We need to clear the max_extent_size when we clear bits from a bitmap
since it could have been from the range that contains the
max_extent_size.
Reviewed-by: Liu Bo
Signed-off-by: Josef Bacik
---
fs/btrfs/free-space-cache.c | 2 ++
1 file changed, 2 insertions(+)
diff --git
We weren't doing any of the accounting cleanup when we aborted
transactions. Fix this by making cleanup_ref_head_accounting global and
calling it from the abort code, this fixes the issue where our
accounting was all wrong after the fs aborts.
Signed-off-by: Josef Bacik
---
fs/btrfs/ctree.h
The first thing we do is loop through the list, this
if (!list_empty())
btrfs_create_pending_block_groups();
thing is just wasted space.
Reviewed-by: Nikolay Borisov
Signed-off-by: Josef Bacik
---
fs/btrfs/extent-tree.c | 3 +--
fs/btrfs/transaction.c | 6 ++
2 files changed, 3 in
I ran into an issue where there was some reference being held on an
inode that I couldn't track. This assert wasn't triggered, but it at
least rules out we're doing something stupid.
Reviewed-by: Omar Sandoval
Reviewed-by: David Sterba
Signed-off-by: Josef Bacik
---
fs/btrfs/disk-io.c | 1 +
Allocating new chunks modifies both the extent and chunk tree, which can
trigger new chunk allocations. So instead of doing list_for_each_safe,
just do while (!list_empty()) so we make sure we don't exit with other
pending bg's still on our list.
Reviewed-by: Omar Sandoval
Reviewed-by: Liu Bo
R
From: Josef Bacik
We do this dance in cleanup_ref_head and check_ref_cleanup, unify it
into a helper and cleanup the calling functions.
Signed-off-by: Josef Bacik
Reviewed-by: Omar Sandoval
---
fs/btrfs/delayed-ref.c | 14 ++
fs/btrfs/delayed-ref.h | 3 ++-
fs/btrfs/extent-tree.c
v3->v4:
- added stacktraces to all the changelogs
- added the various reviewed-by's.
- fixed the loop in inode_rsv_refill to not use goto again;
v2->v3:
- reworked the truncate/evict throttling, we were still occasionally hitting
enospc aborts in production in these paths because we were too agg
We pick the number of ref's to run based on the number of ref heads, and
only make the decision to stop once we've processed entire ref heads, so
only count the ref heads we've run and bail once we've hit the number of
ref heads we wanted to process.
Signed-off-by: Josef Bacik
---
fs/btrfs/exten
If we use up our block group before allocating a new one we'll easily
get a max_extent_size that's set really really low, which will result in
a lot of fragmentation. We need to make sure we're resetting the
max_extent_size when we add a new chunk or add new space.
Reviewed-by: Filipe Manana
Sig
We have a bunch of magic to make sure we're throttling delayed refs when
truncating a file. Now that we have a delayed refs rsv and a mechanism
for refilling that reserve simply use that instead of all of this magic.
Signed-off-by: Josef Bacik
---
fs/btrfs/inode.c | 79 -
From: Josef Bacik
Traditionally we've had voodoo in btrfs to account for the space that
delayed refs may take up by having a global_block_rsv. This works most
of the time, except when it doesn't. We've had issues reported and seen
in production where sometimes the global reserve is exhausted du
From: Josef Bacik
The cleanup_extent_op function actually would run the extent_op if it
needed running, which made the name sort of a misnomer. Change it to
run_and_cleanup_extent_op, and move the actual cleanup work to
cleanup_extent_op so it can be used by check_ref_cleanup() in order to
unify
With my change to no longer take into account the global reserve for
metadata allocation chunks we have this side-effect for mixed block
group fs'es where we are no longer allocating enough chunks for the
data/metadata requirements. To deal with this add a ALLOC_CHUNK_FORCE
step to the flushing st
For enospc_debug having the block rsvs is super helpful to see if we've
done something wrong.
Signed-off-by: Josef Bacik
Reviewed-by: Omar Sandoval
Reviewed-by: David Sterba
---
fs/btrfs/extent-tree.c | 15 +++
1 file changed, 15 insertions(+)
diff --git a/fs/btrfs/extent-tree.c b
From: Josef Bacik
We were missing some quota cleanups in check_ref_cleanup, so break the
ref head accounting cleanup into a helper and call that from both
check_ref_cleanup and cleanup_ref_head. This will hopefully ensure that
we don't screw up accounting in the future for other things that we a
may_commit_transaction will skip committing the transaction if we don't
have enough pinned space or if we're trying to find space for a SYSTEM
chunk. However if we have pending free block groups in this transaction
we still want to commit as we may be able to allocate a chunk to make
our reservati
If we're allocating a new space cache inode it's likely going to be
under a transaction handle, so we need to use memalloc_nofs_save() in
order to avoid deadlocks, and more importantly lockdep messages that
make xfstests fail.
Reviewed-by: Omar Sandoval
Signed-off-by: Josef Bacik
Reviewed-by: Da
From: Josef Bacik
We use this number to figure out how many delayed refs to run, but
__btrfs_run_delayed_refs really only checks every time we need a new
delayed ref head, so we always run at least one ref head completely no
matter what the number of items on it. Fix the accounting to only be
ad
We're getting a lockdep splat because we take the dio_sem under the
log_mutex. What we really need is to protect fsync() from logging an
extent map for an extent we never waited on higher up, so just guard the
whole thing with dio_sem.
==
WARNIN
We want to release the unused reservation we have since it refills the
delayed refs reserve, which will make everything go smoother when
running the delayed refs if we're short on our reservation.
Reviewed-by: Omar Sandoval
Reviewed-by: Liu Bo
Reviewed-by: Nikolay Borisov
Signed-off-by: Josef B
We've done this forever because of the voodoo around knowing how much
space we have. However we have better ways of doing this now, and on
normal file systems we'll easily have a global reserve of 512MiB, and
since metadata chunks are usually 1GiB that means we'll allocate
metadata chunks more rea
From: Josef Bacik
max_extent_size is supposed to be the largest contiguous range for the
space info, and ctl->free_space is the total free space in the block
group. We need to keep track of these separately and _only_ use the
max_free_space if we don't have a max_extent_size, as that means our
o
Delayed iputs means we can have final iputs of deleted inodes in the
queue, which could potentially generate a lot of pinned space that could
be free'd. So before we decide to commit the transaction for ENOPSC
reasons, run the delayed iputs so that any potential space is free'd up.
If there is and
With the introduction of the per-inode block_rsv it became possible to
have really really large reservation requests made because of data
fragmentation. Since the ticket stuff assumed that we'd always have
relatively small reservation requests it just killed all tickets if we
were unable to satisf
On Thu, Oct 11, 2018 at 03:54:22PM -0400, Josef Bacik wrote:
> We were not handling the reserved byte accounting properly for data
> references. Metadata was fine, if it errored out the error paths would
> free the bytes_reserved count and pin the extent, but it even missed one
> of the error case
On Thu, Oct 11, 2018 at 03:54:08PM -0400, Josef Bacik wrote:
> From: Josef Bacik
>
> We can't use entry->bytes if our entry is a bitmap entry, we need to use
> entry->max_extent_size in that case. Fix up all the logic to make this
> consistent.
>
> Signed-off-by: Josef Bacik
> ---
> fs/btrfs/
On Thu, Oct 11, 2018 at 02:33:55PM -0400, Josef Bacik wrote:
> On Thu, Oct 04, 2018 at 01:24:24PM +0200, David Sterba wrote:
> > On Fri, Sep 28, 2018 at 07:17:46AM -0400, Josef Bacik wrote:
> > > may_commit_transaction will skip committing the transaction if we don't
> > > have enough pinned space
On Fri, Sep 21, 2018 at 03:20:29PM +0800, Qu Wenruo wrote:
> And add one line comment explaining what we're doing for each loop.
>
> Signed-off-by: Qu Wenruo
> ---
> changelog:
> v2:
> Use rbtree_postorder_for_each_entry_safe() to replace for() loop.
1-2 reviewed and added to 4.20 queue, thank
On Fri, Oct 12, 2018 at 11:16:16AM +1100, Dave Chinner wrote:
> On Wed, Oct 10, 2018 at 09:12:54PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong
> >
> > A deduplication data corruption is exposed by fstests generic/505 on
> > XFS. It is caused by extending the block match range to inclu
On Fri, Oct 12, 2018 at 12:22:26PM +1100, Dave Chinner wrote:
> On Wed, Oct 10, 2018 at 09:15:19PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong
> >
> > Back when the XFS reflink code only supported clone_file_range, we were
> > only able to return zero or negative error codes to usersp
On 2018/10/12 下午9:52, Josef Bacik wrote:
> On Fri, Oct 12, 2018 at 02:18:19PM +0800, Qu Wenruo wrote:
>> We have a complex loop design for find_free_extent(), that has different
>> behavior for each loop, some even includes new chunk allocation.
>>
>> Instead of putting such a long code into find
On Fri, Oct 12, 2018 at 02:18:18PM +0800, Qu Wenruo wrote:
> This patch will extract unclsutered extent allocation code into
> find_free_extent_unclustered().
>
> And this helper function will use return value to indicate what to do
> next.
>
> This should make find_free_extent() a little easier
On Fri, Oct 12, 2018 at 02:18:17PM +0800, Qu Wenruo wrote:
> We have two main methods to find free extents inside a block group:
> 1) clustered allocation
> 2) unclustered allocation
>
> This patch will extract the clustered allocation into
> find_free_extent_clustered() to make it a little easier
On Fri, Oct 12, 2018 at 02:18:16PM +0800, Qu Wenruo wrote:
> Instead of tons of different local variables in find_free_extent(),
> extract them into find_free_extent_ctl structure, and add better
> explanation for them.
>
> Some modification may looks redundant, but will later greatly simplify
> f
On Fri, Oct 12, 2018 at 02:18:19PM +0800, Qu Wenruo wrote:
> We have a complex loop design for find_free_extent(), that has different
> behavior for each loop, some even includes new chunk allocation.
>
> Instead of putting such a long code into find_free_extent() and makes it
> harder to read, ju
On Fri, Oct 12, 2018 at 10:03:55AM +0100, fdman...@kernel.org wrote:
> From: Filipe Manana
>
> When writing out a block group free space cache we can end deadlocking
> with ourseves on an extent buffer lock resulting in a warning like the
> following:
>
> [245043.379979] WARNING: CPU: 4 PID: 2
On 2018/10/12 下午8:02, fdman...@kernel.org wrote:
> From: Filipe Manana
>
> At inode.c:evict_inode_truncate_pages(), when we iterate over the inode's
> extent states, we access an extent state record's "state" field after we
> unlocked the inode's io tree lock. This can lead to a use-after-free
From: Filipe Manana
At inode.c:evict_inode_truncate_pages(), when we iterate over the inode's
extent states, we access an extent state record's "state" field after we
unlocked the inode's io tree lock. This can lead to a use-after-free issue
because after we unlock the io tree that extent state r
On 2018/10/12 下午6:35, Jürgen Herrmann wrote:
> Am 12.10.2018 10:19, schrieb Qu Wenruo:
>
> [snip]
>
>> Please run the following command:
>>
>> # btrfs ins dump-tree --follow -b 166456229888
>>
>> It could be caused by the fact that btrfs-progs --repair doesn't handle
>> log tree well.
>>
>> If
Am 12.10.2018 10:19, schrieb Qu Wenruo:
[snip]
Please run the following command:
# btrfs ins dump-tree --follow -b 166456229888
It could be caused by the fact that btrfs-progs --repair doesn't handle
log tree well.
If that's the case, "btrfs rescue zero-log" should help.
But anyway, feel fr
[snip]
>
> Hi there!
>
> I ran btrfs check --repair on the filesystem. I dont' have this log
> anymore,
> as it was then sitting on the repaired fs), which is now dead.
> after repairing it I could still mount the fs.
>
> as my btrfs send problem still persists (another thread), I decided to
>
Am 12.10.2018 01:56, schrieb Qu Wenruo:
On 2018/10/12 上午4:30, Jürgen Herrmann wrote:
Hi!
I just did a btrfs check on my laptop's btrfs filesystem while i was
on the usb stick rescue system.
the following errors where reported:
root@mint:/home/mint# btrfs check /dev/mapper/sda3crypt
Checking fi
On 2018/10/12 下午5:13, Nikolay Borisov wrote:
>
>
> On 12.10.2018 11:46, Qu Wenruo wrote:
>>
>>
>> On 2018/10/12 下午2:53, Nikolay Borisov wrote:
>>>
>>>
>>> On 12.10.2018 09:42, Qu Wenruo wrote:
The only user of it is "btrfs inspect dump-super".
Signed-off-by: Qu Wenruo
---
On 12.10.2018 11:46, Qu Wenruo wrote:
>
>
> On 2018/10/12 下午2:53, Nikolay Borisov wrote:
>>
>>
>> On 12.10.2018 09:42, Qu Wenruo wrote:
>>> The only user of it is "btrfs inspect dump-super".
>>>
>>> Signed-off-by: Qu Wenruo
>>> ---
>>> cmds-inspect-dump-super.c | 4 ++--
>>> ctree.h
From: Filipe Manana
When writing out a block group free space cache we can end deadlocking
with ourseves on an extent buffer lock resulting in a warning like the
following:
[245043.379979] WARNING: CPU: 4 PID: 2608 at fs/btrfs/locking.c:251
btrfs_tree_lock+0x1be/0x1d0 [btrfs]
[245043.392792
On 12.10.2018 07:06, Anand Jain wrote:
> This patch adds cli
> btrfs device forget [dev]
> to remove the given device structure in the kernel if the device
> is unmounted. If no argument is given it shall remove all stale
> (device which are not mounted) from the kernel.
>
> Signed-off-by: An
On Thu, Oct 11, 2018 at 8:57 PM Josef Bacik wrote:
>
> If we use up our block group before allocating a new one we'll easily
> get a max_extent_size that's set really really low, which will result in
> a lot of fragmentation. We need to make sure we're resetting the
> max_extent_size when we add
On Thu, Oct 11, 2018 at 8:57 PM Josef Bacik wrote:
>
> From: Josef Bacik
>
> max_extent_size is supposed to be the largest contiguous range for the
> space info, and ctl->free_space is the total free space in the block
> group. We need to keep track of these separately and _only_ use the
> max_f
1 - 100 of 107 matches
Mail list logo