Due to the pagecache limit of 32bit systems, btrfs can't access metadata
at or beyond (ULONG_MAX + 1) << PAGE_SHIFT.
This is 16T for 4K page size while 256T for 64K page size.
And unlike other fses, btrfs uses internally mapped u64 address space for
all of its metadata, this is more tricky than ot
Hi,
> btrfs_wipe_existing_sb() misses calling blkid_do_fullprobe() to do
> the real probe. After calling blkid_new_probe() &
> blkid_probe_set_device() to setup blkid_probe context, it directly
> calls blkid_probe_lookup_value(). This results in
> blkid_probe_lookup_value returning -1, because pr-
Hi,
> On Fri, Apr 09, 2021 at 08:08:00AM +0800, Wang Yugui wrote:
> > Hi,
> >
> > > > kernel: at least 5.10.26/5.10.27/5.10.28
> > > >
> > > > This problem is triggered by our application, NOT xfstests.
> > > > But our applicaiton have some heavy write load just like
> > > > xfstest/generic/476
> -Original Message-
> From: Darrick J. Wong
> Sent: Friday, April 9, 2021 5:53 AM
> Subject: Re: [PATCH v4 1/7] fsdax: Introduce dax_iomap_cow_copy()
>
> On Thu, Apr 08, 2021 at 08:04:26PM +0800, Shiyang Ruan wrote:
> > In the case where the iomap is a write operation and iomap is not
>
On Fri, Apr 09, 2021 at 08:08:00AM +0800, Wang Yugui wrote:
> Hi,
>
> > > kernel: at least 5.10.26/5.10.27/5.10.28
> > >
> > > This problem is triggered by our application, NOT xfstests.
> > > But our applicaiton have some heavy write load just like
> > > xfstest/generic/476.
> > > Our applicati
> -Original Message-
> From: Darrick J. Wong
> Sent: Friday, April 9, 2021 5:11 AM
> Subject: Re: [PATCH v2 2/3] fsdax: Factor helper: dax_fault_actor()
>
> On Wed, Apr 07, 2021 at 09:38:22PM +0800, Shiyang Ruan wrote:
> > The core logic in the two dax page fault functions is similar. S
> -Original Message-
> From: Su Yue
> Subject: Re: [PATCH v4 7/7] fs/xfs: Add dedupe support for fsdax
>
>
> On Thu 08 Apr 2021 at 20:04, Shiyang Ruan wrote:
>
> > Add xfs_break_two_dax_layouts() to break layout for tow dax files.
> > Then call compare range function only when files
On Thu, Apr 08, 2021 at 04:16:28PM -0700, Eric Biggers wrote:
> On Thu, Apr 08, 2021 at 11:49:37AM -0700, Boris Burkov wrote:
> > On Thu, Apr 08, 2021 at 11:41:42AM -0700, Eric Biggers wrote:
> > > On Thu, Apr 08, 2021 at 11:30:12AM -0700, Boris Burkov wrote:
> > > >
> > > > Note that there is a b
Hi,
> > kernel: at least 5.10.26/5.10.27/5.10.28
> >
> > This problem is triggered by our application, NOT xfstests.
> > But our applicaiton have some heavy write load just like
> > xfstest/generic/476.
> > Our application use at most 75% of memory, if still not enough,
> > it will write out al
On 09/04/2021 02:33, Boris Burkov wrote:
The tree checker currently rejects unrecognized flags when it reads
btrfs_inode_item. Practically, this means that adding a new flag makes
the change backwards incompatible if the flag is ever set on a file.
Take up one of the 4 reserved u64 fields in
Hi Boris,
I love your patch! Yet something to improve:
[auto build test ERROR on kdave/for-next]
[also build test ERROR on next-20210408]
[cannot apply to v5.12-rc6]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--bas
On Thu, Apr 08, 2021 at 11:49:37AM -0700, Boris Burkov wrote:
> On Thu, Apr 08, 2021 at 11:41:42AM -0700, Eric Biggers wrote:
> > On Thu, Apr 08, 2021 at 11:30:12AM -0700, Boris Burkov wrote:
> > >
> > > Note that there is a bit of a kludge here: since btrfs_corrupt_block
> > > doesn't handle stre
On Thu, Apr 08, 2021 at 11:57:49AM -0700, Boris Burkov wrote:
> +get_ino() {
> + file=$1
> + ls -i $file | awk '{print $1}'
> +}
Please use 'local' when declaring variables in shell functions.
> +# corrupt the data portion of an inline extent
> +corrupt_inline() {
> + f=$SCRATCH_MNT/i
On Thu, Apr 08, 2021 at 08:04:32PM +0800, Shiyang Ruan wrote:
> Add xfs_break_two_dax_layouts() to break layout for tow dax files. Then
> call compare range function only when files are both DAX or not.
>
> Signed-off-by: Shiyang Ruan
> ---
> fs/xfs/xfs_file.c| 20
> fs
Hi Boris,
I love your patch! Perhaps something to improve:
[auto build test WARNING on kdave/for-next]
[also build test WARNING on next-20210408]
[cannot apply to v5.12-rc6]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use
On Thu, Apr 08, 2021 at 11:33:53AM -0700, Boris Burkov wrote:
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index f7a4ad86adee..e5282a8f566a 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -1339,6 +1339,7 @@ static int btrfs_fill_super(struct super_block *sb,
> sb->s_op =
On Thu, Apr 08, 2021 at 08:04:31PM +0800, Shiyang Ruan wrote:
> In fsdax mode, WRITE and ZERO on a shared extent need CoW performed. After
> CoW, new allocated extents needs to be remapped to the file. So, add an
> iomap_end for dax write ops to do the remapping work.
>
> Signed-off-by: Shiyang R
Hi Boris,
I love your patch! Yet something to improve:
[auto build test ERROR on kdave/for-next]
[also build test ERROR on next-20210408]
[cannot apply to v5.12-rc6]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--bas
On Thu, Apr 08, 2021 at 08:04:30PM +0800, Shiyang Ruan wrote:
> With dax we cannot deal with readpage() etc. So, we create a dax
> comparison funciton which is similar with
> vfs_dedupe_file_range_compare().
> And introduce dax_remap_file_range_prep() for filesystem use.
>
> Signed-off-by: Goldwyn
On Thu, Apr 08, 2021 at 08:04:29PM +0800, Shiyang Ruan wrote:
> Some operations, such as comparing a range of data in two files under
> fsdax mode, requires nested iomap_open()/iomap_end() on two file. Thus,
> we introduce iomap_apply2() to accept arguments from two files and
> iomap_actor2_t for
On Thu, Apr 08, 2021 at 08:04:27PM +0800, Shiyang Ruan wrote:
> We replace the existing entry to the newly allocated one in case of CoW.
> Also, we mark the entry as PAGECACHE_TAG_TOWRITE so writeback marks this
> entry as writeprotected. This helps us snapshots so new write
> pagefaults after sna
On Thu, Apr 08, 2021 at 08:04:26PM +0800, Shiyang Ruan wrote:
> In the case where the iomap is a write operation and iomap is not equal
> to srcmap after iomap_begin, we consider it is a CoW operation.
>
> The destance extent which iomap indicated is new allocated extent.
> So, it is needed to cop
On Wed, Apr 07, 2021 at 09:38:22PM +0800, Shiyang Ruan wrote:
> The core logic in the two dax page fault functions is similar. So, move
> the logic into a common helper function. Also, to facilitate the
> addition of new features, such as CoW, switch-case is no longer used to
> handle different iom
generic/574 has tests for corrupting the merkle tree data stored by the
filesystem. Since btrfs uses a different scheme for storing this data,
the existing logic for corrupting it doesn't work out of the box. Adapt
it to properly corrupt btrfs merkle items.
Note that there is a bit of a kludge her
There are some btrfs specific fsverity scenarios that don't map
neatly onto the tests in generic/574 like holes, inline extents,
and preallocated extents. Cover those in a btrfs specific test.
This test relies on the btrfs implementation of fsverity in:
and it relies on btrfs-corrupt-block for co
The behavior of orphans is most interesting across mounts, interrupted
at arbitrary points during fsverity enable. To cover as many such cases
as possible, use dmlogwrites and dmsnapshot as in
log-writes/replay-individual.sh. At each log entry, we want to assert a
somewhat complicated invariant:
I
This patchset provides tests for fsverity support in btrfs.
It includes modifications for generic tests to pass with btrfs as well
as new btrfs specific tests.
--
v3: rebase onto xfstests master branch
v2: pass generic tests, add logwrites test
Boris Burkov (3):
btrfs: test btrfs specific fsve
On Thu, Apr 08, 2021 at 11:41:42AM -0700, Eric Biggers wrote:
> On Thu, Apr 08, 2021 at 11:30:12AM -0700, Boris Burkov wrote:
> >
> > Note that there is a bit of a kludge here: since btrfs_corrupt_block
> > doesn't handle streaming corruption bytes from stdin (I could change
> > that, but it feels
On Thu, Apr 08, 2021 at 11:36:30AM -0700, Eric Biggers wrote:
> On Thu, Apr 08, 2021 at 11:30:10AM -0700, Boris Burkov wrote:
> > This patchset provides tests for fsverity support in btrfs.
> >
> > It includes modifications for generic tests to pass with btrfs as well
> > as new btrfs specific tes
On Thu, Apr 08, 2021 at 11:30:12AM -0700, Boris Burkov wrote:
>
> Note that there is a bit of a kludge here: since btrfs_corrupt_block
> doesn't handle streaming corruption bytes from stdin (I could change
> that, but it feels like overkill for this purpose), I just read the
> first corruption byt
On Thu, Apr 08, 2021 at 11:30:10AM -0700, Boris Burkov wrote:
> This patchset provides tests for fsverity support in btrfs.
>
> It includes modifications for generic tests to pass with btrfs as well
> as new btrfs specific tests.
Which commit does this apply to? It doesn't apply to the latest xf
From: Chris Mason
Add support for fsverity in btrfs. To support the generic interface in
fs/verity, we add two new item types in the fs tree for inodes with
verity enabled. One stores the per-file verity descriptor and the other
stores the Merkle tree data itself.
Verity checking is done at the
The majority of reads receive a verity check after the bio is complete
as the page is marked uptodate. However, there is a class of reads which
are handled with btrfs logic in readpage, rather than by submitting a
bio. Specifically, these are inline extents, preallocated extents, and
holes. Tweak r
Reading the contents with direct IO would circumvent verity checks, so
fallback to buffered reads. For what it's worth, this is how ext4
handles it as well.
Signed-off-by: Boris Burkov
---
fs/btrfs/file.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
in
If we don't finish creating fsverity metadata for a file, or fail to
clean up already created metadata after a failure, we could leak the
verity items.
To address this issue, we use the orphan mechanism. When we start
enabling verity on a file, we also add an orphan item for that inode.
When we ar
This patchset provides support for fsverity in btrfs.
At a high level, we store the verity descriptor and Merkle tree data
in the file system btree with the file's inode as the objectid, and
direct reads/writes to those items to implement the generic fsverity
interface required by fs/verity/.
The
The tree checker currently rejects unrecognized flags when it reads
btrfs_inode_item. Practically, this means that adding a new flag makes
the change backwards incompatible if the flag is ever set on a file.
Take up one of the 4 reserved u64 fields in the btrfs_inode_item as a
new "compat_flags".
The behavior of orphans is most interesting across mounts, interrupted
at arbitrary points during fsverity enable. To cover as many such cases
as possible, use dmlogwrites and dmsnapshot as in
log-writes/replay-individual.sh. At each log entry, we want to assert a
somewhat complicated invariant:
I
There are some btrfs specific fsverity scenarios that don't map
neatly onto the tests in generic/574 like holes, inline extents,
and preallocated extents. Cover those in a btrfs specific test.
This test relies on the btrfs implementation of fsverity in:
btrfs: add compat_flags to btrfs_inode_item
generic/574 has tests for corrupting the merkle tree data stored by the
filesystem. Since btrfs uses a different scheme for storing this data,
the existing logic for corrupting it doesn't work out of the box. Adapt
it to properly corrupt btrfs merkle items.
Note that there is a bit of a kludge her
This patchset provides tests for fsverity support in btrfs.
It includes modifications for generic tests to pass with btrfs as well
as new btrfs specific tests.
Boris Burkov (3):
btrfs: test btrfs specific fsverity corruption
generic/574: corrupt btrfs merkle tree data
btrfs: test verity orp
On 4/5/21 7:32 AM, fdman...@kernel.org wrote:
From: Filipe Manana
There is a race between a task aborting a transaction during a commit,
a task doing an fsync and the transaction kthread, which leads to an
use-after-free of the log root tree. When this happens, it results in a
stack trace like
On 4/6/21 9:55 AM, Nikolay Borisov wrote:
In case the right buffer is emptied it's first set to null and
subsequently it's dereferenced to get its size to pass to root_sub_used.
This naturally leads to a null pointer dereference. The correct thing
to do is to pass the stashed right->len in "block
On 4/6/21 6:31 PM, Boris Burkov wrote:
`xfs_io -c 'fiemap ' `
can give surprising results on btrfs that differ from xfs.
btrfs spits out extents trimmed to fit the user input. If the user's
fiemap request has an offset, then rather than returning each whole
extent which intersects that range, w
On 2/24/21 8:18 PM, Qu Wenruo wrote:
Due to the pagecache limit of 32bit systems, btrfs can't access metadata
at or beyond (ULONG_MAX + 1) << PAGE_SHIFT.
This is 16T for 4K page size while 256T for 64K page size.
And unlike other fses, btrfs uses internally mapped u64 address space for
all of it
On 4/8/21 8:40 AM, Goldwyn Rodrigues wrote:
try_lock_extent() returns 1 on success or 0 for failure and not an error
code. If try_lock_extent() fails, read_extent_buffer_subpage() returns
zero indicating subpage extent read success.
Return EAGAIN/EWOULDBLOCK if try_lock_extent() fails in locking
On 4/8/21 12:10 PM, Martin Raiber wrote:
On 11.03.2021 18:58 Martin Raiber wrote:
On 01.02.2021 23:08 Martin Raiber wrote:
On 27.01.2021 22:03 Chris Murphy wrote:
On Wed, Jan 27, 2021 at 10:27 AM Martin Raiber wrote:
Hi,
seems 5.10.8 still has the ENOSPC issue when compression is used
(com
On Sat, Apr 03, 2021 at 08:25:38PM +, Luis Chamberlain wrote:
> So creating say 1000 random files in /lib/firmware on a freshly created
> btrfs partition helps reproduce:
>
> mkfs.btrfs /dev/whatever
> mount /dev/wahtever /lib/firmware
> # Put it on /etc/fstab too
>
> Generate 1000 random fil
On 11.03.2021 18:58 Martin Raiber wrote:
On 01.02.2021 23:08 Martin Raiber wrote:
On 27.01.2021 22:03 Chris Murphy wrote:
On Wed, Jan 27, 2021 at 10:27 AM Martin Raiber wrote:
Hi,
seems 5.10.8 still has the ENOSPC issue when compression is used
(compress-force=zstd,space_cache=v2):
Jan 27
On Thu, Apr 08, 2021 at 03:28:20PM +0100, Filipe Manana wrote:
> On Thu, Apr 8, 2021 at 2:50 PM Dennis Zhou wrote:
> >
> > On Thu, Apr 08, 2021 at 05:20:00PM +0800, Wang Yugui wrote:
> > > Hi,
> > >
> > > > On Thu, Apr 08, 2021 at 07:28:01AM +0800, Wang Yugui wrote:
> > > > > Hi,
> > > > >
> > > >
On 4/8/21 4:25 AM, Naohiro Aota wrote:
This commit moves the location of the superblock logging zones. The new
locations of the logging zones are now determined based on fixed block
addresses instead of on fixed zone numbers.
The old placement method based on fixed zone numbers causes problems w
On Thu, Apr 8, 2021 at 2:50 PM Dennis Zhou wrote:
>
> On Thu, Apr 08, 2021 at 05:20:00PM +0800, Wang Yugui wrote:
> > Hi,
> >
> > > On Thu, Apr 08, 2021 at 07:28:01AM +0800, Wang Yugui wrote:
> > > > Hi,
> > > >
> > > > > > > > upper caller:
> > > > > > > > nofs_flag = memalloc_nofs_save();
>
On Thu, Apr 08, 2021 at 05:20:00PM +0800, Wang Yugui wrote:
> Hi,
>
> > On Thu, Apr 08, 2021 at 07:28:01AM +0800, Wang Yugui wrote:
> > > Hi,
> > >
> > > > > > > upper caller:
> > > > > > > nofs_flag = memalloc_nofs_save();
> > > > > > > ret = btrfs_drew_lock_init(&root->snapshot_lock);
>
On Thu 08 Apr 2021 at 20:04, Shiyang Ruan
wrote:
Add xfs_break_two_dax_layouts() to break layout for tow dax
files. Then
call compare range function only when files are both DAX or not.
Signed-off-by: Shiyang Ruan
Not family with xfs code but reading code make my sleep better :)
See b
On 2021/4/8 下午8:40, Goldwyn Rodrigues wrote:
try_lock_extent() returns 1 on success or 0 for failure and not an error
code. If try_lock_extent() fails, read_extent_buffer_subpage() returns
zero indicating subpage extent read success.
Return EAGAIN/EWOULDBLOCK if try_lock_extent() fails in loc
try_lock_extent() returns 1 on success or 0 for failure and not an error
code. If try_lock_extent() fails, read_extent_buffer_subpage() returns
zero indicating subpage extent read success.
Return EAGAIN/EWOULDBLOCK if try_lock_extent() fails in locking the
extent.
Signed-off-by: Goldwyn Rodrigues
In fsdax mode, WRITE and ZERO on a shared extent need CoW performed. After
CoW, new allocated extents needs to be remapped to the file. So, add an
iomap_end for dax write ops to do the remapping work.
Signed-off-by: Shiyang Ruan
---
fs/xfs/xfs_bmap_util.c | 3 +--
fs/xfs/xfs_file.c | 9 +
Add xfs_break_two_dax_layouts() to break layout for tow dax files. Then
call compare range function only when files are both DAX or not.
Signed-off-by: Shiyang Ruan
---
fs/xfs/xfs_file.c| 20
fs/xfs/xfs_inode.c | 8 +++-
fs/xfs/xfs_inode.h | 1 +
fs/xfs/xfs_re
With dax we cannot deal with readpage() etc. So, we create a dax
comparison funciton which is similar with
vfs_dedupe_file_range_compare().
And introduce dax_remap_file_range_prep() for filesystem use.
Signed-off-by: Goldwyn Rodrigues
Signed-off-by: Shiyang Ruan
---
fs/dax.c | 56 ++
Some operations, such as comparing a range of data in two files under
fsdax mode, requires nested iomap_open()/iomap_end() on two file. Thus,
we introduce iomap_apply2() to accept arguments from two files and
iomap_actor2_t for actions on two files.
Signed-off-by: Shiyang Ruan
---
fs/iomap/appl
Punch hole on a reflinked file needs dax_copy_edge() too. Otherwise,
data in not aligned area will be not correct. So, add the srcmap to
dax_iomap_zero() and replace memset() as dax_copy_edge().
Signed-off-by: Shiyang Ruan
Reviewed-by: Ritesh Harjani
---
fs/dax.c | 25 ++
We replace the existing entry to the newly allocated one in case of CoW.
Also, we mark the entry as PAGECACHE_TAG_TOWRITE so writeback marks this
entry as writeprotected. This helps us snapshots so new write
pagefaults after snapshots trigger a CoW.
Signed-off-by: Goldwyn Rodrigues
Signed-off-by
In the case where the iomap is a write operation and iomap is not equal
to srcmap after iomap_begin, we consider it is a CoW operation.
The destance extent which iomap indicated is new allocated extent.
So, it is needed to copy the data from srcmap to new allocated extent.
In theory, it is better
This patchset is attempt to add CoW support for fsdax, and take XFS,
which has both reflink and fsdax feature, as an example.
Changes from V3:
- Take out the first 3 patches as a cleanup patchset[1], which has been
sent yesterday.
- Fix usage of code in dax_iomap_cow_copy()
- Add comments f
On 2021/4/8 下午7:15, riteshh wrote:
Please excuse my silly queries here.
On 21/04/08 04:38PM, Qu Wenruo wrote:
On 2021/4/8 下午4:16, Joe Hermaszewski wrote:
It took a while but I managed to get hold of another one of these
arm32 boards. Very disappointingly this exact "bitflip" is still
pre
Please excuse my silly queries here.
On 21/04/08 04:38PM, Qu Wenruo wrote:
>
>
> On 2021/4/8 下午4:16, Joe Hermaszewski wrote:
> > It took a while but I managed to get hold of another one of these
> > arm32 boards. Very disappointingly this exact "bitflip" is still
> > present (log enclosed).
>
>
On 2021/4/8 下午6:11, Joe Hermaszewski wrote:
Thanks for explaining so patiently, I have a couple more questions if
you have the time:
With the patch I assume that this FS will just refuse to mount on
arm32,
Yes.
If the fs has a chunk beyond 16T, it will be definitely be rejected.
For your c
Hi,
> On Thu, Apr 08, 2021 at 07:28:01AM +0800, Wang Yugui wrote:
> > Hi,
> >
> > > > > > upper caller:
> > > > > > nofs_flag = memalloc_nofs_save();
> > > > > > ret = btrfs_drew_lock_init(&root->snapshot_lock);
> > > > > > memalloc_nofs_restore(nofs_flag);
> > >
> > > The issue is h
On 2021/4/8 下午4:16, Joe Hermaszewski wrote:
It took a while but I managed to get hold of another one of these
arm32 boards. Very disappointingly this exact "bitflip" is still
present (log enclosed).
Yeah, we got to the conclusion it's not bitflip, but completely 32bit
limit on armv7.
For AR
Gentle ping?
Any update? I didn't see it merged into misc-next.
Thanks,
Qu
On 2021/2/25 上午9:18, Qu Wenruo wrote:
Due to the pagecache limit of 32bit systems, btrfs can't access metadata
at or beyond (ULONG_MAX + 1) << PAGE_SHIFT.
This is 16T for 4K page size while 256T for 64K page size.
And
This commit moves the location of the superblock logging zones. The new
locations of the logging zones are now determined based on fixed block
addresses instead of on fixed zone numbers.
The old placement method based on fixed zone numbers causes problems when
one needs to inspect a file system im
It took a while but I managed to get hold of another one of these
arm32 boards. Very disappointingly this exact "bitflip" is still
present (log enclosed).
To summarise, as it's been a while:
- When running scrub, a "page_start" and "eb_start" mismatch is
detected (off by a single bit).
- `btrfs c
72 matches
Mail list logo