On Tue, May 22, 2018 at 08:37:28PM +0200, Michal Hocko wrote:
> So why is this any better than the current code. Sure I am not a great
> fan of GFP_ZONE_TABLE because of how it is incomprehensible but this
> doesn't look too much better, yet we are losing a check for incompatible
> gfp flags. The
On 2018年05月24日 10:09, Misono Tomohiro wrote:
> On 2018/05/23 17:23, Qu Wenruo wrote:
>> James Harvey reported that some corrupted compressed extent data can
>> lead to various kernel memory corruption.
>>
>> Such corrupted extent data belongs to inode with NODATASUM flags, thus
>> data csum
On 2018/05/23 17:23, Qu Wenruo wrote:
> James Harvey reported that some corrupted compressed extent data can
> lead to various kernel memory corruption.
>
> Such corrupted extent data belongs to inode with NODATASUM flags, thus
> data csum won't help us detecting such bug.
>
> If lucky enough,
On Fri, May 18, 2018 at 01:45:49PM -0400, Kent Overstreet wrote:
> On Fri, May 18, 2018 at 08:53:30AM -0700, Christoph Hellwig wrote:
> > On Fri, May 18, 2018 at 06:13:06AM -0700, Matthew Wilcox wrote:
> > > > Historically, the only problematic case has been direct IO, and people
> > > > have been
On Tue, May 22, 2018 at 03:02:12PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval
>
> In btrfs_clone_files(), we must check the NODATASUM flag while the
> inodes are locked. Otherwise, it's possible that btrfs_ioctl_setflags()
> will change the flags after we check and we can
On Wed, May 23, 2018 at 10:14:14AM -0700, Omar Sandoval wrote:
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> The clone fix and device remove fix are in that branch, too. Let me know
> if you'd prefer it as patches.
The diff like that is fine, there likely will be more conflicts with
On 23 May 2018, at 2:37, Christoph Hellwig wrote:
On Tue, May 22, 2018 at 02:31:36PM -0400, Chris Mason wrote:
And what protects two writes from interleaving their results now?
page locks...ish, we at least won't have results interleaved in a
single
page. For btrfs it'll actually be
From: Huaisheng Ye
Use __GFP_ZONE_MOVABLE to replace (__GFP_HIGHMEM | __GFP_MOVABLE).
___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits of GFP mask is reserved for storing
encoded zone number.
__GFP_ZONE_MOVABLE contains
From: Huaisheng Ye
GFP_HIGHUSER_MOVABLE doesn't equal to GFP_HIGHUSER | __GFP_MOVABLE,
modify it to adapt patch of getting rid of GFP_ZONE_TABLE/BAD.
Signed-off-by: Huaisheng Ye
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc:
On Wed, May 23, 2018 at 12:17:14PM +0200, David Sterba wrote:
> On Tue, May 22, 2018 at 03:02:11PM -0700, Omar Sandoval wrote:
> > Based on kdave/for-next. Note that there's a Fixes: tag in there
> > referencing a commit in the for-next branch, so that would have to be
> > updated if the commit
On Wed, May 23, 2018 at 08:22:39AM -0700, Christoph Hellwig wrote:
> On Sun, May 20, 2018 at 06:45:24PM -0400, Kent Overstreet wrote:
> > >
> > > Honestly I think this probably should be in the core. But IFF we move
> > > it to the core the existing users of per-fs locks need to be moved
> > >
From: Huaisheng Ye
GFP_HIGHUSER_MOVABLE doesn't equal to GFP_HIGHUSER | __GFP_MOVABLE,
modify it to adapt patch of getting rid of GFP_ZONE_TABLE/BAD.
Signed-off-by: Huaisheng Ye
Cc: Kate Stewart
Cc: Greg Kroah-Hartman
From: Huaisheng Ye
Use __GFP_ZONE_MOVABLE to replace (__GFP_HIGHMEM | __GFP_MOVABLE).
___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits of GFP mask is reserved for storing
encoded zone number.
__GFP_ZONE_MOVABLE contains
From: Huaisheng Ye
Use __GFP_ZONE_MASK to replace (__GFP_DMA32 | __GFP_HIGHMEM).
In function alloc_extent_state, it is obvious that __GFP_DMA is not
the expecting zone type.
___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits
From: Huaisheng Ye
Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM).
In function xen_swiotlb_alloc_coherent, it is obvious that __GFP_DMA32
is not the expecting zone type.
___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom
From: Huaisheng Ye
Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).
___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits of GFP mask is reserved for storing
encoded zone number.
__GFP_DMA, __GFP_HIGHMEM
From: Huaisheng Ye
Replace GFP_ZONE_TABLE and GFP_ZONE_BAD with encoded zone number.
Delete ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 from GFP bitmasks,
the bottom three bits of GFP mask is reserved for storing encoded
zone number.
The encoding method is XOR. Get zone
From: Huaisheng Ye
Changes since v2: [2]
* According to Christoph's suggestion, rebase patches to current
mainline from v4.16.
* Follow the advice of Matthew, create macros like GFP_NORMAL and
GFP_NORMAL_UNMOVABLE to clear bottom 3 and 4 bits of GFP bitmask.
* Delete some
From: Michal Hocko [mailto:mho...@kernel.org]
Sent: Wednesday, May 23, 2018 2:37 AM
>
> On Mon 21-05-18 23:20:21, Huaisheng Ye wrote:
> > From: Huaisheng Ye
> >
> > Replace GFP_ZONE_TABLE and GFP_ZONE_BAD with encoded zone number.
> >
> > Delete ___GFP_DMA, ___GFP_HIGHMEM and
From: Josef Bacik
This is no longer used anywhere, remove all of it.
Signed-off-by: Josef Bacik
---
fs/btrfs/ordered-data.c | 123 ---
fs/btrfs/ordered-data.h | 20 ++-
fs/btrfs/tree-log.c | 16
From: Josef Bacik
Since we are waiting on all ordered extents at the start of the fsync()
path we don't need to wait on any logged ordered extents, and we don't
need to look up the checksums on the ordered extents as they will
already be on disk prior to getting here. Rework this
From: Josef Bacik
We no longer use this list we've passed around so remove it everywhere.
Also remove the extra checks for ordered/filemap errors as this is
handled higher up now that we're waiting on ordered_extents before
getting to the tree log code.
Signed-off-by: Josef Bacik
From: Josef Bacik
There's a priority inversion that exists currently with btrfs fsync. In
some cases we will collect outstanding ordered extents onto a list and
only wait on them at the very last second. However this "very last
second" falls inside of a transaction handle, so if
On 23 May 2018, at 3:26, robbieko wrote:
Chris Mason 於 2018-05-23 02:31 寫到:
On 22 May 2018, at 14:08, Christoph Hellwig wrote:
On Wed, May 16, 2018 at 11:52:37AM +0800, robbieko wrote:
From: Robbie Ko
This idea is from direct io. By this patch, we can make the
On Tue, May 22, 2018 at 6:47 PM, Josef Bacik wrote:
> From: Josef Bacik
>
> There's a priority inversion that exists currently with btrfs fsync. In
> some cases we will collect outstanding ordered extents onto a list and
> only wait on them at the very last
On 23.05.2018 18:38, Josef Bacik wrote:
> It's just removing all of the code that is no longer needed with the
> unconditional wait_ordered_extents, it's not that complicated.
Just because something is painfully obvious to you doesn't mean it's the
same for others. Especially given the current
It's just removing all of the code that is no longer needed with the
unconditional wait_ordered_extents, it's not that complicated.
Thanks,
Josef
On Wed, May 23, 2018 at 8:24 AM, David Sterba wrote:
> On Tue, May 22, 2018 at 01:47:23PM -0400, Josef Bacik wrote:
>> From: Josef
On Sun, May 20, 2018 at 06:45:24PM -0400, Kent Overstreet wrote:
> >
> > Honestly I think this probably should be in the core. But IFF we move
> > it to the core the existing users of per-fs locks need to be moved
> > over first. E.g. XFS as the very first one, and at least ext4 and f2fs
> >
From: Huaisheng Ye
Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM).
In function xen_swiotlb_alloc_coherent, it is obvious that __GFP_DMA32
is not the expecting zone type.
___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom
From: Huaisheng Ye
Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).
___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits of GFP mask is reserved for storing
encoded zone number.
__GFP_DMA, __GFP_HIGHMEM
From: Huaisheng Ye
Changes since v2: [2]
* According to Christoph's suggestion, rebase patches to current
mainline from v4.16.
* Follow the advice of Matthew, create macros like GFP_NORMAL and
GFP_NORMAL_UNMOVABLE to clear bottom 3 and 4 bits of GFP bitmask.
* Delete some
From: Huaisheng Ye
Use __GFP_ZONE_MASK to replace (__GFP_DMA32 | __GFP_HIGHMEM).
In function alloc_extent_state, it is obvious that __GFP_DMA is not
the expecting zone type.
___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits
From: Huaisheng Ye
GFP_HIGHUSER_MOVABLE doesn't equal to GFP_HIGHUSER | __GFP_MOVABLE,
modify it to adapt patch of getting rid of GFP_ZONE_TABLE/BAD.
Signed-off-by: Huaisheng Ye
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc:
From: Huaisheng Ye
GFP_HIGHUSER_MOVABLE doesn't equal to GFP_HIGHUSER | __GFP_MOVABLE,
modify it to adapt patch of getting rid of GFP_ZONE_TABLE/BAD.
Signed-off-by: Huaisheng Ye
Cc: Kate Stewart
Cc: Greg Kroah-Hartman
From: Huaisheng Ye
Replace GFP_ZONE_TABLE and GFP_ZONE_BAD with encoded zone number.
Delete ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 from GFP bitmasks,
the bottom three bits of GFP mask is reserved for storing encoded
zone number.
The encoding method is XOR. Get zone
From: Huaisheng Ye
Use __GFP_ZONE_MOVABLE to replace (__GFP_HIGHMEM | __GFP_MOVABLE).
___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits of GFP mask is reserved for storing
encoded zone number.
__GFP_ZONE_MOVABLE contains
From: Huaisheng Ye
Use __GFP_ZONE_MOVABLE to replace (__GFP_HIGHMEM | __GFP_MOVABLE).
___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits of GFP mask is reserved for storing
encoded zone number.
__GFP_ZONE_MOVABLE contains
From: Huaisheng Ye
Use __GFP_ZONE_MOVABLE to replace (__GFP_HIGHMEM | __GFP_MOVABLE).
___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits of GFP mask is reserved for storing
encoded zone number.
__GFP_ZONE_MOVABLE contains
On Wed, May 23, 2018 at 02:34:40PM +0200, David Sterba wrote:
> On Wed, May 23, 2018 at 10:16:55AM +0800, Su Yue wrote:
> > >>> [ 47.692084] kernel BUG at fs/btrfs/locking.c:286!
> > >>
> > >> I saw the crash too but did not investigate the root cause. So I'll
> > >> remove the branch from
On Wed, May 23, 2018 at 10:16:55AM +0800, Su Yue wrote:
> >>> [ 47.692084] kernel BUG at fs/btrfs/locking.c:286!
> >>
> >> I saw the crash too but did not investigate the root cause. So I'll
> >> remove the branch from for-next until it's fixed. Thanks for the report.
> >
> > I think the
On Tue, May 22, 2018 at 01:47:23PM -0400, Josef Bacik wrote:
> From: Josef Bacik
>
> There's a priority inversion that exists currently with btrfs fsync. In
> some cases we will collect outstanding ordered extents onto a list and
> only wait on them at the very last second.
On 23.05.2018 11:03, ein wrote:
> On 05/23/2018 08:32 AM, Nikolay Borisov wrote:
>
> Nikolay, thank you for the answer.
>
>>> [...]
>>> root@node0:~# dmesg | grep BTRFS | grep warn
>>> 185:980:[2927472.393557] BTRFS warning (device dm-0): csum failed root
>>> -9 ino 312 off 608284672 csum
On 2018-05-23 06:09, ein wrote:
On 05/23/2018 11:09 AM, Duncan wrote:
ein posted on Wed, 23 May 2018 10:03:52 +0200 as excerpted:
IMHO the best course of action would be to disable checksumming for you
vm files.
Do you mean '-o nodatasum' mount flag? Is it possible to disable
checksumming
On Mon, Apr 30, 2018 at 5:04 PM, Vijay Chidambaram wrote:
> Hi,
>
> We found two more cases where the btrfs behavior is a little strange.
> In one case, an fsync-ed file goes missing after a crash. In the
> other, a renamed file shows up in both directories after a crash.
>
>
On Tue, May 22, 2018 at 03:02:11PM -0700, Omar Sandoval wrote:
> Based on kdave/for-next. Note that there's a Fixes: tag in there
> referencing a commit in the for-next branch, so that would have to be
> updated if the commit gets rebased. These patches are also available at
>
On 05/23/2018 11:09 AM, Duncan wrote:
> ein posted on Wed, 23 May 2018 10:03:52 +0200 as excerpted:
>
>>> IMHO the best course of action would be to disable checksumming for you
>>> vm files.
>>
>> Do you mean '-o nodatasum' mount flag? Is it possible to disable
>> checksumming for singe file by
David Sterba 於 2018-05-23 00:28 寫到:
On Fri, Apr 27, 2018 at 03:05:24PM +0800, Ethan Lien wrote:
We should balance dirty metadata pages at the end of
btrfs_finish_ordered_io, since a small, unmergeable random write can
potentially produce dirty metadata which is multiple times larger than
the
ein posted on Wed, 23 May 2018 10:03:52 +0200 as excerpted:
>> IMHO the best course of action would be to disable checksumming for you
>> vm files.
>>
>>
> Do you mean '-o nodatasum' mount flag? Is it possible to disable
> checksumming for singe file by setting some magical chattr? Google
>
For inlined extent, we only have one segment, thus less thing to check.
And further more, inlined extent always has csum in its leaf header,
it's less possible to have corrupted data.
Anyway, still check header and segment header.
Signed-off-by: Qu Wenruo
---
fs/btrfs/lzo.c | 11
This patchset can be fetched from github:
https://github.com/adam900710/linux/tree/lzo_corruption
Which is based on v4.17-rc5.
James Harvey reported pretty strange kernel misbehavior where after
reading certain btrfs compressed data, kernel crash with unrelated
calltrace.
On 23.05.2018 11:06, Su Yue wrote:
> Commit 5a5003df98d5 ("btrfs: delayed-ref: double free in
> btrfs_add_delayed_tree_ref()") fixed double free problem by creating
> an unnessesary label to jump.
> The elegant way is just to change "ref" to "head_ref" and keep
> btrfs_add_delayed_tree_ref() and
Since compression.h is using SZ_* macros, and if some user only includes
compression.h without linux/sizes.h, it will cause compile error.
One example is lzo.c, if it uses BTRFS_MAX_COMPRESSED, it would cause
compile error.
Fix it by adding linux/sizes.h in compression.h
Signed-off-by: Qu
James Harvey reported that some corrupted compressed extent data can
lead to various kernel memory corruption.
Such corrupted extent data belongs to inode with NODATASUM flags, thus
data csum won't help us detecting such bug.
If lucky enough, kasan could catch it like:
Although it's not that complex, but such comment could still save
several minutes for newer reader/reviewer.
Signed-off-by: Qu Wenruo
---
fs/btrfs/lzo.c | 35 +++
1 file changed, 35 insertions(+)
diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
index
On 05/23/2018 08:32 AM, Nikolay Borisov wrote:
Nikolay, thank you for the answer.
>> [...]
>> root@node0:~# dmesg | grep BTRFS | grep warn
>> 185:980:[2927472.393557] BTRFS warning (device dm-0): csum failed root
>> -9 ino 312 off 608284672 csum 0x7d03a376 expected csum 0x3163a9b7 mirror 1
>>
Commit 5a5003df98d5 ("btrfs: delayed-ref: double free in
btrfs_add_delayed_tree_ref()") fixed double free problem by creating
an unnessesary label to jump.
The elegant way is just to change "ref" to "head_ref" and keep
btrfs_add_delayed_tree_ref() and btrfs_add_delayed_data_ref() in
similar
On 23.05.2018 09:37, Christoph Hellwig wrote:
> On Tue, May 22, 2018 at 02:31:36PM -0400, Chris Mason wrote:
>>> And what protects two writes from interleaving their results now?
>>
>> page locks...ish, we at least won't have results interleaved in a single
>> page. For btrfs it'll actually be
On 2018年05月22日 22:00, David Sterba wrote:
> On Mon, May 21, 2018 at 01:19:25PM +0800, Qu Wenruo wrote:
>> Although it's not that complex, but such comment could still save
>> several minutes for newer reader/reviewer.
>>
>> Signed-off-by: Qu Wenruo
>> ---
>> fs/btrfs/lzo.c | 23
Goede dag,
We zijn Funding Trusts Finance verstrekt leningen per postadvertentie. Wij
bieden verschillende soorten leningen of projectleningen (korte en lange
termijnleningen, persoonlijke leningen, leningen aan bedrijven enz.) Met een
rentetarief van 3%. We verstrekken leningen
On 23.05.2018 10:29, Qu Wenruo wrote:
> When doing qgroup rescan using the following script (modified from
> btrfs/017 test case), we can sometimes hit qgroup corruption.
>
> --
> umount $dev &> /dev/null
> umount $mnt &> /dev/null
>
> mkfs.btrfs -f -n 64k $dev
> mount $dev $mnt
>
>
When doing qgroup rescan using the following script (modified from
btrfs/017 test case), we can sometimes hit qgroup corruption.
--
umount $dev &> /dev/null
umount $mnt &> /dev/null
mkfs.btrfs -f -n 64k $dev
mount $dev $mnt
extent_size=8192
xfs_io -f -d -c "pwrite 0 $extent_size" $mnt/foo
Goede dag,
We zijn Funding Trusts Finance verstrekt leningen per postadvertentie. Wij
bieden verschillende soorten leningen of projectleningen (korte en lange
termijnleningen, persoonlijke leningen, leningen aan bedrijven enz.) Met een
rentetarief van 3%. We verstrekken leningen
Chris Mason 於 2018-05-23 02:31 寫到:
On 22 May 2018, at 14:08, Christoph Hellwig wrote:
On Wed, May 16, 2018 at 11:52:37AM +0800, robbieko wrote:
From: Robbie Ko
This idea is from direct io. By this patch, we can make the buffered
write parallel, and improve the
Omar Sandoval 於 2018-05-23 01:28 寫到:
On Wed, May 16, 2018 at 11:52:37AM +0800, robbieko wrote:
From: Robbie Ko
This idea is from direct io. By this patch, we can make the buffered
write parallel, and improve the performance and latency. But because
we
can not update
On Tue, May 22, 2018 at 02:31:36PM -0400, Chris Mason wrote:
> > And what protects two writes from interleaving their results now?
>
> page locks...ish, we at least won't have results interleaved in a single
> page. For btrfs it'll actually be multiple pages since we try to do more
> than one at
Add a kernel log when the balance ends, either for cancel or completed
or if it is paused.
Signed-off-by: Anand Jain
---
v3->v4: nothing.
v2->v3: nothing.
v1->v2: Moved from 2/3 to 3/3
fs/btrfs/volumes.c | 7 +++
1 file changed, 7 insertions(+)
diff --git
Improve on describe_relocation() add a common helper function to describe
the block groups.
Signed-off-by: Anand Jain
---
v3->v4: Just pass full flag name in the define DESCRIBE_FLAG(flag,..),
so that it can be used at couple of more places.
Rename
Balance arg info is an important information to be reviewed for the
system audit. So this patch adds them to the kernel log.
Example:
->btrfs bal start --full-balance -f -mprofiles=raid1,convert=single,soft
-dlimit=10..20,usage=50 /btrfs
kernel: BTRFS info (device sdc): balance: start -f -d
On 22.05.2018 23:05, ein wrote:
> Hello devs,
>
> I tested BTRFS in production for about a month:
>
> 21:08:17 up 34 days, 2:21, 3 users, load average: 0.06, 0.02, 0.00
>
> Without power blackout, hardware failure, SSD's SMART is flawless etc.
> The tests ended with:
>
> root@node0:~#
v3->v4:
Pls ref to individual patches.
Based on misc-next.
v2->v3:
Inspried by describe_relocation(), improves it, makes it a helper
function and use it to log the balance operations.
Kernel logs are very important for the forensic investigations of the
issues, these patchs make balance
On 05/22/2018 08:26 PM, David Sterba wrote:
On Mon, May 21, 2018 at 02:37:43PM +0800, Anand Jain wrote:
Improve on describe_relocation() add a common helper function to describe
the block groups.
Signed-off-by: Anand Jain
---
v3: Born.
fs/btrfs/relocation.c | 30
On 23.05.2018 06:37, Gu Jinxiang wrote:
> set_extent_bits may fail, return the result in add_excluded_extent.
>
> Signed-off-by: Gu Jinxiang
Reviewed-by: Nikolay Borisov
> Changelog:
> v2-v1:
> 1.remove goto to make the function run linearly.
>
On 05/22/2018 08:35 PM, David Sterba wrote:
On Mon, May 21, 2018 at 02:37:44PM +0800, Anand Jain wrote:
Balance arg info is an important information to be reviewed for the
system audit. So this patch adds them to the kernel log.
Example:
-> btrfs bal start
On 23.05.2018 01:02, Omar Sandoval wrote:
> From: Omar Sandoval
>
> In btrfs_extent_same(), we must check the NODATASUM flag while the
> inodes are locked. Otherwise, it's possible that btrfs_ioctl_setflags()
> will change the flags after we check. This was correct until a
On 23.05.2018 01:02, Omar Sandoval wrote:
> From: Omar Sandoval
>
> In btrfs_clone_files(), we must check the NODATASUM flag while the
> inodes are locked. Otherwise, it's possible that btrfs_ioctl_setflags()
> will change the flags after we check and we can end up with a party
75 matches
Mail list logo