On Tue, Mar 26, 2019 at 10:42:31AM +0200, Nikolay Borisov wrote:
> 
> 
> On 26.03.19 г. 6:30 ч., Zygo Blaxell wrote:
> > On Mon, Mar 25, 2019 at 10:50:28PM -0400, Zygo Blaxell wrote:
> >> Running balance, rsync, and dedupe, I get kernel warnings every few
> >> minutes on 5.0.4.  No warnings on 5.0.3 under similar conditions.
> >>
> >> Mount options are:  flushoncommit,space_cache=v2,compress=zstd.
> >>
> >> There are two different stacks on the warnings.  This one comes from
> >> btrfs balance:
> > 
> > [snip]
> > 
> > Possibly unrelated, but I'm also repeatably getting this in 5.0.4 and
> > not 5.0.3, after about 5 hours of uptime.  Different processes, same
> > kernel stack:
> > 
> >     [Mon Mar 25 23:35:17 2019] kworker/u8:4: page allocation failure: 
> > order:0, mode:0x404000(GFP_NOWAIT|__GFP_COMP), 
> > nodemask=(null),cpuset=/,mems_allowed=0
> >     [Mon Mar 25 23:35:17 2019] CPU: 2 PID: 29518 Comm: kworker/u8:4 
> > Tainted: G        W         5.0.4-zb64-303ce93b05c9+ #1
> 
> What commits does this kernel include because it doesn't seem to be a
> pristine upstream 5.0.4 ? Also what you are seeing below is definitely a
> bug in MM. The question is whether it's due to your doing faulty
> backports in the kernel or it's due to something that got automatically
> backported to 5.0.4

That was the first thing I thought of, so I reverted to vanilla 5.0.4,
repeated the test, and obtained the same result.

You may have a point about non-btrfs patches in 5.0.4, though.
I previously tested 5.0.3 with most of the 5.0.4 fs/btrfs commits
already included by cherry-pick:

        1098803b8cb7 Btrfs: fix deadlock between clone/dedupe and rename
        3486142a68e3 Btrfs: fix corruption reading shared and compressed 
extents after hole punching
        fb9c36acfab1 btrfs: scrub: fix circular locking dependency warning
        9d7b327affb8 Btrfs: setup a nofs context for memory allocation at 
__btrfs_set_acl
        80dcd07c27df Btrfs: setup a nofs context for memory allocation at 
btrfs_create_tree()

The commits that are in 5.0.4 but not in my last 5.0.3 test run are:

        ebbb48419e8a btrfs: init csum_list before possible free
        88e610ae4c3a btrfs: ensure that a DUP or RAID1 block group has exactly 
two stripes
        9c58f2ada4fa btrfs: drop the lock on error in btrfs_dev_replace_cancel

and I don't see how those commits could lead to the observed changes
in behavior.  I didn't include them for 5.0.3 because my test scenario
doesn't execute the code they touch.  So the problem might be outside
of btrfs completely.

Reply via email to