On Thu, Apr 08, 2021 at 05:20:00PM +0800, Wang Yugui wrote:
> Hi,
> 
> > On Thu, Apr 08, 2021 at 07:28:01AM +0800, Wang Yugui wrote:
> > > Hi,
> > > 
> > > > > > > upper caller:
> > > > > > >     nofs_flag = memalloc_nofs_save();
> > > > > > >     ret = btrfs_drew_lock_init(&root->snapshot_lock);
> > > > > > >     memalloc_nofs_restore(nofs_flag);
> > > > 
> > > > The issue is here. nofs is set which means percpu attempts an atomic
> > > > allocation. If it cannot find anything already allocated it isn't happy.
> > > > This was done before memalloc_nofs_{save/restore}() were pervasive.
> > > > 
> > > > Percpu should probably try to allocate some pages if possible even if
> > > > nofs is set.
> > > 
> > > Thanks.
> > > 
> > > I will wait for the patch, and then test it.
> > > 
> > 
> > I'm currently a bit busy with some other things. Adding support I don't
> > think will be much work, just a little bit tricky.
> > 
> > I recommend carrying what you have minus the change to reserved percpu
> > memory for now. If I'm the one to write it, I'll cc you.
> > 
> > Thanks,
> > Dennis
> 
> 
> In the recent test, another problem is triggered too with my extended
> percpu buffer size patch. maybe this info is helpful.
> 
> problem:
> OS/VGA console is freezed , and no call stace is outputed.
> Just some info is outputed to IPMI/dell iDRAC
>    2 | 04/03/2021 | 11:35:01 | OS Critical Stop #0x46 | Run-time critical 
> stop () | Asserted
>    3 | Linux kernel panic: Fatal excep
>    4 | Linux kernel panic: tion
>    5 | 04/05/2021 | 19:09:14 | OS Critical Stop #0x46 | Run-time critical 
> stop () | Asserted
>    6 | Linux kernel panic: Fatal excep
>    7 | Linux kernel panic: tion
>    8 | 04/06/2021 | 13:08:42 | OS Critical Stop #0x46 | Run-time critical 
> stop () | Asserted
>    9 | Linux kernel panic: Fatal excep
>    a | Linux kernel panic: tion
>    b | 04/08/2021 | 02:12:46 | OS Critical Stop #0x46 | Run-time critical 
> stop () | Asserted
>    c | Linux kernel panic: Fatal excep
>    d | Linux kernel panic: tion

Unfortunately non of the above to me is useful.

> kernel: at least 5.10.26/5.10.27/5.10.28
> 
> This problem is triggered by our application, NOT xfstests.
> But our applicaiton have some heavy write load just like xfstest/generic/476.
> Our application use at most 75% of memory, if still not enough, 
> it will write out all buffer info to filesystem.

Do you use cgroups at all? If yes can you describe the workload pattern
a bit.

> This problem is happen in linux kernel 5.10.x, but not happen in linux
> kernel 5.4.x. It have high frequency to repduce too.

Ah. Can you try the following patch?
https://lore.kernel.org/lkml/20210408035736.883861-4-g...@fb.com/

Thanks,
Dennis

Reply via email to