Thanks for the patch - I applied it against 5.9-rc2, and it seems to help...:  
The test I am using for this is to copy the entire rootfs tree to a 
zstd-compressed f2fs partition.  Previously, even a vm.min_free_kbytes of 32768 
wasn't enough to avoid the allocation traps for the copy; with this patch I'm 
able to complete the entire copy without an error at vm.min_free_kbytes=32768.

However, if I try vm.min_free_kbytes=16384 (for example), then it still runs 
out of memory and logs many traps.  It still seems rather excessive to require 
so much available memory...?

Example traps at the system default vm.min_free_kbytes of ~2800 (following 
board boot):

[  141.863780] kworker/u8:4: page allocation failure: order:6, 
mode:0x40c40(GFP_NOFS|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
[  141.863810] CPU: 3 PID: 1444 Comm: kworker/u8:4 Tainted: G         C        
5.9.0-rc2-sunxi #trunk
[  141.863812] Hardware name: Allwinner sun8i Family
[  141.863833] Workqueue: writeback wb_workfn (flush-179:0)
[  141.863859] [<c010d415>] (unwind_backtrace) from [<c01097a5>] 
(show_stack+0x11/0x14)
[  141.863872] [<c01097a5>] (show_stack) from [<c0573da1>] 
(dump_stack+0x75/0x84)
[  141.863888] [<c0573da1>] (dump_stack) from [<c0246163>] 
(warn_alloc+0xa3/0x104)
[  141.863899] [<c0246163>] (warn_alloc) from [<c0246d71>] 
(__alloc_pages_nodemask+0xbad/0xc58)
[  141.863911] [<c0246d71>] (__alloc_pages_nodemask) from [<c022a09f>] 
(kmalloc_order+0x23/0x50)
[  141.863920] [<c022a09f>] (kmalloc_order) from [<c022a0e5>] 
(kmalloc_order_trace+0x19/0x90)
[  141.863933] [<c022a0e5>] (kmalloc_order_trace) from [<c0481519>] 
(zstd_init_compress_ctx+0x51/0xfc)
[  141.863946] [<c0481519>] (zstd_init_compress_ctx) from [<c048304b>] 
(f2fs_write_multi_pages+0x27b/0x6a0)
[  141.863961] [<c048304b>] (f2fs_write_multi_pages) from [<c04699e3>] 
(f2fs_write_cache_pages+0x3bf/0x538)
[  141.863971] [<c04699e3>] (f2fs_write_cache_pages) from [<c0469d8f>] 
(f2fs_write_data_pages+0x233/0x264)
[  141.863985] [<c0469d8f>] (f2fs_write_data_pages) from [<c02139b9>] 
(do_writepages+0x35/0x98)
[  141.863995] [<c02139b9>] (do_writepages) from [<c02947ef>] 
(__writeback_single_inode+0x2f/0x358)
[  141.864004] [<c02947ef>] (__writeback_single_inode) from [<c0294c9d>] 
(writeback_sb_inodes+0x185/0x378)
[  141.864012] [<c0294c9d>] (writeback_sb_inodes) from [<c0294ec1>] 
(__writeback_inodes_wb+0x31/0x88)
[  141.864019] [<c0294ec1>] (__writeback_inodes_wb) from [<c029510b>] 
(wb_writeback+0x1f3/0x264)
[  141.864026] [<c029510b>] (wb_writeback) from [<c0296053>] 
(wb_workfn+0x2a3/0x3a4)
[  141.864035] [<c0296053>] (wb_workfn) from [<c0130313>] 
(process_one_work+0x15f/0x3b0)
[  141.864043] [<c0130313>] (process_one_work) from [<c013065f>] 
(worker_thread+0xfb/0x3e0)
[  141.864053] [<c013065f>] (worker_thread) from [<c0135407>] 
(kthread+0xeb/0x10c)
[  141.864063] [<c0135407>] (kthread) from [<c0100159>] 
(ret_from_fork+0x11/0x38)
[  141.864067] Exception stack(0xcf153fb0 to 0xcf153ff8)
[  141.864073] 3fa0:                                     00000000 00000000 
00000000 00000000
[  141.864079] 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 
00000000 00000000
[  141.864084] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[  141.864089] Mem-Info:
[  141.864103] active_anon:105 inactive_anon:9374 isolated_anon:0
                active_file:12581 inactive_file:77234 isolated_file:32
                unevictable:4 dirty:11187 writeback:174
                slab_reclaimable:3566 slab_unreclaimable:6038
                mapped:5698 shmem:414 pagetables:348 bounce:0
                free:10114 free_pcp:223 free_cma:8329
[  141.864114] Node 0 active_anon:420kB inactive_anon:37496kB 
active_file:50324kB inactive_file:308936kB unevictable:16kB isolated(anon):0kB 
isolated(file):128kB mapped:22792kB dirty:44748kB writeback:696kB shmem:1656kB 
writeback_tmp:0kB kernel_stack:1216kB all_unreclaimable? no
[  141.864127] Normal free:40456kB min:6904kB low:7604kB high:8304kB 
reserved_highatomic:0KB active_anon:420kB inactive_anon:37496kB 
active_file:50248kB inactive_file:308768kB unevictable:16kB 
writepending:45608kB present:524288kB managed:503884kB mlocked:16kB 
pagetables:1392kB bounce:0kB free_pcp:892kB local_pcp:176kB free_cma:33316kB
[  141.864129] lowmem_reserve[]: 0 0 0
[  141.864135] Normal: 88*4kB (UMEC) 107*8kB (UMEC) 51*16kB (UMEC) 29*32kB 
(UMEC) 13*64kB (UMEC) 2*128kB (UE) 3*256kB (UC) 2*512kB (U) 2*1024kB (U) 
0*2048kB 8*4096kB (C) = 40648kB
[  141.864162] 90296 total pagecache pages
[  141.864168] 0 pages in swap cache
[  141.864171] Swap cache stats: add 0, delete 0, find 0/0
[  141.864173] Free swap  = 251940kB
[  141.864175] Total swap = 251940kB
[  141.864177] 131072 pages RAM
[  141.864179] 0 pages HighMem/MovableOnly
[  141.864181] 5101 pages reserved
[  141.864184] 32768 pages cma reserved
[  155.171118] warn_alloc: 23 callbacks suppressed
[  155.171143] kworker/u8:4: page allocation failure: order:6, 
mode:0x40c40(GFP_NOFS|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
[  155.171168] CPU: 1 PID: 1444 Comm: kworker/u8:4 Tainted: G         C        
5.9.0-rc2-sunxi #trunk
[  155.171172] Hardware name: Allwinner sun8i Family
[  155.171195] Workqueue: writeback wb_workfn (flush-179:0)
[  155.171229] [<c010d415>] (unwind_backtrace) from [<c01097a5>] 
(show_stack+0x11/0x14)
[  155.171243] [<c01097a5>] (show_stack) from [<c0573da1>] 
(dump_stack+0x75/0x84)
[  155.171266] [<c0573da1>] (dump_stack) from [<c0246163>] 
(warn_alloc+0xa3/0x104)
[  155.171281] [<c0246163>] (warn_alloc) from [<c0246d71>] 
(__alloc_pages_nodemask+0xbad/0xc58)
[  155.171294] [<c0246d71>] (__alloc_pages_nodemask) from [<c022a09f>] 
(kmalloc_order+0x23/0x50)
[  155.171304] [<c022a09f>] (kmalloc_order) from [<c022a0e5>] 
(kmalloc_order_trace+0x19/0x90)
[  155.171320] [<c022a0e5>] (kmalloc_order_trace) from [<c0481519>] 
(zstd_init_compress_ctx+0x51/0xfc)
[  155.171334] [<c0481519>] (zstd_init_compress_ctx) from [<c048304b>] 
(f2fs_write_multi_pages+0x27b/0x6a0)
[  155.171349] [<c048304b>] (f2fs_write_multi_pages) from [<c04699e3>] 
(f2fs_write_cache_pages+0x3bf/0x538)
[  155.171359] [<c04699e3>] (f2fs_write_cache_pages) from [<c0469d8f>] 
(f2fs_write_data_pages+0x233/0x264)
[  155.171374] [<c0469d8f>] (f2fs_write_data_pages) from [<c02139b9>] 
(do_writepages+0x35/0x98)
[  155.171385] [<c02139b9>] (do_writepages) from [<c02947ef>] 
(__writeback_single_inode+0x2f/0x358)
[  155.171394] [<c02947ef>] (__writeback_single_inode) from [<c0294c9d>] 
(writeback_sb_inodes+0x185/0x378)
[  155.171402] [<c0294c9d>] (writeback_sb_inodes) from [<c0294ec1>] 
(__writeback_inodes_wb+0x31/0x88)
[  155.171409] [<c0294ec1>] (__writeback_inodes_wb) from [<c029510b>] 
(wb_writeback+0x1f3/0x264)
[  155.171417] [<c029510b>] (wb_writeback) from [<c0295ffd>] 
(wb_workfn+0x24d/0x3a4)
[  155.171428] [<c0295ffd>] (wb_workfn) from [<c0130313>] 
(process_one_work+0x15f/0x3b0)
[  155.171437] [<c0130313>] (process_one_work) from [<c013065f>] 
(worker_thread+0xfb/0x3e0)
[  155.171447] [<c013065f>] (worker_thread) from [<c0135407>] 
(kthread+0xeb/0x10c)
[  155.171457] [<c0135407>] (kthread) from [<c0100159>] 
(ret_from_fork+0x11/0x38)
[  155.171462] Exception stack(0xcf153fb0 to 0xcf153ff8)
[  155.171468] 3fa0:                                     00000000 00000000 
00000000 00000000
[  155.171474] 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 
00000000 00000000
[  155.171480] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[  155.171488] Mem-Info:
[  155.171504] active_anon:105 inactive_anon:9403 isolated_anon:0
                active_file:17189 inactive_file:52888 isolated_file:0
                unevictable:4 dirty:11785 writeback:50
                slab_reclaimable:4217 slab_unreclaimable:6052
                mapped:5706 shmem:414 pagetables:349 bounce:0
                free:29132 free_pcp:340 free_cma:27347
[  155.171516] Node 0 active_anon:420kB inactive_anon:37612kB 
active_file:68756kB inactive_file:211552kB unevictable:16kB isolated(anon):0kB 
isolated(file):0kB mapped:22824kB dirty:47140kB writeback:200kB shmem:1656kB 
writeback_tmp:0kB kernel_stack:1216kB all_unreclaimable? no
[  155.171531] Normal free:116528kB min:6904kB low:7604kB high:8304kB 
reserved_highatomic:0KB active_anon:420kB inactive_anon:37612kB 
active_file:68680kB inactive_file:211696kB unevictable:16kB 
writepending:47352kB present:524288kB managed:503884kB mlocked:16kB 
pagetables:1396kB bounce:0kB free_pcp:1356kB local_pcp:8kB free_cma:109388kB
[  155.171534] lowmem_reserve[]: 0 0 0
[  155.171540] Normal: 365*4kB (UMEC) 188*8kB (UMEC) 153*16kB (UMC) 111*32kB 
(UMC) 73*64kB (UMC) 44*128kB (UC) 33*256kB (UC) 18*512kB (UC) 18*1024kB (UC) 
6*2048kB (C) 12*4096kB (C) = 116804kB
[  155.171568] 70535 total pagecache pages
[  155.171576] 0 pages in swap cache
[  155.171579] Swap cache stats: add 0, delete 0, find 0/0
[  155.171581] Free swap  = 251940kB
[  155.171583] Total swap = 251940kB
[  155.171585] 131072 pages RAM
[  155.171587] 0 pages HighMem/MovableOnly
[  155.171590] 5101 pages reserved
[  155.171592] 32768 pages cma reserved


On Mon, Aug 31, 2020, at 6:39 PM, Chao Yu wrote:
> Hi,
> 
> We should align max compress window size of zstd to cluster size of 
> current inode,
> by default, cluster size is 16KB (log size is 2), so it can reduce size 
> of allocated
> memory significantly.
> 
> So, could you please try below patch first?
> 
>  From c4bf178e5133525027d817a2ac542db6f5621c4f Mon Sep 17 00:00:00 2001
> From: Chao Yu <yuch...@huawei.com>
> Date: Tue, 1 Sep 2020 09:29:08 +0800
> Subject: [PATCH] fix memory allocation failure on zstd decompression
> 
> Signed-off-by: Chao Yu <yuch...@huawei.com>
> ---
>   fs/f2fs/compress.c | 7 ++++---
>   fs/f2fs/f2fs.h     | 2 +-
>   2 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
> index df097c4a71e1..357303d8514b 100644
> --- a/fs/f2fs/compress.c
> +++ b/fs/f2fs/compress.c
> @@ -382,16 +382,17 @@ static int zstd_init_decompress_ctx(struct 
> decompress_io_ctx *dic)
>       ZSTD_DStream *stream;
>       void *workspace;
>       unsigned int workspace_size;
> +     unsigned int max_window_size =
> +                     MAX_COMPRESS_WINDOW_SIZE(dic->log_cluster_size);
> 
> -     workspace_size = ZSTD_DStreamWorkspaceBound(MAX_COMPRESS_WINDOW_SIZE);
> +     workspace_size = ZSTD_DStreamWorkspaceBound(max_window_size);
> 
>       workspace = f2fs_kvmalloc(F2FS_I_SB(dic->inode),
>                                       workspace_size, GFP_NOFS);
>       if (!workspace)
>               return -ENOMEM;
> 
> -     stream = ZSTD_initDStream(MAX_COMPRESS_WINDOW_SIZE,
> -                                     workspace, workspace_size);
> +     stream = ZSTD_initDStream(max_window_size, workspace, workspace_size);
>       if (!stream) {
>               printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initDStream 
> failed\n",
>                               KERN_ERR, F2FS_I_SB(dic->inode)->sb->s_id,
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 21f86001bb3a..d210809292f9 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -1419,7 +1419,7 @@ struct decompress_io_ctx {
>   #define NULL_CLUSTER                        ((unsigned int)(~0))
>   #define MIN_COMPRESS_LOG_SIZE               2
>   #define MAX_COMPRESS_LOG_SIZE               8
> -#define MAX_COMPRESS_WINDOW_SIZE     ((PAGE_SIZE) << MAX_COMPRESS_LOG_SIZE)
> +#define MAX_COMPRESS_WINDOW_SIZE(log_size)   ((PAGE_SIZE) << (log_size))
> 
>   struct f2fs_sb_info {
>       struct super_block *sb;                 /* pointer to VFS super block */
> -- 
> 2.26.2
> 
> 
> 
> On 2020/9/1 2:14, 5kft wrote:
> > Sounds good :-)  Perhaps it's simply that zstd needs a lot of memory to 
> > operate, however it's unfortunate that it doesn't work on smaller platforms 
> > "out of the box" like lz4 does.  Should there a be note or guidance of some 
> > sort regarding this for smaller embedded platforms?
> > 
> > On Mon, Aug 31, 2020, at 11:04 AM, Jaegeuk Kim wrote:
> >> Let me add more f2fs folks. :)
> >>
> >> On 08/27, 5kft wrote:
> >>> (Note that for testing this I backported f2fs from 5.9-rc2 into 5.8.5, as 
> >>> I don't have 5.9 working on these boards yet.)
> >>>
> >>> On Thu, Aug 27, 2020, at 7:39 AM, 5kft wrote:
> >>>> Quick update - I encounter the problem with f2fs zstd compression in the 
> >>>> mainline 5.9-rc2 kernel as well - e.g.,
> >>>>
> >>>> [   67.668529] F2FS-fs (mmcblk0p1): Found nat_bits in checkpoint
> >>>> [   68.339021] F2FS-fs (mmcblk0p1): Mounted with checkpoint version = 
> >>>> 76732978
> >>>> [   93.862327] kworker/u8:2: page allocation failure: order:6, 
> >>>> mode:0x40c40(GFP_NOFS|__GFP_COMP), 
> >>>> nodemask=(null),cpuset=/,mems_allowed=0
> >>>> [   93.862360] CPU: 0 PID: 187 Comm: kworker/u8:2 Tainted: G         C   
> >>>>      5.8.5-sunxi #trunk
> >>>> [   93.862364] Hardware name: Allwinner sun8i Family
> >>>> [   93.862388] Workqueue: writeback wb_workfn (flush-179:0)
> >>>> [   93.862424] [<c010d6d5>] (unwind_backtrace) from [<c0109a55>] 
> >>>> (show_stack+0x11/0x14)
> >>>> [   93.862439] [<c0109a55>] (show_stack) from [<c056eae9>] 
> >>>> (dump_stack+0x75/0x84)
> >>>> [   93.862456] [<c056eae9>] (dump_stack) from [<c0243b8f>] 
> >>>> (warn_alloc+0xa3/0x104)
> >>>> [   93.862469] [<c0243b8f>] (warn_alloc) from [<c0244777>] 
> >>>> (__alloc_pages_nodemask+0xb87/0xc40)
> >>>> [   93.862483] [<c0244777>] (__alloc_pages_nodemask) from [<c02267fd>] 
> >>>> (kmalloc_order+0x19/0x38)
> >>>> [   93.862492] [<c02267fd>] (kmalloc_order) from [<c0226835>] 
> >>>> (kmalloc_order_trace+0x19/0x90)
> >>>> [   93.862506] [<c0226835>] (kmalloc_order_trace) from [<c047ddf5>] 
> >>>> (zstd_init_compress_ctx+0x51/0xfc)
> >>>> [   93.862518] [<c047ddf5>] (zstd_init_compress_ctx) from [<c047f90b>] 
> >>>> (f2fs_write_multi_pages+0x27b/0x6a0)
> >>>> [   93.862532] [<c047f90b>] (f2fs_write_multi_pages) from [<c046630d>] 
> >>>> (f2fs_write_cache_pages+0x415/0x538)
> >>>> [   93.862542] [<c046630d>] (f2fs_write_cache_pages) from [<c0466663>] 
> >>>> (f2fs_write_data_pages+0x233/0x264)
> >>>> [   93.862555] [<c0466663>] (f2fs_write_data_pages) from [<c0210ded>] 
> >>>> (do_writepages+0x35/0x98)
> >>>> [   93.862571] [<c0210ded>] (do_writepages) from [<c0290c4f>] 
> >>>> (__writeback_single_inode+0x2f/0x358)
> >>>> [   93.862584] [<c0290c4f>] (__writeback_single_inode) from [<c02910fd>] 
> >>>> (writeback_sb_inodes+0x185/0x378)
> >>>> [   93.862594] [<c02910fd>] (writeback_sb_inodes) from [<c0291321>] 
> >>>> (__writeback_inodes_wb+0x31/0x88)
> >>>> [   93.862603] [<c0291321>] (__writeback_inodes_wb) from [<c029156b>] 
> >>>> (wb_writeback+0x1f3/0x264)
> >>>> [   93.862612] [<c029156b>] (wb_writeback) from [<c0292461>] 
> >>>> (wb_workfn+0x24d/0x3a4)
> >>>> [   93.862624] [<c0292461>] (wb_workfn) from [<c0130b2f>] 
> >>>> (process_one_work+0x15f/0x3b0)
> >>>> [   93.862634] [<c0130b2f>] (process_one_work) from [<c0130e7b>] 
> >>>> (worker_thread+0xfb/0x3e0)
> >>>> [   93.862646] [<c0130e7b>] (worker_thread) from [<c0135c3b>] 
> >>>> (kthread+0xeb/0x10c)
> >>>> [   93.862656] [<c0135c3b>] (kthread) from [<c0100159>] 
> >>>> (ret_from_fork+0x11/0x38)
> >>>> [   93.862661] Exception stack(0xd4167fb0 to 0xd4167ff8)
> >>>> [   93.862667] 7fa0:                                     00000000 
> >>>> 00000000 00000000 00000000
> >>>> [   93.862674] 7fc0: 00000000 00000000 00000000 00000000 00000000 
> >>>> 00000000 00000000 00000000
> >>>> [   93.862680] 7fe0: 00000000 00000000 00000000 00000000 00000013 
> >>>> 00000000
> >>>> [   93.862686] Mem-Info:
> >>>> [   93.862699] active_anon:3457 inactive_anon:6470 isolated_anon:32
> >>>>                  active_file:14148 inactive_file:75224 isolated_file:0
> >>>>                  unevictable:4 dirty:10374 writeback:151
> >>>>                  slab_reclaimable:4946 slab_unreclaimable:8951
> >>>>                  mapped:5557 shmem:414 pagetables:332 bounce:0
> >>>>                  free:5946 free_pcp:118 free_cma:4292
> >>>> [   93.862709] Node 0 active_anon:13828kB inactive_anon:26032kB 
> >>>> active_file:56592kB inactive_file:300896kB unevictable:16kB 
> >>>> isolated(anon):0kB isolated(file):0kB mapped:22228kB dirty:41496kB 
> >>>> writeback:604kB shmem:1656kB writeback_tmp:0kB all_unreclaimable? no
> >>>> [   93.862725] Normal free:23784kB min:6904kB low:7604kB high:8304kB 
> >>>> reserved_highatomic:0KB active_anon:13956kB inactive_anon:25800kB 
> >>>> active_file:56592kB inactive_file:301212kB unevictable:16kB 
> >>>> writepending:42024kB present:524288kB managed:503888kB mlocked:16kB 
> >>>> kernel_stack:1200kB pagetables:1328kB bounce:0kB free_pcp:472kB 
> >>>> local_pcp:196kB free_cma:17168kB
> >>>> [   93.862727] lowmem_reserve[]: 0 0 0
> >>>> [   93.862734] Normal: 95*4kB (UMEC) 122*8kB (UMEC) 45*16kB (UMEC) 
> >>>> 32*32kB (UMEC) 17*64kB (UMEC) 7*128kB (UMEC) 4*256kB (U) 3*512kB (UC) 
> >>>> 0*1024kB 0*2048kB 4*4096kB (C) = 24028kB
> >>>> [   93.862762] 89790 total pagecache pages
> >>>> [   93.862768] 0 pages in swap cache
> >>>> [   93.862771] Swap cache stats: add 0, delete 0, find 0/0
> >>>> [   93.862773] Free swap  = 251940kB
> >>>> [   93.862775] Total swap = 251940kB
> >>>> [   93.862777] 131072 pages RAM
> >>>> [   93.862780] 0 pages HighMem/MovableOnly
> >>>> [   93.862782] 5100 pages reserved
> >>>> [   93.862784] 32768 pages cma reserved
> >>>>
> >>>> I haven't tried lowering MAX_COMPRESS_LOG_SIZE in this kernel yet but 
> >>>> will test this when I can.
> >>>>
> >>>> On Tue, Aug 25, 2020, at 1:31 PM, 5kft wrote:
> >>>>> Note that I don't think that this particular problem is a memleak as it 
> >>>>> happens very quickly when simply copying files to the zstd-mounted 
> >>>>> filesystem - but I haven't been able to compare the 5.8.3 changes to 
> >>>>> 5.9-rc1 yet.  This particular board boots up with vm.min_free_kbytes = 
> >>>>> 2406, which seems pretty low, but the board only has 512MB RAM on it 
> >>>>> total.  Kind of crazy I know, but it's a good test case for this 
> >>>>> problem :-)  Also, again lz4 compression works fine at this low value.
> >>>>>
> >>>>> I'm not sure that this particular change (lowering 
> >>>>> MAX_COMPRESS_LOG_SIZE) helps significantly.  I'm still seeing the 
> >>>>> failures even with vm.mem_free_kbytes = 32768 (and this seems like a 
> >>>>> rather high value compared to the default).
> >>>>>
> >>>>> On Tue, Aug 25, 2020, at 12:43 PM, Jaegeuk Kim wrote:
> >>>>>> So, if there's no memleak in f2fs but we need to do something like 
> >>>>>> that, I feel that something is misconfigured in f2fs wrt zstd.
> >>>>>> I took a look at zstd initialization flow, it seems f2fs is asking too 
> >>>>>> much memory space for the workspace when comparing it with btrfs.
> >>>>>> Could you please check whether replacing the below "8" with "5" 
> >>>>>> mitigates the problem? ("5" is used in btrfs.)
> >>>>>>
> >>>>>> In fs/f2fs/f2fs.h,
> >>>>>> #define MAX_COMPRESS_LOG_SIZE           8
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> 2020년 8월 25일 (화) 오후 12:30, 5kft <5...@5kft.org>님이 작성:
> >>>>>>> __
> >>>>>>> Will do!  Quick question - should these changes handle a low 
> >>>>>>> "vm.min_free_kbytes" situation with f2fs?  I can workaround for now 
> >>>>>>> by increasing this value per-board, although I don't know how high to 
> >>>>>>> increase it to (and I'm not sure typical users of f2fs with 
> >>>>>>> compression would know how to determine the right value either).
> >>>>>>>
> >>>>>>> On Tue, Aug 25, 2020, at 12:25 PM, Jaegeuk Kim wrote:
> >>>>>>>> Oh, can you try to get the diff from up-to-date f2fs?
> >>>>>>>>
> >>>>>>>> # cd <5.8.3_branch>
> >>>>>>>> # git diff <5.9-rc1_branch> fs/f2fs
> >>>>>>>>
> >>>>>>>> 2020년 8월 25일 (화) 오전 11:45, 5kft <5...@5kft.org>님이 작성:
> >>>>>>>>> __
> >>>>>>>>> Indeed these changes are present in 5.8.3 (copy from the compress.c 
> >>>>>>>>> on my build):
> >>>>>>>>>
> >>>>>>>>>                  err = f2fs_write_compressed_pages(cc, submitted,
> >>>>>>>>>                                                          wbc, 
> >>>>>>>>> io_type);
> >>>>>>>>>                  cops->destroy_compress_ctx(cc);
> >>>>>>>>>                  kfree(cc->cpages);
> >>>>>>>>>                  cc->cpages = NULL;
> >>>>>>>>>                  if (!err)
> >>>>>>>>>                          return 0;
> >>>>>>>>>
> >>>>>>>>> On Tue, Aug 25, 2020, at 11:37 AM, Jaegeuk Kim wrote:
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> Thank you for the test and report. :)
> >>>>>>>>>>
> >>>>>>>>>> Just to make sure if there's any missing fixes, I guess the gap is 
> >>>>>>>>>> the recent 5.9-rc1 updates.
> >>>>>>>>>> Looking at a glance, potential memory leak was fixed by the below 
> >>>>>>>>>> commit among them. Could you give it a try?
> >>>>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs-stable.git/commit/?h=linux-5.4.y&id=721ef9e46dec3091fa7cd955da99ce83a850ab32
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> 2020년 8월 25일 (화) 오전 11:09, 5kft <5...@5kft.org>님이 작성:
> >>>>>>>>>>> __
> >>>>>>>>>>> I did a little quick testing further on this problem, and I found 
> >>>>>>>>>>> that if I increase "vm.min_free_kbytes" then the allocations (not 
> >>>>>>>>>>> surprisingly) work and the failures go away.  E.g., this appears 
> >>>>>>>>>>> to make it work fine:
> >>>>>>>>>>>
> >>>>>>>>>>>      sysctl -w vm.min_free_kbytes=65536
> >>>>>>>>>>>
> >>>>>>>>>>> I didn't bisect this to find out what the lowest/safe minimum 
> >>>>>>>>>>> should be...
> >>>>>>>>>>>
> >>>>>>>>>>> Is there a way that F2FS should indicate that a change like this 
> >>>>>>>>>>> may be necessary when using zstd compression on some platforms?  
> >>>>>>>>>>> Perhaps this is just a documentation addition?  I just want to 
> >>>>>>>>>>> save others from the pain of a potentially corrupted filesystem 
> >>>>>>>>>>> when using zstd compression because F2FS was internally running 
> >>>>>>>>>>> out of memory (which is what happened to me...)
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks!
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, Aug 25, 2020, at 7:47 AM, 5kft wrote:
> >>>>>>>>>>>> Hi Jaegeuk,
> >>>>>>>>>>>>
> >>>>>>>>>>>> First, I'd like to apologize in advance if a direct email isn't 
> >>>>>>>>>>>> appropriate for reporting bugs in f2fs; I'm not sure what the 
> >>>>>>>>>>>> accepted process is for reporting issues in F2FS.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I am a contributor to the Armbian project 
> >>>>>>>>>>>> (https://www.armbian.com/ and https://github.com/armbian), and 
> >>>>>>>>>>>> have been using compression in F2FS for some time now - very 
> >>>>>>>>>>>> nice work - LZ4 compression works great!  Unfortunately, 
> >>>>>>>>>>>> however, when I try using "zstd" compression, I consistently get 
> >>>>>>>>>>>> numerous kernel page allocation failures (and not surprisingly 
> >>>>>>>>>>>> in some cases corruption of data from the filesystem).  I've 
> >>>>>>>>>>>> been seeing this for some time but finally got a few minutes to 
> >>>>>>>>>>>> write this email to you.
> >>>>>>>>>>>>
> >>>>>>>>>>>> What follows is an example of the problem on a small SBC (Nano 
> >>>>>>>>>>>> Pi NEO Air - 
> >>>>>>>>>>>> https://www.friendlyarm.com/index.php?route=product/product&product_id=151),
> >>>>>>>>>>>>  although I have reproduced this issue on some 64-bit ARM A53 
> >>>>>>>>>>>> boards as well (e.g., w/1GB RAM, including the Nano Pi NEO2, 
> >>>>>>>>>>>> NEO2 Black, etc.)  I have not tried zstd on an amd64 machine yet.
> >>>>>>>>>>>>
> >>>>>>>>>>>> This filesystem is formatted with compression ("-O 
> >>>>>>>>>>>> extra_attr,enable_compression"), and mounted to use zstd 
> >>>>>>>>>>>> compression ("-o compress_algorithm=zstd"), and the root mount 
> >>>>>>>>>>>> directory has compression enabled ("chattr +c mntpt").  After 
> >>>>>>>>>>>> doing a simple test copy of a number of files to it, it started 
> >>>>>>>>>>>> giving page allocation failures - example traps are provided 
> >>>>>>>>>>>> below.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I'm not sure if there are some kernel memory parameters that 
> >>>>>>>>>>>> need to be changed or something, but even so it seems to me that 
> >>>>>>>>>>>> this sort of thing shouldn't happen by default by a filesystem 
> >>>>>>>>>>>> :-)  Here are a couple of example failure cases, running on 
> >>>>>>>>>>>> stable kernel 5.8.3:
> >>>>>>>>>>>>
> >>>>>>>>>>>> [168053.070957] F2FS-fs (mmcblk0p1): Found nat_bits in checkpoint
> >>>>>>>>>>>> [168053.742204] F2FS-fs (mmcblk0p1): Mounted with checkpoint 
> >>>>>>>>>>>> version = 37a48fb3
> >>>>>>>>>>>> [168170.268522] kworker/u8:1: page allocation failure: order:6, 
> >>>>>>>>>>>> mode:0x40c40(GFP_NOFS|__GFP_COMP), 
> >>>>>>>>>>>> nodemask=(null),cpuset=/,mems_allowed=0
> >>>>>>>>>>>> [168170.268556] CPU: 3 PID: 7830 Comm: kworker/u8:1 Tainted: G   
> >>>>>>>>>>>>       C        5.8.3-sunxi #trunk
> >>>>>>>>>>>> [168170.268559] Hardware name: Allwinner sun8i Family
> >>>>>>>>>>>> [168170.268580] Workqueue: writeback wb_workfn (flush-179:24)
> >>>>>>>>>>>> [168170.268611] [<c010d6d5>] (unwind_backtrace) from 
> >>>>>>>>>>>> [<c0109a55>] (show_stack+0x11/0x14)
> >>>>>>>>>>>> [168170.268624] [<c0109a55>] (show_stack) from [<c056d489>] 
> >>>>>>>>>>>> (dump_stack+0x75/0x84)
> >>>>>>>>>>>> [168170.268639] [<c056d489>] (dump_stack) from [<c0243b53>] 
> >>>>>>>>>>>> (warn_alloc+0xa3/0x104)
> >>>>>>>>>>>> [168170.268651] [<c0243b53>] (warn_alloc) from [<c024473b>] 
> >>>>>>>>>>>> (__alloc_pages_nodemask+0xb87/0xc40)
> >>>>>>>>>>>> [168170.268662] [<c024473b>] (__alloc_pages_nodemask) from 
> >>>>>>>>>>>> [<c02267c5>] (kmalloc_order+0x19/0x38)
> >>>>>>>>>>>> [168170.268672] [<c02267c5>] (kmalloc_order) from [<c02267fd>] 
> >>>>>>>>>>>> (kmalloc_order_trace+0x19/0x90)
> >>>>>>>>>>>> [168170.268685] [<c02267fd>] (kmalloc_order_trace) from 
> >>>>>>>>>>>> [<c047c805>] (zstd_init_compress_ctx+0x51/0xfc)
> >>>>>>>>>>>> [168170.268697] [<c047c805>] (zstd_init_compress_ctx) from 
> >>>>>>>>>>>> [<c047e2bd>] (f2fs_write_multi_pages+0x269/0x68c)
> >>>>>>>>>>>> [168170.268708] [<c047e2bd>] (f2fs_write_multi_pages) from 
> >>>>>>>>>>>> [<c0465163>] (f2fs_write_cache_pages+0x3bf/0x538)
> >>>>>>>>>>>> [168170.268718] [<c0465163>] (f2fs_write_cache_pages) from 
> >>>>>>>>>>>> [<c046550f>] (f2fs_write_data_pages+0x233/0x264)
> >>>>>>>>>>>> [168170.268730] [<c046550f>] (f2fs_write_data_pages) from 
> >>>>>>>>>>>> [<c0210db5>] (do_writepages+0x35/0x98)
> >>>>>>>>>>>> [168170.268745] [<c0210db5>] (do_writepages) from [<c0290c17>] 
> >>>>>>>>>>>> (__writeback_single_inode+0x2f/0x358)
> >>>>>>>>>>>> [168170.268757] [<c0290c17>] (__writeback_single_inode) from 
> >>>>>>>>>>>> [<c02910c5>] (writeback_sb_inodes+0x185/0x378)
> >>>>>>>>>>>> [168170.268766] [<c02910c5>] (writeback_sb_inodes) from 
> >>>>>>>>>>>> [<c02912e9>] (__writeback_inodes_wb+0x31/0x88)
> >>>>>>>>>>>> [168170.268776] [<c02912e9>] (__writeback_inodes_wb) from 
> >>>>>>>>>>>> [<c0291533>] (wb_writeback+0x1f3/0x264)
> >>>>>>>>>>>> [168170.268783] [<c0291533>] (wb_writeback) from [<c0292429>] 
> >>>>>>>>>>>> (wb_workfn+0x24d/0x3a4)
> >>>>>>>>>>>> [168170.268794] [<c0292429>] (wb_workfn) from [<c0130b2f>] 
> >>>>>>>>>>>> (process_one_work+0x15f/0x3b0)
> >>>>>>>>>>>> [168170.268803] [<c0130b2f>] (process_one_work) from 
> >>>>>>>>>>>> [<c0130e7b>] (worker_thread+0xfb/0x3e0)
> >>>>>>>>>>>> [168170.268813] [<c0130e7b>] (worker_thread) from [<c0135c3b>] 
> >>>>>>>>>>>> (kthread+0xeb/0x10c)
> >>>>>>>>>>>> [168170.268824] [<c0135c3b>] (kthread) from [<c0100159>] 
> >>>>>>>>>>>> (ret_from_fork+0x11/0x38)
> >>>>>>>>>>>> [168170.268829] Exception stack(0xccb67fb0 to 0xccb67ff8)
> >>>>>>>>>>>> [168170.268835] 7fa0:                                     
> >>>>>>>>>>>> 00000000 00000000 00000000 00000000
> >>>>>>>>>>>> [168170.268842] 7fc0: 00000000 00000000 00000000 00000000 
> >>>>>>>>>>>> 00000000 00000000 00000000 00000000
> >>>>>>>>>>>> [168170.268848] 7fe0: 00000000 00000000 00000000 00000000 
> >>>>>>>>>>>> 00000013 00000000
> >>>>>>>>>>>> [168170.268853] Mem-Info:
> >>>>>>>>>>>> [168170.268867] active_anon:2089 inactive_anon:5866 
> >>>>>>>>>>>> isolated_anon:0
> >>>>>>>>>>>>                   active_file:41402 inactive_file:37715 
> >>>>>>>>>>>> isolated_file:0
> >>>>>>>>>>>>                   unevictable:4 dirty:9162 writeback:90
> >>>>>>>>>>>>                   slab_reclaimable:5935 slab_unreclaimable:10851
> >>>>>>>>>>>>                   mapped:4694 shmem:881 pagetables:369 bounce:0
> >>>>>>>>>>>>                   free:12678 free_pcp:201 free_cma:11324
> >>>>>>>>>>>> [168170.268877] Node 0 active_anon:8356kB inactive_anon:23464kB 
> >>>>>>>>>>>> active_file:165608kB inactive_file:150860kB unevictable:16kB 
> >>>>>>>>>>>> isolated(anon):0kB isolated(file):0kB mapped:18776kB 
> >>>>>>>>>>>> dirty:36648kB writeback:360kB shmem:3524kB writeback_tmp:0kB 
> >>>>>>>>>>>> all_unreclaimable? no
> >>>>>>>>>>>> [168170.268891] Normal free:50712kB min:6500kB low:7100kB 
> >>>>>>>>>>>> high:7700kB reserved_highatomic:0KB active_anon:8356kB 
> >>>>>>>>>>>> inactive_anon:23464kB active_file:165764kB 
> >>>>>>>>>>>> inactive_file:150884kB unevictable:16kB writepending:36944kB 
> >>>>>>>>>>>> present:524288kB managed:503888kB mlocked:16kB 
> >>>>>>>>>>>> kernel_stack:1144kB pagetables:1476kB bounce:0kB free_pcp:828kB 
> >>>>>>>>>>>> local_pcp:116kB free_cma:45296kB
> >>>>>>>>>>>> [168170.268893] lowmem_reserve[]: 0 0 0
> >>>>>>>>>>>> [168170.268899] Normal: 1096*4kB (UMEC) 217*8kB (UMEC) 132*16kB 
> >>>>>>>>>>>> (UMEC) 82*32kB (UMEC) 283*64kB (UC) 72*128kB (C) 16*256kB (UC) 
> >>>>>>>>>>>> 9*512kB (UC) 4*1024kB (C) 0*2048kB 0*4096kB = 50984kB
> >>>>>>>>>>>> [168170.268927] 80105 total pagecache pages
> >>>>>>>>>>>> [168170.268933] 72 pages in swap cache
> >>>>>>>>>>>> [168170.268937] Swap cache stats: add 5255, delete 5182, find 
> >>>>>>>>>>>> 5492/6131
> >>>>>>>>>>>> [168170.268939] Free swap  = 232484kB
> >>>>>>>>>>>> [168170.268941] Total swap = 251940kB
> >>>>>>>>>>>> [168170.268944] 131072 pages RAM
> >>>>>>>>>>>> [168170.268946] 0 pages HighMem/MovableOnly
> >>>>>>>>>>>> [168170.268948] 5100 pages reserved
> >>>>>>>>>>>> [168170.268951] 32768 pages cma reserved
> >>>>>>>>>>>> [168182.775001] warn_alloc: 84 callbacks suppressed
> >>>>>>>>>>>> [168182.775115] kworker/u9:3: page allocation failure: order:9, 
> >>>>>>>>>>>> mode:0x40c40(GFP_NOFS|__GFP_COMP), 
> >>>>>>>>>>>> nodemask=(null),cpuset=/,mems_allowed=0
> >>>>>>>>>>>> [168182.775235] CPU: 3 PID: 8168 Comm: kworker/u9:3 Tainted: G   
> >>>>>>>>>>>>       C        5.8.3-sunxi #trunk
> >>>>>>>>>>>> [168182.775246] Hardware name: Allwinner sun8i Family
> >>>>>>>>>>>> [168182.775367] Workqueue: f2fs_post_read_wq f2fs_post_read_work
> >>>>>>>>>>>> [168182.775534] [<c010d6d5>] (unwind_backtrace) from 
> >>>>>>>>>>>> [<c0109a55>] (show_stack+0x11/0x14)
> >>>>>>>>>>>> [168182.775584] [<c0109a55>] (show_stack) from [<c056d489>] 
> >>>>>>>>>>>> (dump_stack+0x75/0x84)
> >>>>>>>>>>>> [168182.775658] [<c056d489>] (dump_stack) from [<c0243b53>] 
> >>>>>>>>>>>> (warn_alloc+0xa3/0x104)
> >>>>>>>>>>>> [168182.775689] [<c0243b53>] (warn_alloc) from [<c024473b>] 
> >>>>>>>>>>>> (__alloc_pages_nodemask+0xb87/0xc40)
> >>>>>>>>>>>> [168182.775731] [<c024473b>] (__alloc_pages_nodemask) from 
> >>>>>>>>>>>> [<c02267c5>] (kmalloc_order+0x19/0x38)
> >>>>>>>>>>>> [168182.775757] [<c02267c5>] (kmalloc_order) from [<c02267fd>] 
> >>>>>>>>>>>> (kmalloc_order_trace+0x19/0x90)
> >>>>>>>>>>>> [168182.775797] [<c02267fd>] (kmalloc_order_trace) from 
> >>>>>>>>>>>> [<c047c665>] (zstd_init_decompress_ctx+0x21/0x88)
> >>>>>>>>>>>> [168182.775837] [<c047c665>] (zstd_init_decompress_ctx) from 
> >>>>>>>>>>>> [<c047e9cf>] (f2fs_decompress_pages+0x97/0x228)
> >>>>>>>>>>>> [168182.775860] [<c047e9cf>] (f2fs_decompress_pages) from 
> >>>>>>>>>>>> [<c045d0ab>] (__read_end_io+0xfb/0x130)
> >>>>>>>>>>>> [168182.775871] [<c045d0ab>] (__read_end_io) from [<c045d141>] 
> >>>>>>>>>>>> (f2fs_post_read_work+0x61/0x84)
> >>>>>>>>>>>> [168182.775884] [<c045d141>] (f2fs_post_read_work) from 
> >>>>>>>>>>>> [<c0130b2f>] (process_one_work+0x15f/0x3b0)
> >>>>>>>>>>>> [168182.775893] [<c0130b2f>] (process_one_work) from 
> >>>>>>>>>>>> [<c0130e7b>] (worker_thread+0xfb/0x3e0)
> >>>>>>>>>>>> [168182.775905] [<c0130e7b>] (worker_thread) from [<c0135c3b>] 
> >>>>>>>>>>>> (kthread+0xeb/0x10c)
> >>>>>>>>>>>> [168182.775919] [<c0135c3b>] (kthread) from [<c0100159>] 
> >>>>>>>>>>>> (ret_from_fork+0x11/0x38)
> >>>>>>>>>>>> [168182.775924] Exception stack(0xcfd5ffb0 to 0xcfd5fff8)
> >>>>>>>>>>>> [168182.775930] ffa0:                                     
> >>>>>>>>>>>> 00000000 00000000 00000000 00000000
> >>>>>>>>>>>> [168182.775937] ffc0: 00000000 00000000 00000000 00000000 
> >>>>>>>>>>>> 00000000 00000000 00000000 00000000
> >>>>>>>>>>>> [168182.775943] ffe0: 00000000 00000000 00000000 00000000 
> >>>>>>>>>>>> 00000013 00000000
> >>>>>>>>>>>> [168182.775949] Mem-Info:
> >>>>>>>>>>>> [168182.775968] active_anon:2361 inactive_anon:4620 
> >>>>>>>>>>>> isolated_anon:0
> >>>>>>>>>>>>                   active_file:16267 inactive_file:15209 
> >>>>>>>>>>>> isolated_file:0
> >>>>>>>>>>>>                   unevictable:4 dirty:3287 writeback:0
> >>>>>>>>>>>>                   slab_reclaimable:5976 slab_unreclaimable:11441
> >>>>>>>>>>>>                   mapped:3760 shmem:485 pagetables:396 bounce:0
> >>>>>>>>>>>>                   free:60170 free_pcp:71 free_cma:25015
> >>>>>>>>>>>> [168182.775980] Node 0 active_anon:9444kB inactive_anon:18480kB 
> >>>>>>>>>>>> active_file:65068kB inactive_file:60836kB unevictable:16kB 
> >>>>>>>>>>>> isolated(anon):0kB isolated(file):0kB mapped:15040kB 
> >>>>>>>>>>>> dirty:13148kB writeback:0kB shmem:1940kB writeback_tmp:0kB 
> >>>>>>>>>>>> all_unreclaimable? no
> >>>>>>>>>>>> [168182.775995] Normal free:240680kB min:2404kB low:3004kB 
> >>>>>>>>>>>> high:3604kB reserved_highatomic:0KB active_anon:9444kB 
> >>>>>>>>>>>> inactive_anon:18480kB active_file:65068kB inactive_file:60836kB 
> >>>>>>>>>>>> unevictable:16kB writepending:13112kB present:524288kB 
> >>>>>>>>>>>> managed:503888kB mlocked:16kB kernel_stack:1168kB 
> >>>>>>>>>>>> pagetables:1584kB bounce:0kB free_pcp:280kB local_pcp:16kB 
> >>>>>>>>>>>> free_cma:100060kB
> >>>>>>>>>>>> [168182.775996] lowmem_reserve[]: 0 0 0
> >>>>>>>>>>>> [168182.776003] Normal: 4668*4kB (UMEC) 4945*8kB (UMEC) 
> >>>>>>>>>>>> 3001*16kB (UEC) 1684*32kB (UMEC) 584*64kB (UMEC) 157*128kB 
> >>>>>>>>>>>> (UMEC) 39*256kB (UMEC) 12*512kB (UMC) 7*1024kB (UMC) 0*2048kB 
> >>>>>>>>>>>> 0*4096kB = 240904kB
> >>>>>>>>>>>> [168182.776032] 32082 total pagecache pages
> >>>>>>>>>>>> [168182.776039] 66 pages in swap cache
> >>>>>>>>>>>> [168182.776043] Swap cache stats: add 6730, delete 6663, find 
> >>>>>>>>>>>> 5492/6140
> >>>>>>>>>>>> [168182.776045] Free swap  = 227108kB
> >>>>>>>>>>>> [168182.776047] Total swap = 251940kB
> >>>>>>>>>>>> [168182.776050] 131072 pages RAM
> >>>>>>>>>>>> [168182.776052] 0 pages HighMem/MovableOnly
> >>>>>>>>>>>> [168182.776054] 5100 pages reserved
> >>>>>>>>>>>> [168182.776056] 32768 pages cma reserved
> >>>>>>>>>>>>
> >>>>>>>>>>>> Again, I've had no issues on any of my boards when using lz4 
> >>>>>>>>>>>> compression, only with zstd.  (I have not had an opportunity to 
> >>>>>>>>>>>> try lzo-rle yet.)  I'm happy to try to provide more information 
> >>>>>>>>>>>> if necessary.  Thanks!
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>
> > 
> > 
> > _______________________________________________
> > Linux-f2fs-devel mailing list
> > Linux-f2fs-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > 
>


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to