I think no special option:
/dev/md127 on /data type btrfs
(rw,noatime,nodiratime,nospace_cache,subvolid=5,subvol=/)

ok, will try laster kernel.

thanks.

On Thu, Dec 7, 2017 at 9:21 AM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
>
>
> On 2017年12月07日 09:19, Taibai Li wrote:
>> thanks for the quick response,  I tired to test this on 4.4.100
>> kernel, disabled quota :
>> # btrfs qgroup show /data/
>> ERROR: can't list qgroups: quotas not enabled
>>
>> But seems it still OOM after about 7 hours copyed  144G files,  any
>> other ideas?  Maybe I will try to test by disable quota on 4.14 kernel
>> too.
>
> Trying latest kernel is always a good idea.
>
> Despite qgroup, I am not pretty sure which can be the cause.
>
> Is there any special mount option used?
>
> Thanks,
> Qu
>
>>
>> thanks.
>>
>> On Wed, Dec 6, 2017 at 2:25 PM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
>>>
>>>
>>> On 2017年12月06日 14:22, taibai li wrote:
>>>> Hi Guys,
>>>>
>>>> I hit the OOM issues with as Box running 4.4.x kernel,   so I tried to
>>>> build a 4.14.3 kernel to try that.  The testbed is :
>>>> NAS box with 2G memory, and a single disk raid ,  I setup a nfs server
>>>> with sync mode,  add the storage on ESXi servers 6.0 and backup all
>>>> the VMs on it by the ghettoVCB script , after about 14 hours,
>>>> Inpot/Output error happened,
>>>> checked the box , found OOM.
>>>> # cat /etc/exports
>>>> "/data/Videos" 
>>>> *(insecure,insecure_locks,no_subtree_check,crossmnt,anonuid=99,anongid=99,root_squash,rw,sync)
>>>> # uname -a
>>>> Linux lzx-314-desk 4.14.3.x86_64.1 #1 SMP Fri Dec 1 01:31:25 UTC 2017
>>>> x86_64 GNU/Linux
>>>> # btrfs fi show /data/
>>>> Label: '43f611ae:data' uuid: 6fefb319-a21d-476e-9642-565e0600a049
>>>> Total devices 1 FS bytes used 292.78GiB
>>>> devid 1 size 1.81TiB used 296.02GiB path /dev/md127
>>>>
>>>> The stack is :
>>>> Dec 04 23:47:43 lzx-314-desk kernel: nfsd: page allocation stalls for
>>>> 621031ms, order:0, mode:0x14000c0(GFP_KERNEL), nodemask=(null)
>>>> Dec 04 23:47:43 lzx-314-desk kernel: nfsd cpuset=/ mems_allowed=0
>>>> Dec 04 23:47:43 lzx-314-desk kernel: CPU: 0 PID: 3376 Comm: nfsd Not
>>>> tainted 4.14.3.x86_64.1 #1
>>>> Dec 04 23:47:43 lzx-314-desk kernel: Hardware name: NETGEAR ReadyNAS
>>>> 314/To be filled by O.E.M., BIOS 4.6.5 11/05/2013
>>>> Dec 04 23:47:43 lzx-314-desk kernel: Call Trace:
>>>> Dec 04 23:47:43 lzx-314-desk kernel: dump_stack+0x4d/0x6a
>>>> Dec 04 23:47:43 lzx-314-desk kernel: warn_alloc+0xe3/0x180
>>>> Dec 04 23:47:43 lzx-314-desk kernel: __alloc_pages_nodemask+0xb1e/0xed0
>>>> Dec 04 23:47:43 lzx-314-desk kernel: svc_recv+0x99/0x900
>>>> Dec 04 23:47:43 lzx-314-desk kernel: ? svc_process+0x241/0x690
>>>> Dec 04 23:47:43 lzx-314-desk kernel: nfsd+0xd2/0x150
>>>> Dec 04 23:47:43 lzx-314-desk kernel: kthread+0x11a/0x150
>>>> Dec 04 23:47:43 lzx-314-desk kernel: ? nfsd_destroy+0x60/0x60
>>>> Dec 04 23:47:43 lzx-314-desk kernel: ? kthread_create_on_node+0x40/0x40
>>>> Dec 04 23:47:43 lzx-314-desk kernel: ret_from_fork+0x22/0x30
>>>> Dec 04 23:47:43 lzx-314-desk kernel: readynasd invoked oom-killer:
>>>> gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0,
>>>> oom_score_adj=-1000
>>>> Dec 04 23:47:43 lzx-314-desk kernel: readynasd cpuset=/ mems_allowed=0
>>>> Dec 04 23:47:43 lzx-314-desk kernel: CPU: 3 PID: 3307 Comm: readynasd
>>>> Not tainted 4.14.3.x86_64.1 #1
>>>> Dec 04 23:47:43 lzx-314-desk kernel: Hardware name: NETGEAR ReadyNAS
>>>> 314/To be filled by O.E.M., BIOS 4.6.5 11/05/2013
>>>> Dec 04 23:47:43 lzx-314-desk kernel: Call Trace:
>>>> Dec 04 23:47:43 lzx-314-desk kernel: dump_stack+0x4d/0x6a
>>>> Dec 04 23:47:43 lzx-314-desk kernel: dump_header+0x9a/0x21b
>>>> Dec 04 23:47:43 lzx-314-desk kernel: ? pick_next_task_fair+0x1d5/0x4b0
>>>> Dec 04 23:47:43 lzx-314-desk kernel: ? security_capable_noaudit+0x40/0x60
>>>> Dec 04 23:47:43 lzx-314-desk kernel: oom_kill_process+0x216/0x430
>>>> Dec 04 23:47:43 lzx-314-desk kernel: out_of_memory+0xf9/0x2e0
>>>> Dec 04 23:47:43 lzx-314-desk kernel: __alloc_pages_nodemask+0xd6c/0xed0
>>>> Dec 04 23:47:43 lzx-314-desk kernel: __read_swap_cache_async+0x11d/0x190
>>>> Dec 04 23:47:43 lzx-314-desk kernel: read_swap_cache_async+0x17/0x40
>>>> Dec 04 23:47:43 lzx-314-desk kernel: swapin_readahead+0x1f1/0x230
>>>> Dec 04 23:47:43 lzx-314-desk kernel: ? find_get_entry+0x19/0xf0
>>>> Dec 04 23:47:43 lzx-314-desk kernel: ? pagecache_get_page+0x27/0x210
>>>> Dec 04 23:47:43 lzx-314-desk kernel: do_swap_page+0x432/0x590
>>>> Dec 04 23:47:43 lzx-314-desk kernel: ? do_swap_page+0x432/0x590
>>>> Dec 04 23:47:43 lzx-314-desk kernel: ? 
>>>> poll_select_copy_remaining+0x120/0x120
>>>> Dec 04 23:47:43 lzx-314-desk kernel: __handle_mm_fault+0x33e/0xa20
>>>> Dec 04 23:47:43 lzx-314-desk kernel: handle_mm_fault+0x14a/0x1d0
>>>> Dec 04 23:47:43 lzx-314-desk kernel: __do_page_fault+0x212/0x440
>>>> Dec 04 23:47:43 lzx-314-desk kernel: page_fault+0x22/0x30
>>>> Dec 04 23:47:43 lzx-314-desk kernel: RIP:
>>>> 0010:copy_user_generic_unrolled+0x89/0xc0
>>>> Dec 04 23:47:43 lzx-314-desk kernel: RSP: 0000:ffffc90000cdbd70 EFLAGS: 
>>>> 00010202
>>>> Dec 04 23:47:43 lzx-314-desk kernel: RAX: 0000000000000000 RBX:
>>>> 0000000000000008 RCX: 0000000000000001
>>>> Dec 04 23:47:43 lzx-314-desk kernel: RDX: 0000000000000000 RSI:
>>>> ffffc90000cdbdd8 RDI: 00007ff45e7fba80
>>>> Dec 04 23:47:43 lzx-314-desk kernel: RBP: ffffc90000cdbee8 R08:
>>>> 0000000000000000 R09: 0000000000000104
>>>> Dec 04 23:47:43 lzx-314-desk kernel: R10: ffffc90000cdbd78 R11:
>>>> 0000000000000104 R12: 0000000000000000
>>>> Dec 04 23:47:43 lzx-314-desk kernel: R13: ffffc90000cdbdc0 R14:
>>>> 00007ff45e7fba80 R15: ffffc90000cdbdc0
>>>> Dec 04 23:47:43 lzx-314-desk kernel: ? core_sys_select+0x208/0x2a0
>>>> Dec 04 23:47:43 lzx-314-desk kernel: ? __handle_mm_fault+0x4fc/0xa20
>>>> Dec 04 23:47:43 lzx-314-desk kernel: ? ktime_get_ts64+0x44/0xe0
>>>> Dec 04 23:47:43 lzx-314-desk kernel: SyS_select+0xa6/0xe0
>>>> Dec 04 23:47:43 lzx-314-desk kernel: entry_SYSCALL_64_fastpath+0x13/0x94
>>>> Dec 04 23:47:43 lzx-314-desk kernel: RIP: 0033:0x7ff488f05893
>>>> Dec 04 23:47:43 lzx-314-desk kernel: RSP: 002b:00007ff45e7fb9f0
>>>> EFLAGS: 00000293 ORIG_RAX: 0000000000000017
>>>> Dec 04 23:47:43 lzx-314-desk kernel: RAX: ffffffffffffffda RBX:
>>>> 0000000000000018 RCX: 00007ff488f05893
>>>> Dec 04 23:47:43 lzx-314-desk kernel: RDX: 0000000000000000 RSI:
>>>> 00007ff45e7fba80 RDI: 0000000000000019
>>>> Dec 04 23:47:43 lzx-314-desk kernel: RBP: 00007ff48fc4b0f0 R08:
>>>> 00007ff45e7fba70 R09: 00007ff4480008c0
>>>> Dec 04 23:47:43 lzx-314-desk kernel: R10: 0000000000000000 R11:
>>>> 0000000000000293 R12: 0000000000000000
>>>> Dec 04 23:47:43 lzx-314-desk kernel: R13: 0000000000000000 R14:
>>>> 0000000001999670 R15: 00007ff4480008c0
>>>> Dec 04 23:47:43 lzx-314-desk kernel: Mem-Info:
>>>> Dec 04 23:47:43 lzx-314-desk kernel: active_anon:0 inactive_anon:0
>>>> isolated_anon:0
>>>> active_file:322 inactive_file:461978 isolated_file:352
>>>> unevictable:0 dirty:136 writeback:0 unstable:0
>>>> slab_reclaimable:15780 slab_unreclaimable:5246
>>>> mapped:1 shmem:0 pagetables:1165 bounce:0
>>>> free:13867 free_pcp:60 free_cma:0
>>>> ......
>>>>
>>>> I tried to fomat the md device to ext4,  then it's fine to backup all
>>>> the VMs,  And if I use async option for nfs, it works too, so seems
>>>> like  btrfs is more memory consuming sometimes.
>>>>
>>>> I attatched the full logs.
>>>>
>>>> Any one hit  similar issue or have any ideas ?
>>>
>>> Are you using btrfs qgroups (quota)?
>>>
>>> It's known qgroup will take extra memory and may cause OOM if there are
>>> a lot of extents modified in current transaction.
>>>
>>> Thanks,
>>> Qu
>>>
>>>>
>>>> thanks so much.
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>>> the body of a message to majord...@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to