Hi David, this looks like the bug that I already reported two times:
https://www.spinics.net/lists/linux-btrfs/msg54394.html https://www.spinics.net/lists/linux-btrfs/msg75104.html The second thread contains Nikolay's debug patch that can confirm if you run out of global metadata reservations too. Martin Dne 21.4.2018 v 9:38 David Goodwin napsal(a): > Hi, > > I'm running a 3TiB EBS based (2+1TiB devices) volume in EC2 which contains > about 500 read-only > snapshots. > > btrfs-progs v4.7.3 > > There are two dmesg trace things below. The first one from a 4.9.77 kernel - > > ------------[ cut here ]------------ > BTRFS: error (device xvdg) in btrfs_run_delayed_refs:2967: errno=-28 No space > left > BTRFS info (device xvdg): forced readonlyApr 19 11:44:40 gateway1 kernel: > [7648104.300115] > WARNING: CPU: 2 PID: 963 at fs/btrfs/extent-tree.c:2967 > btrfs_run_delayed_refs+0x27e/0x2b0 > [btrfs]Apr 19 11:44:40 gateway1 kernel: [7648104.313268] BTRFS: Transaction > aborted (error -28) > Modules linked in: dm_mod nfsv3 ipt_REJECT nf_reject_ipv4 ipt_MASQUERADE > nf_nat_masquerade_ipv4 > iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat_ftp > nf_conntrack_ftp nf_nat > nf_conntrack xt_mu > nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc evdev > intel_rapl > crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_pcsp snd_pcm > aesni_intel aes_x86_64 lrw > gf128mul glue_helper snd_timer ablk_helper snd cryptd soundcore ext4 crc16 > jbd2 mbcache btrfs xor > raid6_pq xen_netfront xen_blkfront crc32c_intel > CPU: 2 PID: 963 Comm: btrfs-transacti Not tainted 4.9.77-dg1 #1Apr 19 > 11:44:40 gateway1 kernel: > [7648104.408561] 0000000000000000 ffffffff812f17a4 ffffc90043203d08 > 0000000000000000 > ffffffff8107389e ffffffffa0157d5a ffffc90043203d58 ffff8802ccfd7170 > ffff880394684800 ffff880394684800 000000000007315c ffffffff8107390f > Call Trace: > [<ffffffff812f17a4>] ? dump_stack+0x5c/0x78 > [<ffffffff8107389e>] ? __warn+0xbe/0xe0 > [<ffffffff8107390f>] ? warn_slowpath_fmt+0x4f/0x60 > [<ffffffffa00bd3fe>] ? btrfs_run_delayed_refs+0x27e/0x2b0 [btrfs] > [<ffffffffa00a7523>] ? btrfs_release_path+0x13/0x80 [btrfs] > [<ffffffffa00c1dc2>] ? btrfs_start_dirty_block_groups+0x2c2/0x450 [btrfs] > [<ffffffffa00d36ac>] ? btrfs_commit_transaction+0x14c/0xa30 [btrfs] > [<ffffffffa00d4026>] ? start_transaction+0x96/0x480 [btrfs] > [<ffffffffa00ce54c>] ? transaction_kthread+0x1dc/0x200 [btrfs] > [<ffffffffa00ce370>] ? btrfs_cleanup_transaction+0x550/0x550 [btrfs] > [<ffffffff81091ef7>] ? kthread+0xc7/0xe0 > [<ffffffff81091e30>] ? kthread_park+0x60/0x60 > [<ffffffff815a3174>] ? ret_from_fork+0x54/0x60 > ---[ end trace 69ca1332d91b4310 ]--- > BTRFS: error (device xvdg) in btrfs_run_delayed_refs:2967: errno=-28 No space > left > BTRFS error (device xvdg): parent transid verify failed on 5400398217216 > wanted 1893543 found 1893366 > > > > On checking btrfs fi us there was plenty of unallocated space left. > > % btrfs fi us /broken/ > > Overall: > Device size: 3.06TiB > Device allocated: 2.43TiB > Device unallocated: 643.09GiB > Device missing: 0.00B > Used: 2.43TiB > Free (estimated): 646.41GiB (min: 646.41GiB) > Data ratio: 1.00 > Metadata ratio: 1.00 > Global reserve: 512.00MiB (used: 0.00B) > > .... > > The VM was then rebooted with a 4.16.2 kernel, which encountered what I > assume is the same problem: > > > ------------[ cut here ]------------ > BTRFS: Transaction aborted (error -28) > WARNING: CPU: 2 PID: 981 at fs/btrfs/extent-tree.c:6990 > __btrfs_free_extent.isra.63+0x3d2/0xd20 > [btrfs] > Modules linked in: nfsv3 ipt_REJECT nf_reject_ipv4 ipt_MASQUERADE > nf_nat_masquerade_ipv4 > iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat_ftp > nf_conntrack_ftp nf_nat > nf_conntrack libcrc32c crc32c_generic xt_multiport iptable_filter ip_tables > x_tables autofs4 nfsd > auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc intel_rapl > crct10dif_pclmul crc32_pclmul > ghash_clmulni_intel evdev pcbc snd_pcsp aesni_intel snd_pcm aes_x86_64 > snd_timer crypto_simd > glue_helper snd cryptd soundcore ext4 crc16 mbcache jbd2 btrfs xor > zstd_decompress zstd_compress > xxhash raid6_pq xen_netfront xen_blkfront crc32c_intel > CPU: 2 PID: 981 Comm: btrfs-transacti Not tainted 4.16.2-dg1 #1 > RIP: e030:__btrfs_free_extent.isra.63+0x3d2/0xd20 [btrfs] > RSP: e02b:ffffc900428d7c68 EFLAGS: 00010292 > RAX: 0000000000000026 RBX: 000001fb8031c000 RCX: 0000000000000006 > RDX: 0000000000000007 RSI: 0000000000000001 RDI: ffff88039a916650 > RBP: 00000000ffffffe4 R08: 0000000000000001 R09: 000000000000010a > R10: 0000000000000001 R11: 000000000000010a R12: ffff8803957e6000 > R13: ffff88036f5a9e70 R14: 0000000000000000 R15: 0000000000000002 > FS: 0000000000000000(0000) GS:ffff88039a900000(0000) knlGS:ffff88039a900000 > CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007f2168f8c274 CR3: 000000038fd88000 CR4: 0000000000000660 > Call Trace: > ? btrfs_merge_delayed_refs+0x23c/0x3c0 [btrfs] > __btrfs_run_delayed_refs+0x320/0x1150 [btrfs] > btrfs_run_delayed_refs+0x105/0x1c0 [btrfs] > btrfs_commit_transaction+0x393/0x8a0 [btrfs] > ? start_transaction+0x93/0x420 [btrfs] > transaction_kthread+0x195/0x1b0 [btrfs] > kthread+0xf8/0x130 > ? btrfs_cleanup_transaction+0x520/0x520 [btrfs] > ? kthread_create_worker_on_cpu+0x50/0x50 > ret_from_fork+0x35/0x40 > Code: 48 8b 04 24 48 8b 40 50 f0 48 0f ba a8 d0 16 00 00 02 72 19 83 fd fb 0f > 84 07 03 00 00 89 ee > 48 c7 c7 28 29 16 a0 e8 9e b2 fb e0 <0f> 0b 48 8b 3c 24 89 e9 ba 4e 1b 00 00 > 48 c7 c6 80 b8 15 a0 e8 > ---[ end trace 7d4d4006f7a3a06e ]--- > > BTRFS: error (device xvdg) in __btrfs_free_extent:6990: errno=-28 No space > left > > > The volume appears to be usable when mounted read-only. > > Hopefully the above might help someone remove what I ignorantly assume is a > bug. > > David. > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html