Re: Long btrfs hangs during suspend to RAM / BTRFS warning (device dm-0): Aborting unused transaction
On 07/04/2012 11:15 PM, Marc MERLIN wrote: On Wed, Jul 04, 2012 at 01:58:31PM +0800, Liu Bo wrote: The dmesg log, sysrq log and stack dump info can usually be very helpful. From your report, we can see the csum error and hang on log, 'no csum' is not that bad while hanging-on is serious and dangerous. so can you please get any 'sysrq + w' log in the hanging-on case and paste them here, and the log may tell us who blocks other threads. Hi, thanks for the answer. I dumped all sysrq data, that was in my original Email. Here are two different sysrq+w runs, as well as aborted transaction messages from that Email. Sorry that the original was a bit long and contained a bunch of sysrq output. From doing further testing since then, it does seem that the code just start doing bad things, including the file corruption I saw, when I'm running low on free space. Anything else that would help? Thanks, Marc I'd expect to get some info from the following one, but I fails. Is it reproducible on your box? thanks, liubo Back to 3.2.16, I'm now seeing this: [ 840.516733] INFO: task VirtualBox:6818 blocked for more than 120 seconds. [ 840.516735] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [ 840.516736] VirtualBox D 8801fd134080 0 6818 6758 0x0080 [ 840.516740] 8801fd134080 0086 0050 880202e7f100 [ 840.516744] 00013580 8801c6f0dfd8 8801c6f0dfd8 8801fd134080 [ 840.516748] 8801c6f0da68 8801c6f0da68 88020a4e22f0 88023bc13e08 [ 840.516752] Call Trace: [ 840.516755] [810b5c67] ? __lock_page+0x66/0x66 [ 840.516758] [8134aea4] ? io_schedule+0x58/0x6f [ 840.516761] [810b5c6d] ? sleep_on_page+0x6/0xa [ 840.516764] [8134b1e5] ? __wait_on_bit_lock+0x3c/0x85 [ 840.516767] [810b5c62] ? __lock_page+0x61/0x66 [ 840.516770] [81060051] ? autoremove_wake_function+0x2a/0x2a [ 840.516785] [a01838d7] ? extent_write_cache_pages.isra.13.constprop.22+0xf6/0x278 [btrfs] [ 840.516789] [810ec9cb] ? __cache_free.isra.40+0x19/0x1a7 [ 840.516792] [8134ed52] ? sub_preempt_count+0x83/0x94 [ 840.516795] [8134c2dd] ? _raw_spin_unlock+0x24/0x30 [ 840.516811] [a0183c4b] ? extent_writepages+0x40/0x57 [btrfs] [ 840.516826] [a0177f5f] ? __btrfs_buffered_write+0x2bb/0x2dc [btrfs] [ 840.516841] [a016e88a] ? uncompress_inline.isra.44+0x116/0x116 [btrfs] [ 840.516844] [810b6aaf] ? __filemap_fdatawrite_range+0x4b/0x50 [ 840.516847] [810b6ad9] ? filemap_write_and_wait_range+0x25/0x4d [ 840.516863] [a01782ce] ? btrfs_file_aio_write+0x34e/0x490 [btrfs] [ 840.516866] [8103e092] ? get_parent_ip+0x9/0x1b [ 840.516882] [a0177f80] ? __btrfs_buffered_write+0x2dc/0x2dc [btrfs] [ 840.516886] [8112f19c] ? aio_rw_vect_retry+0x70/0x18e [ 840.516888] [8112f12c] ? aio_fsync+0x22/0x22 [ 840.516891] [8112fbc7] ? aio_run_iocb+0x72/0x11c [ 840.516894] [81130d9a] ? do_io_submit+0x6a4/0x7f9 [ 840.516898] [813508d2] ? system_call_fastpath+0x16/0x1b [ 1187.553635] btrfs: unlinked 8 orphans -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Long btrfs hangs during suspend to RAM / BTRFS warning (device dm-0): Aborting unused transaction
On Wed, Jul 04, 2012 at 01:58:31PM +0800, Liu Bo wrote: The dmesg log, sysrq log and stack dump info can usually be very helpful. From your report, we can see the csum error and hang on log, 'no csum' is not that bad while hanging-on is serious and dangerous. so can you please get any 'sysrq + w' log in the hanging-on case and paste them here, and the log may tell us who blocks other threads. Hi, thanks for the answer. I dumped all sysrq data, that was in my original Email. Here are two different sysrq+w runs, as well as aborted transaction messages from that Email. Sorry that the original was a bit long and contained a bunch of sysrq output. From doing further testing since then, it does seem that the code just start doing bad things, including the file corruption I saw, when I'm running low on free space. Anything else that would help? Thanks, Marc [84951.680847] Sched Debug Version: v0.10, 3.4.4-amd64-preempt-noide-20120410 #1 [84951.680850] ktime : 84951680.847904 [84951.680851] sched_clk : 55820767.832790 [84951.680853] cpu_clk : 84951680.846676 [84951.680855] jiffies : 4316130216 [84951.680857] sched_clock_stable : 0 [84951.680858] [84951.680859] sysctl_sched [84951.680860] .sysctl_sched_latency: 12.00 [84951.680862] .sysctl_sched_min_granularity: 1.50 [84951.680864] .sysctl_sched_wakeup_granularity : 2.00 [84951.680865] .sysctl_sched_child_runs_first : 0 [84951.680867] .sysctl_sched_features : 24119 [84951.680869] .sysctl_sched_tunable_scaling: 1 (logaritmic) [84951.680871] [84951.680872] cpu#0, 2527.110 MHz [84951.680873] .nr_running: 4 [84951.680875] .load : 1542 [84951.680876] .nr_switches : 298015357 [84951.680878] .nr_load_updates : 16858906 [84951.680879] .nr_uninterruptible: -7812 [84951.680881] .next_balance : 4316.130229 [84951.680883] .curr-pid : 6859 [84951.680884] .clock : 84951679.104864 [84951.680886] .cpu_load[0] : 1277 [84951.680887] .cpu_load[1] : 885 [84951.680889] .cpu_load[2] : 728 [84951.680890] .cpu_load[3] : 693 [84951.680892] .cpu_load[4] : 731 [84951.680894] [84951.680894] cfs_rq[0]:/autogroup-69 [84951.680896] .exec_clock: 0.00 [84951.680898] .MIN_vruntime : 0.01 [84951.680899] .min_vruntime : 194572.976125 [84951.680901] .max_vruntime : 0.01 [84951.680903] .spread: 0.00 [84951.680904] .spread0 : -52160826.593685 [84951.680906] .nr_spread_over: 0 [84951.680907] .nr_running: 0 [84951.680909] .load : 0 [84951.680910] .load_avg : 5179.920896 [84951.680912] .load_period : 8.310703 [84951.680913] .load_contrib : 623 [84951.680915] .load_tg : 623 [84951.680917] .se-exec_start: 84951675.234336 [84951.680918] .se-vruntime : 52355393.270692 [84951.680920] .se-sum_exec_runtime : 241715.485721 [84951.680922] .se-load.weight : 2 [84951.680923] [84951.680924] cfs_rq[0]:/autogroup-20 [84951.680925] .exec_clock: 0.00 [84951.680927] .MIN_vruntime : 0.01 [84951.680929] .min_vruntime : 13027.823103 [84951.680930] .max_vruntime : 0.01 [84951.680932] .spread: 0.00 [84951.680934] .spread0 : -52342371.746707 [84951.680935] .nr_spread_over: 0 [84951.680937] .nr_running: 0 [84951.680938] .load : 0 [84951.680940] .load_avg : 1279.999872 [84951.680941] .load_period : 6.444962 [84951.680943] .load_contrib : 198 [84951.680944] .load_tg : 257 [84951.680946] .se-exec_start: 84951664.094034 [84951.680948] .se-vruntime : 52355389.621584 [84951.680950] .se-sum_exec_runtime : 13082.655453 [84951.680951] .se-load.weight : 2 [84951.680953] [84951.680954] cfs_rq[0]:/autogroup-74 [84951.680955] .exec_clock: 0.00 [84951.680957] .MIN_vruntime : 1323610.934982 [84951.680959] .min_vruntime : 1323616.934982 [84951.680960]
Re: Long btrfs hangs during suspend to RAM / BTRFS warning (device dm-0): Aborting unused transaction
On 07/03/2012 03:58 AM, Marc MERLIN wrote: On Fri, Jun 29, 2012 at 05:36:24AM -0700, Marc MERLIN wrote: On Tue, Jun 26, 2012 at 10:20:12PM -0700, Marc MERLIN wrote: On Tue, Jun 26, 2012 at 06:38:18PM -0700, Marc MERLIN wrote: Now, I'm also seeing these below and I have this again (86% CPU): 6076 root 20 0 000 R 86 0.0 29:40.11 btrfs-delalloc- How bad is it, doctor? I think I'll be going back to 3.2.16 for now though. I reverted to 3.2.16 and haven't had further problems after dropping the current snapshot that was corrupted in various ways. Now, I'm not sure when I should upgrade anymore since I haven't heard of any fixes for what I saw. Assuming I go forward again, is there something else I could have provided to help debug? Mmmh, ok. I understand that this code comes with no guarantees, and I have backups, but I'm reporting a problem that lead to corruption (I had multiple files that were corrupted in my latest snapshot and I had to drop it and revert to an older snapshot and then out of fear for 3.4.4, went back to 3.2.16). Hi Marc, Sorry for not replying this earlier. The dmesg log, sysrq log and stack dump info can usually be very helpful. From your report, we can see the csum error and hang on log, 'no csum' is not that bad while hanging-on is serious and dangerous. so can you please get any 'sysrq + w' log in the hanging-on case and paste them here, and the log may tell us who blocks other threads. I didn't see any problems with 3.2.16 (doesn't mean there weren't any, just that I didn't see any). Feel free to use the latest btrfs upstream, it always contains some fixes. thanks, liubo Since my filesystem was a bit full, and that triggers problems with btrfs, I freed up 70GB gandalfthegreat:~# btrfs fi show Label: 'btrfs_pool1' uuid: 873d526c-e911-4234-af1b-239889cd143d Total devices 1 FS bytes used 163.01GB devid1 size 231.02GB used 231.02GB path /dev/dm-0 I rebooted with 3.4.4 and started copying data, and for now I've gotten this: kernel: [ 832.108558] btrfs no csum found for inode 3896855 start 0 kernel: [ 832.108873] btrfs csum failed ino 3896855 off 0 csum 1150320628 private 0 How bad is this? More generally, what was missing from my previous report (I gave all the sysrq I could output) that no one seemed to be able to use it? Thanks, Marc Back to 3.2.16, I'm now seeing this: [ 840.516733] INFO: task VirtualBox:6818 blocked for more than 120 seconds. [ 840.516735] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [ 840.516736] VirtualBox D 8801fd134080 0 6818 6758 0x0080 [ 840.516740] 8801fd134080 0086 0050 880202e7f100 [ 840.516744] 00013580 8801c6f0dfd8 8801c6f0dfd8 8801fd134080 [ 840.516748] 8801c6f0da68 8801c6f0da68 88020a4e22f0 88023bc13e08 [ 840.516752] Call Trace: [ 840.516755] [810b5c67] ? __lock_page+0x66/0x66 [ 840.516758] [8134aea4] ? io_schedule+0x58/0x6f [ 840.516761] [810b5c6d] ? sleep_on_page+0x6/0xa [ 840.516764] [8134b1e5] ? __wait_on_bit_lock+0x3c/0x85 [ 840.516767] [810b5c62] ? __lock_page+0x61/0x66 [ 840.516770] [81060051] ? autoremove_wake_function+0x2a/0x2a [ 840.516785] [a01838d7] ? extent_write_cache_pages.isra.13.constprop.22+0xf6/0x278 [btrfs] [ 840.516789] [810ec9cb] ? __cache_free.isra.40+0x19/0x1a7 [ 840.516792] [8134ed52] ? sub_preempt_count+0x83/0x94 [ 840.516795] [8134c2dd] ? _raw_spin_unlock+0x24/0x30 [ 840.516811] [a0183c4b] ? extent_writepages+0x40/0x57 [btrfs] [ 840.516826] [a0177f5f] ? __btrfs_buffered_write+0x2bb/0x2dc [btrfs] [ 840.516841] [a016e88a] ? uncompress_inline.isra.44+0x116/0x116 [btrfs] [ 840.516844] [810b6aaf] ? __filemap_fdatawrite_range+0x4b/0x50 [ 840.516847] [810b6ad9] ? filemap_write_and_wait_range+0x25/0x4d [ 840.516863] [a01782ce] ? btrfs_file_aio_write+0x34e/0x490 [btrfs] [ 840.516866] [8103e092] ? get_parent_ip+0x9/0x1b [ 840.516882] [a0177f80] ? __btrfs_buffered_write+0x2dc/0x2dc [btrfs] [ 840.516886] [8112f19c] ? aio_rw_vect_retry+0x70/0x18e [ 840.516888] [8112f12c] ? aio_fsync+0x22/0x22 [ 840.516891] [8112fbc7] ? aio_run_iocb+0x72/0x11c [ 840.516894] [81130d9a] ? do_io_submit+0x6a4/0x7f9 [ 840.516898] [813508d2] ? system_call_fastpath+0x16/0x1b [ 1187.553635] btrfs: unlinked 8 orphans [ 3810.200064] e1000e :00:19.0: BAR 0: set to [mem 0xfc00-0xfc01] (PCI address [0xfc00-0xfc01]) [ 3810.200071] e1000e :00:19.0: BAR 1: set to [mem 0xfc025000-0xfc025fff] (PCI address [0xfc025000-0xfc025fff]) [ 3810.200076] e1000e :00:19.0: BAR 2: set to [io 0x1840-0x185f] (PCI address [0x1840-0x185f])
Re: Long btrfs hangs during suspend to RAM / BTRFS warning (device dm-0): Aborting unused transaction
On Fri, Jun 29, 2012 at 05:36:24AM -0700, Marc MERLIN wrote: On Tue, Jun 26, 2012 at 10:20:12PM -0700, Marc MERLIN wrote: On Tue, Jun 26, 2012 at 06:38:18PM -0700, Marc MERLIN wrote: Now, I'm also seeing these below and I have this again (86% CPU): 6076 root 20 0 000 R 86 0.0 29:40.11 btrfs-delalloc- How bad is it, doctor? I think I'll be going back to 3.2.16 for now though. I reverted to 3.2.16 and haven't had further problems after dropping the current snapshot that was corrupted in various ways. Now, I'm not sure when I should upgrade anymore since I haven't heard of any fixes for what I saw. Assuming I go forward again, is there something else I could have provided to help debug? Mmmh, ok. I understand that this code comes with no guarantees, and I have backups, but I'm reporting a problem that lead to corruption (I had multiple files that were corrupted in my latest snapshot and I had to drop it and revert to an older snapshot and then out of fear for 3.4.4, went back to 3.2.16). I didn't see any problems with 3.2.16 (doesn't mean there weren't any, just that I didn't see any). Since my filesystem was a bit full, and that triggers problems with btrfs, I freed up 70GB gandalfthegreat:~# btrfs fi show Label: 'btrfs_pool1' uuid: 873d526c-e911-4234-af1b-239889cd143d Total devices 1 FS bytes used 163.01GB devid1 size 231.02GB used 231.02GB path /dev/dm-0 I rebooted with 3.4.4 and started copying data, and for now I've gotten this: kernel: [ 832.108558] btrfs no csum found for inode 3896855 start 0 kernel: [ 832.108873] btrfs csum failed ino 3896855 off 0 csum 1150320628 private 0 How bad is this? More generally, what was missing from my previous report (I gave all the sysrq I could output) that no one seemed to be able to use it? Thanks, Marc Back to 3.2.16, I'm now seeing this: [ 840.516733] INFO: task VirtualBox:6818 blocked for more than 120 seconds. [ 840.516735] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [ 840.516736] VirtualBox D 8801fd134080 0 6818 6758 0x0080 [ 840.516740] 8801fd134080 0086 0050 880202e7f100 [ 840.516744] 00013580 8801c6f0dfd8 8801c6f0dfd8 8801fd134080 [ 840.516748] 8801c6f0da68 8801c6f0da68 88020a4e22f0 88023bc13e08 [ 840.516752] Call Trace: [ 840.516755] [810b5c67] ? __lock_page+0x66/0x66 [ 840.516758] [8134aea4] ? io_schedule+0x58/0x6f [ 840.516761] [810b5c6d] ? sleep_on_page+0x6/0xa [ 840.516764] [8134b1e5] ? __wait_on_bit_lock+0x3c/0x85 [ 840.516767] [810b5c62] ? __lock_page+0x61/0x66 [ 840.516770] [81060051] ? autoremove_wake_function+0x2a/0x2a [ 840.516785] [a01838d7] ? extent_write_cache_pages.isra.13.constprop.22+0xf6/0x278 [btrfs] [ 840.516789] [810ec9cb] ? __cache_free.isra.40+0x19/0x1a7 [ 840.516792] [8134ed52] ? sub_preempt_count+0x83/0x94 [ 840.516795] [8134c2dd] ? _raw_spin_unlock+0x24/0x30 [ 840.516811] [a0183c4b] ? extent_writepages+0x40/0x57 [btrfs] [ 840.516826] [a0177f5f] ? __btrfs_buffered_write+0x2bb/0x2dc [btrfs] [ 840.516841] [a016e88a] ? uncompress_inline.isra.44+0x116/0x116 [btrfs] [ 840.516844] [810b6aaf] ? __filemap_fdatawrite_range+0x4b/0x50 [ 840.516847] [810b6ad9] ? filemap_write_and_wait_range+0x25/0x4d [ 840.516863] [a01782ce] ? btrfs_file_aio_write+0x34e/0x490 [btrfs] [ 840.516866] [8103e092] ? get_parent_ip+0x9/0x1b [ 840.516882] [a0177f80] ? __btrfs_buffered_write+0x2dc/0x2dc [btrfs] [ 840.516886] [8112f19c] ? aio_rw_vect_retry+0x70/0x18e [ 840.516888] [8112f12c] ? aio_fsync+0x22/0x22 [ 840.516891] [8112fbc7] ? aio_run_iocb+0x72/0x11c [ 840.516894] [81130d9a] ? do_io_submit+0x6a4/0x7f9 [ 840.516898] [813508d2] ? system_call_fastpath+0x16/0x1b [ 1187.553635] btrfs: unlinked 8 orphans [ 3810.200064] e1000e :00:19.0: BAR 0: set to [mem 0xfc00-0xfc01] (PCI address [0xfc00-0xfc01]) [ 3810.200071] e1000e :00:19.0: BAR 1: set to [mem 0xfc025000-0xfc025fff] (PCI address [0xfc025000-0xfc025fff]) [ 3810.200076] e1000e :00:19.0: BAR 2: set to [io 0x1840-0x185f] (PCI address [0x1840-0x185f]) [ 3810.200093] e1000e :00:19.0: restoring config space at offset 0xf (was 0x100, writing 0x10b) [ 3810.200115] e1000e :00:19.0: restoring config space at offset 0x1 (was 0x10, writing 0x100107) [ 3810.200147] e1000e :00:19.0: PME# disabled [ 3810.200224] e1000e :00:19.0: irq 45 for MSI/MSI-X [ 4671.144685] iwlwifi :03:00.0: Tx aggregation enabled on ra = 2c:b0:5d:3c:7d:f1 tid = 1 [ 4799.384107] btrfs: unlinked 8 orphans [ 8436.512513] btrfs:
Re: Long btrfs hangs during suspend to RAM / BTRFS warning (device dm-0): Aborting unused transaction
On Tue, Jun 26, 2012 at 12:36:37PM -0700, Marc MERLIN wrote: I was fine with btrfs until 3.2.16, but when going to 3.4 or 3.4.4, I'm having my system randomly not wanting to suspend to RAM. My suspend light started flashing but the system wasn't suspending. I was able to unlock X, run top, saw that btrfs-delalloc-??? was taking 98% CPU, and any command that touched disk hung (so I could log in, but would not get a shell). I did the sysrq commands, which took a long time to work (they stacked up with nothing happening and maybe 10mn later, the system unhung itself and they ran, maybe too late, not sure). Interestingly the system did go to sleep and resumed ok after that, it just took 20mn to get there. It's all pasted below. gandalfthegreat:~# btrfs fi show Label: 'btrfs_pool1' uuid: 873d526c-e911-4234-af1b-239889cd143d Total devices 1 FS bytes used 214.44GB devid1 size 231.02GB used 231.02GB path /dev/dm-0 Btrfs Btrfs v0.19 gandalfthegreat:~# Now, I'm also seeing these below and I have this again (86% CPU): 6076 root 20 0 000 R 86 0.0 29:40.11 btrfs-delalloc- How bad is it, doctor? I think I'll be going back to 3.2.16 for now though. [100415.369301] BTRFS warning (device dm-0): Aborting unused transaction. [100415.739337] BTRFS warning (device dm-0): Aborting unused transaction. [100416.093839] BTRFS warning (device dm-0): Aborting unused transaction. [100416.433245] BTRFS warning (device dm-0): Aborting unused transaction. [100709.352445] usb 1-5: USB disconnect, device number 8 [100710.197602] thinkpad_acpi: EC reports that Thermal Table has changed [100710.199954] ACPI: \_SB_.GDCK - undocking [100711.358679] wlan0: deauthenticating from 00:24:6c:67:03:d1 by local choice (reason=3) [100713.044214] e1000e :00:19.0: irq 46 for MSI/MSI-X [101447.801190] btrfs no csum found for inode 3910588 start 5144576 [101447.802277] btrfs csum failed ino 3910588 off 5144576 csum 2096800889 private 0 [101447.957161] btrfs no csum found for inode 3910588 start 5996544 [101447.957990] btrfs csum failed ino 3910588 off 5996544 csum 2096800889 private 0 [101447.960459] btrfs no csum found for inode 3910588 start 6062080 [101447.961351] btrfs csum failed ino 3910588 off 6062080 csum 2096800889 private 0 [101447.962459] btrfs no csum found for inode 3910588 start 6225920 [101447.963543] btrfs csum failed ino 3910588 off 6225920 csum 2096800889 private 0 [101447.964369] btrfs no csum found for inode 3910588 start 6291456 [101447.965045] btrfs csum failed ino 3910588 off 6291456 csum 2096800889 private 0 [101447.966406] btrfs no csum found for inode 3910588 start 6356992 [101447.967392] btrfs csum failed ino 3910588 off 6356992 csum 2123070294 private 0 [101448.088912] btrfs no csum found for inode 3910588 start 6553600 [101448.088971] btrfs no csum found for inode 3910588 start 6586368 [101448.090058] btrfs csum failed ino 3910588 off 6586368 csum 2096800889 private 0 [101448.090305] btrfs csum failed ino 3910588 off 6553600 csum 4253301504 private 0 [101448.093231] btrfs no csum found for inode 3910588 start 6717440 [101448.094269] btrfs csum failed ino 3910588 off 6717440 csum 2096800889 private 0 [101448.096104] btrfs no csum found for inode 3910588 start 6848512 [101448.097002] btrfs csum failed ino 3910588 off 6848512 csum 3282939717 private 0 [101448.098249] btrfs no csum found for inode 3910588 start 6946816 [101448.101561] btrfs no csum found for inode 3910588 start 7045120 [101448.106352] btrfs no csum found for inode 3910588 start 7176192 [101448.108224] btrfs no csum found for inode 3910588 start 7241728 [101448.33] btrfs no csum found for inode 3910588 start 7307264 [101448.127081] btrfs no csum found for inode 3910588 start 7569408 [101448.144351] btrfs no csum found for inode 3910588 start 7864320 [101448.155857] btrfs no csum found for inode 3910588 start 8060928 [101448.165868] btrfs no csum found for inode 3910588 start 8257536 [101448.175275] btrfs no csum found for inode 3910588 start 8454144 [101448.176087] btrfs no csum found for inode 3910588 start 8552448 [101448.180927] btrfs no csum found for inode 3910588 start 8847360 [101448.183489] btrfs no csum found for inode 3910588 start 8978432 [101448.192909] btrfs no csum found for inode 3910588 start 9371648 [101448.198959] btrfs no csum found for inode 3910588 start 9568256 [101448.413716] btrfs no csum found for inode 3910588 start 5439488 [101448.413996] btrfs no csum found for inode 3910588 start 5537792 [101448.414379] btrfs no csum found for inode 3910588 start 5603328 [101448.416386] btrfs no csum found for inode 3910588 start 5734400 [101448.424108] btrfs no csum found for inode 3910588 start 5799936 [101448.721649] btrfs no csum found for inode 3910588 start 9699328 [101448.737377] btrfs no csum found for inode 3910588 start 9863168 [101448.743929] btrfs no csum found for inode 3910588 start 10027008 [101448.753186] btrfs no csum found for inode
Re: Long btrfs hangs during suspend to RAM / BTRFS warning (device dm-0): Aborting unused transaction
On Tue, Jun 26, 2012 at 06:38:18PM -0700, Marc MERLIN wrote: Now, I'm also seeing these below and I have this again (86% CPU): 6076 root 20 0 000 R 86 0.0 29:40.11 btrfs-delalloc- How bad is it, doctor? I think I'll be going back to 3.2.16 for now though. Back to 3.2.16, I'm now seeing this: [ 840.516733] INFO: task VirtualBox:6818 blocked for more than 120 seconds. [ 840.516735] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [ 840.516736] VirtualBox D 8801fd134080 0 6818 6758 0x0080 [ 840.516740] 8801fd134080 0086 0050 880202e7f100 [ 840.516744] 00013580 8801c6f0dfd8 8801c6f0dfd8 8801fd134080 [ 840.516748] 8801c6f0da68 8801c6f0da68 88020a4e22f0 88023bc13e08 [ 840.516752] Call Trace: [ 840.516755] [810b5c67] ? __lock_page+0x66/0x66 [ 840.516758] [8134aea4] ? io_schedule+0x58/0x6f [ 840.516761] [810b5c6d] ? sleep_on_page+0x6/0xa [ 840.516764] [8134b1e5] ? __wait_on_bit_lock+0x3c/0x85 [ 840.516767] [810b5c62] ? __lock_page+0x61/0x66 [ 840.516770] [81060051] ? autoremove_wake_function+0x2a/0x2a [ 840.516785] [a01838d7] ? extent_write_cache_pages.isra.13.constprop.22+0xf6/0x278 [btrfs] [ 840.516789] [810ec9cb] ? __cache_free.isra.40+0x19/0x1a7 [ 840.516792] [8134ed52] ? sub_preempt_count+0x83/0x94 [ 840.516795] [8134c2dd] ? _raw_spin_unlock+0x24/0x30 [ 840.516811] [a0183c4b] ? extent_writepages+0x40/0x57 [btrfs] [ 840.516826] [a0177f5f] ? __btrfs_buffered_write+0x2bb/0x2dc [btrfs] [ 840.516841] [a016e88a] ? uncompress_inline.isra.44+0x116/0x116 [btrfs] [ 840.516844] [810b6aaf] ? __filemap_fdatawrite_range+0x4b/0x50 [ 840.516847] [810b6ad9] ? filemap_write_and_wait_range+0x25/0x4d [ 840.516863] [a01782ce] ? btrfs_file_aio_write+0x34e/0x490 [btrfs] [ 840.516866] [8103e092] ? get_parent_ip+0x9/0x1b [ 840.516882] [a0177f80] ? __btrfs_buffered_write+0x2dc/0x2dc [btrfs] [ 840.516886] [8112f19c] ? aio_rw_vect_retry+0x70/0x18e [ 840.516888] [8112f12c] ? aio_fsync+0x22/0x22 [ 840.516891] [8112fbc7] ? aio_run_iocb+0x72/0x11c [ 840.516894] [81130d9a] ? do_io_submit+0x6a4/0x7f9 [ 840.516898] [813508d2] ? system_call_fastpath+0x16/0x1b [ 1187.553635] btrfs: unlinked 8 orphans [ 3810.200064] e1000e :00:19.0: BAR 0: set to [mem 0xfc00-0xfc01] (PCI address [0xfc00-0xfc01]) [ 3810.200071] e1000e :00:19.0: BAR 1: set to [mem 0xfc025000-0xfc025fff] (PCI address [0xfc025000-0xfc025fff]) [ 3810.200076] e1000e :00:19.0: BAR 2: set to [io 0x1840-0x185f] (PCI address [0x1840-0x185f]) [ 3810.200093] e1000e :00:19.0: restoring config space at offset 0xf (was 0x100, writing 0x10b) [ 3810.200115] e1000e :00:19.0: restoring config space at offset 0x1 (was 0x10, writing 0x100107) [ 3810.200147] e1000e :00:19.0: PME# disabled [ 3810.200224] e1000e :00:19.0: irq 45 for MSI/MSI-X [ 4671.144685] iwlwifi :03:00.0: Tx aggregation enabled on ra = 2c:b0:5d:3c:7d:f1 tid = 1 [ 4799.384107] btrfs: unlinked 8 orphans [ 8436.512513] btrfs: unlinked 7 orphans [11350.749850] btrfs no csum found for inode 3909426 start 0 [11350.750697] btrfs csum failed ino 3909426 off 0 csum 1419704114 private 0 [11652.088805] btrfs no csum found for inode 3910848 start 0 [11652.089524] btrfs csum failed ino 3910848 off 0 csum 3145117582 private 0 My firefox and chrome profiles were corrupted, so I had to restore them from an old snapshot. I can't prove it, but it looks like my corruption happened right at the same time than I rebooted to 3.4.4. Marc -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html