Re: dmesg: csum "varying number" expected csum 0x0 mirror 1 (with trace/oops)

2017-09-26 Thread Liu Bo
On Wed, Sep 27, 2017 at 12:21:55AM +0200, Kai Krakow wrote:
> Hello!
> 
> I came across noting some kernel messages which seem to be related to
> btrfs, should I worry?
> 
> I'm currently running scrub on the device now.
> 
> inode-resolve points to an unimportant, easily recoverable file:
> 
> $ sudo btrfs inspect-internal inode-resolve -v 1229624 
> /mnt/btrfs-pool/gentoo/usr/portage/
> ioctl ret=0, bytes_left=4033, bytes_missing=0, cnt=1, missed=0
> /mnt/btrfs-pool/gentoo/usr/portage//packages/dev-lang/tcl/tcl-8.6.6-2.xpak
> 
> The file wasn't modified in month and it has been in order previously
> because the backup didn't choke on it:
> 
> $ ls -alt
> -rw-r--r-- 1 root root 11241835 19. Apr 22:11 
> /mnt/btrfs-pool/gentoo/usr/portage//packages/dev-lang/tcl/tcl-8.6.6-2.xpak
> 
> I can read the file without problems, no new messages in dmesg:
> 
> $ cat 
> /mnt/btrfs-pool/gentoo/usr/portage//packages/dev-lang/tcl/tcl-8.6.6-2.xpak 
> >/dev/null
> 
> What's strange is the long time gaps between the btrfs reports and the
> kernel backtraces... Also, expected csum=0 looks strange.
> 
> That mount is subvol_id=0. Not sure if inode-resolve works that way. I
> retried for the important subvolumes and it resolved for none.
> 
> Just in case, I have a backlog of multiple daily backups.
> 
> $ uname -a
> Linux jupiter 4.12.14-ck #1 SMP PREEMPT Fri Sep 22 02:47:44 CEST 2017
> x86_64 Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz GenuineIntel GNU/Linux
> 
> 
> [88597.792462] BTRFS info (device bcache48): no csum found for inode 1229624 
> start 1650688
> [88597.793304] BTRFS warning (device bcache48): csum failed root 280 ino 
> 1229624 off 1650688 csum 0x953c1e92 expected csum 0x mirror 1
> [128569.451376] [ cut here ]
> [128569.451382] WARNING: CPU: 7 PID: 146 at kernel/workqueue.c:2041 
> process_one_work+0x44/0x310
> [128569.451383] Modules linked in: cifs ccm fuse bridge stp llc veth rfcomm 
> bnep cachefiles btusb af_packet btintel iTCO_wdt bluetooth tun 
> iTCO_vendor_support rfkill ecdh_generic kvm_intel snd_hda_codec_hdmi 
> snd_hda_codec_realtek kvm snd_hda_codec_generic snd_hda_intel snd_hda_codec 
> snd_hda_core rtc_cmos snd_pcm snd_timer snd soundcore lpc_ich irqbypass uas 
> usb_storage r8168(O) nvidia_drm(PO) vboxpci(O) vboxnetadp(O) vboxnetflt(O) 
> vboxdrv(O) nvidia_modeset(PO) nvidia(PO) nct6775 hwmon_vid coretemp hwmon 
> efivarfs
> [128569.451406] CPU: 7 PID: 146 Comm: bcache Tainted: P   O
> 4.12.14-ck #1
> [128569.451407] Hardware name: To Be Filled By O.E.M. To Be Filled By 
> O.E.M./Z68 Pro3, BIOS L2.16A 02/22/2013
> [128569.451410] task: 880419bf8bc0 task.stack: c95c4000
> [128569.451412] RIP: 0010:process_one_work+0x44/0x310
> [128569.451412] RSP: :c95c7e78 EFLAGS: 00013002
> [128569.451413] RAX: 0007 RBX: 880419bdcf00 RCX: 
> 88042f2d8020
> [128569.451414] RDX: 88042f2d8018 RSI: 880120454f08 RDI: 
> 880419bdcf00
> [128569.451415] RBP: 88042f2d8000 R08:  R09: 
> 
> [128569.451415] R10: 8a209a609e79 R11:  R12: 
> 
> [128569.451416] R13: 88042f2e1b00 R14: 88042f2e1b80 R15: 
> 880419bdcf30
> [128569.451417] FS:  () GS:88042f3c() 
> knlGS:
> [128569.451418] CS:  0010 DS:  ES:  CR0: 80050033
> [128569.451418] CR2: 00b0 CR3: 0003f23ef000 CR4: 
> 001406e0
> [128569.451419] Call Trace:
> [128569.451422]  ? rescuer_thread+0x20b/0x370
> [128569.451424]  ? kthread+0xf2/0x130
> [128569.451425]  ? process_one_work+0x310/0x310
> [128569.451426]  ? kthread_create_on_node+0x40/0x40
> [128569.451428]  ? ret_from_fork+0x22/0x30
> [128569.451429] Code: 04 b8 00 00 00 00 4c 0f 44 e8 49 8b 45 08 44 8b a0 00 
> 01 00 00 41 83 e4 20 f6 45 10 04 75 0e 65 8b 05 79 38 f7 7e 3b 45 04 74 02 
> <0f> ff 48 ba eb 83 b5 80 46 86 c8 61 48 0f af d6 48 c1 ea 3a 48 
> [128569.451449] ---[ end trace d00a1585e5166d18 ]---
> [148997.934146] BUG: unable to handle kernel paging request at 
> c96af000
> [148997.934154] IP: memcpy_erms+0x6/0x10
> [148997.934155] PGD 41d021067
> [148997.934156] P4D 41d021067
> [148997.934157] PUD 41d022067
> [148997.934158] PMD 41961c067
> [148997.934158] PTE 0
> [148997.934162] Oops: 0002 [#1] PREEMPT SMP
> [148997.934163] Modules linked in: cifs ccm fuse bridge stp llc veth rfcomm 
> bnep cachefiles btusb af_packet btintel iTCO_wdt bluetooth tun 
> iTCO_vendor_support rfkill ecdh_generic kvm_intel snd_hda_codec_hdmi 
> snd_hda_codec_realtek kvm snd_hda_codec_generic snd_hda_intel snd_hda_codec 
> snd_hda_core rtc_cmos snd_pcm snd_timer snd soundcore lpc_ich irqbypass uas 
> usb_storage r8168(O) nvidia_drm(PO) vboxpci(O) vboxnetadp(O) vboxnetflt(O) 
> vboxdrv(O) nvidia_modeset(PO) nvidia(PO) nct6775 hwmon_vid coretemp hwmon 
> efivarfs
> [148997.934188] CPU: 6 PID: 966 Comm: kworker/u16:16 Tainted: PW  O   
>  4.12.14-ck 

dmesg: csum "varying number" expected csum 0x0 mirror 1 (with trace/oops)

2017-09-26 Thread Kai Krakow
Hello!

I came across noting some kernel messages which seem to be related to
btrfs, should I worry?

I'm currently running scrub on the device now.

inode-resolve points to an unimportant, easily recoverable file:

$ sudo btrfs inspect-internal inode-resolve -v 1229624 
/mnt/btrfs-pool/gentoo/usr/portage/
ioctl ret=0, bytes_left=4033, bytes_missing=0, cnt=1, missed=0
/mnt/btrfs-pool/gentoo/usr/portage//packages/dev-lang/tcl/tcl-8.6.6-2.xpak

The file wasn't modified in month and it has been in order previously
because the backup didn't choke on it:

$ ls -alt
-rw-r--r-- 1 root root 11241835 19. Apr 22:11 
/mnt/btrfs-pool/gentoo/usr/portage//packages/dev-lang/tcl/tcl-8.6.6-2.xpak

I can read the file without problems, no new messages in dmesg:

$ cat 
/mnt/btrfs-pool/gentoo/usr/portage//packages/dev-lang/tcl/tcl-8.6.6-2.xpak 
>/dev/null

What's strange is the long time gaps between the btrfs reports and the
kernel backtraces... Also, expected csum=0 looks strange.

That mount is subvol_id=0. Not sure if inode-resolve works that way. I
retried for the important subvolumes and it resolved for none.

Just in case, I have a backlog of multiple daily backups.

$ uname -a
Linux jupiter 4.12.14-ck #1 SMP PREEMPT Fri Sep 22 02:47:44 CEST 2017
x86_64 Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz GenuineIntel GNU/Linux


[88597.792462] BTRFS info (device bcache48): no csum found for inode 1229624 
start 1650688
[88597.793304] BTRFS warning (device bcache48): csum failed root 280 ino 
1229624 off 1650688 csum 0x953c1e92 expected csum 0x mirror 1
[128569.451376] [ cut here ]
[128569.451382] WARNING: CPU: 7 PID: 146 at kernel/workqueue.c:2041 
process_one_work+0x44/0x310
[128569.451383] Modules linked in: cifs ccm fuse bridge stp llc veth rfcomm 
bnep cachefiles btusb af_packet btintel iTCO_wdt bluetooth tun 
iTCO_vendor_support rfkill ecdh_generic kvm_intel snd_hda_codec_hdmi 
snd_hda_codec_realtek kvm snd_hda_codec_generic snd_hda_intel snd_hda_codec 
snd_hda_core rtc_cmos snd_pcm snd_timer snd soundcore lpc_ich irqbypass uas 
usb_storage r8168(O) nvidia_drm(PO) vboxpci(O) vboxnetadp(O) vboxnetflt(O) 
vboxdrv(O) nvidia_modeset(PO) nvidia(PO) nct6775 hwmon_vid coretemp hwmon 
efivarfs
[128569.451406] CPU: 7 PID: 146 Comm: bcache Tainted: P   O
4.12.14-ck #1
[128569.451407] Hardware name: To Be Filled By O.E.M. To Be Filled By 
O.E.M./Z68 Pro3, BIOS L2.16A 02/22/2013
[128569.451410] task: 880419bf8bc0 task.stack: c95c4000
[128569.451412] RIP: 0010:process_one_work+0x44/0x310
[128569.451412] RSP: :c95c7e78 EFLAGS: 00013002
[128569.451413] RAX: 0007 RBX: 880419bdcf00 RCX: 
88042f2d8020
[128569.451414] RDX: 88042f2d8018 RSI: 880120454f08 RDI: 
880419bdcf00
[128569.451415] RBP: 88042f2d8000 R08:  R09: 

[128569.451415] R10: 8a209a609e79 R11:  R12: 

[128569.451416] R13: 88042f2e1b00 R14: 88042f2e1b80 R15: 
880419bdcf30
[128569.451417] FS:  () GS:88042f3c() 
knlGS:
[128569.451418] CS:  0010 DS:  ES:  CR0: 80050033
[128569.451418] CR2: 00b0 CR3: 0003f23ef000 CR4: 
001406e0
[128569.451419] Call Trace:
[128569.451422]  ? rescuer_thread+0x20b/0x370
[128569.451424]  ? kthread+0xf2/0x130
[128569.451425]  ? process_one_work+0x310/0x310
[128569.451426]  ? kthread_create_on_node+0x40/0x40
[128569.451428]  ? ret_from_fork+0x22/0x30
[128569.451429] Code: 04 b8 00 00 00 00 4c 0f 44 e8 49 8b 45 08 44 8b a0 00 01 
00 00 41 83 e4 20 f6 45 10 04 75 0e 65 8b 05 79 38 f7 7e 3b 45 04 74 02 <0f> ff 
48 ba eb 83 b5 80 46 86 c8 61 48 0f af d6 48 c1 ea 3a 48 
[128569.451449] ---[ end trace d00a1585e5166d18 ]---
[148997.934146] BUG: unable to handle kernel paging request at c96af000
[148997.934154] IP: memcpy_erms+0x6/0x10
[148997.934155] PGD 41d021067
[148997.934156] P4D 41d021067
[148997.934157] PUD 41d022067
[148997.934158] PMD 41961c067
[148997.934158] PTE 0
[148997.934162] Oops: 0002 [#1] PREEMPT SMP
[148997.934163] Modules linked in: cifs ccm fuse bridge stp llc veth rfcomm 
bnep cachefiles btusb af_packet btintel iTCO_wdt bluetooth tun 
iTCO_vendor_support rfkill ecdh_generic kvm_intel snd_hda_codec_hdmi 
snd_hda_codec_realtek kvm snd_hda_codec_generic snd_hda_intel snd_hda_codec 
snd_hda_core rtc_cmos snd_pcm snd_timer snd soundcore lpc_ich irqbypass uas 
usb_storage r8168(O) nvidia_drm(PO) vboxpci(O) vboxnetadp(O) vboxnetflt(O) 
vboxdrv(O) nvidia_modeset(PO) nvidia(PO) nct6775 hwmon_vid coretemp hwmon 
efivarfs
[148997.934188] CPU: 6 PID: 966 Comm: kworker/u16:16 Tainted: PW  O
4.12.14-ck #1
[148997.934190] Hardware name: To Be Filled By O.E.M. To Be Filled By 
O.E.M./Z68 Pro3, BIOS L2.16A 02/22/2013
[148997.934193] Workqueue: btrfs-endio btrfs_endio_helper
[148997.934194] task: 88001496a340 task.stack: c90011b68000
[148997.934196] RIP: