I really need your help, because it's the second time btrfs ate my data in
a couple of days and I can't use my laptop if I don't find the culprit.
This was the mail I sent a couple of days ago:
https://www.spinics.net/lists/linux-btrfs/msg54754.html
I previously thought the culprit was a bug in kernel 4.6-rc, but I was
wrong.
Then I reinstalled the whole system (Arch Linux) from scratch, and after
just two days I lost some of my data, again. Once again btrfs check
--repair got stuck in an infinite loop and I can't repair my fs. The system
has always been shutdown properly, except for a single time when I had to
forcedly power it off just after the boot because I didn't see any signal
on the screen.
First the obvious things:
- memory is ok
(https://drive.google.com/open?id=0Bwe9Wtc-5xF1VnJ0SE9fT1FZMTg)
- disk is ok
(https://drive.google.com/open?id=0Bwe9Wtc-5xF1NGRhd2daVDRJVGc)
- tlp has SATA_LINKPWR_ON_BAT=max_performance
(https://drive.google.com/open?id=0Bwe9Wtc-5xF1dFAwUE5ETVpNWGM)
- rootfs mount options:
rw,noatime,compress=lzo,ssd,discard,space_cache,autodefrag,subvolid=257,subvol=/@
- Command line: BOOT_IMAGE=/@/boot/vmlinuz-linux
root=UUID=4fc2278e-f6e8-4a21-8876-cabbf885bb2e rw rootflags=subvol=@
cryptdevice=/dev/disk/by-uuid/c7c8f501-507c-4bd2-a80a-8c7360651f02:cryptroot:allow-discards
quiet
- scrub didn't find any error:
$ sudo btrfs scrub status /
scrub status for 4fc2278e-f6e8-4a21-8876-cabbf885bb2e
scrub started at Thu May 5 00:57:30 2016 and finished after
00:00:45
total bytes scrubbed: 22.26GiB with 0 errors
I have the whole rootfs encrypted, including boot. I followed these steps:
https://wiki.archlinux.org/index.php/Dm-crypt/Encrypting_an_entire_system#Btrfs_subvolumes_with_swap
Disk is a SAMSUNG SSD PM851 M.2 2280 256GB (Firmware Version: EXT25D0Q).
Laptop is a Dell XPS 13 9343 QHD+.
Distro is Arch Linux, kernel version is 4.5.1. btrfs-progs is 4.5.2.
After two days from the previous data loss I finished reinstalling my
distro from scratch, then I decided to do a full backup from a snapshot
using tar. This is what I got while trying to backup my data:
tar: usr/share/kig/icons/hicolor/32x32/actions/test.png: errore di lettura
al byte 0 leggendo 810 byte: Errore di input/output
tar: usr/share/kig/icons/hicolor/32x32/actions/circlebpd.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/pointOnLine.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/bezierN.png: funzione "stat"
non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/convexhull.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/centerofcurvature.png:
funzione "stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/en.png: funzione "stat" non
riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/circlebps.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/directrix.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/beziercurves.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/segment_midpoint.png:
funzione "stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/distance.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/circlebcl.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/conicb5p.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/kig_polygon.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/conicasymptotes.png:
funzione "stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/pointxy.png: funzione "stat"
non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/attacher.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/coniclineintersection.png:
funzione "stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/vectorsum.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/rbezier4.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/ellipsebffp.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/angle.png: funzione "stat"
non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/kig_text.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/vectordifference.png:
funzione "stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/segmentaxis.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/radicalline.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/polygonsides.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/projection.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/inversion.png: funzione
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/bezier4.png: funzione "stat"
non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/equilateralhyperbolab4p.png:
funzione "stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/areaCircle.png: funzione
"stat" non riuscita: Stale file handle
tar: var/lib/samba/private/msg.sock/666: socket ignorato
tar: Uscita con stato di fallimento in base agli errori precedenti
[ 3057.008185] BTRFS error (device dm-0): parent transid verify failed on
528089088 wanted 3458764513820541211 found 283
[ 3057.008195] BTRFS error (device dm-0): error loading props for ino
183988 (root 505): -5
[ 3057.008417] BTRFS error (device dm-0): parent transid verify failed on
528089088 wanted 3458764513820541211 found 283
[ 3057.008631] BTRFS error (device dm-0): parent transid verify failed on
528089088 wanted 3458764513820541211 found 283
[ 3057.009165] BTRFS error (device dm-0): parent transid verify failed on
528089088 wanted 3458764513820541211 found 283
[ 3057.009389] BTRFS error (device dm-0): parent transid verify failed on
528089088 wanted 3458764513820541211 found 283
[ 3057.009734] BTRFS error (device dm-0): parent transid verify failed on
528089088 wanted 3458764513820541211 found 283
[ 3057.009960] BTRFS error (device dm-0): parent transid verify failed on
528089088 wanted 3458764513820541211 found 283
[ 3057.010664] BTRFS error (device dm-0): parent transid verify failed on
528089088 wanted 3458764513820541211 found 283
[ 3057.010888] BTRFS error (device dm-0): parent transid verify failed on
528089088 wanted 3458764513820541211 found 283
[ 3057.011201] BTRFS error (device dm-0): parent transid verify failed on
528089088 wanted 3458764513820541211 found 283
[ 3331.795474] verify_parent_transid: 57 callbacks suppressed
[ 3331.795480] BTRFS error (device dm-0): parent transid verify failed on
528089088 wanted 3458764513820541211 found 283
[ 3331.795776] BTRFS error (device dm-0): parent transid verify failed on
528089088 wanted 3458764513820541211 found 283
I made a copy of /dev/mapper/cryptroot with dd on an external drive and I
run btrfs check on it (btrfs-progs 4.5.2):
https://drive.google.com/open?id=0Bwe9Wtc-5xF1SjJacXpMMU5mems (37MB)
Then I tried to run btrfs check --repair on it but once again it got stuck
in an infinite loop like this one
(https://www.spinics.net/lists/linux-btrfs/msg54146.html) and after an hour
of looping and several hundreds of MBs of logs I had to kill it. Here is
the log, truncated to 30MB:
https://drive.google.com/open?id=0Bwe9Wtc-5xF1SmRuVUlfeGRES3M
They are probably not needed but here is snapper -c @ list:
https://drive.google.com/open?id=0Bwe9Wtc-5xF1N0llOFpfVXVwNVk
and btrfs subvolume list -p /:
https://drive.google.com/open?id=0Bwe9Wtc-5xF1andCdWZzeV9VbDg
This is the link to the whole gdrive directory with all the logs:
https://drive.google.com/open?id=0Bwe9Wtc-5xF1UFltcXhtRmt4YjA
I really don't know what may be the problem, maybe discard? I can't think
about switching back to ext4 and losing snapshots, transactions,
compression, incremental send/receive backups etc.
I would really love being able to do something to fix it, but I don't have
the slightest idea about what's the problem. Hopefully someone here will be
smarter than me and find the problem, otherwise I will have to switch to
ext4 because I need my laptop to work.
Thanks,
Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html