Re[4]: btrfs check "Couldn't open file system" after error in transaction.c
Hello, here the output of btrfsck: Checking filesystem on /dev/sdd UUID: a8af3832-48c7-4568-861f-e80380dd7e0b checking extents checking free space cache checking fs root checking csums checking root refs checking quota groups Ignoring qgroup relation key 24544 Ignoring qgroup relation key 24610 Ignoring qgroup relation key 24611 Ignoring qgroup relation key 25933 Ignoring qgroup relation key 25934 Ignoring qgroup relation key 25935 Ignoring qgroup relation key 25936 Ignoring qgroup relation key 25937 Ignoring qgroup relation key 25938 Ignoring qgroup relation key 25939 Ignoring qgroup relation key 25939 Ignoring qgroup relation key 25941 Ignoring qgroup relation key 25942 Ignoring qgroup relation key 25958 Ignoring qgroup relation key 25959 Ignoring qgroup relation key 25960 Ignoring qgroup relation key 25961 Ignoring qgroup relation key 25962 Ignoring qgroup relation key 25963 Ignoring qgroup relation key 25964 Ignoring qgroup relation key 25965 Ignoring qgroup relation key 25966 Ignoring qgroup relation key 25966 Ignoring qgroup relation key 25968 Ignoring qgroup relation key 25970 Ignoring qgroup relation key 25971 Ignoring qgroup relation key 25972 Ignoring qgroup relation key 25975 Ignoring qgroup relation key 25976 Ignoring qgroup relation key 25976 Ignoring qgroup relation key 25976 Ignoring qgroup relation key 567172078071971871 Ignoring qgroup relation key 567172078071971872 Ignoring qgroup relation key 567172078071971882 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971886 Ignoring qgroup relation key 567172078071971886 Ignoring qgroup relation key 567172078071971886 Ignoring qgroup relation key 567172078071971886 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Qgroup is already inconsistent before checking Counts for qgroup id: 3102 are different our:referenced 174829252608 referenced compressed 174829252608 disk: referenced 174829252608 referenced compressed 174829252608 our:exclusive 2899968 exclusive compressed 2899968 disk: exclusive 2916352 exclusive compressed 2916352 diff: exclusive -16384 exclusive compressed -16384 Counts for qgroup id: 25977 are different our:referenced 47249391616 referenced compressed 47249391616 disk: referenced 47249391616 referenced compressed 47249391616 our:exclusive 90222592 exclusive compressed 90222592 disk: exclusive 90238976 exclusive compressed 90238976 diff: exclusive -16384 exclusive compressed -16384 Counts for qgroup id: 25978 are different our:referenced 174829252608 referenced compressed 174829252608 disk: referenced 174829252608 referenced compressed 174829252608 our:exclusive 1064960 exclusive compressed 1064960 disk: exclusive 1081344 exclusive compressed 1081344 diff: exclusive -16384 exclusive compressed -16384 Counts for qgroup id: 26162 are different our:referenced 65940500480 referenced compressed 65940500480 disk: referenced 65866997760 referenced compressed 65866997760 diff: referenced 73502720 referenced compressed 73502720 our:exclusive 3991326720 exclusive compressed 3991326720 disk: exclusive 3960582144 exclusive compressed 3960582144 diff: exclusive 30744576 exclusive compressed 30744576 found 8423479726080 bytes used err is 1 total csum bytes: 8206766844 total tree bytes: 17669144576 total fs tree bytes: 7271251968 total extent tree bytes: 683851776 total csum bytes: 8206766844 total tree bytes: 17669144576 total fs tree bytes: 7271251968 total extent tree bytes: 683851776 btree space waste bytes: 2859469730 file data blocks allocated: 16171232772096 referenced 13512171663360 What does that tell us? Greetings, Hendrik -- Originalnachricht -- Von: "Hendrik Fr
Re[3]: btrfs check "Couldn't open file system" after error in transaction.c
Hello again, before overwriting the filesystem, some last questions: Maybe take advantage of the fact it does read only and recreate it. You could take a btrfs-image and btrfs-debug-tree first, And what do I do with it? because there's some bug somewhere: somehow it became inconsistent, and can't be fixed at mount time or even with btrfs check. Ok, so is there any way to help you finding this bug? Anything, I can do here? Coming back to my objectives: -Understand the reason behind the issue and prevent it in future Finding the but would help on the above -If not possible to repair the filesystem: -understand if the data that I read from the drive is valid or corrupted Can you answer this? As mentioned: I do have a backup, a month old. The data does not change so regularly, so most should be ok. Now I have two sources of data: the backup and the current degraded filesystem. If data differs, which one do I take? Is it safe to use the more recent one from the degraded filesystem? And can you help me on these points? FYI, I did a btrfsck --init-csum-tree /dev/sdd btrfs rescue zero-log btrfs-zero-log btrfsck /dev/sdd now. The last command is still running. It seems to be working; Is there a way to be sure, that the data is all ok again? Regards, Hendrik Greetings, Hendrik -- Chris Murphy --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re[2]: btrfs check "Couldn't open file system" after error in transaction.c
Hi Chris, thanks for your reply -especially on a Sunday. I have a filesystem (three disks with no raid) So it's data single *and* metadata single? No: Data, single: total=8.14TiB, used=7.64TiB System, RAID1: total=32.00MiB, used=912.00KiB Metadata, RAID1: total=18.00GiB, used=16.45GiB GlobalReserve, single: total=512.00MiB, used=0.00B btrfs check will lead to "Couldn't open file system" Try btrfs-progs all the most recent btrfs-progs to see if it's any different: 4.5.3, 4.6.1, 4.7 (or 4.7.1 if you can get it, it's days old). Ok, I will try. [ 98.534830] BTRFS error (device sde): parent transid verify failed on 22168481054720 wanted 1826943 found 1828546 That's pretty weird. It wants a LOWER generation number than what it found? By quite a bit. It's nearly 1500 generations different. I don't know what can cause this kind of confusion or how to fix it. Ok, time to get the data off it (I do have backups, but of course some weeks old). Answering this question: -If possible repair the filesystem NO Maybe take advantage of the fact it does read only and recreate it. You could take a btrfs-image and btrfs-debug-tree first, And what do I do with it? because there's some bug somewhere: somehow it became inconsistent, and can't be fixed at mount time or even with btrfs check. Ok, so is there any way to help you finding this bug? Coming back to my objectives: -Understand the reason behind the issue and prevent it in future Finding the but would help on the above -If not possible to repair the filesystem: -understand if the data that I read from the drive is valid or corrupted Can you answer this? As mentioned: I do have a backup, a month old. The data does not change so regularly, so most should be ok. Now I have two sources of data: the backup and the current degraded filesystem. If data differs, which one do I take? Is it safe to use the more recent one from the degraded filesystem? Greetings, Hendrik -- Chris Murphy --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs check "Couldn't open file system" after error in transaction.c
Hello, some more info: The system is Debian jessie with kernel 4.6.0 and btrfs-tools 4.6. I did go through the recovery steps from the wiki: -btrfs scrub to detect issues on live filesystems see my original mail. Is aborted immediately -look at btrfs detected errors in syslog (look at Marc's blog above on how to use sec.pl to do this) see my original mail -mount -o ro,recovery to mount a filesystem with issues does work, but I get many errors like this one [ 325.360115] BTRFS info (device sdd): no csum found for inode 1703 start 2072977408 -btrfs-zero-log might help in specific cases. Go read Btrfs-zero-log I would like to get your ok and instructions on this first -btrfs restore will help you copy data off a broken btrfs filesystem. See its page: Restore see above. Recovering the data does seem to work with ro,recovery. By the way: can I be sure somehow that the Data is correct when I read it this way, despite the "no csum found for inode" ? -btrfs check --repair, aka btrfsck is your last option if the ones above have not worked. Does not work: "Couldn't open file system" I also went through Marcs page: http://marc.merlins.org/perso/btrfs/post_2014-03-19_Btrfs-Tips_-Btrfs-Scrub-and-Btrfs-Filesystem-Repair.html but without further hints. So I have now these objectives: -If possible repair the filesystem -Understand the reason behind the issue and prevent it in future -If not possible to repair the filesystem: -understand if the data that I read from the drive is valid or corrupted I'd appreciate your help on this. Greetings, Hendrik -- Originalnachricht -- Von: "Hendrik Friedel" <hend...@friedels.name> An: "Btrfs BTRFS" <linux-btrfs@vger.kernel.org> Gesendet: 28.08.2016 12:04:18 Betreff: btrfs check "Couldn't open file system" after error in transaction.c Hello, I have a filesystem (three disks with no raid) that I can still mount ro, but I cannot check or scrub it. In dmesg I see: [So Aug 28 11:33:22 2016] BTRFS error (device sde): parent transid verify failed on 22168481054720 wanted 1826943 found 1828546 [So Aug 28 11:33:22 2016] BTRFS warning (device sde): Skipping commit of aborted transaction. (more complete at the end of this mail) What I did up to now in order to recover: - mount ro,recovery (http://marc.merlins.org/perso/btrfs/post_2014-03-19_Btrfs-Tips_-Btrfs-Scrub-and-Btrfs-Filesystem-Repair.html) that works. btrfs check will lead to "Couldn't open file system" root@homeserver:~# btrfs scrub start /mnt/test scrub started on /mnt/test, fsid a8af3832-48c7-4568-861f-e80380dd7e0b (pid=18953) root@homeserver:~# btrfs scrub status /mnt/test scrub status for a8af3832-48c7-4568-861f-e80380dd7e0b scrub started at Sun Aug 28 12:02:46 2016 and was aborted after 00:00:00 total bytes scrubbed: 0.00B with 0 errors First thing to do now is probably to check the backups. But is there a way to repair this filesystem? Besides this: What could be the reason for this error? Scrubs were regular and good. There was no power loss and also smartctl looks fine on all three drives. Greetings, Hendrik root@homeserver:~# btrfs fi show Label: 'BigStorage' uuid: a8af3832-48c7-4568-861f-e80380dd7e0b Total devices 3 FS bytes used 7.66TiB devid1 size 2.73TiB used 2.72TiB path /dev/sde devid2 size 2.73TiB used 2.72TiB path /dev/sdc devid3 size 2.73TiB used 2.73TiB path /dev/sdd [ 98.534830] BTRFS error (device sde): parent transid verify failed on 22168481054720 wanted 1826943 found 1828546 [ 98.534866] BTRFS error (device sde): parent transid verify failed on 22168481054720 wanted 1826943 found 1828546 [ 98.534891] BTRFS error (device sde): parent transid verify failed on 22168481054720 wanted 1826943 found 1828546 [ 98.534920] BTRFS warning (device sde): Skipping commit of aborted transaction. [ 98.534921] [ cut here ] [ 98.534939] WARNING: CPU: 1 PID: 3643 at /home/zumbi/linux-4.6.4/fs/btrfs/transaction.c:1771 cleanup_transaction+0x96/0x300 [btrfs] [ 98.534940] BTRFS: Transaction aborted (error -5) [ 98.534940] Modules linked in: xt_nat(E) xt_tcpudp(E) veth(E) ftdi_sio(E) usbserial(E) ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) xfrm_user(E) xfrm_algo(E) iptable_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_nat_ipv4(E) xt_addrtype(E) iptable_filter(E) ip_tables(E) xt_conntrack(E) x_tables(E) nf_nat(E) nf_conntrack(E) br_netfilter(E) bridge(E) stp(E) llc(E) cpufreq_stats(E) cpufreq_userspace(E) cpufreq_conservative(E) cpufreq_powersave(E) binfmt_misc(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) nfs(E) lockd(E) grace(E) fscache(E) sunrpc(E) snd_hda_codec_hdmi(E) iTCO_wdt(E) iTCO_vendor_support(E) stv6110x(E) lnbp21(E) intel_rapl(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E)
btrfs check "Couldn't open file system" after error in transaction.c
Hello, I have a filesystem (three disks with no raid) that I can still mount ro, but I cannot check or scrub it. In dmesg I see: [So Aug 28 11:33:22 2016] BTRFS error (device sde): parent transid verify failed on 22168481054720 wanted 1826943 found 1828546 [So Aug 28 11:33:22 2016] BTRFS warning (device sde): Skipping commit of aborted transaction. (more complete at the end of this mail) What I did up to now in order to recover: - mount ro,recovery (http://marc.merlins.org/perso/btrfs/post_2014-03-19_Btrfs-Tips_-Btrfs-Scrub-and-Btrfs-Filesystem-Repair.html) that works. btrfs check will lead to "Couldn't open file system" root@homeserver:~# btrfs scrub start /mnt/test scrub started on /mnt/test, fsid a8af3832-48c7-4568-861f-e80380dd7e0b (pid=18953) root@homeserver:~# btrfs scrub status /mnt/test scrub status for a8af3832-48c7-4568-861f-e80380dd7e0b scrub started at Sun Aug 28 12:02:46 2016 and was aborted after 00:00:00 total bytes scrubbed: 0.00B with 0 errors First thing to do now is probably to check the backups. But is there a way to repair this filesystem? Besides this: What could be the reason for this error? Scrubs were regular and good. There was no power loss and also smartctl looks fine on all three drives. Greetings, Hendrik root@homeserver:~# btrfs fi show Label: 'BigStorage' uuid: a8af3832-48c7-4568-861f-e80380dd7e0b Total devices 3 FS bytes used 7.66TiB devid1 size 2.73TiB used 2.72TiB path /dev/sde devid2 size 2.73TiB used 2.72TiB path /dev/sdc devid3 size 2.73TiB used 2.73TiB path /dev/sdd [ 98.534830] BTRFS error (device sde): parent transid verify failed on 22168481054720 wanted 1826943 found 1828546 [ 98.534866] BTRFS error (device sde): parent transid verify failed on 22168481054720 wanted 1826943 found 1828546 [ 98.534891] BTRFS error (device sde): parent transid verify failed on 22168481054720 wanted 1826943 found 1828546 [ 98.534920] BTRFS warning (device sde): Skipping commit of aborted transaction. [ 98.534921] [ cut here ] [ 98.534939] WARNING: CPU: 1 PID: 3643 at /home/zumbi/linux-4.6.4/fs/btrfs/transaction.c:1771 cleanup_transaction+0x96/0x300 [btrfs] [ 98.534940] BTRFS: Transaction aborted (error -5) [ 98.534940] Modules linked in: xt_nat(E) xt_tcpudp(E) veth(E) ftdi_sio(E) usbserial(E) ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) xfrm_user(E) xfrm_algo(E) iptable_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_nat_ipv4(E) xt_addrtype(E) iptable_filter(E) ip_tables(E) xt_conntrack(E) x_tables(E) nf_nat(E) nf_conntrack(E) br_netfilter(E) bridge(E) stp(E) llc(E) cpufreq_stats(E) cpufreq_userspace(E) cpufreq_conservative(E) cpufreq_powersave(E) binfmt_misc(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) nfs(E) lockd(E) grace(E) fscache(E) sunrpc(E) snd_hda_codec_hdmi(E) iTCO_wdt(E) iTCO_vendor_support(E) stv6110x(E) lnbp21(E) intel_rapl(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) cryptd(E) pcspkr(E) serio_raw(E) [ 98.534963] snd_hda_codec_realtek(E) snd_hda_codec_generic(E) i2c_i801(E) i915(E) stv090x(E) snd_hda_intel(E) snd_hda_codec(E) snd_hda_core(E) snd_hwdep(E) snd_pcm(E) drm_kms_helper(E) ngene(E) ddbridge(E) snd_timer(E) snd(E) lpc_ich(E) mei_me(E) dvb_core(E) mfd_core(E) drm(E) soundcore(E) mei(E) i2c_algo_bit(E) shpchp(E) evdev(E) battery(E) tpm_tis(E) video(E) tpm(E) processor(E) button(E) fuse(E) autofs4(E) btrfs(E) xor(E) raid6_pq(E) dm_mod(E) md_mod(E) hid_generic(E) usbhid(E) hid(E) sg(E) sd_mod(E) ahci(E) libahci(E) crc32c_intel(E) libata(E) psmouse(E) scsi_mod(E) fan(E) thermal(E) xhci_pci(E) xhci_hcd(E) fjes(E) e1000e(E) ptp(E) pps_core(E) ehci_pci(E) ehci_hcd(E) usbcore(E) usb_common(E) [ 98.534988] CPU: 1 PID: 3643 Comm: btrfs-transacti Tainted: G E 4.6.0-0.bpo.1-amd64 #1 Debian 4.6.4-1~bpo8+1 [ 98.534989] Hardware name: /DH87RL, BIOS RLH8710H.86A.0325.2014.0417.1800 04/17/2014 [ 98.534990] 0286 007b2061 813124c5 8804a15f3d40 [ 98.534992] 8107af94 8804e884ac10 8804a15f3d98 [ 98.534993] 8804b9bbc500 fffb 8804e884ac10 fffb [ 98.534995] Call Trace: [ 98.535000] [] ? dump_stack+0x5c/0x77 [ 98.535003] [] ? __warn+0xc4/0xe0 [ 98.535005] [] ? warn_slowpath_fmt+0x5f/0x80 [ 98.535014] [] ? cleanup_transaction+0x96/0x300 [btrfs] [ 98.535017] [] ? wait_woken+0x90/0x90 [ 98.535026] [] ? btrfs_commit_transaction+0x2b3/0xa30 [btrfs] [ 98.535028] [] ? wait_woken+0x90/0x90 [ 98.535036] [] ? transaction_kthread+0x1ce/0x1f0 [btrfs] [ 98.535043] [] ? btrfs_cleanup_transaction+0x590/0x590 [btrfs] [ 98.535045] [] ? kthread+0xdf/0x100 [ 98.535048] [] ? ret_from_fork+0x22/0x40 [ 98.535049] [] ? kthread_park+0x50/0x50 [ 98.535050] ---[ end trace
Debian Jessie: How to set rootflags=degraded
Hello, I am using a raid1 under debian Jessie, because I need to decrease the likelyhood of unavailability of the system. Unfortunately I found, that when removing one of the drives, the system will not boot up. Instead initramfs will show up and tell me that the root volume could not be mounted. I read here: https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg31265.html that adding rootflags=degraded. Furthermore I see that this is not yet default -at least in Ubuntu- https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1229456?comments=all But that it should work by modifying /etc/grub.d/10_linux: https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1229456/comments/3 GRUB_CMDLINE_LINUX="rootflags=subvol=${rootsubvol},degraded ${GRUB_CMDLINE_LINUX}" when doing that and running grub-mkconfig and update-grub, I still get no entry with with "degraded" in /etc/grub.cfg. Can someone tell me, how to achieve this? I've been seaching for very long now and I am surprised, that I don't find any answer. Greetings, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
bad tree blcok start & faild to read chunk root
Hello, from this https://www.spinics.net/lists/linux-btrfs/msg57405.html I still have damaged btrf file system (the partition was recovered. Thanks Chris). When mounting, I get: [15681.255356] BTRFS info (device sda1): disk space caching is enabled [15681.255690] BTRFS error (device sda1): bad tree block start 0 20987904 [15681.255786] BTRFS error (device sda1): bad tree block start 0 20987904 [15681.255805] BTRFS: failed to read chunk root on So I ran chunk-recover (output below). Total Chunks: 5 Recoverable: 0 Unrecoverable:5 Is there anything else, I can do, or is that it? Regards, Hendrik root@homeserver:~# btrfs rescue chunk-recover -v /dev/sda1 All Devices: Device: id = 1, name = /dev/sda1 Scanning: DONE in dev0 DEVICE SCAN RESULT: Filesystem Information: sectorsize: 4096 leafsize: 16384 tree root generation: 22444670 chunk root generation: 233419 All Devices: Device: id = 1, name = /dev/sda1 All Block Groups: Block Group: start = 7776239616, len = 1073741824, flag = 1 Block Group: start = 61496885248, len = 536870912, flag = 24 Block Group: start = 62033756160, len = 536870912, flag = 24 Block Group: start = 71697432576, len = 1073741824, flag = 1 Block Group: start = 102835945472, len = 1073741824, flag = 1 All Chunks: All Device Extents: CHECK RESULT: Recoverable Chunks: Unrecoverable Chunks: Chunk: start = 7776239616, len = 1073741824, type = 1, num_stripes = 0 Stripes list: Block Group: start = 7776239616, len = 1073741824, flag = 1 No device extent. Chunk: start = 62033756160, len = 536870912, type = 24, num_stripes = 0 Stripes list: Block Group: start = 62033756160, len = 536870912, flag = 24 No device extent. Chunk: start = 61496885248, len = 536870912, type = 24, num_stripes = 0 Stripes list: Block Group: start = 61496885248, len = 536870912, flag = 24 No device extent. Chunk: start = 71697432576, len = 1073741824, type = 1, num_stripes = 0 Stripes list: Block Group: start = 71697432576, len = 1073741824, flag = 1 No device extent. Chunk: start = 102835945472, len = 1073741824, type = 1, num_stripes = 0 Stripes list: Block Group: start = 102835945472, len = 1073741824, flag = 1 No device extent. Total Chunks: 5 Recoverable: 0 Unrecoverable:5 Orphan Block Groups: Orphan Device Extents: Couldn't map the block 62034313216 No mapping for 62034313216-62034329600 Couldn't map the block 62034313216 bytenr mismatch, want=62034313216, have=0 Couldn't read tree root open with broken chunk error Chunk tree recovery failed --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re[2]: Chances to recover with bad partition table?
Hello Chris, thanks for your help. I did run Testdisk before and it found one partition on the drive. But there should be at least one before that one, that was not found. However, I followed your instructions and it matches to the find of testdisk: - 1f510040 5f 42 48 52 66 53 5f 4d 7e 7a 56 01 00 00 00 00 |_BHRfS_M~zV.| 23500040 5f 42 48 52 66 53 5f 4d 7e 7a 56 01 00 00 00 00 |_BHRfS_M~zV.| 0x1f510040 -0x10040 = 0x1f50 = 525336576 /512 =1026048 gdisk -l /dev/sda GPT fdisk (gdisk) version 0.8.10 Partition table scan: MBR: MBR only BSD: not present APM: not present GPT: present Found valid MBR and GPT. Which do you want to use? 1 - MBR 2 - GPT 3 - Create blank GPT Your answer: 1 Disk /dev/sda: 234441648 sectors, 111.8 GiB Logical sector size: 512 bytes Disk identifier (GUID): 49D09617-2C48-42D3-B2CD-9A1B72BC6C07 Partition table holds up to 128 entries First usable sector is 34, last usable sector is 234441614 Partitions will be aligned on 2048-sector boundaries Total free space is 1026925 sectors (501.4 MiB) Number Start (sector)End (sector) Size Code Name 1 1026048 234440703 111.3 GiB 8300 Linux filesystem btrfs-show-super /dev/sda1 dev_item.total_bytes119508303872 --- So, I have two problems now: a) recover the partition before this one b) get this one mounted I understand that a) is off topic in this group, so let's focus on b. Having said that, I'd be greatful for hints also on a) when trying to mount (o ro, recovery) this disk, I get: [36904.547011] BTRFS info (device sda1): disk space caching is enabled [36904.547898] BTRFS error (device sda1): bad tree block start 0 20987904 [36904.549864] BTRFS error (device sda1): bad tree block start 0 20987904 [36904.551632] BTRFS: failed to read chunk root on sda1 [36904.568589] BTRFS: open_ctree failed Thus, I tried to follow this tutorial: http://ram.kossboss.com/btrfs-restoring-a-corrupt-filesystem-from-another-tree-location/ but unfortunately the options of btrfs restore have apparently changed. (invalid option -F) I am using Kernel 4.6 and btrfs-tools 4.4 I ran btrfs rescue super-recover -v /dev/sda1 All Devices: Device: id = 1, name = /dev/sda1 Before Recovering: [All good supers]: device name = /dev/sda1 superblock bytenr = 65536 device name = /dev/sda1 superblock bytenr = 67108864 [All bad supers]: All supers are valid, no need to recover What would be the next step? Regards, Hendrik -- Originalnachricht -- Von: "Chris Murphy" <li...@colorremedies.com> An: Cc: "Hendrik Friedel" <hend...@friedels.name>; "Btrfs BTRFS" <linux-btrfs@vger.kernel.org> Gesendet: 24.07.2016 03:49:28 Betreff: Re: Chances to recover with bad partition table? On Sat, Jul 23, 2016 at 7:46 PM, Chris Murphy <li...@colorremedies.com> wrote: Something like this: [root@f24s ~]# dd if=/dev/sda | hexdump -C | egrep '5f 42 48 52 66 53 5f' 00110040 5f 42 48 52 66 53 5f 4d 8d 4f 04 00 00 00 00 00 |_BHRfS_M.O..| Ha so originally I was planning on putting in the dd portion a count limit, like count=20480 for a 10MiB search from the start of the drive. You could just do # hexdump -C /dev/sda | egrep '5f 42 48 52 66 53 5f' And then control-C to cancel it once the first super shows up. -- Chris Murphy --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chances to recover with bad partition table?
Hello, this morning I had to face an unusual prompt on my machine. I found that the partition table of /dev/sda had vanished. I restored it with testdisk. It found one partition, but I am quite sure there was a /boot partition in front of that which was not found. Now, running btrfsck fails: root@homeserver:~# fdisk -l /dev/sda WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util fdisk doesn't support GPT. Use GNU Parted. Disk /dev/sda: 120.0 GB, 120034123776 bytes 255 heads, 63 sectors/track, 14593 cylinders, total 234441648 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sda1 * 1026048 234440703 116707328 83 Linux root@homeserver:~# btrfsck /dev/sda1 checksum verify failed on 20987904 found E4E3BDB6 wanted checksum verify failed on 20987904 found E4E3BDB6 wanted checksum verify failed on 20987904 found E4E3BDB6 wanted checksum verify failed on 20987904 found E4E3BDB6 wanted bytenr mismatch, want=20987904, have=0 Couldn't read chunk root Couldn't open file system Is there a way to let btrfs search for the start of the partiton? I do have a backup; thus it is not fatal. But some data on the disk is more recent than my back up (of course) Regards, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chances to recover with bad partition table?
Hello, this morning I had to face an unusual prompt on my machine. I found that the partition table of /dev/sda had vanished. I restored it with testdisk. It found one partition, but I am quite sure there was a /boot partition in front of that which was not found. Now, running btrfsck fails: root@homeserver:~# fdisk -l /dev/sda WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util fdisk doesn't support GPT. Use GNU Parted. Disk /dev/sda: 120.0 GB, 120034123776 bytes 255 heads, 63 sectors/track, 14593 cylinders, total 234441648 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sda1 * 1026048 234440703 116707328 83 Linux root@homeserver:~# btrfsck /dev/sda1 checksum verify failed on 20987904 found E4E3BDB6 wanted checksum verify failed on 20987904 found E4E3BDB6 wanted checksum verify failed on 20987904 found E4E3BDB6 wanted checksum verify failed on 20987904 found E4E3BDB6 wanted bytenr mismatch, want=20987904, have=0 Couldn't read chunk root Couldn't open file system Is there a way to let btrfs search for the start of the partiton? I do have a backup; thus it is not fatal. But some data on the disk is more recent than my back up (of course) Regards, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of SMR with BTRFS
Hello Austin, thanks for your reply. Ok, thanks; So, TGMR does not say whether or not the Device is SMR or not, right? I'm not 100% certain about that. Technically, the only non-firmware difference is in the read head and the tracking. If it were me, I'd be listing SMR instead of TGMR on the data sheet, but I'd be more than willing to bet that many drive manufacturers won't think like that. While the Data-Sheet does not mention SMR and the 'Desktop' in the name rather than 'Archive' would indicate no SMR, some reviews indicate SMR (http://www.legitreviews.com/seagate-barracuda-st5000dm000-5tb-desktop-hard-drive-review_161241) Beyond that, I'm not sure, but I believe that their 'Desktop' branding still means it's TGMR and not SMR. ... which in the Seagate nomenclature might not exclude each other (TGMR could still be SMR). I will just ask them... How did you find out on your drives whether they use SMR? I'd very much suggest avoiding USB connected SMR drives though, USB is already poorly designed for storage devices (even with USB attached SCSI), and most of the filesystem issues I see personally (not just with BTRFS, but any other filesystem as well) are on USB connected storage, so I'd be very wary of adding all the potential issues with SMR drives on top of that as well. Understood. But I use this drive as a Backup. The Drive must not be connected to the System unless doing a backup. Otherwise a Virus, or just an issue with the power (peak due to lightning strike) might vanish both the Source data and Backup at once (single point of failure). Greetings, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of SMR with BTRFS
Hello and thanks for your replies, It's a Seagate Expansion Desktop 5TB (USB3). It is probably a ST5000DM000. this is TGMR not SMR disk: TGMR is a derivative of giant magneto-resistance, and is what's been used in hard disk drives for decades now. With limited exceptions in recent years and in ancient embedded systems, all modern hard drives are TGMR based. Ok, thanks; So, TGMR does not say whether or not the Device is SMR or not, right? While the Data-Sheet does not mention SMR and the 'Desktop' in the name rather than 'Archive' would indicate no SMR, some reviews indicate SMR (http://www.legitreviews.com/seagate-barracuda-st5000dm000-5tb-desktop-hard-drive-review_161241) In any case: the drive behaves like a SMR drive: I ran a benchmark on it with up to 200MB/s. When copying a file onto the drive in parallel the rate in the benchmark dropped to 7MB/s, while that particular file was copied at 40MB/s. This type of performance degradation is actually not unexpected Ok. I was not aware. I expected some, but less impact. There's two things that should be clarified here: [...] Thanks for clarifying. Well, I'm no pro. But I found this: https://github.com/kdave/drafts/blob/master/btrfs/smr-mode.txt And this does sound like improvements to BTRFS can be done for SMR in a generic, not vendor/device specific manner. And I am wondering: [...] b) whether these improvements have been made already Not yet. Ok, thanks. So I conclude that on SMR Drives, BTRFS has all benefits that it has on all other devices and there are no BTRFS related disadvantages in relation with BTRFS. Nevertheless, some improvements to BTRFS can be made in order to improve BTRFS with these drives. Greetings, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of SMR with BTRFS
Hi Thomasz, @Dave I have added you to the conversation, as I refer to your notes (https://github.com/kdave/drafts/blob/master/btrfs/smr-mode.txt) thanks for your reply! It's a Seagate Expansion Desktop 5TB (USB3). It is probably a ST5000DM000. this is TGMR not SMR disk: http://www.seagate.com/www-content/product-content/desktop-hdd-fam/en-us/docs/100743772a.pdf So it still confirms to standard record strategy ... I am not convinced. I had not heared TGMR before. But I find TGMR as a technology for the head. https://pics.computerbase.de/4/0/3/4/4/29-1080.455720475.jpg In any case: the drive behaves like a SMR drive: I ran a benchmark on it with up to 200MB/s. When copying a file onto the drive in parallel the rate in the benchmark dropped to 7MB/s, while that particular file was copied at 40MB/s. There are two types: 1. SMR managed by device firmware. BTRFS sees that as a normal block device … problems you get are not related to BTRFS it self … That for sure. But the way BTRFS uses/writes data could cause problems in conjunction with these devices still, no? I'm sorry but I'm confused now, what "magical way of using/writing data" you actually mean ? AFAIK btrfs sees the disk as a block device Well, btrfs does write data very different to many other file systems. On every write the file is copied to another place, even if just one bit is changed. That's special and I am wondering whether that could cause problems. Now think slowly and thoroughly about it: who would write a code (and maintain it) for a file system that access device specific data for X amount of vendors with each having Y amount of model specific configurations/caveats/firmwares/protocols ... S.M.A.R.T. emerged to give a unifying interface to device statistics ... this is how bad it was ... Well, I'm no pro. But I found this: https://github.com/kdave/drafts/blob/master/btrfs/smr-mode.txt And this does sound like improvements to BTRFS can be done for SMR in a generic, not vendor/device specific manner. And I am wondering: a) whether it is advisable to use BTRFS on these drives before these improvements have been made already i) if not: Are there specific btrfs features that should be avoided, or btrfs in general? b) whether these improvements have been made already care about your data, do some research ... if not ... maybe raiserFS is for you :) You are right for sure. And that's what I do here. But I am far away from being able to judge myself, so I rely on support. Greetings, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of SMR with BTRFS
Hello Tomasz, thanks for your reply. What disk are you using ? It's a Seagate Expansion Desktop 5TB (USB3). It is probably a ST5000DM000. There are two types: 1. SMR managed by device firmware. BTRFS sees that as a normal block device … problems you get are not related to BTRFS it self … That for sure. But the way BTRFS uses/writes data could cause problems in conjunction with these devices still, no? 2. SMR managed by host system, BTRFS still does see this as a block device … just emulated by host system to look normal. I am not sure, what I am using. How can I find out? In case of funky technologies like that I would research how exactly data is stored in terms of “BAND” and experiment with setting leaf & sector size to match a band, Sorry, but I have no idea where to start. It seems to me, although the drive being a pure consumer drive, it is a 'pro' feature and I should avoid it with BTRFS. I am just surprised, there is no hint in the wiki with that regards. Greetings, Hendrik > On 15 Jul 2016, at 19:29, Hendrik Friedel <hend...@friedels.name> wrote: > > Hello, > > I have a 5TB Seagate drive that uses SMR. > > I was wondering, if BTRFS is usable with this Harddrive technology. So, first I searched the BTRFS wiki -nothing. Then google. > > * I found this: https://bbs.archlinux.org/viewtopic.php?id=203696 > But this turned out to be an issue not related to BTRFS. > > * Then this: http://www.snia.org/sites/default/files/SDC15_presentations/smr/ HannesReinecke_Strategies_for_running_unmodified_FS_SMR.pdf > " BTRFS operation matches SMR parameters very closely [...] > > High number of misaligned write accesses ; points to an issue with btrfs itself > > > * Then this: http://superuser.com/questions/962257/fastest-linux-filesystem-on-shingled-disks > The BTRFS performance seemed good. > > > * Finally this: http://www.spinics.net/lists/linux-btrfs/msg48072.html > "So you can get mixed results when trying to use the SMR devices but I'd say it will mostly not work. > But, btrfs has all the fundamental features in place, we'd have to make > adjustments to follow the SMR constraints:" > [...] > I have some notes at > https://github.com/kdave/drafts/blob/master/btrfs/smr-mode.txt; > > > So, now I am wondering, what the state is today. "We" (I am happy to do that; but not sure of access rights) should also summarize this in the wiki. > My use-case by the way are back-ups. I am thinking of using some of the interesting BTRFS features for this (send/receive, deduplication) > > Greetings, > Hendrik > > > --- > Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. > https://www.avast.com/antivirus > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Status of SMR with BTRFS
Hello, I have a 5TB Seagate drive that uses SMR. I was wondering, if BTRFS is usable with this Harddrive technology. So, first I searched the BTRFS wiki -nothing. Then google. * I found this: https://bbs.archlinux.org/viewtopic.php?id=203696 But this turned out to be an issue not related to BTRFS. * Then this: http://www.snia.org/sites/default/files/SDC15_presentations/smr/ HannesReinecke_Strategies_for_running_unmodified_FS_SMR.pdf " BTRFS operation matches SMR parameters very closely [...] High number of misaligned write accesses ; points to an issue with btrfs itself * Then this: http://superuser.com/questions/962257/fastest-linux-filesystem-on-shingled-disks The BTRFS performance seemed good. * Finally this: http://www.spinics.net/lists/linux-btrfs/msg48072.html "So you can get mixed results when trying to use the SMR devices but I'd say it will mostly not work. But, btrfs has all the fundamental features in place, we'd have to make adjustments to follow the SMR constraints:" [...] I have some notes at https://github.com/kdave/drafts/blob/master/btrfs/smr-mode.txt; So, now I am wondering, what the state is today. "We" (I am happy to do that; but not sure of access rights) should also summarize this in the wiki. My use-case by the way are back-ups. I am thinking of using some of the interesting BTRFS features for this (send/receive, deduplication) Greetings, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to move a btrfs volume to a smaller disk
Hello Hugo, thanks for your ultrafast reply. Unfortunately, it does not work for me: [root@homeserver mnt2]# btrfs filesystem resize 80G /mnt2/Data_Store/ && btrfs replace start /dev/sdb4 /dev/sda4 /mnt2/Data_Store/ -f && btrfs filesystem resize max /mnt2/Data_Store/ Resize '/mnt2/Data_Store/' of '80G' ERROR: target device smaller than source device (required 119121379328 bytes) [root@homeserver mnt2]# btrfs filesystem show /mnt2/Data_Store/ Label: 'Data_Store' uuid: 0ccc1e24-090d-42e2-9e61-d0a1b3101f93 Total devices 1 FS bytes used 47.95GiB devid1 size 80.00GiB used 66.03GiB path /dev/sdb4 [root@homeserver mnt2]# lsblk | grep sda4 └─sda48:40 103.5G 0 part Greetings, Hendrik On 09.03.2016 22:50, Hugo Mills wrote: On Wed, Mar 09, 2016 at 10:46:09PM +0100, Hendrik Friedel wrote: Hello, I intend to move this subvolume to a new device. btrfs fi show /mnt2/Data_Store/ Label: 'Data_Store' uuid: 0ccc1e24-090d-42e2-9e61-d0a1b3101f93 Total devices 1 FS bytes used 47.93GiB devid1 size 102.94GiB used 76.03GiB path /dev/sdb4 (fi usage at the bottom of this message) The new device (sda4) is 8G smaller unfortunately. sda 8:00 111.8G 0 disk └─sda48:40 103.5G 0 part sdb 8:16 0 119.2G 0 disk └─sdb48:20 0 111G 0 part /mnt2/Data_Store Thus, btrfs replace does not work What would you suggest now to move the FS (it does contain many subvolumes)? btrfs dev resize to shrink it to (slightly smaller than) the replacement device, then btrfs replace should work. Then btrfs dev resize max to fill up the replacement device completely. Hugo. I tried btrfs send /mnt2/Data_Store/read_only_snapshot/ | btrfs receive /mnt/sda4/ but this only created an empty subvolume /mnt/sda4/read_only_snapshot/ So, then btrfs device add /dev/sda4 /mnt/Data_Store btrfs balance start /mnt/Data_Store btrfs device remove /dev/sdb4 /mnt/Data_Store ? Or is there a better option? Regards, Hendrik btrfs fi usage /mnt2/Data_Store/ Overall: Device size: 102.94GiB Device allocated: 74.03GiB Device unallocated: 28.91GiB Device missing: 0.00B Used: 47.96GiB Free (estimated): 53.24GiB (min: 53.24GiB) Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 512.00MiB (used: 0.00B) Data,single: Size:69.00GiB, Used:44.67GiB /dev/sdb4 69.00GiB Metadata,single: Size:5.00GiB, Used:3.29GiB /dev/sdb4 5.00GiB System,single: Size:32.00MiB, Used:16.00KiB /dev/sdb4 32.00MiB Unallocated: /dev/sdb4 28.91GiB --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
How to move a btrfs volume to a smaller disk
Hello, I intend to move this subvolume to a new device. btrfs fi show /mnt2/Data_Store/ Label: 'Data_Store' uuid: 0ccc1e24-090d-42e2-9e61-d0a1b3101f93 Total devices 1 FS bytes used 47.93GiB devid1 size 102.94GiB used 76.03GiB path /dev/sdb4 (fi usage at the bottom of this message) The new device (sda4) is 8G smaller unfortunately. sda 8:00 111.8G 0 disk └─sda48:40 103.5G 0 part sdb 8:16 0 119.2G 0 disk └─sdb48:20 0 111G 0 part /mnt2/Data_Store Thus, btrfs replace does not work What would you suggest now to move the FS (it does contain many subvolumes)? I tried btrfs send /mnt2/Data_Store/read_only_snapshot/ | btrfs receive /mnt/sda4/ but this only created an empty subvolume /mnt/sda4/read_only_snapshot/ So, then btrfs device add /dev/sda4 /mnt/Data_Store btrfs balance start /mnt/Data_Store btrfs device remove /dev/sdb4 /mnt/Data_Store ? Or is there a better option? Regards, Hendrik btrfs fi usage /mnt2/Data_Store/ Overall: Device size: 102.94GiB Device allocated: 74.03GiB Device unallocated: 28.91GiB Device missing: 0.00B Used: 47.96GiB Free (estimated): 53.24GiB (min: 53.24GiB) Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 512.00MiB (used: 0.00B) Data,single: Size:69.00GiB, Used:44.67GiB /dev/sdb4 69.00GiB Metadata,single: Size:5.00GiB, Used:3.29GiB /dev/sdb4 5.00GiB System,single: Size:32.00MiB, Used:16.00KiB /dev/sdb4 32.00MiB Unallocated: /dev/sdb4 28.91GiB --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: booting from BTRFS works only with one device in the pool
Hello Chris, thanks, I appreciate your help - 1. Install CentOS 7.0 to vda 2. reboot 3. btrfs dev add /dev/vdb / 4. reboot ## works 5. btrfs balance start / 6. reboot ## works Same thing when starting with CentOS 7.2 media. This is a NAS product using CentOS 7.2? My only guess is they've changed something broke this. - I confirm that it runs also for me on CentOS 7 Minimal. Thus we can rule out CentOS and myself as the Source of the problem. Rockstor is Based on Centos 7.2. So, I compared the grub.cfg of Centos 7.2 -which works for me to Rockstor. I see very few differences apart from UUIDs: 1) menuentry 'Rockstor' --class rhel fedora menuentry 'Centos ' --class centos 2) insmod ext2 insmod zfs I have attached the two files. Can you point me at other settings/files I could compare, please? Regards, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus # # DO NOT EDIT THIS FILE # # It is automatically generated by grub2-mkconfig using templates # from /etc/grub.d and settings from /etc/default/grub # ### BEGIN /etc/grub.d/00_header ### set pager=1 if [ -s $prefix/grubenv ]; then load_env fi if [ "${next_entry}" ] ; then set default="${next_entry}" set next_entry= save_env next_entry set boot_once=true else set default="${saved_entry}" fi if [ x"${feature_menuentry_id}" = xy ]; then menuentry_id_option="--id" else menuentry_id_option="" fi export menuentry_id_option if [ "${prev_saved_entry}" ]; then set saved_entry="${prev_saved_entry}" save_env saved_entry set prev_saved_entry= save_env prev_saved_entry set boot_once=true fi function savedefault { if [ -z "${boot_once}" ]; then saved_entry="${chosen}" save_env saved_entry fi } function load_video { if [ x$feature_all_video_module = xy ]; then insmod all_video else insmod efi_gop insmod efi_uga insmod ieee1275_fb insmod vbe insmod vga insmod video_bochs insmod video_cirrus fi } terminal_output console if [ x$feature_timeout_style = xy ] ; then set timeout_style=menu set timeout=5 # Fallback normal timeout code in case the timeout_style feature is # unavailable. else set timeout=5 fi ### END /etc/grub.d/00_header ### ### BEGIN /etc/grub.d/00_tuned ### set tuned_params="" ### END /etc/grub.d/00_tuned ### ### BEGIN /etc/grub.d/01_users ### if [ -f ${prefix}/user.cfg ]; then source ${prefix}/user.cfg if [ -n "${GRUB2_PASSWORD}" ]; then set superusers="root" export superusers password_pbkdf2 root ${GRUB2_PASSWORD} fi fi ### END /etc/grub.d/01_users ### ### BEGIN /etc/grub.d/10_linux ### menuentry 'CentOS Linux (3.10.0-327.4.5.el7.x86_64) 7 (Core)' --class centos --class gnu-linux --class gnu --class os --unrestricted $menuentry_id_option 'gnulinux-3.10.0-327.4.5.el7.x86_64-advanced-e37695d8-f6df-490a-9a0a-b13cdfbea99d' { load_video set gfxpayload=keep insmod gzio insmod part_msdos insmod xfs set root='hd0,msdos1' if [ x$feature_platform_search_hint = xy ]; then search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1 --hint='hd0,msdos1' e7d4883f-c825-49bc-907a-8ec9bd22e774 else search --no-floppy --fs-uuid --set=root e7d4883f-c825-49bc-907a-8ec9bd22e774 fi linux16 /vmlinuz-3.10.0-327.4.5.el7.x86_64 root=UUID=e37695d8-f6df-490a-9a0a-b13cdfbea99d ro rootflags=subvol=root crashkernel=auto rhgb quiet LANG=en_US.UTF-8 initrd16 /initramfs-3.10.0-327.4.5.el7.x86_64.img } menuentry 'CentOS Linux (0-rescue-e8569f127454499cb00018a87844b14d) 7 (Core)' --class centos --class gnu-linux --class gnu --class os --unrestricted $menuentry_id_option 'gnulinux-0-rescue-e8569f127454499cb00018a87844b14d-advanced-e37695d8-f6df-490a-9a0a-b13cdfbea99d' { load_video insmod gzio insmod part_msdos insmod xfs set root='hd0,msdos1' if [ x$feature_platform_search_hint = xy ]; then search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1 --hint='hd0,msdos1' e7d4883f-c825-49bc-907a-8ec9bd22e774 else search --no-floppy --fs-uuid --set=root e7d4883f-c825-49bc-907a-8ec9bd22e774 fi linux16 /vmlinuz-0-rescue-e8569f127454499cb00018a87844b14d root=UUID=e37695d8-f6df-490a-9a0a-b13cdfbea99d ro rootflags=subvol=root crashkernel=auto rhgb quiet initrd16 /initramfs-0-rescue-e8569f127454499cb00018a87844b14d.img } ### END /etc/grub.d/10_linux ### ### BEGIN /etc/grub.d/20_linux_xen ### ### END /etc/grub.d/20_linux_xen ### ### BEGIN /etc/grub.d/20_ppc_terminfo ### ### END /etc/grub.d/20_ppc_terminfo ### ### BEGIN /etc/grub.d/30_os-prober ### ### END /etc/grub.d/30_os-prober ### ### BEGIN /etc/grub.d/40_custom ### # This file provides an easy
Re: booting from BTRFS works only with one device in the pool
Sorry, I missed this: > What do you get for rpm -q grub2 grub2-2.02-0.34.el7.centos.x86_64 Greetings, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: booting from BTRFS works only with one device in the pool
Hello Chris, That's a bit weird. This is BIOS or UEFI system? On UEFI, the prebaked grubx64.efi includes btrfs, so insmod isn't strictly needed. But on BIOS it would be. it is a Virtual-Box-VM. It is a BIOS system > It might be as simple as manually mounting: >btrfs dev scan >btrfs fi show ## hopefully both devices are now associated with the volume > mount /dev/sdXY /sysroot > exit The mount works. After entering "exit" I get the feedback "logout" and the system hangs. > If it mounts, you can exit to continue the startup process. And then: dracut -f That'll rebuild the initramfs. And dracut then somehow understands that btrfs dev scan is needed? set root='hd0,msdos1' search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1 --hint='hd0,msdos1' 4a470ac6-f013-4e7b-a4f3-3a58cc4debc3 after removing sda3 from the pool again, the system boots normally. That's unexpected. Both devices should now refer to each other, so either device missing should fail, it's effectively raid0 except on a chunk level. That's way beyond my understanding. I am not sure how this entry is generated. But it is somehow a default behavior, as it seems (I have not done this) Greetings, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: booting from BTRFS works only with one device in the pool
Hello Hugo, >> Here I am stuck in a recovery prompt. By far the simplest and most reliable method of doing this is to use an initramfs with the command "btrfs dev scan" in it somewhere before mounting. Most of the major distributions already have an initramfs set up (as does yours, I see), and will install the correct commands in the initramfs if you install the btrfs-progs package (btrfs-tools in Debian derivatives). I would like to go the sensible way :-) But can you hint me how and where to add the btrfs device scan option to the initramfs? Regards, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: booting from BTRFS works only with one device in the pool
Hello, I would like to go the sensible way :-) But can you hint me how and where to add the btrfs device scan option to the initramfs? If btrfs-progs 4.3.1 is installed already, dracut -f will rebuild the initramfs and should just drag in current tools which will include 'btrfs device scan'. Yes, 4.3.1 is installed and even in the recovery-mode (i.e. if boot fails) btrfs reports version 4.3.1. Also, in this recovery mode, btrfs dev scan is working. Furthermore, I can mount the drive in this recovery mode (which I *suspect* is the initramfs). My understanding is/was, that it is not sufficient that btrfs dev scan is available, but also its execution must be triggered. Is that right? Like I said above, just install the distribution's own btrfs-progs/btrfs-tools package, and it should do the right thing. You may have to tell the distribution to rebuild the initramfs I have run dracut -f. It does not solve the issue. >This is CentOS 7.2 installed to a single VDI file, and then you use > 'btrfs dev add' to add another VDI file? I'd like to know how to reproduce the conditions > so I can figure out what's wrong because it ought to work, seeing as it worked for me with > Fedora 19 and CentOS 7.x is in the vicinity of Fedora 19/20. Yes, it is Rockstor (CentOS 7.2 based) installed in the way you mention. Regards, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
booting from BTRFS works only with one device in the pool
Hello, I am running CentOS from a btrfs root. This worked fine until I added a device to that pool: btrfs device add /dev/sda3 / reboot This now causes the errors: BTRFS: failed to read chunk tree on sdb3 BTRFS: open_ctree failed Here I am stuck in a recovery prompt. btrfs fi show displays the file system correctly with 2.1GiB used for sdb3 and 0.00GiB used on sda3 btrfs-tools version reports btrfs-progs v4.3.1 Now, I read that in case of this issue, should add the second device of the pool to the commandline argument of the kernel/the boot options/grub.cfg But I am not sure how to do this. I can mount /boot/ and the /boot/grub2/grub.cfg contains: insmod ext2 (but not btrfs!) set root='hd0,msdos1' search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1 --hint='hd0,msdos1' 4a470ac6-f013-4e7b-a4f3-3a58cc4debc3 after removing sda3 from the pool again, the system boots normally. blkid gives: /dev/sdb3: LABEL="rockstor_rockstor" UUID="f9de7c11-012e-4e5d-8b53-0e6d6c2916a3" UUID_SUB="24bdf07b-dbd3-44dc-9195-4b0bfedf974f" TYPE="btrfs" PARTLABEL="Linux filesystem" PARTUUID="c438bd3c-df9a-4e49-8607-47cd9b45e212" (note /dev/sda3 is not shown here) btrfs fi show Label: 'rockstor_rockstor' uuid: f9de7c11-012e-4e5d-8b53-0e6d6c2916a3 Total devices 1 FS bytes used 1.38GiB devid1 size 6.87GiB used 2.10GiB path /dev/sdb3 It's a pitty that the only NAS Distribution built around btrfs does not support the full feature-set of btrfs on its root partition. Could you please help me fixing this? Below you find the complete grub.cfg. Regards, Hendrik cat /boot/grub2/grub.cfg # # DO NOT EDIT THIS FILE # # It is automatically generated by grub2-mkconfig using templates # from /etc/grub.d and settings from /etc/default/grub # ### BEGIN /etc/grub.d/00_header ### set pager=1 if [ -s $prefix/grubenv ]; then load_env fi if [ "${next_entry}" ] ; then set default="${next_entry}" set next_entry= save_env next_entry set boot_once=true else set default="${saved_entry}" fi if [ x"${feature_menuentry_id}" = xy ]; then menuentry_id_option="--id" else menuentry_id_option="" fi export menuentry_id_option if [ "${prev_saved_entry}" ]; then set saved_entry="${prev_saved_entry}" save_env saved_entry set prev_saved_entry= save_env prev_saved_entry set boot_once=true fi function savedefault { if [ -z "${boot_once}" ]; then saved_entry="${chosen}" save_env saved_entry fi } function load_video { if [ x$feature_all_video_module = xy ]; then insmod all_video else insmod efi_gop insmod efi_uga insmod ieee1275_fb insmod vbe insmod vga insmod video_bochs insmod video_cirrus fi } terminal_output console if [ x$feature_timeout_style = xy ] ; then set timeout_style=menu set timeout=5 # Fallback normal timeout code in case the timeout_style feature is # unavailable. else set timeout=5 fi ### END /etc/grub.d/00_header ### ### BEGIN /etc/grub.d/00_tuned ### set tuned_params="" ### END /etc/grub.d/00_tuned ### ### BEGIN /etc/grub.d/01_users ### if [ -f ${prefix}/user.cfg ]; then source ${prefix}/user.cfg if [ -n "${GRUB2_PASSWORD}" ]; then set superusers="root" export superusers password_pbkdf2 root ${GRUB2_PASSWORD} fi fi ### END /etc/grub.d/01_users ### ### BEGIN /etc/grub.d/10_linux ### menuentry 'Rockstor (4.3.3-1.el7.elrepo.x86_64) 3 (Core)' --class rhel fedora --class gnu-linux --class gnu --class os --unrestricted $menuentry_id_option 'gnulinux-4.3.3-1.el7.elrepo.x86_64-advanced-f9de7c11-012e-4e5d-8b53-0e6d6c2916a3' { load_video set gfxpayload=keep insmod gzio insmod part_msdos insmod ext2 set root='hd0,msdos1' if [ x$feature_platform_search_hint = xy ]; then search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1 --hint='hd0,msdos1' 4a470ac6-f013-4e7b-a4f3-3a58cc4debc3 else search --no-floppy --fs-uuid --set=root 4a470ac6-f013-4e7b-a4f3-3a58cc4debc3 fi linux16 /vmlinuz-4.3.3-1.el7.elrepo.x86_64 root=UUID=f9de7c11-012e-4e5d-8b53-0e6d6c2916a3 ro rootflags=subvol=root crashkernel=auto rhgb quiet LANG=en_US.UTF-8 initrd16 /initramfs-4.3.3-1.el7.elrepo.x86_64.img } menuentry 'Rockstor (0-rescue-f5f625480f394bdc90d6d3c06be7fb88) 3 (Core)' --class rhel fedora --class gnu-linux --class gnu --class os --unrestricted $menuentry_id_option 'gnulinux-0-rescue-f5f625480f394bdc90d6d3c06be7fb88-advanced-f9de7c11-012e-4e5d-8b53-0e6d6c2916a3' { load_video insmod gzio insmod part_msdos insmod ext2 set root='hd0,msdos1' if [ x$feature_platform_search_hint = xy ]; then search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1 --hint='hd0,msdos1'
Re: understanding btrfs fi df
Hello Hugo, It shouldn't happen, as I understand how the process works. Can you show the output of btrfs fi df /mnt/__Complete_Disk? Let's just check that everything is indeed RAID-5 still. Here we go: btrfs fi df /mnt/__Complete_Disk Data, RAID5: total=3.79TiB, used=3.78TiB System, RAID5: total=32.00MiB, used=416.00KiB Metadata, RAID5: total=6.46GiB, used=4.85GiB GlobalReserve, single: total=512.00MiB, used=0.00B can you help me on this? As far as I see, it's all Raid5. So, After this, we're probably going to have to look at the device and chunk trees to work out what's going on. Can you help me on this? Regards, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: understanding btrfs fi df
Hello Hugo, thanks for your hint. On 16.08.2015 16:57, Hugo Mills wrote: Here's your problem -- you've got a RAID 5 filesystem, which has a minimum allocation of 2 devices, but only one device has free space on it for allocation, so no more chunks can be allocated. I'm not sure how it ended up in this situation, but that's what's happened. 2) convert a few chunks of the existing FS, on devices 1 and 3, to single, which should free up some space across all the devices, and then run a full balance, and finally convert the single back to RAID 5: # btrfs balance start -dconvert=single,devid=1,limit=4/mnt/__Complete_Disk # btrfs balance start -dconvert=single,devid=3,limit=4/mnt/__Complete_Disk # btrfs balance start -dprofiles=raid5/mnt/__Complete_Disk # btrfs balance start -dconvert=raid5,soft/mnt/__Complete_Disk I did that. In order to run the first two commands I had to free some space though, which was no problem. Now the output is: root@homeserver:/media# btrfs fi df /mnt/__Complete_Disk Data, RAID5: total=3.79TiB, used=3.78TiB System, RAID5: total=32.00MiB, used=416.00KiB Metadata, RAID5: total=6.46GiB, used=4.85GiB GlobalReserve, single: total=512.00MiB, used=0.00B root@homeserver:/media# btrfs fi show Label: none uuid: a8af3832-48c7-4568-861f-e80380dd7e0b Total devices 3 FS bytes used 3.79TiB devid1 size 2.73TiB used 2.43TiB path /dev/sdf devid2 size 2.73TiB used 2.43TiB path /dev/sdd devid3 size 2.73TiB used 1.38TiB path /dev/sde How can only 1.38TiB be used on devid 3? Greetings, Hendrik -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Tel. 04203 8394854 Mobil 0178 1874363 --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: understanding btrfs fi df
Hi Hugo, thanks for your help. Now the output is: root@homeserver:/media# btrfs fi df /mnt/__Complete_Disk Data, RAID5: total=3.79TiB, used=3.78TiB System, RAID5: total=32.00MiB, used=416.00KiB Metadata, RAID5: total=6.46GiB, used=4.85GiB GlobalReserve, single: total=512.00MiB, used=0.00B root@homeserver:/media# btrfs fi show Label: none uuid: a8af3832-48c7-4568-861f-e80380dd7e0b Total devices 3 FS bytes used 3.79TiB devid1 size 2.73TiB used 2.43TiB path /dev/sdf devid2 size 2.73TiB used 2.43TiB path /dev/sdd devid3 size 2.73TiB used 1.38TiB path /dev/sde How can only 1.38TiB be used on devid 3? It shouldn't happen, as I understand how the process works. Can you show the output of btrfs fi df /mnt/__Complete_Disk? Let's just check that everything is indeed RAID-5 still. Here we go: btrfs fi df /mnt/__Complete_Disk Data, RAID5: total=3.79TiB, used=3.78TiB System, RAID5: total=32.00MiB, used=416.00KiB Metadata, RAID5: total=6.46GiB, used=4.85GiB GlobalReserve, single: total=512.00MiB, used=0.00B Greetings, Hendrik Greetings, Hendrik -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Tel. 04203 8394854 Mobil 0178 1874363 --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
understanding btrfs fi df
Hello, I am struggling to understand the output of btrfs fi df: btrfs fi df /mnt/__Complete_Disk/ Data, RAID5: total=3.85TiB, used=3.85TiB System, RAID5: total=32.00MiB, used=576.00KiB Metadata, RAID5: total=6.46GiB, used=5.14GiB GlobalReserve, single: total=512.00MiB, used=0.00B I have three disks: btrfs fi show Label: none uuid: a8af3832-48c7-4568-861f-e80380dd7e0b Total devices 3 FS bytes used 3.85TiB devid1 size 2.73TiB used 2.73TiB path /dev/sdf devid2 size 2.73TiB used 2.24TiB path /dev/sdd devid3 size 2.73TiB used 2.73TiB path /dev/sde So I would expect around 5.4TiB to be available. In fact I am currently using 4.4TiB and the drive seems to be full (I cannot store more data). I went through Marcs Blog post on this, but the hints there did not help. In particular: btrfs balance start -dusage=95 /mnt/__Complete_Disk/ (started with 55) What's wrong here? Regards, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Data single *and* raid?
Hello Quo, thanks for your reply. But then: root@homeserver:/mnt/__Complete_Disk# btrfs fi df /mnt/__Complete_Disk/ Data, RAID5: total=3.83TiB, used=3.78TiB System, RAID5: total=32.00MiB, used=576.00KiB Metadata, RAID5: total=6.46GiB, used=4.84GiB GlobalReserve, single: total=512.00MiB, used=0.00B GlobalReserve is not a chunk type, it just means a range of metadata reserved for overcommiting. And it's always single. Personally, I don't think it should be output in fi df command, as it's in a higher level than chunk. At least for your case, nothing is needed to worry about. But this seems to be a RAID5 now, right? Well, that's what I want, but the command was: btrfs balance start -dprofiles=single -mprofiles=raid1 /mnt/__Complete_Disk/ So, we would expect raid1 here, no? Greetings, Hendrik On 01.08.2015 22:44, Chris Murphy wrote: On Sat, Aug 1, 2015 at 2:32 PM, Hugo Mills h...@carfax.org.uk wrote: On Sat, Aug 01, 2015 at 10:09:35PM +0200, Hendrik Friedel wrote: Hello, I converted an array to raid5 by btrfs device add /dev/sdd /mnt/new_storage btrfs device add /dev/sdc /mnt/new_storage btrfs balance start -dconvert=raid5 -mconvert=raid5 /mnt/new_storage/ The Balance went through. But now: Label: none uuid: a8af3832-48c7-4568-861f-e80380dd7e0b Total devices 3 FS bytes used 5.28TiB devid1 size 2.73TiB used 2.57TiB path /dev/sde devid2 size 2.73TiB used 2.73TiB path /dev/sdc devid3 size 2.73TiB used 2.73TiB path /dev/sdd btrfs-progs v4.1.1 Already the 2.57TiB is a bit surprising: root@homeserver:/mnt# btrfs fi df /mnt/new_storage/ Data, single: total=2.55TiB, used=2.55TiB Data, RAID5: total=2.73TiB, used=2.72TiB System, RAID5: total=32.00MiB, used=736.00KiB Metadata, RAID1: total=6.00GiB, used=5.33GiB Metadata, RAID5: total=3.00GiB, used=2.99GiB Looking at the btrfs fi show output, you've probably run out of space during the conversion, probably due to an uneven distribution of the original single chunks. I think I would suggest balancing the single chunks, and trying the conversion (of the unconverted parts) again: # btrfs balance start -dprofiles=single -mprofile=raid1 /mnt/new_storage/ # btrfs balance start -dconvert=raid5,soft -mconvert=raid5,soft /mnt/new_storage/ Yep I bet that's it also. btrfs fi usage might be better at exposing this case. -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Tel. 04203 8394854 Mobil 0178 1874363 --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Data single *and* raid?
Hello Hugo, hello Chris, thanks for your advice. Now I am here: btrfs balance start -dprofiles=single -mprofiles=raid1 /mnt/__Complete_Disk/ Done, had to relocate 0 out of 3939 chunks root@homeserver:/mnt/__Complete_Disk# btrfs fi show Label: none uuid: a8af3832-48c7-4568-861f-e80380dd7e0b Total devices 3 FS bytes used 3.78TiB devid1 size 2.73TiB used 2.72TiB path /dev/sde devid2 size 2.73TiB used 2.23TiB path /dev/sdc devid3 size 2.73TiB used 2.73TiB path /dev/sdd btrfs-progs v4.1.1 So, that looks good. But then: root@homeserver:/mnt/__Complete_Disk# btrfs fi df /mnt/__Complete_Disk/ Data, RAID5: total=3.83TiB, used=3.78TiB System, RAID5: total=32.00MiB, used=576.00KiB Metadata, RAID5: total=6.46GiB, used=4.84GiB GlobalReserve, single: total=512.00MiB, used=0.00B Is the RAID5 expected here? I did not yet run: btrfs balance start -dconvert=raid5,soft -mconvert=raid5,soft /mnt/new_storage/ Regards, Hendrik On 01.08.2015 22:44, Chris Murphy wrote: On Sat, Aug 1, 2015 at 2:32 PM, Hugo Mills h...@carfax.org.uk wrote: On Sat, Aug 01, 2015 at 10:09:35PM +0200, Hendrik Friedel wrote: Hello, I converted an array to raid5 by btrfs device add /dev/sdd /mnt/new_storage btrfs device add /dev/sdc /mnt/new_storage btrfs balance start -dconvert=raid5 -mconvert=raid5 /mnt/new_storage/ The Balance went through. But now: Label: none uuid: a8af3832-48c7-4568-861f-e80380dd7e0b Total devices 3 FS bytes used 5.28TiB devid1 size 2.73TiB used 2.57TiB path /dev/sde devid2 size 2.73TiB used 2.73TiB path /dev/sdc devid3 size 2.73TiB used 2.73TiB path /dev/sdd btrfs-progs v4.1.1 Already the 2.57TiB is a bit surprising: root@homeserver:/mnt# btrfs fi df /mnt/new_storage/ Data, single: total=2.55TiB, used=2.55TiB Data, RAID5: total=2.73TiB, used=2.72TiB System, RAID5: total=32.00MiB, used=736.00KiB Metadata, RAID1: total=6.00GiB, used=5.33GiB Metadata, RAID5: total=3.00GiB, used=2.99GiB Looking at the btrfs fi show output, you've probably run out of space during the conversion, probably due to an uneven distribution of the original single chunks. I think I would suggest balancing the single chunks, and trying the conversion (of the unconverted parts) again: # btrfs balance start -dprofiles=single -mprofile=raid1 /mnt/new_storage/ # btrfs balance start -dconvert=raid5,soft -mconvert=raid5,soft /mnt/new_storage/ Yep I bet that's it also. btrfs fi usage might be better at exposing this case. -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Tel. 04203 8394854 Mobil 0178 1874363 --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Data single *and* raid?
Hello, Looking at the btrfs fi show output, you've probably run out of space during the conversion, probably due to an uneven distribution of the original single chunks. I think I would suggest balancing the single chunks, and trying the conversion (of the unconverted parts) again: # btrfs balance start -dprofiles=single -mprofile=raid1 /mnt/new_storage/ # btrfs balance start -dconvert=raid5,soft -mconvert=raid5,soft /mnt/new_storage/ Yep I bet that's it also. btrfs fi usage might be better at exposing this case. Thanks for your hints. The balance is running now for about 11h: The status is a bit surprising to me: 0 out of about 2619 chunks balanced (4165 considered), 100% left btrfs fi usage is also surprising: Overall: Device size: 8.19TiB Device allocated: 2.56TiB Device unallocated:5.62TiB Device missing: 0.00B Used: 2.56TiB Free (estimated): 11.65TiB (min: 2.81TiB) Data ratio: 0.48 Metadata ratio: 1.33 Global reserve: 512.00MiB (used: 0.00B) Data,single: Size:2.55TiB, Used:2.55TiB /dev/sdc1.60TiB /dev/sde 975.44GiB Data,RAID5: Size:2.73TiB, Used:2.72TiB /dev/sdc1.12TiB /dev/sdd2.73TiB /dev/sde1.61TiB Metadata,RAID1: Size:6.00GiB, Used:5.33GiB /dev/sdc5.00GiB /dev/sdd1.00GiB /dev/sde6.00GiB Metadata,RAID5: Size:3.00GiB, Used:2.99GiB /dev/sdc2.00GiB /dev/sdd1.00GiB /dev/sde3.00GiB System,RAID5: Size:32.00MiB, Used:736.00KiB /dev/sdd 32.00MiB /dev/sde 32.00MiB Unallocated: /dev/sdc1.02MiB /dev/sdd1.02MiB /dev/sde 164.59GiB I hope, that is because Raid5 is not implemented yet? Regards, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Data single *and* raid?
Hello, I converted an array to raid5 by btrfs device add /dev/sdd /mnt/new_storage btrfs device add /dev/sdc /mnt/new_storage btrfs balance start -dconvert=raid5 -mconvert=raid5 /mnt/new_storage/ The Balance went through. But now: Label: none uuid: a8af3832-48c7-4568-861f-e80380dd7e0b Total devices 3 FS bytes used 5.28TiB devid1 size 2.73TiB used 2.57TiB path /dev/sde devid2 size 2.73TiB used 2.73TiB path /dev/sdc devid3 size 2.73TiB used 2.73TiB path /dev/sdd btrfs-progs v4.1.1 Already the 2.57TiB is a bit surprising: root@homeserver:/mnt# btrfs fi df /mnt/new_storage/ Data, single: total=2.55TiB, used=2.55TiB Data, RAID5: total=2.73TiB, used=2.72TiB System, RAID5: total=32.00MiB, used=736.00KiB Metadata, RAID1: total=6.00GiB, used=5.33GiB Metadata, RAID5: total=3.00GiB, used=2.99GiB Why is there Data single and Raid? Why is Metadata RAID1 and Raid5? A scrub is currently running and showed no errors yet. Greetings, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
kernel BUG at ctree.c:5196
Hello, I recently added a third device to my raid and converted it from raid0 to raid 5 via balance (dconvert, mconvert). Unfortunately, the new device was faulty. I wrote about this on this List in size 2.73TiB used 240.97GiB after balance. Initially the system was very unstable when trying to access this volume (even with ro, degraded, recovery), but now I was able to recover the data from the disk. I think it helped to attach the disk directly rather than in its USB-Enclosure. Now I have two remaining points: a) I want to contribute to making btrfs more stable b) I would like to repair the volume When running btrfs check --repair, I get: btrfs check --repair /dev/sdb enabling repair mode Checking filesystem on /dev/sdb UUID: b4a6cce6-dc9c-4a13-80a4-ed6bc5b40bb8 checking extents parent transid verify failed on 8300132024320 wanted 114915 found 114837 parent transid verify failed on 8300132024320 wanted 114915 found 114837 checksum verify failed on 8300132024320 found C3925FB7 wanted 2771E201 parent transid verify failed on 8300132024320 wanted 114915 found 114837 Ignoring transid failure parent transid verify failed on 8300132024320 wanted 114915 found 114837 Ignoring transid failure owner ref check failed [8300132024320 16384] Unable to find block group for 0 extent-tree.c:289: find_search_start: Assertion `1` failed. btrfs[0x442311] btrfs(btrfs_reserve_extent+0x8a4)[0x4477f5] btrfs(btrfs_alloc_free_block+0x57)[0x447b43] btrfs(__btrfs_cow_block+0x163)[0x4389f7] btrfs(btrfs_cow_block+0xd0)[0x439304] btrfs(btrfs_search_slot+0x16f)[0x43b996] btrfs[0x420dcb] btrfs[0x4284d6] btrfs(cmd_check+0x14a9)[0x42a464] btrfs(main+0x153)[0x409f37] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f22b8161ec5] btrfs[0x409af9] When now running a balance (dconvert=raid0 mconvert=raid0) this shows up in the syslog: Jul 19 22:43:33 homeserver kernel: [ 219.670944] BTRFS: checking UUID tree | Jul 19 22:43:33 homeserver kernel: [ 219.670951] BTRFS info (device sde): continuing balance | Jul 19 22:43:35 homeserver kernel: [ 221.108745] BTRFS info (device sde): relocating block group 8175533162496 flags 1 | Jul 19 22:43:35 homeserver kernel: [ 221.392879] BTRFS (device sde): parent transid verify failed on 8300132024320 wanted 11491| 5 found 114837 | Jul 19 22:43:35 homeserver kernel: [ 221.409515] BTRFS (device sde): parent transid verify failed on 8300132024320 wanted 11491| 5 found 114837 | Jul 19 22:43:35 homeserver kernel: [ 221.409564] kernel BUG at /home/kernel/COD/linux/fs/btrfs/ctree.c:5196! | Jul 19 22:43:35 homeserver kernel: [ 221.409594] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ip| v4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_CHECKSUM iptable_mangle xt_tcp| udp ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables autofs4 bridge stp llc nfsd auth_rpcgss nf| s_acl nfs lockd binfmt_misc grace sunrpc fscache stv6110x lnbp21 intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp coret| emp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_gener| ic stv090x cryptd snd_hda_intel snd_hda_controller snd_hda_codec snd_hda_core serio_raw snd_hwdep ddbridge snd_pcm i915 cxd2099(| C) dvb_core snd_seq_midi media snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer lpc_ich drm_kms_helper snd drm i2| c_algo_bit soundcore mei_me shpchp mei video mac_hid lp parport btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq asy| nc_xor async_tx xor raid6_pq raid1 hid_generic raid0 e1000e usbhid multipath linear psmouse ptp ahci libahci hid pps_core | Jul 19 22:43:35 homeserver kernel: [ 221.409923] CPU: 0 PID: 4529 Comm: btrfs-uuid Tainted: G C 4.1.0-040100rc5-ge| neric #201505250235 | Jul 19 22:43:35 homeserver kernel: [ 221.409989] RIP: 0010:[c01687b2] [c01687b2] btrfs_search_forward+0x31| 2/0x330 [btrfs] | Jul 19 22:43:35 homeserver kernel: [ 221.410261] [c0160a47] ? btrfs_release_path+0x77/0xb0 [btrfs] | Jul 19 22:43:35 homeserver kernel: [ 221.410289] [c01b243c] btrfs_uuid_scan_kthread+0xec/0x3b0 [btrfs] | Jul 19 22:43:35 homeserver kernel: [ 221.410317] [c01b2700] ? btrfs_uuid_scan_kthread+0x3b0/0x3b0 [btrfs] | Jul 19 22:43:35 homeserver kernel: [ 221.410343] [c01b2732] btrfs_uuid_rescan_kthread+0x32/0x70 [btrfs] | Jul 19 22:43:35 homeserver kernel: [ 221.410525] RIP [c01687b2] btrfs_search_forward+0x312/0x330 [btrfs] I can now play a bit with the file-system if valuable for you. When being able to recover, I would also profit, because I save a bit of time getting all data together again (now spread over drives), but it's not
Re: size 2.73TiB used 240.97GiB after balance
Hello Donald, thanks for your reply. I appreciate your help. I would use recover to get the data if at all possible, then you can experiment with try to fix the degraded condition live. If you have any chance of getting data from the pool, you reduce that chance every time you make a change. Ok, you assume that btrfs recover is the most likely way of recovering data. But if mounting degraded, scrubbing, btrfsck, ... are more successful, your proposal is more risky, isn't it? With a dd-image I can always go back to todays status. If btrfs did the balance like you said, it wouldn't be raid5. What you just described is raid4 where only one drive holds parity data. I can't say that I actually know for a fact that btrfs doesn't do this, but I'd be shocked and some dev would need to eat their underware if the balance job didn't distribute the parity also. Ok, I was not aware of the difference between raid45. So, I did try a btrs-recover: warning devid 3 not found already Check tree block failed, want=8300102483968, have=65536 Check tree block failed, want=8300102483968, have=65536 Check tree block failed, want=8300102483968, have=65536 read block failed check_tree_block Couldn't setup extent tree [it is still running] btrfs-find-root gives me: http://paste.ubuntu.com/11844005/ http://paste.ubuntu.com/11844009/ (on the two disks) btrfs-show-super: http://paste.ubuntu.com/11844016/ Greetings, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: size 2.73TiB used 240.97GiB after balance
Hello, yes, I will check the cables, thanks for the hint. Before trying to recover the data, I would like to save the status quo. I have two new drives? Is it advisable to dd-copy the data on the new drives and then to try to recover? I am asking, because I suppose that dd will also copy the UUID, which might confuse BTRFS (two drives with same UUID attached)? And then I have a technical question on btrfs balance when converting to raid5 (from raid1): does the balance create the parity information on the newly-added (empty) drive, so that the data on the two original disks is not touched at all? Regards, Hendrik On 07.07.2015 15:14, Donald Pearson wrote: That's what it looks like. You may want to try reseating cables, etc. Instead of mounting and file copy, btrfs restore might be worth a shot to recover what you can. On Tue, Jul 7, 2015 at 12:42 AM, Hendrik Friedel hend...@friedels.name wrote: Hello, while mounting works with the recovery option, the system locks after reading. dmesg shows: [ 684.258246] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 684.258249] ata6.00: irq_stat 0x4001 [ 684.258252] ata6.00: failed command: DATA SET MANAGEMENT [ 684.258255] ata6.00: cmd 06/01:01:00:00:00/00:00:00:00:00/a0 tag 26 dma 512 out [ 684.258255] res 51/04:01:01:00:00/00:00:00:00:00/a0 Emask 0x1 (device error) [ 684.258256] ata6.00: status: { DRDY ERR } [ 684.258258] ata6.00: error: { ABRT } [ 684.258266] sd 5:0:0:0: [sdd] tag#26 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 684.258268] sd 5:0:0:0: [sdd] tag#26 Sense Key : Illegal Request [current] [descriptor] [ 684.258270] sd 5:0:0:0: [sdd] tag#26 Add. Sense: Unaligned write command [ 684.258272] sd 5:0:0:0: [sdd] tag#26 CDB: Write same(16) 93 08 00 00 00 00 00 01 d3 80 00 00 00 80 00 00 So, also this drive is failing?! Regards, Hendrik On 07.07.2015 00:59, Donald Pearson wrote: Anything in dmesg? On Mon, Jul 6, 2015 at 5:07 PM, hend...@friedels.name hend...@friedels.name wrote: Hallo, It seems, that mounting works, but the System locks completely soon after I backing up. Greetings, Hendrik -- Originalnachricht-- Von: Donald Pearson Datum: Mo., 6. Juli 2015 23:49 An: Hendrik Friedel; Cc: Omar Sandoval;Hugo Mills;Btrfs BTRFS; Betreff:Re: size 2.73TiB used 240.97GiB after balance If you can mount it RO, first thing to do is back up any data that youcare about.According to the bug that Omar posted you should not try a devicereplace and you should not try a scrub with a missing device.You may be able to just do a device delete missing, then separately doa device add of a new drive, or rebalance back in to raid1.On Mon, Jul 6, 2015 at 4:12 PM, Hendrik Friedel wrote: Hello, oh dear, I fear I am in trouble: recovery-mounted, I tried to save some data, but the system hung. So I re-booted and sdc is now physically disconnected. Label: none uuid: b4a6cce6-dc9c-4a13-80a4-ed6bc5b40bb8 Total devices 3 FS bytes used 4.67TiB devid1 size 2.73TiB used 2.67TiB path /dev/sdc devid2 size 2.73TiB used 2.67TiB path /dev/sdb *** Some devices missing I try to mount the rest again: mount -o recovery,ro /dev/sdb /mnt/__Complete_Disk mount: wrong fs type, bad option, bad superblock on /dev/sdb,missing codepage or helper program, or other error In some cases useful info is found in syslog - trydmesg | tail or so root@homeserver:~# dmesg | tail [ 447.059275] BTRFS info (device sdc): enabling auto recovery [ 447.059280] BTRFS info (device sdc): disk space caching is enabled [ 447.086844] BTRFS: failed to read chunk tree on sdc [ 447.110588] BTRFS: open_ctree failed [ 474.496778] BTRFS info (device sdc): enabling auto recovery [ 474.496781] BTRFS info (device sdc): disk space caching is enabled [ 474.519005] BTRFS: failed to read chunk tree on sdc [ 474.540627] BTRFS: open_ctree failed mount -o degraded,ro /dev/sdb /mnt/__Complete_Disk Does work now though. So, how can I remove the reference to the failed disk and check the data for consistency (scrub I suppose, but is it safe?)? Regards, Hendrik On 06.07.2015 22:52, Omar Sandoval wrote: On 07/06/2015 01:01 PM, Donald Pearson wrote: Based on my experience Hugo's advice is critical, get the bad drive out of the pool when in raid56 and do not try to replace or delete it while it's still attached and recognized. If you add a new device, mount degraded and rebalance. If you don't, mount degraded then device delete missing. Watch out, replacing a missing device in RAID 5/6 currently doesn't work and will cause a kernel BUG(). See my patch series here: http://www.spinics.net/lists/linux-btrfs/msg44874.html -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Tel. 04203 8394854 Mobil 0178 1874363 --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Tel. 04203 8394854 Mobil 0178 1874363 --- Diese
Re: size 2.73TiB used 240.97GiB after balance
Hello, oh dear, I fear I am in trouble: recovery-mounted, I tried to save some data, but the system hung. So I re-booted and sdc is now physically disconnected. Label: none uuid: b4a6cce6-dc9c-4a13-80a4-ed6bc5b40bb8 Total devices 3 FS bytes used 4.67TiB devid1 size 2.73TiB used 2.67TiB path /dev/sdc devid2 size 2.73TiB used 2.67TiB path /dev/sdb *** Some devices missing I try to mount the rest again: mount -o recovery,ro /dev/sdb /mnt/__Complete_Disk mount: wrong fs type, bad option, bad superblock on /dev/sdb, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so root@homeserver:~# dmesg | tail [ 447.059275] BTRFS info (device sdc): enabling auto recovery [ 447.059280] BTRFS info (device sdc): disk space caching is enabled [ 447.086844] BTRFS: failed to read chunk tree on sdc [ 447.110588] BTRFS: open_ctree failed [ 474.496778] BTRFS info (device sdc): enabling auto recovery [ 474.496781] BTRFS info (device sdc): disk space caching is enabled [ 474.519005] BTRFS: failed to read chunk tree on sdc [ 474.540627] BTRFS: open_ctree failed mount -o degraded,ro /dev/sdb /mnt/__Complete_Disk Does work now though. So, how can I remove the reference to the failed disk and check the data for consistency (scrub I suppose, but is it safe?)? Regards, Hendrik On 06.07.2015 22:52, Omar Sandoval wrote: On 07/06/2015 01:01 PM, Donald Pearson wrote: Based on my experience Hugo's advice is critical, get the bad drive out of the pool when in raid56 and do not try to replace or delete it while it's still attached and recognized. If you add a new device, mount degraded and rebalance. If you don't, mount degraded then device delete missing. Watch out, replacing a missing device in RAID 5/6 currently doesn't work and will cause a kernel BUG(). See my patch series here: http://www.spinics.net/lists/linux-btrfs/msg44874.html -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Tel. 04203 8394854 Mobil 0178 1874363 --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
size 2.73TiB used 240.97GiB after balance
Hello, I started with a raid1: devid1 size 2.73TiB used 2.67TiB path /dev/sdd devid2 size 2.73TiB used 2.67TiB path /dev/sdb Then I added a third device, /dev/sdc1 and a balance btrfs balance start -dconvert=raid5 -mconvert=raid5 /mnt/__Complete_Disk/ Now the file-system looks like this: Total devices 3 FS bytes used 4.68TiB devid1 size 2.73TiB used 2.67TiB path /dev/sdd devid2 size 2.73TiB used 2.67TiB path /dev/sdb devid3 size 2.73TiB used 240.97GiB path /dev/sdc1 I am surprised by the 240.97GiB... In the syslog and dmesg I find several: [108274.415499] btrfs_dev_stat_print_on_error: 8 callbacks suppressed [108279.840334] btrfs_dev_stat_print_on_error: 12 callbacks suppressed What's wrong here? Regards, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: size 2.73TiB used 240.97GiB after balance
Hello, ok, sdc seems to have failed (sorry, I checked only sdd and sdb SMART values, as sdc is brand new. Maybe a bad assumption, from my side. I have mounted the device mount -o recovery,ro So, what should I do now: btrfs device delete /dev/sdc /mnt or mount -o degraded /dev/sdb /mnt btrfs device delete missing /mnt I do have a backup of the most valuable data. But if you consider one of the above options risky, I might better get a new drive before, but this might take a couple of days (in which sdc could further degrade). What is your recommendation? Regards, Hendrik --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: size 2.73TiB used 240.97GiB after balance
Hello, while mounting works with the recovery option, the system locks after reading. dmesg shows: [ 684.258246] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 684.258249] ata6.00: irq_stat 0x4001 [ 684.258252] ata6.00: failed command: DATA SET MANAGEMENT [ 684.258255] ata6.00: cmd 06/01:01:00:00:00/00:00:00:00:00/a0 tag 26 dma 512 out [ 684.258255] res 51/04:01:01:00:00/00:00:00:00:00/a0 Emask 0x1 (device error) [ 684.258256] ata6.00: status: { DRDY ERR } [ 684.258258] ata6.00: error: { ABRT } [ 684.258266] sd 5:0:0:0: [sdd] tag#26 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 684.258268] sd 5:0:0:0: [sdd] tag#26 Sense Key : Illegal Request [current] [descriptor] [ 684.258270] sd 5:0:0:0: [sdd] tag#26 Add. Sense: Unaligned write command [ 684.258272] sd 5:0:0:0: [sdd] tag#26 CDB: Write same(16) 93 08 00 00 00 00 00 01 d3 80 00 00 00 80 00 00 So, also this drive is failing?! Regards, Hendrik On 07.07.2015 00:59, Donald Pearson wrote: Anything in dmesg? On Mon, Jul 6, 2015 at 5:07 PM, hend...@friedels.name hend...@friedels.name wrote: Hallo, It seems, that mounting works, but the System locks completely soon after I backing up. Greetings, Hendrik -- Originalnachricht-- Von: Donald Pearson Datum: Mo., 6. Juli 2015 23:49 An: Hendrik Friedel; Cc: Omar Sandoval;Hugo Mills;Btrfs BTRFS; Betreff:Re: size 2.73TiB used 240.97GiB after balance If you can mount it RO, first thing to do is back up any data that youcare about.According to the bug that Omar posted you should not try a devicereplace and you should not try a scrub with a missing device.You may be able to just do a device delete missing, then separately doa device add of a new drive, or rebalance back in to raid1.On Mon, Jul 6, 2015 at 4:12 PM, Hendrik Friedel wrote: Hello, oh dear, I fear I am in trouble: recovery-mounted, I tried to save some data, but the system hung. So I re-booted and sdc is now physically disconnected. Label: none uuid: b4a6cce6-dc9c-4a13-80a4-ed6bc5b40bb8 Total devices 3 FS bytes used 4.67TiB devid1 size 2.73TiB used 2.67TiB path /dev/sdc devid2 size 2.73TiB used 2.67TiB path /dev/sdb *** Some devices missing I try to mount the rest again: mount -o recovery,ro /dev/sdb /mnt/__Complete_Disk mount: wrong fs type, bad option, bad superblock on /dev/sdb,missing codepage or helper program, or other error In some cases useful info is found in syslog - trydmesg | tail or so root@homeserver:~# dmesg | tail [ 447.059275] BTRFS info (device sdc): enabling auto recovery [ 447.059280] BTRFS info (device sdc): disk space caching is enabled [ 447.086844] BTRFS: failed to read chunk tree on sdc [ 447.110588] BTRFS: open_ctree failed [ 474.496778] BTRFS info (device sdc): enabling auto recovery [ 474.496781] BTRFS info (device sdc): disk space caching is enabled [ 474.519005] BTRFS: failed to read chunk tree on sdc [ 474.540627] BTRFS: open_ctree failed mount -o degraded,ro /dev/sdb /mnt/__Complete_Disk Does work now though. So, how can I remove the reference to the failed disk and check the data for consistency (scrub I suppose, but is it safe?)? Regards, Hendrik On 06.07.2015 22:52, Omar Sandoval wrote: On 07/06/2015 01:01 PM, Donald Pearson wrote: Based on my experience Hugo's advice is critical, get the bad drive out of the pool when in raid56 and do not try to replace or delete it while it's still attached and recognized. If you add a new device, mount degraded and rebalance. If you don't, mount degraded then device delete missing. Watch out, replacing a missing device in RAID 5/6 currently doesn't work and will cause a kernel BUG(). See my patch series here: http://www.spinics.net/lists/linux-btrfs/msg44874.html -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Tel. 04203 8394854 Mobil 0178 1874363 --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Tel. 04203 8394854 Mobil 0178 1874363 --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Very high load when reading/writing
Dear all, I have very high load when writing/reading from/to two of my btrfs volumes. One sda1, mounted as /mnt/BTRFS, the other, sdd2/sde2 (raid) as / sda1 is a 3TB disc, whereas the sdd2/sde2 are small SSDs of 16GB. I wrote a small script to demonstrate it. It does: -echo what it will do -show the current load -dd from one volume to the other. -show the current load -sync and flush the cache -sleep 300s in order to get the load down again. Here the output: Test from /mnt/BTRFS to /tmp 1.05 0.55, 0.41 124,553 s, 16,8 MB/s 6.98 2.94, 1.30 Test /mnt/BTRFS to /mnt/BTRFS 0.23 1.32, 1.10 127,008 s, 16,5 MB/s 4.76 2.82, 1.69 Test /mnt/BTRFS to /dev/null 0.17 1.29, 1.39 21,9972 s, 95,3 MB/s 0.64 1.31, 1.39 Test from /tmp to /mnt/BTRFS 0.23 0.64, 1.08 124,655 s, 16,8 MB/s 8.63 3.44, 2.03 I'm sure, this is not normal, is it? What I mean: The load is very high and the data rate is very low. Below some Information on the Filesystems and Disks. I'd appreciate any help to understand what's wrong. Regards, Hendrik # ~/btrfs/integration/devel/btrfs fi show /mnt/BTRFS/Video/ Label: 'Daten' uuid: d3ba0e97-24ae-4f94-b407-05bf2cd4ddf4 Total devices 1 FS bytes used 2.31TiB devid1 size 2.73TiB used 2.35TiB path /dev/sda1 Btrfs this-will-become-v3.13-48-g57c3600 # ~/btrfs/integration/devel/btrfs fi show / Label: 'ROOT_BTRFS_RAID' uuid: a2d5f2db-04ca-413a-aee1-cb754aa8fba5 Total devices 2 FS bytes used 7.50GiB devid1 size 14.85GiB used 14.36GiB path /dev/sde2 devid2 size 14.65GiB used 14.36GiB path /dev/sdd2 uname -r 3.14.0-031400rc4-generic ./btrfsck /dev/sda1 Checking filesystem on /dev/sda1 UUID: d3ba0e97-24ae-4f94-b407-05bf2cd4ddf4 checking extents checking free space cache checking fs roots checking csums checking root refs found 1264928671538 bytes used err is 0 total csum bytes: 2475071700 total tree bytes: 2829418496 total fs tree bytes: 55672832 total extent tree bytes: 72744960 btree space waste bytes: 210148896 file data blocks allocated: 2535102173184 referenced 2533075963904 Btrfs this-will-become-v3.13-48-g57c3600 Checking filesystem on /dev/sdd2 UUID: a2d5f2db-04ca-413a-aee1-cb754aa8fba5 checking extents checking free space cache checking fs roots checking csums checking root refs found 423637793 bytes used err is 0 total csum bytes: 8078432 total tree bytes: 421920768 total fs tree bytes: 393560064 total extent tree bytes: 18857984 btree space waste bytes: 71825111 file data blocks allocated: 16775815168 referenced 8751009792 Btrfs this-will-become-v3.13-48-g57c3600 smartctl -a /dev/sdd2 smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.14.0-031400rc4-generic] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: MXSSD2MSLD16G-V Serial Number:0YWOMT24NF16IB8U Firmware Version: 20130221 User Capacity:15.837.691.904 bytes [15,8 GB] Sector Size: 512 bytes logical/physical Device is:Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is:Thu May 1 19:48:58 2014 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Total time to complete Offline data collection:(0) seconds. Offline data collection capabilities:(0x00) Offline data collection not supported. SMART capabilities:(0x0002) Does not save SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability:(0x00) Error logging NOT supported. No General Purpose Logging support. SMART Attributes Data Structure revision number: 1 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x 100 100 050Old_age Offline - 0 5 Reallocated_Sector_Ct 0x0002 100 100 050Old_age Always - 0 9 Power_On_Hours 0x 100 100 050Old_age Offline - 4690 12 Power_Cycle_Count 0x 100 100 050Old_age Offline - 15 160 Unknown_Attribute 0x 100 100 050Old_age Offline - 0 161 Unknown_Attribute 0x 100 100 050Old_age Offline - 136 162 Unknown_Attribute 0x 100 100 050Old_age Offline -
Re: free space inode generation (0) did not match free space cache generation
Hello, after merely 5 days, I have the same Problem: root@homeserver:~# ./btrfs/integration/devel/btrfs fi df /mnt/test1/ Disk size:29.50GiB Disk allocated: 29.30GiB Disk unallocated:202.00MiB Used: 13.84GiB Free (Estimated):929.95MiB (Max: 1.01GiB, min: 929.95MiB) Data to disk ratio: 50 % root@homeserver:~# ./btrfs/integration/devel/btrfs fi show /mnt/test1/ Label: 'ROOT_BTRFS_RAID' uuid: a2d5f2db-04ca-413a-aee1-cb754aa8fba5 Total devices 2 FS bytes used 13.84GiB devid1 size 14.85GiB used 14.65GiB path /dev/sde2 devid2 size 14.65GiB used 14.65GiB path /dev/sdd2 root@homeserver:~# ./btrfs/integration/devel/btrfs fi df /mnt/test1/ Disk size:29.50GiB Disk allocated: 29.30GiB Disk unallocated:202.00MiB Used: 13.84GiB Free (Estimated):929.95MiB (Max: 1.01GiB, min: 929.95MiB) Data to disk ratio: 50 % root@homeserver:~# ./btrfs/integration/devel/btrfs fi show /mnt/test1/ Label: 'ROOT_BTRFS_RAID' uuid: a2d5f2db-04ca-413a-aee1-cb754aa8fba5 Total devices 2 FS bytes used 13.84GiB devid1 size 14.85GiB used 14.65GiB path /dev/sde2 devid2 size 14.65GiB used 14.65GiB path /dev/sdd2 Btrfs this-will-become-v3.13-48-g57c3600 root@homeserver:~# time ./btrfs/integration/devel/btrfs balance start -dusage=0 /mnt/test1 Done, had to relocate 0 out of 22 chunks real0m2.734s user0m0.000s sys 0m0.022s I increased dusage until I got: root@homeserver:~# time ./btrfs/integration/devel/btrfs balance start -dusage=90 /mnt/test1 ERROR: error during balancing '/mnt/test1' - No space left on device There may be more info in syslog - try dmesg | tail Before I could do a full balance I had to delete all Snapshots: ~20 on my root subvolume ~40 on my /home and /root subvolume I do not find this a extraordinary high number of snapshots. Also others should have higher numbers, when they use snapper. Any Idea of what could be the reason here? Regards, Hendrik Am 25.03.2014 21:10, schrieb Hugo Mills: On Tue, Mar 25, 2014 at 09:03:26PM +0100, Hendrik Friedel wrote: Hi, Well, given the relative immaturity of btrfs as a filesystem at this point in its lifetime, I think it's acceptable/tolerable. However, for a filesystem feted[1] to ultimately replace the ext* series as an assumed Linux default, I'd definitely argue that the current situation should be changed such that btrfs can automatically manage its own de-allocation at some point, yes, and that said some point really needs to come before that point at which btrfs can be considered an appropriate replacement for ext2/3/4 as the assumed default Linux filesystem of the day. Agreed! I hope, this is on the ToDo List?! https://btrfs.wiki.kernel.org/index.php/Project_ideas#Block_group_reclaim Yes. :) [1] feted: celebrated, honored. I had to look it up to be sure my intuition on usage was correct, and indeed I had spelled it wrong :-) Did you mean fated: intended, destined? Hugo. -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Mobil 0178 1874363 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: free space inode generation (0) did not match free space cache generation
Hi, Well, given the relative immaturity of btrfs as a filesystem at this point in its lifetime, I think it's acceptable/tolerable. However, for a filesystem feted[1] to ultimately replace the ext* series as an assumed Linux default, I'd definitely argue that the current situation should be changed such that btrfs can automatically manage its own de-allocation at some point, yes, and that said some point really needs to come before that point at which btrfs can be considered an appropriate replacement for ext2/3/4 as the assumed default Linux filesystem of the day. Agreed! I hope, this is on the ToDo List?! [1] feted: celebrated, honored. I had to look it up to be sure my intuition on usage was correct, and indeed I had spelled it wrong :-) Greetings, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: free space inode generation (0) did not match free space cache generation
Hello, I read through the FAQ you mentioned, but I must admit, that I do not fully understand. My experience is that it takes a bit of time to soak in. Between time, previous Linux experience, and reading this list for awhile, things do make more sense now, but my understanding has definitely changed and deepened over time. Yes, I'm progressing. But I am a bit behind you :-) What I am wondering about is, what caused this problem to arise. The filesystem was hardly a week old, never mistreated (powered down without unmounting or so) and not even half full. So what caused the data chunks all being allocated? I can't really say, but it's worth noting that btrfs can normally allocate chunks, but doesn't (yet?) automatically deallocate them. To deallocate, you balance. Btrfs can reuse areas that have been deleted as the same thing, data or metadata, but it can't switch between them without a balance. Ok, I do understand that. I don't know why it could not automatically deallocate them.. But then at least I'd expect it to automatically detect this problem and do a balance, when needed. Note, that this Problem caused my System to become unavailable and it took days to find how to fix it (even if the fix was then very quick, thanks to your help). So the most obvious thing is that if you copy a bunch of stuff around so the filesystem is nearing full, then delete a bunch of it, consider checking your btrfs filesystem df/show stats and see whether you need a balance. But like I said, that's obvious. Yes. I did not really do much with the system. I copied everything onto the filesystem, rebooted and let it run for a week. The only thing that I could think of is that I created hourly snapshots with snapper. In fact in order to be able to do the balance, I had to delete something -so I deleted the snapshots. One possibility off the top of my head: Do you have noatime set in your mount options? That's definitely recommended with snapshotting, since otherwise, atime updates will be changes to the filesystem metadata since the last snapshot, and thus will add to the difference between snapshots that must be stored. If you're doing hourly snapshots and are accessing much of the filesystem each hour, that'll add up! Really? I do have noatime set, but I would expect the accesstime be stored in the metadata. So when snapshotting, only the changed metadata would have to be stored for the files that have been accesed between the two snapshots. That should not be a problem, is it? Additionally, I recommend snapshot thinning. Hourly snapshots are nice but after some time, they just become noise. Will you really know or care which specific hour it was if you're having to retrieve a snapshot from a month ago? In fact, snapper does that for me. Also, it may or may not apply to you, but internal-rewrite (as opposed to simply appended) files are bad news for COW-based filesystems such as btrfs. I don't see any applications that do internal re-writes on my system. Interesting nevertheless, esp. wrt. the posible solution. Thaks. Besides this: You recommend monitoring the output of btrfs fi show and to do a balance, whenever unallocated space drops too low. I can monitor this and let monit send me a message once that happens. Still, I'd like to know how to make this less likely. I haven't had a problem with it here, but then I haven't been doing much snapshotting (and always manual when I do it), I don't run any VMs or large databases, I mounted with the autodefrag option from the beginning, and I've used noatime for nearing a decade now as it was also recommended for my previous filesystem, reiserfs. The only differences are that my snapshotting is automated and the autodefrag is not set. No databases, no VMs, noatime set. It's a simple install of Ubuntu. But regardless of my experience with my own usage pattern, I suspect that with reasonable monitoring, you'll eventually become familiar with how fast the chunks are allocated and possibly with what sort of actions beyond the obvious active moving stuff around on the filesystem triggers those allocations, for your specific usage pattern, and can then adapt as necessary. Yes, that's a workaround. But really, that makes one the slave to your filesystem. That's not really acceptable, is it? Regards, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
free space inode generation (0) did not match free space cache generation
Hello, I have a file-system on which I cannot write anymore (no space left on device, which is not true root@homeserver:~/btrfs/integration/devel# df -h DateisystemGröße Benutzt Verf. Verw% Eingehängt auf /dev/sdd230G 24G 5,1G 83% /mnt/test1 ) About the filesystem: root@homeserver:~/btrfs/integration/devel# ./btrfs fi show /mnt/test1 Label: 'ROOT_BTRFS_RAID' uuid: a2d5f2db-04ca-413a-aee1-cb754aa8fba5 Total devices 2 FS bytes used 11.84GiB devid1 size 14.85GiB used 14.67GiB path /dev/sde2 devid2 size 14.65GiB used 14.65GiB path /dev/sdd2 Btrfs this-will-become-v3.13-48-g57c3600 Check of the filesystem: root@homeserver:~/btrfs/integration/devel# umount /mnt/test1 root@homeserver:~/btrfs/integration/devel# ./btrfsck /dev/sdd2 Checking filesystem on /dev/sdd2 UUID: a2d5f2db-04ca-413a-aee1-cb754aa8fba5 checking extents checking free space cache free space inode generation (0) did not match free space cache generation (41) free space inode generation (0) did not match free space cache generation (7380) free space inode generation (0) did not match free space cache generation (3081) checking fs roots checking csums checking root refs found 3680170466 bytes used err is 0 total csum bytes: 10071956 total tree bytes: 2398781440 total fs tree bytes: 2308784128 total extent tree bytes: 74203136 btree space waste bytes: 372004575 file data blocks allocated: 341759610880 referenced 75292241920 Btrfs this-will-become-v3.13-48-g57c3600 Before the btrfsck I did a mount -o clear_cache /dev/sdd2 /mnt/test1/ which in fact reduced the number of error messages (did not match free space cache generation) from more than ten to just three. I do have a backup of the FS and in fact it would have been quicker just whiping the disk and using the backup (just 16GB), than writing this message. But: Is it of interest to look at fixing this for someone, so that the development of btrfs can profit of this, or should I just whipe the disc? Greetings, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: free space inode generation (0) did not match free space cache generation
Hello, thanks for your help, I appreciate your hint. I think (reboot into the system with the fs mounted as root still outstanding), it fixed my problem. I read through the FAQ you mentioned, but I must admit, that I do not fully understand. What I am wondering about is, what caused this problem to arise. The filesystem was hardly a week old, never mistreated (powered down without unmounting or so) and not even half full. So what caused the data chunks all being allocated? The only thing that I could think of is that I created hourly snapshots with snapper. In fact in order to be able to do the balance, I had to delete something -so I deleted the snapshots. Can you tell me where I can read about the causes for this problem? Besides this: You recommend monitoring the output of btrfs fi show and to do a balance, whenever unallocated space drops too low. I can monitor this and let monit send me a message once that happens. Still, I'd like to know how to make this less likely. Greetings, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Snapper on Ubuntu
Hello, ok, thanks for the explaination. I would find a behaviour in which by default all configurations would be used (i.e. no -c option means that a snapshot of all configurations will be done) more intuitive. I'll get used to it though :-) Greetings, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Snapper on Ubuntu
Hello, Just a recommendation about the config names. At least on openSUSE root is used for /. I would suggest to use home_root for /root like the pam-snapper module does. thanks for the advise. In fact on a previous try I had -by chance- used exactly this nomenclature. Then I restarted Now, can I just rename the files in /etc/snapper/configs and the entries in /etc/sysconfig/snapper? Or do I have to start from scratch (remove all snapshots, all configurations)? Regards, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Snapper on Ubuntu
Hello, I am not sure, whether this is the right place to ask this question -if not, please advise. Ubuntu installs on btrfs, creating subvolumes for the homes (/home), the root home (/root) and the root (/) named @home, @root and @ respectively. When I install snapper I configure it like this snapper -c rt create-config / snapper -c home create-config /home snapper -c root create-config /root snapper -c Video create-config /mnt/BTRFS/Video/ After executing snapper create several times, this results in #btrfs subvolume list / ID 258 gen 2615 top level 5 path @ ID 259 gen 2611 top level 5 path @root ID 260 gen 2555 top level 5 path @home ID 281 gen 2555 top level 5 path @home/.snapshots ID 282 gen 2606 top level 5 path @root/.snapshots ID 283 gen 2562 top level 5 path @root/.snapshots/1/snapshot ID 284 gen 2563 top level 5 path @root/.snapshots/2/snapshot ID 285 gen 2573 top level 5 path @root/.snapshots/3/snapshot ID 286 gen 2577 top level 5 path @root/.snapshots/4/snapshot ID 287 gen 2582 top level 5 path @root/.snapshots/5/snapshot ID 288 gen 2583 top level 5 path @root/.snapshots/6/snapshot ID 290 gen 2605 top level 258 path .snapshots ID 291 gen 2599 top level 5 path @root/.snapshots/7/snapshot ID 292 gen 2600 top level 5 path @root/.snapshots/8/snapshot ID 293 gen 2605 top level 5 path @root/.snapshots/9/snapshot #btrfs subvolume list /mnt/BTRFS/Video/ ID 258 gen 4560 top level 5 path Video ID 259 gen 4557 top level 258 path VDR ID 275 gen 672 top level 258 path Filme ID 284 gen 816 top level 258 path Homevideo ID 288 gen 1048 top level 258 path VideoSchnitt ID 1874 gen 1288 top level 5 path rsnapshot ID 1875 gen 4245 top level 5 path backups ID 2265 gen 4560 top level 258 path .snapshots So, this all works for @root only, not for the other subvolumes. Do you have any suggestions, how to find the cause? Regards, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck does not fix
Hi Chris, It might be worth finding large files to defragment. See the ENOSPC errors during raid1 rebalance thread. It sounds like it might be possible for some fragmented files to be stuck across multiple chunks, preventing conversion. I moved 400Gb from my other (but full) disc to the btrfs disc. This freed up 400Gb on the full disc, so that I could move the other 400Gb to the non-btrfs disc. Essentially, I think this also defragmented all files, as they were freshly written (and as single, so that in fact a balance probably was not neccessarry anymore). After this balance and device-delete worked, also the device-delete! Nevertheless: I find it concerning, that this problem occured (remember, it was a raid with no SMART errors) and could not be fixed. My understanding was, that this should not happen to btrfs even in its current state. Thanks! Greetings, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck does not fix
Hi Chris, hi Ducan, time ./btrfs balance start -dconvert=single,soft /mnt/BTRFS/Video/ ERROR: error during balancing '/mnt/BTRFS/Video/' - No space left on device There may be more info in syslog - try dmesg | tail real0m23.803s user0m0.000s sys 0m1.070s dmesg: [697498.761318] btrfs: relocating block group 19874593112064 flags 9 [697507.614140] btrfs: relocating block group 19715679322112 flags 9 [697516.218690] btrfs: 2 enospc errors during balance You could try mounting with enospc_debug option and retrying, see if there's more information dmesg. I did this (this is on 3.14 rc4 now): [ 2631.094438] BTRFS info (device sda1): block group has cluster?: no [ 2631.094439] BTRFS info (device sda1): 0 blocks of free space at or bigger than bytes is [ 2631.094440] BTRFS: block group 24946983043072 has 1073741824 bytes, 0 used 0 pinned 0 reserved [ 2631.094441] BTRFS critical (device sda1): entry offset 24946983043072, bytes 1073741824, bitmap no [ 2631.105072] BTRFS info (device sda1): block group has cluster?: no [ 2631.105073] BTRFS info (device sda1): 0 blocks of free space at or bigger than bytes is [ 2631.105074] BTRFS: block group 24948056784896 has 1073741824 bytes, 0 used 0 pinned 0 reserved [ 2631.105075] BTRFS critical (device sda1): entry offset 24948056784896, bytes 1073741824, bitmap no [ 2631.115594] BTRFS info (device sda1): block group has cluster?: no [ 2631.115595] BTRFS info (device sda1): 0 blocks of free space at or bigger than bytes is [ 2631.115596] BTRFS: block group 24949130526720 has 1073741824 bytes, 0 used 0 pinned 0 reserved [ 2631.115597] BTRFS critical (device sda1): entry offset 24949130526720, bytes 1073741824, bitmap no [ 2631.126096] BTRFS info (device sda1): block group has cluster?: no [ 2631.126097] BTRFS info (device sda1): 0 blocks of free space at or bigger than bytes is [ 2635.099492] BTRFS info (device sda1): 2 enospc errors during balance So hopefully the data you need is already backed up and you can just blow this file system away. I am lacking space why I did the balance (to free one of the two discs). So, unless the above helps, it seems, I need to buy another HDD? Davids itegration-branch btrfsck tells me: ./btrfsck /dev/sda1 Checking filesystem on /dev/sda1 UUID: 989306aa-d291-4752-8477-0baf94f8c42f checking extents checking free space cache checking fs roots root 256 inode 9579 errors 100, file extent discount root 256 inode 9580 errors 100, file extent discount root 256 inode 14258 errors 100, file extent discount root 256 inode 14259 errors 100, file extent discount root 14155 inode 9579 errors 100, file extent discount root 14155 inode 9580 errors 100, file extent discount root 14155 inode 14258 errors 100, file extent discount root 14155 inode 14259 errors 100, file extent discount root 14251 inode 9579 errors 100, file extent discount root 14251 inode 9580 errors 100, file extent discount ... root 17083 inode 14259 errors 100, file extent discount found 141999189 bytes used err is 1 total csum bytes: 2488683616 total tree bytes: 2811752448 total fs tree bytes: 49192960 total extent tree bytes: 129617920 btree space waste bytes: 83558397 file data blocks allocated: 2585349570560 referenced 2583623929856 Btrfs this-will-become-v3.13-48-g57c3600 So, nothing new, as far as I can tell... Greetings, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck does not fix
Hi Chris, thanks for your hint. No. You said you need to recreate the file system, and only have these two devices and therefore must remove one device. You can't achieve that with raid1 which requires minimum two devices. -dconvert=single -mconvert=dup -sconvert=dup Actually, I'm reminded with multiple devices that dup might not be possible. Instead you might have to using single for all of them. Then remove the device you want removed. And then do another conversion for just -mconvert=dup -sconvert=dup, and do not specify -dconvert. That way the single metadata profile is converted to duplicate. I think it didn't work. btrfs balance start -dconvert=single -mconvert=single -sconvert=single --force /mnt/BTRFS/Video/ After 10h: btrfs balance status /mnt/BTRFS/Video/ No balance found on '/mnt/BTRFS/Video/' root@homeserver:~# btrfs fi df /mnt/BTRFS/Video/ Data, RAID0: total=4.00GB, used=4.00GB Data: total=2.29TB, used=2.29TB System: total=32.00MB, used=256.00KB Metadata: total=4.00GB, used=2.57GB root@homeserver:~# btrfs fi show Label: none uuid: 989306aa-d291-4752-8477-0baf94f8c42f Total devices 2 FS bytes used 2.29TB devid2 size 2.73TB used 1.15TB path /dev/sdc1 devid1 size 2.73TB used 1.15TB path /dev/sdb1 (you see that I cleaned up beforehand, so that enough space is available, generally). Do you have an idea what could be wrong? Thanks and Regards, Hendrik -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Mobil 0178 1874363 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck does not fix
Hi Chris, thanks for your reply. ./btrfs filesystem show /dev/sdb1 Label: none uuid: 989306aa-d291-4752-8477-0baf94f8c42f Total devices 2 FS bytes used 3.47TiB devid1 size 2.73TiB used 1.74TiB path /dev/sdb1 devid2 size 2.73TiB used 1.74TiB path /dev/sdc1 I don't understand the no spare part. You have 3.47T of data, and yet the single device size is 2.73T. There is no way to migrate 1.74T from sdc1 to sdb1 because there isn't enough space. Fair point. I summed up manually (with du) and apparently missed some data. I can move the 0.8TiB out of the way. I just don't have 3.5TiB 'spare'. btrfs device delete /dev/sdc1 /mnt/BTRFS/rsnapshot/ btrfs device delete /dev/sdc1 /mnt/BTRFS/backups/ btrfs device delete /dev/sdc1 /mnt/BTRFS/Video/ btrfs filesystem balance start /mnt/BTRFS/Video/ I don't understand this sequence because I don't know what you've mounted where, I'm sorry. here you go: /btrfs subvolume list /mnt/BTRFS/Video ID 256 gen 226429 top level 5 path Video -- /mnt/BTRFS/Video/ ID 1495 gen 226141 top level 5 path rsnapshot -- /mnt/BTRFS/rsnapshot ID gen 226429 top level 256 path Snapshot -- not mounted ID 5845 gen 226375 top level 5 path backups -- /mnt/BTRFS/backups but in any case maybe it's a bug that you're not getting errors for each of these commands because you can't delete sdc1 from a raid0 volume. That makes sense. I read that procedure somewhere in the -totally unvalidated- Internet. In case the missing Error-Message is a Bug: Is this place here sufficient to report it, or is there a Bug-Tracker? You'd first have to convert the data, metadata, and system profiles to single (metadata can be set to dup). And then you'd be able to delete a device so long as there's room on remaining devices, which you don't have. Yes, but I can create that space. So, for me the next steps would be to: -generate enough room on the filesystem -btrfs balance start -dconvert=raid1 -mconvert=raid1 /mnt/BTRFS/Video -btrfs device delete /dev/sdc1 /mnt/BTRFS/Video Right? next, I'm doing the balance for the subvolume /mnt/BTRFS/backups You told us above you deleted that subvolume. So how are you balancing it? Yes, that was my understanding from my research: You tell btrfs, that you want to remove one disc from the filesystem and then balance it to move the data on the remaining disc. I did find this logical. I was expecting that I possibly need a further command to tell btrfs that it's not a raid anymore, but I thought this could also be automagical. I understand, that's not the way it is implemented, but it's not a crazy idea, is it? And also, balance applies to a mountpoint, and even if you mount a subvolume to that mountpoint, the whole file system is balanced. Not just the mounted subvolume. That is confusing. (I mean: I understand what you are saying, but it's counterintuitive). Why is this the case? In parallel, I try to delete /mnt/BTRFS/rsnapshot, but it fails: btrfs subvolume delete /mnt/BTRFS/rsnapshot/ Delete subvolume '/mnt/BTRFS/rsnapshot' ERROR: cannot delete '/mnt/BTRFS/rsnapshot' - Inappropriate ioctl for device Why's that? But even more: How do I free sdc1 now?! Well I'm pretty confused because again, I can't tell if your paths refer to subvolumes or if they refer to mount points. Now I am confused. These paths are the paths to which I mounted the subvolumes: my (abbreviated) fstab: UUID=xy /mnt/BTRFS/Video btrfs subvol=Video UUID=xy /mnt/BTRFS/rsnapshot btrfs subvol=rsnapshot UUID=xy /mnt/BTRFS/backups btrfs subvol=backups The balance and device delete commands all refer to a mount point, which is the path returned by the df command. So this: /dev/sdb1 5,5T3,5T 2,0T 64% /mnt/BTRFS/Video /dev/sdb1 5,5T3,5T 2,0T 64% /mnt/BTRFS/backups /dev/sdc1 5,5T3,5T 2,0T 64% /mnt/BTRFS/rsnapshot The subvolume delete command needs a path to subvolume that starts with the mount point. Sorry, this I do not understand, no matter how hard I think about it.. What would it be in my case? Thanks for your help! I appreciate it. Greetings, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck does not fix
Hello, Ok. I think, I do/did have some symptoms, but I cannot exclude other reasons.. -High Load without high cpu-usage (io was the bottleneck) -Just now: transfer from one directory to the other on the same subvolume (from /mnt/subvol/A/B to /mnt/subvol/A) I get 1.2MB/s instead of 60. -For some of the files I even got a no space left on device error. This is without any messages in dmesg or syslog related to btrfs. as I don't see that I can fix this, I intend to re-create the file-system. For that, I need to remove one of the two discs from the raid/filesystem, then create a new fs on this and move the data to it (I have no spare) Could you please advise me, wheather this will be successful? first some Information on the filesystem: ./btrfs filesystem show /dev/sdb1 Label: none uuid: 989306aa-d291-4752-8477-0baf94f8c42f Total devices 2 FS bytes used 3.47TiB devid1 size 2.73TiB used 1.74TiB path /dev/sdb1 devid2 size 2.73TiB used 1.74TiB path /dev/sdc1 /btrfs subvolume list /mnt/BTRFS/Video ID 256 gen 226429 top level 5 path Video ID 1495 gen 226141 top level 5 path rsnapshot ID gen 226429 top level 256 path Snapshot ID 5845 gen 226375 top level 5 path backups btrfs fi df /mnt/BTRFS/Video/ Data, RAID0: total=3.48TB, used=3.47TB System, RAID1: total=32.00MB, used=260.00KB Metadata, RAID1: total=4.49GB, used=3.85GB What I did already yesterday was: btrfs device delete /dev/sdc1 /mnt/BTRFS/rsnapshot/ btrfs device delete /dev/sdc1 /mnt/BTRFS/backups/ btrfs device delete /dev/sdc1 /mnt/BTRFS/Video/ btrfs filesystem balance start /mnt/BTRFS/Video/ next, I'm doing the balance for the subvolume /mnt/BTRFS/backups In parallel, I try to delete /mnt/BTRFS/rsnapshot, but it fails: btrfs subvolume delete /mnt/BTRFS/rsnapshot/ Delete subvolume '/mnt/BTRFS/rsnapshot' ERROR: cannot delete '/mnt/BTRFS/rsnapshot' - Inappropriate ioctl for device Why's that? But even more: How do I free sdc1 now?! Greetings, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck does not fix
Hello again: I think, I do/did have some symptoms, but I cannot exclude other reasons.. -High Load without high cpu-usage (io was the bottleneck) -Just now: transfer from one directory to the other on the same subvolume (from /mnt/subvol/A/B to /mnt/subvol/A) I get 1.2MB/s instead of 60. -For some of the files I even got a no space left on device error. And the first symptom is also there: top - 21:00:58 up 22:19, 5 users, load average: 1.08, 1.15, 1.09 Tasks: 204 total, 1 running, 203 sleeping, 0 stopped, 0 zombie Cpu(s): 1.5%us, 2.7%sy, 0.3%ni, 66.6%id, 28.6%wa, 0.3%hi, 0.0%si, 0.0%st Mem: 3795584k total, 3614088k used, 181496k free, 367820k buffers Swap: 8293372k total,45464k used, 8247908k free, 2337704k cached Greetings, Hendrik -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Mobil 0178 1874363 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck does not fix
Hello, Yes. Here I mount the three subvolumes: Does scrubbing the volume give any errors? Last time I did (that was after I discovered the first errors in btrfsck) scrub, it found no error. But I will re-check asap. As to the error messages: I do not know how critical those are. I usually just scrub my filesystems once in a while and would only try btrfs check on one that fails the scrubbing or has problems mounting or (in some cases) yields strange messages in dmesg. Ok. I think, I do/did have some symptoms, but I cannot exclude other reasons.. -High Load without high cpu-usage (io was the bottleneck) -Just now: transfer from one directory to the other on the same subvolume (from /mnt/subvol/A/B to /mnt/subvol/A) I get 1.2MB/s instead of 60. -For some of the files I even got a no space left on device error. This is without any messages in dmesg or syslog related to btrfs. Greetings, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck does not fix
Hello, Kernel version? 3.12.0-031200-generic It mounts OK with no kernel messages? Yes. Here I mount the three subvolumes: dmesg: [105152.392900] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 164942 /dev/sdb1 [105152.394332] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 164942 /dev/sdc1 [105152.394663] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 164942 /dev/sdb1 [105152.394759] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 164942 /dev/sdc1 [105152.394845] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 164942 /dev/sdc1 [105152.395941] btrfs: disk space caching is enabled [105195.320249] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 164942 /dev/sdb1 [105195.320256] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 164942 /dev/sdc1 [105195.320263] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 164942 /dev/sdb1 [105195.320290] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 164942 /dev/sdc1 [105195.320308] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 164942 /dev/sdc1 [105208.832997] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 164942 /dev/sdb1 [105208.833005] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 164942 /dev/sdc1 [105208.833026] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 164942 /dev/sdb1 [105208.833030] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 164942 /dev/sdc1 [105208.833032] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 164942 /dev/sdc1 Syslog: Jan 12 23:25:43 homeserver kernel: [105152.392900] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 164942 /dev/sdb1 Jan 12 23:25:43 homeserver kernel: [105152.394332] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 164942 /dev/sdc1 Jan 12 23:25:43 homeserver kernel: [105152.394663] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 164942 /dev/sdb1 Jan 12 23:25:43 homeserver kernel: [105152.394759] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 164942 /dev/sdc1 Jan 12 23:25:43 homeserver kernel: [105152.394845] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 164942 /dev/sdc1 Jan 12 23:25:43 homeserver kernel: [105152.395941] btrfs: disk space caching is enabled Jan 12 23:26:26 homeserver kernel: [105195.320249] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 164942 /dev/sdb1 Jan 12 23:26:26 homeserver kernel: [105195.320256] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 164942 /dev/sdc1 Jan 12 23:26:26 homeserver kernel: [105195.320263] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 164942 /dev/sdb1 Jan 12 23:26:26 homeserver kernel: [105195.320290] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 164942 /dev/sdc1 Jan 12 23:26:26 homeserver kernel: [105195.320308] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 164942 /dev/sdc1 Jan 12 23:26:39 homeserver kernel: [105208.832997] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 164942 /dev/sdb1 Jan 12 23:26:39 homeserver kernel: [105208.833005] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 164942 /dev/sdc1 Jan 12 23:26:39 homeserver kernel: [105208.833026] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 164942 /dev/sdb1 Jan 12 23:26:39 homeserver kernel: [105208.833030] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 164942 /dev/sdc1 Jan 12 23:26:39 homeserver kernel: [105208.833032] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 164942 /dev/sdc1 What do you get for: btrfs fi show ./btrfs/btrfs-progs/btrfs fi show Label: none uuid: 989306aa-d291-4752-8477-0baf94f8c42f Total devices 2 FS bytes used 4.37TiB devid1 size 2.73TiB used 2.73TiB path /dev/sdb1 devid2 size 2.73TiB used 2.73TiB path /dev/sdc1 Btrfs v3.12 btrfs fi df mp ./btrfs/btrfs-progs/btrfs fi df /mnt/BTRFS/rsnapshot/ Data, RAID0: total=5.45TiB, used=4.37TiB System, RAID1: total=8.00MiB, used=396.00KiB System, single: total=4.00MiB, used=0.00 Metadata, RAID1: total=6.00GiB, used=5.41GiB Metadata, single: total=8.00MiB, used=0.00 (for all Subvolumes) btrfs
Re: btrfsck does not fix
Hello, I was wondering whether I am doing something wrong in the way I am asking/what I am asking. My understanding is, that btrfsck is not able to fix this error yet. So, I am surprised, that noone is interested in this, apparently? Regards, Hendrik Friedel Am 07.01.2014 21:38, schrieb Hendrik Friedel: Hello, I ran btrfsck on my volume with the repair option. When I re-run it, I get the same errors as before. It mounts without errors? So why then btrfsck/btrfs repair? What precipitated the repair? I don't know what caused the damage, but a check revealed this: Checking filesystem on /dev/sdb1 UUID: 989306aa-d291-4752-8477-0baf94f8c42f checking extents Extent back ref already exists for 2994950590464 parent 863072366592 root 0 Extent back ref already exists for 2994950836224 parent 863072366592 root 0 Extent back ref already exists for 862762737664 parent 863072366592 root 0 Extent back ref already exists for 2994950877184 parent 863072366592 [...] Incorrect global backref count on 2995767250944 found 1 wanted 2 backpointer mismatch on [2995767250944 4096] ref mismatch on [2995767304192 4096] extent item 1, found 2 Incorrect global backref count on 2995767304192 found 1 wanted 2 backpointer mismatch on [2995767304192 4096] ref mismatch on [2995768258560 4096] extent item 1, found 2 Incorrect global backref count on 2995768258560 found 1 wanted 2 backpointer mismatch on [2995768258560 4096] ref mismatch on [2995768459264 4096] extent item 1, found 2 Incorrect global backref count on 2995768459264 found 1 wanted 2 backpointer mismatch on [2995768459264 4096] Errors found in extent allocation tree or chunk allocation ref mismatch on [2995768459264 4096] extent item 1, found 2 Incorrect global backref count on 2995768459264 found 1 wanted 2 backpointer mismatch on [2995768459264 4096] Errors found in extent allocation tree or chunk allocation checking free space cache checking fs roots root 256 inode 9579 errors 100, file extent discount root 256 inode 9580 errors 100, file extent discount root 256 inode 14258 errors 100, file extent discount root 256 inode 14259 errors 100, file extent discount root inode 9579 errors 100, file extent discount root inode 9580 errors 100, file extent discount root inode 14258 errors 100, file extent discount root inode 14259 errors 100, file extent discount found 1993711951581 bytes used err is 1 total csum bytes: 4560615360 total tree bytes: 5643403264 total fs tree bytes: 139776000 total extent tree bytes: 263602176 btree space waste bytes: 504484726 file data blocks allocated: 6557032402944 referenced 6540949323776 Btrfs v3.12 This made me run btrfsck with the repair option: Extent back ref already exists for 2994950590464 parent 863072366592 root 0 ref mismatch on [32935936 4096] extent item 1, found 2 repair deleting extent record: key 32935936 168 4096 adding new tree backref on start 32935936 len 4096 parent 2994784206848 root 2994784206848 Incorrect global backref count on 32935936 found 1 wanted 2 backpointer mismatch on [32935936 4096] ref mismatch on [32997376 4096] extent item 1, found 2 repair deleting extent record: key 32997376 168 4096 adding new tree backref on start 32997376 len 4096 parent 2994824708096 root 2994824708096 Incorrect global backref count on 32997376 found 1 wanted 2 backpointer mismatch on [32997376 4096] Incorrect global backref count on 8988365651968 found 1 wanted 0 backpointer mismatch on [8988365651968 4096] repaired damaged extent references checking free space cache checking fs roots root 256 inode 9579 errors 100, file extent discount root 256 inode 9580 errors 100, file extent discount root 256 inode 14258 errors 100, file extent discount root 256 inode 14259 errors 100, file extent discount root inode 9579 errors 100, file extent discount root inode 9580 errors 100, file extent discount root inode 14258 errors 100, file extent discount root inode 14259 errors 100, file extent discount enabling repair mode Checking filesystem on /dev/sdc1 UUID: 989306aa-d291-4752-8477-0baf94f8c42f cache and super generation don't match, space cache will be invalidated found 827360733827 bytes used err is 1 total csum bytes: 4446455380 total tree bytes: 5506977792 total fs tree bytes: 137293824 total extent tree bytes: 258691072 btree space waste bytes: 496921489 file data blocks allocated: 6440132583424 referenced 6424163344384 Btrfs v3.12 After this, I ran a check without the repair option again and the same errors persist. Greetings, Hendrik -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Mobil 0178 1874363 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck does not fix
Hello, I ran btrfsck on my volume with the repair option. When I re-run it, I get the same errors as before. It mounts without errors? So why then btrfsck/btrfs repair? What precipitated the repair? I don't know what caused the damage, but a check revealed this: Checking filesystem on /dev/sdb1 UUID: 989306aa-d291-4752-8477-0baf94f8c42f checking extents Extent back ref already exists for 2994950590464 parent 863072366592 root 0 Extent back ref already exists for 2994950836224 parent 863072366592 root 0 Extent back ref already exists for 862762737664 parent 863072366592 root 0 Extent back ref already exists for 2994950877184 parent 863072366592 [...] Incorrect global backref count on 2995767250944 found 1 wanted 2 backpointer mismatch on [2995767250944 4096] ref mismatch on [2995767304192 4096] extent item 1, found 2 Incorrect global backref count on 2995767304192 found 1 wanted 2 backpointer mismatch on [2995767304192 4096] ref mismatch on [2995768258560 4096] extent item 1, found 2 Incorrect global backref count on 2995768258560 found 1 wanted 2 backpointer mismatch on [2995768258560 4096] ref mismatch on [2995768459264 4096] extent item 1, found 2 Incorrect global backref count on 2995768459264 found 1 wanted 2 backpointer mismatch on [2995768459264 4096] Errors found in extent allocation tree or chunk allocation ref mismatch on [2995768459264 4096] extent item 1, found 2 Incorrect global backref count on 2995768459264 found 1 wanted 2 backpointer mismatch on [2995768459264 4096] Errors found in extent allocation tree or chunk allocation checking free space cache checking fs roots root 256 inode 9579 errors 100, file extent discount root 256 inode 9580 errors 100, file extent discount root 256 inode 14258 errors 100, file extent discount root 256 inode 14259 errors 100, file extent discount root inode 9579 errors 100, file extent discount root inode 9580 errors 100, file extent discount root inode 14258 errors 100, file extent discount root inode 14259 errors 100, file extent discount found 1993711951581 bytes used err is 1 total csum bytes: 4560615360 total tree bytes: 5643403264 total fs tree bytes: 139776000 total extent tree bytes: 263602176 btree space waste bytes: 504484726 file data blocks allocated: 6557032402944 referenced 6540949323776 Btrfs v3.12 This made me run btrfsck with the repair option: Extent back ref already exists for 2994950590464 parent 863072366592 root 0 ref mismatch on [32935936 4096] extent item 1, found 2 repair deleting extent record: key 32935936 168 4096 adding new tree backref on start 32935936 len 4096 parent 2994784206848 root 2994784206848 Incorrect global backref count on 32935936 found 1 wanted 2 backpointer mismatch on [32935936 4096] ref mismatch on [32997376 4096] extent item 1, found 2 repair deleting extent record: key 32997376 168 4096 adding new tree backref on start 32997376 len 4096 parent 2994824708096 root 2994824708096 Incorrect global backref count on 32997376 found 1 wanted 2 backpointer mismatch on [32997376 4096] Incorrect global backref count on 8988365651968 found 1 wanted 0 backpointer mismatch on [8988365651968 4096] repaired damaged extent references checking free space cache checking fs roots root 256 inode 9579 errors 100, file extent discount root 256 inode 9580 errors 100, file extent discount root 256 inode 14258 errors 100, file extent discount root 256 inode 14259 errors 100, file extent discount root inode 9579 errors 100, file extent discount root inode 9580 errors 100, file extent discount root inode 14258 errors 100, file extent discount root inode 14259 errors 100, file extent discount enabling repair mode Checking filesystem on /dev/sdc1 UUID: 989306aa-d291-4752-8477-0baf94f8c42f cache and super generation don't match, space cache will be invalidated found 827360733827 bytes used err is 1 total csum bytes: 4446455380 total tree bytes: 5506977792 total fs tree bytes: 137293824 total extent tree bytes: 258691072 btree space waste bytes: 496921489 file data blocks allocated: 6440132583424 referenced 6424163344384 Btrfs v3.12 After this, I ran a check without the repair option again and the same errors persist. Greetings, Hendrik -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Mobil 0178 1874363 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck does not fix
Hello, What messages in dmesg so you get when you use recovery? I'll find out, tomorrow (I can't access the disk just now). Here it is: [90098.989872] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 162460 /dev/sdc1 That's all. The same in the syslog. Do you have further suggestions to fix the file-system? Regards, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck does not fix
Hi Chris, I ran btrfsck on my volume with the repair option. When I re-run it, I get the same errors as before. Did you try mounting with -o recovery first? https://btrfs.wiki.kernel.org/index.php/Problem_FAQ No, I did not. In fact, I had visited the FAQ before, and my understanding was, that -o recovery was used/needed when mounting is impossible. This is not the case. In fact, the disk does work without obvious problems. What messages in dmesg so you get when you use recovery? I'll find out, tomorrow (I can't access the disk just now). Greetings, Hendrik -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Mobil 0178 1874363 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck errors is it save to fix?
Hello, Possible? Yes. Although I did not implicitly mention it, you would combine clear_cache and nospace_cache - that should do the trick. Then unmount and check. Thanks and mentally noted for further reference. I didn't think about combining the options, but it makes perfect sense now that I have, thanks to you. =:^) For me, it unfortunately did not work: mount /dev/sdc1 /mnt/BTRFS/Video/VDR -o clear_cache,nospace_cache [wait a day or two] Checking filesystem on /dev/sdc1 UUID: 989306aa-d291-4752-8477-0baf94f8c42f checking extents checking free space cache cache and super generation don't match, space cache will be invalidated checking fs roots root 256 inode 9579 errors 100, file extent discount root 256 inode 9580 errors 100, file extent discount root 256 inode 14258 errors 100, file extent discount root 256 inode 14259 errors 100, file extent discount root inode 9579 errors 100, file extent discount root inode 9580 errors 100, file extent discount root inode 14258 errors 100, file extent discount root inode 14259 errors 100, file extent discount found 2928473450130 bytes used err is 1 total csum bytes: 3206482672 total tree bytes: 3902070784 total fs tree bytes: 38912000 total extent tree bytes: 136044544 btree space waste bytes: 411777432 file data blocks allocated: 3447164817408 referenced 3446445981696 Btrfs v0.20-rc1-596-ge9ac73b Same as before (just a bit more verbose). Does it help to delete the files at the affected inodes? How do I find, which files are stored at these inodes? Greetings, Hendrik -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Mobil 0178 1874363 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck errors is it save to fix?
Hello, I re-post this: To answer the is it safe to fix question... In that context, yes, it's safe to btrfsck --repair, because you're prepared to lose the entire filesystem if worse comes to worse in any case, so even if btrfsck --repair makes things worse instead of better, you've not lost anything you're particularly worried about anyway. I do have an daily backup of the important data. There is other data, that is (a bit more than) nice to keep (TV-Recordings). It seems all still readable, so I can also back this up, if I could free some space. So, I have run btrfsck --repair: --- root@homeserver:~/btrfs/btrfs-progs# git pull remote: Counting objects: 124, done. remote: Compressing objects: 100% (52/52), done. remote: Total 99 (delta 55), reused 89 (delta 47) Unpacking objects: 100% (99/99), done. From git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs d1570a0..c652e4e integration - origin/integration Already up-to-date. --- The repair: --- ./btrfsck --repair /dev/sdc1 enabling repair mode Checking filesystem on /dev/sdc1 UUID: 989306aa-d291-4752-8477-0baf94f8c42f checking extents checking free space cache cache and super generation don't match, space cache will be invalidated checking fs roots root 256 inode 9579 errors 100 root 256 inode 9580 errors 100 root 256 inode 14258 errors 100 root 256 inode 14259 errors 100 root inode 9579 errors 100 root inode 9580 errors 100 root inode 14258 errors 100 root inode 14259 errors 100 found 2895817096773 bytes used err is 1 total csum bytes: 3206482672 total tree bytes: 3901480960 total fs tree bytes: 38912000 total extent tree bytes: 135892992 btree space waste bytes: 411727425 file data blocks allocated: 3446512275456 referenced 3445793439744 Btrfs v0.20-rc1-358-g194aa4 --- After the repair, another check reveals the same errors as before: --- ./btrfsck /dev/sdc1 Checking filesystem on /dev/sdc1 UUID: 989306aa-d291-4752-8477-0baf94f8c42f checking extents checking free space cache cache and super generation don't match, space cache will be invalidated checking fs roots root 256 inode 9579 errors 100 root 256 inode 9580 errors 100 root 256 inode 14258 errors 100 root 256 inode 14259 errors 100 root inode 9579 errors 100 root inode 9580 errors 100 root inode 14258 errors 100 root inode 14259 errors 100 found 2895817096773 bytes used err is 1 total csum bytes: 3206482672 total tree bytes: 3901480960 total fs tree bytes: 38912000 total extent tree bytes: 135892992 btree space waste bytes: 411727425 file data blocks allocated: 3446512275456 referenced 3445793439744 Btrfs v0.20-rc1-358-g194aa4a --- The only messages in syslog/dmesg regarding btrfs are: [299517.270322] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 140436 /dev/sdc1 [299525.805867] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 140436 /dev/sdb1 [299525.807148] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 140436 /dev/sdc1 [299525.808277] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 140436 /dev/sdb1 (repeating several times) Can we find out, why btrfsck does not fix the errors? I got no reply to this. Now, I have two Intentions: -Help improving btrfs(ck) -Make the System usable again Please let me know, if it is of interest to work with this example on btrfsck, which apparently now is not able to fix this problem and what Information you would need from me. Otherwise, I can proceed to the second point. Greetings, Hendrik --- Diese E-Mail ist frei von Viren und Malware, denn der avast! Antivirus Schutz ist aktiv. http://www.avast.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck errors is it save to fix?
Hello thanks for your reply. To answer the is it safe to fix question... In that context, yes, it's safe to btrfsck --repair, because you're prepared to lose the entire filesystem if worse comes to worse in any case, so even if btrfsck --repair makes things worse instead of better, you've not lost anything you're particularly worried about anyway. I do have an daily backup of the important data. There is other data, that is (a bit more than) nice to keep (TV-Recordings). It seems all still readable, so I can also back this up, if I could free some space. So, I have run btrfsck --repair: --- root@homeserver:~/btrfs/btrfs-progs# git pull remote: Counting objects: 124, done. remote: Compressing objects: 100% (52/52), done. remote: Total 99 (delta 55), reused 89 (delta 47) Unpacking objects: 100% (99/99), done. From git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs d1570a0..c652e4e integration - origin/integration Already up-to-date. --- The repair: --- ./btrfsck --repair /dev/sdc1 enabling repair mode Checking filesystem on /dev/sdc1 UUID: 989306aa-d291-4752-8477-0baf94f8c42f checking extents checking free space cache cache and super generation don't match, space cache will be invalidated checking fs roots root 256 inode 9579 errors 100 root 256 inode 9580 errors 100 root 256 inode 14258 errors 100 root 256 inode 14259 errors 100 root inode 9579 errors 100 root inode 9580 errors 100 root inode 14258 errors 100 root inode 14259 errors 100 found 2895817096773 bytes used err is 1 total csum bytes: 3206482672 total tree bytes: 3901480960 total fs tree bytes: 38912000 total extent tree bytes: 135892992 btree space waste bytes: 411727425 file data blocks allocated: 3446512275456 referenced 3445793439744 Btrfs v0.20-rc1-358-g194aa4 --- After the repair, another check reveals the same errors as before: --- ./btrfsck /dev/sdc1 Checking filesystem on /dev/sdc1 UUID: 989306aa-d291-4752-8477-0baf94f8c42f checking extents checking free space cache cache and super generation don't match, space cache will be invalidated checking fs roots root 256 inode 9579 errors 100 root 256 inode 9580 errors 100 root 256 inode 14258 errors 100 root 256 inode 14259 errors 100 root inode 9579 errors 100 root inode 9580 errors 100 root inode 14258 errors 100 root inode 14259 errors 100 found 2895817096773 bytes used err is 1 total csum bytes: 3206482672 total tree bytes: 3901480960 total fs tree bytes: 38912000 total extent tree bytes: 135892992 btree space waste bytes: 411727425 file data blocks allocated: 3446512275456 referenced 3445793439744 Btrfs v0.20-rc1-358-g194aa4a --- The only messages in syslog/dmesg regarding btrfs are: [299517.270322] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 140436 /dev/sdc1 [299525.805867] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 140436 /dev/sdb1 [299525.807148] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 140436 /dev/sdc1 [299525.808277] btrfs: device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 140436 /dev/sdb1 (repeating several times) Can we find out, why btrfsck does not fix the errors? Greetings, Hendrik -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Mobil 0178 1874363 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck errors is it save to fix?
Hello again, can someone please help me on this? Regards, Hendrik Am 06.11.2013 07:45, schrieb Hendrik Friedel: Hello, sorry, I was totally unaware still being on 3.11rc2. I re-ran btrfsck with the same result: ./btrfs-progs/btrfsck /dev/sdc1 Checking filesystem on /dev/sdc1 UUID: 989306aa-d291-4752-8477-0baf94f8c42f checking extents checking free space cache checking fs roots root 256 inode 9579 errors 100 root 256 inode 9580 errors 100 root 256 inode 14258 errors 100 root 256 inode 14259 errors 100 root inode 9579 errors 100 root inode 9580 errors 100 root inode 14258 errors 100 root inode 14259 errors 100 found 1992865028914 bytes used err is 1 total csum bytes: 3207847732 total tree bytes: 3902865408 total fs tree bytes: 38875136 total extent tree bytes: 135864320 btree space waste bytes: 411665032 file data blocks allocated: 3426722545664 referenced 3426000965632 Btrfs v0.20-rc1-358-g194aa4a Now dmesg and the syslog stay clear of entries relatet to btrfs. But I think, that might also be a coincidence: I ran the old Kernel for weeks until this error came, whereas I ran this kernel merely 12h. Now: Does it make sense to futher try to find a possible bug, or do we suspect it is fixed? If so: How can I help? And: Can I fix these Problems safely with btrfsck? Regards, Hendrik Am 05.11.2013 03:03, schrieb cwillu: On Mon, Nov 4, 2013 at 3:14 PM, Hendrik Friedel hend...@friedels.name wrote: Hello, the list was quite full with patches, so this might have been hidden. Here the complete Stack. Does this help? Is this what you needed? [95764.899294] CPU: 1 PID: 21798 Comm: umount Tainted: GFCIO 3.11.0-031100rc2-generic #201307211535 Can you reproduce the problem under the released 3.11 or 3.12? An -rc2 is still pretty early in the release cycle, and I wouldn't be at all surprised if it was a bug added and fixed in a later rc. -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Mobil 0178 1874363 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck errors is it save to fix?
Hello, sorry, I was totally unaware still being on 3.11rc2. I re-ran btrfsck with the same result: ./btrfs-progs/btrfsck /dev/sdc1 Checking filesystem on /dev/sdc1 UUID: 989306aa-d291-4752-8477-0baf94f8c42f checking extents checking free space cache checking fs roots root 256 inode 9579 errors 100 root 256 inode 9580 errors 100 root 256 inode 14258 errors 100 root 256 inode 14259 errors 100 root inode 9579 errors 100 root inode 9580 errors 100 root inode 14258 errors 100 root inode 14259 errors 100 found 1992865028914 bytes used err is 1 total csum bytes: 3207847732 total tree bytes: 3902865408 total fs tree bytes: 38875136 total extent tree bytes: 135864320 btree space waste bytes: 411665032 file data blocks allocated: 3426722545664 referenced 3426000965632 Btrfs v0.20-rc1-358-g194aa4a Now dmesg and the syslog stay clear of entries relatet to btrfs. But I think, that might also be a coincidence: I ran the old Kernel for weeks until this error came, whereas I ran this kernel merely 12h. Now: Does it make sense to futher try to find a possible bug, or do we suspect it is fixed? If so: How can I help? And: Can I fix these Problems safely with btrfsck? Regards, Hendrik Am 05.11.2013 03:03, schrieb cwillu: On Mon, Nov 4, 2013 at 3:14 PM, Hendrik Friedel hend...@friedels.name wrote: Hello, the list was quite full with patches, so this might have been hidden. Here the complete Stack. Does this help? Is this what you needed? [95764.899294] CPU: 1 PID: 21798 Comm: umount Tainted: GFCIO 3.11.0-031100rc2-generic #201307211535 Can you reproduce the problem under the released 3.11 or 3.12? An -rc2 is still pretty early in the release cycle, and I wouldn't be at all surprised if it was a bug added and fixed in a later rc. -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Mobil 0178 1874363 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck errors is it save to fix?
. -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Mobil 0178 1874363 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck errors is it save to fix?
Hello, sorry about that: [ 126.444603] init: plymouth-stop pre-start process (3446) terminated with status 1 [11189.299864] hda-intel: IRQ timing workaround is activated for card #0. Suggest a bigger bdl_pos_adj. [94999.489736] device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 140408 /dev/sdc1 [94999.489755] device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 140408 /dev/sdb1 [95394.400840] device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 140420 /dev/sdb1 [95394.400872] device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 140420 /dev/sdc1 [95585.149738] init: smbd main process (1168) killed by TERM signal [95725.171156] nfsd: last server has exited, flushing export cache [95764.899173] [ cut here ] [95764.899216] WARNING: CPU: 1 PID: 21798 at /home/apw/COD/linux/fs/btrfs/disk-io.c:3423 free_fs_root+0x99/0xa 0 [btrfs]() [95764.899219] Modules linked in: nvram pci_stub vboxpci(OF) vboxnetadp(OF) vboxnetflt(OF) vboxdrv(OF) ip6tabl e_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_de frag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp iptable_filter ip_tables x_tab les bridge stp llc kvm_intel kvm nfsd nfs_acl auth_rpcgss nfs fscache binfmt_misc lockd sunrpc ftdi_sio usbser ial stv6110x lnbp21 snd_hda_codec_realtek snd_hda_intel stv090x snd_hda_codec snd_hwdep snd_pcm ddbridge dvb_c ore snd_timer snd soundcore snd_page_alloc cxd2099(C) mei_me psmouse i915 drm_kms_helper mei drm lpc_ich i2c_a lgo_bit serio_raw video mac_hid coretemp lp parport hid_generic usbhid hid btrfs raid6_pq e1000e ptp pps_core ahci libahci xor zlib_deflate libcrc32c [95764.899294] CPU: 1 PID: 21798 Comm: umount Tainted: GFCIO 3.11.0-031100rc2-generic #201307211535 [95764.899297] Hardware name: /DH87RL, BIOS RLH8710H.86A.0320.2013.0606.1802 06/06/2013 [95764.899300] 0d5f 880118b59cb8 8171e74d 0007 [95764.899306] 880118b59cf8 8106532c 880118b59d08 [95764.899311] 8801184cb800 8801184cb800 880118118000 880118b59d78 [95764.899315] Call Trace: [95764.899324] [8171e74d] dump_stack+0x46/0x58 [95764.899331] [8106532c] warn_slowpath_common+0x8c/0xc0 [95764.899336] [8106537a] warn_slowpath_null+0x1a/0x20 [95764.899359] [a00d9a59] free_fs_root+0x99/0xa0 [btrfs] [95764.899384] [a00dd653] btrfs_drop_and_free_fs_root+0x93/0xc0 [btrfs] [95764.899408] [a00dd74f] del_fs_roots+0xcf/0x130 [btrfs] [95764.899433] [a00ddac6] close_ctree+0x146/0x270 [btrfs] [95764.899441] [811cd24e] ? evict_inodes+0xce/0x130 [95764.899461] [a00b4eb9] btrfs_put_super+0x19/0x20 [btrfs] [95764.899467] [811b47e2] generic_shutdown_super+0x62/0xf0 [95764.899475] [811b4906] kill_anon_super+0x16/0x30 [95764.899493] [a00b754a] btrfs_kill_super+0x1a/0x90 [btrfs] [95764.899500] [811b512d] deactivate_locked_super+0x4d/0x80 [95764.899505] [811b57ae] deactivate_super+0x4e/0x70 [95764.899510] [811d1266] mntput_no_expire+0x106/0x160 [95764.899515] [811d2b79] SyS_umount+0xa9/0xf0 [95764.899520] [817333ef] tracesys+0xe1/0xe6 [95764.899524] ---[ end trace 0024dfebf572e76c ]--- [95764.985245] VFS: Busy inodes after unmount of sdb1. Self-destruct in 5 seconds. Have a nice day... [95790.079663] device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 140425 /dev/sdb1 [95790.101778] device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 140425 /dev/sdc1 [95790.162960] device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 140425 /dev/sdb1 [95790.163825] device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 140425 /dev/sdc1 [95924.393344] device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 140425 /dev/sdb1 [95924.421118] device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 140425 /dev/sdc1 [95924.676571] device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 1 transid 140425 /dev/sdb1 [95924.677046] device fsid 989306aa-d291-4752-8477-0baf94f8c42f devid 2 transid 140425 /dev/sdc1 Greetings, Hendrik Am 02.11.2013 09:12, schrieb cwillu: Now that I am searching, I see this in dmesg: [95764.899359] [a00d9a59] free_fs_root+0x99/0xa0 [btrfs] [95764.899384] [a00dd653] btrfs_drop_and_free_fs_root+0x93/0xc0 [btrfs] [95764.899408] [a00dd74f] del_fs_roots+0xcf/0x130 [btrfs] [95764.899433] [a00ddac6] close_ctree+0x146/0x270 [btrfs] [95764.899461] [a00b4eb9] btrfs_put_super+0x19/0x20 [btrfs] [95764.899493] [a00b754a] btrfs_kill_super+0x1a/0x90 [btrfs] Need to see the rest of the trace this came from. -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Mobil 0178 1874363 -- To unsubscribe from this list: send the line unsubscribe
btrfsck errors is it save to fix?
Hello, I have noticed that my server experiences high load average when writing to it. So I checked the file-system and found errors: ./btrfsck /dev/sdc1 Checking filesystem on /dev/sdc1 UUID: 989306aa-d291-4752-8477-0baf94f8c42f checking extents checking free space cache checking fs roots root 256 inode 9579 errors 100 root 256 inode 9580 errors 100 root 256 inode 14258 errors 100 root 256 inode 14259 errors 100 root inode 9579 errors 100 root inode 9580 errors 100 root inode 14258 errors 100 root inode 14259 errors 100 found 1478386534452 bytes used err is 1 total csum bytes: 3207847732 total tree bytes: 3902853120 total fs tree bytes: 38875136 total extent tree bytes: 135856128 btree space waste bytes: 411653937 file data blocks allocated: 3426722545664 referenced 3426000965632 Btrfs v0.20-rc1-358-g194aa4a It is a system striped over two physical disks. Now, what concerns me is that I found no indications of problems except for the performance whatsoever. Nothing in the syslog. Now that I am searching, I see this in dmesg: [95764.899359] [a00d9a59] free_fs_root+0x99/0xa0 [btrfs] [95764.899384] [a00dd653] btrfs_drop_and_free_fs_root+0x93/0xc0 [btrfs] [95764.899408] [a00dd74f] del_fs_roots+0xcf/0x130 [btrfs] [95764.899433] [a00ddac6] close_ctree+0x146/0x270 [btrfs] [95764.899461] [a00b4eb9] btrfs_put_super+0x19/0x20 [btrfs] [95764.899493] [a00b754a] btrfs_kill_super+0x1a/0x90 [btrfs] Now the fact that the load went up indicates to me that the system struggled reading or writing. Can't this struggeling be detected and reported? Wouldn't this contribute to data-safety? An the for me now more pressing question: How can I fix the problem? Greetings, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Mount multiple-device-filesystem by UUID
Thanks for your replies. I will try. Greetings, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Mount multiple-device-filesystem by UUID
Hello, As stated in the -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Mount multiple-device-filesystem by UUID
Hello, As stated in the wiki, multiple-device filesystems (e.g. raid 1) will only mount after a btfs device scan, or if all devices are passed with the mount options. I remember, that for Ubuntu 12.04 I changed the initrd. But after a re-install, I have to do this again, and I don't remember how I did it. So, the other option would be passing the devices in the fstab. But here, I'd prefer UUIDs rather than device names, as they can change. Is this possible? What is the syntax? Regards, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
raid0, raid1, raid5, what to choose?
Hello, I'd appreciate your recommendation on this: I have three hdd with 3TB each. I intend to use them as raid5 eventually. currently I use them like this: # mount|grep sd /dev/sda1 on /mnt/Datenplatte type ext4 /dev/sdb1 on /mnt/BTRFS/Video type btrfs /dev/sdb1 on /mnt/BTRFS/rsnapshot type btrfs #df -h /dev/sda1 2,7T 1,3T 1,3T 51% /mnt/Datenplatte /dev/sdb1 5,5T 5,4T 93G 99% /mnt/BTRFS/Video /dev/sdb1 5,5T 5,4T 93G 99% /mnt/BTRFS/rsnapshot Now, what surprises me, and here I lack memory- is that sdb appears twice.. I think, I created a raid1, but how can I find out? #/usr/local/smarthome# ~/btrfs/btrfs-progs/btrfs fi show /dev/sdb1 Label: none uuid: 989306aa-d291-4752-8477-0baf94f8c42f Total devices 2 FS bytes used 2.68TB devid2 size 2.73TB used 2.73TB path /dev/sdc1 devid1 size 2.73TB used 2.73TB path /dev/sdb1 Now, I wanted to convert it to raid0, because I lack space and redundancy is not important for the Videos and the Backup, but this fails: ~/btrfs/btrfs-progs/btrfs fi balance start -dconvert=raid0 /mnt/BTRFS/ ERROR: error during balancing '/mnt/BTRFS/' - Inappropriate ioctl for device dmesg does not help here. Anyway: This gave me some time to think about this. In fact, as soon as raid5 is stable, I want to have all three as a raid5. Will this be possible with a balance command? If so: will this be possible as soon as raid5 is stable, or will I have to wait longer? What approach do you recommend? Greetings, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Needed change in Wiki
Hello, I don't see how to change the wiki, but it needs an update: apt-get build-dep btrfs-tools -or- apt-get install uuid-dev libattr1-dev zlib1g-dev libacl1-dev e2fslibs-dev here libblkid-dev is missing -at least for the latest git version of the btrfs-progs. Greetings, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: experimental raid5/6 code in git
Hi Chris, I've been keen for raid5/6 in btrfs since I heard of it. I cannot give you any feedback, but I'd like to take the opportunity to thank you -and all contributors (thinking of David for the raid) for your work. Regards, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: segmentation-fault in btrfsck (git-version)
Hello, I re-send this message, hoping that someone can give me a hint? Regards, Hendrik Am 18.12.2012 23:17, schrieb Hendrik Friedel: Hi Mitch, hi all, thanks for your hint. I used btrfs-debug-tree now. With -e, the output is empty. But without -e I do get a bit output file. When I search for Filenames that I am missing, I get: grep Sting big_output_file |grep Berlin namelen 20 datalen 0 name: Sting_Live_in_Berlin namelen 20 datalen 0 name: Sting_Live_in_Berlin inode ref index 29 namelen 20 name: Sting_Live_in_Berlin That looks good. That raises two questions now: Can I restore the file? And: Can I do that for a whole Path (e.g. ./Video/) GreetingsThanks! Hendrik Am 15.12.2012 23:24, schrieb Mitch Harder: On Sat, Dec 15, 2012 at 1:40 PM, Hendrik Friedel hend...@friedels.name wrote: Hello Mitch, hello all, Since btrfs has significant improvements and fixes in each kernel release, and since very few of these changes are backported, it is recommended to use the latest kernels available. Ok, it's 3.7 now. The root ### inode # errors 400 are an indication that there is an inconsistency in the inode size. There was a patch included in the 3.1 or 3.2 kernel to address this issue (http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=commit;h=f70a9a6b94af86fca069a7552ab672c31b457786). But I don't believe this patch fixed existing occurrences of this error. Apparently not. It's still there. At this point, the quickest solution for you may be to rebuild and reformat this RAID assembly, and restore this data from backups. Yepp, I did that. But in fact, some data is missing. It is not essential, but nice to have. If you don't have a backup of this data, and since your array seems to be working pretty well in a degraded state, this would be a really good time to look at a strategy of getting a backup of this data before doing many more attempts at rescue. Done. It's all save on another ext4 drive. So, let's play ;-) Could you please help me trying to restore the missing Data? What I tried sofar was: ./btrfs-restore /dev/sdc1 /mnt/restore/ It worked, in a way that it restored what I already had. What's odd aswell is, that btrfs scrub did run through without errors. So, the missing data could have been (accidentally) deleted by me. But I don't think... nevertheless I cannot exclude. What I know is the (original) Path of the Data. You could try btrfs-debug-tree, and search for any traces of your file. However, be ready to sift through a massive amount of output. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Mobil 0178 1874363 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: segmentation-fault in btrfsck (git-version)
Hi Mitch, hi all, thanks for your hint. I used btrfs-debug-tree now. With -e, the output is empty. But without -e I do get a bit output file. When I search for Filenames that I am missing, I get: grep Sting big_output_file |grep Berlin namelen 20 datalen 0 name: Sting_Live_in_Berlin namelen 20 datalen 0 name: Sting_Live_in_Berlin inode ref index 29 namelen 20 name: Sting_Live_in_Berlin That looks good. That raises two questions now: Can I restore the file? And: Can I do that for a whole Path (e.g. ./Video/) GreetingsThanks! Hendrik Am 15.12.2012 23:24, schrieb Mitch Harder: On Sat, Dec 15, 2012 at 1:40 PM, Hendrik Friedel hend...@friedels.name wrote: Hello Mitch, hello all, Since btrfs has significant improvements and fixes in each kernel release, and since very few of these changes are backported, it is recommended to use the latest kernels available. Ok, it's 3.7 now. The root ### inode # errors 400 are an indication that there is an inconsistency in the inode size. There was a patch included in the 3.1 or 3.2 kernel to address this issue (http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=commit;h=f70a9a6b94af86fca069a7552ab672c31b457786). But I don't believe this patch fixed existing occurrences of this error. Apparently not. It's still there. At this point, the quickest solution for you may be to rebuild and reformat this RAID assembly, and restore this data from backups. Yepp, I did that. But in fact, some data is missing. It is not essential, but nice to have. If you don't have a backup of this data, and since your array seems to be working pretty well in a degraded state, this would be a really good time to look at a strategy of getting a backup of this data before doing many more attempts at rescue. Done. It's all save on another ext4 drive. So, let's play ;-) Could you please help me trying to restore the missing Data? What I tried sofar was: ./btrfs-restore /dev/sdc1 /mnt/restore/ It worked, in a way that it restored what I already had. What's odd aswell is, that btrfs scrub did run through without errors. So, the missing data could have been (accidentally) deleted by me. But I don't think... nevertheless I cannot exclude. What I know is the (original) Path of the Data. You could try btrfs-debug-tree, and search for any traces of your file. However, be ready to sift through a massive amount of output. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Mobil 0178 1874363 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: segmentation-fault in btrfsck (git-version)
Hello Mitch, hello all, Since btrfs has significant improvements and fixes in each kernel release, and since very few of these changes are backported, it is recommended to use the latest kernels available. Ok, it's 3.7 now. The root ### inode # errors 400 are an indication that there is an inconsistency in the inode size. There was a patch included in the 3.1 or 3.2 kernel to address this issue (http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=commit;h=f70a9a6b94af86fca069a7552ab672c31b457786). But I don't believe this patch fixed existing occurrences of this error. Apparently not. It's still there. At this point, the quickest solution for you may be to rebuild and reformat this RAID assembly, and restore this data from backups. Yepp, I did that. But in fact, some data is missing. It is not essential, but nice to have. If you don't have a backup of this data, and since your array seems to be working pretty well in a degraded state, this would be a really good time to look at a strategy of getting a backup of this data before doing many more attempts at rescue. Done. It's all save on another ext4 drive. So, let's play ;-) Could you please help me trying to restore the missing Data? What I tried sofar was: ./btrfs-restore /dev/sdc1 /mnt/restore/ It worked, in a way that it restored what I already had. What's odd aswell is, that btrfs scrub did run through without errors. So, the missing data could have been (accidentally) deleted by me. But I don't think... nevertheless I cannot exclude. What I know is the (original) Path of the Data. Greetings, Hendrik -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Mobil 0178 1874363 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: no activity in kernel.org btrfs-progs git repo?
Hello, Try git://github.com/josefbacik/btrfs-progs I just spent whole day debugging btrfs-restore, fixing signed / unsigned comparisons, adding another mirror retry, only to find out it is all already done in this repository. D'oh! But it has no --repair option. Greetings, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: segmentation-fault in btrfsck (git-version)
Hello, thanks for letting me know. Indeed it would be good to replace the segmentation Fault by btrfs does not yet know how to handle this condition. Future refinements of btrfsck will probably include proper error messages for issues that can't be handled, or perhaps even fix the error. It might be interesting for you to try a newer kernel, and use scrub on this volume if you have the two disks RAIDed. I will try that. Greetings, Hendrik -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Mobil 0178 1874363 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: segmentation-fault in btrfsck (git-version)
Dear Mich, thanks for your help and suggestion: It might be interesting for you to try a newer kernel, and use scrub on this volume if you have the two disks RAIDed. I have now scrubbed the Disk: ./btrfs scrub status /mnt/other/ scrub status for a15eede9-1a92-47d8-940a-adc7cf97352d scrub started at Sun Dec 9 13:48:57 2012 and finished after 3372 seconds total bytes scrubbed: 1.10TB with 0 errors That's odd, as in one folder, data is missing (I could have deleted it, but I'd be very surprised...) Also, when I run btrfsck, I get errors: On sdc1: root 261 inode 64370 errors 400 root 261 inode 64373 errors 400 root 261 inode 64375 errors 400 root 261 inode 64376 errors 400 found 1203899371520 bytes used err is 1 total csum bytes: 1173983136 total tree bytes: 1740640256 total fs tree bytes: 280260608 btree space waste bytes: 212383383 file data blocks allocated: 28032005304320 referenced 1190305632256 Btrfs v0.20-rc1-37-g91d9eec On sdb1: root 261 inode 64373 errors 400 root 261 inode 64375 errors 400 root 261 inode 64376 errors 400 found 1203899371520 bytes used err is 1 total csum bytes: 1173983136 total tree bytes: 1740640256 total fs tree bytes: 280260608 btree space waste bytes: 212383383 file data blocks allocated: 28032005304320 referenced 1190305632256 Btrfs v0.20-rc1-37-g91d9eec And when I try to mount one of the two raided disks, I get: [ 1173.773861] device fsid a15eede9-1a92-47d8-940a-adc7cf97352d devid 1 transid 140194 /dev/sdb1 [ 1173.774695] btrfs: failed to read the system array on sdb1 [ 1173.774854] btrfs: open_ctree failed while the other works: [ 1177.927096] device fsid a15eede9-1a92-47d8-940a-adc7cf97352d devid 2 transid 140194 /dev/sdc1 Do you have hints for me? The Kernel now is 3.3.7-030307-generic (anything more recent, I would have to compile myself, which I will do, if you suggest to) Greetings, Hendrik Am 06.12.2012 20:09, schrieb Mitch Harder: On Wed, Dec 5, 2012 at 2:50 PM, Hendrik Friedel hend...@friedels.name wrote: Dear all, thanks for developing btrfsck! Now, I'd like to contribute -as far as I can. I'm not a developer, but I do have some linux-experience. I've been using btrfsck on two 3TB HDDs (mirrored) for a while now under Kernel 3.0. Now it's corrupt. I had some hard resets of the machine -which might have contributed. I do have a backup of the data -at least of the important stuff. Some TV-Recordings are missing. The reason I am writing is, to support the development. Unfortunately, btrfsck (latest git-version) crashes with a segmentation fault, when trying to repair this. Here's the backtrace: root 261 inode 64375 errors 400 root 261 inode 64376 errors 400 btrfsck: disk-io.c:382: __commit_transaction: Assertion `!(!eb || eb-start != start)' failed. Program received signal SIGABRT, Aborted. 0x7784c425 in raise () from /lib/x86_64-linux-gnu/libc.so.6 (gdb) (gdb) backtrace #0 0x7784c425 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x7784fb8b in abort () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x778450ee in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #3 0x77845192 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6 #4 0x0040d3ae in __commit_transaction (trans=0x62e010, root=0xb66ae0) at disk-io.c:382 #5 0x0040d4d8 in btrfs_commit_transaction (trans=0x62e010, root=0xb66ae0) at disk-io.c:415 #6 0x0040743d in main (ac=optimized out, av=optimized out) at btrfsck.c:3587 Now, here's where my debugging knowledge ends. Are you interested in debugging this further, or is it a known bug? Line 382 in disk-io.c is: BUG_ON(!eb || eb-start != start); So, basically, btrfsck is intentionally crashing because it doesn't know how to handle this condition. Future refinements of btrfsck will probably include proper error messages for issues that can't be handled, or perhaps even fix the error. It might be interesting for you to try a newer kernel, and use scrub on this volume if you have the two disks RAIDed. -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Mobil 0178 1874363 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
segmentation-fault in btrfsck (git-version)
Dear all, thanks for developing btrfsck! Now, I'd like to contribute -as far as I can. I'm not a developer, but I do have some linux-experience. I've been using btrfsck on two 3TB HDDs (mirrored) for a while now under Kernel 3.0. Now it's corrupt. I had some hard resets of the machine -which might have contributed. I do have a backup of the data -at least of the important stuff. Some TV-Recordings are missing. The reason I am writing is, to support the development. Unfortunately, btrfsck (latest git-version) crashes with a segmentation fault, when trying to repair this. Here's the backtrace: root 261 inode 64375 errors 400 root 261 inode 64376 errors 400 btrfsck: disk-io.c:382: __commit_transaction: Assertion `!(!eb || eb-start != start)' failed. Program received signal SIGABRT, Aborted. 0x7784c425 in raise () from /lib/x86_64-linux-gnu/libc.so.6 (gdb) (gdb) backtrace #0 0x7784c425 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x7784fb8b in abort () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x778450ee in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #3 0x77845192 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6 #4 0x0040d3ae in __commit_transaction (trans=0x62e010, root=0xb66ae0) at disk-io.c:382 #5 0x0040d4d8 in btrfs_commit_transaction (trans=0x62e010, root=0xb66ae0) at disk-io.c:415 #6 0x0040743d in main (ac=optimized out, av=optimized out) at btrfsck.c:3587 Now, here's where my debugging knowledge ends. Are you interested in debugging this further, or is it a known bug? Regards, Hendrik -- Hendrik Friedel Auf dem Brink 12 28844 Weyhe Mobil 0178 1874363 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html