So, I forgot to mention that it's my main media and backup server that got
corrupted. Yes, I do actually have a backup of a backup server, but it's
going to take days to recover due to the amount of data to copy back, not
counting lots of manual typing due to the number of subvolumes, btrfs
send/receive relationships and so forth.

Really, I should be able to roll back all writes from the last 24H, run a
check --repair/scrub on top just to be sure, and be back on track.

In the meantime, the good news is that the filesystem doesn't crash the
kernel (the poasted crash below) now that I was able to cancel the btrfs 
balance, 
but it goes read only at the drop of a hat, even when I'm trying to delete
recent snapshots and all data that was potentially written in the last 24H

On Mon, May 01, 2017 at 10:06:41AM -0700, Marc MERLIN wrote:
> I have a filesystem that sadly got corrupted by a SAS card I just installed 
> yesterday.
> 
> I don't think in a case like this, there is there a way to roll back all
> writes across all subvolumes in the last 24H, correct?
> 
> Is the best thing to go in each subvolume, delete the recent snapshots and
> rename the one from 24H as the current one?
 
Well, just like I expected, it's a pain in the rear and this can't even help
fix the top level mountpoint which doesn't have snapshots, so I can't roll
it back.
btrfs should really have an easy way to roll back X hours, or days to
recover from garbage written after a good known point, given that it is COW
afterall.

Is there a way do this with check --repair maybe?

In the meantime, I got stuck while trying to delete snapshots:

Let's say I have this:
ID 428 gen 294021 top level 5 path backup
ID 2023 gen 294021 top level 5 path Soft
ID 3021 gen 294051 top level 428 path backup/debian32
ID 4400 gen 294018 top level 428 path backup/debian64
ID 4930 gen 294019 top level 428 path backup/ubuntu

I can easily
Delete subvolume (no-commit): '/mnt/btrfs_pool2/Soft'
and then:
gargamel:/mnt/btrfs_pool2# mv Soft_rw.20170430_01:50:22 Soft

But I can't delete backup, which actually is mostly only a directory
containing other things (in hindsight I shouldn't have made that a
subvolume)
Delete subvolume (no-commit): '/mnt/btrfs_pool2/backup'
ERROR: cannot delete '/mnt/btrfs_pool2/backup': Directory not empty

This is because backup has a lot of subvolumes due to btrfs send/receive
relationships.

Is it possible to recover there? Can you reparent subvolumes to a different
subvolume without doing a full copy via btrfs send/receive?

Thanks,
Marc

> BTRFS warning (device dm-5): failed to load free space cache for block group 
> 6746013696000, rebuilding it now
> BTRFS warning (device dm-5): block group 6754603630592 has wrong amount of 
> free space
> BTRFS warning (device dm-5): failed to load free space cache for block group 
> 6754603630592, rebuilding it now
> BTRFS warning (device dm-5): block group 7125178777600 has wrong amount of 
> free space
> BTRFS warning (device dm-5): failed to load free space cache for block group 
> 7125178777600, rebuilding it now
> BTRFS error (device dm-5): bad tree block start 3981076597540270796 
> 2899180224512
> BTRFS error (device dm-5): bad tree block start 942082474969670243 
> 2899180224512
> BTRFS: error (device dm-5) in __btrfs_free_extent:6944: errno=-5 IO failure
> BTRFS info (device dm-5): forced readonly
> BTRFS: error (device dm-5) in btrfs_run_delayed_refs:2961: errno=-5 IO failure
> BUG: unable to handle kernel NULL pointer dereference at           (null)
> IP: __del_reloc_root+0x3f/0xa6
> PGD 189a0e067
> PUD 189a0f067
> PMD 0
> 
> Oops: 0000 [#1] PREEMPT SMP
> Modules linked in: veth ip6table_filter ip6_tables ebtable_nat ebtables ppdev 
> lp xt_addrtype br_netfilter bridge stp llc tun autofs4 softdog binfmt_misc 
> ftdi_sio nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ipt_REJECT 
> nf_reject_ipv4 xt_conntrack xt_mark xt_nat xt_tcpudp nf_log_ipv4 
> nf_log_common xt_LOG iptable_mangle iptable_filter lm85 hwmon_vid pl2303 
> dm_snapshot dm_bufio iptable_nat ip_tables nf_conntrack_ipv4 nf_defrag_ipv4 
> nf_nat_ipv4 nf_conntrack_ftp ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_nat 
> nf_conntrack x_tables sg st snd_pcm_oss snd_mixer_oss bcache kvm_intel kvm 
> irqbypass snd_hda_codec_realtek snd_cmipci snd_hda_codec_generic 
> snd_hda_intel snd_mpu401_uart snd_hda_codec snd_opl3_lib snd_rawmidi 
> snd_hda_core snd_seq_device snd_hwdep eeepc_wmi snd_pcm asus_wmi rc_ati_x10
>  asix snd_timer ati_remote sparse_keymap usbnet rfkill snd hwmon soundcore 
> rc_core evdev libphy tpm_infineon pcspkr i915 parport_pc i2c_i801 input_leds 
> mei_me lpc_ich parport tpm_tis battery usbserial tpm_tis_core tpm wmi e1000e 
> ptp pps_core fuse raid456 multipath mmc_block mmc_core lrw ablk_helper 
> dm_crypt dm_mod async_raid6_recov async_pq async_xor async_memcpy async_tx 
> crc32c_intel blowfish_x86_64 blowfish_common pcbc aesni_intel aes_x86_64 
> crypto_simd glue_helper cryptd xhci_pci ehci_pci sata_sil24 xhci_hcd mvsas 
> ehci_hcd r8169 usbcore mii libsas scsi_transport_sas thermal fan [last 
> unloaded: ftdi_sio]
> CPU: 0 PID: 9056 Comm: btrfs Tainted: G     U          
> 4.11.0-amd64-preempt-sysrq-20170406 #2
> Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 
> 04/27/2013
> task: ffff88374d2a60c0 task.stack: ffffa6f226424000
> RIP: 0010:__del_reloc_root+0x3f/0xa6
> RSP: 0018:ffffa6f226427a40 EFLAGS: 00210246
> RAX: 0000000000000000 RBX: ffff8838ee256000 RCX: 00000000ffffffe2
> RDX: 0000000000000001 RSI: ffffffff9f83b410 RDI: ffff8837992da568
> RBP: ffffa6f226427a68 R08: 0000000000000000 R09: ffffffff9fd69480
> R10: 0000000000000000 R11: 0000000000000000 R12: ffffa6f226427ab0
> R13: ffff883768938000 R14: ffff8837992da568 R15: ffff8837992da570
> FS:  00007facd18d28c0(0000) GS:ffff883a5e200000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 0000000189a10000 CR4: 00000000001406f0
> Call Trace:
>  free_reloc_roots+0x4f/0x5d
>  merge_reloc_roots+0x159/0x1ba
>  relocate_block_group+0x410/0x492
>  btrfs_relocate_block_group+0x12d/0x253
>  btrfs_relocate_chunk+0x3e/0xb1
>  btrfs_balance+0xd16/0xf36
>  btrfs_ioctl_balance+0x24f/0x2cd
>  ? __alloc_pages_nodemask+0x134/0x1e0
>  btrfs_ioctl+0x1447/0x1e22
>  ? mem_cgroup_charge_statistics+0x1e/0x88
>  ? get_page+0x9/0x26
>  ? __lru_cache_add+0x2a/0x6c
>  ? set_pte_at+0x9/0xd
>  ? __handle_mm_fault+0x61d/0xa6f
>  vfs_ioctl+0x21/0x38
>  ? vfs_ioctl+0x21/0x38
>  do_vfs_ioctl+0x4ef/0x537
>  ? current_kernel_time64+0x10/0x36
>  ? __audit_syscall_entry+0xc2/0xe6
>  ? syscall_trace_enter+0x1ac/0x20e
>  SyS_ioctl+0x57/0x7b
>  do_syscall_64+0x6b/0x7d
>  entry_SYSCALL64_slow_path+0x25/0x25
> RIP: 0033:0x7facd097ecc7
> RSP: 002b:00007ffefd3c3128 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007facd097ecc7
> RDX: 00007ffefd3c31b8 RSI: 00000000c4009420 RDI: 0000000000000003
> RBP: 00007ffefd3c31b8 R08: 0000000000000003 R09: 0000000000008040
> R10: 0000000000000541 R11: 0000000000000206 R12: 0000000000000003
> R13: 00007ffefd3c4cc9 R14: 0000000000000001 R15: 0000000000000001
> Code: af f0 01 00 00 48 89 fb 4d 8b b5 10 0b 00 00 4d 8d be 70 05 00 00 49 81 
> c6 68 05 00 00 4c 89 ff e8 0f 44 43 00 48 8b 03 4c 89 f7 <48> 8b 30 e8 0e fc 
> ff ff 48 85 c0 49 89 c4 74 0b 4c 89 f6 48 89
> RIP: __del_reloc_root+0x3f/0xa6 RSP: ffffa6f226427a40
> CR2: 0000000000000000
> ---[ end trace 64c3fa4dc953d295 ]---
> Kernel panic - not syncing: Fatal exception
> Kernel Offset: 0x1e000000 from 0xffffffff81000000 (relocation range: 
> 0xffffffff80000000-0xffffffffbfffffff)
> Rebooting in 20 seconds..
> ACPI MEMORY or I/O RESET_REG.
> 
> -- 
> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> Microsoft is to operating systems ....
>                                       .... what McDonalds is to gourmet 
> cooking
> Home page: http://marc.merlins.org/  

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to