On Fri, 14 Feb 2014 15:33:10 +0100, Saint Germain <saint...@gmail.com>
wrote :

> On 11 February 2014 03:30, Saint Germain <saint...@gmail.com> wrote:
> >> > I am experimenting with BTRFS and RAID1 on my Debian Wheezy (with
> >> > backported kernel 3.12-0.bpo.1-amd64) using a a motherboard with
> >> > UEFI.
> >>
> >> > I have installed Debian with the following partition on the first
> >> > hard drive (no BTRFS subsystem):
> >> > /dev/sda1: for / (BTRFS)
> >> > /dev/sda2: for /home (BTRFS)
> >> > /dev/sda3: for swap
> >> >
> >> > Then I added another drive for a RAID1 configuration (with btrfs
> >> > balance) and I installed grub on the second hard drive with
> >> > "grub-install /dev/sdb".
> >>
> >> You should be able to mount a two-device btrfs raid1 filesystem
> >> with only a single device with the degraded mount option, tho I
> >> believe current kernels refuse a read-write mount in that case, so
> >> you'll have read-only access until you btrfs device add a second
> >> device, so it can do normal raid1 mode once again.
> >>
> >> Meanwhile, I don't believe it's on the wiki, but it's worth noting
> >> my experience with btrfs raid1 mode in my pre-deployment tests.
> >> Actually, with the (I believe) mandatory read-only mount if raid1
> >> is degraded below two devices, this problem's going to be harder
> >> to run into than it was in my testing several kernels ago, but
> >> here's what I found:
> >>
> >> But as I said, if btrfs only allows read-only mounts of filesystems
> >> without enough devices to properly complete the raidlevel, that
> >> shouldn't be as big an issue these days, since it should be more
> >> difficult or impossible to get the two devices separately mounted
> >> writable in the first place, with the consequence that the
> >> differing copies issue will be difficult or impossible to trigger
> >> in the first place. =:^)
> >>
> 
> Hello,
> 
> With your advices and Chris ones, I have now a (clean ?) partition to
> start experimenting with RAID1 (and which boot correctly in UEFI
> mode):
> sda1 = BIOS Boot partition
> sda2 = EFI System Partition
> sda3 = BTFS partition
> sda4 = swap partition
> For the moment I haven't created subvolumes (for "/" and for "/home"
> for instance) to keep things simple.
> 
> The idea is then to create a RAID1 with a sdb drive (duplicate sda
> partitioning, add/balance/convert sdb3 + grub-install on sdb, add sdb
> swap UUID in /etc/fstab), shutdown and remove sda to check the
> procedure to replace it.
> 
> I read the last thread on the subject "lost with degraded RAID1", but
> would like to really confirm what would be the current approved
> procedure and if it will be valid for future BTRFS version (especially
> about the read-only mount).
> 
> So what should I do from there ?
> Here are a few questions:
> 
> 1) Boot in degraded mode: currently with my kernel
> (3.12-0.bpo.1-amd64, from Debian wheezy-backports) it seems that I can
> mount in read-write mode.
> However for future kernel, it seems that I will be only able to mount
> read-only ? See here:
> http://www.spinics.net/lists/linux-btrfs/msg20164.html
> https://bugzilla.kernel.org/show_bug.cgi?id=60594
> 
> 2) If I am able to mount read-write, is this the correct procedure:
>   a) place a new drive in another physical location sdc (I don't think
> I can use the same sda physical location ?)
>   b) boot in degraded mode on sdb
>   c) use the 'replace' command to replace sda by sdc
>   d) perhaps a 'balance' is necessary ?
> 
> 3) Can I use also the above procedure if I am only allowed to mount
> read-only ?
> 
> 4) If I want to use my system without RAID1 support (dangerous I
> know), after booting in degraded mode with read-write, can I convert
> back sdb from RAID1 to RAID0 in a safe way ?
> (btrfs balance start -dconvert=raid0 -mconvert=raid0 /)
> 

To continue with this RAID1 recovery procedure (Debian stable with
kernel 3.12-0.bpo.1-amd64), I tried to reproduce Duncan setup and the
result is not good.

Starting with a clean setup of 2 hard drive in RAID1 (sda and sdb) and
a clean snapshot of the rootfs:
1) poweroff, disconnect sda and boot on sdb with rootflags=ro,degraded
2) sdb is mounted ro but automatically remounted read-write by initramf
3) create a file witness1 and modify a file test.txt with 'alpha' inside
4) poweroff, connect sda, disconnect sdb and boot on sda
5) create a file witness2 and modify a file test.txt with 'beta' inside
6) poweroff, connect sdb and boot on sda
7) the modification from step 3 are there (but not from step 5)
8) launch scrub: a lot of errors are detected but no unrepairable errors
9) poweroff, disconnect sdb, boot on sda
10) the modification from step 3 are there (but not from step 5)
11) poweroff, boot on sda: kernel panic on startup
12) reboot, boot is possible
13) launch scrub: a lot of errors and kernel error
14) reboot, error on boot, and same error as step 13 with scrub
15) boot on previous snapshot of step1, same error on boot and
same error as step 13 with scrub.


I hope that it will be useful for someone. It seems that mounting
read-write is really not a good idea (have to find how to force ro with
Debian). The RAID1 configuration is a bit fagile and even snapshots
won't protect you when it is broken.

If someone has some insigths, please let me know.

Here is for info the kernel error at boot and the scrub error:
[   37.575270] BUG: unable to handle kernel NULL pointer dereference at 
00000000000000c0
[   37.575299] IP: [<ffffffffa01a8a3b>] __btrfs_cow_block+0x28b/0x540 [btrfs]
[   37.575328] PGD 0 
[   37.575337] Oops: 0000 [#1] SMP 
[   37.575351] Modules linked in: cpufreq_userspace cpufreq_stats 
cpufreq_powersave cpufreq_conservative nfsd auth_rpcgss oid_registry nfs_acl 
nfs lockd fscache sunrpc nls_utf8 nls_cp437 vfat fat loop x86_pkg_temp_thermal 
coretemp kvm_intel snd_hda_codec_hdmi snd_hda_codec_realtek kvm snd_hda_intel 
snd_hda_codec lib80211_crypt_tkip snd_hwdep crct10dif_pclmul crc32_pclmul 
crc32c_intel ghash_clmulni_intel aesni_intel wl(PO) snd_pcm aes_x86_64 lrw 
gf128mul glue_helper snd_page_alloc ablk_helper snd_timer iTCO_wdt 
iTCO_vendor_support cryptd psmouse acpi_cpufreq snd cfg80211 lib80211 i915 
serio_raw drm_kms_helper drm processor button video thermal_sys joydev evdev 
rfkill i2c_i801 i2c_algo_bit lpc_ich i2c_core pcspkr mei_me mei mfd_core 
soundcore btrfs xor hid_generic usbhid hid raid6_pq crc32c libcrc32c sg sd_mod 
sr_mod cdrom crc_t10dif crct10dif_common ahci libahci libata scsi_mod xhci_hcd 
ehci_pci ehci_hcd e1000e ptp pps_core usbcore usb_common
[   37.575667] CPU: 0 PID: 297 Comm: btrfs-transacti Tainted: P           O 
3.12-0.bpo.1-amd64 #1 Debian 3.12.9-1~bpo70+1
[   37.575695] Hardware name: To Be Filled By O.E.M. To Be Filled By 
O.E.M./Z87E-ITX, BIOS P2.10 10/04/2013
[   37.575719] task: ffff8800610747c0 ti: ffff880061060000 task.ti: 
ffff880061060000
[   37.575738] RIP: 0010:[<ffffffffa01a8a3b>]  [<ffffffffa01a8a3b>] 
__btrfs_cow_block+0x28b/0x540 [btrfs]
[   37.575766] RSP: 0018:ffff8800610618e8  EFLAGS: 00010217
[   37.575781] RAX: 0000160000000000 RBX: ffff88006124f800 RCX: ffff880060f92ea8
[   37.575799] RDX: ffff880000000000 RSI: 000000000000027b RDI: ffff880064a79270
[   37.575818] RBP: ffff880064948920 R08: ffff880061061944 R09: 0000000000000000
[   37.575837] R10: ffff88003734a000 R11: 0000000000000000 R12: ffff880060fe5960
[   37.575856] R13: ffff880064a79270 R14: 0000000000000000 R15: 0000000000000000
[   37.575874] FS:  0000000000000000(0000) GS:ffff880100200000(0000) 
knlGS:0000000000000000
[   37.575895] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   37.575910] CR2: 00000000000000c0 CR3: 000000000180c000 CR4: 00000000001407f0
[   37.575928] Stack:
[   37.575935]  ffff880000000000 0000000300000000 0000000000000000 
ffffffffa01cb08d
[   37.575959]  ffff880061061a38 0000000000000000 ffff880064948920 
0000000000000000
[   37.575983]  0000000000001000 000000b8a21aab99 0000000001000000 
0000000100000000
[   37.576007] Call Trace:
[   37.576022]  [<ffffffffa01cb08d>] ? btrfs_buffer_uptodate+0x6d/0x80 [btrfs]
[   37.576044]  [<ffffffffa01a8ecb>] ? btrfs_cow_block+0x13b/0x200 [btrfs]
[   37.576069]  [<ffffffffa02044e2>] ? btrfs_set_lock_blocking_rw+0xa2/0xe0 
[btrfs]
[   37.576092]  [<ffffffffa01ad33f>] ? btrfs_search_slot+0x46f/0x9c0 [btrfs]
[   37.576115]  [<ffffffffa01b365d>] ? lookup_inline_extent_backref+0xcd/0x5e0 
[btrfs]
[   37.576139]  [<ffffffffa01b3be4>] ? insert_inline_extent_backref+0x74/0x120 
[btrfs]
[   37.576163]  [<ffffffffa01b5a7f>] ? update_block_group.isra.75+0xbf/0x270 
[btrfs]
[   37.576183]  [<ffffffff8116f97c>] ? kmem_cache_alloc+0x1bc/0x1f0
[   37.576203]  [<ffffffffa01b44ef>] ? __btrfs_inc_extent_ref+0xaf/0x250 [btrfs]
[   37.576227]  [<ffffffffa01bb47b>] ? run_clustered_refs+0xd4b/0xef0 [btrfs]
[   37.576251]  [<ffffffffa0214804>] ? btrfs_find_ref_cluster+0x74/0x170 [btrfs]
[   37.576274]  [<ffffffffa01bf370>] ? btrfs_run_delayed_refs+0xd0/0x530 [btrfs]
[   37.576300]  [<ffffffffa01e8683>] ? btrfs_run_ordered_operations+0x213/0x2b0 
[btrfs]
[   37.576326]  [<ffffffffa01cf90a>] ? btrfs_commit_transaction+0x5a/0x9f0 
[btrfs]
[   37.576350]  [<ffffffffa01c9385>] ? transaction_kthread+0x1b5/0x220 [btrfs]
[   37.576373]  [<ffffffffa01c91d0>] ? btree_readpage_end_io_hook+0x2d0/0x2d0 
[btrfs]
[   37.576394]  [<ffffffff81082333>] ? kthread+0xb3/0xc0
[   37.576408]  [<ffffffff81082280>] ? flush_kthread_worker+0xa0/0xa0
[   37.576426]  [<ffffffff814cb70c>] ? ret_from_fork+0x7c/0xb0
[   37.576442]  [<ffffffff81082280>] ? flush_kthread_worker+0xa0/0xa0
[   37.576458] Code: 00 48 39 2b 0f 84 e6 01 00 00 48 83 bb d7 01 00 00 f8 48 
c7 44 24 38 00 00 00 00 0f 84 bf 01 00 00 48 b8 00 00 00 00 00 16 00 00 <49> 03 
87 c0 00 00 00 48 ba b7 6d db b6 6d db b6 6d 48 c1 f8 03 
[   37.576578] RIP  [<ffffffffa01a8a3b>] __btrfs_cow_block+0x28b/0x540 [btrfs]
[   37.576601]  RSP <ffff8800610618e8>
[   37.576611] CR2: 00000000000000c0
[   37.576622] ---[ end trace eb7650bbae358a3a ]---

And here is for info the scrub error:
[  390.015858] BTRFS critical (device sdb3): unable to find logical 
687194767360 len 4096
[  390.015957] ------------[ cut here ]------------
[  390.015989] kernel BUG at 
/build/linux-SMWX37/linux-3.12.9/fs/btrfs/inode.c:1595!
[  390.016037] invalid opcode: 0000 [#1] SMP 
[  390.016070] Modules linked in: cpufreq_userspace cpufreq_stats 
cpufreq_powersave cpufreq_conservative nfsd auth_rpcgss oid_registry nfs_acl 
nfs lockd fscache sunrpc nls_utf8 nls_cp437 vfat fat loop x86_pkg_temp_thermal 
coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel 
ghash_clmulni_intel aesni_intel aes_x86_64 snd_hda_codec_hdmi 
snd_hda_codec_realtek lrw gf128mul snd_hda_intel snd_hda_codec snd_hwdep 
lib80211_crypt_tkip wl(PO) iTCO_wdt iTCO_vendor_support glue_helper ablk_helper 
snd_pcm cryptd acpi_cpufreq pcspkr psmouse mei_me mei i915 lpc_ich video 
processor drm_kms_helper drm mfd_core i2c_algo_bit snd_page_alloc snd_timer snd 
cfg80211 lib80211 rfkill i2c_i801 soundcore thermal_sys serio_raw i2c_core 
joydev evdev button hid_generic usbhid hid btrfs xor raid6_pq crc32c libcrc32c 
sg sd_mod sr_mod cdrom crc_t10dif crct10dif_common ahci libahci xhci_hcd 
ehci_pci ehci_hcd e1000e ptp pps_core libata scsi_mod usbcore usb_common
[  390.016840] CPU: 0 PID: 3045 Comm: btrfs Tainted: P           O 
3.12-0.bpo.1-amd64 #1 Debian 3.12.9-1~bpo70+1
[  390.016899] Hardware name: To Be Filled By O.E.M. To Be Filled By 
O.E.M./Z87E-ITX, BIOS P2.10 10/04/2013
[  390.016957] task: ffff88006a9207c0 ti: ffff8800641be000 task.ti: 
ffff8800641be000
[  390.017004] RIP: 0010:[<ffffffffa01b5671>]  [<ffffffffa01b5671>] 
btrfs_merge_bio_hook+0x71/0x80 [btrfs]
[  390.017084] RSP: 0018:ffff8800641bf4f8  EFLAGS: 00010282
[  390.017119] RAX: 00000000ffffffea RBX: 0000000000001000 RCX: 0000000000000000
[  390.017163] RDX: ffff88010020ffa8 RSI: ffff88010020e4b8 RDI: 0000000000000246
[  390.017207] RBP: 0000000000001000 R08: 0000000000000000 R09: ffff8800606c5970
[  390.017250] R10: 00000000000002fb R11: ffffffff8164e348 R12: ffff880063c08c28
[  390.017294] R13: 0000000000000000 R14: ffff880064bd88b0 R15: 0000000050000008
[  390.017338] FS:  00007fe25f8fe700(0000) GS:ffff880100200000(0000) 
knlGS:0000000000000000
[  390.017388] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  390.017424] CR2: 00007fe25f933f80 CR3: 0000000036435000 CR4: 00000000001407f0
[  390.017493] Stack:
[  390.017518]  0000000000001000 0000000000001000 ffff880064bd88b0 
ffff8800641bf768
[  390.017616]  0000000000001000 ffffffffa01cf313 0000000000000000 
ffffea000159bff0
[  390.017712]  0000000000000000 ffff880100240380 ffffffffa01cf700 
0000000000000020
[  390.017809] Call Trace:
[  390.017867]  [<ffffffffa01cf313>] ? submit_extent_page.isra.38+0xf3/0x250 
[btrfs]
[  390.017967]  [<ffffffffa01cf700>] ? repair_io_failure+0x240/0x240 [btrfs]
[  390.018039]  [<ffffffffa01d0587>] ? __do_readpage+0x4b7/0x720 [btrfs]
[  390.018092]  [<ffffffffa01cf700>] ? repair_io_failure+0x240/0x240 [btrfs]
[  390.018150]  [<ffffffffa01a99d0>] ? verify_parent_transid+0x190/0x190 [btrfs]
[  390.018207]  [<ffffffffa01d08a4>] ? __extent_read_full_page+0xb4/0xd0 [btrfs]
[  390.018264]  [<ffffffffa01a99d0>] ? verify_parent_transid+0x190/0x190 [btrfs]
[  390.018320]  [<ffffffffa01a99d0>] ? verify_parent_transid+0x190/0x190 [btrfs]
[  390.018380]  [<ffffffffa01d312d>] ? read_extent_buffer_pages+0x25d/0x350 
[btrfs]
[  390.018437]  [<ffffffffa01a99d0>] ? verify_parent_transid+0x190/0x190 [btrfs]
[  390.018493]  [<ffffffffa01aaf59>] ? 
btree_read_extent_buffer_pages.constprop.128+0xa9/0x110 [btrfs]
[  390.018559]  [<ffffffffa01ad5d3>] ? read_tree_block+0x33/0x60 [btrfs]
[  390.018610]  [<ffffffffa018dc65>] ? 
read_block_for_search.isra.45+0x195/0x3d0 [btrfs]
[  390.018669]  [<ffffffffa01938b9>] ? btrfs_next_old_leaf+0x2f9/0x490 [btrfs]
[  390.018726]  [<ffffffffa0206056>] ? scrub_stripe+0x5f6/0x1200 [btrfs]
[  390.018771]  [<ffffffff81082d00>] ? add_wait_queue+0x40/0x60
[  390.018820]  [<ffffffffa0206d8c>] ? scrub_chunk.isra.18+0x12c/0x150 [btrfs]
[  390.018875]  [<ffffffffa0207013>] ? scrub_enumerate_chunks+0x263/0x5a0 
[btrfs]
[  390.018922]  [<ffffffff81082d00>] ? add_wait_queue+0x40/0x60
[  390.018971]  [<ffffffffa0208756>] ? btrfs_scrub_dev+0x1e6/0x570 [btrfs]
[  390.019027]  [<ffffffffa01e5e31>] ? btrfs_ioctl+0xe91/0x1d30 [btrfs]
[  390.019079]  [<ffffffffa01e5e8b>] ? btrfs_ioctl+0xeeb/0x1d30 [btrfs]
[  390.019123]  [<ffffffff814c305d>] ? rwsem_down_read_failed+0x9d/0xf0
[  390.019165]  [<ffffffff8128ef54>] ? call_rwsem_down_read_failed+0x14/0x30
[  390.019210]  [<ffffffff814c75b8>] ? __do_page_fault+0x2b8/0x540
[  390.019252]  [<ffffffff811971ca>] ? do_vfs_ioctl+0x8a/0x4f0
[  390.019289]  [<ffffffff81062c1d>] ? do_exit+0x6fd/0xa80
[  390.019325]  [<ffffffff810135f1>] ? __switch_to+0x171/0x4c0
[  390.019363]  [<ffffffff811976d0>] ? SyS_ioctl+0xa0/0xc0
[  390.019399]  [<ffffffff814cb7b9>] ? system_call_fastpath+0x16/0x1b
[  390.019437] Code: c9 45 31 c0 89 fe 48 89 c7 48 89 6c 24 08 e8 97 5f 02 00 
85 c0 78 14 48 01 eb 31 c0 48 3b 5c 24 08 0f 97 c0 48 83 c4 18 5b 5d c3 <0f> 0b 
66 66 66 66 2e 0f 1f 84 00 00 00 00 00 53 48 89 fb 89 f1 
[  390.019733] RIP  [<ffffffffa01b5671>] btrfs_merge_bio_hook+0x71/0x80 [btrfs]
[  390.019793]  RSP <ffff8800641bf4f8>
[  390.019838] ---[ end trace 74720f4e8a3bc0fa ]---

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to