Bug#687456: btrfs: Oops when adding a device on a degraded raid1 filesystem
tags 687456 + moreinfo quit Hi Antoine, Antoine Sirinelli wrote: This bug has been fixed by this commit commit 99f5944b8477914406173b47b4f261356286730b Btrfs: do not strdup non existent strings Sorry for the slow response. Unfortunately 99f5944b8477 fixes a regression introduced in v3.5-rc3~1^2~14 (Btrfs: use rcu to protect device-name, 2012-06-04), so it can't be the fix on its own. How did you track down that patch? E.g., did you bisect? Curious, Jonathan -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20121125081703.GA14629@elie.Belkin
Bug#687456: btrfs: Oops when adding a device on a degraded raid1 filesystem
On Sun, Nov 25, 2012 at 12:17:03AM -0800, Jonathan Nieder wrote: Unfortunately 99f5944b8477 fixes a regression introduced in v3.5-rc3~1^2~14 (Btrfs: use rcu to protect device-name, 2012-06-04), so it can't be the fix on its own. How did you track down that patch? E.g., did you bisect? No I was lazy, I just asked on btrfs mailing list: http://thread.gmane.org/gmane.comp.file-systems.btrfs/19716 The answer and the patch looked reasonable and as it was working with 3.6.0-rc5, I was happy. I could try to apply this patch to the current Debian kernel to see if it solves the issue. Do you think this bug should be considered as RC as it prevents people to change disk in btrfs-raid configuration? Here is the patch: commit 99f5944b8477914406173b47b4f261356286730b Author: Josef Bacik jba...@fusionio.com Date: Thu Aug 2 10:23:59 2012 -0400 Btrfs: do not strdup non existent strings When we close devices we add back empty devices for some reason that escapes me. In the case of a missing dev we don't allocate an rcu_string for it's name, so check to see if the device has a name and if it doesn't don't bother strdup()'ing it. Thanks, Signed-off-by: Josef Bacik jba...@fusionio.com diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index b8708f9..3b39450 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -569,9 +569,11 @@ static int __btrfs_close_devices(struct btrfs_fs_devices *fs_devices) memcpy(new_device, device, sizeof(*new_device)); /* Safe because we are under uuid_mutex */ - name = rcu_string_strdup(device-name-str, GFP_NOFS); - BUG_ON(device-name !name); /* -ENOMEM */ - rcu_assign_pointer(new_device-name, name); + if (device-name) { + name = rcu_string_strdup(device-name-str, GFP_NOFS); + BUG_ON(device-name !name); /* -ENOMEM */ + rcu_assign_pointer(new_device-name, name); + } new_device-bdev = NULL; new_device-writeable = 0; new_device-in_fs_metadata = 0; Antoine signature.asc Description: Digital signature
Bug#687456: btrfs: Oops when adding a device on a degraded raid1 filesystem
tags 687456 patch thanks This bug has been fixed by this commit commit 99f5944b8477914406173b47b4f261356286730b Btrfs: do not strdup non existent strings This is available in 3.6.0-rc5. Can this patch be cherry-picked? Antoine -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120918193836.GA8840@kabis
Bug#687456: btrfs: Oops when adding a device on a degraded raid1 filesystem
severity 687456 normal thanks In fact I created the RAID1-like filesystem the wrong way (btrfs wiki is wrong). I was using a RAID1 strategy only for the metadata and not the data. With the following command-line to create the fs, I no longer get the Oops: # mkfs.btrfs -m raid1 -d raid1 /dev/vdb /dev/vdc (-d flag added). Nevertheless, we still have a condition where we got a kernel Oops where we should not so I am leaving the bug open but with a normal severity. Antoine -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120913201008.GA4496@kabis
Bug#687456: btrfs: Oops when adding a device on a degraded raid1 filesystem
Package: src:linux Version: 3.2.23-1 Severity: grave On a very fresh wheezy install I was able to crash the kernel by playing with btrfs. I am reporting it using a kvm image but I first had the bug on real boxes. To reproduce the bug, I am creating a raid1 btrfs filesystem, creating some file and stopping the system: # mkfs.btrfs -m raid1 /dev/vdb /dev/vdc # mount /dev/vdb /mnt # dd if=/dev/zero of=/mnt/zeros count=1M # umount /mnt # shutdown -h now While it is stopped, I wipe one of the drive to make it invalid (/dev/vdc). I can now restart the system, mount the btrfs filesystem in degraded mode and when I want to add a new device to replace the faulty one, it crashes: # mount -o degraded /dev/vdb /mnt # btrfs device add /dev/vdc /mnt Here is the output of the kernel crash: [7.209204] btrfs: error reading free space cache [7.209403] BUG: unable to handle kernel NULL pointer dereference at 0001 [7.209962] IP: [a013cd22] io_ctl_drop_pages+0x1e/0x48 [btrfs] [7.210210] PGD 37546067 PUD 3c174067 PMD 0 [7.210566] Oops: 0002 [#1] SMP [7.210821] CPU 0 [7.210919] Modules linked in: netconsole configfs nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc loop joydev usbhid hid virtio_balloon processor psmouse thermal_sys evdev serio_raw button snd_pcm snd_page_alloc snd_timer snd soundcore i2c_piix4 i2c_core pcspkr ext4 crc16 jbd2 mbcache btrfs crc32c libcrc32c zlib_deflate sg sr_mod cdrom ata_generic virtio_net virtio_blk floppy ata_piix uhci_hcd ehci_hcd virtio_pci virtio_ring virtio libata scsi_mod usbcore usb_common [last unloaded: scsi_wait_scan] [7.212225] [7.212225] Pid: 2244, comm: btrfs Not tainted 3.2.0-3-amd64 #1 Bochs Bochs [7.212225] RIP: 0010:[a013cd22] [a013cd22] io_ctl_drop_pages+0x1e/0x48 [btrfs] [7.212225] RSP: 0018:8800376f7528 EFLAGS: 00010203 [7.212225] RAX: 0001 RBX: 8800376f75c0 RCX: 0034 [7.212225] RDX: RSI: eadfbc68 RDI: eadfbc68 [7.212225] RBP: 000d R08: 88003d332520 R09: 0020 [7.212225] R10: 0001 R11: R12: 000d [7.212225] R13: 000c R14: 0002005a R15: 0001 [7.212225] FS: 7fd9e9b5f760() GS:88003fc0() knlGS: [7.212225] CS: 0010 DS: ES: CR0: 8005003b [7.212225] CR2: 0001 CR3: 3b9f CR4: 06f0 [7.212225] DR0: DR1: DR2: [7.212225] DR3: DR6: 0ff0 DR7: 0400 [7.212225] Process btrfs (pid: 2244, threadinfo 8800376f6000, task 88003bf716d0) [7.212225] Stack: [7.212225] 8800376f75c0 000c 88003de7e5b8 a013d581 [7.212225] 0282 eadfbc68 0014 88003cb3cdc0 [7.212225] 88003de7e588 0008 0014 88003d332520 [7.212225] Call Trace: [7.212225] [a013d581] ? io_ctl_prepare_pages.isra.24+0xc1/0x10c [btrfs] [7.212225] [a013ea60] ? __load_free_space_cache+0x185/0x463 [btrfs] [7.212225] [a013ee08] ? load_free_space_cache+0xca/0x15b [btrfs] [7.212225] [a0103033] ? cache_block_group+0x1d9/0x34d [btrfs] [7.212225] [8102bb64] ? pvclock_clocksource_read+0x42/0xb2 [7.212225] [8105f5f3] ? add_wait_queue+0x3c/0x3c [7.212225] [a0105200] ? find_free_extent.constprop.71+0x3e0/0x930 [btrfs] [7.212225] [a0107d7c] ? btrfs_reserve_extent+0xb0/0x18e [btrfs] [7.212225] [a0108284] ? btrfs_alloc_free_block+0x15d/0x284 [btrfs] [7.212225] [a00fb7bf] ? __btrfs_cow_block+0x102/0x33a [btrfs] [7.212225] [a00fbaee] ? btrfs_cow_block+0xf7/0x143 [btrfs] [7.212225] [a00fe546] ? btrfs_search_slot+0x225/0x64e [btrfs] [7.212225] [a00ff5d9] ? btrfs_insert_empty_items+0x5c/0xad [btrfs] [7.212225] [a0106cf1] ? run_clustered_refs+0x386/0x6aa [btrfs] [7.212225] [a00fa2a0] ? leaf_space_used+0x4a/0x6f [btrfs] [7.212225] [a01070de] ? btrfs_run_delayed_refs+0xc9/0x176 [btrfs] [7.212225] [8134a1b0] ? mutex_lock+0xd/0x2d [7.212225] [a0114153] ? btrfs_commit_transaction+0x8f/0x6f9 [btrfs] [7.212225] [8105f5f3] ? add_wait_queue+0x3c/0x3c [7.212225] [81036457] ? should_resched+0x5/0x23 [7.212225] [a01324fa] ? btrfs_init_new_device+0x871/0xa56 [btrfs] [7.212225] [a01377bd] ? btrfs_ioctl+0x699/0xe1c [btrfs] [7.212225] [810eb5d3] ? __kmalloc_track_caller+0xfe/0x110 [7.212225] [81036457] ? should_resched+0x5/0x23 [7.212225] [a01377df] ? btrfs_ioctl+0x6bb/0xe1c [btrfs] [7.212225] [8103ad6e] ? check_preempt_curr+0x52/0x5f [