Mysterious Kernel Bug
Hi! I'm using btrfs on this system: Linux qmap02 3.11.0-18-generic #32-Ubuntu SMP Tue Feb 18 21:11:14 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux During a task with heavy I/O (it's a database server) suddenly every disk-accessing process freezes. Dmesg outputs the following when this happend: [38108.895509] [ cut here ] [38108.895663] kernel BUG at /build/buildd/linux-3.11.0/fs/btrfs/ctree.c:2964! [38108.895825] invalid opcode: [#1] SMP [38108.895970] Modules linked in: cirrus ttm drm_kms_helper drm syscopyarea sysfillrect sysimgblt kvm_intel kvm virtio_balloon i2c_piix4 mac_hid virtio_console psmouse serio_raw microcode lp parport btrfs xor zlib_deflate raid6_pq libcrc32c floppy [38108.896013] CPU: 1 PID: 29950 Comm: btrfs-endio-wri Not tainted 3.11.0-18-generic #32-Ubuntu [38108.896013] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [38108.896013] task: 880dcf2c2ee0 ti: 8809b923 task.ti: 8809b923 [38108.896013] RIP: 0010:[] [] btrfs_set_item_key_safe+0x161/0x170 [btrfs] [38108.896013] RSP: 0018:8809b9231b20 EFLAGS: 00010286 [38108.896013] RAX: RBX: 0018 RCX: 17466000 [38108.896013] RDX: RSI: 8809b9231c26 RDI: 8809b9231b3f [38108.896013] RBP: 8809b9231b78 R08: 00017340 R09: [38108.896013] R10: 17468000 R11: R12: 8809b9231b2e [38108.896013] R13: 880dced2bd80 R14: 8809b9231c26 R15: 880b95815680 [38108.896013] FS: () GS:880e1fc4() knlGS: [38108.896013] CS: 0010 DS: ES: CR0: 8005003b [38108.896013] CR2: 0085f001 CR3: 000c4dbcf000 CR4: 06e0 [38108.896013] Stack: [38108.896013] 880cee276000 081a 006c 1a174640 [38108.896013] 6c08 17464000 880b95815680 17466000 [38108.896013] 0ad3 880dced2bd80 8809b9231c70 [38108.896013] Call Trace: [38108.896013] [] __btrfs_drop_extents+0x421/0xae0 [btrfs] [38108.896013] [] btrfs_drop_extents+0x60/0x90 [btrfs] [38108.896013] [] insert_reserved_file_extent.constprop.58+0x6c/0x290 [btrfs] [38108.896013] [] ? _raw_spin_lock+0xe/0x20 [38108.896013] [] btrfs_finish_ordered_io+0x4ef/0x910 [btrfs] [38108.896013] [] ? mempool_free_slab+0x17/0x20 [38108.896013] [] ? mempool_free+0x49/0x90 [38108.896013] [] ? bio_put+0x7d/0xa0 [38108.896013] [] ? end_compressed_bio_write+0xe8/0xf0 [btrfs] [38108.896013] [] finish_ordered_fn+0x15/0x20 [btrfs] [38108.896013] [] worker_loop+0x15e/0x550 [btrfs] [38108.896013] [] ? btrfs_queue_worker+0x320/0x320 [btrfs] [38108.896013] [] kthread+0xc0/0xd0 [38108.896013] [] ? kthread_create_on_node+0x120/0x120 [38108.896013] [] ret_from_fork+0x7c/0xb0 [38108.896013] [] ? kthread_create_on_node+0x120/0x120 [38108.896013] Code: 48 8b 45 bf 48 8d 7d c7 4c 89 f6 48 89 45 d0 0f b6 45 be 88 45 cf 48 8b 45 b6 48 89 45 c7 e8 c7 f2 ff ff 85 c0 0f 8f 48 ff ff ff <0f> 0b 0f 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 [38108.896013] RIP [] btrfs_set_item_key_safe+0x161/0x170 [btrfs] [38108.896013] RSP [38108.947225] ---[ end trace f5739e163e2811a8 ]--- Does anybody have any idea what this problem causes and how to solve it? Thank you a lot! Regards, Juergen Fitschen -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Any use for mkfs.btrfs -d raid5 -m raid1 ?
While raid1 metadata with raid5 data would seem to be a non-useful configuration, I've taken to using raid10 metadata with raid0 data (and if I'm right, my logic could probably be extended to claim that r10 metadata would be a good choice with r5 data). In theory this would preserve some copy of the metadata in the case of a pair of blown drives in r5, or in the case of any blown drive in r0. I have not done any testing, but my assumption in setting it up this way would be that even though my data is r0 and losing a drive would obviously send that particular data into the weeds, since there is still a good metadata structure I'd be able to recover data that lay elsewhere on the BTRFS. My setup here, BTW, is 10x 28TB LUNs. (250+TB btrfs filesystem) BTRFS Raid0 for data, Raid10 for metadata. This is just log archive data, so total data redundancy isn't much of an issue. Obviously if one of my 28TB LUNs pooped the bed, I'd lose all of the data there, but presumably since the metadata is redundant across other LUNs, I'd still be able to rescue a decent amount of the remaining data. Now that I've written this entire email out, I'm thinking I should probably actually test the implication of a total LUN loss though, because otherwise I'm just wasting space mirroring the striped metadata (not that it really matters that much though, storage is cheap). -ben On Mar 23, 2014, at 7:11 PM, Marc MERLIN wrote: > On Sun, Mar 23, 2014 at 10:52:29PM +, Hugo Mills wrote: >> On Sun, Mar 23, 2014 at 03:44:35PM -0700, Marc MERLIN wrote: >>> If I lose 2 drives on a raid5, -m raid1 should ensure I haven't lost my >>> metadate. >>> From there, would I indeed have small files that would be stored entirely on >>> some of the drives that didn't go missing, and therefore I could recover >>> some data with 2 missing drives? >> >> btrfs's RAID-1 is two copies only, so you may well have lost some >> of your metadata. n-copies RAID-1 is coming Real Soon Now™ (Chris has >> it on his todo list, along with fixing all the parity RAID stuff). > > Oh, right, I forgot about that. Then I'm not coming up with many good > reasons why raid1 metadata with raid5 data would be useful. > Actually raid5 metadata should be faster since it's striped on more drives. > > I'll update the doc I just posted, thanks. > > Marc > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > Microsoft is to operating systems > what McDonalds is to gourmet cooking > Home page: http://marc.merlins.org/ -- Benjamin O'Connor TechOps Systems Administrator TripAdvisor Media Group bocon...@tripadvisor.com c. 617-312-9072 -- -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: for Chris Mason ( iowatcher graphs)
2014-03-23 21:47 GMT+04:00 Hugo Mills : >> Hello. Sorry for writing to btrfs mailing list, but personal mail >> reject my message. >> Saying " >> : host 10.101.1.19[10.101.1.19] said: 554 5.4.6 Hop >> count exceeded - possible mail loop (in reply to end of DATA command) > >He's moved to Facebook now. Thanks! I'll resend to c...@fb.com -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
btrfs: kernel BUG at fs/btrfs/extent_io.c:676!
Hi all, While fuzzing with trinity inside KVM tools guest running latest -next kernel I've stumbled on the following spew. This is a result of a failed allocation in alloc_extent_state_atomic() which triggers a BUG_ON when the return value is NULL. It's a bit weird that it BUGs on failed allocations, since it's obviously not a critical failure. [ 447.705167] kernel BUG at fs/btrfs/extent_io.c:676! [ 447.706201] invalid opcode: [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 447.707732] Dumping ftrace buffer: [ 447.708473](ftrace buffer empty) [ 447.709684] Modules linked in: [ 447.710246] CPU: 17 PID: 4195 Comm: kswapd17 Tainted: GW 3.14.0-rc7-next-20140321-sasha-00018-g0516fe6-dirty #265 [ 447.710253] task: 88066be9b000 ti: 88066be82000 task.ti: 88066be82000 [ 447.710253] RIP: clear_extent_bit (fs/btrfs/extent_io.c:676) [ 447.710253] RSP: :88066be83768 EFLAGS: 00010246 [ 447.710253] RAX: RBX: 00d00fff RCX: 0006 [ 447.710253] RDX: 58e0 RSI: 88066be9bd60 RDI: 0286 [ 447.710253] RBP: 88066be837e8 R08: R09: [ 447.710253] R10: 0001 R11: 454a4e495f544c55 R12: 01ff [ 447.710253] R13: R14: 88007b89fd08 R15: 00d0 [ 447.710253] FS: () GS:8804acc0() knlGS: [ 447.710253] CS: 0010 DS: ES: CR0: 8005003b [ 447.710253] CR2: 02aec968 CR3: 05e29000 CR4: 06a0 [ 447.710253] DR0: 00698000 DR1: 00698000 DR2: [ 447.710253] DR3: DR6: 0ff0 DR7: 0600 [ 447.710253] Stack: [ 447.710253] 88066be83788 844fc4d5 8804ab4800e8 [ 447.710253] 0001 8804ab4800c8 fbf7 [ 447.710253] 88066be837c8 0006 ea0007aaf340 [ 447.710253] Call Trace: [ 447.710253] ? _raw_spin_unlock (arch/x86/include/asm/preempt.h:98 include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:183) [ 447.710253] try_release_extent_mapping (fs/btrfs/extent_io.c:3998 fs/btrfs/extent_io.c:4058) [ 447.710253] __btrfs_releasepage (fs/btrfs/inode.c:7521) [ 447.710253] btrfs_releasepage (fs/btrfs/inode.c:7534) [ 447.710253] try_to_release_page (mm/filemap.c:2984) [ 447.710253] invalidate_inode_page (mm/truncate.c:165 mm/truncate.c:215) [ 447.710253] invalidate_mapping_pages (mm/truncate.c:517) [ 447.710253] inode_lru_isolate (arch/x86/include/asm/current.h:14 include/linux/swap.h:33 fs/inode.c:724) [ 447.710253] ? insert_inode_locked (fs/inode.c:687) [ 447.710253] list_lru_walk_node (mm/list_lru.c:89) [ 447.710253] prune_icache_sb (fs/inode.c:759) [ 447.710253] super_cache_scan (fs/super.c:96) [ 447.710253] shrink_slab_node (mm/vmscan.c:306) [ 447.710253] shrink_slab (mm/vmscan.c:381) [ 447.710253] kswapd_shrink_zone (mm/vmscan.c:2909) [ 447.710253] kswapd (mm/vmscan.c:3090 mm/vmscan.c:3296) [ 447.710253] ? mem_cgroup_shrink_node_zone (mm/vmscan.c:3213) [ 447.710253] kthread (kernel/kthread.c:219) [ 447.710253] ? __tick_nohz_task_switch (arch/x86/include/asm/paravirt.h:809 kernel/time/tick-sched.c:272) [ 447.710253] ? kthread_create_on_node (kernel/kthread.c:185) [ 447.710253] ret_from_fork (arch/x86/kernel/entry_64.S:555) [ 447.710253] ? kthread_create_on_node (kernel/kthread.c:185) [ 447.710253] Code: e9 a9 00 00 00 0f 1f 00 48 39 c3 0f 82 87 00 00 00 4c 39 e3 0f 83 7e 00 00 00 48 8b 7d a0 e8 45 ef ff ff 48 85 c0 49 89 c5 75 05 <0f> 0b 0f 1f 00 48 8b 7d b0 48 8d 4b 01 48 89 c2 4c 89 f6 e8 c5 [ 447.710253] RIP clear_extent_bit (fs/btrfs/extent_io.c:676) [ 447.710253] RSP Thanks, Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
hi
My Name is Macus Donald,Please i have no intentions of causing you any pains, I'm seeking your assistance only on humanitarian basis.I want you to assist me ensure that my estate and money is been used for Orphanage Home Project.if you are interested and you need more details contact me on my private E-mail Address: macus.donald01...@gmail.com -- Esta mensagem foi verificada pelo sistema de antivirus e acredita-se estar livre de perigo. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Any use for mkfs.btrfs -d raid5 -m raid1 ?
On Sun, Mar 23, 2014 at 10:52:29PM +, Hugo Mills wrote: > On Sun, Mar 23, 2014 at 03:44:35PM -0700, Marc MERLIN wrote: > > If I lose 2 drives on a raid5, -m raid1 should ensure I haven't lost my > > metadate. > > From there, would I indeed have small files that would be stored entirely on > > some of the drives that didn't go missing, and therefore I could recover > > some data with 2 missing drives? > >btrfs's RAID-1 is two copies only, so you may well have lost some > of your metadata. n-copies RAID-1 is coming Real Soon Now™ (Chris has > it on his todo list, along with fixing all the parity RAID stuff). Oh, right, I forgot about that. Then I'm not coming up with many good reasons why raid1 metadata with raid5 data would be useful. Actually raid5 metadata should be faster since it's striped on more drives. I'll update the doc I just posted, thanks. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ signature.asc Description: Digital signature
Btrfs and raid5 status with kernel 3.14, documentation, and howto
Ok, thanks to the help I got from you, and my own experiments, I've written this: http://marc.merlins.org/perso/btrfs/post_2014-03-23_Btrfs-Raid5-Status.html If someone reminds me how to edit the btrfs wiki, I'm happy to copy that there, or give anyone permission to take part of all of what I wrote and use it for any purpose. The highlights are if you're coming from the mdadm raid5 world: - btrfs does not yet seem to know that if you removed a drive from an array and you plug it back in later, that drive is out of date. It will auto-add an out of date drive back to an array and that will likely cause data loss by hiding files you had but the old drive didn't have. This means you should wipe a drive cleanly before you put it back into an array it used to be part of - btrfs does not deal well with a drive that is present but not working. It does not know how to kick it from the array, nor can it be removed (btrfs device delete) because this causes reading from the drive that isn't working. This means btrfs will try to write to the bad drive forever. The solution there is to umount the array, remount it with the bad drive missing (it cannot be seen by btrfs, or it'll get automounted/added), and then rebuild on a new drive or rebuild/shrink the array to be one drive smaller (this is explained below). - You can add and remove drives from an array and rebalance to grow/shrink an array without umounting it. Note that is slow since it forces rewriting of all data blocks, and this takes about 3H per 100GB (or 30H per terabyte) with 10 drives on a dual core duo. - If you are missing a drive, btrfs will refuse to mount the array and give an obscure error unless you mount with -o degraded - btrfs has no special rebuild procedure. Rebuilding is done by rebalancing the array. You could actualy rebalance a degraded array to a smaller array by rebuilding/balancing without adding a drive, or you can add a drive, rebalance on it, and that will force a read/rewrite of all data blocks, which will restripe them nicely. - btrfs replace does not work, but you can easily do btrfs device add, and btrfs remove of the other drive, and this will do the same thing. - btrfs device add will not cause an auto rebalance. You could chose not to rebalance existing data and only have new data be balanced properly. - btrfs device delete will force all data from the deleted drive to be rebalanced and the command completes when the drive has been freed up. - The magic command to delete an unused drive from an array while it is missing from the system is btrfs device delete missing . - btrfs doesn't easily tell you that your array is in degraded mode (run btrfs fi show, and it'll show a missing drive as well as how much of your total data is still on it). This does means you can have an array that is half degraded: half the files are striped over the current drives because they were written after the drive was removed, or were written by a rebalance that hasn't finished, while the other half of your data could be in degraded mode. You can see this by looking at the amount of data on each drive, anything on drive 11 is properly striped 10 way, while anything on drive 3 is in degraded mode: polgara:~# btrfs fi show Label: backupcopy uuid: eed9b55c-1d5a-40bf-a032-1be6980648e1 Total devices 11 FS bytes used 564.54GiB devid1 size 465.76GiB used 63.14GiB path /dev/dm-0 devid2 size 465.76GiB used 63.14GiB path /dev/dm-1 devid3 size 465.75GiB used 30.00GiB path <- this device is missing devid4 size 465.76GiB used 63.14GiB path /dev/dm-2 devid5 size 465.76GiB used 63.14GiB path /dev/dm-3 devid6 size 465.76GiB used 63.14GiB path /dev/dm-4 devid7 size 465.76GiB used 63.14GiB path /dev/mapper/crypt_sdi1 devid8 size 465.76GiB used 63.14GiB path /dev/mapper/crypt_sdj1 devid9 size 465.76GiB used 63.14GiB path /dev/dm-7 devid10 size 465.76GiB used 63.14GiB path /dev/dm-8 devid11 size 465.76GiB used 33.14GiB path /dev/mapper/crypt_sde1 <- this device was added Hope this helps, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Any use for mkfs.btrfs -d raid5 -m raid1 ?
On Sun, Mar 23, 2014 at 03:44:35PM -0700, Marc MERLIN wrote: > If I lose 2 drives on a raid5, -m raid1 should ensure I haven't lost my > metadate. > From there, would I indeed have small files that would be stored entirely on > some of the drives that didn't go missing, and therefore I could recover > some data with 2 missing drives? btrfs's RAID-1 is two copies only, so you may well have lost some of your metadata. n-copies RAID-1 is coming Real Soon Now™ (Chris has it on his todo list, along with fixing all the parity RAID stuff). > Or is it kind of pointless/waste of space? > > Actually, would it make btrfs faster for metadata work since it can read > from n drives in parallel and get data just a bit faster, or is that mostly > negligeable? I don't think we've got good benchmarks from anyone on any of this kind of thing. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Great oxymorons of the world, no. 9: Standard Deviation --- signature.asc Description: Digital signature
Any use for mkfs.btrfs -d raid5 -m raid1 ?
If I lose 2 drives on a raid5, -m raid1 should ensure I haven't lost my metadate. >From there, would I indeed have small files that would be stored entirely on some of the drives that didn't go missing, and therefore I could recover some data with 2 missing drives? Or is it kind of pointless/waste of space? Actually, would it make btrfs faster for metadata work since it can read from n drives in parallel and get data just a bit faster, or is that mostly negligeable? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ERROR: error during balancing '.' - No space left on device
On Sun, Mar 23, 2014 at 12:10:17PM -0700, Marc MERLIN wrote: > On Sun, Mar 23, 2014 at 05:34:09PM +, Hugo Mills wrote: > >xaba on IRC has just pointed out that it looks like you're running > > this on a mounted filesystem -- it needs to be unmounted for > > btrfs-image to work reliably. > > Sorry, I didn't realize that, although it makes sense. btrfs-image really > should fail though: > https://bugzilla.kernel.org/show_bug.cgi?id=72791 > > I got the image, but it's 1.5GB (1TB filesystem), but kernel bugzilla times > out after 1% of upload (repeatedly) > Placeholder bug: > https://bugzilla.kernel.org/show_bug.cgi?id=72801 Here is the image: http://marc.merlins.org/tmp/pool2.image After taking that image for you, I got the filesysetm to recover: legolas:/mnt/btrfs_pool2# for i in *weekly*; do btrfs subvolume delete $i; done Delete subvolume '/mnt/btrfs_pool2/home_weekly_20140309_00:06:01' Delete subvolume '/mnt/btrfs_pool2/home_weekly_20140316_00:06:01' Delete subvolume '/mnt/btrfs_pool2/root_weekly_20140309_00:06:01' Delete subvolume '/mnt/btrfs_pool2/root_weekly_20140316_00:06:01' Delete subvolume '/mnt/btrfs_pool2/tmp_weekly_20140309_00:06:01' Delete subvolume '/mnt/btrfs_pool2/tmp_weekly_20140316_00:06:01' Delete subvolume '/mnt/btrfs_pool2/usr_weekly_20140309_00:06:01' Delete subvolume '/mnt/btrfs_pool2/usr_weekly_20140316_00:06:01' Delete subvolume '/mnt/btrfs_pool2/var_weekly_20140309_00:06:01' ^A Delete subvolume '/mnt/btrfs_pool2/var_weekly_20140316_00:06:01' legolas:/mnt/btrfs_pool2# btrfs fi df . Data, single: total=800.42GiB, used=636.91GiB System, DUP: total=8.00MiB, used=92.00KiB System, single: total=4.00MiB, used=0.00 Metadata, DUP: total=10.00GiB, used=9.50GiB Metadata, single: total=8.00MiB, used=0.00 legolas:/mnt/btrfs_pool2# btrfs balance start -v -dusage=0 /mnt/btrfs_pool2 Dumping filters: flags 0x1, state 0x0, force is off DATA (flags 0x2): balancing, usage=0 Done, had to relocate 91 out of 823 chunks legolas:/mnt/btrfs_pool2# btrfs fi df . Data, single: total=709.01GiB, used=603.85GiB System, DUP: total=8.00MiB, used=88.00KiB System, single: total=4.00MiB, used=0.00 Metadata, DUP: total=10.00GiB, used=7.39GiB Metadata, single: total=8.00MiB, used=0.00 Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ signature.asc Description: Digital signature
Improve btrfs man page
Hi, I am going through the process of replacing a bad drive in a RAID 1 mirror. The filesystem wouldn't mount because of the missing device, and the btrfs man pages were not helpful in resolving it. Specifically, it would have been very useful to me if this part of the wiki were included somewhere: https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices#Replacing_failed_devices There is no mention in the man pages of the -o degraded mount option, which is needed to even manipulate the filesystem; nor is there any mention of the special option "missing" for device delete. I'm sure I'm not the only one who would turn to man btrfs first when facing a failed drive. Thanks, Per -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to handle a RAID5 arrawy with a failing drive? -> raid5 mostly works, just no rebuilds
On Wed, Mar 19, 2014 at 10:53:33AM -0600, Chris Murphy wrote: > > On Mar 19, 2014, at 9:40 AM, Marc MERLIN wrote: > > > > After adding a drive, I couldn't quite tell if it was striping over 11 > > drive2 or 10, but it felt that at least at times, it was striping over 11 > > drives with write failures on the missing drive. > > I can't prove it, but I'm thinking the new data I was writing was being > > striped in degraded mode. > > Well it does sound fragile after all to add a drive to a degraded array, > especially when it's not expressly treating the faulty drive as faulty. I > think iotop will show what block devices are being written to. And in a VM > it's easy (albeit rudimentary) with sparse files, as you can see them grow. > > > > > Yes, although it's limited, you apparently only lose new data that was added > > after you went into degraded mode and only if you add another drive where > > you write more data. > > In real life this shouldn't be too common, even if it is indeed a bug. > > It's entirely plausible a drive power/data cable becomes lose, runs for hours > degraded before the wayward device is reseated. It'll be common enough. It's > definitely not OK for all of that data in the interim to vanish just because > the volume has resumed from degraded to normal. Two states of data, normal vs > degraded, is scary. It sounds like totally silent data loss. So yeah if it's > reproducible it's worthy of a separate bug. I just got around to filing that bug: https://bugzilla.kernel.org/show_bug.cgi?id=72811 In other news, I was able to 1) remove a drive 2) mount degraded 3) add a new drive 4) rebalance (that took 2 days with little data, 4 deadlocks and reboots though) 5) remove the missing drive from the filesystem 6) remount the array without -o degraded Now, I'm testing 1) add a new drive 2 remove a working drive 3) automatic rebalance from #2 should rebuild on the new drive automatically Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ERROR: error during balancing '.' - No space left on device
On Sun, Mar 23, 2014 at 05:34:09PM +, Hugo Mills wrote: >xaba on IRC has just pointed out that it looks like you're running > this on a mounted filesystem -- it needs to be unmounted for > btrfs-image to work reliably. Sorry, I didn't realize that, although it makes sense. btrfs-image really should fail though: https://bugzilla.kernel.org/show_bug.cgi?id=72791 I got the image, but it's 1.5GB (1TB filesystem), but kernel bugzilla times out after 1% of upload (repeatedly) Placeholder bug: https://bugzilla.kernel.org/show_bug.cgi?id=72801 Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ signature.asc Description: Digital signature
Re: Cannot add device "is mounted" for unmounted drive that used to be in raidset that is mounted
On Sun, Mar 23, 2014 at 11:09:07AM -0700, Marc MERLIN wrote: > I found out that a drive that used to be part of a raid system that is mounted > and running without it, btrfs apparently decides that the drive is part of > the mounted > raidset and in use. > As a result, I had to eventually dd 0's over it, btrfs device scan, and > finally > I was able to use it again. I filed a bug for this, it's a bit more minor, but worth fixing eventually https://bugzilla.kernel.org/show_bug.cgi?id=72771 Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Cannot add device "is mounted" for unmounted drive that used to be in raidset that is mounted
I found out that a drive that used to be part of a raid system that is mounted and running without it, btrfs apparently decides that the drive is part of the mounted raidset and in use. As a result, I had to eventually dd 0's over it, btrfs device scan, and finally I was able to use it again. btrfs should probably improve its check to see that the drive is not really used and let me format it. Longer details: On a running system, I re-added a drive, it showed up as /dev/sdm1 I decrypted it cryptsetup luksOpen /dev/sdm1 crypt_sdm1 But I can't add it: polgara:/mnt/btrfs_backupcopy# btrfs device add -f /dev/mapper/crypt_sdm1 . /dev/mapper/crypt_sdm1 is mounted however, I can format it as ext4, mount it, write to it, unmount it, and I still get the same error: polgara:/mnt/btrfs_backupcopy# mke2fs -t ext4 /dev/mapper/crypt_sdm1 polgara:/mnt/btrfs_backupcopy# mount /dev/mapper/crypt_sdm1 /mnt/mnt polgara:/mnt/btrfs_backupcopy# Mar 23 10:48:31 polgara kernel: [38645.955263] EXT4-fs (dm-10): mounted filesystem with ordered data mode. Opts: (null) polgara:/mnt/btrfs_backupcopy# umount /mnt/mnt polgara:/mnt/btrfs_backupcopy# btrfs device add -f /dev/mapper/crypt_sdm1 . /dev/mapper/crypt_sdm1 is mounted polgara:/mnt/btrfs_backupcopy# fuser -v /dev/mapper/crypt_sdm1 polgara:/mnt/btrfs_backupcopy# lsof -n | grep -E '(dm-10|sdm)' polgara:/mnt/btrfs_backupcopy# mkfs.btrfs -f /dev/mapper/crypt_sdm1 Error: /dev/mapper/crypt_sdm1 is mounted polgara:/mnt/btrfs_backupcopy# In the end, I had to run dd if=/dev/zero of=/dev/mapper/crypt_sdm1 bs=1M for btrfs to stop telling me the filesystem was mounted. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: for Chris Mason ( iowatcher graphs)
On Sun, Mar 23, 2014 at 09:36:19PM +0400, Vasiliy Tolstov wrote: > Hello. Sorry for writing to btrfs mailing list, but personal mail > reject my message. > Saying " > : host 10.101.1.19[10.101.1.19] said: 554 5.4.6 Hop > count exceeded - possible mail loop (in reply to end of DATA command) He's moved to Facebook now. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Unix: For controlling fungal diseases in crops. --- signature.asc Description: Digital signature
for Chris Mason ( iowatcher graphs)
Hello. Sorry for writing to btrfs mailing list, but personal mail reject my message. Saying " : host 10.101.1.19[10.101.1.19] said: 554 5.4.6 Hop count exceeded - possible mail loop (in reply to end of DATA command) Final-Recipient: rfc822; chris.ma...@fusionio.com Action: failed Status: 5.0.0 Diagnostic-Code: X-Spam-&-Virus-Firewall; host 10.101.1.19[10.101.1.19] said: 554 5.4.6 Hop count exceeded - possible mail loop (in reply to end of DATA command) " My question is: I'm try to see what happening when virtual machine write to disk and run blktrace from dom0 like this blktrace -w 60 -d /dev/disk/vbd/21-920 -o - > test.trace /dev/disk/vbd/21-920 is the software raid contains 2 lv volumes , each lv volume created in big srp attached disk Inside vm i try to do some work via fio: [global] rw=randread size=128m directory=/tmp ioengine=libaio iodepth=4 invalidate=1 direct=1 [bgwriter] rw=randwrite iodepth=32 [queryA] iodepth=1 ioengine=mmap direct=0 thinktime=3 [queryB] iodepth=1 ioengine=mmap direct=0 thinktime=5 [bgupdater] rw=randrw iodepth=16 thinktime=40 size=128m After that i try to get graph like iowatcher -t test.trace -o trace.svg But svg contains unreadable images. What i doing wrong and how can i fix that? svg looks like http://62.76.182.4/trace.svg Thank you for good tools ! -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ERROR: error during balancing '.' - No space left on device
On Sun, Mar 23, 2014 at 10:03:14AM -0700, Marc MERLIN wrote: > On Sun, Mar 23, 2014 at 04:28:25PM +, Hugo Mills wrote: > >Before you do this, can you take a btrfs-image of your metadata, > > and add a report to bugzilla.kernel.org? You're not the only person > > who's had this problem recently, and I suspect there's something > > still lurking in there that needs attention. > > > > > Hopefully this will be handled better in later code. > > > >With the info you can provide from a btrfs-image... let's hope so. :) > > Mmmh, this may not be good :-/ > > legolas:/mnt/btrfs_pool2# btrfs-image -c 9 -t 6 /dev/mapper/disk2 > /tmp/pool2.image > parent transid verify failed on 295965446144 wanted 51493 found 51495 > parent transid verify failed on 295965446144 wanted 51493 found 51495 > parent transid verify failed on 295965446144 wanted 51493 found 51495 > parent transid verify failed on 295965446144 wanted 51493 found 51495 > Ignoring transid failure > leaf parent key incorrect 295965446144 > parent transid verify failed on 106205184 wanted 51468 found 51528 > parent transid verify failed on 106205184 wanted 51468 found 51528 > parent transid verify failed on 106205184 wanted 51468 found 51528 > parent transid verify failed on 106205184 wanted 51468 found 51528 [snip] > Segmentation fault > > Is it a bug in the tool, or do I have real corruption? > > Are there magic options I can give it to make it work around this? xaba on IRC has just pointed out that it looks like you're running this on a mounted filesystem -- it needs to be unmounted for btrfs-image to work reliably. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Unix: For controlling fungal diseases in crops. --- signature.asc Description: Digital signature
Re: ERROR: error during balancing '.' - No space left on device
On Sun, Mar 23, 2014 at 04:28:25PM +, Hugo Mills wrote: >Before you do this, can you take a btrfs-image of your metadata, > and add a report to bugzilla.kernel.org? You're not the only person > who's had this problem recently, and I suspect there's something > still lurking in there that needs attention. > > > Hopefully this will be handled better in later code. > >With the info you can provide from a btrfs-image... let's hope so. :) Mmmh, this may not be good :-/ legolas:/mnt/btrfs_pool2# btrfs-image -c 9 -t 6 /dev/mapper/disk2 /tmp/pool2.image parent transid verify failed on 295965446144 wanted 51493 found 51495 parent transid verify failed on 295965446144 wanted 51493 found 51495 parent transid verify failed on 295965446144 wanted 51493 found 51495 parent transid verify failed on 295965446144 wanted 51493 found 51495 Ignoring transid failure leaf parent key incorrect 295965446144 parent transid verify failed on 106205184 wanted 51468 found 51528 parent transid verify failed on 106205184 wanted 51468 found 51528 parent transid verify failed on 106205184 wanted 51468 found 51528 parent transid verify failed on 106205184 wanted 51468 found 51528 Ignoring transid failure parent transid verify failed on 267162120192 wanted 51469 found 51528 parent transid verify failed on 267162120192 wanted 51469 found 51528 parent transid verify failed on 267162120192 wanted 51469 found 51528 parent transid verify failed on 267162120192 wanted 51469 found 51528 Ignoring transid failure parent transid verify failed on 107315200 wanted 51468 found 51528 parent transid verify failed on 107315200 wanted 51468 found 51528 parent transid verify failed on 107315200 wanted 51468 found 51528 parent transid verify failed on 107315200 wanted 51468 found 51528 Ignoring transid failure parent transid verify failed on 108081152 wanted 51468 found 51528 parent transid verify failed on 108081152 wanted 51468 found 51528 parent transid verify failed on 108081152 wanted 51468 found 51528 parent transid verify failed on 108081152 wanted 51468 found 51528 Ignoring transid failure parent transid verify failed on 267162124288 wanted 51469 found 51528 parent transid verify failed on 267162124288 wanted 51469 found 51528 parent transid verify failed on 267162124288 wanted 51469 found 51528 parent transid verify failed on 267162124288 wanted 51469 found 51528 Ignoring transid failure parent transid verify failed on 108351488 wanted 51468 found 51528 parent transid verify failed on 108351488 wanted 51468 found 51528 parent transid verify failed on 108351488 wanted 51468 found 51528 parent transid verify failed on 108351488 wanted 51468 found 51528 Ignoring transid failure parent transid verify failed on 275021406208 wanted 51469 found 51529 parent transid verify failed on 275021406208 wanted 51469 found 51529 parent transid verify failed on 275021406208 wanted 51469 found 51529 parent transid verify failed on 275021406208 wanted 51469 found 51529 Ignoring transid failure parent transid verify failed on 108466176 wanted 51468 found 51528 parent transid verify failed on 108466176 wanted 51468 found 51528 parent transid verify failed on 108466176 wanted 51468 found 51528 parent transid verify failed on 108466176 wanted 51468 found 51528 Ignoring transid failure Segmentation fault Is it a bug in the tool, or do I have real corruption? Are there magic options I can give it to make it work around this? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ERROR: error during balancing '.' - No space left on device
On Sun, Mar 23, 2014 at 09:20:00AM -0700, Marc MERLIN wrote: > Both > legolas:/mnt/btrfs_pool2# btrfs balance start -v -dusage=5 /mnt/btrfs_pool2 > legolas:/mnt/btrfs_pool2# btrfs balance start -v -dusage=0 /mnt/btrfs_pool2 > failed unfortunately. > > On Sun, Mar 23, 2014 at 12:26:32PM +, Duncan wrote: > > When it rains, it pours. What you're missing is that this is now the > > third thread in three days with exactly the same out-of-space-when-there- > > appears-to-be-plenty problem, which is well explained and a solution > > presented, along with further discussion, on those threads. > > > > Evidently you haven't read the others, but rather than rewrite a similar > > reply here with exactly the same explanation and fix, I'll just refer you > > to them. > > Thanks. Indeed, while I spent most of yesterday dealing with 3 btrfs > filesystems, the one here that was hanging my laptop, the raid5 one that > was hanging repeatedly during balance, and then my main server were one > FS is so slow that it takes 8H to do an reflink copy or delete a backup > with 1 million inodes, I got behind on reading the list :) > > Thanks for the pointers > > > btrfs balance start -dusage=5 `pwd` > > > > Tweak the N in usage=N as needed. > > I had actually tried this, but it failed too: > legolas:/mnt/btrfs_pool2# btrfs balance start -v -dusage=5 /mnt/btrfs_pool2 > Dumping filters: flags 0x1, state 0x0, force is off > DATA (flags 0x2): balancing, usage=5 > ERROR: error during balancing '/mnt/btrfs_pool2' - No space left on device > > But I now just found > https://btrfs.wiki.kernel.org/index.php/Balance_Filters > and tried -dusage=0 > > On Sun, Mar 23, 2014 at 11:47:12AM +, Hugo Mills wrote: > >I think you probably shouldn't be doing a full balance, but a > > filtered one: > > > > # btrfs balance start -dusage=5 /mnt/btrfs_pool > > > > which should only try to clean up chunks which have little usage (so > > it's much faster to run). > > Thanks for the other answer Hugo. > > So, now I'm down to > legolas:/mnt/btrfs_pool2# btrfs balance start -v -dusage=0 /mnt/btrfs_pool2 > Dumping filters: flags 0x1, state 0x0, force is off > DATA (flags 0x2): balancing, usage=0 > ERROR: error during balancing '/mnt/btrfs_pool2' - No space left on device > > Looks like there is no good way out of this, so I'll start deleting > snapshots. Before you do this, can you take a btrfs-image of your metadata, and add a report to bugzilla.kernel.org? You're not the only person who's had this problem recently, and I suspect there's something still lurking in there that needs attention. > Hopefully this will be handled better in later code. With the info you can provide from a btrfs-image... let's hope so. :) Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Anyone using a computer to generate random numbers is, of --- course, in a state of sin. signature.asc Description: Digital signature
Re: btrfs-tools missing "btrfs device delete devid=x path" ?
On Sun, Mar 23, 2014 at 04:18:43PM +, Hugo Mills wrote: > On Sun, Mar 23, 2014 at 08:25:17AM -0700, Marc MERLIN wrote: > > I'm still doing some testing so that I can write some howto. > > > > I got that far after a rebalance (mmmh, that took 2 days with little > > data, and unfortunately 5 deadlocks and reboots. > > > > polgara:/mnt/btrfs_backupcopy# btrfs fi show > > Label: backupcopy uuid: eed9b55c-1d5a-40bf-a032-1be6980648e1 > > Total devices 11 FS bytes used 114.35GiB > > devid1 size 465.76GiB used 32.14GiB path /dev/dm-0 > > devid2 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdd1 > > devid3 size 465.75GiB used 0.00 path < drive is freed up > > now. > > devid4 size 465.76GiB used 32.14GiB path /dev/dm-2 > > devid5 size 465.76GiB used 32.14GiB path /dev/dm-3 > > devid6 size 465.76GiB used 32.14GiB path /dev/dm-4 > > devid7 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdi1 > > devid8 size 465.76GiB used 32.14GiB path /dev/dm-6 > > devid9 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdk1 > > devid10 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdl1 > > devid11 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sde1 > > Btrfs v3.12 > > > > What's the syntax for removing a drive that isn't there? > >btrfs dev del missing /path > >Removes all the missing devices. Aaah, this worked, thank you. This isn't documented in the 3.12 man page, but I'm guessing it'll be in the upcoming 3.14. I'll document this in the raid5 page I'm currently writing. Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ERROR: error during balancing '.' - No space left on device
Both legolas:/mnt/btrfs_pool2# btrfs balance start -v -dusage=5 /mnt/btrfs_pool2 legolas:/mnt/btrfs_pool2# btrfs balance start -v -dusage=0 /mnt/btrfs_pool2 failed unfortunately. On Sun, Mar 23, 2014 at 12:26:32PM +, Duncan wrote: > When it rains, it pours. What you're missing is that this is now the > third thread in three days with exactly the same out-of-space-when-there- > appears-to-be-plenty problem, which is well explained and a solution > presented, along with further discussion, on those threads. > > Evidently you haven't read the others, but rather than rewrite a similar > reply here with exactly the same explanation and fix, I'll just refer you > to them. Thanks. Indeed, while I spent most of yesterday dealing with 3 btrfs filesystems, the one here that was hanging my laptop, the raid5 one that was hanging repeatedly during balance, and then my main server were one FS is so slow that it takes 8H to do an reflink copy or delete a backup with 1 million inodes, I got behind on reading the list :) Thanks for the pointers > btrfs balance start -dusage=5 `pwd` > > Tweak the N in usage=N as needed. I had actually tried this, but it failed too: legolas:/mnt/btrfs_pool2# btrfs balance start -v -dusage=5 /mnt/btrfs_pool2 Dumping filters: flags 0x1, state 0x0, force is off DATA (flags 0x2): balancing, usage=5 ERROR: error during balancing '/mnt/btrfs_pool2' - No space left on device But I now just found https://btrfs.wiki.kernel.org/index.php/Balance_Filters and tried -dusage=0 On Sun, Mar 23, 2014 at 11:47:12AM +, Hugo Mills wrote: >I think you probably shouldn't be doing a full balance, but a > filtered one: > > # btrfs balance start -dusage=5 /mnt/btrfs_pool > > which should only try to clean up chunks which have little usage (so > it's much faster to run). Thanks for the other answer Hugo. So, now I'm down to legolas:/mnt/btrfs_pool2# btrfs balance start -v -dusage=0 /mnt/btrfs_pool2 Dumping filters: flags 0x1, state 0x0, force is off DATA (flags 0x2): balancing, usage=0 ERROR: error during balancing '/mnt/btrfs_pool2' - No space left on device Looks like there is no good way out of this, so I'll start deleting snapshots. Hopefully this will be handled better in later code. Thanks for the answer, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs-tools missing "btrfs device delete devid=x path" ?
On Sun, Mar 23, 2014 at 08:25:17AM -0700, Marc MERLIN wrote: > I'm still doing some testing so that I can write some howto. > > I got that far after a rebalance (mmmh, that took 2 days with little > data, and unfortunately 5 deadlocks and reboots. > > polgara:/mnt/btrfs_backupcopy# btrfs fi show > Label: backupcopy uuid: eed9b55c-1d5a-40bf-a032-1be6980648e1 > Total devices 11 FS bytes used 114.35GiB > devid1 size 465.76GiB used 32.14GiB path /dev/dm-0 > devid2 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdd1 > devid3 size 465.75GiB used 0.00 path < drive is freed up > now. > devid4 size 465.76GiB used 32.14GiB path /dev/dm-2 > devid5 size 465.76GiB used 32.14GiB path /dev/dm-3 > devid6 size 465.76GiB used 32.14GiB path /dev/dm-4 > devid7 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdi1 > devid8 size 465.76GiB used 32.14GiB path /dev/dm-6 > devid9 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdk1 > devid10 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdl1 > devid11 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sde1 > Btrfs v3.12 > > What's the syntax for removing a drive that isn't there? btrfs dev del missing /path Removes all the missing devices. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Anyone using a computer to generate random numbers is, of --- course, in a state of sin. signature.asc Description: Digital signature
btrfs-tools missing "btrfs device delete devid=x path" ?
I'm still doing some testing so that I can write some howto. I got that far after a rebalance (mmmh, that took 2 days with little data, and unfortunately 5 deadlocks and reboots. polgara:/mnt/btrfs_backupcopy# btrfs fi show Label: backupcopy uuid: eed9b55c-1d5a-40bf-a032-1be6980648e1 Total devices 11 FS bytes used 114.35GiB devid1 size 465.76GiB used 32.14GiB path /dev/dm-0 devid2 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdd1 devid3 size 465.75GiB used 0.00 path < drive is freed up now. devid4 size 465.76GiB used 32.14GiB path /dev/dm-2 devid5 size 465.76GiB used 32.14GiB path /dev/dm-3 devid6 size 465.76GiB used 32.14GiB path /dev/dm-4 devid7 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdi1 devid8 size 465.76GiB used 32.14GiB path /dev/dm-6 devid9 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdk1 devid10 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdl1 devid11 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sde1 Btrfs v3.12 What's the syntax for removing a drive that isn't there? I'm looking for something like polgara:/mnt/btrfs_backupcopy# btrfs device delete devid=3 . ERROR: error removing the device 'devid=3' - No such file or directory Until I can remove it, not surprisingly, even though device 3 is now unused, I still need to mount degraded: polgara:~# mount -v -t btrfs -o compress=zlib,space_cache,noatime LABEL=backupcopy /mnt/btrfs_backupcopy mount: wrong fs type, bad option, bad superblock on /dev/mapper/crypt_sde1, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so [28865.527656] BTRFS: open devid=3 failed [29777.507022] BTRFS: device label backupcopy devid 11 transid 5137 /dev/mapper/crypt_sde1 [29777.558874] BTRFS info (device dm-9): disk space caching is enabled [29777.566997] BTRFS: failed to read chunk tree on dm-9 [29777.612041] BTRFS: open_ctree failed polgara:~# mount -v -t btrfs -o compress=zlib,space_cache,noatime,degraded LABEL=backupcopy /mnt/btrfs_backupcopy /dev/mapper/crypt_sde1 on /mnt/btrfs_backupcopy type btrfs (rw,noatime,compress=zlib,space_cache,degraded) polgara:~# This works. I think if I get that last step working, I will have succcessfully removed a raid5 drive, added a new drive, rebalanced existing data on it, and removed the old one. Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Rebalance makes BTRFS 10x slower
Am Sonntag, 23. März 2014, 15:51:34 schrieben Sie: > I was expecting either a speed improvement after rebalance, or no > noticeable effect, but I am extremely disappointed to see that now (and > after having rebooted), my system has become slow like hell, takes at least > 10x longer to boot and operate, to the point it has become hardly usable > > I would have thought a rebalance would have improved the filesystem > organization, looks like it's the absolute contrary I also found this to be the case as I rebalanced the root filesystem of my Debian installation on this ThinkPad T520 on an Intel SSD 320. It doubled the boot time that systemd-analayze rebootet back then. apt-get dist-upgrade was noticably slower to. Thus I avoid balance unless I really need it. For migrating of /home to BTRFS RAID 1 on Intel SSD 320 + Crucial mSATA M500 SSD I did a balance to switch to RAID 1. Probably /home would be faster recreating it from scratch and restoring from backup tough. Especially MySQL based Akonadi seems to be slower than with /home on a single SSD, but that might also be due to that amount of mails in Linux kernel-ml folder raising to a high amount (and inefficiencies in Akonadi). -- Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Rebalance makes BTRFS 10x slower
Hi there, # uname -r 3.13.6-1-ARCH # btrfs --version Btrfs v3.12 After having read the recent discussion about rebalance, I ran it for a test on my laptop with a 1TB HD, which current situation (after rebalance) is : # btrfs fi sh Label: TETHYS uuid: 9a1ca6f4-1c1b-4a62-a84b-b388066084dc Total devices 1 FS bytes used 575.56GiB devid1 size 845.00GiB used 580.06GiB path /dev/dm-3 Label: BOOT uuid: 6a16d133-4b99-47b2-876f-148a8266f58f Total devices 1 FS bytes used 67.99MiB devid1 size 1.00GiB used 144.00MiB path /dev/dm-0 Btrfs v3.12 # btrfs fi df / Data, single: total=574.00GiB, used=573.60GiB System, DUP: total=32.00MiB, used=72.00KiB Metadata, DUP: total=3.00GiB, used=1.96GiB # df -h /boot / Sys. de fichiers Taille Utilisé Dispo Uti% Monté sur /dev/dm-0 1,0G 68M 924M 7% /boot /dev/dm-3 845G578G 266G 69% / The rebalance for the biggest BTRFS took about 19 hours. I was expecting either a speed improvement after rebalance, or no noticeable effect, but I am extremely disappointed to see that now (and after having rebooted), my system has become slow like hell, takes at least 10x longer to boot and operate, to the point it has become hardly usable :-( I would have thought a rebalance would have improved the filesystem organization, looks like it's the absolute contrary :-( -- Swâmi Petaramesh http://petaramesh.org PGP 9076E32E -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Partition won't mount after forced shutdown
I was using my laptop when, suddenly, it froze. I forced it to shutdown, but, when I tried to turn it back on, /home, a btrfs partition, couldn't mount. I tried to mount it with the recovery mount option but it didn't help: http://pastebin.com/8C8MEyK9. Then I tried to recover it with btrfs-zero-log; still nothing: http://pastebin.com/g6gviKpP. Next, following https://btrfs.wiki.kernel.org/index.php/Restore, I tried to recover the files from the partition, but only a handful of config files were successfully recovered, even when using the -i option (output: http://pastebin.com/fp2GiSAz). And I can't find the find-root executable in my system or in btrfs-progs. Here's my /var/log/messages: https://www.dropbox.com/s/ytn21srae858brn/msgs What else should I do before trying to btrfsck? I was going to try and follow the guide in this post: https://bbs.archlinux.org/viewtopic.php?pid=1321989#p1321989. Regards, -- Luiz Romário Santana Rios -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ERROR: error during balancing '.' - No space left on device
Marc MERLIN posted on Sun, 23 Mar 2014 00:01:44 -0700 as excerpted: > legolas:/mnt/btrfs_pool2# btrfs balance . > ERROR: error during balancing '.' - No space left on device > But: > # btrfs fi show `pwd` > Label: btrfs_pool2 uuid: [...] > Total devices 1 FS bytes used 646.41GiB > devid 1 size 820.45GiB used 820.45GiB path /dev/mapper/disk2 > # [btrfs fi df `pwd` (unused single metadata/system chunks omitted)] > Data, single: total=800.42GiB, used=636.91GiB > System, DUP: total=8.00MiB, used=92.00KiB > Metadata, DUP: total=10.00GiB, used=9.50GiB > > I can't see how I'm full, and now that I can't run balance to fix > things, this is making things worse. > Kernel is 3.14. > > What am I missing? When it rains, it pours. What you're missing is that this is now the third thread in three days with exactly the same out-of-space-when-there- appears-to-be-plenty problem, which is well explained and a solution presented, along with further discussion, on those threads. Evidently you haven't read the others, but rather than rewrite a similar reply here with exactly the same explanation and fix, I'll just refer you to them. First thread, subject, original poster, message-id, gmane link: fresh btrfs filesystem, out of disk space, hundreds of gigs free Jon Nelson http://permalink.gmane.org/gmane.comp.file-systems.btrfs/33640 Second thread (same problem altho he chose as a subject a btrfsck error that wasn't actually related to it) ... free space inode generation (0) did not match free space cache generation Hendrik Friedel <532dd2dc.3030...@friedels.name> http://permalink.gmane.org/gmane.comp.file-systems.btrfs/33647 Very briefly, from your btrfs fi show: devid 1 size 820.45GiB used 820.45GiB That's total size and total of allocated chunks. They're equal, so there's no space left to allocate. Normally you'd use a balance, but a normal balance won't work without some space left to write the new chunks it's going to copy the old data/metadata into. There is however a solution, balance filters, using a command something like this (-d for data only since df says that's what has the extra chunks that need freed, usage=percent-full): btrfs balance start -dusage=5 `pwd` Tweak the N in usage=N as needed. The explanation for all that is in the other threads. If that doesn't work (tho it most likely will), there's other solutions available, see the no-space discussion in the FAQ on the wiki, again as linked in the other threads. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fresh btrfs filesystem, out of disk space, hundreds of gigs free
Jon Nelson posted on Sat, 22 Mar 2014 18:21:02 -0500 as excerpted: >>> # btrfs fi df / >>> Data, single: total=1.80TiB, used=832.22GiB >>> System, DUP: total=8.00MiB, used=204.00KiB >>> Metadata, DUP: total=5.50GiB, used=5.00GiB [The 0-used single listings left over from filesystem creation omitted.] >> Metadata is the red-flag here. Metadata chunks are 256 MiB in size, >> but in default DUP mode, two are allocated at once, thus 512 MiB at a >> time. And you're [below that so close to needing more allocated]. > > The size of the chunks allocated is especially useful information. I've > not seen that anywhere else, and does explain a fair bit. I actually had to dig a little bit for that information, but like you I found it quite useful, so the digging was worth it. =:^) >> But for a complete picture you need the filesystem show output, below, >> as well... >> >>> # btrfs fi show Label: none uuid: [...] >>> Total devices 1 FS bytes used 837.22GiB >>> devid 1 size 1.81TiB used 1.81TiB path /dev/sda3 >> >> OK. Here we see the root problem. Size 1.81 TiB, used 1.81 TiB. No >> unallocated space at all. Whichever runs out of space first, data or >> metadata, you'll be stuck. > > Now it's at this point that I am unclear. I thought the above said: > "1 device on this filesystem, 837.22 GiB used." > and > [devID #1, /dev/sda3, is 1.81TiB in size, with btrfs using it all.] > > Which I interpret differently. Can you go into more detail as to how > (from btrfs fi show) we can say "the _filesystem_ (not the device) is > full"? FWIW, there has been some discussion about changing the way both df and show present their information, giving a bit more than they are now and ideally presenting the core information you need both commands to see now, in one. I expect that to eventually happen, but meanwhile, the output of filesystem show in particular /is/ a bit confusing. I actually think they need to omit or change the size displayed on the total devices line entirely, as the information it gives, without the information from filesystem df as well, really isn't useful on its own, and is an invitation to confusion and misinterpretation, much like you found yourself with, because it really isn't related to the numbers given (in show) for the individual devices at all, and is only useful in the context of filesystem df, which is where it belongs, NOT in show! My opinion, of course. =:^) Anyway, if you compare the numbers from filesystem df and do the math, you'll quickly find what the total size in show is actually telling you: >From df: data-used + metadata-used + system-used = ... >From show: filesystem total used. Given the numbers posted above: >From df: data-used= 832.22 GiB (out of 1.8 TiB allocated/total data) metadata-used= 5.00 GiB (out of 5.5 GiB allocated metadata) system-used= (insignificant, well under a MiB) >From show, the total: total-used=837.22 GiB The PROBLEM is that the numbers the REST of show is giving you are something entirely different, only tangentially related: >From show: per device: 1) Total filesystem size on that device. 2) Total of all chunk allocations (*NOT* what's actually used from those allocations) on that device, altho it's /labeled/ "used" in show's individual device listings). Again, comparing from df it's quickly apparent where the numbers come from, the totals (*NOT* used) of data+metadata+system allocations (labeled total in df, but it's the allocated). Given the posting above, that'd be: >From df: data-allocated (total) = 1.80 TiB metadata-allocated = 0.005 TiB (5.5 GiB) system-allocated = (trivial, a few MiB) >From show, adding up all individual devices, in your case just one: total-allocated = 1.81 TiB (obviously rounded slightly) 3) What's *NOT* shown but can be easily deduced by subtracting allocated (labeled used) from total is the unallocated, thus still free to allocate. In this case, that's zero, since the filesystem size on that (single) device is 1.81 TiB and 1.81 TiB is allocated. So there's nothing left available to allocate and you're running out of metadata space and need to allocate some more, thus the problem, despite the fact that both normal (not btrfs) df and btrfs filesystem show, APPEAR to say there's plenty of room left. Btrfs filesystem show ACTUALLY says there's NO room left, at least for further chunk allocations, but you really have to understand what information it's presenting and how, in ordered to actually get what it's telling you. Like I said, I really wish show's total used size either wasn't even there, or likewise corresponded to the allocation, not what's used from that allocation, as all the device lines do. But that /does/ explain why the total of all the device used (in your case just one) doesn't equal the total used of the filesystem -- they're reporting two *ENTIRELY* diffe
Re: ERROR: error during balancing '.' - No space left on device
On Sun, Mar 23, 2014 at 12:01:44AM -0700, Marc MERLIN wrote: > legolas:/mnt/btrfs_pool2# btrfs balance . > ERROR: error during balancing '.' - No space left on device > There may be more info in syslog - try dmesg | tail > [ 8454.159635] BTRFS info (device dm-1): relocating block group 288329039872 > flags 1 > [ 8590.167294] BTRFS info (device dm-1): relocating block group 232494465024 > flags 1 > [ 9200.801177] BTRFS info (device dm-1): relocating block group 85928706048 > flags 1 > [ 9533.830623] BTRFS info (device dm-1): 824 enospc errors during balance > > But: > legolas:/mnt/btrfs_pool2# btrfs fi show `pwd` > Label: btrfs_pool2 uuid: 6afd4707-876c-46d6-9de2-21c4085b7bed > Total devices 1 FS bytes used 646.41GiB > devid1 size 820.45GiB used 820.45GiB path /dev/mapper/disk2 > Btrfs v3.12 > legolas:/mnt/btrfs_pool2# > Data, single: total=800.42GiB, used=636.91GiB > System, DUP: total=8.00MiB, used=92.00KiB > System, single: total=4.00MiB, used=0.00 > Metadata, DUP: total=10.00GiB, used=9.50GiB ^^^ This is where you're full. There's a block reserve here that (should be) usable for doing a balance, and it seems to be about 500 MiB or so that's the point at which problems show up. > Metadata, single: total=8.00MiB, used=0.00 > > I can't see how I'm full, and now that I can't run balance to fix > things, this is making things worse. I think you probably shouldn't be doing a full balance, but a filtered one: # btrfs balance start -dusage=5 /mnt/btrfs_pool which should only try to clean up chunks which have little usage (so it's much faster to run). > Kernel is 3.14. > > What am I missing? Not much. We do seem to have a problem with not being able to run balance in recent kernels under some circumstances -- you're not the only person who's reported this kind of problem lately. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Don't worry, he's not drunk. He's like that all the time. --- signature.asc Description: Digital signature
Re: Send/Receive howto and script for others to use (was Re: Is anyone using btrfs send/receive)
On 2014/03/22 11:11 PM, Marc MERLIN wrote: Please consider adding a blank line between quotes, it makes them just a bit more readable :) Np. On Sat, Mar 22, 2014 at 11:02:24PM +0200, Brendan Hide wrote: - it doesn't create writeable snapshots on the destination in case you want to use the copy as a live filesystem One of the issues with doing writeable snapshots by default is that the backup is (ever so slightly) less safe from "fat-finger syndrome". If I want a writeable snapshot, I'll make it from the read-only snapshot, thereby reducing the chances of accidentally "tainting" or deleting data in the backup. I actually *did* accidentally delete my entire filesystem (hence the paranoid umounts). But, of course, my script *first* created read-only snapshots from which recovery took only a few minutes. ;) The writeable snapshot I create is on top of the read only one used by btrfs receive. So, I can play with it, but it won't upset/break anything for the backup. The historical snapshots I keep give me cheap backups to go back to do get a file I may have deleted 3 days ago and want back now even though my btrfs send/receive runs hourly. Ah. In that case my comment is moot. I could add support for something like this but I'm unlikely to use it. [snip] - Your comments say shlock isn't safe and that's documented. I don't see that in the man page http://manpages.ubuntu.com/manpages/trusty/man1/shlock.1.html That man page looks newer than the one I last looked at - specifically the part saying "improved by Berend Reitsma to solve a race condition." The previous documentation on shlock indicated it was safe for hourly crons - but not in the case where a cron might be executed twice simultaneously. Shlock was recommended by a colleague until I realised this potential issue, thus my template doesn't use it. I should update the comment with some updated information. It's not super important, it was more my curiosity. If a simple lock program in C isn't atomic, what's the point of it? I never looked at the source code, but maybe I should... Likely the INN devs needed something outside of a shell environment. Based on the man page, shlock should be atomic now. I'd love to have details on this if I shouldn't be using it - Is set -o noclobber; echo $$ > $lockfile really atomic and safer than shlock? If so, great, although I would then wonder why shlock even exists :) The part that brings about an atomic lock is "noclobber", which sets it so that we are not allowed to "clobber"/"overwrite" an existing file. Thus, if the file exists, the command fails. If it successfully creates the new file, the command returns true. I understand how it's supposed to work, I just wondered if it was really atomic as it should be since there would be no reason for shlock to even exist with that line of code you wrote. When I originally came across the feature I wasn't sure it would work and did extensive testing: For example, spawn 30 000 processes, each of which tried to take the lock. After the machine became responsive again ;) only 1 lock ever turned out to have succeeded. Since then its been in production use across various scripts on hundreds of servers. My guess (see above) is that the INN devs couldn't or didn't want to use it. The original page where I learned about noclobber: http://www.davidpashley.com/articles/writing-robust-shell-scripts/ Thanks for the info Marc No problem - and thanks. :) -- __ Brendan Hide http://swiftspirit.co.za/ http://www.webafrica.co.za/?AFF1E97 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
ERROR: error during balancing '.' - No space left on device
legolas:/mnt/btrfs_pool2# btrfs balance . ERROR: error during balancing '.' - No space left on device There may be more info in syslog - try dmesg | tail [ 8454.159635] BTRFS info (device dm-1): relocating block group 288329039872 flags 1 [ 8590.167294] BTRFS info (device dm-1): relocating block group 232494465024 flags 1 [ 9200.801177] BTRFS info (device dm-1): relocating block group 85928706048 flags 1 [ 9533.830623] BTRFS info (device dm-1): 824 enospc errors during balance But: legolas:/mnt/btrfs_pool2# btrfs fi show `pwd` Label: btrfs_pool2 uuid: 6afd4707-876c-46d6-9de2-21c4085b7bed Total devices 1 FS bytes used 646.41GiB devid1 size 820.45GiB used 820.45GiB path /dev/mapper/disk2 Btrfs v3.12 legolas:/mnt/btrfs_pool2# Data, single: total=800.42GiB, used=636.91GiB System, DUP: total=8.00MiB, used=92.00KiB System, single: total=4.00MiB, used=0.00 Metadata, DUP: total=10.00GiB, used=9.50GiB Metadata, single: total=8.00MiB, used=0.00 I can't see how I'm full, and now that I can't run balance to fix things, this is making things worse. Kernel is 3.14. What am I missing? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html