On Mon, Nov 14, 2011 at 05:12:13PM -0500, Jérôme Carretero wrote: > I have a couple of questions concerning btrfs reliability. > > I'm currently using btrfs in my internal drives (strong advantages) and have > used it on external drives, but I've recently migrated the external ones to > ext4, for reliability reasons. > The kernel seems to be able to handle ext4 partition disconnections (drive > error, cable gets eaten by rodent, or most commonly, unplugged too early on > removable drives...) quite gracefully. > This is not yet the case for btrfs partitions (deadlocks, various oopses, > need to reboot). > Any idea when this will be available ? >
At some point in the future. We know it's a problem, but as you can see from your nice list at the end, we have lot's of problems working properly when the drives are plugged in and behaving fine ;). > How to handle bad blocks (sometimes, they are very localized on HDDs, and > they will happen on old SSDs) ? > > Imagine the following use case: > - get untrusted drive from dumpster > - check that it runs, and has an acceptable amount of bad block clusters > - add the drive to a btrfs pool, which guarantees that its data will be > duplicated somewhere else > - enjoy the drive while it lasts > - ability to retrieve bad blocks map later on > - ability to cleanly remove the drive from the pool if it becomes useless > (found a better one) or dies (see first question) > when that happens, data gets replicated to other locations... > data replication could be done automatically by background scrubbing with > some mount flag or ioctl > How far are we from that ? Will we get there some day ? > > > Since I'm here, a few random and useless notes, as I'm currently testing > v3.2-rc1-284-g52e4c2a and I see a few bugs, deadlocks and weirdnesses. > I don't know if it's normal for -rc1, maybe. > My current workload is "rsync 1.5TB from SATA to USB2+3 (500+1000GB in raid0) > and vice versa". > The load average can grow to 15. > I've ran into BUG at fs/btrfs/inode.c:1795 > (http://comments.gmane.org/gmane.comp.file-systems.btrfs/14128). > I've ran into WARNING: at fs/btrfs/free-space-cache.c:1847 > btrfs_remove_free_space+0x1a3/0x287() [1] > I've also ran into INFO: task btrfs-transacti:1465 blocked for more than 120 > seconds. [2] > Sometimes linux is writing I don't know what for a looooooong time on drives, > and there's nothing in cache. > Sometimes rsync stops, doing nothing. It will somehow restart after I do a > "echo 3 > /proc/sys/vm/drop_caches"... > I see that a lot of features will be added for 3.2 but I hope they will be > well tested ! So the inode one was fixed recently, Chris sent the patch to Linus this weekend so upgrade to that and you should be good. The free_space_cache one is a new one on me, how often do you hit it? And as for the transaction thing, if it happens again can you do sysrq+w? Sometimes the guy hanging everybody up doesn't get printed out so we only get part of the picture, sysrq+w will give us all waiters so we can figure out whats going on. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html