Re: Replicate snapshot to second machine fails
This isn't a btrfs-send or a btrfs-receive question: $ echo hi | ssh machine.local sudo echo test sudo: no tty present and no askpass program specified How were you planning on providing credentials to sudo? On Sun, Feb 8, 2015 at 9:17 AM, Thomas Schneider wrote: > > > Hi, > > I want to replicate a snapshot from PC1 to virtual machine using this command: > > user@pc1-gigabyte ~ $ sudo btrfs send > "/home/.snapshots/lmde_home_2015-02-07_02:38:21" | ssh vm1-debian sudo btrfs > receive /home/.snapshots/ > At subvol /home/.snapshots/lmde_home_2015-02-07_02:38:21 > sudo: Kein TTY vorhanden und kein »askpass«-Programm angegeben > > Unfortunately I cannot detect the root cause for the failure. > Any ideas? > > On both machines I have installed programs ssh-askpass and ssh-askpass-gnome. > > > user@pc1-gigabyte ~ $ uname -a > Linux pc1-gigabyte 3.16.0-4-686-pae #1 SMP Debian 3.16.7-ckt2-1 (2014-12-08) > i686 GNU/Linux > user@pc1-gigabyte ~ $ sudo btrfs --version > Btrfs v3.17 > user@pc1-gigabyte ~ $ sudo btrfs fi show > Label: none uuid: 236fe36a-3187-4955-977d-f22cd818c424 > Total devices 1 FS bytes used 103.96GiB > devid1 size 147.00GiB used 147.00GiB path /dev/sda5 > Btrfs v3.17 > user@pc1-gigabyte ~ $ sudo btrfs fi df /home > Data, single: total=142.97GiB, used=103.03GiB > System, DUP: total=8.00MiB, used=16.00KiB > System, single: total=4.00MiB, used=0.00B > Metadata, DUP: total=2.00GiB, used=957.17MiB > Metadata, single: total=8.00MiB, used=0.00B > GlobalReserve, single: total=320.00MiB, used=0.00B > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCH v2] mount.btrfs helper
Sorry, misread "initrdless" as "initramfs". In #btrfs, I usually say something like "do you gain enough by not using an initfs for this to be worth the hassle?", but of course, that's not an argument against making mount smarter. On Sun, Nov 30, 2014 at 4:57 PM, Dimitri John Ledkov wrote: > On 30 November 2014 at 22:31, cwillu wrote: >> >> In ubuntu, the initfs runs a btrfs dev scan, which should catch >> anything that would be missed there. >> > > I'm sorry, udev rule(s) is not sufficient in the initramfs-less case, > as outlined. > > In case of booting with initramfs, indeed, both Debian & Ubuntu > include snippets there to run btrfs scan. > > -- > Regards, > > Dimitri. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCH v2] mount.btrfs helper
In ubuntu, the initfs runs a btrfs dev scan, which should catch anything that would be missed there. On Sun, Nov 30, 2014 at 4:11 PM, Dimitri John Ledkov wrote: > Hello, > > On 30 November 2014 at 17:43, Goffredo Baroncelli wrote: >> Hi all, >> >> this patch provides a "mount.btrfs" helper for the mount command. >> A btrfs filesystem could span several disks. This helper scans all the >> partitions to discover all the disks required to mount a filesystem. >> So it would not necessary any-more to "scan" the partitions to mount a >> filesystem. >> > > I would welcome this, as a general idea. At the moment in debian & > ubuntu, btrfs tools package ships udev rules to call "btrfs scan" > whenever device nodes appear. > > If scan is built into mount, I would be able to drop that udev rule. > There are also some reports (not yet re-verified) that such udev rule > is not effective, that is btrfs mount fails when attempted before udev > has attempted to be run - e.g. from initrdless boot trying to mount > btrfs systems before udev-trigger has been run (to process "cold-plug" > events). > > -- > Regards, > > Dimitri. > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: TEST PING
On Sun, Oct 12, 2014 at 2:45 PM, royy walls wrote: > > -- http://www.tux.org/lkml/#s3 "Test" messages are very, very inappropriate on the lkml or any other list, for that matter. If you want to know whether the subscribe succeeded, wait for a couple of hours after you get a reply from the mailing list software saying it did. You'll undoubtedly get a number of list messages. If you want to know whether you can post, you must have something important to say, right? After you have read the following paragraphs, compose a real letter, not a test message, in an editor, saving the body of the letter in the off chance your post doesn't succeed. Then post your letter to lkml. Please remember that there are quite a number of subscribers, and it will take a while for your letter to be reflected back to you. An hour is not too long to wait. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What is the vision for btrfs fs repair?
If -o recovery is necessary, then you're either running into a btrfs bug, or your hardware is lying about when it has actually written things to disk. The first case isn't unheard of, although far less common than it used to be, and it should continue to improve with time. In the second case, you're potentially screwed regardless of the filesystem, without doing hacks like "wait a good long time before returning from fsync in the hopes that the disk might actually have gotten around to performing the write it said had already finished." On Fri, Oct 10, 2014 at 5:12 AM, Bob Marley wrote: > On 10/10/2014 12:59, Roman Mamedov wrote: >> >> On Fri, 10 Oct 2014 12:53:38 +0200 >> Bob Marley wrote: >> >>> On 10/10/2014 03:58, Chris Murphy wrote: > > * mount -o recovery > "Enable autorecovery attempts if a bad tree root is found at > mount time." I'm confused why it's not the default yet. Maybe it's continuing to evolve at a pace that suggests something could sneak in that makes things worse? It is almost an oxymoron in that I'm manually enabling an autorecovery If true, maybe the closest indication we'd get of btrfs stablity is the default enabling of autorecovery. >>> >>> No way! >>> I wouldn't want a default like that. >>> >>> If you think at distributed transactions: suppose a sync was issued on >>> both sides of a distributed transaction, then power was lost on one >>> side >> >> What distributed transactions? Btrfs is not a clustered filesystem[1], it >> does >> not support and likely will never support being mounted from multiple >> hosts at >> the same time. >> >> [1]http://en.wikipedia.org/wiki/Clustered_file_system >> > > This is not the only way to do a distributed transaction. > Databases can be hosted on the filesystem, and those can do distributed > transations. > Think of two bank accounts, one on btrfs fs1 here, and another bank account > on database on a whatever filesystem in another country. You want to debit > one account and credit the other one: the filesystems at the two sides *must > not rollback their state* !! (especially not transparently without human > intervention) > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] btrfs-progs: Add simple python front end to the search ioctl
Damn you gmail... -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] btrfs-progs: Add simple python front end to the search ioctl
On Tue, Sep 23, 2014 at 10:39 AM, Chris Mason wrote: > > This is a starting point for a debugfs style python interface using > the search ioctl. For now it can only do one thing, which is to > print out all the extents in a file and calculate the compression ratio. > > Over time it will grow more features, especially for the kinds of things > we might run btrfs-debug-tree to find out. Expect the usage and output > to change dramatically over time (don't hard code to it). > > Signed-off-by: Chris Mason > --- > btrfs-debugfs | 296 > ++ > 1 file changed, 296 insertions(+) > create mode 100755 btrfs-debugfs > > diff --git a/btrfs-debugfs b/btrfs-debugfs > new file mode 100755 > index 000..cf1d285 > --- /dev/null > +++ b/btrfs-debugfs > @@ -0,0 +1,296 @@ > +#!/usr/bin/env python2 > +# > +# Simple python program to print out all the extents of a single file > +# LGPLv2 license > +# Copyright Facebook 2014 > + > +import sys,os,struct,fcntl,ctypes,stat > + > +# helpers for max ints > +maxu64 = (1L << 64) - 1 > +maxu32 = (1L << 32) - 1 > + > +# the inode (like form stat) > +BTRFS_INODE_ITEM_KEY = 1 > +# backref to the directory > +BTRFS_INODE_REF_KEY = 12 > +# backref to the directory v2 > +BTRFS_INODE_EXTREF_KEY = 13 > +# xattr items > +BTRFS_XATTR_ITEM_KEY = 24 > +# orphans for list files > +BTRFS_ORPHAN_ITEM_KEY = 48 > +# treelog items for dirs > +BTRFS_DIR_LOG_ITEM_KEY = 60 > +BTRFS_DIR_LOG_INDEX_KEY = 72 > +# dir items and dir indexes both hold filenames > +BTRFS_DIR_ITEM_KEY = 84 > +BTRFS_DIR_INDEX_KEY = 96 > +# these are the file extent pointers > +BTRFS_EXTENT_DATA_KEY = 108 > +# csums > +BTRFS_EXTENT_CSUM_KEY = 128 > +# root item for subvols and snapshots > +BTRFS_ROOT_ITEM_KEY = 132 > +# root item backrefs > +BTRFS_ROOT_BACKREF_KEY = 144 > +BTRFS_ROOT_REF_KEY = 156 > +# each allocated extent has an extent item > +BTRFS_EXTENT_ITEM_KEY = 168 > +# optimized extents for metadata only > +BTRFS_METADATA_ITEM_KEY = 169 > +# backrefs for extents > +BTRFS_TREE_BLOCK_REF_KEY = 176 > +BTRFS_EXTENT_DATA_REF_KEY = 178 > +BTRFS_EXTENT_REF_V0_KEY = 180 > +BTRFS_SHARED_BLOCK_REF_KEY = 182 > +BTRFS_SHARED_DATA_REF_KEY = 184 > +# one of these for each block group > +BTRFS_BLOCK_GROUP_ITEM_KEY = 192 > +# dev extents records which part of each device is allocated > +BTRFS_DEV_EXTENT_KEY = 204 > +# dev items describe devs > +BTRFS_DEV_ITEM_KEY = 216 > +# one for each chunk > +BTRFS_CHUNK_ITEM_KEY = 228 > +# qgroup info > +BTRFS_QGROUP_STATUS_KEY = 240 > +BTRFS_QGROUP_INFO_KEY = 242 > +BTRFS_QGROUP_LIMIT_KEY = 244 > +BTRFS_QGROUP_RELATION_KEY = 246 > +# records balance progress > +BTRFS_BALANCE_ITEM_KEY = 248 > +# stats on device errors > +BTRFS_DEV_STATS_KEY = 249 > +BTRFS_DEV_REPLACE_KEY = 250 > +BTRFS_STRING_ITEM_KEY = 253 > + > +# in the kernel sources, this is flattened > +# btrfs_ioctl_search_args_v2. It includes both the btrfs_ioctl_search_key > +# and the buffer. We're using a 64K buffer size. > +# > +args_buffer_size = 65536 > +class btrfs_ioctl_search_args(ctypes.Structure): Put comments like these in triple-quoted strings just inside the class or function you're defining; this makes them accessible using the standard help() system: class foo(bar): """ In the kernel sources, this is > +_pack_ = 1 > +_fields_ = [ ("tree_id", ctypes.c_ulonglong), > + ("min_objectid", ctypes.c_ulonglong), > + ("max_objectid", ctypes.c_ulonglong), > + ("min_offset", ctypes.c_ulonglong), > + ("max_offset", ctypes.c_ulonglong), > + ("min_transid", ctypes.c_ulonglong), > + ("max_transid", ctypes.c_ulonglong), > + ("min_type", ctypes.c_uint), > + ("max_type", ctypes.c_uint), > + ("nr_items", ctypes.c_uint), > + ("unused", ctypes.c_uint), > + ("unused1", ctypes.c_ulonglong), > + ("unused2", ctypes.c_ulonglong), > + ("unused3", ctypes.c_ulonglong), > + ("unused4", ctypes.c_ulonglong), > + ("buf_size", ctypes.c_ulonglong), > + ("buf", ctypes.c_ubyte * args_buffer_size), > + ] > + > +# the search ioctl resturns one header for each item > +class btrfs_ioctl_search_header(ctypes.Structure): > +_pack_ = 1 > +_fields_ = [ ("transid", ctypes.c_ulonglong), > + ("objectid", ctypes.c_ulonglong), > + ("offset", ctypes.c_ulonglong), > + ("type", ctypes.c_uint), > + ("len", ctypes.c_uint), > + ] > + > +# the type field in btrfs_file_extent_item > +BTRFS_FILE_EXTENT_INLINE = 0 > +BTRFS_FILE_EXTENT_REG = 1 > +BTRFS_FILE_EXTENT_PREALLOC = 2 > + > +class btrfs_file_extent_item(ctypes.LittleEndianStructure): > +_pack_ = 1 > +_fields_ = [ ("generation", ctypes.c_ulonglong), > + ("ram_bytes", ctypes.c_ul
Re: [systemd-devel] Slow startup of systemd-journal on BTRFS
It's not a mmap problem, it's a small writes with an msync or fsync after each one problem. For the case of sequential writes (via write or mmap), padding writes to page boundaries would help, if the wasted space isn't an issue. Another approach, again assuming all other writes are appends, would be to periodically (but frequently enough that the pages are still in cache) read a chunk of the file and write it back in-place, with or without an fsync. On the other hand, if you can afford to lose some logs on a crash, not fsyncing/msyncing after each write will also eliminate the fragmentation. (Worth pointing out that none of that is conjecture, I just spent 30 minutes testing those cases while composing this ;p) Josef has mentioned in irc that a piece of Chris' raid5/6 work will also fix this when it lands. On Mon, Jun 16, 2014 at 1:52 PM, Martin wrote: > On 16/06/14 17:05, Josef Bacik wrote: >> >> On 06/16/2014 03:14 AM, Lennart Poettering wrote: >>> On Mon, 16.06.14 10:17, Russell Coker (russ...@coker.com.au) wrote: >>> > I am not really following though why this trips up btrfs though. I am > not sure I understand why this breaks btrfs COW behaviour. I mean, > I don't believe that fallocate() makes any difference to fragmentation on BTRFS. Blocks will be allocated when writes occur so regardless of an fallocate() call the usage pattern in systemd-journald will cause fragmentation. >>> >>> journald's write pattern looks something like this: append something to >>> the end, make sure it is written, then update a few offsets stored at >>> the beginning of the file to point to the newly appended data. This is >>> of course not easy to handle for COW file systems. But then again, it's >>> probably not too different from access patterns of other database or >>> database-like engines... > > Even though this appears to be a problem case for btrfs/COW, is there a > more favourable write/access sequence possible that is easily > implemented that is favourable for both ext4-like fs /and/ COW fs? > > Database-like writing is known 'difficult' for filesystems: Can a data > log can be a simpler case? > > >> Was waiting for you to show up before I said anything since most systemd >> related emails always devolve into how evil you are rather than what is >> actually happening. > > Ouch! Hope you two know each other!! :-P :-) > > > [...] >> since we shouldn't be fragmenting this badly. >> >> Like I said what you guys are doing is fine, if btrfs falls on it's face >> then its not your fault. I'd just like an exact idea of when you guys >> are fsync'ing so I can replicate in a smaller way. Thanks, > > Good if COW can be so resilient. I have about 2GBytes of data logging > files and I must defrag those as part of my backups to stop the system > fragmenting to a stop (I use "cp -a" to defrag the files to a new area > and restart the data software logger on that). > > > Random thoughts: > > Would using a second small file just for the mmap-ed pointers help avoid > repeated rewriting of random offsets in the log file causing excessive > fragmentation? > > Align the data writes to 16kByte or 64kByte boundaries/chunks? > > Are mmap-ed files a similar problem to using a swap file and so should > the same "btrfs file swap" code be used for both? > > > Not looked over the code so all random guesses... > > Regards, > Martin > > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [systemd-devel] Slow startup of systemd-journal on BTRFS
Fallocate is a red herring except insofar as it's a hint that btrfs isn't making much use of: you see the same behaviour with small writes to an mmap'ed file that's msync'ed after each write, and likewise with plain old appending small writes with an fsync after each write, with or without fallocating the file first. Looking at the fiemap output while doing either of those, you'll see a new 4k extent being made, and then the physical location of that extent will increment until the writes move on to the next 4k extent. cwillu@cwillu-home:~/work/btrfs/e2fs$ touch /tmp/test >>> f=open('/tmp/test', 'r+') >>> m=mmap.mmap(f.fileno(), size) >>> for x in xrange(size): ... m[x] = " " ... m.flush(x / 4096 * 4096, 4096) # msync(self->data + offset, size, MS_SYNC)) { cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s) start: 0, length: 80 fs_ioc_fiemap 3223348747d File /tmp/test has 3 extents: # Logical Physical Length Flags 0: 000b3d9c 1000 1: 1000 00069f012000 0000003ff000 2: 0040 000b419d1000 0040 0001 cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s) 0: 000b3daf3000 1000 1: 1000 00069f012000 0000003ff000 2: 0040 000b419d1000 0040 0001 cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s) 0: 000b3dc38000 1000 1: 1000 00069f012000 0000003ff000 2: 0040 000b419d1000 0040 0001 cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s) 0: 000b3dc9f000 1000 1: 1000 000b3d2b7000 1000 2: 2000 00069f013000 0000003fe000 3: 0040 000b419d1000 0040 0001 cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s)# msync(self->data + offset, size, MS_SYNC)) { 0: 000b3dc9f000 1000 1: 1000 000b3d424000 1000 2: 2000 00069f013000 0000003fe000 3: 0040 000b419d1000 0040 0001 cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s) 0: 000b3dc9f000 1000 1: 1000 000b3d563000 1000 2: 2000 00069f013000 003fe000 3: 0040 000b419d1000 0040 0001 cwillu@cwillu-home:~/work/btrfs/e2fs$ rm /tmp/test cwillu@cwillu-home:~/work/btrfs/e2fs$ touch /tmp/test >>> f=open('/tmp/test', 'r+') >>> f.truncate(size) >>> m=mmap.mmap(f.fileno(), size) >>> for x in xrange(size): ... m[x] = " " ... m.flush(x / 4096 * 4096, 4096) # msync(self->data + offset, size, MS_SYNC)) { cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s) start: 0, length: 80 fs_ioc_fiemap 3223348747d File /tmp/test has 1 extents: # Logical Physical Length Flags 0: 000b47f11000 00001000 0001 cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s) 0: 000b48006000 1000 0001 cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s) 0: 000b48183000 1000 1: 1000 000b48255000 1000 0001 cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s) 0: 000b48183000 1000 1: 1000 000b48353000 1000 0001 cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s) 0: 000b48183000 1000 1: 1000 000b493ed000 00001000 2: 2000 000b4a68f000 1000 3: 3000 000b4b36f000 1000 0001 cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s) 0: 000b48183000 1000 1: 1000 000b493ed000 00001000 2: 2000 000b4a68f000 1000 3: 3000 000b4b4cf000 1000 0001 cwillu@cwillu-home:~/work/btrfs/e2fs$ rm /tmp/test cwillu@cwillu-home:~/work/btrfs/e2fs$ touch /tmp/test >>> f=open('/tmp/test', '
Re: [PATCH 24/27] btrfs-progs: Convert man page for btrfs-zero-log
On Fri, Apr 4, 2014 at 12:46 PM, Marc MERLIN wrote: > On Wed, Apr 02, 2014 at 04:29:35PM +0800, Qu Wenruo wrote: >> Convert man page for btrfs-zero-log >> >> Signed-off-by: Qu Wenruo >> --- >> Documentation/Makefile | 2 +- >> Documentation/btrfs-zero-log.txt | 39 >> +++ >> 2 files changed, 40 insertions(+), 1 deletion(-) >> create mode 100644 Documentation/btrfs-zero-log.txt >> >> diff --git a/Documentation/Makefile b/Documentation/Makefile >> index e002d53..de06629 100644 >> --- a/Documentation/Makefile >> +++ b/Documentation/Makefile >> @@ -11,7 +11,7 @@ MAN8_TXT += btrfs-image.txt >> MAN8_TXT += btrfs-map-logical.txt >> MAN8_TXT += btrfs-show-super.txt >> MAN8_TXT += btrfstune.txt >> -#MAN8_TXT += btrfs-zero-log.txt >> +MAN8_TXT += btrfs-zero-log.txt >> #MAN8_TXT += fsck.btrfs.txt >> #MAN8_TXT += mkfs.btrfs.txt >> >> diff --git a/Documentation/btrfs-zero-log.txt >> b/Documentation/btrfs-zero-log.txt >> new file mode 100644 >> index 000..e3041fa >> --- /dev/null >> +++ b/Documentation/btrfs-zero-log.txt >> @@ -0,0 +1,39 @@ >> +btrfs-zero-log(8) >> += >> + >> +NAME >> + >> +btrfs-zero-log - clear out log tree >> + >> +SYNOPSIS >> + >> +'btrfs-zero-log' >> + >> +DESCRIPTION >> +--- >> +'btrfs-zero-log' will remove the log tree if log tree is corrupt, which will >> +allow you to mount the filesystem again. >> + >> +The common case where this happens has been fixed a long time ago, >> +so it is unlikely that you will see this particular problem. > > A note on this one: this can happen if your SSD rites things in the > wrong order or potentially writes garbage when power is lost, or before > locking up. > I hit this problem about 10 times and it wasn't a btrfs bug, just the > drive doing bad things. And -o recovery didn't work around it? My understanding is that -o recovery will skip reading the log. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Building a brtfs filesystem < 70M?
Have you tried the -M option to mkfs.btrfs? I'm not sure if we select it automatically (or if we do, whether you have recent enough tools to have that). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Repost] Is BTRFS "bedup" maintained ?
Bedup was/is a third-party project, not sure if its developer follows this list. Might be worth filing a bug or otherwise poking the author on https://github.com/g2p/bedup On Wed, Mar 5, 2014 at 2:43 PM, Marc MERLIN wrote: > On Wed, Mar 05, 2014 at 06:24:40PM +0100, Swāmi Petaramesh wrote: >> Hello, >> >> (Not having received a single answer, I repost this...) > > I got your post, and posted myself about bedup not working at all for me, > and got no answer either. > > As far as I can tell, it's entirely unmaintained and was likely just a proof > of concept until the kernel can do it itself and that's not entirely > finished from what I understand. > > It's a bit disappointing, but hopefully it'll get fixed eventually. > > Marc > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > Microsoft is to operating systems > what McDonalds is to gourmet > cooking > Home page: http://marc.merlins.org/ > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: No space left on device (again)
Try btrfs filesystem balance start -dusage=15 /home, and gradually increase it until you see it relocate at least one chunk. On Tue, Feb 25, 2014 at 2:27 PM, Marcus Sundman wrote: > On 25.02.2014 22:19, Hugo Mills wrote: >> >> On Tue, Feb 25, 2014 at 01:05:51PM -0500, Jim Salter wrote: >>> >>> 370GB of 410GB used isn't really "fine", it's over 90% usage. >>> >>> That said, I'd be interested to know why btrfs fi show /dev/sda3 >>> shows 412.54G used, but btrfs fi df /home shows 379G used... >> >> This is an FAQ... >> >> btrfs fi show tells you how much is allocated out of the available >> pool on each disk. btrfs fi df then shows how much of that allocated >> space (in each category) is used. > > > What is the difference between the "used 371.11GB" and the "used 412.54GB" > displayed by "btrfs fi show"? > > >> The problem here is also in the FAQ: the metadata is close to full >> -- typically something like 500-750 MiB of headroom is needed in >> metadata. The FS can't allocate more metadata because it's allocated >> everything already (total=used in btrfs fi show), so the solution is >> to do a filtered balance: >> >> btrfs balance start -dusage=5 /mountpoint > > > Of course that was the first thing I tried, and it didn't help *at* *all*: > >> # btrfs filesystem balance start -dusage=5 /home >> Done, had to relocate 0 out of 415 chunks >> # > > > ... and it really didn't free anything. > > >>> On 02/25/2014 11:49 AM, Marcus Sundman wrote: Hi I get "No space left on device" and it is unclear why: > # df -h|grep sda3 > /dev/sda3 413G 368G 45G 90% /home > # btrfs filesystem show /dev/sda3 > Label: 'home' uuid: 46279061-51f4-40c2-afd0-61d6faab7f60 > Total devices 1 FS bytes used 371.11GB > devid1 size 412.54GB used 412.54GB path /dev/sda3 > > Btrfs v0.20-rc1 > # btrfs filesystem df /home > Data: total=410.52GB, used=369.61GB > System: total=4.00MB, used=64.00KB > Metadata: total=2.01GB, used=1.50GB > # So, 'data' and 'metadata' seem to be fine(?), but 'system' is a bit low. Is that it? If so, can I do something about it? Or should I look somewhere else? I really wish I could get a warning before running out of disk space, instead of everything breaking suddenly when there seems to be lots and lots of space left. - Marcus > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What to do about df and btrfs fi df
On Mon, Feb 10, 2014 at 7:02 PM, Roger Binns wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 10/02/14 10:24, cwillu wrote: >> The regular df data used number should be the amount of space required >> to hold a backup of that content (assuming that the backup maintains >> reflinks and compression and so forth). >> >> There's no good answer for available space; > > I think the flipside of the above works well. How large a group of files > can you expect to create before you will get ENOSPC? > > That for example is the check code does that looks at df - "I need to put > in XGB of files - will it fit?" It is also what users do. But the answer changes dramatically depending on whether it's large numbers of small files or a small number of large files, and the conservative worst-case choice means we report a number that is half what is probably expected. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What to do about df and btrfs fi df
> In the past [1] I proposed the following approach. > > $ sudo btrfs filesystem df /mnt/btrfs1/ > Disk size: 400.00GB > Disk allocated:8.04GB > Disk unallocated:391.97GB > Used: 11.29MB > Free (Estimated):250.45GB (Max: 396.99GB, min: 201.00GB) > Data to disk ratio: 63 % Note that a big chunk of the problem is "what do we do with the regular system df output". I don't mind this as a btrfs fi df summary though. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What to do about df and btrfs fi df
>> IMO, used should definitely include metadata, especially given that we >> inline small files. >> >> I can convince myself both that this implies that we should roll it >> into b_avail, and that we should go the other way and only report the >> actual used number for metadata as well, so I might just plead >> insanity here. > > I could be convinced to do this. So we have > > total: (total disk bytes) / (raid multiplier) > used: (total used in data block groups) + > (total used in metadata block groups) > avail: total - (total used in data block groups + > total metadata block groups) > > That seems like the simplest to code up. Then we can argue about whether to > use the total metadata size or just the used metadata size for b_avail. > Seem reasonable? I can't think of any situations where this results in tears. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What to do about df and btrfs fi df
IMO, used should definitely include metadata, especially given that we inline small files. I can convince myself both that this implies that we should roll it into b_avail, and that we should go the other way and only report the actual used number for metadata as well, so I might just plead insanity here. On Mon, Feb 10, 2014 at 12:28 PM, Josef Bacik wrote: > > > On 02/10/2014 01:24 PM, cwillu wrote: >> >> I concur. >> >> The regular df data used number should be the amount of space required >> to hold a backup of that content (assuming that the backup maintains >> reflinks and compression and so forth). >> >> There's no good answer for available space; the statfs syscall isn't >> rich enough to cover all the bases even in the face of dup metadata >> and single data (i.e., the common case), and a truly conservative >> estimate (report based on the highest-usage raid level in use) would >> report space/2 on that same common case. "Highest-usage data raid >> level in use" is probably the best compromise, with a big warning that >> that many large numbers of small files will not actually fit, posted >> in some mythical place that users look. >> >> I would like to see the information from btrfs fi df and btrfs fi show >> summarized somewhere (ideally as a new btrfs fi df output), as both >> sets of numbers are really necessary, or at least have btrfs fi df >> include the amount of space not allocated to a block group. >> >> Re regular df: are we adding space allocated to a block group (raid1, >> say) but not in actual use in a file as the N/2 space available in the >> block group, or the N space it takes up on disk? This probably >> matters a bit less than it used to, but if it's N/2, that leaves us >> open to "empty filesystem, 100GB free, write a 80GB file and then >> delete it, wtf, only 60GB free now?" reporting issues. >> > > The only case we add the actual allocated chunk space is for metadata, for > data we only add the actual used number. So say say you write 80gb file and > then delete it but during the writing we allocated a 1 gig chunk for > metadata you'll see only 99gb free, make sense? We could (should?) roll > this into the b_avail magic and make "used" really only reflect data usage, > opinions on this? Thanks, > > Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What to do about df and btrfs fi df
I concur. The regular df data used number should be the amount of space required to hold a backup of that content (assuming that the backup maintains reflinks and compression and so forth). There's no good answer for available space; the statfs syscall isn't rich enough to cover all the bases even in the face of dup metadata and single data (i.e., the common case), and a truly conservative estimate (report based on the highest-usage raid level in use) would report space/2 on that same common case. "Highest-usage data raid level in use" is probably the best compromise, with a big warning that that many large numbers of small files will not actually fit, posted in some mythical place that users look. I would like to see the information from btrfs fi df and btrfs fi show summarized somewhere (ideally as a new btrfs fi df output), as both sets of numbers are really necessary, or at least have btrfs fi df include the amount of space not allocated to a block group. Re regular df: are we adding space allocated to a block group (raid1, say) but not in actual use in a file as the N/2 space available in the block group, or the N space it takes up on disk? This probably matters a bit less than it used to, but if it's N/2, that leaves us open to "empty filesystem, 100GB free, write a 80GB file and then delete it, wtf, only 60GB free now?" reporting issues. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Provide a better free space estimate on RAID1
Everyone who has actually looked at what the statfs syscall returns and how df (and everyone else) uses it, keep talking. Everyone else, go read that source code first. There is _no_ combination of values you can return in statfs which will not be grossly misleading in some common scenario that someone cares about. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Are nocow files snapshot-aware
On Thu, Feb 6, 2014 at 6:32 PM, Kai Krakow wrote: > Duncan <1i5t5.dun...@cox.net> schrieb: > >>> Ah okay, that makes it clear. So, actually, in the snapshot the file is >>> still nocow - just for the exception that blocks being written to become >>> unshared and relocated. This may introduce a lot of fragmentation but it >>> won't become worse when rewriting the same blocks over and over again. >> >> That also explains the report of a NOCOW VM-image still triggering the >> snapshot-aware-defrag-related pathology. It was a _heavily_ auto- >> snapshotted btrfs (thousands of snapshots, something like every 30 >> seconds or more frequent, without thinning them down right away), and the >> continuing VM writes would nearly guarantee that many of those snapshots >> had unique blocks, so the effect was nearly as bad as if it wasn't NOCOW >> at all! > > The question here is: Does it really make sense to create such snapshots of > disk images currently online and running a system. They will probably be > broken anyway after rollback - or at least I'd not fully trust the contents. > > VM images should not be part of a subvolume of which snapshots are taken at > a regular and short interval. The problem will go away if you follow this > rule. > > The same applies to probably any kind of file which you make nocow - e.g. > database files. Most of those file implement their own way of transaction > protection or COW system, e.g. look at InnoDB files. Neither they gain > anything from using IO schedulers (because InnoDB internally does block > sorting and prioritizing and knows better, doing otherwise even hurts > performance), nor they gain from file system semantics like COW (because it > does its own transactions and atomic updates and probably can do better for > its use case). Similar applies to disk images (imagine ZFS, NTFS, ReFS, or > btrfs images on btrfs). Snapshots can only do harm here (the only > "protection" use case would be to have a backup, but snapshots are no > backups), and COW will probably hurt performance a lot. The only use case is > taking _controlled_ snapshots - and doing it all 30 seconds is by all means > NOT controlled, it's completely undeterministic. If the database/virtual machine/whatever is crash safe, then the atomic state that a snapshot grabs will be useful. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS corrupted by combination of mistreatment of hiberantion and accidental power loss.
You'd have been better off to just throw away the hiberated image: mounting the filesystem would look like any other recovery from a crash, and would have replayed the log and committed a new transaction, in addition to whatever other disk writes happened due to boot logs and so forth. In this case, I suspect you'd have been perfectly fine. When resuming the hibernated image at that point however, the kernel will have its own ideas about the state on disk (i.e., whatever state it had in memory), partially undoing a subset of the changes from the previous boot and generally making a mess of things. That said, have you tried mounting with -o recovery yet? I wouldn't be surprised if btrfs-restore was also able to retrieve most of everything. Either way, I'd be suspicious of the filesystem, and would look to restore from backup to a fresh fs. On Wed, Jan 29, 2014 at 8:50 AM, Adam Ryczkowski wrote: > I have two independent Linux installations my notebook, both sharing the > same btrfs partition as root file system, but installed on different > subvolumes. > > I hibernated one Linux (Mint 15 64 bit). Hibernation data is stored on the > swap file, which is used exclusively by this system. > > Then 2 events happened. > > 1) I accidentally ran the other system, which wasn't hibernated - Ubuntu > 12.10. Realizing the problem, I waited until the system booted up, and then > shutdowned it. > > Then I opened the hibernated Mint 15. Restoration went successful, and I > never thought I am in trouble. > > 2) Immediately after that, by coincidence, the battery fell down, brutally > powering down the computer. > > After that, I am unable to repair/mount the root btrfs partition, however I > try (I built the current btrfs-tools from git). Dmesg displays only one > error entry: btrfs: open_ctree failed. > > I know, that if one those two events happened separately, there would be no > problem. The problem arose only when those two events happened > simultaneously. > > So I guess I am experiencing one of the corner cases. > > What are my prospects to restoring my data? I have several subvolumes on the > hard drive, some of them were not touched by the accident at all. > > Adam Ryczkowski > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs and ECC RAM
On Fri, Jan 17, 2014 at 6:23 PM, Ian Hinder wrote: > Hi, > > I have been reading a lot of articles online about the dangers of using ZFS > with non-ECC RAM. Specifically, the fact that when good data is read from > disk and compared with its checksum, a RAM error can cause the read data to > be incorrect, causing a checksum failure, and the bad data might now be > written back to the disk in an attempt to correct it, corrupting it in the > process. This would be exacerbated by a scrub, which could run through all > your data and potentially corrupt it. There is a strong current of opinion > that using ZFS without ECC RAM is "suicide for your data". That sounds entirely silly: a scrub will only write data to the disk that has actually passed a checksum. In order for that to corrupt something on disk, you'd have to have a perfect storm of correct and corrupt reads, and in every such case thta I can think of, you'd be worse off without checksums than if you had them. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Subvolume creation returns file exists
On Fri, Nov 15, 2013 at 9:27 AM, Hugo Mills wrote: > On Fri, Nov 15, 2013 at 02:33:58PM +, Alin Dobre wrote: >> We are using btrfs filesystems in our infrastructure and, at some >> point of time, they start refusing to create new subvolumes. >> >> Each file system is being quota initialized immediately after its >> creation (with "btrfs quota enable") and then all subfolders under >> the root directory are created as subvolumes (btrfs subvolume >> create). Over time, these subvolumes may also be deleted. What's >> under subvolumes are just various files and directories, should not >> be related to this problem. >> >> After a while of using this setup, without any obvious steps to >> reproduce it, the filesystem goes into a state where the following >> happens: >> # btrfs subvolume create btrfs_mount/test_subvolume >> Create subvolume 'btrfs_mount/test_subvolume' >> ERROR: cannot create subvolume - File exists > >We've had someone else with this kind of symptom (snapshot/subvol > creation fails unexpectedly) on IRC recently. I don't think they've > got to the bottom of it yet, but the investigation is ongoing. I've > cc'd Carey in on this, because he was the one trying to debug it. > >Hugo. > >> In regards to data, the filesystem is pretty empty, it only has a >> single empty directory. I don't know about the metadata, at this >> point. >> >> The problem goes away if we disable and re-enable the quota. It all >> seems to be some dead metadata lying around. And indeed, it turns out I did have quotas enabled, and disabling them restores the ability to create subvolumes. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Odp: Re: Odp: Btrfs might be gradually slowing the boot process
On Fri, Nov 8, 2013 at 1:46 PM, Hugo Mills wrote: > On Fri, Nov 08, 2013 at 08:37:37PM +0100, y...@wp.pl wrote: >> Sure; >> >> the kernel line from grub.cfg: >> linux /boot/vmlinuz-linux root=UUID=c26e6d9a-0bbb-436a-a217-95c738b5b9c6 >> rootflags=noatime,space_cache rw quiet > >OK, this may be your problem. You're generating the space cache > every time you boot. You only need it once; let the disk activity on > boot finish (it may take a while, depending on how big your filesystem > is, and how much data it has, and how fragmented it is), and remove > the space_cache option from your rootflags. When you next boot, it > will use the existing cache rather than generating it again from > scratch. While it's true you only need to mount with it once, mounting with space_cache will only generate it if it doesn't already exist. The existence of a valid space cache generation in the super actually enables exactly the same flag that space_cache/no_space_cache toggles, in the very same function as the mount option is checked (this is basically how the "you only need to mount with it once" magic is implemented). super.c: int btrfs_parse_options(struct btrfs_root *root, char *options) { ... cache_gen = btrfs_super_cache_generation(root->fs_info->super_copy); if (cache_gen) btrfs_set_opt(info->mount_opt, SPACE_CACHE); ... case Opt_space_cache: btrfs_set_opt(info->mount_opt, SPACE_CACHE); break; ... } -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs and default directory link count
On Fri, Nov 8, 2013 at 5:07 AM, Andreas Schneider wrote: > Hello, > > I did run the Samba testsuite and have a failing test > (samba.vfstest.stream_depot). It revealed that it only fails on btrfs. The > reason is that a simple check fails: > > if (smb_fname_base->st.st_ex_nlink == 2) > > If you create a directory on btrfs and check stat: > > $ mkdir x > $ stat x > File: ‘x’ > Size: 0 Blocks: 0 IO Block: 4096 directory > Device: 2bh/43d Inode: 3834720 Links: 1 > Access: (0755/drwxr-xr-x) Uid: ( 1000/ asn) Gid: ( 100/ users) > Access: 2013-11-08 11:54:32.431040963 +0100 > Modify: 2013-11-08 11:54:32.430040956 +0100 > Change: 2013-11-08 11:54:32.430040956 +0100 > Birth: - > > then you see Links: 1. On ext4 or other filesystems: > > mkdir x > stat x > File: ‘x’ > Size: 4096Blocks: 8 IO Block: 4096 directory > Device: fd00h/64768dInode: 8126886 Links: 2 > Access: (0755/drwxr-xr-x) Uid: ( 1000/ asn) Gid: ( 100/ users) > Access: 2013-11-08 11:54:55.428212340 +0100 > Modify: 2013-11-08 11:54:55.427212319 +0100 > Change: 2013-11-08 11:54:55.427212319 +0100 > Birth: - > > the link count for a directory differs: Links: 2. > > Why is btrfs different here? Could someone explain this? As I understand it, inferring the number of directory entries from st_nlink is an optimization that isn't universally valid. If that count is 1, it must be considered invalid, and programs that don't handle this correctly are broken. Coreutils handle this, at least... -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck errors is it save to fix?
On Mon, Nov 4, 2013 at 3:14 PM, Hendrik Friedel wrote: > Hello, > > the list was quite full with patches, so this might have been hidden. > Here the complete Stack. > Does this help? Is this what you needed? >> [95764.899294] CPU: 1 PID: 21798 Comm: umount Tainted: GFCIO >> 3.11.0-031100rc2-generic #201307211535 Can you reproduce the problem under the released 3.11 or 3.12? An -rc2 is still pretty early in the release cycle, and I wouldn't be at all surprised if it was a bug added and fixed in a later rc. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck errors is it save to fix?
> Now that I am searching, I see this in dmesg: > [95764.899359] [] free_fs_root+0x99/0xa0 [btrfs] > [95764.899384] [] btrfs_drop_and_free_fs_root+0x93/0xc0 > [btrfs] > [95764.899408] [] del_fs_roots+0xcf/0x130 [btrfs] > [95764.899433] [] close_ctree+0x146/0x270 [btrfs] > [95764.899461] [] btrfs_put_super+0x19/0x20 [btrfs] > [95764.899493] [] btrfs_kill_super+0x1a/0x90 [btrfs] Need to see the rest of the trace this came from. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel BUG at fs/btrfs/relocation.c:1060 during rebalancing
Another user has just reported this in irc on 3.11.2 kernel BUG at fs/btrfs/relocation.c:1055! invalid opcode: [#1] SMP Modules linked in: ebtable_nat nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables bnep ip6table_filter ip6_tables arc4 x86_pkg_temp_thermal coretemp kvm_intel ath9k_htc joydev ath9k_common ath9k_hw ath kvm snd_hda_codec_hdmi mac80211 cfg80211 iTCO_wdt iTCO_vendor_support ath3k r8169 btusb snd_hda_codec_realtek snd_hda_intel mii snd_hda_codec snd_hwdep serio_raw snd_seq snd_seq_device mxm_wmi snd_pcm bluetooth mei_me microcode i2c_i801 rfkill shpchp lpc_ich mfd_core mei wmi mperf snd_page_alloc snd_timer snd soundcore uinput btrfs libcrc32c xor zlib_deflate raid6_pq dm_crypt hid_logitech_dj i915 crc32_pclmul crc32c_intel ghash_clmulni_intel i2c_algo_bit drm_kms_helper drm i2c_core video CPU: 1 PID: 564 Comm: btrfs-balance Not tainted 3.11.2-201.fc19.x86_64 #1 Hardware name: ECS Z77H2-AX/Z77H2-AX, BIOS 4.6.5 10/25/2012 task: 8807ee1c1e80 ti: 8807f1cc8000 task.ti: 8807f1cc8000 RIP: 0010:[] [] build_backref_tree+0x1077/0x1130 [btrfs] RSP: 0018:8807f1cc9ab8 EFLAGS: 00010246 RAX: RBX: 8807eef77480 RCX: dead00200200 RDX: 8807f1cc9b28 RSI: 8807f1cc9b28 RDI: 8807ef5896d0 RBP: 8807f1cc9b98 R08: 8807ef5896d0 R09: 0001 R10: a01f5483 R11: R12: 8807ef5896d0 R13: 8807ef5896c0 R14: 8807f22ee360 R15: 8807f0e62000 FS: () GS:88081f24() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7f7d97749a90 CR3: 0007e38ef000 CR4: 001407e0 Stack: 8807f0e62580 8807eef77a80 8807ef5899e0 8807eef77780 8807ea5ab000 8807f22ee360 8807eef777c0 8807f22ee000 8807f0e62120 8807eef77a80 8807f0e62020 Call Trace: [] relocate_tree_blocks+0x1d8/0x630 [btrfs] [] ? add_data_references+0x248/0x280 [btrfs] [] relocate_block_group+0x280/0x690 [btrfs] [] btrfs_relocate_block_group+0x19f/0x2e0 [btrfs] [] btrfs_relocate_chunk.isra.32+0x6f/0x740 [btrfs] [] ? btrfs_set_path_blocking+0x39/0x80 [btrfs] [] ? btrfs_search_slot+0x382/0x940 [btrfs] [] ? free_extent_buffer+0x4f/0xa0 [btrfs] [] btrfs_balance+0x8e7/0xe80 [btrfs] [] balance_kthread+0x70/0x80 [btrfs] [] ? btrfs_balance+0xe80/0xe80 [btrfs] [] kthread+0xc0/0xd0 [] ? insert_kthread_work+0x40/0x40 [] ret_from_fork+0x7c/0xb0 [] ? insert_kthread_work+0x40/0x40 Code: 4c 89 f7 e8 0c 0c f9 ff 48 8b bd 58 ff ff ff e8 00 0c f9 ff 48 83 bd 38 ff ff ff 00 0f 85 1e fe ff ff 31 c0 e9 5d f0 ff ff 0f 0b <0f> 0b 48 8b 73 18 48 89 c7 e8 49 f3 01 00 48 8b 85 38 ff ff ff RIP [] build_backref_tree+0x1077/0x1130 [btrfs] RSP On Wed, Sep 25, 2013 at 11:26 PM, Guenther Starnberger wrote: > On Wed, Sep 25, 2013 at 04:46:41PM +0200, David Sterba wrote: > >> 3.12-rc really? I'd like to see the stacktrace then. > > Yes - this also happens on 3.12-rc kernels. Here's the stacktrace for 4b97280 > (which is several commits ahead of 3.12-rc2): > > [ 126.735598] btrfs: disk space caching is enabled > [ 126.737038] btrfs: has skinny extents > [ 144.769929] BTRFS debug (device dm-0): unlinked 1 orphans > [ 144.836240] btrfs: continuing balance > [ 153.441134] btrfs: relocating block group 1542996361216 flags 1 > [ 295.780293] btrfs: found 18 extents > [ 310.107200] [ cut here ] > [ 310.108496] kernel BUG at fs/btrfs/relocation.c:1060! > [ 310.109709] invalid opcode: [#1] PREEMPT SMP > [ 310.110268] Modules linked in: btrfs raid6_pq crc32c libcrc32c xor xts > gf128mul dm_crypt dm_mod usb_storage psmouse ppdev e1000 evdev pcspkr > serio_raw joydev microcode snd_intel8x0 snd_ac97_codec i2c_piix4 i2c_core > ac97_bus snd_pcm snd_page_alloc snd_timer parport_pc parport snd soundcore > intel_agp button battery processor ac intel_gtt ext4 crc16 mbcache jbd2 > hid_generic usbhid hid sr_mod cdrom sd_mod ata_generic pata_acpi ohci_pci > ata_piix ahci libahci ohci_hcd ehci_pci ehci_hcd usbcore usb_common libata > scsi_mod > [ 310.110268] CPU: 0 PID: 366 Comm: btrfs-balance Not tainted > 3.12.0-1-00083-g4b97280-dirty #1 > [ 310.110268] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS > VirtualBox 12/01/2006 > [ 310.110268] task: 880078b0 ti: 880078afe000 task.ti: > 880078afe000 > [ 310.110268] RIP: 0010:[] [] > build_backref_tree+0x112a/0x11d0 [btrfs] > [ 310.110268] RSP: 0018:880078affab8 EFLAGS: 00010246 > [ 310.110268] RAX: RBX: 8800784d4000 RCX: > 88006a2a9d90 > [ 310.110268] RDX: 880078affb30 RSI: 8800784d4020 RDI: > 88006a2a9d80 > [ 310.110268] RBP: 880078affba0 R08: 880077d07e00 R09: > 880078affa
Re: [PATCH] Drop unused parameter from btrfs_item_nr
On Mon, Sep 16, 2013 at 8:58 AM, Ross Kirk wrote: > Unused parameter cleanup > > Ross Kirk (1): > btrfs: drop unused parameter from btrfs_item_nr > > fs/btrfs/backref.c|2 +- > fs/btrfs/ctree.c | 34 +- > fs/btrfs/ctree.h | 13 ++--- > fs/btrfs/dir-item.c |2 +- > fs/btrfs/inode-item.c |2 +- > fs/btrfs/inode.c |4 ++-- > fs/btrfs/print-tree.c |2 +- > fs/btrfs/send.c |4 ++-- > 8 files changed, 31 insertions(+), 32 deletions(-) > > -- > 1.7.7.6 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Something appears to be missing... -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Manual deduplication would be useful
On Tue, Jul 23, 2013 at 9:47 AM, Rick van Rein wrote: > Hello, > > For over a year now, I've been experimenting with stacked filesystems as a > way to save on resources. A basic OS layer is shared among Containers, each > of which stacks a layer with modifications on top of it. This approach means > that Containers share buffer cache and loaded executables. Concrete > technology choices aside, the result is rock-solid and the efficiency > improvements are incredible, as documented here: > > http://rickywiki.vanrein.org/doku.php?id=openvz-aufs > > One problem with this setup is updating software. In lieu of > stacking-support in package managers, it is necessary to do this on a > per-Container basis, meaning that each installs their own versions, including > overwrites of the basic OS layer. Deduplication could remedy this, but the > generic mechanism is known from ZFS to be fairly inefficient. > > Interestingly however, this particular use case demonstrates that a much > simpler deduplication mechanism than normally considered could be useful. It > would suffice if the filesystem could check on manual hints, or > stack-specifying hints, to see if overlaid files share the same file > contents; when they do, deduplication could commence. This saves searching > through the entire filesystem for every file or block written. It might also > mean that the actual stacking is not needed, but instead a basic OS could be > cloned to form a new basic install, and kept around for this hint processing. > > I'm not sure if this should ideally be implemented inside the stacking > approach (where it would be stacking-implementation-specific) or in the > filesystem (for which it might be too far off the main purpose) but I thought > it wouldn't hurt to start a discussion on it, given that (1) filesystems > nowadays service multiple instances, (2) filesystems like Btrfs are based on > COW, and (3) deduplication is a goal but the generic mechanism could use some > efficiency improvements. > > I hope having seen this approach is useful to you! > > Please reply-all? I'm not on this list. > > Cheers, > -Rick-- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html There's patches providing offline dedup (i.e., manually telling the kernel which files to consider) floating around: http://lwn.net/Articles/547542/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raid 10 corruption from single drive failure
> Making this with all 6 devices from the beginning and btrfsck doesn't > segfault. But it also doesn't repair the system enough to make it > mountable. ( nether does -o recover, however -o degraded works, and > files > are then accessible ) Not sure I entirely follow: mounting with -o degraded (not -o recovery) is how you're supposed to mount if there's a disk missing. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: My multi-device btrfs (3*2TB) won't mount anymore.
Does anything show up in dmesg when you mount? If mount just hangs, do an alt-sysrq-w, and then post what that sends to dmesg. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raid0, raid1, raid5, what to choose?
On Thu, Jun 13, 2013 at 3:21 PM, Hugo Mills wrote: > On Thu, Jun 13, 2013 at 11:09:00PM +0200, Hendrik Friedel wrote: >> Hello, >> >> I'd appreciate your recommendation on this: >> >> I have three hdd with 3TB each. I intend to use them as raid5 eventually. >> currently I use them like this: >> >> # mount|grep sd >> /dev/sda1 on /mnt/Datenplatte type ext4 >> /dev/sdb1 on /mnt/BTRFS/Video type btrfs >> /dev/sdb1 on /mnt/BTRFS/rsnapshot type btrfs >> >> #df -h >> /dev/sda1 2,7T 1,3T 1,3T 51% /mnt/Datenplatte >> /dev/sdb1 5,5T 5,4T 93G 99% /mnt/BTRFS/Video >> /dev/sdb1 5,5T 5,4T 93G 99% /mnt/BTRFS/rsnapshot >> >> Now, what surprises me, and here I lack memory- is that sdb appears >> twice.. I think, I created a raid1, but how can I find out? > >Appearing twice in that list is more an indication that you have > multiple subvolumes -- check the subvol= options in /etc/fstab > >> #/usr/local/smarthome# ~/btrfs/btrfs-progs/btrfs fi show /dev/sdb1 >> Label: none uuid: 989306aa-d291-4752-8477-0baf94f8c42f >> Total devices 2 FS bytes used 2.68TB >> devid2 size 2.73TB used 2.73TB path /dev/sdc1 >> devid1 size 2.73TB used 2.73TB path /dev/sdb1 >> >> Now, I wanted to convert it to raid0, because I lack space and >> redundancy is not important for the Videos and the Backup, but this >> fails: >> ~/btrfs/btrfs-progs/btrfs fi balance start -dconvert=raid0 /mnt/BTRFS/ >> ERROR: error during balancing '/mnt/BTRFS/' - Inappropriate ioctl for device > >/mnt/BTRFS isn't a btrfs subvol, according to what you have listed > above. It's a subdirectory in /mnt which is contains two subdirs > (Video and rsnapshot) which are used as mountpoints for subvolumes. > >Try running the above command with /mnt/BTRFS/Video instead (or > rsnapshot -- it doesn't matter which). > >> dmesg does not help here. >> >> Anyway: This gave me some time to think about this. In fact, as soon >> as raid5 is stable, I want to have all three as a raid5. Will this >> be possible with a balance command? If so: will this be possible as >> soon as raid5 is stable, or will I have to wait longer? > >Yes, it's possible to convert to RAID-5 right now -- although the > code's not settled down into its final form quite yet. Note that > RAID-5 over two devices won't give you any space benefits over RAID-1 > over two devices. (Or any reliability benefits either). > >Hugo. > > -- > === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === > PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk >--- "Are you the man who rules the Universe?" "Well, I --- > try not to." Raid5 currently is only suitable for testing: it's known and expected to break on power cuts, for instance. The parity logging stuff is waiting on the skip-list implementation you may have read about on lwn, otherwise the performance overhead wasn't acceptable or some such. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recommended settings for SSD
On Sun, May 26, 2013 at 9:16 AM, Harald Glatt wrote: > I don't know a better way to check than doing df -h before and > after... If you use space_cache you have to clear_cache though to make > the numbers be current for sure each time before looking at df. Not sure what you're thinking of; space_cache is just a mount-time optimization, storing and loading a memory structure to disk so that it doesn't have to be regenerated. As I understand it, if it's ever wrong, it's a serious bug. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recommended settings for SSD
> At the moment I am using: > defaults,noatime,nodiratime,ssd,subvol=@home No need to specify ssd, it's automatically detected. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: hard freezes with 3.9.0 during io-intensive loads
On Sun, May 5, 2013 at 10:10 AM, Kai Krakow wrote: > Hello list, > > Kai Krakow schrieb: > >> I've upgraded to 3.9.0 mainly for the snapshot-aware defragging patches. >> I'm running bedup[1] on a regular basis and it is now the third time that >> I got back to my PC just to find it hard-frozen and I needed to use the >> reset button. >> >> It looks like this happens only while running bedup on my two btrfs >> filesystems but I'm not sure if it happens for any of the filesystems or >> only one. This is my setup: >> >> # cat /etc/fstab (shortened) >> UUID=d2bb232a-2e8f-4951-8bcc-97e237f1b536 / btrfs >> compress=lzo,subvol=root64 0 1 # /dev/sd{a,b,c}3 >> LABEL=usb-backup /mnt/private/usb-backup btrfs noauto,compress- >> force=zlib,subvolid=0,autodefrag,comment=systemd.automount 0 0 # external >> usb3 disk >> >> # btrfs filesystem show >> Label: 'usb-backup' uuid: 7038c8fa-4293-49e9-b493-a9c46e5663ca >> Total devices 1 FS bytes used 1.13TB >> devid1 size 1.82TB used 1.75TB path /dev/sdd1 >> >> Label: 'system' uuid: d2bb232a-2e8f-4951-8bcc-97e237f1b536 >> Total devices 3 FS bytes used 914.43GB >> devid3 size 927.26GB used 426.03GB path /dev/sdc3 >> devid2 size 927.26GB used 426.03GB path /dev/sdb3 >> devid1 size 927.26GB used 427.07GB path /dev/sda3 >> >> Btrfs v0.20-rc1 >> >> Since the system hard-freezes I have no messages from dmesg. But I suspect >> it to be related to the defragmentation option in bedup (I've switched to >> bedub with --defrag since 3.9.0, and autodefrag for the backup drive). >> Just in case, I'm going to try without this option now and see if it won't >> freeze. >> >> I was able to take a "physical" screenshot with a real camera of a kernel >> backtrace one time when the freeze happened. I wonder if it is useful to >> you and where to send it. I just don't want to upload jpegs right here to >> the list without asking first. >> >> The big plus is: Altough I had to hard-reset the frozen system several >> times now, btrfs survived the procedure without any impact (just boot >> times increases noticeably, probably due to log-replays or something). So >> thumbs up for the developers on that point. > > Thanks to the great cwillu netcat service here's my backtrace: > > 4,1072,17508258745,-;[ cut here ] > 2,1073,17508258772,-;kernel BUG at fs/btrfs/ctree.c:1144! > 4,1074,17508258791,-;invalid opcode: [#1] SMP > 4,1075,17508258811,-;Modules linked in: bnep bluetooth af_packet vmci(O) > vmmon(O) vmblock(O) vmnet(O) vsock reiserfs snd_usb_audio snd_usbmidi_lib > snd_rawmidi snd_seq_device gspca_sonixj gpio_ich gspca_main videodev > coretemp hwmon kvm_intel kvm crc32_pclmul crc32c_intel 8250 serial_core > lpc_ich microcode mfd_core i2c_i801 pcspkr evdev usb_storage zram(C) unix > 4,1076,17508258966,-;CPU 0 > 4,1077,17508258977,-;Pid: 7212, comm: btrfs-endio-wri Tainted: G C O > 3.9.0-gentoo #2 To Be Filled By O.E.M. To Be Filled By O.E.M./Z68 Pro3 > 4,1078,17508259023,-;RIP: 0010:[] [] > __tree_mod_log_rewind+0x4c/0x121 > 4,1079,17508259064,-;RSP: 0018:8801966718e8 EFLAGS: 00010293 > 4,1080,17508259085,-;RAX: 0003 RBX: 8801ee8d33b0 RCX: > 880196671888 > 4,1081,17508259112,-;RDX: 0a4596a4 RSI: 0eee RDI: > 8804087be700 > 4,1082,17508259138,-;RBP: 0071 R08: 1000 R09: > 880196671898 > 4,1083,17508259165,-;R10: R11: R12: > 880406c2e000 > 4,1084,17508259191,-;R13: 8a11 R14: 8803b5aa1200 R15: > 0001 > 4,1085,17508259218,-;FS: () GS:88041f20() > knlGS: > 4,1086,17508259248,-;CS: 0010 DS: ES: CR0: 80050033 > 4,1087,17508259270,-;CR2: 026f0390 CR3: 01a0b000 CR4: > 000407f0 > 4,1088,17508259297,-;DR0: DR1: DR2: > > 4,1089,17508259323,-;DR3: DR6: 0ff0 DR7: > 0400 > 4,1090,17508259350,-;Process btrfs-endio-wri (pid: 7212, threadinfo > 88019667, task 8801b82e5400) > 4,1091,17508259383,-;Stack: > 4,1092,17508259391,-; 8801ee8d38f0 880021b6f360 88013a5b2000 > 8a11 > 4,1093,17508259423,-; 8802d0a14000 81167606 0246 > 8801ee8d33b0 > 4,1094,17508259455,-; 880406c2e000 8801966719bf 880021b6f360 > > 4,1095,17508259
Re: Panic while running defrag
On Mon, Apr 29, 2013 at 9:20 PM, Stephen Weinberg wrote: > I ran into a panic while running find -xdev | xargs brtfs fi defrag '{}'. I > don't remember the exact command because the history was not saved. I also > started and stopped it a few times however. > > The kernel logs were on a different filesystem. Here is the > kern.log:http://fpaste.org/9383/36729191/ Apr 28 15:24:05 hotel kernel: [614592.785065] [ cut here ] Apr 28 15:24:05 hotel kernel: [614592.785146] WARNING: at /build/buildd-linux_3.8.5-1~experimental.1-amd64-_t_ZfP/linux-3.8.5/fs/btrfs/locking.c:46 btrfs_set_lock_blocking_rw+0x6f/0xe7 [btrfs]() Apr 28 15:24:05 hotel kernel: [614592.785152] Hardware name: BK169AAR-ABA HPE-210f Apr 28 15:24:05 hotel kernel: [614592.785157] Modules linked in: nls_utf8 nls_cp437 vfat fat cbc ecb parport_pc ppdev lp parport bnep rfcomm bluetooth binfmt_misc nfsd auth_rpcgss nfs_acl nfs lockd dns_resolver fscache sunrpc loop ecryptfs arc4 snd_hda_codec_hdmi snd_hda_codec_realtek ath9k ath9k_common ath9k_hw ath mac80211 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm acpi_cpufreq snd_page_alloc radeon mperf ttm snd_seq cfg80211 kvm_amd kvm sp5100_tco snd_seq_device snd_timer rfkill drm_kms_helper drm snd edac_mce_amd edac_core k10temp i2c_piix4 i2c_algo_bit soundcore microcode i2c_core button processor psmouse evdev serio_raw thermal_sys pcspkr ext4 crc16 jbd2 mbcache btrfs zlib_deflate crc32c libcrc32c hid_generic usbhid hid usb_storage sg sr_mod cdrom sd_mod crc_t10dif firewire_ohci firewire_core crc_itu_t ehci_pci ohci_hcd ehci_hcd ahci libahci libata usbcore r8169 mii scsi_mod usb_common Apr 28 15:24:05 hotel kernel: [614592.785279] Pid: 24757, comm: btrfs Not tainted 3.8-trunk-amd64 #1 Debian 3.8.5-1~experimental.1 Apr 28 15:24:05 hotel kernel: [614592.785284] Call Trace: Apr 28 15:24:05 hotel kernel: [614592.785299] [] ? warn_slowpath_common+0x76/0x8a Apr 28 15:24:05 hotel kernel: [614592.785350] [] ? btrfs_set_lock_blocking_rw+0x6f/0xe7 [btrfs] Apr 28 15:24:05 hotel kernel: [614592.785386] [] ? btrfs_realloc_node+0xef/0x380 [btrfs] Apr 28 15:24:05 hotel kernel: [614592.785434] [] ? btrfs_defrag_leaves+0x242/0x304 [btrfs] Apr 28 15:24:05 hotel kernel: [614592.785479] [] ? btrfs_defrag_root+0x4f/0x9e [btrfs] Apr 28 15:24:05 hotel kernel: [614592.785526] [] ? btrfs_ioctl_defrag+0xb2/0x194 [btrfs] Apr 28 15:24:05 hotel kernel: [614592.785574] [] ? btrfs_ioctl+0x771/0x175a [btrfs] Apr 28 15:24:05 hotel kernel: [614592.785584] [] ? handle_mm_fault+0x1eb/0x239 Apr 28 15:24:05 hotel kernel: [614592.785594] [] ? __do_page_fault+0x2d7/0x375 Apr 28 15:24:05 hotel kernel: [614592.785605] [] ? vfs_ioctl+0x1e/0x31 Apr 28 15:24:05 hotel kernel: [614592.785613] [] ? do_vfs_ioctl+0x3ee/0x430 Apr 28 15:24:05 hotel kernel: [614592.785624] [] ? kmem_cache_free+0x44/0x80 Apr 28 15:24:05 hotel kernel: [614592.785632] [] ? sys_ioctl+0x4d/0x7c Apr 28 15:24:05 hotel kernel: [614592.785642] [] ? system_call_fastpath+0x16/0x1b Apr 28 15:24:05 hotel kernel: [614592.785648] ---[ end trace 2150df5c163b6833 ]--- Apr 28 15:24:05 hotel kernel: [614592.785693] [ cut here ] Apr 28 15:24:05 hotel kernel: [614592.785813] kernel BUG at /build/buildd-linux_3.8.5-1~experimental.1-amd64-_t_ZfP/linux-3.8.5/fs/btrfs/locking.c:265! Apr 28 15:24:05 hotel kernel: [614592.786054] invalid opcode: [#1] SMP Apr 28 15:24:05 hotel kernel: [614592.786158] Modules linked in: nls_utf8 nls_cp437 vfat fat cbc ecb parport_pc ppdev lp parport bnep rfcomm bluetooth binfmt_misc nfsd auth_rpcgss nfs_acl nfs lockd dns_resolver fscache sunrpc loop ecryptfs arc4 snd_hda_codec_hdmi snd_hda_codec_realtek ath9k ath9k_common ath9k_hw ath mac80211 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm acpi_cpufreq snd_page_alloc radeon mperf ttm snd_seq cfg80211 kvm_amd kvm sp5100_tco snd_seq_device snd_timer rfkill drm_kms_helper drm snd edac_mce_amd edac_core k10temp i2c_piix4 i2c_algo_bit soundcore microcode i2c_core button processor psmouse evdev serio_raw thermal_sys pcspkr ext4 crc16 jbd2 mbcache btrfs zlib_deflate crc32c libcrc32c hid_generic usbhid hid usb_storage sg sr_mod cdrom sd_mod crc_t10dif firewire_ohci firewire_core crc_itu_t ehci_pci ohci_hcd ehci_hcd ahci libahci libata usbcore r8169 mii scsi_mod usb_common Apr 28 15:24:05 hotel kernel: [614592.788282] CPU 2 Apr 28 15:24:05 hotel kernel: [614592.788337] Pid: 24757, comm: btrfs Tainted: GW3.8-trunk-amd64 #1 Debian 3.8.5-1~experimental.1 HP-Pavilion BK169AAR-ABA HPE-210f/ALOE Apr 28 15:24:05 hotel kernel: [614592.788634] RIP: 0010:[] [] btrfs_assert_tree_locked+0x7/0xa [btrfs] Apr 28 15:24:05 hotel kernel: [614592.788899] RSP: 0018:88021b017c20 EFLAGS: 00010246 Apr 28 15:24:05 hotel kernel: [614592.789023] RAX: RBX: 88003396f310 RCX: 05ad05ad Apr 28 15:24:05 hotel kernel: [614592.789186] RDX: fa56 RSI: 0046 RDI: 88003396f310 Apr 28 15:24:05 hotel kernel: [614592.789349] RBP: 0
Re: Btrfs performance problem; metadata size to blame?
[how'd that send button get there] space_cache is the default, set by mkfs, for a year or so now. It's sticky, so even if it wasn't, you'd only need to mount with it once. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfs performance problem; metadata size to blame?
On Sun, Apr 28, 2013 at 2:17 PM, Roger Binns wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 28/04/13 12:57, Harald Glatt wrote: >> If you want better answers ... > > There is a lot of good information at the wiki and it does see regular > updates. For example the performance mount options are on this page: > > https://btrfs.wiki.kernel.org/index.php/Mount_options > > Roger > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.12 (GNU/Linux) > > iEYEARECAAYFAlF9g+wACgkQmOOfHg372QQu6QCffq/cB7GPutTwiAUE0CyTuIJx > Qj8AnjsqxVyPrK5FTDqaLk1d1lsYYB38 > =6HN3 > -END PGP SIGNATURE- > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: scrub "correcting" tons of errors ?
> Actually instead of netconsole we have an awesome service provided by Carey, > you > can just do > > nc cwillu.com 10101 < /dev/kmsg ... at a root prompt. > after you've run sysrq+w and then reply with the URL it spits out. Thanks, -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: minimum kernel version for btrfsprogs.0.20?
On Thu, Mar 28, 2013 at 11:41 PM, Chris Murphy wrote: > Creating a btrfs file system using > btrfs-progs-0.20.rc1.20130308git704a08c-1.fc19, and either kernel > 3.6.10-4.fc18 or 3.9.0-0.rc3.git0.3.fc19, makes a file system that cannot be > mounted by kernel 3.6.10-4.fc18. It can be mounted by kernel 3.8.4. I haven't > tested any other 3.8, or any 3.7 kernels. > > Is this expected? > > dmesg reports: > [ 300.014764] btrfs: disk space caching is enabled > [ 300.024137] BTRFS: couldn't mount because of unsupported optional features > (40). > [ 300.034148] btrfs: open_ctree failed commit 1a72afaa "btrfs-progs: mkfs support for extended inode refs" unconditionally enables extended irefs (which permits more than 4k links to the same inode). It's the right default imo, but there probably should have been a mkfs option to disable it. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: question about replacing a drive in raid10
On Thu, Mar 28, 2013 at 1:54 AM, Joeri Vanthienen wrote: > Hi all, > > I have a question about replacing a drive in raid10 (and linux kernel 3.8.4). > A bad disk was physical removed from the server. After this a new disk > was added with "btrfs device add /dev/sdg /btrfs" to the raid10 btrfs > FS. > After this the server was rebooted and I mounted the filesystem in > degraded mode. It seems that a previous started balance continued. > > At this point I want to remove the missing device from the pooI (btrfs > device delete missing /btrfs). Is this safe to do ? Yep. > The disk usage numbers look weird to me, also the limited amount of > data written to the new disk after the balance. You're not actually looking at the data on the disk, but the size of the block groups allocated on that disk. I expect the data got spread across all of the remaining disks, including the new one. Probably worth running a scrub anyway though. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: No space left on device (28)
On Fri, Mar 22, 2013 at 12:39 AM, Stefan Priebe - Profihost AG wrote: > Already tried with value 5 did not help ;-( and it also happens with plain cp > copying a 15gb file and aborts at about 80% You tried -musage=5? Your original email said -dusage=5. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: No space left on device (28)
On Fri, Mar 22, 2013 at 12:13 AM, Roman Mamedov wrote: > On Thu, 21 Mar 2013 20:42:28 +0100 > Stefan Priebe wrote: > > I might be wrong here, but doesn't this > >> rsync: rename >> "/mnt/.software/kernel/linux-3.9-rc3/drivers/infiniband/hw/amso1100/"" >> -> >> ".software/kernel/linux-3.9-rc3/drivers/infiniband/hw/amso1100/c2_ae.h": > > ...try to move a file from > > "/mnt/.software/" > > to > > ".software/" > > (relative to current dir)?? No; that's rsync giving the full path, and then the target path relative to the command it was given. The filename itself (".c2_ae.h.WEhLGP") is a semi-random filename rsync uses to write to temporarily, so it can mv it over the original in an atomic fashion... Stefan: ...which means that the actual copy succeeded, which suggests that this is more of a metadata enospc thing. You might try btrfs balance start -musage=5 (instead of -dusage), and if that doesn't report any chunks balanced, try a high number until it does. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to recover uncorrectable errors ?
>> # rm -rf * >> rm: cannot remove 'drivers/misc/lis3lv02d/lis3lv02d.c': Stale NFS file handle >> rm: cannot remove 'drivers/misc/lis3lv02d/lis3lv02d.c': Stale NFS file handle >> rm: cannot remove 'drivers/misc/lis3lv02d/lis3lv02d.c': Stale NFS file handle >> rm: cannot remove 'drivers/misc/lis3lv02d/lis3lv02d.c': Stale NFS file handle >> rm: cannot remove 'drivers/misc/lis3lv02d/lis3lv02d.c': Stale NFS file handle >> rm: cannot remove 'drivers/misc/lis3lv02d/lis3lv02d.c': Stale NFS file handle >> rm: cannot remove 'drivers/misc/lis3lv02d/lis3lv02d.c': Stale NFS file handle >> rm: cannot remove 'drivers/misc/lis3lv02d/lis3lv02d.c': Stale NFS file handle >> ... > > You are trying to remove the files from an NFS client. Stale NFS file > handle just means that the NFS handle is no longer valid. NFS clients refer to file by a file handle composed of filesystem id and > inode number. Maybe a change in there? > > Anyway, to find the real error message its necessary to try to delete > the files on the server. Cause even if there is a real BTRFS issue, the > NFS client likely won´t report helpful error messages. Don't read too much into that "Stale NFS file handle" message; ESTALE doesn't imply anything about NFS being involved, despite the standard error string for that value. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfs in multiple different disc -d sigle -m raid1 one drive failure...
On Mon, Mar 18, 2013 at 12:32 PM, Jan Beranek wrote: > Hi all, > I'm preparing a strorage pool for large data with quite low importance > - there will be at least 3 hdd in "-d single" and "-m raid1" > configuration. > > mkfs.btrfs -d single -m raid1 /dev/sda /dev/sdb /dec/sdc > > What happen if one hdd fails? Do I lost everything from all three > discs or only data from one disc? (if from only one disc, then is it > acceptable otherwise not...) I just finished doing some testing to check: It will work, kinda sorta. You'll be forced to mount read-only, and any reads of file extents that existed on the missing disk will return an io error. As I understand it, single doesn't force files to be on a single disk, instead it _doesn't_ force them to be _several_ disks; the implication being that a large file (say, a 4gb movie) may still end up with pieces on each disk. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: multiple btrfsck runs
On Sat, Mar 16, 2013 at 6:44 AM, Marc MERLIN wrote: > On Sat, Mar 16, 2013 at 06:24:47AM -0600, cwillu wrote: >> On Sat, Mar 16, 2013 at 6:02 AM, Russell Coker wrote: >> > Is it expected that running btrfsck more than once will keep reporting >> > errors? >> >> Without options, btrfsck does not write to the disk. > > Ah, that explains why I never got it to work the day I wanted to try > it. > I should note that waht you're saying is neither documented in the man > page, nor in https://btrfs.wiki.kernel.org/index.php/Btrfsck > > For that matter, the wiki actually states there are no options. > > Is that mostly intentional so that whoever isn't reading the source > doesn't really run the tool because it's not ready? At least a bit, yeah. People have getting better about updating the documentation recently though. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: multiple btrfsck runs
On Sat, Mar 16, 2013 at 6:46 AM, Russell Coker wrote: > On Sat, 16 Mar 2013, cwillu wrote: >> On Sat, Mar 16, 2013 at 6:02 AM, Russell Coker wrote: >> > Is it expected that running btrfsck more than once will keep reporting >> > errors? >> >> Without options, btrfsck does not write to the disk. > > The man page for the version in Debian doesn't document any options. > > The source indicates that --repair might be the one that's desired, is that > correct? Yes. However, unless something is actually broken, or you've been advised by a developer, I'd stick with btrfs scrub. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: multiple btrfsck runs
On Sat, Mar 16, 2013 at 6:02 AM, Russell Coker wrote: > Is it expected that running btrfsck more than once will keep reporting errors? Without options, btrfsck does not write to the disk. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Drive low space / huge performance hit.
On Thu, Mar 7, 2013 at 11:05 AM, Steve Heyns wrote: > hi > > I am using compression lzo on my 350GB partition, I have 2 subvolumes > on this partition. My kernel is 3.7 BTRFS v0.19 - > > According to my system (df -h) that partition has 75Gb available. > According to btrfs > > btrfs fi df /mnt/DevSystem/ > Data: total=260.01GB, used=259.09GB > System, DUP: total=8.00MB, used=36.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=8.00GB, used=7.87GB > Metadata: total=8.00MB, used=0.00 Show btrfs fi show /dev/whatever -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Does defragmenting even work
On Thu, Feb 28, 2013 at 8:35 AM, Swâmi Petaramesh wrote: > BTW... > > I'm not even sure that "btrfs filesystem defrag " actually > does anything... > > If I run "filefrag " afterwards, it typically shows the same > number of fragments that it did prior to running defrag... > > I'm not sure about how it actually works and what I should expect... Can't explain something if I can't see the data I'm explaining :p -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copy on write misconception
On Fri, Feb 22, 2013 at 11:41 AM, Mike Power wrote: > On 02/22/2013 09:16 AM, Hugo Mills wrote: >> >> On Fri, Feb 22, 2013 at 09:11:28AM -0800, Mike Power wrote: >>> >>> I think I have a misconception of what copy on write in btrfs means >>> for individual files. >>> >>> I had originally thought that I could create a large file: >>> time dd if=/dev/zero of=10G bs=1G count=10 >>> 10+0 records in >>> 10+0 records out >>> 10737418240 bytes (11 GB) copied, 100.071 s, 107 MB/s >>> >>> real1m41.082s >>> user0m0.000s >>> sys0m7.792s >>> >>> Then if I copied this file no blocks would be copied until they are >>> written. Hence the two files would use the same blocks underneath. >>> But specifically that copy would be fast. Since it would only need >>> to write some metadata. But when I copy the file: >>> time cp 10G 10G2 >>> >>> real3m38.790s >>> user0m0.124s >>> sys0m10.709s >>> >>> Oddly enough it actually takes longer then the initial file >>> creation. So I am guessing that the long duration copy of the file >>> is expected and that is not one of the virtues of btrfs copy on >>> write. Does that sound right? >> >> You probably want cp --reflink=always, which makes a CoW copy of >> the file's metadata only. The resulting files have the semantics of >> two different files, but share their blocks until a part of one of >> them is modified (at which point, the modified blocks are no longer >> shared). >> >> Hugo. >> > I see, and it works great: > time cp --reflink=always 10G 10G3 > > real0m0.028s > user0m0.000s > sys0m0.000s > > So from the user perspective I might say I want to opt out of this feature > not optin. I want all copies by all applications done as a copy on write. > But if my understanding is correct that is up to the application being > called (in this case cp) and how it in turns makes calls to the system. > > In short I can't remount the btrfs filesystem with some new args that says > always copy on write files because that is what it already. There's no "copy a file" syscall; when a program copies a file, it opens a new file, and writes all the bytes from the old to the new. Converting this to a reflink would require btrfs to implement full de-dup (which is rather expensive), and still wouldn't prevent the program from reading and writing all 10gb (and so wouldn't be any faster). You can set an alias in your shell to make cp --reflink=auto the default, but that won't affect other programs, nor other users. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copy on write misconception
> Then if I copied this file no blocks would be copied until they are written. > Hence the two files would use the same blocks underneath. But specifically > that copy would be fast. Since it would only need to write some metadata. > But when I copy the file: > time cp 10G 10G2 cp without arguments still does a regular copy; btrfs does nothing to de-duplicate writes. "cp --reflink 10G 10G2" will give you the results you expect. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fwd: Current State of BTRFS
On Fri, Feb 8, 2013 at 4:56 PM, Florian Hofmann wrote: > Oh ... I should have mentioned that btrfs is running on top of LUKS. > > 2013/2/8 Florian Hofmann : >> $ btrfs fi df / >> Data: total=165.00GB, used=164.19GB >> System, DUP: total=32.00MB, used=28.00KB >> System: total=4.00MB, used=0.00 >> Metadata, DUP: total=2.00GB, used=1.40GB >> >> $ btrfs fi show >> failed to read /dev/sr0 >> Label: none uuid: b4ec0b14-2a42-47e3-a0cd-1257e789ed25 >> Total devices 1 FS bytes used 165.59GB >> devid1 size 600.35GB used 169.07GB path /dev/dm-0 >> >> Btrfs Btrfs v0.19 >> >> --- >> >> I just noticed that I can force 'it' by transferring a large file from >> my NAS. I did the sysrq-trigger thing, but there is no suspicious >> output in dmesg (http://pastebin.com/swrCdC3U). >> >> Anything else? The pastebin didn't include any output from sysrq-w; even if there's nothing to report there would still be a dozen lines or so per cpu; at the absolute minimum there should be a line for each time you ran it: [4477369.680307] SysRq : Show Blocked State Note that you need to echo as root, or use the keyboard combo alt-sysrq-w to trigger. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: System unmountable RW
> then I do : mount -o rw,remount /backup/ > > Feb 1 22:32:38 frozen kernel: [ 65.780686] btrfs: force zlib compression > Feb 1 22:32:38 frozen kernel: [ 65.780700] btrfs: not using ssd allocation > scheme > Feb 1 22:32:38 frozen kernel: [ 65.780706] btrfs: disk space caching is > enabled > > > I let that mount run days, without any success. It stay running, and I can't > interrupt it (CTRL+C or kill). Hit alt-sysrq-w at that point, and then post your dmesg; there should be at least one stacktrace in there (possibly many), which should give a good idea where it's hanging up. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Abort on memory allocation failure
On Fri, Jan 25, 2013 at 4:55 PM, Ian Kumlien wrote: > Hi, > > Could someone do a sanity check of this, i have removed some of the > checking code that is no longer needed but i would prefer to have > reviewers. I haven't looked much at the code, mainly been focusing on > the grunt work ;) > > Anyway, thanks for looking at it! Include patches inline in your email rather than as an attachment. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs-progs: Exit if not running as root
On Fri, Jan 25, 2013 at 9:04 AM, Gene Czarcinski wrote: > OK, I think I have gotten the message that this is a bad idea as implemented > and that it should be dropped as such. I believe that there are some things > ("btrfs fi show" comes to mind) which will need root and I am going to > explore doing something for that case. And it also might be reasonable for > some situations to issue the message about root if something errors-out. Eh? That's one of the clearest cases where you _may not_ need root. cwillu@cwillu-home:~$ groups cwillu adm dialout cdrom audio video plugdev mlocate lpadmin admin sambashare cwillu@cwillu-home:~$ btrfs fi show /dev/sda3 failed to read /dev/sda failed to read /dev/sda1 failed to read /dev/sda2 failed to read /dev/sda3 failed to read /dev/sdb Btrfs v0.19-152-g1957076 cwillu@cwillu-home:~$ sudo addgroup cwillu disk cwillu@cwillu-home:~$ su cwillu cwillu@cwillu-home:~$ groups cwillu adm disk dialout cdrom audio video plugdev mlocate lpadmin admin sambashare cwillu@cwillu-home:~$ btrfs fi show /dev/sda3 Label: none uuid: ede59711-6230-474f-992d-f1e3deeddab7 Total devices 1 FS bytes used 72.12GB devid1 size 104.34GB used 104.34GB path /dev/sda3 Btrfs v0.19-152-g1957076 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: scrub questtion
On Tue, Jan 15, 2013 at 8:21 AM, Gene Czarcinski wrote: > When you start btrfs scrub and point at one subvolume, what is "scrubbed"? > > Just that subvolume or the entire volume? The entire volume. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: obscure out of space, df and fi df are way off
>>> [root@localhost tmp]# df >>> Filesystem 1K-blocksUsed Available Use% Mounted on >>> /dev/sda33746816 3193172 1564 100% /mnt/sysimage >>> /dev/sda1 495844 31509438735 7% >>> /mnt/sysimage/boot >>> /dev/sda33746816 3193172 1564 100% >>> /mnt/sysimage/home >>> >>> So there's 1.5M of free space left according to conventional df. However: >>> >>> [root@localhost tmp]# btrfs fi show >>> Label: 'fedora_f18v' uuid: 0c9b2b62-5ec1-4610-ab2f-9f00c909428a >>>Total devices 1 FS bytes used 2.87GB >>>devid1 size 3.57GB used 3.57GB path /dev/sda3 >>> >>> [root@localhost tmp]# btrfs fi df /mnt/sysimage >>> Data: total=2.69GB, used=2.69GB >>> System, DUP: total=8.00MB, used=4.00KB >>> System: total=4.00MB, used=0.00 >>> Metadata, DUP: total=438.94MB, used=183.36MB >>> Metadata: total=8.00MB, used=0.00 > So if I assume 2.7GiB for data, and add up the left side of fi df I get > 3224MB rounded up, which is neither 3.57GB or 3.57GiB. I'm missing 346MB at > least. That is what I should have said from the outset. 2.69 + (438.94 / 1000 *2) + (8.0 / 1000 / 1000 *2) + (4.0 / 1000 / 1000) + (8.0 / 1000 / 1000 *2) 3.567916 Looks like 3.57GB to me :p > So is the Metadta DUP Total 438.94MB allocated value actually twice that, but > only 438.94MB is displayed because that's what's available (since the > metadata is duplicated)? The capacity of the metadata group is 438.94; the actual size on disk is twice that. >> Note that the -M option to mkfs.btrfs is intended for this use-case: >> filesystems where the size of a block allocation is large compared to >> the size of the filesystem. It should let you squeeze out most of the >> rest of that 400MB (200MB, DUP). > > Is there a simple rule of thumb an installer could use to know when to use > -M? I know mkfs.btrfs will do this for smaller filesystems than this. I'm > thinking this is a testing edge case that a desktop installer shouldn't be > concerned about, but rather should just gracefully fail from, or better yet, > insist on a larger install destination than this in particular when Btrfs. I tend to go with "any filesystem smaller than 32GB", but a more accurate rule is probably along the lines of "any filesystem that you expect to normally run within half a gb of full". -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: obscure out of space, df and fi df are way off
On Fri, Jan 11, 2013 at 11:50 PM, Chris Murphy wrote: > Very low priority. > No user data at risk. > 8GB virtual disk being installed to, and the installer is puking. I'm trying > to figure out why. > > I first get an rsync error 12, followed by the installer crashing. What's > interesting is this, deleting irrelevant source file systems, just showing > the mounts for the installed system: > > [root@localhost tmp]# df > Filesystem 1K-blocksUsed Available Use% Mounted on > /dev/sda33746816 3193172 1564 100% /mnt/sysimage > /dev/sda1 495844 31509438735 7% /mnt/sysimage/boot > /dev/sda33746816 3193172 1564 100% /mnt/sysimage/home > > So there's 1.5M of free space left according to conventional df. However: > > [root@localhost tmp]# btrfs fi show > Label: 'fedora_f18v' uuid: 0c9b2b62-5ec1-4610-ab2f-9f00c909428a > Total devices 1 FS bytes used 2.87GB > devid1 size 3.57GB used 3.57GB path /dev/sda3 > > [root@localhost tmp]# btrfs fi df /mnt/sysimage > Data: total=2.69GB, used=2.69GB > System, DUP: total=8.00MB, used=4.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=438.94MB, used=183.36MB > Metadata: total=8.00MB, used=0.00 > > And absolutely nothing in dmesg. > > This is confusing. fi show says 3.57GB available and used. Whereas fi df says > 2.69 available and used. So is it 3.57GB? Or is it 2.69? I suppose the simple > answer is, it doesn't matter, in either case it's full. But it seems like the > installer is underestimating Btrfs requirements and should be more > conservative, somehow so I'd like to better understand the allocation. There reporting is correct, but a bit obscure. We have a FAQ item on how to read the output on the wiki, but it's a known sore spot. btrfs fi show reports 3.57GB allocated to block groups (so everything is assigned to metadata or data); btrfs fi df reports how that 3.57GB is being used: of 2.69GB allocated to data block groups, 2.69GB (i.e., all of it) is in use by file data; of 438.94MB of metadata (or 0.87GB after DUP), 183.36MB is in use by metadata (which may include small files that have been inlined). In other words, the tools are saying that filesystem is basically full. :) Note that the -M option to mkfs.btrfs is intended for this use-case: filesystems where the size of a block allocation is large compared to the size of the filesystem. It should let you squeeze out most of the rest of that 400MB (200MB, DUP). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Option LABEL
On Thu, Jan 3, 2013 at 11:57 AM, Helmut Hullen wrote: > But other filesystems don't put the label onto more than 1 device. > There's the problem for/with btrfs. Other filesystems don't exist on more than one device, so of course they don't put a label on more than one device. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: parent transid verify failed on -- After moving btrfs closer to the beginning of drive with dd
On Sat, Dec 29, 2012 at 7:14 AM, Jordan Windsor wrote: > Also here's the output of btrfs-find-root: > > ./btrfs-find-root /dev/sdb1 > Super think's the tree root is at 1229060866048, chunk root 1259695439872 > Went past the fs size, exiting > > Not sure where to go from here. I can't say for certain, but that suggests that the move-via-dd didn't succeed / wasn't correct, and/or the partitioning changes didn't match, and/or the dd happened from a mounted filesystem (which would also explain the transid errors, if there wasn't an unclean umount involved). btrfs-restore might be able to pick out files, but you may be in restore-from-backup territory. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: parent transid verify failed on -- After moving btrfs closer to the beginning of drive with dd
On Fri, Dec 28, 2012 at 12:09 PM, Jordan Windsor wrote: > Hello, > I moved my btrfs to the beginning of my drive & updated the partition > table & also restarted, I'm currently unable to mount it, here's the > output in dmesg. > > [ 481.513432] device label Storage devid 1 transid 116023 /dev/sdb1 > [ 481.514277] btrfs: disk space caching is enabled > [ 481.522611] parent transid verify failed on 1229060423680 wanted > 116023 found 116027 > [ 481.522789] parent transid verify failed on 1229060423680 wanted > 116023 found 116027 > [ 481.522790] btrfs: failed to read tree root on sdb1 > [ 481.523656] btrfs: open_ctree failed > > What command should I run from here? The filesystem wasn't uncleanly unmounted, likely on an older kernel. Try mounting with -o recovery -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: HIT WARN_ON WARNING: at fs/btrfs/extent-tree.c:6339 btrfs_alloc_free_block+0x126/0x330 [btrfs]()
On Wed, Dec 19, 2012 at 9:12 AM, Rock Lee wrote: > Hi all, > > Did someone have met this problem before. When doing the tests, I hit > > the WARN_ON. Is this log make sense or someone had fixed the problem. > > If needed, I can supply the detail log and the testcase source file. That'd be good, as well as the specific kernel version. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs subvolume snapshot performance problem
On Tue, Dec 18, 2012 at 7:06 AM, Sylvain Alain wrote: > So, if I don't use the discard command, how often do I need to run the > fstrim command ? If your ssd isn't a pile of crap, never. SSD's are always over-provisioned, and so every time an erase block fills up, the drive knows that there must be one erase-block worth of garbage which could be compacted, erased, and added to the pool of empty blocks. The crappiest ones only do this as needed (which is why their write speed plummets with use), and really benefit from people forcing the issue with -o discard or occasional fstrim. Everything else should get along fine without it, although an occasional fstrim certainly won't hurt: it just shouldn't help much. > I found this thread : https://patrick-nagel.net/blog/archives/337 It's worth noting that there's a large number of very effective tricks that an ssd can perform to almost completely negate the caveat mentioned there. It really is a solved problem in a modern ssd. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: unmountable partition and live distro (no space left)
Try booting with bootflags=ro,recovery in grub (with the latest possible kernel), or mounting with -o recovery from the livecd (likewise). If it works, then you're done, you should be able to boot normally after a clean umount and shutdown. If it doesn't, post dmesg from the attempt. > I'v been told this is missing relevant details. > The original kernel version was 3.2.0-something (standard Ubuntu 12.04 LTS). > I've since upgraded to 3.7 but this has made no difference. > Right now I don't have the dmesg, I'll post it later. > > Currently I've been able to mount the partition with btrfs-restore and am > trying to rsync it on another ext4 volume. Terminology note: btrfs-restore doesn't "mount" anything, it just copies files directly from a device. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Intel 120G SSD write performance with 3.2.0-4-amd64
On Sat, Dec 15, 2012 at 5:23 PM, Russell Coker wrote: > I've got a system running Debian kernel 3.2.0-4-amd64 with root on a SSD that > identifies itself as "INTEL SSDSC2CT12 300i" (it's an Intel 120G device). 3.2 is massively old in btrfs terms, with lots of fun little stability and performance bugs. > Here is the /proc/mounts entry which shows that ssd and discard options are > enabled. > > /dev/disk/by-uuid/7939c405-c656-4e85-a6a0-29f17be09585 / btrfs > rw,seclabel,nodev,noatime,ssd,discard,space_cache 0 0 Don't use discard; it's a non-queuing command, which means your performance will suck unless your device is _really_ terrible at garbage collection (in which case, it's just the lesser of two evils). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Encryption
On Wed, Dec 12, 2012 at 2:06 PM, wrote: > On Wed, Dec 12, 2012, at 10:48, cwillu wrote: >> Sayeth the FAQ: > > Oh pardon me, it's BTRFS RAID that's a no-go, which is just as critical > to me as I have a 4 disk 8TB array. > The FAQ goeth on to Say: > --- > This pretty much forbids you to use btrfs' cool RAID features if you > need encryption. Using a RAID implementation on top of several encrypted > disks is much slower than using encryption on top of a RAID device. So > the RAID implementation must be on a lower layer than the encryption, > which is not possible using btrfs' RAID support. > --- > > You saw that I need RAID above. Were you just trying to criticize my > memory of the FAQ cwillu? It's not asking for trouble, it's just asking for poor performance, and I suspect even that will depend greatly on the workload. Snapshots still have nothing to do with it: you could have btrfs (with snapshots) on dm-crypt on mdraid. Btrfs would just lose the ability to try alternate mirrors and similar; snapshots would still work just fine. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Encryption
On Wed, Dec 12, 2012 at 12:38 PM, wrote: > > On Wed, Dec 12, 2012, at 10:31, Mitch Harder wrote: >> I run btrfs on top of LUKS encryption on my laptop. You should be able to >> do the same. >> >> You could then run rsync through ssh. However, rsync will have no knowledge >> of any blocks shared under subvolume snapshots. >> >> Btrfs does not yet have internal encryption. > The FAQ says specifically to NOT run BTRFS with any kind of volume > encryption, so you're asking for trouble. Sayeth the FAQ: Does Btrfs work on top of dm-crypt? This is deemed safe since 3.2 kernels. Corruption has been reported before that, so you want a recent kernel. The reason was improper passing of device barriers that are a requirement of the filesystem to guarantee consistency. > And clearly encryption is not possible if you need snapshots. Snapshots don't come into this at all: btrfs doesn't care where the block devices it's on come from. Things like dm-crypt show btrfs (or whatever filesystem you put on it) a decrypted view of the device. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can't mount luks partition after reboot
On Mon, Dec 3, 2012 at 7:22 PM, Travis LaDuke wrote: > This is kind of silly, but may be salvageable... > I made a btrfs on top of luks partition and tried it for a couple days. Then > I made another luks partition on another drive then added and balanced that > new drive as btrfs raid1. A lot of time passed and the balance finished. > > Then I rebooted. The original partition will luksOpen, but btrfs won't mount > it. The 2nd one is in worse shape, it won't even luksOpen. > I haven't tried btrfsck yet. Is there something else I should try first? > What debug info can I post? > > halp "Help" ;p Try mounting the original with -o degraded, and post the dmesg of the attempt if it doesn't work. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: High-sensitivity fs checker (not repairer) for btrfs
On Sat, Nov 10, 2012 at 4:32 PM, Bob Marley wrote: > On 11/10/12 22:23, Hugo Mills wrote: >> >> The closest thing is btrfsck. That's about as picky as we've got to >> date. >> >> What exactly is your use-case for this requirement? > > > We need a decently-available system. We can rollback filesystem to > last-known-good if the "test" detects an inconsistency on current btrfs > filesystem, but we need a very good test for that (i.e. if last-known-good > is actually bad we get into serious troubles). Scrub is probably more useful as a check, combined with "does the filesystem actually mount". > So do you think btrfsck can return a false "OK" result? can it "not-see" an > inconsistency? No set of checks will ever be perfect, so yes. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: (late) REQUEST: Default mkfs.btrfs block size
On Mon, Nov 5, 2012 at 10:06 AM, David Sterba wrote: > On Wed, Oct 31, 2012 at 12:20:39PM +, Alex wrote: >> As one 'stuck' with 4k leaves on my main machine for the moment, can I >> request >> the btrfs-progs v0.20 defaults to more efficient decent block sizes before >> release. Most distro install programs for the moment don't give access to the >> options at install time and there seems to be is a significant advantage to >> 16k >> or 32k > > IMHO this should be fixed inside the installer, changing defaults for a > core utility will affect everybody. 4k is the most tested option and > thus can be considered "safe for everybody". > > The installer may let you to enter a shell and create the filesystem by > hand, then point it to use it for installation. If we know a better setting, we should default to it. Punting the decision to the distro just means I'll spend the next 3 years telling people "yeah, distro X doesn't set it to the recommended setting (which isn't the mkfs default), and there's no way to change it without wiping and reinstalling using manual partitioning blah blah blah." -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][BTRFS-PROGS] Enhance btrfs fi df
> do you have more information about raid ? When it will land on the btrfs > earth ? :-) An unnamed source recently said "today I'm fixing parity rebuild in the middle of a read/modify/write. its one of my last blockers", at which point several gags about progress meters were made. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's the minimum size I can shrink my FS to?
Run "btrfs balance start -musage=1 -dusage=1", and then try it again. This may require update btrfs tools however. On Fri, Nov 2, 2012 at 10:09 PM, Jordan Windsor wrote: > Hello, > I'm trying to shrink my Btrfs filesystem to the smallest size it can > go, here's the information: > > failed to read /dev/sr0 > Label: 'Storage' uuid: 717d4a43-38b3-495f-841b-d223068584de > Total devices 1 FS bytes used 491.86GB > devid1 size 612.04GB used 605.98GB path /dev/sda6 > > Btrfs Btrfs v0.19 > > Data: total=580.90GB, used=490.88GB > System, DUP: total=32.00MB, used=76.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=12.51GB, used=1001.61MB > > Here's the command I use to resize: > > [root@archpc ~]# btrfs file res 500g /home/jordan/Storage/ > Resize '/home/jordan/Storage/' of '500g' > ERROR: unable to resize '/home/jordan/Storage/' - No space left on device > > I was wondering if that size doesn't work then what's the minimum I > can shrink to? > > Thanks. > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Request for review] [RFC] Add label support for snapshots and subvols
> Below is a demo of this new feature. > > btrfs fi label -t /btrfs/sv1 "Prod-DB" > > btrfs fi label -t /btrfs/sv1 > Prod-DB > > btrfs su snap /btrfs/sv1 /btrfs/snap1-sv1 > Create a snapshot of '/btrfs/sv1' in '/btrfs/snap1-sv1' > btrfs fi label -t /btrfs/snap1-sv1 > > btrfs fi label -t /btrfs/snap1-sv1 "Prod-DB-sand-box-testing" > > btrfs fi label -t /btrfs/snap1-sv1 > Prod-DB-sand-box-testing Why is this better than: # btrfs su snap /btrfs/Prod-DB /btrfs/Prod-DB-sand-box-testing # mv /btrfs/Prod-DB-sand-box-testing /btrfs/Prod-DB-production-test # ls /btrfs/ Prod-DB Prod-DB-production-test -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Why btrfs inline small file by default?
On Wed, Oct 31, 2012 at 4:48 AM, Ahmet Inan wrote: >>> i also dont see any benefit from inlining small files: > >>> with defaults (inlining small files): >>> real4m39.253s >>> Data: total=10.01GB, used=9.08GB >>> Metadata, DUP: total=2.00GB, used=992.48MB > >>> without inline: >>> real4m42.085s >>> Data: total=11.01GB, used=10.85GB >>> Metadata, DUP: total=1.00GB, used=518.59MB >> >> I suggest you take a closer look at your numbers. > > both use 12GiB in total and both need 280 seconds. > am i missing something? 9.08GB + 992.48MB*2 == 11.02GB 10.85GB + 518MB*2 == 11.86GB That's nearly a GB smaller. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Why btrfs inline small file by default?
On Wed, Oct 31, 2012 at 2:48 AM, Ahmet Inan wrote: > i also dont see any benefit from inlining small files: > with defaults (inlining small files): > real4m39.253s > Data: total=10.01GB, used=9.08GB > Metadata, DUP: total=2.00GB, used=992.48MB > without inline: > real4m42.085s > Data: total=11.01GB, used=10.85GB > Metadata, DUP: total=1.00GB, used=518.59MB I suggest you take a closer look at your numbers. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Why btrfs inline small file by default?
On Tue, Oct 30, 2012 at 5:47 PM, ching wrote: > On 10/31/2012 06:19 AM, Hugo Mills wrote: >> On Tue, Oct 30, 2012 at 10:14:12PM +, Hugo Mills wrote: >>> On Wed, Oct 31, 2012 at 05:40:25AM +0800, ching wrote: >>>> On 10/30/2012 08:17 PM, cwillu wrote: >>>>>>> If there is a lot of small files, then the size of metadata will be >>>>>>> undesirable due to deduplication >>>>>> Yes, that is a fact, but if that really matters depends on the use-case >>>>>> (e.g., the small files to large files ratio, ...). But as btrfs is >>>>>> designed >>>>>> explicitly as a general purpose file system, you usually want the good >>>>>> performance instead of the better disk-usage (especially as disk space >>>>>> isn't >>>>>> expensive anymore). >>>>> As I understand it, in basically all cases the total storage used by >>>>> inlining will be _smaller_, as the allocation doesn't need to be >>>>> aligned to the sector size. >>>>> >>>> if i have 10G small files in total, then it will consume 20G by default. >>>If those small files are each 128 bytes in size, then you have >>> approximately 80 million of them, and they'd take up 80 million pages, >>> or 320 GiB of total disk space. >>Sorry, to make that clear -- I meant if they were stored in Data. >> If they're inlined in metadata, then they'll take approximately 20 GiB >> as you claim, which is a lot less than the 320 GiB they'd be if >> they're not. >> >>Hugo. >> > > > is it the same for: > 1. 3k per file with leaf size=4K > 2. 60k per file with leaf size=64k > > import os import sys data = "1" * 1024 * 3 for x in xrange(100 * 1000): with open('%s/%s' % (sys.argv[1], x), 'a') as f: f.write(data) root@repository:~$ mount -o loop ~/inline /mnt root@repository:~$ mount -o loop,max_inline=0 ~/noninline /mnt2 root@repository:~$ time python test.py /mnt real0m11.105s user0m1.328s sys 0m5.416s root@repository:~$ time python test.py /mnt2 real0m21.905s user0m1.292s sys 0m5.460s root@repository:/$ btrfs fi df /mnt Data: total=1.01GB, used=256.00KB System, DUP: total=8.00MB, used=4.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=652.70MB Metadata: total=8.00MB, used=0.00 root@repository:/$ btrfs fi df /mnt2 Data: total=1.01GB, used=391.12MB System, DUP: total=8.00MB, used=4.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=60.98MB Metadata: total=8.00MB, used=0.00 3k data, 4k leaf: inline is twice the speed, but 1.4x bigger. root@repository:~$ mkfs.btrfs inline -l 64k root@repository:~$ mkfs.btrfs noninline -l 64k ... root@repository:~$ time python test.py /mnt real0m12.244s user0m1.396s sys 0m8.101s root@repository:~$ time python test.py /mnt2 real0m13.047s user0m1.436s sys 0m7.772s root@repository:/$ btr\fs fi df /mnt Data: total=8.00MB, used=256.00KB System, DUP: total=8.00MB, used=64.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=342.06MB Metadata: total=8.00MB, used=0.00 root@repository:/$ btr\fs fi df /mnt2 Data: total=1.01GB, used=391.10MB System, DUP: total=8.00MB, used=64.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=50.06MB Metadata: total=8.00MB, used=0.00 3k data, 64k leaf: inline is still 10% faster, and is now 25% smaller data = "1" * 1024 * 32 ... (mkfs, mount, etc) root@repository:~$ time python test.py /mnt real0m17.834s user0m1.224s sys 0m4.772s root@repository:~$ time python test.py /mnt2 real0m20.521s user0m1.304s sys 0m6.344s root@repository:/$ btrfs fi df /mnt Data: total=4.01GB, used=3.05GB System, DUP: total=8.00MB, used=64.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=54.00MB Metadata: total=8.00MB, used=0.00 root@repository:/$ btrfs fi df /mnt2 Data: total=4.01GB, used=3.05GB System, DUP: total=8.00MB, used=64.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=53.56MB Metadata: total=8.00MB, used=0.00 32k data, 64k leaf: inline is still 10% faster, and is now the same size (not dead sure why, probably some interaction with the size of the actual write that happens) data = "1" * 1024 * 7 ... etc root@repository:~$ time python test.py /mnt real0m9.628s user0m1.368s sys 0m4.188s root@repository:~$ time python test.py /mnt2 real0m13.455s user0m1.608s sys 0m7.884s root@repository:/$ btrfs fi df /mnt Data: total=3.01GB, used=1.91GB System, DUP: total=8.00MB, used=64.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=74
Re: Why btrfs inline small file by default?
On Tue, Oct 30, 2012 at 3:40 PM, ching wrote: > On 10/30/2012 08:17 PM, cwillu wrote: >>>> If there is a lot of small files, then the size of metadata will be >>>> undesirable due to deduplication >>> >>> Yes, that is a fact, but if that really matters depends on the use-case >>> (e.g., the small files to large files ratio, ...). But as btrfs is designed >>> explicitly as a general purpose file system, you usually want the good >>> performance instead of the better disk-usage (especially as disk space isn't >>> expensive anymore). >> As I understand it, in basically all cases the total storage used by >> inlining will be _smaller_, as the allocation doesn't need to be >> aligned to the sector size. >> > > if i have 10G small files in total, then it will consume 20G by default. > > ching No. No they will not. As I already explained. root@repository:/mnt$ mount ~/inline /mnt -o loop root@repository:/mnt$ mount ~/inline /mnt2 -o loop,max_inline=0 root@repository:/mnt$ mount /dev/loop0 on /mnt type btrfs (rw) /dev/loop1 on /mnt2 type btrfs (rw,max_inline=0) root@repository:/mnt$ time for x in {1..243854}; do echo "some stuff" > /mnt/$x; done real1m5.447s user0m38.422s sys 0m18.493s root@repository:/mnt$ time for x in {1..243854}; do echo "some stuff" > /mnt2/$x; done real1m49.880s user0m40.379s sys 0m26.210s root@repository:/mnt$ df /mnt /mnt2 Filesystem 1K-blocks Used Available Use% Mounted on /dev/loop010485760266952 8359680 4% /mnt /dev/loop110485760 1311620 7384236 16% /mnt2 root@repository:/mnt$ btrfs fi df /mnt Data: total=1.01GB, used=256.00KB System, DUP: total=8.00MB, used=4.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=130.22MB Metadata: total=8.00MB, used=0.00 root@repository:/mnt$ btrfs fi df /mnt2 Data: total=2.01GB, used=953.05MB System, DUP: total=8.00MB, used=4.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=164.03MB Metadata: total=8.00MB, used=0.00 root@repository:/mnt$ btrfs fi show Label: none uuid: e5440337-9f44-4b2d-9889-80ab0ab8f245 Total devices 1 FS bytes used 130.47MB devid1 size 10.00GB used 3.04GB path /dev/loop0 Label: none uuid: cfcc4149-3102-465d-89b8-0a6bb6a4749a Total devices 1 FS bytes used 1.09GB devid1 size 10.00GB used 4.04GB path /dev/loop1 Btrfs Btrfs v0.19 Any questions? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Why btrfs inline small file by default?
>> If there is a lot of small files, then the size of metadata will be >> undesirable due to deduplication > > > Yes, that is a fact, but if that really matters depends on the use-case > (e.g., the small files to large files ratio, ...). But as btrfs is designed > explicitly as a general purpose file system, you usually want the good > performance instead of the better disk-usage (especially as disk space isn't > expensive anymore). As I understand it, in basically all cases the total storage used by inlining will be _smaller_, as the allocation doesn't need to be aligned to the sector size. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs defrag problem
On Tue, Oct 30, 2012 at 5:47 AM, ching wrote: > Hi all, > > I try to defrag my btrfs root partition (run by root privilege) > > find / -type f -o -type d -print0 | xargs --null --no-run-if-empty btrfs > filesystem defragment -t $((32*1024*1024)) > > > 1. This kind of error messages is prompted: > > failed to open /bin/bash > open:: Text file busy > total 1 failures > failed to open /lib64/ld-2.15.so > open:: Text file busy > total 1 failures > failed to open /sbin/agetty > open:: Text file busy > failed to open /sbin/btrfs > open:: Text file busy > failed to open /sbin/dhclient > open:: Text file busy > failed to open /sbin/init > open:: Text file busy > failed to open /sbin/udevd > > It seems that locked files cannot be defragged, is it expected behaviour? I can't reproduce that behaviour here, although maybe you're running an older kernel with some bug that's since been fixed? > 2. Btrfs Wiki mentions that defrag directory will defrag metadata, is > symlink/hardlink considered as metadata? > > P.S. inline data is already disabled by "max_inline=0" Well, that's a silly thing to do, causing every small file to take up a separate 4kb block rather than its size * 2, and requiring extra seeks to read/write them (i.e., if you have a million 10 byte files, they'll now take up 4GB instead of 20MB). > 3. Is any possible to online defrag a btrfs partition without hindered by > mount point/polyinstantied directories? If you're asking if you can defrag an unmounted btrfs, not at this time. It's possible in principle, nobody has cared enough to implement it yet. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Naming of subvolumes
On Fri, Oct 26, 2012 at 9:54 AM, Chris Murphy wrote: > > On Oct 26, 2012, at 2:27 AM, Richard Hughes wrote: > > >>> And if you're going to apply the upgrade to the snapshot, or to the top >>> level file system? >> >> That's a very good question. I was going to apply the upgrade to the >> top level file system, and then roll-back to the old snapshot if the >> new upgrade state does not boot to a GUI. It would work equally well >> applying the update to the snapshot and then switching to that, >> although I don't know how well userspace is going to cope switching to >> a new subvolume at early-runtime given the fact we're running the >> upgrade itself from the filesystem and not in a special initrd or >> something. Also, we need to ready out lots of upgrade data (200Mb+?) >> from somewhere, so doing this on the snapshot would mean creating the >> snapshot and copying the data there *before* we reboot to install >> upgrades, and we really want to create the snapshot when we're sure >> the system will boot. > > a. Upgrade top level: > Maybe download upgrade data to /tmp, upon completion snapshot the current > system state, then copy the upgrade data to persistent storage and reboot; > upgrade of top level boot root begins. The snapshot is the regressive state > in case of failure. > > b. Upgrade snapshot: > Create snapshot, mount it somewhere; download upgrade data to a location > within the snapshot; reboot from the snapshot, it upgrades itself and cleans > up. The top level is the regressive state in case of failure. > > Either way, 200MB of downloaded (and undeleted) upgrade data isn't stuck in a > snapshot after it's used. And either way the snapshot is bootable. > > If you get sufficient metadata in the snapshot, then you can name/rename the > snapshots whatever you want. I'd also point out it's valid for the user to > prefer a different organization, i.e. instead of Fedora taking over the top > level of a btrfs volume, to create subvolumes Fedora 17, Fedora 18, Ubuntu > 12X, etc., at the top level, and insert boot and root and possibly home in > those. In which case the upgrade mechanism should still work. > >> >>> So I'm going to guess that you will actually create a subvolume named >>> something like @system-upgrade-20121025, and then snapshot root, boot, and >>> home into that subvol? >> >> Not /home. Packages shouldn't be installing stuff there anyway, and >> /home is going to typically much bigger than /root or /boot. > > OK so small problem here is that today /etc/fstab is pointing to the home > subvolume in a relative location to the default subvolume. The fstab mount > option is subvol=home, not subvol=/home, not subvolid=xxx. > > So if you want to use changing default subvolumes to make the switch between > the current updated state, and rollback states, (which is milliseconds fast), > which also means no changes needed to grub's core.img or grub.cfg (since > those use relative references for boot and root), a change is needed for home > to use an absolute reference: either subvol=/home or use subvolid= in the > fstab. > > While a bit more obscure in the /etc/fstab the subvolid= is more reliable. > That home can be renamed or moved anywhere and it'll still work. I think it's > legitimate for a user to create or want > > >>> If the upgrade is not successful, you change the default subvolume ID to >>> that of @system-upgrade-20121025. >> >> I was actually thinking of using btrfs send | btrfs receive to roll >> back the root manually. It would be better if btrfs could swap the >> subvolume ID's of @system-upgrade-20121025 and 0, as then we don't get >> a snapshot that's useless. > > I haven't tried btrfs send/receive for this purpose, so I can't compare. But > btrfs subvolume set-default is faster than the release of my finger from the > return key. And it's easy enough the user could do it themselves if they had > reasons for regression to a snapshot that differ than the automagic > determination of the upgrade pass/fail. > > The one needed change, however, is to get /etc/fstab to use an absolute > reference for home. > > > Chris Murphy > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html I'd argue that everything should be absolute references to subvolumes (/@home, /@, etc), and neither set-default nor subvolume id's should be touched. There's no need, as you can simply mv those around (even while mounted). More importantly, it doesn't result in a case where the fstab in one snapshot points its mountpoint to a different snapshot, with all the hilarity that would cause over time, and also allows multiple distros to be installed on the same filesystem without having them stomp on each others set-defaults: /@fedora, /@rawhide, /@ubuntu, /@home, etc. -- To unsubscribe from this list: send the line "unsubscribe linux-b
Re: [RFC] New attempt to a better "btrfs fi df"
On Thu, Oct 25, 2012 at 8:33 PM, Chris Murphy wrote: > So what's the intended distinction between 'fi df' and 'fi show'? Because for > months using btrfs I'd constantly be confused which command was going to show > me what information I wanted, and that tells me there should be some better > distinction between the commands. Or the distinction should be removed, which is what this patch effectively does. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] New attempt to a better "btrfs fi df"
On Thu, Oct 25, 2012 at 2:36 PM, Chris Murphy wrote: > My suggestion is that by default a summary similar to the existing df command > be mimicked, where it makes sense, for btrfs fi df. > > - I like the Capacity %. If there is a reliable equivalent, it need not be > inode based, that would be great. > > - I care far less about the actual physical device information, more about > the btrfs volume(s) as a whole. How big is the volume, how much of that is > used, and how much is available? > > I understand the challenges behind estimating the amount available. So if the > value for available/free is precided by a ~ in each case, or as a heading > "~Avail" or "~Free" I'd be OK with that disclaimer. > > I think the examples so far are reporting too much information and it's > difficult to get just what I want. Plain old "/bin/df" is adequate for that though, and in the mean time one _does_ need _all_ of that information to work with the filesystem. However, the detailed breakdown is vital to answer many questions: "Why can't I write to my filesystem with 80gb free? Oh, because metadata is raid1 and the one disk is 80gb smaller than the other." "How much data is on this disk that started giving SMART errors?" "How many GB of vm image files (or other large files) can I probably fit on this fs?" "How many GB of mail (or other tiny files) can I probably fit on this fs?" "Is there enough space to remove this disk from the fs, and how much free space will I have then?" And the all-important "Could you please run btrfs fi df and pastebin the output so we can tell what the hell is going on?" :) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] New attempt to a better "btrfs fi df"
On Thu, Oct 25, 2012 at 2:03 PM, Chris Murphy wrote: > > On Oct 25, 2012, at 1:21 PM, Goffredo Baroncelli wrote: >> >> Moreover I still didn't understand how btrfs was using the disks. > > This comment has less to do with the RFC, and more about user confusion in a > specific case of the existing fi df behavior. But since I have the same > misunderstanding of how btrfs is using the disks, I decided to reply to this > thread. > > While working with Fedora 18's new System Storage Manager [1], I came across > this problem. For reference the bug report [2] which seems less of a bug with > ssm than a peculiarity with btrfs chunk allocation and how fi df report usage. > > 80GB VDI, Virtual Box VM, containing Fedora 18: installed and yum updated 2-3 > times. That's it, yet for some reason, 76 GB of chunks have been allocated > and they're all full? This doesn't make sense when there's just under 4GB of > data on this single device. > > [root@f18v ~]# btrfs fi show > Label: 'fedora' uuid: 780b8553-4097-4136-92a4-c6fd48779b0c > Total devices 1 FS bytes used 3.93GB > devid1 size 76.06GB used 76.06GB path /dev/sda1 > > [root@f18v ~]# btrfs fi df / > Data: total=72.03GB, used=3.67GB > System, DUP: total=8.00MB, used=16.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=2.00GB, used=257.54MB > Metadata: total=8.00MB, used=0.00 > > I decided to rebalance, and while things become a lot more sensible, I'm > still confused: > > [chris@f18v ~]$ sudo btrfs fi show > failed to read /dev/sr0 > Label: 'fedora' uuid: 780b8553-4097-4136-92a4-c6fd48779b0c > Total devices 1 FS bytes used 3.91GB > devid1 size 76.06GB used 9.13GB path /dev/sda1 > > [chris@f18v ~]$ sudo btrfs fi df / > Data: total=5.00GB, used=3.66GB > System, DUP: total=64.00MB, used=4.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=2.00GB, used=256.84MB > > Points of confusion: > > 1. Why is FS bytes used = 3.91GB, yet devid 1 used is 9.13 GB? "FS bytes" is what du -sh would show. "devid 1 used" is space allocated to some block group (without that block group itself being entirely used) > 2. Why before a rebalance does 'fi df' show extra lines, and then after > rebalance there are fewer lines? Another case with raid10, 'fi df' shows six > lines of data, but then after rebalance is shows three lines? A bug in mkfs causes some tiny blockgroups with the wrong profile to be created; as they're unused, they get cleaned up by the balance. > 3. How does Data: total=72GB before rebalance, but is 5GB after rebalance? > This was a brand new file system, file system installed, with maybe 2-3 > updates, and a dozen or two reboots. That's it. No VM's created on that > volume (it's a VDI itself), and the VDI file itself never grew beyond 9GB. Combine the previous two answers: You had 72GB allocated to block groups which are mostly empty. After the balance, the contents of those groups have been shuffled around such that most of them could be freed. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] New attempt to a better "btrfs fi df"
>>> Allocated_area: >>> Data,RAID0: Size:921.75MB, Used:256.00KB >>> /dev/vdc 307.25MB >>> /dev/vdb 307.25MB >>> /dev/vdd 307.25MB >>> >>> Data,Single: Size:8.00MB, Used:0.00 >>> /dev/vdb 8.00MB >>> >>> System,RAID1: Size:8.00MB, Used:4.00KB >>> /dev/vdd 8.00MB >>> /dev/vdc 8.00MB >>> >>> System,Single: Size:4.00MB, Used:0.00 >>> /dev/vdb 4.00MB >>> >>> Metadata,RAID1: Size:460.94MB, Used:24.00KB >>> /dev/vdb 460.94MB >>> /dev/vdd 460.94MB >>> >>> Metadata,Single: Size:8.00MB, Used:0.00 >>> /dev/vdb 8.00MB >>> >>> Unused: >>> /dev/vdb 2.23GB >>> /dev/vdc 2.69GB >>> /dev/vdd 2.24GB >> >> Couple minor things, in order of personal opinion of severity: >> >> * Devices should be listed in a consistent order; device names are >> just too consistently similar > Could you elaborate ? I didn't understood well Thereby demonstrating the problem :) Data,RAID0: Size:921.75MB, Used:256.00KB /dev/vdc 307.25MB /dev/vdb 307.25MB /dev/vdd 307.25MB Unused: /dev/vdb 2.23GB /dev/vdc 2.69GB /dev/vdd 2.24GB The first goes c, b, d; the second goes b, c, d. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need help mounting laptop corrupted root btrfs. Kernel BUG at fs/btrfs/volumes.c:3707
On Thu, Oct 25, 2012 at 1:58 PM, Marc MERLIN wrote: > Howdy, > > I can wait a day or maybe 2 before I have to wipe and restore from backup. > Please let me know if you have a patch against 3.6.3 you'd like me to try > to mount/recover this filesystem, or whether you'd like me to try btrfsck. > > > My laptop had a problem with its boot drive which prevented linux > from writing to it, and in turn caused btrfs to have incomplete writes > to it. > After reboot, the boot drive was fine, but the btrfs filesystem has > a corruption that prevents it from being mounted. > > Unfortunately the mount crash prevents writing of crash data to even another > drive since linux stops before the crash data can be written to syslog. > > Picture #1 shows a dump when my laptop crashed (before reboot). > btrfs no csum found for inode X start Y > http://marc.merlins.org/tmp/crash.jpg > > Mounting with 3.5.0 and 3.6.3 gives the same error: > > gandalfthegreat:~# mount -o recovery,skip_balance,ro /dev/mapper/bootdsk > > shows > btrfs: bdev /dev/mapper/bootdsk errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 > btrfs: bdev /dev/mapper/bootdsk errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 > (there are 2 lines, not sure why) > > kernel BUG at fs/btrfs/volumes.c:3707 > int btrfs_num_copies(struct btrfs_mapping_tree *map_tree, u64 logical, u64 > len) > { > struct extent_map *em; > struct map_lookup *map; > struct extent_map_tree *em_tree = &map_tree->map_tree; > int ret; > > read_lock(&em_tree->lock); > em = lookup_extent_mapping(em_tree, logical, len); > read_unlock(&em_tree->lock); > BUG_ON(!em); <--- > > If the snapshot helps (sorry, hard to read, but usable): > http://marc.merlins.org/tmp/btrfs_bug.jpg > > Questions: > 1) Any better way to get a proper dump without serial console? > (I hate to give you pictures) > > 2) Should I try btrfsck now, or are there other mount options than > mount -o recovery,skip_balance,ro /dev/mapper/bootdsk > I should try? > > 3) Want me to try btrfsck although it may make it impossible for me to > reproduce the bug and test a fix, as well as potentially break the filesystem > more (last time I tried btrfsck, it outputted thousands of lines and never > converged > to a state it was happy with) This looks like something btrfs-zero-log would work around (although -o recovery should do mostly the same things). That would destroy the evidence though, and may just make things (slightly) worse, so I'd wait to see if anyone suggests something better before trying it. If you're ultimately ending up restoring from backup though, it may save you that effort at least. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] New attempt to a better "btrfs fi df"
> I don't publish the patched because aren't in a good shape. However I > really like the output. The example is a filesystem based on three > disks of 3GB. > > It is clear that: > - - RAID0 uses all the disks > - - RAID1 uses two different disks > > Comments are welcome. > > Known bugs: > - - if a filesystem uses a disk but there is any chunk, the disk is not > shown (solvable) > - - this command need root capability (I use the BTRFS_IOC_TREE_SEARCH > to get the chunk info; so that is unavoidable) > > > ghigo@emulato:~$ sudo ./btrfs fi df /mnt/btrfs1/ > [sudo] password for ghigo: > Path: /mnt/btrfs1 > Summary: > Disk_size: 9.00GB > Disk_allocated: 1.83GB > Disk_unallocated:7.17GB > Used: 284.00KB > Free_(Estimated):6.76GB (Max: 8.54GB, min: 4.96GB) > Data_to_disk_ratio:75 % > > Allocated_area: > Data,RAID0: Size:921.75MB, Used:256.00KB > /dev/vdc 307.25MB > /dev/vdb 307.25MB > /dev/vdd 307.25MB > > Data,Single: Size:8.00MB, Used:0.00 > /dev/vdb 8.00MB > > System,RAID1: Size:8.00MB, Used:4.00KB > /dev/vdd 8.00MB > /dev/vdc 8.00MB > > System,Single: Size:4.00MB, Used:0.00 > /dev/vdb 4.00MB > > Metadata,RAID1: Size:460.94MB, Used:24.00KB > /dev/vdb 460.94MB > /dev/vdd 460.94MB > > Metadata,Single: Size:8.00MB, Used:0.00 > /dev/vdb 8.00MB > > Unused: > /dev/vdb 2.23GB > /dev/vdc 2.69GB > /dev/vdd 2.24GB Couple minor things, in order of personal opinion of severity: * Devices should be listed in a consistent order; device names are just too consistently similar * System chunks shouldn't be listed between data and metadata; really, they're just noise 99% of the time anyway * I think it may be more useful to display each disk, with the profiles in use underneath. With a larger number of disks, that would make it _much_ easier to tell at-a-glance what is currently on a disk (that I may want to remove, or which I may suspect to be unreliable). * I'd rename "Unused" to "Unallocated" for consistency with the section title * (and I still detest the_underscores_between_all_the_words; it doesn't make parsing significantly easier, and it's an eyesore) * Three coats of blue paint plus a clear-coat is the One True Paint-Job. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs seems to do COW while inode has NODATACOW set
On Thu, Oct 25, 2012 at 12:35 PM, Alex Lyakas wrote: > Hi everybody, > I need some help understanding the nodatacow behavior. > > I have set up a large file (5GiB), which has very few EXTENT_DATAs > (all are real, not bytenr=0). The file has NODATASUM and NODATACOW > flags set (flags=0x3): > item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 > inode generation 5 transid 5 size 5368709120 nbytes 5368709120 > owner[0:0] mode 100644 > inode blockgroup 0 nlink 1 flags 0x3 seq 0 > item 7 key (257 EXTENT_DATA 131072) itemoff 3469 itemsize 53 > item 8 key (257 EXTENT_DATA 33554432) itemoff 3416 itemsize 53 > item 9 key (257 EXTENT_DATA 67108864) itemoff 3363 itemsize 53 > item 10 key (257 EXTENT_DATA 67112960) itemoff 3310 itemsize 53 > item 11 key (257 EXTENT_DATA 67117056) itemoff 3257 itemsize 53 > item 12 key (257 EXTENT_DATA 67121152) itemoff 3204 itemsize 53 > item 13 key (257 EXTENT_DATA 67125248) itemoff 3151 itemsize 53 > item 14 key (257 EXTENT_DATA 67129344) itemoff 3098 itemsize 53 > item 15 key (257 EXTENT_DATA 67133440) itemoff 3045 itemsize 53 > item 16 key (257 EXTENT_DATA 67137536) itemoff 2992 itemsize 53 > item 17 key (257 EXTENT_DATA 67141632) itemoff 2939 itemsize 53 > item 18 key (257 EXTENT_DATA 67145728) itemoff 2886 itemsize 53 > item 19 key (257 EXTENT_DATA 67149824) itemoff 2833 itemsize 53 > item 20 key (257 EXTENT_DATA 67153920) itemoff 2780 itemsize 53 > item 21 key (257 EXTENT_DATA 67158016) itemoff 2727 itemsize 53 > item 22 key (257 EXTENT_DATA 67162112) itemoff 2674 itemsize 53 > item 23 key (257 EXTENT_DATA 67166208) itemoff 2621 itemsize 53 > item 24 key (257 EXTENT_DATA 67170304) itemoff 2568 itemsize 53 > item 25 key (257 EXTENT_DATA 67174400) itemoff 2515 itemsize 53 > extent data disk byte 67174400 nr 5301534720 > extent data offset 0 nr 5301534720 ram 5301534720 > extent compression 0 > As you see by last extent, the file size is exactly 5Gib. > > Then I also mount btrfs with nodatacow option. > > root@vc:/btrfs-progs# ./btrfs fi df /mnt/src/ > Data: total=5.47GB, used=5.00GB > System: total=32.00MB, used=4.00KB > Metadata: total=512.00MB, used=28.00KB > > (I have set up block groups myself by playing with mfks code and > convertion code to learn about the extent tree. The filesystem passes > btrfsck fine, with no errors. All superblock copies are consistent.) > > Then I run parallel random IOs on the file, and almost immediately hit > ENOSPC. When looking at the file, I see that now it has a huge amount > of EXTENT_DATAs: > item 4 key (257 INODE_ITEM 0) itemoff 3593 itemsize 160 > inode generation 5 transid 21 size 5368709120 nbytes 5368709120 > owner[0:0] mode 100644 > inode blockgroup 0 nlink 1 flags 0x3 seq 130098 > item 6 key (257 EXTENT_DATA 0) itemoff 3525 itemsize 53 > item 7 key (257 EXTENT_DATA 131072) itemoff 3472 itemsize 53 > item 8 key (257 EXTENT_DATA 262144) itemoff 3419 itemsize 53 > item 9 key (257 EXTENT_DATA 524288) itemoff 3366 itemsize 53 > item 10 key (257 EXTENT_DATA 655360) itemoff 3313 itemsize 53 > item 11 key (257 EXTENT_DATA 1310720) itemoff 3260 itemsize 53 > item 12 key (257 EXTENT_DATA 1441792) itemoff 3207 itemsize 53 > item 13 key (257 EXTENT_DATA 2097152) itemoff 3154 itemsize 53 > item 14 key (257 EXTENT_DATA 2228224) itemoff 3101 itemsize 53 > item 15 key (257 EXTENT_DATA 2752512) itemoff 3048 itemsize 53 > item 16 key (257 EXTENT_DATA 2883584) itemoff 2995 itemsize 53 > item 17 key (257 EXTENT_DATA 11927552) itemoff 2942 itemsize 53 > item 18 key (257 EXTENT_DATA 12058624) itemoff 2889 itemsize 53 > item 19 key (257 EXTENT_DATA 13238272) itemoff 2836 itemsize 53 > item 20 key (257 EXTENT_DATA 13369344) itemoff 2783 itemsize 53 > item 21 key (257 EXTENT_DATA 16646144) itemoff 2730 itemsize 53 > item 22 key (257 EXTENT_DATA 16777216) itemoff 2677 itemsize 53 > item 23 key (257 EXTENT_DATA 17432576) itemoff 2624 itemsize 53 > ... > > and: > root@vc:/btrfs-progs# ./btrfs fi df /mnt/src/ > Data: total=5.47GB, used=5.46GB > System: total=32.00MB, used=4.00KB > Metadata: total=512.00MB, used=992.00KB > > Kernel is for-linus branch from Chris's tree, up to > f46dbe3dee853f8a860f889cb2b7ff4c624f2a7a (this is the last commit > there now). > > I was under impression that if a file is marked as NODATACOW, then new > writes will never allocate EXTENT_DATAs if appropriate EXTENT_DATAs > already exist. However, it is clearly not the case, or maybe I am > doing something wrong. > > Can anybody please help me to debug further and understand why this is > happening. Have there been any snapshots taken, and/or was the filesystem converted from ext? In those cases, there will be one final copy taken for for the write. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in th
Re: [PATCH 2/2] Btrfs: do not delete a subvolume which is in a R/O subvolume
On Wed, Oct 24, 2012 at 4:03 AM, Miao Xie wrote: > On Mon, 22 Oct 2012 05:57:12 -0600, cwillu wrote: >> On Mon, Oct 22, 2012 at 5:39 AM, Miao Xie wrote: >>> Step to reproduce: >>> # mkfs.btrfs >>> # mount >>> # btrfs sub create /subv0 >>> # btrfs sub snap /subv0/snap0 >>> # change /subv0 from R/W to R/O >>> # btrfs sub del /subv0/snap0 >>> >>> We deleted the snapshot successfully. I think we should not be able to >>> delete >>> the snapshot since the parent subvolume is R/O. >> >> snap0 isn't read-only in that case, right? From a user interaction >> standpoint, this seems like it just forces a user to rm -rf rather >> btrfs sub del, which strikes me as a bit ham-handed when all we really >> care about is leaving a (the?) directory entry where snap0 used to be. >> > > I don't think we can identify "btrfs sub del" with "rm -rf", because "rm -rf" > will check the permission of the parent directory of each file/directory which > is going to be deleted, but "btrfs sub del" doesn't do it, it will see all the > file/directory in the subvolume as one, so I think it seems like a special > "rmdir". From this standpoint, deleting a snapshot whose parent subvolume > is readonly should be forbidden. Sorry; reading back, I misunderstood you to mean that subv0 was marked as a readonly subvolume, as opposed to marking the mountpoint readonly. The former can't work at all (it would make the pair undeletable, as the subv0 can't be deleted while it contains another subvolume). I'm still not sure that the latter is quite right, but I care a lot less as one could always remount it rw (unlike ro subvolumes, as I understand them). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] Btrfs: do not delete a subvolume which is in a R/O subvolume
On Mon, Oct 22, 2012 at 5:39 AM, Miao Xie wrote: > Step to reproduce: > # mkfs.btrfs > # mount > # btrfs sub create /subv0 > # btrfs sub snap /subv0/snap0 > # change /subv0 from R/W to R/O > # btrfs sub del /subv0/snap0 > > We deleted the snapshot successfully. I think we should not be able to delete > the snapshot since the parent subvolume is R/O. snap0 isn't read-only in that case, right? From a user interaction standpoint, this seems like it just forces a user to rm -rf rather btrfs sub del, which strikes me as a bit ham-handed when all we really care about is leaving a (the?) directory entry where snap0 used to be. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: unrecognized mount option 'compression=lzo' and defragment -c errors
> 1. I also added mount option 'compression=lzo' and 'io_cache' to /home at > first. Neither io_cache nor compression=lzo are options that exist. You probably meant compress=lzo for the first, but I really don't know what you wanted for io_cache (inode_cache? that's not really a performance thing) You need to check what the actual parameters are before you change things. Making stuff up simply doesn't work. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Weird Warning
On Fri, Oct 19, 2012 at 3:51 PM, Jérôme Poulin wrote: > After updating to 3.5.5, I get thi on boot and listing some dir freezes. > I don't have anything important on that volume but I'm willing to > debug the problem if needed. Would I need a more recent kernel? Probably worth trying 3.7-rc1, or at least cmason's for-linus (which is 3.6.0 + the btrfs changes that went into 3.7). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Weird Warning
On Fri, Oct 19, 2012 at 2:54 PM, Jérôme Poulin wrote: > I've got this weird WARNING in my system log on a freshly created FS, > I'm using ACL with Samba, this is the only difference I could tell > from any other FSes. It is also using Debian's Wheezy kernel which is > quite old. Should I just ignore this or update BTRFS module? I would strongly recommend updating, even if you hadn't seen any warnings. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: initramfs take a long time to load[135s]
On Fri, Oct 19, 2012 at 1:02 PM, Marguerite Su wrote: > On Sat, Oct 20, 2012 at 2:35 AM, cwillu wrote: >> Without space_cache (once), btrfs has to repopulate that information >> the slow way every mount; with it, it can just load the data from the >> last unmount (modulo some consistency checks). >> >> The setting is sticky, so you don't actually need it in fstab any more >> (although it won't hurt anything either). > > Thanks, cwillu! > > I transfer the message to openSUSE bugzilla and ask them help making > that happen by default in openSUSE. > > Marguerite Apparently mkfs.btrfs does set it by default now, so perhaps your filesystem predates the change, or suse's btrfs-progs is too old. mkfs.btrfs /dev/whatever followed by mounting with no options should print "btrfs: disk space caching is enabled" to dmesg if your mkfs is new enough, if you wish to test. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: initramfs take a long time to load[135s]
On Fri, Oct 19, 2012 at 12:33 PM, Marguerite Su wrote: > On Sat, Oct 20, 2012 at 2:26 AM, cwillu wrote: >> That would work, but it's only necessary to mount with it once (and >> it's probably been done already with /home), hence the -o >> remount,space_cache > > Now my kernel loads in 10s, another 4s for userspace...then -.mount > and all the systemd services. > > It boots like an animal! Without space_cache (once), btrfs has to repopulate that information the slow way every mount; with it, it can just load the data from the last unmount (modulo some consistency checks). The setting is sticky, so you don't actually need it in fstab any more (although it won't hurt anything either). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: initramfs take a long time to load[135s]
On Fri, Oct 19, 2012 at 11:02 AM, Marguerite Su wrote: > On Sat, Oct 20, 2012 at 12:55 AM, cwillu wrote: >> It appears space_cache isn't enabled on your rootfs; can you do a >> "mount / -o remount,space_cache", sync a couple times, make some >> coffee, and then reboot, and see if it's better? >> >> You should see two instances of "btrfs: disk space caching is enabled" >> in your dmesg, one for / and the second for /home. >> >> Also, make sure to reply-all so that others interested can still follow >> along. > > like this > > UUID=9b9aa9d9-760e-445c-a0ab-68e102d9f02e /btrfs >defaults,space_cache,comment=systemd.automount 1 0 > > UUID=559dec06-4fd0-47c1-97b8-cc4fa6153fa0 /homebtrfs > defaults,space_cache,comment=systemd.automount 1 0 > > in /etc/fstab? That would work, but it's only necessary to mount with it once (and it's probably been done already with /home), hence the -o remount,space_cache -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: initramfs take a long time to load[135s]
On Fri, Oct 19, 2012 at 10:18 AM, Marguerite Su wrote: > On Fri, Oct 19, 2012 at 11:41 PM, cwillu wrote: >> Also, next time just put the output directly in the email, that way >> it's permanently around to look at and search for. > > Hi, > > I did it. here's my dmesg: > I made the snapshot at: > > mount -o rw,defaults,comment=systemd.automount -t btrfs /dev/root /root > > and > > Starting Tell Plymouth To Write Out Runtime Data... > Started Recreate Volatile Files and Directories > > > is it useful this time? More useful every time! Can you post the full output of dmesg, or at least the first couple hundred seconds of it? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: initramfs take a long time to load[135s]
On Fri, Oct 19, 2012 at 9:28 AM, Marguerite Su wrote: > On Thu, Oct 18, 2012 at 9:28 PM, Chris Mason wrote: >> If it isn't the free space cache, it'll be a fragmentation problem. The >> easiest way to tell the difference is to get a few sysrq-w snapshots >> during the boot. > > Hi, Chris, > > with some help from openSUSE community, I learnt what's sysrq > snapshots(alt+printscreen+w in tty1)... > > and here's my log: > > http://paste.opensuse.org/31094916 You need to hit alt-sysrq-w during the slowness you're trying to instrument; the pastebin is from an hour later. Also, next time just put the output directly in the email, that way it's permanently around to look at and search for. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS filesystem is not mountable after crash
On Sat, Oct 13, 2012 at 11:51 AM, Alfred Zastrow wrote: > Am 26.08.2012 08:17, schrieb Liu Bo: > >> On 08/26/2012 01:27 PM, Alfred Zastrow wrote: >> >> >> Hello, >> >> has realy nobody a hint for me? >> >> Is compiling chris's latest for-linus helpful? >> >> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git >> >> thanks, >> liubo > > > Hi dev's > > I was not able to install chris's latest for-linus under F17, but I tried > with the latest 3.6.1-Kernel with was recently released. > Same shit.. :-( Chris's for-linus is currently all the btrfs changes that will be going into 3.7. 3.6.1 won't likely have any of them. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html