Re: BTRFS and databases
On 1 August 2018 at 04:45, MegaBrutal wrote: > But there is still one question that I can't get over: if you store a > database (e.g. MySQL), would you prefer having a BTRFS volume mounted > with nodatacow, or would you just simply use ext4? > > I know that with nodatacow, I take away most of the benefits of BTRFS > (those are actually hurting database performance – the exact CoW > nature that is elsewhere a blessing, with databases it's a drawback). > But are there any advantages of still sticking to BTRFS for a database > albeit CoW is disabled, or should I just return to the old and > reliable ext4 for those applications? Also note that no data CoW implies no data checksums too. https://btrfs.wiki.kernel.org/index.php/FAQ#Can_I_have_nodatacow_.28or_chattr_.2BC.29_but_still_have_checksumming.3F Mike -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: notification about corrupt files from "btrfs scrub" in cron
On 23 November 2017 at 11:47, STwrote: > >> > I have following cron job to scrub entire root filesystem (total ca. >> > 7.2TB and 2.3TB of them used) once a week: >> > /bin/btrfs scrub start -r / > /dev/null >> > >> > Such scrubbing takes ca. 2 hours. How should I get notified that a >> > corrupt file was discovered? Does this command return some error code >> > back to cron so it can send an email as usual? Will cron wait 2 hours to >> > get that code? >> > >> > I tried that command once without "> /dev/null" but got no email >> > notification about the results (eventhough the check was OK) - why? >> >> See the btrfs-scrub manpage... >> >> Note that normally btrfs scrub start is asynchronous and should return >> effectively immediately, the only possible errors therefore being for >> example if the given path doesn't point to a btrfs or btrfs-device (which >> would return a status code of 1, scrub couldn't be performed), etc. >> >> Status can be checked via btrfs scrub status, and/or, or you can use the >> btrfs scrub start's -B (don't background) switch, which will cause it to >> wait until the scrub is finished and print a summary report. That should >> allow you to check for a status code of 3, scrub found uncorrectable >> errors, as well. > > Thank you for the response! Does it mean that if write: > > /bin/btrfs scrub start -r -B / > > cron will hang for 2 hours (is it problematic?) and then send me an > email with the summary report (even if everything was OK), and if I > write: > > /bin/btrfs scrub start -r -B / > /dev/null > > after 2 hours it will send an email, only if there was an error with > whatever error code (1-3)? Cron starts configured jobs at the scheduled time asynchronously. I.e. It doesn't block waiting for each command to finish. Cron notices when the job finishes and any output produced, written to stdout and/or stderr, by the job is emailed to the user. So no, a 2 hour job is not a problem for cron. With you redirecting btrfs scrub stdout to /dev/null only any stderr output will be captured by cron and emailed to you. (Unfortunately I don't know btrfs scrub well enough to know if it reports errors to stderr). Mike -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: allow "no" to disable compression for convenience
On 17 September 2017 at 01:36, Satoru Takeuchiwrote: > It's messy to use "" to disable compression. Introduce the new value "no" > which can also be used for this purpose. >From an English language point of view, "none" would be better. None says the absence of, where as no is more general negative. Mike -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] cleanup __btrfs_map_block
On 18 February 2017 at 01:28, Liu Bowrote: > This is attempting to make __btrfs_map_block less scary :) > > The major changes are > > 1) split operations for discard out of __btrfs_map_block and we don't copy > discard operations for the target device of dev replace since they're not > as important as writes. > > 2) put dev replace stuff into helpers since they're basically > self-independant. Just a note on English. Self-independant / self-independent is a made up word. (Used in PATCH 0/7, 3/7 and 4/7). I assume you intended to use either self-contained or independent. Thanks, Mike -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Free space missing
On 6 December 2016 at 12:40, Кравцов Роман Владимировичwrote: > Hello. > > Why 'BTRFS used size' and 'du -s' are different? > > > [root@OraCI2 ~]# btrfs --version > btrfs-progs v4.8.3 > [root@OraCI2 ~]# btrfs fi show /dev/nvme0n1 > Label: 'OLD_PES' uuid: bd542f37-6baa-4af8-b87a-20d4e335e4b3 > Total devices 1 FS bytes used 2.53TiB > devid1 size 2.91TiB used 2.90TiB path /dev/nvme0n1 > [root@OraCI2 ~]# btrfs fi usage /mnt/OLD/ > Overall: > Device size: 2.91TiB > Device allocated: 2.90TiB > Device unallocated: 9.97GiB > Device missing: 0.00B > Used: 2.53TiB > Free (estimated): 385.31GiB(min: 385.31GiB) > Data ratio: 1.00 > Metadata ratio: 1.00 > Global reserve: 512.00MiB(used: 0.00B) > > Data,single: Size:2.89TiB, Used:2.53TiB <- >/dev/nvme0n1 2.89TiB > > Metadata,single: Size:6.01GiB, Used:3.52GiB >/dev/nvme0n1 6.01GiB > > System,single: Size:32.00MiB, Used:432.00KiB >/dev/nvme0n1 32.00MiB > > Unallocated: >/dev/nvme0n1 9.97GiB > > [root@OraCI2 ~]# df -h | grep nvme0n1 > /dev/nvme0n1 3,0T 2,6T 386G > 88% /mnt/OLD > [root@OraCI2 ~]# du -sh /mnt/OLD/ > 2,0T/mnt/OLD/<- > [root@OraCI2 ~]# > /dev/nvme0n1 on /mnt/OLD type btrfs > (rw,relatime,ssd,discard,space_cache,subvolid=5,subvol=/) > > WBR, > Roman Kravtsov These entries in the FAQ should help https://btrfs.wiki.kernel.org/index.php/FAQ#How_much_free_space_do_I_have.3F Mike -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: add dev stats returncode option
On 1 December 2016 at 18:43, Austin S. Hemmelgarnwrote: > Currently, `btrfs device stats` returns non-zero only when there was an > error getting the counter values. This is fine for when it gets run by a > user directly, but is a serious pain when trying to use it in a script or > for monitoring since you need to parse the (not at all machine friendly) > output to check the counter values. > > This patch adds an option ('-s') which causes `btrfs device stats` > to set bit 7 in the return code if any of the counters are non-zero. > This greatly simplifies checking from a script or monitoring software > if any errors have been recorded. In the event that this switch is > passed and an error occurs reading the stats, the return code will have > bit 0 set (so if there are errors reading counters, and the counters > which were read were non-zero, the return value will be 129). I don't think using bit 7 is a good idea. Bash (and I think all shells) report exist status 128+SIGNUM when the process is killed by a signal. I.e. status 129 would be returned when a process is killed by SIGHUP. Perhaps bit 6 would be OK to use. Thanks, Mike https://tiswww.case.edu/php/chet/bash/bashref.html#Exit-Status "Exit statuses fall between 0 and 255, though, as explained below, the shell may use values above 125 specially. ... When a command terminates on a fatal signal whose number is N, Bash uses the value 128+N as the exit status. ... If a command is not found, the child process created to execute it returns a status of 127. If a command is found but is not executable, the return status is 126." -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Small fs
On 12 September 2016 at 19:55, Austin S. Hemmelgarnwrote: > I'm not sure about gparted, but the default behavior for mkfs is as follows: > 1. Is the device rotational? (check /sys/block//rotational). If > not, do some extra stuff to try and ID it as an SSD. If it is an SSD, use > SINGLE mode for metadata, otherwise use DUP mode for metadata. > 2. Is the FS set for mixed-bg? If so, use the same profile for data as > metadata, otherwise use SINGLE mode for data. > > It would not surprise me if gparted switches to single metadata mode for a > small enough FS, but I'm not certain. I do think that they just use the > default selection for mixed-bg though, which means not using it in current > btrfs-progs versions. GParted always just takes the mkfs.btrfs defaults. https://git.gnome.org/browse/gparted/tree/src/btrfs.cc?h=GPARTED_0_26_1#n154 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: commands like "du", "df", and "btrfs fs sync" hang
On 1 May 2016 at 13:47, Duncan <1i5t5.dun...@cox.net> wrote: > Direct from that section of my /etc/sysctl.conf: > > > # Virtual-machine: swap, write-cache Hi Duncan, You mean virtual memory. Quoting from the kernel documentation https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/sysctl/vm.txt " This file contains the documentation for the sysctl files in /proc/sys/vm and is valid for Linux kernel version 2.6.29. The files in this directory can be used to tune the operation of the virtual memory (VM) subsystem of the Linux kernel and the writeout of dirty data to disk. " > # vm.vfs_cache_pressure = 100 > # vm.laptop_mode = 0 > # vm.swappiness = 60 > vm.swappiness = 100 > > # write-cache, foreground/background flushing > # vm.dirty_ratio = 10 (% of RAM) > # make it 3% of 16G ~ half a gig > vm.dirty_ratio = 3 > # vm.dirty_bytes = 0 > > # vm.dirty_background_ratio = 5 (% of RAM) > # make it 1% of 16G ~ 160 M > vm.dirty_background_ratio = 1 > # vm.dirty_background_bytes = 0 > > # vm.dirty_expire_centisecs = 2999 (30 sec) > # vm.dirty_writeback_centisecs = 499 (5 sec) > # make it 10 sec > vm.dirty_writeback_centisecs = 1000 > Thanks, Mike -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/5] btrfs-progs: introduce framework to check kernel supported features
On 23 November 2015 at 12:56, Anand Jainwrote: > In the newer kernel, supported kernel features can be known from > /sys/fs/btrfs/features > however this interface was introduced only after 3.14, and most the > incompatible FS features were introduce before 3.14. > > This patch proposes to maintain kernel version against the feature list, > and so that will be the minimum kernel version needed to use the feature. > > Further, for features supported later than 3.14 this list can still be > updated, so it serves as a repository which can be displayed for easy > reference. > > Signed-off-by: Anand Jain > --- > v2: Check for condition that what happens when we fail to read kernel > version. Now the code will fail back to use the default as set by > the progs. > > utils.c | 80 > - > utils.h | 1 + > 2 files changed, 76 insertions(+), 5 deletions(-) > > diff --git a/utils.c b/utils.c > index b754686..24042e5 100644 > --- a/utils.c > +++ b/utils.c > @@ -32,10 +32,12 @@ > #include > #include > #include > +#include > #include > #include > #include > #include > +#include > #include > > #include "kerncompat.h" > @@ -567,21 +569,28 @@ out: > return ret; > } > > +/* > + * min_ker_ver: update with minimum kernel version at which the feature > + * was integrated into the mainline. For the transit period, that is > + * feature not yet in mainline but in mailing list and for testing, > + * please use "0.0" to indicate the same. > + */ > static const struct btrfs_fs_feature { > const char *name; > u64 flag; > const char *desc; > + const char *min_ker_ver; > } mkfs_features[] = { > { "mixed-bg", BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS, > - "mixed data and metadata block groups" }, > + "mixed data and metadata block groups", "2.7.31"}, I think you mean 2.6.37 here. 67377734fd24c3 "Btrfs: add support for mixed data+metadata block groups" Thanks, Mike > { "extref", BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF, > - "increased hardlink limit per file to 65536" }, > + "increased hardlink limit per file to 65536", "3.7"}, > { "raid56", BTRFS_FEATURE_INCOMPAT_RAID56, > - "raid56 extended format" }, > + "raid56 extended format", "3.9"}, > { "skinny-metadata", BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA, > - "reduced-size metadata extent refs" }, > + "reduced-size metadata extent refs", "3.10"}, > { "no-holes", BTRFS_FEATURE_INCOMPAT_NO_HOLES, > - "no explicit hole extents for files" }, > + "no explicit hole extents for files", "3.14"}, > /* Keep this one last */ > { "list-all", BTRFS_FEATURE_LIST_ALL, NULL } > }; > @@ -3077,3 +3086,64 @@ unsigned int get_unit_mode_from_arg(int *argc, char > *argv[], int df_mode) > > return unit_mode; > } > + > +static int version_to_code(char *v) > +{ > + int i = 0; > + char *b[3] = {NULL}; > + char *save_b = NULL; > + > + for (b[i] = strtok_r(v, ".", _b); > + b[i] != NULL; > + b[i] = strtok_r(NULL, ".", _b)) > + i++; > + > + if (b[2] == NULL) > + return KERNEL_VERSION(atoi(b[0]), atoi(b[1]), 0); > + else > + return KERNEL_VERSION(atoi(b[0]), atoi(b[1]), atoi(b[2])); > + > +} > + > +static int get_kernel_code() > +{ > + int ret; > + struct utsname utsbuf; > + char *version; > + > + ret = uname(); > + if (ret) > + return -ret; > + > + if (!strlen(utsbuf.release)) > + return -EINVAL; > + > + version = strtok(utsbuf.release, "-"); > + > + return version_to_code(version); > +} > + > +u64 btrfs_features_allowed_by_kernel(void) > +{ > + int i; > + int local_kernel_code = get_kernel_code(); > + u64 features = 0; > + > + /* > +* When system did not provide the kernel version then just > +* return 0, the caller has to depend on the intelligence as > +* per btrfs-progs version > +*/ > + if (local_kernel_code <= 0) > + return 0; > + > + for (i = 0; i < ARRAY_SIZE(mkfs_features) - 1; i++) { > + char *ver = strdup(mkfs_features[i].min_ker_ver); > + > + if (local_kernel_code >= version_to_code(ver)) > + features |= mkfs_features[i].flag; > + > + free(ver); > + } > + return (features); > +} > diff --git a/utils.h b/utils.h > index 192f3d1..9044643 100644 > --- a/utils.h > +++ b/utils.h > @@ -104,6 +104,7 @@ void btrfs_list_all_fs_features(u64 mask_disallowed); > char* btrfs_parse_fs_features(char *namelist, u64 *flags); > void btrfs_process_fs_features(u64 flags); > void btrfs_parse_features_to_string(char
Re: BTRFS as image store for KVM?
On 17 September 2015 at 18:56, Gert Menkewrote: > MD+LVM is very close to what I want, but md has no way to cope with silent > data corruption. So if I'd want to use a guest filesystem that has no > checksums either, I'm out of luck. > I'm honestly a bit confused here - isn't checksumming one of the most > obvious things to want in a software RAID setup? Is it a feature that might > appear in the future? Maybe I should talk to the md guys... ... > Any comments on that? Am I missing something? How about using file integrity checking tools for cases when the chosen storage stack doesn't provided data checksumming. E.g. aide - http://aide.sourceforge.net/ cfv - http://cfv.sourceforge.net/ tripwire - http://sourceforge.net/projects/tripwire/ Don't use them, just providing options. Mike -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: fix file read corruption after extent cloning and fsync
On 19 August 2015 at 11:11, fdman...@kernel.org wrote: From: Filipe Manana fdman...@suse.com If we partially clone one extent of a file into a lower offset of the file, fsync the file, power fail and then mount the fs to trigger log replay, we can get multiple checksum items in the csum tree that overlap each other and result in checksum lookup failures later. Those failures can make file data read requests assume a checksum value of 0, but they will not return an error (-EIO for example) to userspace exactly because the expected checksum value 0 is a special value that makes the read bio endio callback return success and set all the bytes of the corresponding page with the value 0x01 (at fs/btrfs/inode.c:__readpage_endio_check()). From a userspace perspective this is equivalent to file corruption because we are not returning what was written to the file. Details about how this can happen, and why, are included inline in the following reproducer test case for fstests and the comment added to tree-log.c. seq=`basename $0` seqres=$RESULT_DIR/$seq echo QA output created by $seq tmp=/tmp/$$ status=1 # failure is the default! trap _cleanup; exit \$status 0 1 2 3 15 _cleanup() { _cleanup_flakey rm -f $tmp.* } # get standard environment, filters and checks . ./common/rc . ./common/filter . ./common/dmflakey # real QA test starts here _need_to_be_root _supported_fs btrfs _supported_os Linux _require_scratch _require_dm_flakey _require_cloner _require_metadata_journaling $SCRATCH_DEV rm -f $seqres.full _scratch_mkfs $seqres.full 21 _init_flakey _mount_flakey # Create our test file with a single 100K extent starting at file # offset 800K. We fsync the file here to make the fsync log tree gets # a single csum item that covers the whole 100K extent, which causes # the second fsync, done after the cloning operation below, to not # leave in the log tree two csum items covering two sub-ranges # ([0, 20K[ and [20K, 100K[)) of our extent. $XFS_IO_PROG -f -c pwrite -S 0xaa 800K 100K \ -c fsync \ $SCRATCH_MNT/foo | _filter_xfs_io # Now clone part of our extent into file offset 400K. This adds a file # extent item to our inode's metadata that points to the 100K extent # we created before, using a data offset of 20K and a data length of # 20K, so that it refers to the sub-range [20K, 40K[ of our original # extent. $CLONER_PROG -s $((800 * 1024 + 20 * 1024)) -d $((400 * 1024)) \ -l $((20 * 1024)) $SCRATCH_MNT/foo $SCRATCH_MNT/foo # Now fsync our file to make sure the extent cloning is durably # persisted. This fsync will not add a second csum item to the log # tree containing the checksums for the blocks in the sub-range # [20K, 40K[ of our extent, because there was already a csum item in # the log tree covering the whole extent, added by the first fsync # we did before. $XFS_IO_PROG -c fsync $SCRATCH_MNT/foo echo File digest before power failure: md5sum $SCRATCH_MNT/foo | _filter_scratch # Silently drop all writes and ummount to simulate a crash/power # failure. _load_flakey_table $FLAKEY_DROP_WRITES _unmount_flakey # Allow writes again, mount to trigger log replay and validate file # contents. # The fsync log replay first processes the file extent item # corresponding to the file offset 400K (the one which refers to the # [20K, 40K[ sub-range of our 100K extent) and then processes the file # extent item for file offset 800K. It used to happen that when # processing the later, it erroneously left in the csum tree 2 csum # items that overlapped each other, 1 for the sub-range [20K, 40K[ and # 1 for the whole range of our extent. This introduced a problem where # subsequent lookups for the checksums of blocks within the range # [40K, 100K[ of our extent would not find anything because lookups in # the csum tree ended up looking only at the smaller csum item, the # one covering the subrange [20K, 40K[. This made read requests assume # an expected checksum with a value of 0 for those blocks, which caused # checksum verification failure when the read operations finished. # However those checksum failure did not result in read requests # returning an error to user space (like -EIO for e.g.) because the # expected checksum value had the special value 0, and in that case # btrfs set all bytes of the corresponding pages with the value 0x01 # and produce the following warning in dmesg/syslog: # # BTRFS warning (device dm-0): csum failed ino 257 off 917504 csum\ # 1322675045 expected csum 0 # _load_flakey_table $FLAKEY_ALLOW_WRITES _mount_flakey echo File digest after log replay: # Must match the same digest he had after cloning the extent and # before the power failure happened.
Re: btrfs raid1 metadata, single data
On 7 August 2015 at 10:47, Sjoerd sjo...@sjomar.eu wrote: While we're at it: any idea why the default for SSD's is single for meta data as described on the wiki? (https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices#Filesystem_creation) I was looking for an answer why my SSD just had single metadata, while I expected it to be DUP and stumbled on this wiki article. Can't find a reason for why a SSD would be different? Cheers, Sjoerd I would assume that it is because some SSD drives controllers deduplicate by default [1]. The developers probably think that when it comes to your data the truth, no mater how ugly, is preferable to a false sense of security. (Btrfs thinking it has 2 copies of metadata when the SSD drive only actually has stored 1 copy). [1] How SSDs can hose your data http://www.zdnet.com/article/how-ssds-can-hose-your-data/ Researchers found that at least 1 Sandforce SSD controller - the SF1200 - does block-level deduplication by default. Which can be a problem. Many file systems - NTFS, most Unix/Linux FSs, ZFS are some - write critical metadata to multiple blocks in case one copy gets corrupted. But what if, unbeknownst to you, your SSD de-duplicates that block, leaving your file system with only 1 copy? Thanks, Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs replace seems to corrupt the file system
On 29 June 2015 at 09:08, Duncan 1i5t5.dun...@cox.net wrote: Meanwhile, unlike many filesystems, btrfs uses the UUID as part of the metadata, so changing the UUID isn't as simple as rewriting a superblock; the metadata must be rewritten to the new UUID. There's actually a tool now available to do just that, but it's new enough I'm not even sure it's available in release form yet; if so, it'll be latest releases. Otherwise, it'd be in integration branch. FYI, btrfstune with changing file system UUID capability, was included in btrfs-progs 4.1 released last week, Mon, 22 Jun 2015. http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg44182.html Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
btrfs-progs 4.1-rc1: btrfstune -u reporting incorrect current fsid?
Hi, I've done a quick test on changing the UUID of a btrfs. It worked, but btrfstune -u didn't print the same current uuid that btrfs fi sh does. It also upper cases the UUID where as btrfs fi sh and blkid don't. Thanks, Mike # btrfs filesystem show /dev/sdb1 | fgrep uuid Label: none uuid: b2813976-4d8b-4976-9d59-cbfbd588399c # ~fedora/programming/c/btrfs-progs-unstable/btrfstune -f -u /dev/sdb1 Current fsid: ---00B0-8F12937F New fsid: D294F3F3-F2B7-4407-B83A-DE5A4F8CBAB1 Set superblock flag CHANGING_FSID Change fsid in extents Change fsid on devices Clear superblock flag CHANGING_FSID Fsid change finished # btrfs filesystem show /dev/sdb1 | fgrep uuid Label: none uuid: d294f3f3-f2b7-4407-b83a-de5a4f8cbab1 # blkid | fgrep sdb1 /dev/sdb1: UUID=d294f3f3-f2b7-4407-b83a-de5a4f8cbab1 UUID_SUB=70065403-5ec1-462c-93a4-26cff8b6aea2 TYPE=btrfs PARTUUID=b309c48c-486f-4882-896c-34d4d0aeb529 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: optionally enforce chroot for btrfs receive
On 18 April 2015 at 14:59, Lauri Võsandi lauri.vosa...@gmail.com wrote: This patch forces btrfs receive to issue chroot before parsing the btrfs stream using command-line flag -C to confine the process and minimize damage that could be done via malicious btrfs stream. Signed-off-by: Lauri Võsandi lauri.vosa...@gmail.com --- cmds-receive.c | 37 - 1 file changed, 28 insertions(+), 9 deletions(-) diff --git a/cmds-receive.c b/cmds-receive.c index 44ef27e..73bd88b 100644 --- a/cmds-receive.c +++ b/cmds-receive.c @@ -61,6 +61,7 @@ struct btrfs_receive char *root_path; char *dest_dir_path; /* relative to root_path */ char *full_subvol_path; + int dest_dir_chroot; struct subvol_info *cur_subvol; @@ -867,14 +868,27 @@ static int do_receive(struct btrfs_receive *r, const char *tomnt, int r_fd, goto out; } - /* -* find_mount_root returns a root_path that is a subpath of -* dest_dir_full_path. Now get the other part of root_path, -* which is the destination dir relative to root_path. -*/ - r-dest_dir_path = dest_dir_full_path + strlen(r-root_path); - while (r-dest_dir_path[0] == '/') - r-dest_dir_path++; + if (r-dest_dir_chroot) { + if (chroot(dest_dir_full_path)) { + ret = -errno; + fprintf(stderr, + ERROR: failed to chroot to %s, %s\n, + dest_dir_full_path, + strerror(-ret)); + goto out; + } + if(chdir(/)) { + ret = -errno; + fprintf(stderr, + ERROR: failed to chdir to /, %s\n, + strerror(-ret)); There appears to be a goto out missing here. + } + if (g_verbose = 1) { + fprintf(stderr, chrooted to %s\n, + dest_dir_full_path); + } + r-root_path = r-dest_dir_path = strdup(/); + } ret = subvol_uuid_search_init(r-mnt_fd, r-sus); if (ret 0) @@ -940,6 +954,7 @@ int cmd_receive(int argc, char **argv) r.write_fd = -1; r.dest_dir_fd = -1; r.explicit_parent = NULL; + r.dest_dir_chroot = 0; while (1) { int c; @@ -948,7 +963,7 @@ int cmd_receive(int argc, char **argv) { NULL, 0, NULL, 0 } }; - c = getopt_long(argc, argv, evf:p:, long_opts, NULL); + c = getopt_long(argc, argv, Cevf:p:, long_opts, NULL); if (c 0) break; @@ -962,6 +977,9 @@ int cmd_receive(int argc, char **argv) case 'e': r.honor_end_cmd = 1; break; + case 'C': + r.dest_dir_chroot = 1; + break; case 'E': max_errors = arg_strtou64(optarg); break; @@ -1014,6 +1032,7 @@ const char * const cmd_receive_usage[] = { in the data stream. Without this option,, the receiver terminates only if an error, is recognized or on EOF., + -C Confine the process to mount using chroot, --max-errors N Terminate as soon as N errors happened while, processing commands from the send stream., Default value is 1. A value of 0 means no limit., -- 1.9.1 Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] btrfs-progs: fix wrong num_devices for btrfs fi show with seed devices
On 7 November 2014 18:16, David Sterba dste...@suse.cz wrote: On Fri, Nov 07, 2014 at 10:07:43AM +0800, Gui Hecheng wrote: The @fi_args-num_devices in @get_fs_info() does not include seed devices. We could just correct it by searching the chunk tree and count how many dev_items there are in total which includes seed devices. Signed-off-by: Gui Hecheng guihc.f...@cn.fujitsu.com --- *Note* This is just a temporary workaround to fix this problem in order to make users happy, because a new ioctl or sysfs interface to handle this problem needs more discussions and efforts. After the work implemented and accepted, we could drop this. Nice, thanks. I agree that this kind of workaround is best possible for the moment, and I'm glad to see that it's not that much code to get the seeding devices right. This would also work with older kernels without the updated sysfs/ioctl interfaces, so this is likely to stay for a long time. +u64 find_max_id(struct btrfs_ioctl_search_args *search_args, int nr_items) That's a very generic name for a function that does a very specialized thing, but I don't have a suggestion right now. Is find_max_device_id a suitable name? +int correct_fs_info(int fd, struct btrfs_ioctl_fs_info_args *fi_args) Same here, make fs_info correct but in what way? A comment would be good as well. Sorry, no suggestion for this function name. Thanks, Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] btrfs-progs: introduce test_issubvolname() for simplicity
On 30 July 2014 04:25, Satoru Takeuchi takeuchi_sat...@jp.fujitsu.com wrote: From: Satoru Takeuchi takeuchi_sat...@jp.fujitsu.com There are many duplicated codes to check if the given string is correct subvolume name. Introduce test_issubvolname() for this purpose for simplicity. Signed-off-by: Satoru Takeuchi takeuchi_sat...@jp.fujitsu.com Cc: David Sterba dste...@suse.cz --- changelog: v2: Move test_issubvolname() to utils.c. Change the target branch to integ-20140729. --- cmds-subvolume.c | 9 +++-- utils.c | 12 utils.h | 1 + 3 files changed, 16 insertions(+), 6 deletions(-) diff --git a/cmds-subvolume.c b/cmds-subvolume.c index 8bdc447..c075fb2 100644 --- a/cmds-subvolume.c +++ b/cmds-subvolume.c @@ -127,8 +127,7 @@ static int cmd_subvol_create(int argc, char **argv) dupdir = strdup(dst); dstdir = dirname(dupdir); - if (!strcmp(newname, .) || !strcmp(newname, ..) || -strchr(newname, '/') ){ + if (!test_issubvolname(newname)) { fprintf(stderr, ERROR: incorrect subvolume name '%s'\n, newname); goto out; @@ -302,8 +301,7 @@ again: vname = basename(dupvname); free(cpath); - if (!strcmp(vname, .) || !strcmp(vname, ..) || -strchr(vname, '/')) { + if (!test_issubvolname(vname)) { fprintf(stderr, ERROR: incorrect subvolume name '%s'\n, vname); ret = 1; @@ -711,8 +709,7 @@ static int cmd_snapshot(int argc, char **argv) dstdir = dirname(dupdir); } - if (!strcmp(newname, .) || !strcmp(newname, ..) || -strchr(newname, '/') ){ + if (!test_issubvolname(newname)) { fprintf(stderr, ERROR: incorrect snapshot name '%s'\n, newname); goto out; diff --git a/utils.c b/utils.c index d61cbec..e2a2acd 100644 --- a/utils.c +++ b/utils.c @@ -2685,3 +2685,15 @@ int btrfs_read_sysfs(char path[PATH_MAX]) close(fd); return atoi((const char *)val); } + +/* + * test if name is a correct subvolume name + * this function return + * 0- name is not a correct subvolume name + * 1- name is a correct subvolume name + */ +int test_issubvolname(char *name) (const char *name) please. The string is not modified by the function. +{ + return name[0] != '\0' !strchr(name, '/') + strcmp(name, .) strcmp(name, ..); +} diff --git a/utils.h b/utils.h index 0c9b65f..d635c84 100644 --- a/utils.h +++ b/utils.h @@ -133,6 +133,7 @@ int get_fslist(struct btrfs_ioctl_fslist **out_fslist, u64 *out_count); int fsid_to_mntpt(__u8 *fsid, char *mntpt, int *mnt_cnt); int test_minimum_size(const char *file, u32 leafsize); +int test_issubvolname(char *name); And in the prototype to match. /* * Btrfs minimum size calculation is complicated, it should include at least: -- 1.9.3 Thanks, Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Partition tables / Output of parted
On 4 June 2014 14:30, Russell Coker russ...@coker.com.au wrote: On Wed, 4 Jun 2014 13:19:16 Stefan Malte Schumacher wrote: I have created multiple filesystems with btrfs, in all cases directly on the devices themself without creating partitions beforehand. I do that sometimes, it works well. I've done the same thing with Ext2/3 in the past as well, it's no big deal. Now, if I open the disks containing the multi-device filesystem in parted it outputs the partion table as loop and shows one partition with btrfs which covers the whole disk. http://lists.alioth.debian.org/pipermail/parted-devel/2009-May/002840.html A Google search on Partition Table: loop turned up the above explanation as the third result. I am unsure how to interpret this output. Two possible explanations come to mind: a) Btrfs does create partitions, but only if a filesystem spans multiple devices or b) the output of parted is faulty and no actual partition is created in both cases. BTRFS doesn't create partitions. c) Parted (libparted) is merely displaying a pretend loop partition table as a way to represent the situation of a file system covering the whole disk in it's view of the world where all disks have a partition table. Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: Improve the parse_size() error message.
On 20 May 2014 08:51, Qu Wenruo quwen...@cn.fujitsu.com wrote: When using parse_size(), even non-numeric value is passed, it will only give error message ERROR: size value is empty, which is quite confusing for end users. This patch will introduce more meaningful error message for the following new cases 1) Invalid size string (non-numeric string) 2) Minus size value (like -1K) Also this patch will take full use of endptr returned by strtoll() to reduce unneeded loop. Reported-by: Hidetoshi Seto seto.hideto...@jp.fujitsu.com Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com --- utils.c | 53 + 1 file changed, 33 insertions(+), 20 deletions(-) diff --git a/utils.c b/utils.c index 392c5cf..1cbe102 100644 --- a/utils.c +++ b/utils.c @@ -1612,18 +1612,20 @@ scan_again: u64 parse_size(char *s) { - int i; char c; u64 mult = 1; - - for (i = 0; s s[i] isdigit(s[i]); i++) ; - if (!i) { - fprintf(stderr, ERROR: size value is empty\n); - exit(50); - } - - if (s[i]) { - c = tolower(s[i]); + long long int ret = 0; + char *endptr; + + if (!s) + goto empty; + ret = strtoll(s, endptr, 10); + if (endptr == s) + goto invalid; + if (endptr[0] endptr[1]) + goto suffix; + if (endptr[0]) { + c = tolower(endptr[0]); switch (c) { case 'e': mult *= 1024; @@ -1646,18 +1648,29 @@ u64 parse_size(char *s) case 'b': break; default: - fprintf(stderr, ERROR: Unknown size descriptor - '%c'\n, c); - exit(1); + goto descriptor; } } - if (s[i] s[i+1]) { - fprintf(stderr, ERROR: Illegal suffix contains - character '%c' in wrong position\n, - s[i+1]); - exit(51); - } - return strtoull(s, NULL, 10) * mult; + ret *= mult; + if (ret = 0) + goto minus; + return (u64) ret; +empty: + fprintf(stderr, ERROR: Size value is empty\n); + exit(50); +suffix: + fprintf(stderr, ERROR: Illegal suffix contains character '%c' in wrong position\n, + endptr[1]); + exit(51); +descriptor: + fprintf(stderr, ERROR: Unknown size descriptor '%c'\n, c); + exit(52); +invalid: + fprintf(stderr, ERROR: Size value '%s' is invalid\n, s); + exit(53); +minus: + fprintf(stderr, ERROR: Size value '%s' is less equal than 0\n, s); + exit(54); } int open_file_or_dir3(const char *fname, DIR **dirstream, int open_flags) -- 1.9.2 IMHO use of this if condition goto print error exit pattern makes the code harder to read. Use of gotos is used when a function creates state which needs tearing down in reverse on error before returning. I think the errors should be printed at the point of detection. Like this: + if (!s) { + fprintf(stderr, ERROR: Size value is empty\n); + exit(1); + } + ret = strtoll(s, endptr, 10); + if (endptr == s) { + fprintf(stderr, ERROR: Size value '%s' is invalid\n, s); + exit(1); + } etc. Thanks, Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] Enhanced df - followup
On 29 April 2014 16:56, David Sterba dste...@suse.cz wrote: Changes: * btrfs filesystem disk_usage - renamed to usage * added a section of overall filesystem usage, that used to be in the 'fi df' output * btrfs device disk_usage - renamed to usage * the device size prints both blockdevie size and the size that's occupied by the filesystem * device ID is printed cmds-fi-disk_usage.c | 631 +++ cmds-fi-disk_usage.h | 35 ++- After renaming the commands btrfs ... disk_usage to btrfs ... usage should the source files also be renamed to cmds-fi-usage.[ch]? Thanks, Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 07/14] btrfs-progs: Print more info about device sizes
On 29 April 2014 17:02, David Sterba dste...@suse.cz wrote: The entire device size may not be available to the filesystem, eg. if it's modified via resize. Print this information if it can be obtained from the DEV_INFO ioctl. Print the device ID on the same line as the device name and move size to the next line. Sample: /dev/sda7, ID: 3 Device size:10.00GiB FS occuppied:5.00GiB Spelling mistake. s/occuppied/occupied/. Data,RAID10: 512.00MiB Metadata,RAID10: 512.00MiB System,RAID10: 4.00MiB Unallocated: 9.00GiB Signed-off-by: David Sterba dste...@suse.cz --- cmds-device.c| 6 +++--- cmds-fi-disk_usage.c | 13 - cmds-fi-disk_usage.h | 6 +- 3 files changed, 20 insertions(+), 5 deletions(-) diff --git a/cmds-device.c b/cmds-device.c index 7a9d808b36dd..519725f83e8c 100644 --- a/cmds-device.c +++ b/cmds-device.c @@ -447,9 +447,9 @@ static int _cmd_device_usage(int fd, char *path, int mode) } for (i = 0; i device_info_count; i++) { - printf(%s\t%10s\n, device_info_ptr[i].path, - df_pretty_sizes(device_info_ptr[i].size, mode)); - + printf(%s, ID: %llu\n, device_info_ptr[i].path, + device_info_ptr[i].devid); + print_device_sizes(fd, device_info_ptr[i], mode); print_device_chunks(fd, device_info_ptr[i].devid, device_info_ptr[i].size, info_ptr, info_count, diff --git a/cmds-fi-disk_usage.c b/cmds-fi-disk_usage.c index 067c60078710..ddb064cc4c66 100644 --- a/cmds-fi-disk_usage.c +++ b/cmds-fi-disk_usage.c @@ -499,7 +499,8 @@ int load_device_info(int fd, struct device_info **device_info_ptr, info[ndevs].devid = dev_info.devid; strcpy(info[ndevs].path, (char *)dev_info.path); - info[ndevs].size = get_partition_size((char *)dev_info.path); + info[ndevs].device_size = get_partition_size((char *)dev_info.path); + info[ndevs].size = dev_info.total_size; ++ndevs; } @@ -879,5 +880,15 @@ void print_device_chunks(int fd, u64 devid, u64 total_size, printf( Unallocated: %*s%10s\n, (int)(20 - strlen(Unallocated)), , df_pretty_sizes(total_size - allocated, mode)); +} +void print_device_sizes(int fd, struct device_info *devinfo, int mode) +{ + printf( Device size: %*s%10s\n, + (int)(20 - strlen(Device size)), , + df_pretty_sizes(devinfo-device_size, mode)); + printf( FS occuppied:%*s%10s\n, Here too. s/occuppied/occupied/. + (int)(20 - strlen(FS occupied)), , + df_pretty_sizes(devinfo-size, mode)); + } } diff --git a/cmds-fi-disk_usage.h b/cmds-fi-disk_usage.h index 787b4eb56acf..79cc2a115bc5 100644 --- a/cmds-fi-disk_usage.h +++ b/cmds-fi-disk_usage.h @@ -27,7 +27,10 @@ int cmd_filesystem_usage(int argc, char **argv); struct device_info { u64 devid; charpath[BTRFS_DEVICE_PATH_NAME_MAX]; - u64 size; + /* Size of the block device */ + u64 device_size; + /* Size that's occupied by the filesystem, can be changed via resize */ + u64 size; }; /* @@ -50,5 +53,6 @@ char *df_pretty_sizes(u64 size, int mode); void print_device_chunks(int fd, u64 devid, u64 total_size, struct chunk_info *chunks_info_ptr, int chunks_info_count, int mode); +void print_device_sizes(int fd, struct device_info *devinfo, int mode); #endif -- 1.9.0 Same spelling mistake (occuppied) also occurs in the following patches too: [PATCH 08/14] btrfs-progs: compare unallocated space against the correct value [PATCH 12/14] btrfs-progs: replace df_pretty_sizes with pretty_size_mode Thanks, Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to identify if a partition containing a btrfs volume is mounted and where
On 25 February 2014 10:28, Mike Fleetwood mike.fleetw...@googlemail.com wrote: On 25 February 2014 03:01, Anand Jain anand.j...@oracle.com wrote: # mkfs.btrfs /dev/sdb2 /dev/sdb3 /dev/sdb4 # mount /dev/sdb2 /mnt/1 # btrfs device delete /dev/sdb2 /mnt/1 So /dev/sdb2 is no longer part of the file system, but it's still mounted using it. # grep btrfs /proc/mounts /dev/sdb2 /mnt/1 btrfs rw,seclabel,relatime,ssd,space_cache 0 0 This bug isn't there is the current btrfs-next. I couldn't reproduce. I've tested this again. With linux 3.12.10 on Fedora 20 it is still reproducible, but after upgrading to linux 3.13.3 /proc/mounts automatically changes to show another device when the initial mounting one is removed from the btrfs. Hi Anand, I've done some more testing and stracing and would appreciate you confirming the following understanding is correct: 1) btrfs-tools = 3.12 (possibly earlier) will always display correct results from btrfs fs show because it uses ioctls to get information from the kernel for each mounted btrfs, and reads the disks for non-mounted ones. 2) Old btrfs tools (tested with 0.20-rc1) only reads the disk and so may get out of date information for mounted file systems, sometimes showing *** Some devices missing. 3) Using btrfs fs sync /mnt makes the cached changes to device membership get flushed to disk, thus avoiding seeing stale data in (2). So as such I plan to use /proc/mounts to determine FS busy status and btrfs fs show to determine btrfs device membership. As per my previous and the above details this will be correct for linux = 3.13 and current btrfs-tools, but for older kernel using /proc/mounts will be wrong if the original mounting device has been removed from the btrfs. Thanks, Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to identify if a partition containing a btrfs volume is mounted and where
On 25 February 2014 03:01, Anand Jain anand.j...@oracle.com wrote: # mkfs.btrfs /dev/sdb2 /dev/sdb3 /dev/sdb4 # mount /dev/sdb2 /mnt/1 # btrfs device delete /dev/sdb2 /mnt/1 So /dev/sdb2 is no longer part of the file system, but it's still mounted using it. # grep btrfs /proc/mounts /dev/sdb2 /mnt/1 btrfs rw,seclabel,relatime,ssd,space_cache 0 0 This bug isn't there is the current btrfs-next. I couldn't reproduce. I've tested this again. With linux 3.12.10 on Fedora 20 it is still reproducible, but after upgrading to linux 3.13.3 /proc/mounts automatically changes to show another device when the initial mounting one is removed from the btrfs. Thanks, Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
How to identify if a partition containing a btrfs volume is mounted and where
Hi, I am trying to enhance GParted (http://www.gparted.org/) to better support btrfs, specifically multi-device ones. GParted displays the busy status (mounted or not) and the mount point of each partition. For a single device file system this is easy. Entry in /proc/mounts for the partition identifies it's mounted and provides the mount point. In the general case for btrfs I don't know how to get from device name containing a btrfs volume to knowing if it's mounted and where? btrfs filesystem show can identify the devices in a btrfs, but if the mounting device was removed from the file system this linkage is broken. # mkfs.btrfs /dev/sdb2 /dev/sdb3 /dev/sdb4 # mount /dev/sdb2 /mnt/1 # btrfs device delete /dev/sdb2 /mnt/1 So /dev/sdb2 is no longer part of the file system, but it's still mounted using it. # grep btrfs /proc/mounts /dev/sdb2 /mnt/1 btrfs rw,seclabel,relatime,ssd,space_cache 0 0 # btrfs filesystem show /dev/sdb3 Label: none uuid: d1e98472-e562-466c-8fa4-ddcaee757c20 Total devices 2 FS bytes used 156.00KB devid3 size 2.00GB used 961.56MB path /dev/sdb4 devid2 size 2.00GB used 552.00MB path /dev/sdb3 So in there a way to determine whether a specific partition containing a btrfs volume is mounted and on what mount point? Thanks, Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/4] btrfs-progs: fix segment fault when exec btrfs-debug-tree as non-root
On 20 February 2014 02:49, Gui Hecheng guihc.f...@cn.fujitsu.com wrote: When exec btrfs-debug-tree as non-root user, we get a segment fault. Because the btrfs_scan_block_devices return a success 0 when we fail to open a device. Now we just return the errno if this case happens. Signed-off-by: Gui Hecheng guihc.f...@cn.fujitsu.com --- utils.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/utils.c b/utils.c index 97e23d5..1878abc 100644 --- a/utils.c +++ b/utils.c @@ -1517,7 +1517,8 @@ scan_again: scans++; goto scan_again; } - return 0; + + return errno ? -errno : 0; } u64 parse_size(char *s) -- 1.8.1.4 Hi Gui, This strikes me as not not right because errno is only documented as being set when open() returns -1 on failure. In the success case errno is not set so you can't assume it will be 0. I think the following might work: 1) Initilase ret = 0 at the start of btrfs_scan_block_devices() 2) In the open(fullpath, O_RDONLY) failure case set ret = fd 3) return ret Thanks, Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 01/11] btrfs: Add barrier option to support -o remount,barrier
On 3 January 2014 06:10, Qu Wenruo quwen...@cn.fujitsu.com wrote: Btrfs can be remounted without barrier, but there is no barrier option so nobody can remount btrfs back with barrier on. Only umount and mount again can re-enable barrier.(Quite awkward) Also the mount options in the document is also changed slightly for the further pairing options changes. Reported-by: Daniel Blueman dan...@quora.org Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com Cc: David Sterba dste...@suse.cz --- changelog: v1: Add barrier option v2: Change the document style to fit pairing options better --- Documentation/filesystems/btrfs.txt | 13 +++-- fs/btrfs/super.c| 8 +++- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/Documentation/filesystems/btrfs.txt b/Documentation/filesystems/btrfs.txt index 5dd282d..2d2e016 100644 --- a/Documentation/filesystems/btrfs.txt +++ b/Documentation/filesystems/btrfs.txt @@ -38,7 +38,7 @@ Mount Options = When mounting a btrfs filesystem, the following option are accepted. -Unless otherwise specified, all options default to off. +Options with (*) are default options and will not show in the mount options. alloc_start=bytes Debugging option to force all block allocations above a certain @@ -138,12 +138,13 @@ Unless otherwise specified, all options default to off. Disable support for Posix Access Control Lists (ACLs). See the acl(5) manual page for more information about ACLs. + barrier(*) nobarrier -Disables the use of block layer write barriers. Write barriers ensure - that certain IOs make it through the device cache and are on persistent - storage. If used on a device with a volatile (non-battery-backed) - write-back cache, this option will lead to filesystem corruption on a - system crash or power loss. +Disable/enable the use of block layer write barriers. Write barriers Please use Enable/Disable ... to match order on the options barrier(*) then nobarrier immediately above. + ensure that certain IOs make it through the device cache and are on + persistent storage. If used on a device with a volatile And: ... If disabled on a device with a volatile to make more sense when both enable and disable options are listed. + (non-battery-backed) write-back cache, this option will lead to + filesystem corruption on a system crash or power loss. nodatacow Disable data copy-on-write for newly created files. Implies nodatasum, diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index e9c13fb..fe9d8a6 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -323,7 +323,7 @@ enum { Opt_no_space_cache, Opt_recovery, Opt_skip_balance, Opt_check_integrity, Opt_check_integrity_including_extent_data, Opt_check_integrity_print_mask, Opt_fatal_errors, Opt_rescan_uuid_tree, - Opt_commit_interval, + Opt_commit_interval, Opt_barrier, Opt_err, }; @@ -335,6 +335,7 @@ static match_table_t tokens = { {Opt_nodatasum, nodatasum}, {Opt_nodatacow, nodatacow}, {Opt_nobarrier, nobarrier}, + {Opt_barrier, barrier}, {Opt_max_inline, max_inline=%s}, {Opt_alloc_start, alloc_start=%s}, {Opt_thread_pool, thread_pool=%d}, @@ -494,6 +495,11 @@ int btrfs_parse_options(struct btrfs_root *root, char *options) btrfs_clear_opt(info-mount_opt, SSD); btrfs_clear_opt(info-mount_opt, SSD_SPREAD); break; + case Opt_barrier: + if (btrfs_test_opt(root, NOBARRIER)) + btrfs_info(root-fs_info, turning on barriers); + btrfs_clear_opt(info-mount_opt, NOBARRIER); + break; case Opt_nobarrier: btrfs_info(root-fs_info, turning off barriers); btrfs_set_opt(info-mount_opt, NOBARRIER); -- 1.8.5.2 Thanks, Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: add options for changing size representations
On 11 March 2013 10:12, Audrius Butkevicius audrius.butkevic...@elastichosts.com wrote: Add '--si', '-h'/'--human-readable' and '--block-size' global options, which allow users to customize the way sizes are displayed. Options and their format tries to mimic GNU ls utility. Signed-off-by: Audrius Butkevicius audrius.butkevic...@elastichosts.com --- btrfs.c |3 ++ utils.c | 146 +++ utils.h |6 +++ 3 files changed, 138 insertions(+), 17 deletions(-) diff --git a/btrfs.c b/btrfs.c index 691adef..6a8fc30 100644 --- a/btrfs.c +++ b/btrfs.c @@ -22,6 +22,8 @@ #include crc32c.h #include commands.h #include version.h +#include ctree.h +#include utils.h static const char * const btrfs_cmd_group_usage[] = { btrfs [--help] [--version] group [group...] command [args], @@ -291,6 +293,7 @@ int main(int argc, char **argv) crc32c_optimization_init(); + handle_size_unit_args(argc, argv); fixup_argv0(argv, cmd-token); exit(cmd-fn(argc, argv)); } diff --git a/utils.c b/utils.c index d660507..58c1919 100644 --- a/utils.c +++ b/utils.c @@ -16,6 +16,7 @@ * Boston, MA 021110-1307, USA. */ +#define _GNU_SOURCE #define _XOPEN_SOURCE 700 #define __USE_XOPEN2K8 #define __XOPEN2K8 /* due to an error in dirent.h, to get dirfd() */ @@ -1095,33 +1096,144 @@ out: return ret; } -static char *size_strs[] = { , KB, MB, GB, TB, - PB, EB, ZB, YB}; +static int sizes_format = SIZES_FORMAT_BYTES; +static u64 sizes_divisor = 1; + +void remove_arg(int i, int *argc, char ***argv) +{ + while (i++ *argc) + (*argv)[i - 1] = (*argv)[i]; + (*argc)--; +} + +void handle_size_unit_args(int *argc, char ***argv) +{ + int k; + int base = 1024; + char *suffix; + char *block_size; + u64 value; + + for (k = *argc - 1; k = 0; k--) { +if (!strcmp((*argv)[k], -h) || +!strcmp((*argv)[k], --human-readable)) { + sizes_format = SIZES_FORMAT_HUMAN; + remove_arg(k, argc, argv); +} else if (!strcmp((*argv)[k], --si)) { + sizes_format = SIZES_FORMAT_SI; + remove_arg(k, argc, argv); +} else if (!strncmp((*argv)[k], --block-size, 12)) { + if (strlen((*argv)[k]) 14 || (*argv)[k][12] != '=') { + fprintf(stderr, +--block-size requires an argument\n); + exit(1); + } + + sizes_format = SIZES_FORMAT_BLOCK; + block_size = strchr((*argv)[k], '='); + + errno = 0; + value = strtoull(++block_size, suffix, 10); + if (errno == ERANGE value == ULLONG_MAX) { + fprintf(stderr, +--block-size argument '%s' too large\n, +block_size); + exit(1); + } + if (suffix == block_size) + value = 1; + + if (strlen(suffix) == 1 value 0) { + base = 1024; + } else if (strlen(suffix) == 2 suffix[1] == 'B' +value 0) { + base = 1000; + /* Allow non-zero values without a suffix */ + } else if (strlen(suffix) != 0 || value == 0) { + fprintf(stderr, +invalid --block-size argument '%s'\n, +block_size); + exit(1); + } + + if (strlen(suffix) 0) { + switch(suffix[0]) { + case 'E': + sizes_divisor *= base; + case 'P': + sizes_divisor *= base; + case 'T': + sizes_divisor *= base; + case 'G': + sizes_divisor *= base; + case 'M': + sizes_divisor *= base; + case 'K': + sizes_divisor *= base; + break; +
Re: lvm volume like support
On 25 February 2013 23:35, Suman C schakr...@gmail.com wrote: Hi, I think it would be great if there is a lvm volume or zfs zvol type support in btrfs. As far as I can tell, there's nobody actively working on this feature. I want to know what the core developers think of this feature, is it technically possible? any strong opinions? implementation ideas? I'd be happy to work towards this feature, but want your feedback before proceeding. Btrfs already has capabilities to add and remove block devices on the fly. Data can be stripped or mirrored or both. Raid 5/6 is in testing at the moment. https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices https://btrfs.wiki.kernel.org/index.php/UseCases#RAID Which specific features do you think btrfs is lacking? Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: add '-b' option to filesystem df and show
On 20 February 2013 13:05, Audrius Butkevicius audrius.butkevic...@elastichosts.com wrote: On 01/02/2013 10:30, Hugo Mills wrote: On Fri, Feb 01, 2013 at 09:59:49AM +, Audrius Butkevicius wrote: Add '-b' and '--bytes' options to btrfs filesystem df and show, for easier integration with scripts. This causes all sizes to be displayed in decimal bytes instead of pretty-printed with human-readable suffices KB, MB, etc. Please, not this way. Hi Hugo, Just wanted to check which approach you'd prefer to see me adopt: 1. Providing an option which is handled near the entry point (prior going to commands), which would toggle a global flag to indicate the format. 2. An option in every function which uses pretty sizes. (Though -B seems to be used by scrub, -b is used by calc-size and mkfs utils, -u is used by subvolume list and so on, meaning the option might have to be different for different commands) 3. An environment variable BTRFS_UNITS, which when set to b[ytes], changes the behaviour of pretty printing. Avoids having to touch the multiple sets of ad-hoc option parsing code, but is perhaps a slightly non-standard interface. Thanks, Audrius. Just for reference du and df use the following options: -b, --bytes(du only) -h, --human-readable -k (equivalent to --block-size=1K) --block-size=n[SUFFIX](suffixes KB, MB, GB, ... mean 1000, 1000^2, 1000^3 ... and K, M, G, ... mean 1024, 1024^2, 1024^3, ...) Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] Btrfs-progs Add make archive
On 22 January 2013 17:22, Gene Czarcinski g...@czarc.net wrote: This adds the archive target to the Makefile which simply executes do-archive.sh. It also adds the remove of btrfs-progs.spec.in to I think you mean btrfs-progs.spec without the .in as that's the generated file. make clean. Signed-off-by: Gene Czarcinski g...@czarc.net --- Makefile | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index d524f69..6812258 100644 --- a/Makefile +++ b/Makefile @@ -111,9 +111,13 @@ manpages: install-man: cd man; $(MAKE) install +archive: + bash do-archive.sh + clean : rm -f $(progs) cscope.out *.o .*.d btrfs-convert btrfs-image btrfs-select-super \ - btrfs-zero-log btrfstune dir-test ioctl-test quick-test version.h + btrfs-zero-log btrfstune dir-test ioctl-test quick-test version.h \ + btrfs-progs.spec cd man; $(MAKE) clean install: $(progs) install-man -- 1.8.1 Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to remove btrfs information
On 17 November 2012 16:04, Goffredo Baroncelli kreij...@gmail.com wrote: On 11/17/2012 03:52 AM, Yangtse Su wrote: I have an btrfs part on /dev/sda5,Then I install windows8 with the windows8 installer.I remove this btrfs part and install windows8 on it.Now in my Linux,'btfrs filesystem show' still show /dev/sda5 as a btrfs part. but now it is Microsoft Reserved part,only 128MB. There was a patch about that [1] and David posted a script perl that does the same [2]. Moreover it seems that wipefs is also able to wipe out a btrfs super-block. I never tried it. Here is some information: http://paste.ubuntu.org.cn/155508 Please the next time put all these info in the email # blkid ... /dev/sda5: UUID=9c3e097a-bab0-4f18-b074-5cd2f081c8c7 UUID_SUB=ef0e296c-f554-415e-9aa9-31b1cb9aef31 TYPE=btrfs /dev/sda6: UUID=3AFE3D50FE3D0623 TYPE=ntfs # btrfs filesystem show Label: none uuid: 9c3e097a-bab0-4f18-b074-5cd2f081c8c7 Total devices 1 FS bytes used 284.00KB devid1 size 33.48GB used 2.04GB path /dev/sda5 ... Btrfs Btrfs v0.19 #gdisk /dev/sda p Number Start (sector)End (sector) Size Code Name .. 5 493025280 493287423 128.0 MiB 0C01 Microsoft reserved part 6 493287424 625141759 62.9 GiB0700 Basic data partition [1]http://comments.gmane.org/gmane.comp.file-systems.btrfs/17065 [2]http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg16197.html Yes wipefs is the simplest method. Check first: # wipefs /dev/sda5 Do it second: # wipefs -a /dev/sda5 Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs_print_tree?
On 1 July 2012 05:53, Zhi Yong Wu zwu.ker...@gmail.com wrote: HI, Do anyone know where btrfs_print_tree is invoked? thanks. -- Regards, Zhi Yong Wu Is this the answer you are after? $ grep -r btrfs_print_tree fs/btrfs/ fs/btrfs/print-tree.c:void btrfs_print_tree(struct btrfs_root *root, struct extent_buffer *c) fs/btrfs/print-tree.c: btrfs_print_tree(root, next); fs/btrfs/print-tree.h:void btrfs_print_tree(struct btrfs_root *root, struct extent_buffer *t); Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Interpreting Output of btrfs fi show
On 30 April 2012 18:10, Hubert Kario h...@qbs.com.pl wrote: On Sunday 29 of April 2012 08:13:48 Martin Steigerwald wrote: Am Donnerstag, 26. April 2012 schrieb Bart Noordervliet: On Thu, Apr 26, 2012 at 11:06, Thomas Rohwer troh...@ennit.de wrote: As for the two filesystems shown in btrfs fi show... I have no clue what that is about. Did you maybe make a mistake to create a btrfs filesystem on the whole disk at first? That is possible. But afterwards I certainly repartioned the device and created a btrfs filesystem on /dev/sda1. Maybe this info is only in the partition table? I understand that I should avoid mounting /dev/sda in this situation. Well I think there is a btrfs superblock still present from the full-disk filesystem. Due to the offset of the first partition from the start of the disk, this superblock was not overwritten when you created the filesystem inside the partition. But they very much overlap and the full-disk superblock will probably eventually be overwritten by elements from the partition filesystem. How you would go about erasing the stale superblock and whether it is safe to do so I can't say though. There is the command wipefs. Whether its safe to use here I do not know. I wouldn´t try without a backup. Sorry, but I'm unable to find it. Is it a `btrfs` tool option or is it a standalone application (in similar form as is the `btrfs-zero-log`)? Google is your friend. wipefs is part of util-linux from 2.17, circa Jan-2010. Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: Don't error on resizing FS to same size
On Fri, Nov 18, 2011 at 03:52:00PM +1100, Chris Samuel wrote: On 18/11/11 08:04, Mike Fleetwood wrote: It seems overly harsh to fail a resize of a btrfs file system to the same size when a shrink or grow would succeed. User app GParted trips over this error. Allow it by bypassing the shrink or grow operation. OK - I'm a newbie with the code (and I'm looking at Linus's current git rather than any dev tree of Chris's), but... Signed-off-by: Mike Fleetwood mike.fleetw...@googlemail.com --- [...] --- fs/btrfs/ioctl.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index dae5dfe..00b7024 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1251,7 +1251,7 @@ static noinline int btrfs_ioctl_resize(struct btrfs_root *root, } ret = btrfs_grow_device(trans, device, new_size); btrfs_commit_transaction(trans, root); - } else { + } else if (new_size old_size) { shouldn't that be: + } else if (new_size old_size) { otherwise you'll never try and shrink if new_size is old_size.. ret = btrfs_shrink_device(device, new_size); } Chris, you're correct. I have messed up a 1 line patch by rushing. Will send corrected patch after some more testing! Embarrassed, Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] Btrfs: Don't error on resizing FS to same size
It seems overly harsh to fail a resize of a btrfs file system to the same size when a shrink or grow would succeed. User app GParted trips over this error. Allow it by bypassing the shrink or grow operation. Signed-off-by: Mike Fleetwood mike.fleetw...@googlemail.com --- v2: Fix FS shrink prevention error spotted by Chris Samuel Example failed resize: # strace -e trace=ioctl btrfs filesystem resize max /mnt/0 Resize '/mnt/0' of 'max' ioctl(3, 0x50009403, 0xbfa5029c)= -1 EINVAL (Invalid argument) ERROR: unable to resize '/mnt/0' - Invalid argument # echo $? 30 # dmesg | tail -1 [426094.235018] new size for /dev/loop1 is 1073741824 --- fs/btrfs/ioctl.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index dae5dfe..bd32cce 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1251,7 +1251,7 @@ static noinline int btrfs_ioctl_resize(struct btrfs_root *root, } ret = btrfs_grow_device(trans, device, new_size); btrfs_commit_transaction(trans, root); - } else { + } else if (new_size old_size) { ret = btrfs_shrink_device(device, new_size); } -- 1.7.4.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: Don't error on resizing FS to same size
It seems overly harsh to fail a resize of a btrfs file system to the same size when a shrink or grow would succeed. User app GParted trips over this error. Allow it by bypassing the shrink or grow operation. Signed-off-by: Mike Fleetwood mike.fleetw...@googlemail.com --- Example failed resize: # strace -e trace=ioctl btrfs filesystem resize max /mnt/0 Resize '/mnt/0' of 'max' ioctl(3, 0x50009403, 0xbfa5029c)= -1 EINVAL (Invalid argument) ERROR: unable to resize '/mnt/0' - Invalid argument # echo $? 30 # dmesg | tail -1 [426094.235018] new size for /dev/loop1 is 1073741824 --- fs/btrfs/ioctl.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index dae5dfe..00b7024 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1251,7 +1251,7 @@ static noinline int btrfs_ioctl_resize(struct btrfs_root *root, } ret = btrfs_grow_device(trans, device, new_size); btrfs_commit_transaction(trans, root); - } else { + } else if (new_size old_size) { ret = btrfs_shrink_device(device, new_size); } -- 1.7.4.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS and power loss ~= corruption?
On 26 August 2011 07:37, Arne Jansen sensi...@gmx.net wrote: On 26.08.2011 01:01, Gregory Maxwell wrote: On Wed, Aug 24, 2011 at 9:11 AM, Berend Dekens bt...@cyberwizzard.nl wrote: It seems to me that if someone created a block device which recorded all write operations a rather excellent test could be constructed where a btrfs filesystem is recorded under load and then every partial replay is mounted and checked for corruption/data loss. This would result in high confidence that no power loss event could destroy data given the offered load assuming well behaved (non-reordering hardware). If it recorded barrier operations the a tool could also try many (but probably not all) permissible reorderings at every truncation offset. I like the idea. Some more thoughts: - instead of trying all reorderings it might be enough to just always deliver the oldest possible copy - the order in which btrfs writes the data probably depends on the order in which the device acknowledges the request. You might need to add some reordering there, too - you need to produce a wide variety of workloads, as problems might only occur at a special kind of it (directIO, fsync, snapshots...) - if there really is a regression somewhere, it would be good to also include the full block layer into the test, as the regression might not be in btrfs at all - as a first small step one could just use blktrace to record the write order and analyze the order on mount as well It seems to me that the existence of this kind of testing is something that should be expected of a modern filesystem before it sees widescale production use. This article describes evaluating ext3, reiserfs and jfs using fault injection using a custom Linux block device driver. Model-Based Failure Analysis of Journaling File Systems http://www.cs.wisc.edu/adsl/Publications/sfa-dsn05.pdf -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: cannot remove files: No space left on device
On 12 June 2011 00:32, Tomasz Chmielewski man...@wpkg.org wrote: I'm trying to remove some files on a btrfs filesystem which has 26 GB free: /dev/sdb4 336G 310G 26G 93% /mnt/btrfs Unfortunately, removing some of the files fails, due to No space left on device: root@dom:/mnt/btrfs# rm -rfv postgresql-noindex removed `postgresql-noindex/postgresql/8.4/main/base/16384/16508.6' removed `postgresql-noindex/postgresql/8.4/main/base/16384/16508.7' rm: cannot remove `postgresql-noindex/postgresql/8.4/main/base/16384/16508.8': No space left on device removed `postgresql-noindex/postgresql/8.4/main/base/16384/16508.9' removed `postgresql-noindex/postgresql/8.4/main/base/16384/16508_fsm' removed `postgresql-noindex/postgresql/8.4/main/base/16384/16511' removed `postgresql-noindex/postgresql/8.4/main/base/16384/16513' removed `postgresql-noindex/postgresql/8.4/main/base/16384/16516' removed `postgresql-noindex/postgresql/8.4/main/base/16384/16516_fsm' removed `postgresql-noindex/postgresql/8.4/main/base/16384/16521' This is a snapshot of a different directory (with some changes). Is it expected? I'm running 2.6.39.1 kernel. -- Tomasz Chmielewski Check out the btrfs FAQ about space usage: https://btrfs.wiki.kernel.org/index.php/FAQ#Why_are_there_so_many_ways_to_check_the_amount_of_free_space.3F and try these command too: btrfs filesystem df /mnt/btrfs btrfs filesystem show /dev/sdb4 I'm no btrfs expert but it's worth trying to delete one file and syncing the fs, then repeating. Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Atomic file data replace API
On 6 January 2011 20:01, Olaf van der Spek olafvds...@gmail.com wrote: Hi, Does btrfs support atomic file data replaces? Hi Olaf, Yes btrfs does support atomic replace, since kernel 2.6.30 circa June 2009. [1] Special handling was added to ext3, ext4, btrfs (and probably other Linux FSs) for your replace-via-truncate and the alternative replace-via-rename application patterns. Try reading Delayed allocation and the zero-length file problem article and comments by Ted Ts'o for further discussion. [2] Mike -- [1] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5a3f23d515a2ebf0c750db80579ca57b28cbce6d [2] http://thunk.org/tytso/blog/2009/03/12/delayed-allocation-and-the-zero-length-file-problem/ -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Disk space accounting and subvolume delete
On 12 May 2010 06:02, Yan, Zheng yanzh...@21cn.com wrote: On Tue, May 11, 2010 at 11:45 PM, Bruce Guenter br...@untroubled.org wrote: On Tue, May 11, 2010 at 08:10:38AM +0800, Yan, Zheng wrote: This is because the snapshot deleting ioctl only removes the a link. Right, I understand that. That part is not unexpected, as it works just like unlink would. However... The corresponding tree is dropped in the background by a kernel thread. The surprise is that 'sync', in any form I was able to try, does not wait until all or even most of the I/O is completed. Apparently the standards spec for sync(2) says it is not required to wait for I/O to complete, but AFAIK all other Linux FS do wait (the man page for sync(2) implies as much, as does the info page for sync in glibc). The only way I've found so far to force this behavior is to unmount, and that's rather intrusive to other users of the FS. We could probably add another ioctl that waits until the tree has been completely dropped. Since the expected behavior for sync is to wait until all pending I/O has been completed, I would argue this should be the default action for sync. Am I misunderstanding something? Dropping a tree can be lengthy. It's not good to let sync wait for hours. For most linux FS, 'sync' just force an transaction/journal commit. I don't think they wait for large operations that can span multiple transactions to complete. Disclaimer: I know nothing about the internals of Btrfs! I have an analogy as a way to thinking about what deleting a snapshot entails (which I hope isn't totally bogus). Deleting a clone of a file system is not like unlinking a single file. It is analogous to deleting a directory tree. Syncing in the middle of a recursive delete will wait for the in flight I/O to complete, but it would not wait for the unlink requests from the portion of the directory tree not yet traversed. The same would be true when the kernel thread deletes the snapshot by recursing through it's tree. Mike -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html