[PATCH 1/1] btrfs: fix improper return value
In function btrfs_uuid_tree_iterate(), errno is assigned to variable ret on errors. However, it directly returns 0. It may be better to return ret. This patch also removes the warning, because the caller already prints a warning. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188731 Signed-off-by: Pan Bian--- fs/btrfs/uuid-tree.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/fs/btrfs/uuid-tree.c b/fs/btrfs/uuid-tree.c index 7fc89e4..83bb2f2 100644 --- a/fs/btrfs/uuid-tree.c +++ b/fs/btrfs/uuid-tree.c @@ -351,7 +351,5 @@ int btrfs_uuid_tree_iterate(struct btrfs_fs_info *fs_info, out: btrfs_free_path(path); - if (ret) - btrfs_warn(fs_info, "btrfs_uuid_tree_iterate failed %d", ret); - return 0; + return ret; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: system hangs due to qgroups
On Sat, Dec 03, 2016 at 10:46:40PM +0100, Marc Joliet wrote: > As it's a rescue shell, I have only the one shell AFAIK, and it's occupied > by mount. So I can't tell if there are dmesg entries, however, when this > happens during a normal running system, I never saw any dmesg entries. You can use "open" (might be named "openvt") to spawn a shell on tty2/tty3/etc. And if you have "screen" installed, Ctrl-a c spawns new terminals (Ctrl-a n/p/0-9 to switch). > The output of sysrq+t is too big to capture all of it (i.e., I can't scroll > back to the beginning) You may use netconsole to log everything kernel says to another machine. I can't provide you with the incantations from the top of my head (got working serial (far more reliable) on all my dev boxes, and it doesn't work with bridging ie containers on production), but as your rescue shell has no network sharing, assuming your network card driver supports a feature netconsole needs _and_ stars were aligned right when your network card was manufactured, netconsole is a valuable aid. The system might be not dead enough to stop userland network logging from getting through, too. Meow! -- The bill declaring Jesus as the King of Poland fails to specify whether the addition is at the top or end of the list of kings. What should the historians do? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/18] btrfs: call functions that overwrite their root parameter with fs_info
On 12/2/16 5:32 PM, Omar Sandoval wrote: > On Fri, Dec 02, 2016 at 12:07:21AM -0500, je...@suse.com wrote: >> From: Jeff Mahoney>> >> There are 11 functions that accept a root parameter and immediately >> overwrite it. We can pass those an fs_info pointer instead. >> >> Signed-off-by: Jeff Mahoney >> --- >> fs/btrfs/ctree.h| 4 ++-- >> fs/btrfs/disk-io.c | 4 ++-- >> fs/btrfs/extent-tree.c | 17 +++--- >> fs/btrfs/file-item.c| 5 ++--- >> fs/btrfs/free-space-cache.c | 5 ++--- >> fs/btrfs/free-space-cache.h | 2 +- >> fs/btrfs/transaction.c | 9 >> fs/btrfs/tree-log.c | 6 ++--- >> fs/btrfs/volumes.c | 55 >> + >> fs/btrfs/volumes.h | 4 ++-- >> 10 files changed, 52 insertions(+), 59 deletions(-) >> > > [snip] > >> int btrfs_remove_chunk(struct btrfs_trans_handle *trans, >> - struct btrfs_root *root, u64 chunk_offset) >> + struct btrfs_fs_info *fs_info, u64 chunk_offset) >> { >> +struct btrfs_root *root = fs_info->chunk_root; >> struct extent_map_tree *em_tree; >> struct extent_map *em; >> struct btrfs_root *extent_root = root->fs_info->extent_root; >> @@ -2812,8 +2811,7 @@ int btrfs_remove_chunk(struct btrfs_trans_handle >> *trans, >> struct btrfs_fs_devices *fs_devices = root->fs_info->fs_devices; >> >> /* Just in case */ >> -root = root->fs_info->chunk_root; >> -em_tree = >fs_info->mapping_tree.map_tree; >> +em_tree = _info->mapping_tree.map_tree; > > I think this "Just in case" comment was referring to the reassignment of > the root, so it's just confusing now. Agreed. Thanks for the review. -Jeff -- Jeff Mahoney SUSE Labs signature.asc Description: OpenPGP digital signature
Re: system hangs due to qgroups
On Sat, Dec 3, 2016 at 2:46 PM, Marc Joliet <mar...@gmx.de> wrote: > On Saturday 03 December 2016 13:42:42 Chris Murphy wrote: >> On Sat, Dec 3, 2016 at 11:40 AM, Marc Joliet <mar...@gmx.de> wrote: >> > Hello all, >> > >> > I'm having some trouble with btrfs on a laptop, possibly due to qgroups. >> > Specifically, some file system activities (e.g., snapshot creation, >> > baloo_file_extractor from KDE Plasma) cause the system to hang for up to >> > about 40 minutes, maybe more. >> >> Do you get any blocked tasks kernel messages? If so, issue sysrq+w >> during the hang, and then check the system log (dmesg may not contain >> everything if the command fills the message buffer). If it's a hang >> without any kernel messages, then issue sysrq+t. >> >> https://www.kernel.org/doc/Documentation/sysrq.txt > > As it's a rescue shell, I have only the one shell AFAIK, and it's occupied by > mount. So I can't tell if there are dmesg entries, however, when this happens > during a normal running system, I never saw any dmesg entries. Anyway, I ran > both. OK so this is root fs? I would try to work on it from another volume. An advantage of openSUSE Tumbleweed is they claim to fully support qgroups, where upstream uses much more guarded language about its stability. Whereas last night's Fedora Rawhide has kernel 4.9-rc7 and btrfs-progs 4.8.5. https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20161203.n.0/compose/Workstation/x86_64/iso/Fedora-Workstation-netinst-x86_64-Rawhide-20161203.n.0.iso You can use dd to write the ISO to a USB stick, it supports BIOS and UEFI and Secure Boot. Troubleshooting > Rescue a Fedora system > option 3 to get to a shell The sysrq+t and sysrq+w can be written out in their entirety with monotonic time using 'journalctl -b -k -o short-monotonic > kernelmessages.log' Unfortunately this is not a live system, so you can't (as far as I know) install script to more easily capture everything to a single file; 'btrfs check > btrfscheck.log' should capture most of the output, but it misses a few early lines for some reason. And then scp those files to another system, or mount another stick and copy locally. > > Should I take photos? That'll be annoying to do with all the scrolling, but I > can do that if need be. I can't decipher it anyway, it's mainly for a dev who wanders across this thread or if you file a bug report. But you can get the complete output using the method above. > >> > After I next turned on the laptop, the balance resumed, causing bootup to >> > fail, after which I remembered about the skip_balance mount option, which >> > I >> > tried in a rescue shell from an initramfs. >> >> The file system is the root filesystem? If so, skip_balance may not be >> happening soon enough. Use kernel parameter rootflags=skip_balance >> which will apply this mount option at the very first moment the file >> system is mounted during boot. > > Yes, it's the root file system (there's that plus a swap partition). I > believe I tried rootflags, but I think it also failed, which is why I'm using > a rescue shell now. I can try it again, though, if anybody thinks that > there's no point in waiting, especially if btrfs_scrub_pause in the btrfs- > transaction call trace is significant. It sounds like it's resuming a scrub. That won't happen if you boot from an alternate volume. There's a scrub file found at /var/lib/btrfs/ that tracks the progress of scrubs for each btrfs volume - that directory with an inprogress scrub for your file system is actually in the directory on that file system. If you haven't had luck with btrfs scrub cancel, you can just remove the files in that directory when you get a chance to rw mount the volume. > >> > Since I couldn't use skip_balance, and logically can't destroy qgroups on >> > a >> > read-only file system, I decided to wait for a regular mount to finish. >> > That has been running since Tuesday, and I am slowly growing impatient. >> Haha, no kidding! I think that's very patient. > > Heh :) . I've still got my main desktop (as ancient as it may be), so I'm > content with waiting for now, but I don't want to wait forever, especially if > there might not even be a point. How big is the file system? Sounds like it's a single device volume on a laptop so I'm guessing at most 1TB, and that'd mean at most 100GiB of metadata, which should mean around 15 minutes max to completely read and process all the metadata, and maybe a few hours to do a scrub. I'd bail after a few hours for sure. > >> > Thus I arrive at my question(s): is there anything else I can try, short >> > of >> > reformatting and restoring from back
Re: system hangs due to qgroups
On Saturday 03 December 2016 13:42:42 Chris Murphy wrote: > On Sat, Dec 3, 2016 at 11:40 AM, Marc Jolietwrote: > > Hello all, > > > > I'm having some trouble with btrfs on a laptop, possibly due to qgroups. > > Specifically, some file system activities (e.g., snapshot creation, > > baloo_file_extractor from KDE Plasma) cause the system to hang for up to > > about 40 minutes, maybe more. > > Do you get any blocked tasks kernel messages? If so, issue sysrq+w > during the hang, and then check the system log (dmesg may not contain > everything if the command fills the message buffer). If it's a hang > without any kernel messages, then issue sysrq+t. > > https://www.kernel.org/doc/Documentation/sysrq.txt As it's a rescue shell, I have only the one shell AFAIK, and it's occupied by mount. So I can't tell if there are dmesg entries, however, when this happens during a normal running system, I never saw any dmesg entries. Anyway, I ran both. The output of sysrq+w mentions two tasks: "btrfs-transaction" with btrfs_scrub_pause+0xbe/0xd0 as the top-most entry in the call trace, and "mount" with its top-most entry at schedule+0x33/0x90 (it looks like it's still in the "early" processing, since there's also "btrfs_parse_early_options+0190/0x190" in the call trace). The output of sysrq+t is too big to capture all of it (i.e., I can't scroll back to the beginning), but just looking at the task names that I *can* see, there are: btrfs-fixup, various btrfs-endio*, btrfs-rmw, btrfs-freespace, btrfs-delayed-m (cut off), btrfs-readahead, btrfs-qgroup-re (cut off), btrfs- extent-re (cut off), btrfs-cleaner, and btrfs-transaction. Oh, and a bunch of kworkers. Should I take photos? That'll be annoying to do with all the scrolling, but I can do that if need be. > > After I next turned on the laptop, the balance resumed, causing bootup to > > fail, after which I remembered about the skip_balance mount option, which > > I > > tried in a rescue shell from an initramfs. > > The file system is the root filesystem? If so, skip_balance may not be > happening soon enough. Use kernel parameter rootflags=skip_balance > which will apply this mount option at the very first moment the file > system is mounted during boot. Yes, it's the root file system (there's that plus a swap partition). I believe I tried rootflags, but I think it also failed, which is why I'm using a rescue shell now. I can try it again, though, if anybody thinks that there's no point in waiting, especially if btrfs_scrub_pause in the btrfs- transaction call trace is significant. > > Since I couldn't use skip_balance, and logically can't destroy qgroups on > > a > > read-only file system, I decided to wait for a regular mount to finish. > > That has been running since Tuesday, and I am slowly growing impatient. > Haha, no kidding! I think that's very patient. Heh :) . I've still got my main desktop (as ancient as it may be), so I'm content with waiting for now, but I don't want to wait forever, especially if there might not even be a point. > > Thus I arrive at my question(s): is there anything else I can try, short > > of > > reformatting and restoring from backup? Can I use btrfs-check here, or > > any > > other tool? Or...? > > Yes, btrfs-progs 4.8.5 has the latest qgroup checks, so if there's > something wrong it should find it and if not that's a bug of its own. The initramfs has 4.8.4, but it looks like 4.8.5 was "only" an urgent bug fix, with no changes to qgroups handling, so I can use that, too. Can it repair qgroups problems, too? > > Also, should I be able to avoid reformatting: how do I properly disable > > quota support? > > 'btrfs quota disable' is the only command that applies to this and it > requires rw mount; there's no 'noquota' mount option. OK, thanks. So what should I try next? I'm sick at home, so I can spend more time on this than usual. -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
[PATCH] btrfs: limit async_work allocation and worker func duration
Problem statement: unprivileged user who has read-write access to more than one btrfs subvolume may easily consume all kernel memory (eventually triggering oom-killer). Reproducer (./mkrmdir below essentially loops over mkdir/rmdir): [root@kteam1 ~]# cat prep.sh DEV=/dev/sdb mkfs.btrfs -f $DEV mount $DEV /mnt for i in `seq 1 16` do mkdir /mnt/$i btrfs subvolume create /mnt/SV_$i ID=`btrfs subvolume list /mnt |grep "SV_$i$" |cut -d ' ' -f 2` mount -t btrfs -o subvolid=$ID $DEV /mnt/$i chmod a+rwx /mnt/$i done [root@kteam1 ~]# sh prep.sh [maxim@kteam1 ~]$ for i in `seq 1 16`; do ./mkrmdir /mnt/$i 2000 2000 & done [root@kteam1 ~]# for i in `seq 1 4`; do grep "kmalloc-128" /proc/slabinfo | grep -v dma; sleep 60; done kmalloc-12810144 10144128 321 : tunables000 : slabdata317317 0 kmalloc-128 9992352 9992352128 321 : tunables000 : slabdata 312261 312261 0 kmalloc-128 24226752 24226752128 321 : tunables000 : slabdata 757086 757086 0 kmalloc-128 42754240 42754240128 321 : tunables000 : slabdata 1336070 1336070 0 The huge numbers above come from insane number of async_work-s allocated and queued by btrfs_wq_run_delayed_node. The problem is caused by btrfs_wq_run_delayed_node() queuing more and more works if the number of delayed items is above BTRFS_DELAYED_BACKGROUND. The worker func (btrfs_async_run_delayed_root) processes at least BTRFS_DELAYED_BATCH items (if they are present in the list). So, the machinery works as expected while the list is almost empty. As soon as it is getting bigger, worker func starts to process more than one item at a time, it takes longer, and the chances to have async_works queued more than needed is getting higher. The problem above is worsened by another flaw of delayed-inode implementation: if async_work was queued in a throttling branch (number of items >= BTRFS_DELAYED_WRITEBACK), corresponding worker func won't quit until the number of items < BTRFS_DELAYED_BACKGROUND / 2. So, it is possible that the func occupies CPU infinitely (up to 30sec in my experiments): while the func is trying to drain the list, the user activity may add more and more items to the list. The patch fixes both problems in straightforward way: refuse queuing too many works in btrfs_wq_run_delayed_node and bail out of worker func if at least BTRFS_DELAYED_WRITEBACK items are processed. Signed-off-by: Maxim Patlasov--- fs/btrfs/async-thread.c |8 fs/btrfs/async-thread.h |1 + fs/btrfs/delayed-inode.c |6 -- 3 files changed, 13 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c index e0f071f..29f6252 100644 --- a/fs/btrfs/async-thread.c +++ b/fs/btrfs/async-thread.c @@ -86,6 +86,14 @@ btrfs_work_owner(struct btrfs_work *work) return work->wq->fs_info; } +bool btrfs_workqueue_normal_congested(struct btrfs_workqueue *wq) +{ + int thresh = wq->normal->thresh != NO_THRESHOLD ? + wq->normal->thresh : num_possible_cpus(); + + return atomic_read(>normal->pending) > thresh * 2; +} + BTRFS_WORK_HELPER(worker_helper); BTRFS_WORK_HELPER(delalloc_helper); BTRFS_WORK_HELPER(flush_delalloc_helper); diff --git a/fs/btrfs/async-thread.h b/fs/btrfs/async-thread.h index 8e52484..1f95973 100644 --- a/fs/btrfs/async-thread.h +++ b/fs/btrfs/async-thread.h @@ -84,4 +84,5 @@ void btrfs_workqueue_set_max(struct btrfs_workqueue *wq, int max); void btrfs_set_work_high_priority(struct btrfs_work *work); struct btrfs_fs_info *btrfs_work_owner(struct btrfs_work *work); struct btrfs_fs_info *btrfs_workqueue_owner(struct __btrfs_workqueue *wq); +bool btrfs_workqueue_normal_congested(struct btrfs_workqueue *wq); #endif diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c index 3eeb9cd..de946dd 100644 --- a/fs/btrfs/delayed-inode.c +++ b/fs/btrfs/delayed-inode.c @@ -1356,7 +1356,8 @@ release_path: total_done++; btrfs_release_prepared_delayed_node(delayed_node); - if (async_work->nr == 0 || total_done < async_work->nr) + if ((async_work->nr == 0 && total_done < BTRFS_DELAYED_WRITEBACK) || + total_done < async_work->nr) goto again; free_path: @@ -1372,7 +1373,8 @@ static int btrfs_wq_run_delayed_node(struct btrfs_delayed_root *delayed_root, { struct btrfs_async_delayed_work *async_work; - if (atomic_read(_root->items) < BTRFS_DELAYED_BACKGROUND) + if (atomic_read(_root->items) < BTRFS_DELAYED_BACKGROUND || + btrfs_workqueue_normal_congested(fs_info->delayed_workers)) return 0; async_work = kmalloc(sizeof(*async_work), GFP_NOFS); -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at
Re: system hangs due to qgroups
On Sat, Dec 3, 2016 at 11:40 AM, Marc Jolietwrote: > Hello all, > > I'm having some trouble with btrfs on a laptop, possibly due to qgroups. > Specifically, some file system activities (e.g., snapshot creation, > baloo_file_extractor from KDE Plasma) cause the system to hang for up to about > 40 minutes, maybe more. Do you get any blocked tasks kernel messages? If so, issue sysrq+w during the hang, and then check the system log (dmesg may not contain everything if the command fills the message buffer). If it's a hang without any kernel messages, then issue sysrq+t. https://www.kernel.org/doc/Documentation/sysrq.txt > > After I next turned on the laptop, the balance resumed, causing bootup to > fail, after which I remembered about the skip_balance mount option, which I > tried in a rescue shell from an initramfs. The file system is the root filesystem? If so, skip_balance may not be happening soon enough. Use kernel parameter rootflags=skip_balance which will apply this mount option at the very first moment the file system is mounted during boot. > Since I couldn't use skip_balance, and logically can't destroy qgroups on a > read-only file system, I decided to wait for a regular mount to finish. That > has been running since Tuesday, and I am slowly growing impatient. Haha, no kidding! I think that's very patient. > Thus I arrive at my question(s): is there anything else I can try, short of > reformatting and restoring from backup? Can I use btrfs-check here, or any > other tool? Or...? Yes, btrfs-progs 4.8.5 has the latest qgroup checks, so if there's something wrong it should find it and if not that's a bug of its own. > Also, should I be able to avoid reformatting: how do I properly disable quota > support? 'btrfs quota disable' is the only command that applies to this and it requires rw mount; there's no 'noquota' mount option. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] btrfs-progs: utils: negative numbers are more plausible than sizes over 8 EiB
I got tired of seeing "16.00EiB" whenever btrfs-progs encounters a negative size value, e.g. during resize: Unallocated: /dev/mapper/datamd18 16.00EiB This version is much more useful: Unallocated: /dev/mapper/datamd18 -26.29GiB Signed-off-by: Zygo Blaxell--- v2: change the function prototype so it's easier to see that the mangling implied by the name "pretty" includes "reinterpretation of the u64 value as a signed quantity." --- utils.c | 12 ++-- utils.h | 4 ++-- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/utils.c b/utils.c index 69b580a..07e8443 100644 --- a/utils.c +++ b/utils.c @@ -2575,7 +2575,7 @@ out: * Note: this function uses a static per-thread buffer. Do not call this * function more than 10 times within one argument list! */ -const char *pretty_size_mode(u64 size, unsigned mode) +const char *pretty_size_mode(s64 size, unsigned mode) { static __thread int ps_index = 0; static __thread char ps_array[10][32]; @@ -2594,20 +2594,20 @@ static const char* unit_suffix_binary[] = static const char* unit_suffix_decimal[] = { "B", "kB", "MB", "GB", "TB", "PB", "EB"}; -int pretty_size_snprintf(u64 size, char *str, size_t str_size, unsigned unit_mode) +int pretty_size_snprintf(s64 size, char *str, size_t str_size, unsigned unit_mode) { int num_divs; float fraction; - u64 base = 0; + s64 base = 0; int mult = 0; const char** suffix = NULL; - u64 last_size; + s64 last_size; if (str_size == 0) return 0; if ((unit_mode & ~UNITS_MODE_MASK) == UNITS_RAW) { - snprintf(str, str_size, "%llu", size); + snprintf(str, str_size, "%lld", size); return 0; } @@ -2642,7 +2642,7 @@ int pretty_size_snprintf(u64 size, char *str, size_t str_size, unsigned unit_mod num_divs = 0; break; default: - while (size >= mult) { + while ((size < 0 ? -size : size) >= mult) { last_size = size; size /= mult; num_divs++; diff --git a/utils.h b/utils.h index 366ca29..525bde9 100644 --- a/utils.h +++ b/utils.h @@ -174,9 +174,9 @@ int check_mounted_where(int fd, const char *file, char *where, int size, int btrfs_device_already_in_root(struct btrfs_root *root, int fd, int super_offset); -int pretty_size_snprintf(u64 size, char *str, size_t str_bytes, unsigned unit_mode); +int pretty_size_snprintf(s64 size, char *str, size_t str_bytes, unsigned unit_mode); #define pretty_size(size) pretty_size_mode(size, UNITS_DEFAULT) -const char *pretty_size_mode(u64 size, unsigned mode); +const char *pretty_size_mode(s64 size, unsigned mode); u64 parse_size(char *s); u64 parse_qgroupid(const char *p); -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: utils: negative numbers are more plausible than sizes over 8 EiB
On Sat, Dec 03, 2016 at 10:25:17AM -0800, Omar Sandoval wrote: > On Sat, Dec 03, 2016 at 01:19:38AM -0500, Zygo Blaxell wrote: > > I got tired of seeing "16.00EiB" whenever btrfs-progs encounters a > > negative size value. > > > > e.g. during filesystem shrink we see: > > > > Unallocated: > >/dev/mapper/testvol0 16.00EiB > > > > Interpreting this as a signed quantity is much more useful: > > > > Unallocated: > >/dev/mapper/testvol0 -26.29GiB > > > > Signed-off-by: Zygo Blaxell> > --- > > utils.c | 13 - > > 1 file changed, 8 insertions(+), 5 deletions(-) > > > > diff --git a/utils.c b/utils.c > > index 69b580a..bd2b66e 100644 > > --- a/utils.c > > +++ b/utils.c > > @@ -2594,20 +2594,23 @@ static const char* unit_suffix_binary[] = > > static const char* unit_suffix_decimal[] = > > { "B", "kB", "MB", "GB", "TB", "PB", "EB"}; > > > > -int pretty_size_snprintf(u64 size, char *str, size_t str_size, unsigned > > unit_mode) > > +int pretty_size_snprintf(u64 usize, char *str, size_t str_size, unsigned > > unit_mode) > > { > > int num_divs; > > float fraction; > > - u64 base = 0; > > + s64 base = 0; > > int mult = 0; > > const char** suffix = NULL; > > - u64 last_size; > > + s64 last_size; > > > > if (str_size == 0) > > return 0; > > > > + /* Negative numbers are more plausible than sizes over 8 EiB. */ > > + s64 size = (s64)usize; > > Just make pretty_size_snprintf() take an s64 size so it's clear from the > function signature that it's signed instead of hidden in the definition. I intentionally buried the unsigned -> signed conversion in the lowest level function so I wouldn't trigger signed/unsigned conversion warnings at all 46 call sites for pretty_size_mode. The btrfs code uses u64 endemically for all size data, and I wasn't about to try to change that. The word "pretty" in the function name should imply that what comes out is a possibly lossy transformation of what goes in. Since "16.00EiB" is much more lossy than "-29.96GiB", I believe I am merely reducing the lossiness quantitatively rather than qualitatively. On the other hand, the signed/unsigned warning isn't enabled by default in this project. I can certainly do it that way if you prefer. > > + > > if ((unit_mode & ~UNITS_MODE_MASK) == UNITS_RAW) { > > - snprintf(str, str_size, "%llu", size); > > + snprintf(str, str_size, "%lld", size); > > return 0; > > } > > > > @@ -2642,7 +2645,7 @@ int pretty_size_snprintf(u64 size, char *str, size_t > > str_size, unsigned unit_mod > >num_divs = 0; > >break; > > default: > > - while (size >= mult) { > > + while ((size < 0 ? -size : size) >= mult) { > > last_size = size; > > size /= mult; > > num_divs++; > > -- > > 2.1.4 > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > > the body of a message to majord...@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > signature.asc Description: Digital signature
Re: missing checksums on reboot
On Fri, Dec 2, 2016 at 1:36 PM, Blake Lewiswrote: > Well, 3.10 is what you get with the RHEL7.x distributions, so that's > why people are running it. > Apparently, it is "good enough" for many purposes. Nope. Btrfs in RHEL 7 is a technology preview, and ... "Technology Preview features are currently unsupported, may not be functionally complete, and are not suitable for deployment in production. " > My real goal here is to understand the scope of the bug and whether > any mitigation is > possible. Of course, I don't expect anyone else to make a patch for > me (or even to accept > my patch), but if I knew what the bug is and what was done to fix it, > I'd be in a much better > position to decide what to do. If anyone can shed any light on this, > I'd be very grateful. https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/diff/?id=v4.8.12=v3.19=2 There's more than 1 lines changed between 3.19 and 4.8.12. You're in the right place, but it's a shot in the dark if any developer remembers the particular problem you're having compared to the massive pile of bugs fixed and changes made since then. The RHEL 3.10 kernel as I understand it has v3.19 Btrfs code. You might test with a 4.1.36 kernel to see if your problem reproduces. There was some O_DIRECTbug fixes a while back but I have no idea what kernel versions those were. If the fix is relatively recent, it simply may not have been backported very far because kernels go EOL and backporting just doesn't happen, not that it can't be done. So you might search the list for O_DIRECT, and see if anything looks familiar. If that doesn't pan out you can start doing simple bisects with longterm kernel.org kernels, starting with the oldest (which is where I got 4.1.36) and see if you can find a commit that fixes the problem, and then if it applies cleanly to the much older kernel you're using. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] btrfs: fix improper return value
On Sat, Dec 03, 2016 at 06:55:16PM +0800, Pan Bian wrote: > In function btrfs_uuid_tree_iterate(), errno is assigned to variable > ret on errors. However, it directly returns 0. It may be better to > return ret. > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188731 > > Signed-off-by: Pan Bian> --- > fs/btrfs/uuid-tree.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/btrfs/uuid-tree.c b/fs/btrfs/uuid-tree.c > index 7fc89e4..44bcc1f 100644 > --- a/fs/btrfs/uuid-tree.c > +++ b/fs/btrfs/uuid-tree.c > @@ -353,5 +353,5 @@ int btrfs_uuid_tree_iterate(struct btrfs_fs_info *fs_info, > btrfs_free_path(path); > if (ret) > btrfs_warn(fs_info, "btrfs_uuid_tree_iterate failed %d", ret); > - return 0; > + return ret; This one makes sense, since the caller is checking the return value. The caller is already printing a warning, can you also get rid of the btrfs_warn(fs_info, "btrfs_uuid_tree_iterate failed %d", ret)? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
system hangs due to qgroups
Hello all, I'm having some trouble with btrfs on a laptop, possibly due to qgroups. Specifically, some file system activities (e.g., snapshot creation, baloo_file_extractor from KDE Plasma) cause the system to hang for up to about 40 minutes, maybe more. It always causes (most of) my desktop to hang, (although I can usually navigate between pre-existing Konsole tabs) and prevents new programs from starting. I've seen the system load go up to >30 before the laptop suddenly resumes normal operation. I've been seeing this since Linux 4.7, maybe already 4.6. Now, I thought that maybe this was (indirectly) due to an overly full file system (~90% full), so I deleted some things I didn't need to get it up to 15% free. (For the record, I also tried mounting with ssd_spread.) After that, I ran a balance with -dusage=50, which started out promising, but then went back to the "bad" behaviour. *But* it seemed better than before overall, so I started a balance with -musage=10, then -musage=50. That turned out to be a mistake. Since I had to transport the laptop, and couldn't wait for "balance cancel" to return (IIUC it only returns after the next block (group?) is freed), I forced the laptop off. After I next turned on the laptop, the balance resumed, causing bootup to fail, after which I remembered about the skip_balance mount option, which I tried in a rescue shell from an initramfs. But wait, that failed, too! Specifically, the stack trace I get whenever I try it includes as one of the last lines: "RIP [] qgroup_fix_relocated_data_extents+0x1f/0x2a8" (I can take photos of the full stack trace if requested.) So then I ran "btrfs qgroup show /sysroot/", which showed many quota groups, much to my surprise. On the upside, at least now I discovered the likely reason for the performance problems. (I actually think I know why I'm seeing qgroups: at one point I was trying out various snapshot/backup tools for btrfs, and one (I forgot which) unconditionally activated quota support, which infuriated me, but I promptly deactivated it, or so I thought. Is quota support automatically enabled when qgroups are discovered, or did I perhaps not disable quota support properly?) Since I couldn't use skip_balance, and logically can't destroy qgroups on a read-only file system, I decided to wait for a regular mount to finish. That has been running since Tuesday, and I am slowly growing impatient. Thus I arrive at my question(s): is there anything else I can try, short of reformatting and restoring from backup? Can I use btrfs-check here, or any other tool? Or...? Also, should I be able to avoid reformatting: how do I properly disable quota support? (BTW, searching for qgroup_fix_relocated_data_extents turned up the ML thread "[PATCH] Btrfs: fix endless loop in balancing block groups", could that be related?) The laptop is currently running Gentoo with Linux 4.8.10 and btrfs-progs 4.8.4. Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: [PATCH 1/1] btrfs: volumes: fix improper return value
On Sat, Dec 03, 2016 at 07:01:45PM +0800, Pan Bian wrote: > Variable ret takes the errno on failures. However, it directly returns 0. > It may be better to return "ret". > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188741 > > Signed-off-by: Pan Bian> --- > fs/btrfs/volumes.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 71a60cc..32fd903 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -4222,7 +4222,7 @@ static int btrfs_uuid_scan_kthread(void *data) > else > set_bit(BTRFS_FS_UPDATE_UUID_TREE_GEN, _info->flags); > up(_info->uuid_tree_rescan_sem); > - return 0; > + return ret; This is a kthread, no one checks the return value. We already let the user know if it fails: btrfs_warn(fs_info, "btrfs_uuid_scan_kthread failed %d", ret); > } > > /* > -- > 1.9.1 > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: utils: negative numbers are more plausible than sizes over 8 EiB
On Sat, Dec 03, 2016 at 01:19:38AM -0500, Zygo Blaxell wrote: > I got tired of seeing "16.00EiB" whenever btrfs-progs encounters a > negative size value. > > e.g. during filesystem shrink we see: > > Unallocated: >/dev/mapper/testvol0 16.00EiB > > Interpreting this as a signed quantity is much more useful: > > Unallocated: >/dev/mapper/testvol0 -26.29GiB > > Signed-off-by: Zygo Blaxell> --- > utils.c | 13 - > 1 file changed, 8 insertions(+), 5 deletions(-) > > diff --git a/utils.c b/utils.c > index 69b580a..bd2b66e 100644 > --- a/utils.c > +++ b/utils.c > @@ -2594,20 +2594,23 @@ static const char* unit_suffix_binary[] = > static const char* unit_suffix_decimal[] = > { "B", "kB", "MB", "GB", "TB", "PB", "EB"}; > > -int pretty_size_snprintf(u64 size, char *str, size_t str_size, unsigned > unit_mode) > +int pretty_size_snprintf(u64 usize, char *str, size_t str_size, unsigned > unit_mode) > { > int num_divs; > float fraction; > - u64 base = 0; > + s64 base = 0; > int mult = 0; > const char** suffix = NULL; > - u64 last_size; > + s64 last_size; > > if (str_size == 0) > return 0; > > + /* Negative numbers are more plausible than sizes over 8 EiB. */ > + s64 size = (s64)usize; Just make pretty_size_snprintf() take an s64 size so it's clear from the function signature that it's signed instead of hidden in the definition. > + > if ((unit_mode & ~UNITS_MODE_MASK) == UNITS_RAW) { > - snprintf(str, str_size, "%llu", size); > + snprintf(str, str_size, "%lld", size); > return 0; > } > > @@ -2642,7 +2645,7 @@ int pretty_size_snprintf(u64 size, char *str, size_t > str_size, unsigned unit_mod > num_divs = 0; > break; > default: > - while (size >= mult) { > + while ((size < 0 ? -size : size) >= mult) { > last_size = size; > size /= mult; > num_divs++; > -- > 2.1.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs_destroy_inode warn (outstanding extents)
On Thu, Dec 01, 2016 at 10:32:09AM -0500, Dave Jones wrote: > http://codemonkey.org.uk/junk/btrfs-destroy-inode-outstanding-extents.txt > > Also same bug, different run, but a different traceview > http://codemonkey.org.uk/junk/btrfs-destroy-inode-outstanding-extents-function-graph.txt > > (function-graph screws up the RIP for some reason, 'return_to_handler' > should actually be btrfs_destroy_inode) Chris pointed me at a pending patch that took care of this warning. > Anyways, I've got some code that works pretty well for dumping the > ftrace buffer now when things go awry. I just need to run it enough > times that I hit that bad page state instead of this, or a lockdep bug first. Which allowed me to run long enough to get this trace.. http://codemonkey.org.uk/junk/bad-page-state.txt Does this shed any light ? The interesting process here seems to be kworker/u8:17, and the trace captures some of what that was doing before that bad page was hit. I've got another run going now, so I'll compare that trace when it happens to see if it matches my current theory that it's something to do with that btrfs_scrubparity_helper. I've seen that show up in stack traces a few times while chasing this. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
`btrfs check --repair` stuck in a loop // filesystem repair case study
Hi, I have a btrfs filesystem on a 4TB HDD connected with USB 2.0. Some time ago I accidentally disconnected the drive while doing heavy writes. After reconnecting it seemed like the filesystem still works (it mounted fine and I could read some files chosen at random), but I ran `btrfs scrub` to be sure. `btrfs scrub` aborted itself after ~20 hours, after reading ~3.5TB of data. `dmesg` contained a single line: #v+ BTRFS error (device dm-0): bad tree block start 0 3527021166592 #v- I couldn't find any further details anywhere in logs. I assume this means that some data have actually been lost from this filesystem. I have backups of data from this drive, so I decided to play a little trying out btrfs recovery strategies. I checked whether there are any bad blocks on the raw device — all blocks were read successfully. I created a devicemapper snapshot/overlay to keep the raw device data read only and track the changes made by any recovery procedures. I ran `btrfstune -u` on the overlay to avoid having two devices with the same uuid. This was done using a dedicated VM which did not see the raw device (suggested by `Ke` on IRC). BTW, this command resulted in the overlay device growing by ~25GB, which IIUC means that around 6M 4096-byte blocks were changed in the process (is that expected?). I was recommended to run `btrfs check`. The result is here: [1] (323 lines of output), and IIRC it finished in few hours. [1] https://gist.github.com/liori/f8c5e69677e8c9d6038d2e3e4db9aa42 (5 data checksum errors are a preexisting condition, I knew about them before the incident). I then started `btrfs check --repair`. This was about a week ago, and it is still going. The partial output is here: [2] (already almost 18k lines). The same problems are being found again and again in a loop, as if it was stuck. [2] https://gist.github.com/liori/01494afbe63cd19ba49be663be937d84 I do observe that the ctime of the overlay file is updated every once a while, but the file itself does not grow anymore after some initial change of ~70k blocks. My interpretation is that even if the repair process writes anything, it only keeps writing in the same places again and again. I did not have any snapshots on this filesystem. I did have some deduplicated content, but no more than 4 copies of any data block, and deduplication resulted in saving ~1TB of space total. The device was never a part of a multi-device setup. Is there anything more I can do with this filesystem to bring it to a state where I can `btrfs scrub` it, know what have been lost, etc? Is this behavior of `btrfs scrub --repair` expected and will it ever finish? Thank you, -- Tomasz Melcer -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/1] btrfs: volumes: fix improper return value
Variable ret takes the errno on failures. However, it directly returns 0. It may be better to return "ret". Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188741 Signed-off-by: Pan Bian--- fs/btrfs/volumes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 71a60cc..32fd903 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -4222,7 +4222,7 @@ static int btrfs_uuid_scan_kthread(void *data) else set_bit(BTRFS_FS_UPDATE_UUID_TREE_GEN, _info->flags); up(_info->uuid_tree_rescan_sem); - return 0; + return ret; } /* -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/1] btrfs: fix improper return value
In function btrfs_uuid_tree_iterate(), errno is assigned to variable ret on errors. However, it directly returns 0. It may be better to return ret. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188731 Signed-off-by: Pan Bian--- fs/btrfs/uuid-tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/uuid-tree.c b/fs/btrfs/uuid-tree.c index 7fc89e4..44bcc1f 100644 --- a/fs/btrfs/uuid-tree.c +++ b/fs/btrfs/uuid-tree.c @@ -353,5 +353,5 @@ int btrfs_uuid_tree_iterate(struct btrfs_fs_info *fs_info, btrfs_free_path(path); if (ret) btrfs_warn(fs_info, "btrfs_uuid_tree_iterate failed %d", ret); - return 0; + return ret; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html