Re: [PATCH 13/20] btrfs-progs: use cmd_struct as command entry point
On 3/7/18 9:40 PM, je...@suse.com wrote: > diff --git a/cmds-filesystem.c b/cmds-filesystem.c > index 62112705..ec038f2f 100644 > --- a/cmds-filesystem.c > +++ b/cmds-filesystem.c > @@ -1075,6 +1078,7 @@ next: > > return !!defrag_global_errors; > } > +static DEFINE_SIMPLE_COMMAND(filesystem_defrag, "defrag"); "defragment" -Jeff -- Jeff Mahoney SUSE Labs signature.asc Description: OpenPGP digital signature
Re: [PATCH 13/20] btrfs-progs: use cmd_struct as command entry point
On 3/7/18 9:40 PM, je...@suse.com wrote: > From: Jeff Mahoney> diff --git a/cmds-inspect.c b/cmds-inspect.c > index afd7fe48..12f200b3 100644 > --- a/cmds-inspect.c > +++ b/cmds-inspect.c > @@ -625,33 +629,27 @@ static int cmd_inspect_min_dev_size(int argc, char > **argv) > out: > return !!ret; > } > +static DEFINE_SIMPLE_COMMAND(inspect_min_dev_size, "min-dev-size"); > > static const char inspect_cmd_group_info[] = > "query various internal information"; > > -const struct cmd_group inspect_cmd_group = { > +static const struct cmd_group inspect_cmd_group = { > inspect_cmd_group_usage, inspect_cmd_group_info, { > - { "inode-resolve", cmd_inspect_inode_resolve, > - cmd_inspect_inode_resolve_usage, NULL, 0 }, > - { "logical-resolve", cmd_inspect_logical_resolve, > - cmd_inspect_logical_resolve_usage, NULL, 0 }, > - { "subvolid-resolve", cmd_inspect_subvolid_resolve, > - cmd_inspect_subvolid_resolve_usage, NULL, 0 }, > - { "rootid", cmd_inspect_rootid, cmd_inspect_rootid_usage, NULL, > - 0 }, > - { "min-dev-size", cmd_inspect_min_dev_size, > - cmd_inspect_min_dev_size_usage, NULL, 0 }, > - { "dump-tree", cmd_inspect_dump_tree, > - cmd_inspect_dump_tree_usage, NULL, 0 }, > - { "dump-super", cmd_inspect_dump_super, > - cmd_inspect_dump_super_usage, NULL, 0 }, > - { "tree-stats", cmd_inspect_tree_stats, > - cmd_inspect_tree_stats_usage, NULL, 0 }, > - NULL_CMD_STRUCT > + _struct_inspect_inode_resolve, > + _struct_inspect_logical_resolve, > + _struct_inspect_subvolid_resolve, > + _struct_inspect_rootid, > + _struct_inspect_min_dev_size, > + _struct_inspect_dump_tree, > + _struct_inspect_dump_super, > + _struct_inspect_tree_stats, > + NULL > } > }; > > -int cmd_inspect(int argc, char **argv) > +static int cmd_inspect(int argc, char **argv) > { > return handle_command_group(_cmd_group, argc, argv); > } > +DEFINE_GROUP_COMMAND(inspect, "inspect"); "inspect-internal" -Jeff -- Jeff Mahoney SUSE Labs signature.asc Description: OpenPGP digital signature
Re: [PATCH v2] btrfs-progs: dump-tree: add degraded option
On 2018年03月10日 21:54, Anand Jain wrote: > btrfs inspect dump-tree cli picks the disk with the largest generation > to read the root tree, even when all the devices were not provided in > the cli. But in 2 disks RAID1 you may need to know what's in the disks > individually, so this option -x | --noscan indicates to use only the > given disk to dump. > > Signed-off-by: Anand Jain> --- > v1->v2: rename --degraded to --noscan > > cmds-inspect-dump-tree.c | 7 ++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/cmds-inspect-dump-tree.c b/cmds-inspect-dump-tree.c > index df44bb635c9c..d2676ce55af7 100644 > --- a/cmds-inspect-dump-tree.c > +++ b/cmds-inspect-dump-tree.c > @@ -198,6 +198,7 @@ const char * const cmd_inspect_dump_tree_usage[] = { > "-u|--uuid print only the uuid tree", > "-b|--block print info from the specified block only", > "-t|--tree print only tree with the given id (string or > number)", > + "-x|--noscanuse the disk in the arg, do not scan for the > disks (for raid1)", Still it looks a little too restrict. What about some equivalent just like kernel "device=" mount option? Thanks, Qu > NULL > }; > > @@ -234,10 +235,11 @@ int cmd_inspect_dump_tree(int argc, char **argv) > { "uuid", no_argument, NULL, 'u'}, > { "block", required_argument, NULL, 'b'}, > { "tree", required_argument, NULL, 't'}, > + { "noscan", no_argument, NULL, 'x'}, > { NULL, 0, NULL, 0 } > }; > > - c = getopt_long(argc, argv, "deb:rRut:", long_options, NULL); > + c = getopt_long(argc, argv, "deb:rRut:x", long_options, NULL); > if (c < 0) > break; > switch (c) { > @@ -286,6 +288,9 @@ int cmd_inspect_dump_tree(int argc, char **argv) > } > break; > } > + case 'x': > + open_ctree_flags |= OPEN_CTREE_NO_DEVICES; > + break; > default: > usage(cmd_inspect_dump_tree_usage); > } > signature.asc Description: OpenPGP digital signature
Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
On Fri, Mar 09, 2018 at 01:10:30PM -0800, Linus Torvalds wrote: > On Fri, Mar 9, 2018 at 12:05 PM, Kees Cookwrote: > > When max() is used in stack array size calculations from literal values > > (e.g. "char foo[max(sizeof(struct1), sizeof(struct2))]", the compiler > > thinks this is a dynamic calculation due to the single-eval logic, which > > is not needed in the literal case. This change removes several accidental > > stack VLAs from an x86 allmodconfig build: > > Ok, looks good. > > I just have a couple of questions about applying it. > > In particular, if this will help people working on getting rid of > VLA's in the short term, I can apply it directly. But if people who > are looking at it (anybody else than Kees?) don't much care, then this > might be a 4.17 thing or at least "random -mm queue"? It's easy enough to work on the other VLA removals without basing on this, no rush. thanks, Tobin. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Ongoing Btrfs stability issues
On Sun, 2018-03-11 at 18:51 +0100, Goffredo Baroncelli wrote: > > COW is needed to properly checksum the data. Otherwise is not > possible to ensure the coherency between data and checksum (however I > have to point out that BTRFS fails even in this case [*]). > We could rearrange this sentence, saying that: if you want checksum, > you need COW... No,... not really... the meta-data is anyway always CoWed... so if you do checksum *and* notdatacow,..., the only thing that could possibly happen (in the worst case) is, that data that actually made it correctly to the disk is falsely determined bad, as the metadata (i.e. the checksums) weren't upgraded correctly. That however is probably much less likely than the other way round,.. i.e. bad data went to disk and would be detected with checksuming. I had lots of discussions about this here on the list, and no one ever brought up a real argument against it... I also had an off-list discussion with Chris Mason who IIRC confirmed that it would actually work as I imagine it... with the only two problems: - good data possibly be marked bad because of bad checksums - reads giving back EIO where people would rather prefer bad data (not really sure if this were really his two arguments,... I'd have to look it up, so don't nail me down). Long story short: In any case, I think giving back bad data without EIO is unacceptable. If someone really doesn't care (e.g. because he has higher level checksumming and possibly even repair) he could still manually disable checksumming. The little chance of having a false positive weights IMO far less that have very large amounts of data (DBs, VM images are our typical cases) completely unprotected. And not having checksumming with notdatacow breaks any safe raid repair (so in that case "repair" may even overwrite good data),... which is IMO also unacceptable. And the typical use cases for nodatacow (VMs, DBs) are in turn not so uncommon to want RAID. I really like btrfs,... and it's not that other fs (which typically have no checksumming at all) would perform better here... but not having it for these major use case is a big disappointment for me. Cheers, Chris. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
On Sun, Mar 11, 2018 at 4:05 AM, Ingo Molnarwrote: > > BTW., while I fully agree with everything you said, it's not entirely correct > to > claim that if a C compiler can generate VLA code it is necessarily able to > parse > and evaluate constant array sizes "just fine". > > Constant expressions are typically parsed very early on, at the preprocessing > stage. They can be used with some preprocessor directives as well, such as > '#if' > (with some further limitations on their syntax). Yes. But constant simplification and CSE etc is just a very fundamental part of a compiler, and anybody who actually implements VLA's would have to do it anyway. So no, a message like warning: Array declaration is not a C90 constant expression, resulting in VLA code generation would be moronic. Only some completely mindless broken shit would do "oh, it's not a parse-time constant, so it will be variable". The two just do not follow AT ALL. So the message might be about _possibly_ resulting in VLA code generation, but honestly, at that point you should just add the warning when you actually generate the code to do the stack allocation. Because at that point, you know whether it's variable or not. And trust me, it won't be variable for things like (2,3), or even for our "max()" thing with other odd builtins. Not unless the compiler doesn't really support VLA at all (maybe some bolted-on crazy thing that just turns a VLA at the front-end time into just an alloca), or the user has explicitly asked to disable some fundamental optimization phase. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Ongoing Btrfs stability issues
On 03/10/2018 03:29 PM, Christoph Anton Mitterer wrote: > On Sat, 2018-03-10 at 14:04 +0200, Nikolay Borisov wrote: >> So for OLTP workloads you definitely want nodatacow enabled, bear in >> mind this also disables crc checksumming, but your db engine should >> already have such functionality implemented in it. > > Unlike repeated claims made here on the list and other places... I > woudln't now *any* DB system which actually does this per default and > or in a way that would be comparable to filesystem lvl checksumming. > I agree with you, also nobody warn that without checksum in case of RAID filesystem BTRFS is not capable anymore to check if a stripe is correct or not > > Look back in the archives... when I've asked several times for > checksumming support *with* nodatacow, I evaluated the existing status > for the big ones (postgres,mysql,sqlite,bdb)... and all of them had > this either not enabled per default, not at all, or requiring special > support for the program using the DB. > > > Similar btw: no single VM image type I've evaluated back then had any > form of checksumming integrated. > > > Still, one of the major deficiencies (not in comparison to other fs, > but in comparison to how it should be) of btrfs unfortunately :-( COW is needed to properly checksum the data. Otherwise is not possible to ensure the coherency between data and checksum (however I have to point out that BTRFS fails even in this case [*]). We could rearrange this sentence, saying that: if you want checksum, you need COW... > > > Cheers, > Chris. > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > [*] https://www.spinics.net/lists/linux-btrfs/msg69185.html -- gpg @keyserver.linux.it: Goffredo Baroncelli Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid1 volume stuck as read-only: How to dump, recreate and restore its content?
On Sun, Mar 11, 2018 at 11:28:08PM +0700, Andreas Hild wrote: > Following a physical disk failure of a RAID1 array, I tried to mount > the remaining volume of a root partition with "-o degraded". For some > reason it ended up as read-only as described here: > https://btrfs.wiki.kernel.org/index.php/Gotchas#raid1_volumes_only_mountable_once_RW_if_degraded > > > How to precisely do this: dump, recreate and restore its contents? > Could someone please provided more details how to recover this volume > safely? > Linux debian 4.9.0-4-amd64 #1 SMP Debian 4.9.65-3 (2017-12-03) x86_64 > GNU/Linux > [ 1313.279140] BTRFS warning (device sdb2): missing devices (1) > exceeds the limit (0), writeable mount is not allowed I'd recommend instead going with kernel 4.14 or newer (available in stretch-backports), which handles this case well without the need to restore. If there's no actual data loss (there shouldn't be, it's RAID1 with only a single device missing), you can mount degraded normally, then balance the data onto the new disk. Recovery with 4.9 is unpleasant. Meow! -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢠⠒⠀⣿⡁ A dumb species has no way to open a tuna can. ⢿⡄⠘⠷⠚⠋⠀ A smart species invents a can opener. ⠈⠳⣄ A master species delegates. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Raid1 volume stuck as read-only: How to dump, recreate and restore its content?
Dear All, Following a physical disk failure of a RAID1 array, I tried to mount the remaining volume of a root partition with "-o degraded". For some reason it ended up as read-only as described here: https://btrfs.wiki.kernel.org/index.php/Gotchas#raid1_volumes_only_mountable_once_RW_if_degraded How to precisely do this: dump, recreate and restore its contents? Could someone please provided more details how to recover this volume safely? Device 4 is missing as shown below. /dev/sda4 is a new disk available now. I'm currently using a live disk to recover this root volume. Many thanks! Best wishes, -Andreas *uname -a Linux debian 4.9.0-4-amd64 #1 SMP Debian 4.9.65-3 (2017-12-03) x86_64 GNU/Linux *btrfs --version btrfs-progs v4.7.3 *btrfs fi show warning, device 4 is missing Label: none uuid: ce222f62-b31b-44d6-94ac-79de595325be Total devices 2 FS bytes used 27.55GiB devid1 size 50.00GiB used 35.31GiB path /dev/sdb2 *** Some devices missing Label: none uuid: f946ebb5-bdf9-43f4-8252-5c2d997f7021 Total devices 1 FS bytes used 146.54GiB devid1 size 407.69GiB used 153.16GiB path /dev/sdb3 Label: none uuid: 9e312f97-b2f5-4a53-9fd0-b4fc65653dd1 Total devices 1 FS bytes used 384.00KiB devid1 size 50.00GiB used 2.02GiB path /dev/sda4 *dmesg > dmesg.log [ 1193.006992] BTRFS error (device sda4): unable to start balance with target data profile 16 [ 1312.905264] BTRFS info (device sdb2): allowing degraded mounts [ 1312.905271] BTRFS info (device sdb2): disk space caching is enabled [ 1312.905275] BTRFS info (device sdb2): has skinny extents [ 1312.907400] BTRFS warning (device sdb2): devid 4 uuid 18d9aa16-4543-41fa-90c8-560944e61b2c is missing [ 1312.957486] BTRFS info (device sdb2): bdev (null) errs: wr 120118005, rd 38429879, flush 8815049, corrupt 0, gen 0 [ 1313.279140] BTRFS warning (device sdb2): missing devices (1) exceeds the limit (0), writeable mount is not allowed [ 1313.312077] BTRFS error (device sdb2): open_ctree failed -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()
* Linus Torvaldswrote: > So an error message like > >warning: ISO C90 requires array sizes to be constant-expressions > > would be technically correct and useful from a portability angle. It > tells you when you're doing something non-portable, and should be > automatically enabled with "-ansi -pedantic", for example. > > So what's misleading is actually the name of the warning and the > message, not that it happens. The warning isn't about "variable > length", it's literally about the rules for what a > "constant-expression" is. > > And in C, the expression (2,3) has a constant _value_ (namely 3), but > it isn't a constant-expression as specified by the language. > > Now, the thing is that once you actually do variable length arrays, > those old front-end rules make no sense any more (outside of the "give > portability warnings" thing). > > Because once you do variable length arrays, you obviously _parse_ > everything just fine, and you're doing to evaluate much more complex > expressions than some limited constant-expression rule. BTW., while I fully agree with everything you said, it's not entirely correct to claim that if a C compiler can generate VLA code it is necessarily able to parse and evaluate constant array sizes "just fine". Constant expressions are typically parsed very early on, at the preprocessing stage. They can be used with some preprocessor directives as well, such as '#if' (with some further limitations on their syntax). If VLA support is implemented in a later stage, and results in heavy-handed code generation that will technically work for constant value expressions as well but results in suboptimal code, then a warning should probably be emitted - and it wouldn't be pedantic. The existing warning is still very misleading: warning: ISO C90 forbids variable length array ‘array’ [-Wvla] ... and if my above theory is correct then I think a better warning would be something like: warning: Array declaration is not a C90 constant expression, resulting in VLA code generation ... and note that in this specific case it's not misleading to talk about VLAs in the warning text, because the array size, even if it's constant value, results in VLA code generation. I don't know whether GCC has such a limitation, but a quick experiment with GCC 7.2 suggests that a (2,3) array size expression results in a lot more code being generated than with a constant expression. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html