On Sun, Jun 15, 2025 at 01:42:22PM +0530, Bharadwaj Raju wrote: > On Sun, Jun 15, 2025 at 5:49 AM Kent Overstreet > <kent.overstr...@linux.dev> wrote: > > > > On Sun, Jun 15, 2025 at 12:27:40AM +0530, Bharadwaj Raju wrote: > > > After cd3cdb1ef706 ("Single err message for btree node reads"), > > > all errors caused __btree_err to return BCH_ERR_fsck_fix no matter what > > > the actual error type was if the recovery pass was scanning for btree > > > nodes. This lead to the code continuing despite things like bad node > > > formats when they earlier would have caused a jump to fsck_err, because > > > btree_err only jumps when the return from __btree_err does not match > > > fsck_fix. Ultimately this lead to undefined behavior by attempting to > > > unpack a key based on an invalid format. > > > > Hang on, -BCH_ERR_fsck_fix should've caused us to fix fixable errors, > > not cause undefined behaviour. > > -BCH_ERR_fsck_fix in btree_err (as _ret from __btree_err) prevents a jump > to fsck_err, so if that is the case, the code afterwards continues > as normal. That's where the UB in this bug comes from. > > I think I should explain the path of the bug: > 1. bch2_btree_node_read_done calls validate_bset, with a jump to > fsck_err if it returns an error. > 2. validate_bset has btree_err_on(bch2_bkey_format_invalid(...), ...), > but in the bug case we were in btree-node-scan so __btree_err returns > fsck_fix, which means ret isn't modified and we don't jump to > fsck_err. > 3. validate_bset doesn't return an error code, so > bch2_btree_node_read_done goes ahead and sets the invalid format -- > and then UB happens when trying to unpack keys based on it. > > > Or is the issue that we're returning -BCH_ERR_fsck_fix for non-fixable > > errors? > > > > Glancing at the code, I think the bug might not be limited to btree node > > scan; we now seem to be passing FSCK_CAN_FIX for all errors in the > > non-btree-node-scan case, and I don't think that's right for > > BCH_ERR_btree_node_read_err_must_retry cases. > > > > But I'll have to go digging through the git history to confirm that, and > > it sounds like you've already looked - does that sound like it? > > Isn't the bch2_fsck_err_opt(c, FSCK_CAN_FIX, err_type) call only in > the node_read_err_fixable case?
You're right, it is. Ok, so it's just the scan_for_btree_nodes path that's busted. But then you're calling __bch2_topology_error()? When we're in scan_for_btree_nodes we're just checking if the node is readable, we shouldn't be doing anything that flags errors/recovery passes elsewhere. I think the correct fix would be more like if (scan_for_btree_nodes) return ret == -BCH_ERR_btree_node_read_err_fixable ? bch_err_throw(c, fsck_fix) : ret; /* or -BCH_ERR_btree_node_read_err_bad_node? */ The error codes feel pretty awkward here, we're converting between different classes of error codes more than we should be which makes things hard to follow - this is code that predates modern errcode.h errcodes and was converted from a private enum, I think I could've done that better - but I think either way works.