On Sat, Oct 26, 2024 at 12:23:40AM +0000, Piotr Zalewski wrote: > > > > > > Sent with Proton Mail secure email. > > On Saturday, October 26th, 2024 at 2:16 AM, Kent Overstreet > <[email protected]> wrote: > > > On Fri, Oct 25, 2024 at 08:11:50PM -0400, Kent Overstreet wrote: > > > > > On Wed, Oct 23, 2024 at 03:33:22PM +0800, Alan Huang wrote: > > > > > > > On Oct 23, 2024, at 15:21, Piotr Zalewski [email protected] > > > > wrote: > > > > > > > > > Add NULL check for key returned from bch2_btree_and_journal_iter_peek > > > > > in > > > > > btree_node_iter_and_journal_peek to avoid NULL ptr dereference in > > > > > bch2_bkey_buf_reassemble. > > > > > > > > It would be helpful if the commit message explained why k.k is null in > > > > this case > > > > > > This code is only for iterating over interior btree nodes - k.k is only > > > null when we have a bad btree topology (gaps). > > > > > > Piotr, could you add a comment to that effect? > > > > > > Actually, not just that - when this happens we should flag the > > filesystem as having topology repairs, and possibly start topology > > repair. > > > > Calling bch2_topology_error() will do that. > > > > We definitely want to log an error message, too; it should reference the > > btree node we're iterating over and explain that it's missing child > > nodes. > > Thanks for the clarification. I will send v2 tomorrow :)
Also, make sure we're returning an error code - your patch didn't do that, we (obviously) can't continue the btree lookup. bch2_topology_error() will give you the error code you want; the error code will tell recovery to rewind and run topology repair (if we're in recovery) or else something otherwise sensible.
