On Sat, Mar 16, 2019 at 9:17 AM Heikki Linnakangas <hlinn...@iki.fi> wrote: > Hmm. "re-find", maybe? We use that term in a few error messages already, > to mean something similar.
WFM. > At first, I thought this would be horrendously expensive, but thinking > about it a bit more, nearby tuples will always follow the same path > through the upper nodes, so it'll all be cached. So maybe it's not quite > so bad. That's deliberate, though you could call bt_relocate_from_root() from anywhere else if you wanted to. It's a bit like a big nested loop join, where the inner side has locality. > Then, a cosmic ray changes the 20 on the root page to 18. That causes > the the leaf tuple 19 to become non-re-findable; if you descend the for > 19, you'll incorrectly land on page 6. But it also causes the high key > on page 2 to be different from the downlink on the root page. Wouldn't > the existing checks catch this? No, the existing checks will not check that. I suppose something closer to the existing approach *could* detect this issue, by making sure that the "seam of identical high keys" from the root to the leaf are a match, but we don't use the high key outside of its own page. Besides, there is something useful about having the code actually rely on _bt_search(). There are other similar cases, where it's not obvious how you can do verification without either 1) crossing multiple levels, or 2) retaining a "low key" as well as a high key (this is what Goetz Graefe calls "retaining fence keys to solve the cousin verification problem" in Modern B-Tree Techniques). What if the corruption was in the leaf page 6 from your example, which had a 20 at the start? We wouldn't have compared the downlink from the parent to the child, because leaf page 6 is the leftmost child, and so we only have "-inf". The lower bound actually comes from the root page, because we truncate "-inf" attributes during page splits, even though we don't have to. Most of the time they're not "absolute minus infinity" -- they're "minus infinity in this subtree". Maybe you could actually do something with the high key from leaf page 5 to detect the stray value "20" in leaf page 6, but again, we don't do anything like that right now. -- Peter Geoghegan