Hi, Alexander! On Fri, 17 May 2024 at 14:11, Alexander Korotkov <aekorot...@gmail.com> wrote:
> On Mon, May 13, 2024 at 4:42 AM Alexander Korotkov <aekorot...@gmail.com> > wrote: > > On Mon, May 13, 2024 at 12:23 AM Alexander Korotkov > > <aekorot...@gmail.com> wrote: > > > On Sat, May 11, 2024 at 4:13 AM Mark Dilger > > > <mark.dil...@enterprisedb.com> wrote: > > > > > On May 10, 2024, at 12:05 PM, Alexander Korotkov < > aekorot...@gmail.com> wrote: > > > > > The only bt_target_page_check() caller is > > > > > bt_check_level_from_leftmost(), which overrides state->target in > the > > > > > next iteration anyway. I think the patch is just refactoring to > > > > > eliminate the confusion pointer by Peter Geoghegan upthread. > > > > > > > > I find your argument unconvincing. > > > > > > > > After bt_target_page_check() returns at line 919, and before > bt_check_level_from_leftmost() overrides state->target in the next > iteration, bt_check_level_from_leftmost() conditionally fetches an item > from the page referenced by state->target. See line 963. > > > > > > > > I'm left with four possibilities: > > > > > > > > > > > > 1) bt_target_page_check() never gets to the code that uses > "rightpage" rather than "state->target" in the same iteration where > bt_check_level_from_leftmost() conditionally fetches an item from > state->target, so the change you're making doesn't matter. > > > > > > > > 2) The code prior to v2-0003 was wrong, having changed > state->target in an inappropriate way, causing the wrong thing to happen at > what is now line 963. The patch fixes the bug, because state->target no > longer gets overwritten where you are now using "rightpage" for the value. > > > > > > > > 3) The code used to work, having set up state->target correctly in > the place where you are now using "rightpage", but v2-0003 has broken that. > > > > > > > > 4) It's been broken all along and your patch just changes from > wrong to wrong. > > > > > > > > > > > > If you believe (1) is true, then I'm complaining that you are > relying far to much on action at a distance, and that you are not > documenting it. Even with documentation of this interrelationship, I'd be > unhappy with how brittle the code is. I cannot easily discern that the two > don't ever happen in the same iteration, and I'm not at all convinced one > way or the other. I tried to set up some Asserts about that, but none of > the test cases actually reach the new code, so adding Asserts doesn't help > to investigate the question. > > > > > > > > If (2) is true, then I'm complaining that the commit message doesn't > mention the fact that this is a bug fix. Bug fixes should be clearly > documented as such, otherwise future work might assume the commit can be > reverted with only stylistic consequences. > > > > > > > > If (3) is true, then I'm complaining that the patch is flat busted. > > > > > > > > If (4) is true, then maybe we should revert the entire feature, or > have a discussion of mitigation efforts that are needed. > > > > > > > > Regardless of which of 1..4 you pick, I think it could all do with > more regression test coverage. > > > > > > > > > > > > For reference, I said something similar earlier today in another > email to this thread: > > > > > > > > This patch introduces a change that stores a new page into variable > "rightpage" rather than overwriting "state->target", which the old > implementation most certainly did. That means that after returning from > bt_target_page_check() into the calling function > bt_check_level_from_leftmost() the value in state->target is not what it > would have been prior to this patch. Now, that'd be irrelevant if nobody > goes on to consult that value, but just 44 lines further down in > bt_check_level_from_leftmost() state->target is clearly used. So the > behavior at that point is changing between the old and new versions of the > code, and I think I'm within reason to ask if it was wrong before the > patch, wrong after the patch, or something else? Is this a bug being > introduced, being fixed, or ... ? > > > > > > Thank you for your analysis. I'm inclined to believe in 2, but not > > > yet completely sure. It's really pity that our tests don't cover > > > this. I'm investigating this area. > > > > It seems that I got to the bottom of this. Changing > > BtreeCheckState.target for a cross-page unique constraint check is > > wrong, but that happens only for leaf pages. After that > > BtreeCheckState.target is only used for setting the low key. The low > > key is only used for non-leaf pages. So, that didn't lead to any > > visible bug. I've revised the commit message to reflect this. > > > > So, the picture for the patches is the following now. > > 0001 – optimization, but rather simple and giving huge effect > > 0002 – refactoring > > 0003 – fix for the bug > > 0004 – better error reporting > > I think the thread contains enough motivation on why 0002, 0003 and > 0004 are material for post-FF. They are fixes and refactoring for > new-in-v17 feature. I'm going to push them if no objections. > > Regarding 0001, I'd like to ask Tom and Mark if they find convincing > that given that optimization is small, simple and giving huge effect, > it could be pushed post-FF? Otherwise, this could wait for v18. > In my view, patches 0002-0004 are worth pushing. 0001 is ready in my view. But I see no problem pushing it into v18 regarding that this optimization could be not eligible for post-FF. I don't know the criteria for this just let's be safe about it. Regards, Pavel Borisov