Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/20546 )
Change subject: [compaction] Skip memory allocation for ancient undo deltas ...................................................................... Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/20546/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20546/1//COMMIT_MSG@7 PS1, Line 7: [compaction] Skip memory allocation for ancient undo deltas > There is no update in functionality here. The only difference is that we ta Thank you for putting together the explanation above, iterating over the same points 2nd (or 3rd?) time :) I hope I understand what this patch is about, and by 'the updated functionality' I meant the change in the code that do exactly as you explained above, not something exposed to the end-user, no. I think it's great that you added the new scenario TestNoGenerateUndoOnRowSetCompaction. Yes, I see the newly added scenario TestNoGenerateUndoOnRowSetCompaction addresses some of the concerns that I had. Essentially, those were about: 1. Exercising the modified code path under various conditions, especially for the case when many or even all the REDO deltas are considered ancient, at least to make sure the code isn't crashing with SIGABRT due to existing CHECK() macros in the control paths, or due to SIGSEGV if there any mistake in dereferncing :) 2. Making sure to get rid of the DELTAs that have been considered ancient, so they aren't lingering in the list of deltas forever. 3. Preserving the consistency of the data after running all the compaction stages that are possible to have for a tablet. Basically, that about returning correct data for reads/scans at various snapshots, and that's so even if the row is updated/deleted/re-inserted multiple times and those updates placed across different sides of the AHM (i.e. something about situations of "ghost" rows -- deleted on one rowset, reinserted on another). It seems items 1 and 2 are addressed with TestNoGenerateUndoOnRowSetCompaction, right? As far as I can see, the item 3 is only partially addressed by the new scenario, and I don't know how to make sure that with the updated code we aren't going to see any flukes w.r.t. "ghost" rows, for example. Sure, feel free to add the metrics for the allocations when processing REDO deltas if you think it's worth it, but I didn't asked about those specifically. I'm rather worried about introducing issues similar to "ghost" rows once the code has been updated. Basically, I'm not sure the existing tests provide enough coverage to make sure the changes in ApplyMutationsAndGenerateUndos() are covered. If, in addition to the high-level explanation of '... taking note of "ancient" REDO mutations quite early in the processing rather than in later part of compaction' you could point to existing test scenarios that provide us with higher level of confidence in not having issues similar to "ghost" rows (or add a new test scenario to check for that), I'd consider my concerns fully addressed. Thanks a lot! -- To view, visit http://gerrit.cloudera.org:8080/20546 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb41636a3ac063478fe181560d4ec85dadeb0ef3 Gerrit-Change-Number: 20546 Gerrit-PatchSet: 1 Gerrit-Owner: Ashwani Raina <ara...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <ale...@apache.org> Gerrit-Reviewer: Ashwani Raina <ara...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Martonka <zmarto...@cloudera.com> Gerrit-Comment-Date: Mon, 23 Oct 2023 20:25:47 +0000 Gerrit-HasComments: Yes