[kudu-CR](branch-1.17.x) KUDU-3535 Clear log cache while tombstoning a tablet replica.
Alexey Serbin has removed a vote on this change. Change subject: KUDU-3535 Clear log cache while tombstoning a tablet replica. .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/21017 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.17.x Gerrit-MessageType: deleteVote Gerrit-Change-Id: I6cf545e604f80d41e7ebd9660acfd2e928cd27a9 Gerrit-Change-Number: 21017 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Song Jiacheng Gerrit-Reviewer: Yingchun Lai
[kudu-CR](branch-1.17.x) KUDU-3535 Clear log cache while tombstoning a tablet replica.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/21017 ) Change subject: KUDU-3535 Clear log cache while tombstoning a tablet replica. .. Patch Set 1: Verified+1 unrelated test failures -- To view, visit http://gerrit.cloudera.org:8080/21017 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.17.x Gerrit-MessageType: comment Gerrit-Change-Id: I6cf545e604f80d41e7ebd9660acfd2e928cd27a9 Gerrit-Change-Number: 21017 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Song Jiacheng Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Thu, 08 Feb 2024 06:18:54 + Gerrit-HasComments: No
[kudu-CR] [catalog manager] Tighten leader UUID fallback
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/21004 ) Change subject: [catalog_manager] Tighten leader UUID fallback .. Patch Set 2: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/21004/1//COMMIT_MSG Commit Message: PS1: > I added an extension to the TestRecreateCMeta so that it is tested in singl The reason for KUDU-2335 is still a mystery to me. Maybe, that's indeed some race, but at least your fix helps to address one of the cases when the DCHECK() in quorum_util.cc on non-healthy leader triggered. Thanks! http://gerrit.cloudera.org:8080/#/c/21004/1/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/21004/1/src/kudu/master/catalog_manager.cc@5456 PS1, Line 5456: // previous cstate, and the leader was known for that term. > Done This comment says 'Done', but I could not see the requested extra comment in PS2. Did I miss something? -- To view, visit http://gerrit.cloudera.org:8080/21004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I06a80a4a0a9fd422b50860e8cd8bf0e12973cd43 Gerrit-Change-Number: 21004 Gerrit-PatchSet: 2 Gerrit-Owner: Ádám Bakai Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Ádám Bakai Gerrit-Comment-Date: Thu, 08 Feb 2024 06:18:19 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-613: Introduce SLRU cache
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/20607 ) Change subject: KUDU-613: Introduce SLRU cache .. Patch Set 10: (4 comments) Just glanced over the tests, and there is one question I'd like to clarify before going over the rest of the code. http://gerrit.cloudera.org:8080/#/c/20607/10/src/kudu/util/slru_cache-test.cc File src/kudu/util/slru_cache-test.cc: http://gerrit.cloudera.org:8080/#/c/20607/10/src/kudu/util/slru_cache-test.cc@76 PS10, Line 76: nit: the indent should be 2 + 4 = 6 spaces here http://gerrit.cloudera.org:8080/#/c/20607/10/src/kudu/util/slru_cache-test.cc@80 PS10, Line 80: { } nit: wrong indent http://gerrit.cloudera.org:8080/#/c/20607/10/src/kudu/util/slru_cache-test.cc@209 PS10, Line 209: RETURN_IF_NO_NVM_CACHE(GetParam().first); Here and below: does it mean the test is run only on NVM cache? If so, how to test SLRU cache on DRAM then? And why running/skipping SLRU-related tests is predicated in NVM cache presence at all? http://gerrit.cloudera.org:8080/#/c/20607/10/src/kudu/util/slru_cache-test.cc@244 PS10, Line 244: Erase Does it make sense to test the erase operation for both the cases when an entry is in probationary and protected segments? -- To view, visit http://gerrit.cloudera.org:8080/20607 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I45531534a2049dd38c002f4dc7982df9fd46e0bb Gerrit-Change-Number: 20607 Gerrit-PatchSet: 10 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Marton Greber Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Thu, 08 Feb 2024 06:12:33 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-613: Introduce SLRU cache
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/20607 ) Change subject: KUDU-613: Introduce SLRU cache .. Patch Set 8: (2 comments) http://gerrit.cloudera.org:8080/#/c/20607/10/src/kudu/util/slru_cache.h File src/kudu/util/slru_cache.h: http://gerrit.cloudera.org:8080/#/c/20607/10/src/kudu/util/slru_cache.h@136 PS10, Line 136: // Returns the lookup handle of entry from probationary segment, same entry is moved to It would be great to document the parameters of this NewSLRUCache() function, in particular, it's not exactly clear what 'lookups' might be necessary for if somebody is looking at this function the very first time. http://gerrit.cloudera.org:8080/#/c/20607/8/src/kudu/util/slru_cache.h File src/kudu/util/slru_cache.h: http://gerrit.cloudera.org:8080/#/c/20607/8/src/kudu/util/slru_cache.h@50 PS8, Line 50: class HandleDeleter { :public: : explicit HandleDeleter(SLRUCache* c) : : c_(c) { : } : : void operator()(Handle* h) const { : if (h != nullptr) { : c_->Release(h); : } : } : : SLRUCache* cache() const { : return c_; : } : :private: : SLRUCache* c_; : }; > I did so, but for any uses of this templatized class both Free() and Releas I guess that's OK. -- To view, visit http://gerrit.cloudera.org:8080/20607 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I45531534a2049dd38c002f4dc7982df9fd46e0bb Gerrit-Change-Number: 20607 Gerrit-PatchSet: 8 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Marton Greber Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Thu, 08 Feb 2024 05:57:37 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-613: Cleanup of cache code
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/21018 ) Change subject: KUDU-613: Cleanup of cache code .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/21018/2/src/kudu/util/cache.h File src/kudu/util/cache.h: http://gerrit.cloudera.org:8080/#/c/21018/2/src/kudu/util/cache.h@81 PS2, Line 81: // Number of lookups for this entry. Used for upgrading to protected segment in SLRU cache. : // Resets to 0 when moved between probationary and protected segments in both directions. : uint32_t lookups; Is it possible to introduce this field only in the patch implementing the SLRU cache? http://gerrit.cloudera.org:8080/#/c/21018/2/src/kudu/util/nvm_cache.h File src/kudu/util/nvm_cache.h: http://gerrit.cloudera.org:8080/#/c/21018/2/src/kudu/util/nvm_cache.h@34 PS2, Line 34: struct LRUHandle { Any chance to unify Cache::RLHandle and this handle by templatizing? Alignment-related functionality might be introduced as an option (a template parameter). http://gerrit.cloudera.org:8080/#/c/21018/2/src/kudu/util/nvm_cache.h@46 PS2, Line 46: // Number of lookups for this entry. Used for upgrading to protected segment in SLRU cache. : // Resets to 0 when moved between probationary and protected segments in both directions. : uint32_t lookups; Ditto: is it possible to introduce this field only in the patch implementing the SLRU cache? -- To view, visit http://gerrit.cloudera.org:8080/21018 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I506d4577c0ae873b01d7fa4f53846d6fd0f664cf Gerrit-Change-Number: 21018 Gerrit-PatchSet: 2 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Thu, 08 Feb 2024 02:31:31 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-613: Cleanup of cache code
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21018 to look at the new patch set (#2). Change subject: KUDU-613: Cleanup of cache code .. KUDU-613: Cleanup of cache code This patch moves some classes out of the anonymous namespace and into the headers of the cache and nvm_cache files. These classes will be used by the new SLRU cache. This patch also templatizes the HandleDeleter and the PendingHandleDeleter class as it will also be used by the SLRU cache. Change-Id: I506d4577c0ae873b01d7fa4f53846d6fd0f664cf --- M src/kudu/cfile/block_cache.h M src/kudu/master/table_locations_cache.cc M src/kudu/util/cache.cc M src/kudu/util/cache.h M src/kudu/util/file_cache.cc M src/kudu/util/nvm_cache.cc M src/kudu/util/nvm_cache.h M src/kudu/util/ttl_cache.h 8 files changed, 297 insertions(+), 286 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/18/21018/2 -- To view, visit http://gerrit.cloudera.org:8080/21018 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I506d4577c0ae873b01d7fa4f53846d6fd0f664cf Gerrit-Change-Number: 21018 Gerrit-PatchSet: 2 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] KUDU-613: Introduce SLRU cache
Mahesh Reddy has posted comments on this change. ( http://gerrit.cloudera.org:8080/20607 ) Change subject: KUDU-613: Introduce SLRU cache .. Patch Set 9: (3 comments) http://gerrit.cloudera.org:8080/#/c/20607/8/src/kudu/util/block_cache_metrics.h File src/kudu/util/block_cache_metrics.h: PS8: > +1 Moved the block cache metrics out of this patch, will be part of a follow up patch. http://gerrit.cloudera.org:8080/#/c/20607/8/src/kudu/util/cache.h File src/kudu/util/cache.h: http://gerrit.cloudera.org:8080/#/c/20607/8/src/kudu/util/cache.h@67 PS8, Line 67: // Recency list handle. An entry is a variable length heap-allocated structure. > +1 Done http://gerrit.cloudera.org:8080/#/c/20607/8/src/kudu/util/slru_cache.h File src/kudu/util/slru_cache.h: http://gerrit.cloudera.org:8080/#/c/20607/8/src/kudu/util/slru_cache.h@50 PS8, Line 50: // UniquePendingHandle -- a wrapper around opaque PendingHandle structure : // to facilitate automatic reference counting newly allocated cache's handles. : typedef std::unique_ptr> UniquePendingHandle; : : // Sets probationary and protected segments metrics to update corresponding counters accordingly. : // : // Entries are reconstructed on any movement between the segments (upgrade/downgrade). : // : // An upgrade of one entry will increment the number of evictions from the probationary segment by : // one and the number of insertions into the protected segment by one. : // : // A downgrade of one entry will increment the number of evictions from the protected segment by : // one and the number of insertions into the probationary segment by one. : // : // To calculate the number of inserts for the SLRU cache, subtract the number of evictions : // of the protected segment from the number of inserts of the probationary segment. : // : // > nit: maybe, it's time to templatize this class and re-use it throughout var I did so, but for any uses of this templatized class both Free() and Release() will have to be made public as it was done so for SLRUCache. If this is not an acceptable tradeoff, let me know and I can revert these changes. -- To view, visit http://gerrit.cloudera.org:8080/20607 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I45531534a2049dd38c002f4dc7982df9fd46e0bb Gerrit-Change-Number: 20607 Gerrit-PatchSet: 9 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Marton Greber Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Wed, 07 Feb 2024 20:35:21 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-613: Introduce SLRU cache
Hello Tidy Bot, Marton Greber, Alexey Serbin, Attila Bukor, Kudu Jenkins, Abhishek Chennaka, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20607 to look at the new patch set (#9). Change subject: KUDU-613: Introduce SLRU cache .. KUDU-613: Introduce SLRU cache This patch introduces the SLRU cache that has two internal segments, the probationary and protected, to protect the cache from long/random reads. The SLRU cache has a parameter named 'lookups_threshold_' that determines the minimum amount of times an entry can be accessed before it's upgraded to the protected segment. Any random scans would then only evict entries from the probationary segment. Both the protected and probationary segment have their own configurable capacities. When the protected segment is at capacity, any entries evicted will be added to the MRU end of the probationary segment. TODO: - Add performance tests - Use SLRU cache in BlockCache - Write a similar cache for NVM memory Change-Id: I45531534a2049dd38c002f4dc7982df9fd46e0bb --- M src/kudu/util/CMakeLists.txt A src/kudu/util/slru_cache-test.cc A src/kudu/util/slru_cache.cc A src/kudu/util/slru_cache.h 4 files changed, 1,399 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/20607/9 -- To view, visit http://gerrit.cloudera.org:8080/20607 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I45531534a2049dd38c002f4dc7982df9fd46e0bb Gerrit-Change-Number: 20607 Gerrit-PatchSet: 9 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Marton Greber Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] KUDU-613: Cleanup of cache code
Mahesh Reddy has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21018 Change subject: KUDU-613: Cleanup of cache code .. KUDU-613: Cleanup of cache code This patch moves some classes out of the anonymous namespace and into the headers of the cache and nvm_cache files. These classes will be used by the new SLRU cache. This patch also templatizes the HandleDeleter and the PendingHandleDeleter class as it will also be used by the SLRU cache. Change-Id: I506d4577c0ae873b01d7fa4f53846d6fd0f664cf --- M src/kudu/cfile/block_cache.h M src/kudu/master/table_locations_cache.cc M src/kudu/util/cache.cc M src/kudu/util/cache.h M src/kudu/util/nvm_cache.cc M src/kudu/util/nvm_cache.h M src/kudu/util/ttl_cache.h 7 files changed, 296 insertions(+), 285 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/18/21018/1 -- To view, visit http://gerrit.cloudera.org:8080/21018 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I506d4577c0ae873b01d7fa4f53846d6fd0f664cf Gerrit-Change-Number: 21018 Gerrit-PatchSet: 1 Gerrit-Owner: Mahesh Reddy
[kudu-CR](branch-1.17.x) KUDU-3535 Clear log cache while tombstoning a tablet replica.
Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21017 Change subject: KUDU-3535 Clear log cache while tombstoning a tablet replica. .. KUDU-3535 Clear log cache while tombstoning a tablet replica. The log cache of a replica still exists even if the replica has been already tombstoned. This problem might take place if we decrease the replication factor of a table with high throughput. So we should clear the log cache while deleting the replica with delete type TABLET_DATA_TOMBSTONED. Change-Id: I6cf545e604f80d41e7ebd9660acfd2e928cd27a9 Reviewed-on: http://gerrit.cloudera.org:8080/20822 Reviewed-by: Alexey Serbin Tested-by: Alexey Serbin (cherry picked from commit 368225e87f77851f8cdf98fc4b7670aaac7a773e) --- M src/kudu/consensus/consensus_queue-test.cc M src/kudu/consensus/consensus_queue.cc M src/kudu/consensus/consensus_queue.h M src/kudu/consensus/log_cache.cc M src/kudu/consensus/log_cache.h 5 files changed, 31 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/17/21017/1 -- To view, visit http://gerrit.cloudera.org:8080/21017 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.17.x Gerrit-MessageType: newchange Gerrit-Change-Id: I6cf545e604f80d41e7ebd9660acfd2e928cd27a9 Gerrit-Change-Number: 21017 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Song Jiacheng
[kudu-CR] KUDU-3535 Clear log cache while tombstoning a tablet replica.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/20822 ) Change subject: KUDU-3535 Clear log cache while tombstoning a tablet replica. .. Patch Set 10: Verified+1 unrelated test failures -- To view, visit http://gerrit.cloudera.org:8080/20822 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6cf545e604f80d41e7ebd9660acfd2e928cd27a9 Gerrit-Change-Number: 20822 Gerrit-PatchSet: 10 Gerrit-Owner: Song Jiacheng Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Song Jiacheng Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Comment-Date: Wed, 07 Feb 2024 19:59:42 + Gerrit-HasComments: No
[kudu-CR] KUDU-3535 Clear log cache while tombstoning a tablet replica.
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/20822 ) Change subject: KUDU-3535 Clear log cache while tombstoning a tablet replica. .. KUDU-3535 Clear log cache while tombstoning a tablet replica. The log cache of a replica still exists even if the replica has been already tombstoned. This problem might take place if we decrease the replication factor of a table with high throughput. So we should clear the log cache while deleting the replica with delete type TABLET_DATA_TOMBSTONED. Change-Id: I6cf545e604f80d41e7ebd9660acfd2e928cd27a9 Reviewed-on: http://gerrit.cloudera.org:8080/20822 Reviewed-by: Alexey Serbin Tested-by: Alexey Serbin --- M src/kudu/consensus/consensus_queue-test.cc M src/kudu/consensus/consensus_queue.cc M src/kudu/consensus/consensus_queue.h M src/kudu/consensus/log_cache.cc M src/kudu/consensus/log_cache.h 5 files changed, 31 insertions(+), 6 deletions(-) Approvals: Alexey Serbin: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/20822 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I6cf545e604f80d41e7ebd9660acfd2e928cd27a9 Gerrit-Change-Number: 20822 Gerrit-PatchSet: 11 Gerrit-Owner: Song Jiacheng Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Song Jiacheng Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
[kudu-CR] KUDU-3535 Clear log cache while tombstoning a tablet replica.
Alexey Serbin has removed a vote on this change. Change subject: KUDU-3535 Clear log cache while tombstoning a tablet replica. .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/20822 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I6cf545e604f80d41e7ebd9660acfd2e928cd27a9 Gerrit-Change-Number: 20822 Gerrit-PatchSet: 10 Gerrit-Owner: Song Jiacheng Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Song Jiacheng Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
[kudu-CR] KUDU-3535 Clear log cache while tombstoning a tablet replica.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/20822 ) Change subject: KUDU-3535 Clear log cache while tombstoning a tablet replica. .. Patch Set 10: Code-Review+2 Thank you for the fix! -- To view, visit http://gerrit.cloudera.org:8080/20822 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6cf545e604f80d41e7ebd9660acfd2e928cd27a9 Gerrit-Change-Number: 20822 Gerrit-PatchSet: 10 Gerrit-Owner: Song Jiacheng Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Song Jiacheng Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Comment-Date: Wed, 07 Feb 2024 19:59:32 + Gerrit-HasComments: No
[kudu-CR](branch-1.17.x) KUDU-3549 fix WriteAsPrometheus() for non-arithmetic gauges
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/21007 ) Change subject: KUDU-3549 fix WriteAsPrometheus() for non-arithmetic gauges .. KUDU-3549 fix WriteAsPrometheus() for non-arithmetic gauges This patch fixes WriteAsPrometheus() implementation for string-based gauges. The original changelist that introduced Prometheus metrics had a proper implementation of WriteAsPrometheus() only for StringGauge, but any FunctionGauge for a non-arithmetic type (e.g., for std::string) would still output a string value in Prometheus text format, and that could not be consumed by Prometheus. The patch also contains a test scenario that would fail without the fix. This is a follow-up to 00efc6826ac9a1f5d10750296c7357790a041fec. Change-Id: Ib7128f52729c7f984004811153a7eecc8ffe751b Reviewed-on: http://gerrit.cloudera.org:8080/20990 Tested-by: Alexey Serbin Reviewed-by: Marton Greber Reviewed-by: Attila Bukor (cherry picked from commit 6b1c1eb0c97a2349e0b3fa098bf40f8147b43a60) Conflicts: src/kudu/util/metrics.cc src/kudu/util/metrics.h Reviewed-on: http://gerrit.cloudera.org:8080/21007 Reviewed-by: Yingchun Lai --- M src/kudu/util/metrics-test.cc M src/kudu/util/metrics.cc M src/kudu/util/metrics.h 3 files changed, 98 insertions(+), 71 deletions(-) Approvals: Yingchun Lai: Looks good to me, approved Alexey Serbin: Verified -- To view, visit http://gerrit.cloudera.org:8080/21007 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.17.x Gerrit-MessageType: merged Gerrit-Change-Id: Ib7128f52729c7f984004811153a7eecc8ffe751b Gerrit-Change-Number: 21007 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Marton Greber Gerrit-Reviewer: Yingchun Lai
[kudu-CR](branch-1.17.x) KUDU-3549 fix WriteAsPrometheus() for non-arithmetic gauges
Alexey Serbin has removed a vote on this change. Change subject: KUDU-3549 fix WriteAsPrometheus() for non-arithmetic gauges .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/21007 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.17.x Gerrit-MessageType: deleteVote Gerrit-Change-Id: Ib7128f52729c7f984004811153a7eecc8ffe751b Gerrit-Change-Number: 21007 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Marton Greber Gerrit-Reviewer: Yingchun Lai
[kudu-CR](branch-1.17.x) KUDU-3549 fix WriteAsPrometheus() for non-arithmetic gauges
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/21007 ) Change subject: KUDU-3549 fix WriteAsPrometheus() for non-arithmetic gauges .. Patch Set 1: Verified+1 unrelated test failures -- To view, visit http://gerrit.cloudera.org:8080/21007 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.17.x Gerrit-MessageType: comment Gerrit-Change-Id: Ib7128f52729c7f984004811153a7eecc8ffe751b Gerrit-Change-Number: 21007 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Marton Greber Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Wed, 07 Feb 2024 17:18:24 + Gerrit-HasComments: No
[kudu-CR] [log block manager] Write lock for deletion
Hello Alexey Serbin, Attila Bukor, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20901 to look at the new patch set (#14). Change subject: [log_block_manager] Write lock for deletion .. [log_block_manager] Write lock for deletion Both removing blocks and compacting metadata manipulates the live_blocks_ value and file content. This way it can happen that on thread A live_blocks_ is updated, on thread B metadata is compacted and live_blocks_ value is overwritten, then thread A writes deletion records in the metadata file. This way it creates inconsistency between file content and live_blocks_ value, which leads to incorrect behaviour to skip compaction later. This fix makes sure that CompactMetadata and deletion can not overlap each other. No new test is needed because it fixes LogBlockManagerTest:: TestContainerBlockLimitingByMetadataSizeWithCompaction flakiness. Change-Id: I73712a5fd8eccf23621f8abda3e99ebdf10a769f --- M src/kudu/fs/log_block_manager.cc 1 file changed, 2 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/01/20901/14 -- To view, visit http://gerrit.cloudera.org:8080/20901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I73712a5fd8eccf23621f8abda3e99ebdf10a769f Gerrit-Change-Number: 20901 Gerrit-PatchSet: 14 Gerrit-Owner: Ádám Bakai Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [log block manager-test] Improve random selection
Hello Mahesh Reddy, Tidy Bot, Marton Greber, Zoltan Martonka, Kudu Jenkins, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20899 to look at the new patch set (#11). Change subject: [log_block_manager-test] Improve random selection .. [log_block_manager-test] Improve random selection The previous solution to remove random blocks was not good, because it was possible that way less blocks are removed than expected because it is possible that rand()%100 returns 0 or 99 every time and that means in extreme cases zero or all the elements are deleted. The new approach removes exactly the same number of blocks every time but randomizes which blocks are removed and the order of removal is randomized, too. Change-Id: I9f73f0572319784b07845b02d15a7d3e6f31abce --- M src/kudu/fs/log_block_manager-test.cc 1 file changed, 5 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/20899/11 -- To view, visit http://gerrit.cloudera.org:8080/20899 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9f73f0572319784b07845b02d15a7d3e6f31abce Gerrit-Change-Number: 20899 Gerrit-PatchSet: 11 Gerrit-Owner: Ádám Bakai Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Marton Greber Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka Gerrit-Reviewer: Ádám Bakai
[kudu-CR] [log block manager-test] Improve random selection
Hello Mahesh Reddy, Tidy Bot, Marton Greber, Zoltan Martonka, Kudu Jenkins, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20899 to look at the new patch set (#10). Change subject: [log_block_manager-test] Improve random selection .. [log_block_manager-test] Improve random selection The previous solution to remove random blocks was not good, because it was possible that way less blocks are removed than expected because it is possible that rand()%100 returns 0 or 99 every time and that means in extreme cases zero or all the elements are deleted. The new approach removes exactly the same number of blocks every time but randomizes which blocks are removed and the order of removal is randomized, too. Change-Id: I9f73f0572319784b07845b02d15a7d3e6f31abce --- M src/kudu/fs/log_block_manager-test.cc 1 file changed, 5 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/20899/10 -- To view, visit http://gerrit.cloudera.org:8080/20899 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9f73f0572319784b07845b02d15a7d3e6f31abce Gerrit-Change-Number: 20899 Gerrit-PatchSet: 10 Gerrit-Owner: Ádám Bakai Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Marton Greber Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka Gerrit-Reviewer: Ádám Bakai
[kudu-CR] [log block manager] Write lock for deletion
Hello Alexey Serbin, Attila Bukor, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20901 to look at the new patch set (#13). Change subject: [log_block_manager] Write lock for deletion .. [log_block_manager] Write lock for deletion Both removing blocks and compacting metadata manipulates the live_blocks_ value and file content. This way it can happen that on thread A live_blocks_ is updated, on thread B metadata is compacted and live_blocks_ value is overwritten, then thread A writes deletion records in the metadata file. This way it creates inconsistency between file content and live_blocks_ value, which leads to incorrect behaviour to skip compaction later. This fix makes sure that CompactMetadata and deletion can not overlap each other. No new test is needed because it fixes LogBlockManagerTest:: TestContainerBlockLimitingByMetadataSizeWithCompaction flakiness. Change-Id: I73712a5fd8eccf23621f8abda3e99ebdf10a769f --- M src/kudu/fs/log_block_manager.cc 1 file changed, 2 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/01/20901/13 -- To view, visit http://gerrit.cloudera.org:8080/20901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I73712a5fd8eccf23621f8abda3e99ebdf10a769f Gerrit-Change-Number: 20901 Gerrit-PatchSet: 13 Gerrit-Owner: Ádám Bakai Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [catalog manager] Tighten leader UUID fallback
Hello Alexey Serbin, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21004 to look at the new patch set (#2). Change subject: [catalog_manager] Tighten leader UUID fallback .. [catalog_manager] Tighten leader UUID fallback It is safe to assume that if the term is the same in the current cstate as in the previous cstate then even if the leader is not set, it will be the same. But it is possible that cmeta file is deleted then recreated with "local_replica cmeta unsafe_recreate" command. In this case the leader_uuid is empty in the new cmeta file. This means that the peer doesn't consider itself a leader, so no health report is generated in tablet report and it has no leader_uuid set either. When a master receives tablet report like this and there isn't a new term, then the catalog master will treat this peer as a leader, but it will fail on a check because the leader has to be in healthy status. This happened in ToolTest::TestRecreateCMeta. As a reproduction step, the same test now runs with a single TServer configuration, too. In this configuration the error is reproducible 100% of the times, since the term is not increased and the leader's cmeta file is changed. The solution is that catalog manager only assumes the previous leader for the peer if the previous leader is not the peer itself. This gives time for the peers to form a consensus about the leader. Change-Id: I06a80a4a0a9fd422b50860e8cd8bf0e12973cd43 --- M src/kudu/master/catalog_manager.cc M src/kudu/tools/kudu-tool-test.cc 2 files changed, 11 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/04/21004/2 -- To view, visit http://gerrit.cloudera.org:8080/21004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I06a80a4a0a9fd422b50860e8cd8bf0e12973cd43 Gerrit-Change-Number: 21004 Gerrit-PatchSet: 2 Gerrit-Owner: Ádám Bakai Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Ádám Bakai
[kudu-CR] [catalog manager] Tighten leader UUID fallback
Ádám Bakai has posted comments on this change. ( http://gerrit.cloudera.org:8080/21004 ) Change subject: [catalog_manager] Tighten leader UUID fallback .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/21004/1//COMMIT_MSG Commit Message: PS1: > Thank you for getting to the bottom of this! I added an extension to the TestRecreateCMeta so that it is tested in single TServer configuration,too. That is a sure way to reproduce the error. Maybe it's enough to only run with a single TServer, but I thought multiple TServer test is nice to have, too. If I understand correctly, the KUDU-2335 is still present. I tried to reproduce it without the patch and I had no luck. I think it happens because some very rare race condition between the raft consensus thread and of tablet report generation thread that is triggered by the voting process. http://gerrit.cloudera.org:8080/#/c/21004/1/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/21004/1/src/kudu/master/catalog_manager.cc@5456 PS1, Line 5456: // previous cstate, and the leader was known for that term. > Could you please add an extra comment, explaining the essence of the extra Done -- To view, visit http://gerrit.cloudera.org:8080/21004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I06a80a4a0a9fd422b50860e8cd8bf0e12973cd43 Gerrit-Change-Number: 21004 Gerrit-PatchSet: 1 Gerrit-Owner: Ádám Bakai Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Ádám Bakai Gerrit-Comment-Date: Wed, 07 Feb 2024 15:30:09 + Gerrit-HasComments: Yes
[kudu-CR](branch-1.17.x) KUDU-3549 fix WriteAsPrometheus() for non-arithmetic gauges
Yingchun Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/21007 ) Change subject: KUDU-3549 fix WriteAsPrometheus() for non-arithmetic gauges .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/21007 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.17.x Gerrit-MessageType: comment Gerrit-Change-Id: Ib7128f52729c7f984004811153a7eecc8ffe751b Gerrit-Change-Number: 21007 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Marton Greber Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Wed, 07 Feb 2024 09:29:07 + Gerrit-HasComments: No