[kudu-CR](branch-1.17.x) KUDU-3535 Clear log cache while tombstoning a tablet replica.

2024-02-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed a vote on this change.

Change subject: KUDU-3535 Clear log cache while tombstoning a tablet replica.
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/21017
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I6cf545e604f80d41e7ebd9660acfd2e928cd27a9
Gerrit-Change-Number: 21017
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Song Jiacheng 
Gerrit-Reviewer: Yingchun Lai 


[kudu-CR](branch-1.17.x) KUDU-3535 Clear log cache while tombstoning a tablet replica.

2024-02-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21017 )

Change subject: KUDU-3535 Clear log cache while tombstoning a tablet replica.
..


Patch Set 1: Verified+1

unrelated test failures


--
To view, visit http://gerrit.cloudera.org:8080/21017
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I6cf545e604f80d41e7ebd9660acfd2e928cd27a9
Gerrit-Change-Number: 21017
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Song Jiacheng 
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Comment-Date: Thu, 08 Feb 2024 06:18:54 +
Gerrit-HasComments: No


[kudu-CR] [catalog manager] Tighten leader UUID fallback

2024-02-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21004 )

Change subject: [catalog_manager] Tighten leader UUID fallback
..


Patch Set 2: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21004/1//COMMIT_MSG
Commit Message:

PS1:
> I added an extension to the TestRecreateCMeta so that it is tested in singl
The reason for KUDU-2335 is still a mystery to me.  Maybe, that's indeed some 
race, but at least your fix helps to address one of the cases when the DCHECK() 
in quorum_util.cc on non-healthy leader triggered.

Thanks!


http://gerrit.cloudera.org:8080/#/c/21004/1/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/21004/1/src/kudu/master/catalog_manager.cc@5456
PS1, Line 5456: // previous cstate, and the leader was known for that 
term.
> Done
This comment says 'Done', but I could not see the requested extra comment in 
PS2.  Did I miss something?



--
To view, visit http://gerrit.cloudera.org:8080/21004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I06a80a4a0a9fd422b50860e8cd8bf0e12973cd43
Gerrit-Change-Number: 21004
Gerrit-PatchSet: 2
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Ádám Bakai 
Gerrit-Comment-Date: Thu, 08 Feb 2024 06:18:19 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-613: Introduce SLRU cache

2024-02-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20607 )

Change subject: KUDU-613: Introduce SLRU cache
..


Patch Set 10:

(4 comments)

Just glanced over the tests, and there is one question I'd like to clarify 
before going over the rest of the code.

http://gerrit.cloudera.org:8080/#/c/20607/10/src/kudu/util/slru_cache-test.cc
File src/kudu/util/slru_cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/20607/10/src/kudu/util/slru_cache-test.cc@76
PS10, Line 76:
nit: the indent should be 2 + 4 = 6 spaces here


http://gerrit.cloudera.org:8080/#/c/20607/10/src/kudu/util/slru_cache-test.cc@80
PS10, Line 80:  { }
nit: wrong indent


http://gerrit.cloudera.org:8080/#/c/20607/10/src/kudu/util/slru_cache-test.cc@209
PS10, Line 209:   RETURN_IF_NO_NVM_CACHE(GetParam().first);
Here and below: does it mean the test is run only on NVM cache?  If so, how to 
test SLRU cache on DRAM then?

And why running/skipping SLRU-related tests is predicated in NVM cache presence 
at all?


http://gerrit.cloudera.org:8080/#/c/20607/10/src/kudu/util/slru_cache-test.cc@244
PS10, Line 244: Erase
Does it make sense to test the erase operation for both the cases when an entry 
is in probationary and protected segments?



--
To view, visit http://gerrit.cloudera.org:8080/20607
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45531534a2049dd38c002f4dc7982df9fd46e0bb
Gerrit-Change-Number: 20607
Gerrit-PatchSet: 10
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 08 Feb 2024 06:12:33 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-613: Introduce SLRU cache

2024-02-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20607 )

Change subject: KUDU-613: Introduce SLRU cache
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/20607/10/src/kudu/util/slru_cache.h
File src/kudu/util/slru_cache.h:

http://gerrit.cloudera.org:8080/#/c/20607/10/src/kudu/util/slru_cache.h@136
PS10, Line 136:   // Returns the lookup handle of entry from probationary 
segment, same entry is moved to
It would be great to document the parameters of this NewSLRUCache() function, 
in particular, it's not exactly clear what 'lookups' might be necessary for if 
somebody is looking at this function the very first time.


http://gerrit.cloudera.org:8080/#/c/20607/8/src/kudu/util/slru_cache.h
File src/kudu/util/slru_cache.h:

http://gerrit.cloudera.org:8080/#/c/20607/8/src/kudu/util/slru_cache.h@50
PS8, Line 50:   class HandleDeleter {
:public:
: explicit HandleDeleter(SLRUCache* c)
: : c_(c) {
: }
:
: void operator()(Handle* h) const {
:   if (h != nullptr) {
: c_->Release(h);
:   }
: }
:
: SLRUCache* cache() const {
:   return c_;
: }
:
:private:
: SLRUCache* c_;
:   };
> I did so, but for any uses of this templatized class both Free() and Releas
I guess that's OK.



--
To view, visit http://gerrit.cloudera.org:8080/20607
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45531534a2049dd38c002f4dc7982df9fd46e0bb
Gerrit-Change-Number: 20607
Gerrit-PatchSet: 8
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 08 Feb 2024 05:57:37 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-613: Cleanup of cache code

2024-02-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21018 )

Change subject: KUDU-613: Cleanup of cache code
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21018/2/src/kudu/util/cache.h
File src/kudu/util/cache.h:

http://gerrit.cloudera.org:8080/#/c/21018/2/src/kudu/util/cache.h@81
PS2, Line 81: // Number of lookups for this entry. Used for upgrading to 
protected segment in SLRU cache.
: // Resets to 0 when moved between probationary and protected 
segments in both directions.
: uint32_t lookups;
Is it possible to introduce this field only in the patch implementing the SLRU 
cache?


http://gerrit.cloudera.org:8080/#/c/21018/2/src/kudu/util/nvm_cache.h
File src/kudu/util/nvm_cache.h:

http://gerrit.cloudera.org:8080/#/c/21018/2/src/kudu/util/nvm_cache.h@34
PS2, Line 34: struct LRUHandle {
Any chance to unify Cache::RLHandle and this handle by templatizing?  
Alignment-related functionality might be introduced as an option (a template 
parameter).


http://gerrit.cloudera.org:8080/#/c/21018/2/src/kudu/util/nvm_cache.h@46
PS2, Line 46:   // Number of lookups for this entry. Used for upgrading to 
protected segment in SLRU cache.
:   // Resets to 0 when moved between probationary and protected 
segments in both directions.
:   uint32_t lookups;
Ditto: is it possible to introduce this field only in the patch implementing 
the SLRU cache?



--
To view, visit http://gerrit.cloudera.org:8080/21018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I506d4577c0ae873b01d7fa4f53846d6fd0f664cf
Gerrit-Change-Number: 21018
Gerrit-PatchSet: 2
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 08 Feb 2024 02:31:31 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-613: Cleanup of cache code

2024-02-07 Thread Mahesh Reddy (Code Review)
Hello Tidy Bot, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21018

to look at the new patch set (#2).

Change subject: KUDU-613: Cleanup of cache code
..

KUDU-613: Cleanup of cache code

This patch moves some classes out of the
anonymous namespace and into the headers
of the cache and nvm_cache files. These
classes will be used by the new SLRU cache.

This patch also templatizes the HandleDeleter
and the PendingHandleDeleter class as it
will also be used by the SLRU cache.

Change-Id: I506d4577c0ae873b01d7fa4f53846d6fd0f664cf
---
M src/kudu/cfile/block_cache.h
M src/kudu/master/table_locations_cache.cc
M src/kudu/util/cache.cc
M src/kudu/util/cache.h
M src/kudu/util/file_cache.cc
M src/kudu/util/nvm_cache.cc
M src/kudu/util/nvm_cache.h
M src/kudu/util/ttl_cache.h
8 files changed, 297 insertions(+), 286 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/18/21018/2
--
To view, visit http://gerrit.cloudera.org:8080/21018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I506d4577c0ae873b01d7fa4f53846d6fd0f664cf
Gerrit-Change-Number: 21018
Gerrit-PatchSet: 2
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-613: Introduce SLRU cache

2024-02-07 Thread Mahesh Reddy (Code Review)
Mahesh Reddy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20607 )

Change subject: KUDU-613: Introduce SLRU cache
..


Patch Set 9:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/20607/8/src/kudu/util/block_cache_metrics.h
File src/kudu/util/block_cache_metrics.h:

PS8:
> +1
Moved the block cache metrics out of this patch, will be part of a follow up 
patch.


http://gerrit.cloudera.org:8080/#/c/20607/8/src/kudu/util/cache.h
File src/kudu/util/cache.h:

http://gerrit.cloudera.org:8080/#/c/20607/8/src/kudu/util/cache.h@67
PS8, Line 67:   // Recency list handle. An entry is a variable length 
heap-allocated structure.
> +1
Done


http://gerrit.cloudera.org:8080/#/c/20607/8/src/kudu/util/slru_cache.h
File src/kudu/util/slru_cache.h:

http://gerrit.cloudera.org:8080/#/c/20607/8/src/kudu/util/slru_cache.h@50
PS8, Line 50:   // UniquePendingHandle -- a wrapper around opaque PendingHandle 
structure
:   // to facilitate automatic reference counting newly allocated 
cache's handles.
:   typedef std::unique_ptr> 
UniquePendingHandle;
:
:   // Sets probationary and protected segments metrics to update 
corresponding counters accordingly.
:   //
:   // Entries are reconstructed on any movement between the 
segments (upgrade/downgrade).
:   //
:   // An upgrade of one entry will increment the number of 
evictions from the probationary segment by
:   // one and the number of insertions into the protected segment 
by one.
:   //
:   // A downgrade of one entry will increment the number of 
evictions from the protected segment by
:   // one and the number of insertions into the probationary 
segment by one.
:   //
:   // To calculate the number of inserts for the SLRU cache, 
subtract the number of evictions
:   // of the protected segment from the number of inserts of the 
probationary segment.
:   //
:   //
> nit: maybe, it's time to templatize this class and re-use it throughout var
I did so, but for any uses of this templatized class both Free() and Release() 
will have to be made public as it was done so for SLRUCache. If this is not an 
acceptable tradeoff, let me know and I can revert these changes.



--
To view, visit http://gerrit.cloudera.org:8080/20607
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45531534a2049dd38c002f4dc7982df9fd46e0bb
Gerrit-Change-Number: 20607
Gerrit-PatchSet: 9
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 07 Feb 2024 20:35:21 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-613: Introduce SLRU cache

2024-02-07 Thread Mahesh Reddy (Code Review)
Hello Tidy Bot, Marton Greber, Alexey Serbin, Attila Bukor, Kudu Jenkins, 
Abhishek Chennaka,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20607

to look at the new patch set (#9).

Change subject: KUDU-613: Introduce SLRU cache
..

KUDU-613: Introduce SLRU cache

This patch introduces the SLRU cache that has two internal segments,
the probationary and protected, to protect the cache from long/random
reads. The SLRU cache has a parameter named 'lookups_threshold_' that
determines the minimum amount of times an entry can be accessed
before it's upgraded to the protected segment. Any random scans would
then only evict entries from the probationary segment.

Both the protected and probationary segment have their own
configurable capacities. When the protected segment is at capacity,
any entries evicted will be added to the MRU end of the
probationary segment.

TODO:
- Add performance tests
- Use SLRU cache in BlockCache
- Write a similar cache for NVM memory

Change-Id: I45531534a2049dd38c002f4dc7982df9fd46e0bb
---
M src/kudu/util/CMakeLists.txt
A src/kudu/util/slru_cache-test.cc
A src/kudu/util/slru_cache.cc
A src/kudu/util/slru_cache.h
4 files changed, 1,399 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/20607/9
--
To view, visit http://gerrit.cloudera.org:8080/20607
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I45531534a2049dd38c002f4dc7982df9fd46e0bb
Gerrit-Change-Number: 20607
Gerrit-PatchSet: 9
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-613: Cleanup of cache code

2024-02-07 Thread Mahesh Reddy (Code Review)
Mahesh Reddy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21018


Change subject: KUDU-613: Cleanup of cache code
..

KUDU-613: Cleanup of cache code

This patch moves some classes out of the
anonymous namespace and into the headers
of the cache and nvm_cache files. These
classes will be used by the new SLRU cache.

This patch also templatizes the HandleDeleter
and the PendingHandleDeleter class as it
will also be used by the SLRU cache.

Change-Id: I506d4577c0ae873b01d7fa4f53846d6fd0f664cf
---
M src/kudu/cfile/block_cache.h
M src/kudu/master/table_locations_cache.cc
M src/kudu/util/cache.cc
M src/kudu/util/cache.h
M src/kudu/util/nvm_cache.cc
M src/kudu/util/nvm_cache.h
M src/kudu/util/ttl_cache.h
7 files changed, 296 insertions(+), 285 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/18/21018/1
--
To view, visit http://gerrit.cloudera.org:8080/21018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I506d4577c0ae873b01d7fa4f53846d6fd0f664cf
Gerrit-Change-Number: 21018
Gerrit-PatchSet: 1
Gerrit-Owner: Mahesh Reddy 


[kudu-CR](branch-1.17.x) KUDU-3535 Clear log cache while tombstoning a tablet replica.

2024-02-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21017


Change subject: KUDU-3535 Clear log cache while tombstoning a tablet replica.
..

KUDU-3535 Clear log cache while tombstoning a tablet replica.

The log cache of a replica still exists even if the replica has been
already tombstoned. This problem might take place if we decrease
the replication factor of a table with high throughput.

So we should clear the log cache while deleting the replica with
delete type TABLET_DATA_TOMBSTONED.

Change-Id: I6cf545e604f80d41e7ebd9660acfd2e928cd27a9
Reviewed-on: http://gerrit.cloudera.org:8080/20822
Reviewed-by: Alexey Serbin 
Tested-by: Alexey Serbin 
(cherry picked from commit 368225e87f77851f8cdf98fc4b7670aaac7a773e)
---
M src/kudu/consensus/consensus_queue-test.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/consensus/log_cache.cc
M src/kudu/consensus/log_cache.h
5 files changed, 31 insertions(+), 6 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/17/21017/1
--
To view, visit http://gerrit.cloudera.org:8080/21017
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6cf545e604f80d41e7ebd9660acfd2e928cd27a9
Gerrit-Change-Number: 21017
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Song Jiacheng 


[kudu-CR] KUDU-3535 Clear log cache while tombstoning a tablet replica.

2024-02-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20822 )

Change subject: KUDU-3535 Clear log cache while tombstoning a tablet replica.
..


Patch Set 10: Verified+1

unrelated test failures


--
To view, visit http://gerrit.cloudera.org:8080/20822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6cf545e604f80d41e7ebd9660acfd2e928cd27a9
Gerrit-Change-Number: 20822
Gerrit-PatchSet: 10
Gerrit-Owner: Song Jiacheng 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Song Jiacheng 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Comment-Date: Wed, 07 Feb 2024 19:59:42 +
Gerrit-HasComments: No


[kudu-CR] KUDU-3535 Clear log cache while tombstoning a tablet replica.

2024-02-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/20822 )

Change subject: KUDU-3535 Clear log cache while tombstoning a tablet replica.
..

KUDU-3535 Clear log cache while tombstoning a tablet replica.

The log cache of a replica still exists even if the replica has been
already tombstoned. This problem might take place if we decrease
the replication factor of a table with high throughput.

So we should clear the log cache while deleting the replica with
delete type TABLET_DATA_TOMBSTONED.

Change-Id: I6cf545e604f80d41e7ebd9660acfd2e928cd27a9
Reviewed-on: http://gerrit.cloudera.org:8080/20822
Reviewed-by: Alexey Serbin 
Tested-by: Alexey Serbin 
---
M src/kudu/consensus/consensus_queue-test.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/consensus/log_cache.cc
M src/kudu/consensus/log_cache.h
5 files changed, 31 insertions(+), 6 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/20822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6cf545e604f80d41e7ebd9660acfd2e928cd27a9
Gerrit-Change-Number: 20822
Gerrit-PatchSet: 11
Gerrit-Owner: Song Jiacheng 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Song Jiacheng 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>


[kudu-CR] KUDU-3535 Clear log cache while tombstoning a tablet replica.

2024-02-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed a vote on this change.

Change subject: KUDU-3535 Clear log cache while tombstoning a tablet replica.
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/20822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I6cf545e604f80d41e7ebd9660acfd2e928cd27a9
Gerrit-Change-Number: 20822
Gerrit-PatchSet: 10
Gerrit-Owner: Song Jiacheng 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Song Jiacheng 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>


[kudu-CR] KUDU-3535 Clear log cache while tombstoning a tablet replica.

2024-02-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20822 )

Change subject: KUDU-3535 Clear log cache while tombstoning a tablet replica.
..


Patch Set 10: Code-Review+2

Thank you for the fix!


-- 
To view, visit http://gerrit.cloudera.org:8080/20822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6cf545e604f80d41e7ebd9660acfd2e928cd27a9
Gerrit-Change-Number: 20822
Gerrit-PatchSet: 10
Gerrit-Owner: Song Jiacheng 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Song Jiacheng 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Comment-Date: Wed, 07 Feb 2024 19:59:32 +
Gerrit-HasComments: No


[kudu-CR](branch-1.17.x) KUDU-3549 fix WriteAsPrometheus() for non-arithmetic gauges

2024-02-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21007 )

Change subject: KUDU-3549 fix WriteAsPrometheus() for non-arithmetic gauges
..

KUDU-3549 fix WriteAsPrometheus() for non-arithmetic gauges

This patch fixes WriteAsPrometheus() implementation for string-based
gauges.  The original changelist that introduced Prometheus metrics
had a proper implementation of WriteAsPrometheus() only for StringGauge,
but any FunctionGauge for a non-arithmetic type (e.g., for std::string)
would still output a string value in Prometheus text format, and that
could not be consumed by Prometheus.  The patch also contains a test
scenario that would fail without the fix.

This is a follow-up to 00efc6826ac9a1f5d10750296c7357790a041fec.

Change-Id: Ib7128f52729c7f984004811153a7eecc8ffe751b
Reviewed-on: http://gerrit.cloudera.org:8080/20990
Tested-by: Alexey Serbin 
Reviewed-by: Marton Greber 
Reviewed-by: Attila Bukor 
(cherry picked from commit 6b1c1eb0c97a2349e0b3fa098bf40f8147b43a60)
  Conflicts:
src/kudu/util/metrics.cc
src/kudu/util/metrics.h
Reviewed-on: http://gerrit.cloudera.org:8080/21007
Reviewed-by: Yingchun Lai 
---
M src/kudu/util/metrics-test.cc
M src/kudu/util/metrics.cc
M src/kudu/util/metrics.h
3 files changed, 98 insertions(+), 71 deletions(-)

Approvals:
  Yingchun Lai: Looks good to me, approved
  Alexey Serbin: Verified

--
To view, visit http://gerrit.cloudera.org:8080/21007
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: merged
Gerrit-Change-Id: Ib7128f52729c7f984004811153a7eecc8ffe751b
Gerrit-Change-Number: 21007
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Yingchun Lai 


[kudu-CR](branch-1.17.x) KUDU-3549 fix WriteAsPrometheus() for non-arithmetic gauges

2024-02-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed a vote on this change.

Change subject: KUDU-3549 fix WriteAsPrometheus() for non-arithmetic gauges
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/21007
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: Ib7128f52729c7f984004811153a7eecc8ffe751b
Gerrit-Change-Number: 21007
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Yingchun Lai 


[kudu-CR](branch-1.17.x) KUDU-3549 fix WriteAsPrometheus() for non-arithmetic gauges

2024-02-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21007 )

Change subject: KUDU-3549 fix WriteAsPrometheus() for non-arithmetic gauges
..


Patch Set 1: Verified+1

unrelated test failures


--
To view, visit http://gerrit.cloudera.org:8080/21007
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib7128f52729c7f984004811153a7eecc8ffe751b
Gerrit-Change-Number: 21007
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Comment-Date: Wed, 07 Feb 2024 17:18:24 +
Gerrit-HasComments: No


[kudu-CR] [log block manager] Write lock for deletion

2024-02-07 Thread Code Review
Hello Alexey Serbin, Attila Bukor, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20901

to look at the new patch set (#14).

Change subject: [log_block_manager] Write lock for deletion
..

[log_block_manager] Write lock for deletion

Both removing blocks and compacting metadata manipulates the
live_blocks_ value and file content. This way it can happen that on
thread A live_blocks_ is updated, on thread B metadata is
compacted and live_blocks_ value is overwritten, then thread A writes
deletion records in the metadata file. This way it creates inconsistency
between file content and live_blocks_ value, which leads to incorrect
behaviour to skip compaction later.
This fix makes sure that CompactMetadata and deletion can not overlap
each other. No new test is needed because it fixes LogBlockManagerTest::
TestContainerBlockLimitingByMetadataSizeWithCompaction flakiness.

Change-Id: I73712a5fd8eccf23621f8abda3e99ebdf10a769f
---
M src/kudu/fs/log_block_manager.cc
1 file changed, 2 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/01/20901/14
--
To view, visit http://gerrit.cloudera.org:8080/20901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I73712a5fd8eccf23621f8abda3e99ebdf10a769f
Gerrit-Change-Number: 20901
Gerrit-PatchSet: 14
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [log block manager-test] Improve random selection

2024-02-07 Thread Code Review
Hello Mahesh Reddy, Tidy Bot, Marton Greber, Zoltan Martonka, Kudu Jenkins, 
Wang Xixu,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20899

to look at the new patch set (#11).

Change subject: [log_block_manager-test] Improve random selection
..

[log_block_manager-test] Improve random selection

The previous solution to remove random blocks was not good, because it
was possible that way less blocks are removed than expected because it
is possible that rand()%100 returns 0 or 99 every time and that means in
extreme cases zero or all the elements are deleted. The new approach
removes exactly the same number of blocks every time but randomizes
which blocks are removed and the order of removal is randomized, too.

Change-Id: I9f73f0572319784b07845b02d15a7d3e6f31abce
---
M src/kudu/fs/log_block_manager-test.cc
1 file changed, 5 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/20899/11
--
To view, visit http://gerrit.cloudera.org:8080/20899
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9f73f0572319784b07845b02d15a7d3e6f31abce
Gerrit-Change-Number: 20899
Gerrit-PatchSet: 11
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Zoltan Martonka 
Gerrit-Reviewer: Ádám Bakai 


[kudu-CR] [log block manager-test] Improve random selection

2024-02-07 Thread Code Review
Hello Mahesh Reddy, Tidy Bot, Marton Greber, Zoltan Martonka, Kudu Jenkins, 
Wang Xixu,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20899

to look at the new patch set (#10).

Change subject: [log_block_manager-test] Improve random selection
..

[log_block_manager-test] Improve random selection

The previous solution to remove random blocks was not good, because it
was possible that way less blocks are removed than expected because it
is possible that rand()%100 returns 0 or 99 every time and that means in
extreme cases zero or all the elements are deleted. The new approach
removes exactly the same number of blocks every time but randomizes
which blocks are removed and the order of removal is randomized, too.

Change-Id: I9f73f0572319784b07845b02d15a7d3e6f31abce
---
M src/kudu/fs/log_block_manager-test.cc
1 file changed, 5 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/20899/10
--
To view, visit http://gerrit.cloudera.org:8080/20899
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9f73f0572319784b07845b02d15a7d3e6f31abce
Gerrit-Change-Number: 20899
Gerrit-PatchSet: 10
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Zoltan Martonka 
Gerrit-Reviewer: Ádám Bakai 


[kudu-CR] [log block manager] Write lock for deletion

2024-02-07 Thread Code Review
Hello Alexey Serbin, Attila Bukor, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20901

to look at the new patch set (#13).

Change subject: [log_block_manager] Write lock for deletion
..

[log_block_manager] Write lock for deletion

Both removing blocks and compacting metadata manipulates the
live_blocks_ value and file content. This way it can happen that on
thread A live_blocks_ is updated, on thread B metadata is
compacted and live_blocks_ value is overwritten, then thread A writes
deletion records in the metadata file. This way it creates inconsistency
between file content and live_blocks_ value, which leads to incorrect
behaviour to skip compaction later.
This fix makes sure that CompactMetadata and deletion can not overlap
each other. No new test is needed because it fixes LogBlockManagerTest::
TestContainerBlockLimitingByMetadataSizeWithCompaction flakiness.

Change-Id: I73712a5fd8eccf23621f8abda3e99ebdf10a769f
---
M src/kudu/fs/log_block_manager.cc
1 file changed, 2 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/01/20901/13
--
To view, visit http://gerrit.cloudera.org:8080/20901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I73712a5fd8eccf23621f8abda3e99ebdf10a769f
Gerrit-Change-Number: 20901
Gerrit-PatchSet: 13
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [catalog manager] Tighten leader UUID fallback

2024-02-07 Thread Code Review
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21004

to look at the new patch set (#2).

Change subject: [catalog_manager] Tighten leader UUID fallback
..

[catalog_manager] Tighten leader UUID fallback

It is safe to assume that if the term is the same in the current cstate
as in the previous cstate then even if the leader is not set, it will
be the same. But it is possible that cmeta file is deleted then
recreated with "local_replica cmeta unsafe_recreate" command. In this
case the leader_uuid is empty in the new cmeta file. This means that the
peer doesn't consider itself a leader, so no health report is generated
in tablet report and it has no leader_uuid set either. When a master
receives tablet report like this and there isn't a new term, then the
catalog master will treat this peer as a leader, but it will fail on a
check because the leader has to be in healthy status. This happened in
ToolTest::TestRecreateCMeta. As a reproduction step, the same test now
runs with a single TServer configuration, too. In this configuration the
error is reproducible 100% of the times, since the term is not increased
and the leader's cmeta file is changed.

The solution is that catalog manager only assumes the previous leader
for the peer if the previous leader is not the peer itself. This gives
time for the peers to form a consensus about the leader.

Change-Id: I06a80a4a0a9fd422b50860e8cd8bf0e12973cd43
---
M src/kudu/master/catalog_manager.cc
M src/kudu/tools/kudu-tool-test.cc
2 files changed, 11 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/04/21004/2
--
To view, visit http://gerrit.cloudera.org:8080/21004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I06a80a4a0a9fd422b50860e8cd8bf0e12973cd43
Gerrit-Change-Number: 21004
Gerrit-PatchSet: 2
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Ádám Bakai 


[kudu-CR] [catalog manager] Tighten leader UUID fallback

2024-02-07 Thread Code Review
Ádám Bakai has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21004 )

Change subject: [catalog_manager] Tighten leader UUID fallback
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21004/1//COMMIT_MSG
Commit Message:

PS1:
> Thank you for getting to the bottom of this!
I added an extension to the TestRecreateCMeta so that it is tested in single 
TServer configuration,too. That is a sure way to reproduce the error. Maybe 
it's enough to only run with a single TServer, but I thought multiple TServer 
test is nice to have, too.
If I understand correctly, the KUDU-2335 is still present. I tried to reproduce 
it without the patch and I had no luck. I think it happens because some very 
rare race condition between the raft consensus thread and of tablet report 
generation thread that is triggered by the voting process.


http://gerrit.cloudera.org:8080/#/c/21004/1/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/21004/1/src/kudu/master/catalog_manager.cc@5456
PS1, Line 5456: // previous cstate, and the leader was known for that 
term.
> Could you please add an extra comment, explaining the essence of the extra
Done



--
To view, visit http://gerrit.cloudera.org:8080/21004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I06a80a4a0a9fd422b50860e8cd8bf0e12973cd43
Gerrit-Change-Number: 21004
Gerrit-PatchSet: 1
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Ádám Bakai 
Gerrit-Comment-Date: Wed, 07 Feb 2024 15:30:09 +
Gerrit-HasComments: Yes


[kudu-CR](branch-1.17.x) KUDU-3549 fix WriteAsPrometheus() for non-arithmetic gauges

2024-02-07 Thread Yingchun Lai (Code Review)
Yingchun Lai has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21007 )

Change subject: KUDU-3549 fix WriteAsPrometheus() for non-arithmetic gauges
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21007
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib7128f52729c7f984004811153a7eecc8ffe751b
Gerrit-Change-Number: 21007
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Comment-Date: Wed, 07 Feb 2024 09:29:07 +
Gerrit-HasComments: No