[kudu-CR] [tools] Add test to generate heavy rowset compaction

2023-03-23 Thread Ashwani Raina (Code Review)
Hello Mahesh Reddy, Zoltan Chovan, Alexey Serbin, Attila Bukor, Kudu Jenkins, 
Abhishek Chennaka,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19278

to look at the new patch set (#10).

Change subject: [tools] Add test to generate heavy rowset compaction
..

[tools] Add test to generate heavy rowset compaction

This patch adds tests to repro KUDU-3406 like scenario
where rowset compaction could end up consuming large
amount of memory used to load accumulated undo deltas.

TestLoadgenCompactRowSetsOpHighMemUsage:
Generates load to invoke CompactRowSetsOp without
enabling memory budgeting flags.
With memory budgeting disabled, memory usage is as follows:
Timing: real 10.446s, "peak_mem_usage":3323898718
Timing: real 9.816s, "peak_mem_usage":4199547144
Timing: real 9.038s, "peak_mem_usage":3779520834
Timing: real 10.610s, "peak_mem_usage":3757873687

TestLoadgenCompactRowSetsOpOptimalMemUsage:
Generates load to ensure CompactRowSetsOp stats computation
kicks in. With memory budgeting flags enabled in such a way
that budgeting criteria is met, CompactRowSetsOp is skipped.
With memory budgeting enabled, memory usage is as follows:
Timing: real 1.032s, "peak_mem_usage":106388163
Timing: real 1.341s, "peak_mem_usage":439757141
Timing: real 1.621s, "peak_mem_usage":420143817
Timing: real 1.187s, "peak_mem_usage":439080084
Timing: real 1.111s, "peak_mem_usage":390513965

Change-Id: Ief4ec03f1bf0f7936f8fb054948f87e71333f824
---
M src/kudu/benchmarks/CMakeLists.txt
A src/kudu/benchmarks/compaction-highmem.cc
M src/kudu/scripts/benchmarks.sh
3 files changed, 387 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/78/19278/10
--
To view, visit http://gerrit.cloudera.org:8080/19278
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ief4ec03f1bf0f7936f8fb054948f87e71333f824
Gerrit-Change-Number: 19278
Gerrit-PatchSet: 10
Gerrit-Owner: Ashwani Raina 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Zoltan Chovan 


[kudu-CR] [tools] Add test to generate heavy rowset compaction

2023-03-23 Thread Ashwani Raina (Code Review)
Ashwani Raina has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19278 )

Change subject: [tools] Add test to generate heavy rowset compaction
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19278/9/src/kudu/benchmarks/compaction-highmem.cc
File src/kudu/benchmarks/compaction-highmem.cc:

http://gerrit.cloudera.org:8080/#/c/19278/9/src/kudu/benchmarks/compaction-highmem.cc@165
PS9, Line 165: RunLoadgen
> Does it make sense to increase the number of maintenance manager threads an
We could do that if the test requirement is to make compaction run more 
frequently. Here, the main goal is to create enough deltas so that we could see 
to it that compaction ends up consuming lot of memory to compact those deltas.
The issue at hand is that compaction (when it starts working on a set of 
deltas) didn't care about the memory budget and ended up consuming too much 
memory by allocating too many data structures in the process, that sometimes it 
led to OOM on certain nodes.


http://gerrit.cloudera.org:8080/#/c/19278/9/src/kudu/benchmarks/compaction-highmem.cc@270
PS9, Line 270: RunLoadgen
> Same as above.
Same comment as above with one difference. Here, I am enabling memory budgeting 
flags to ensure that compaction known if there is going to be too much memory 
demand, it skips compaction. This helps in avoiding OOM situation.



--
To view, visit http://gerrit.cloudera.org:8080/19278
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ief4ec03f1bf0f7936f8fb054948f87e71333f824
Gerrit-Change-Number: 19278
Gerrit-PatchSet: 9
Gerrit-Owner: Ashwani Raina 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Zoltan Chovan 
Gerrit-Comment-Date: Fri, 24 Mar 2023 06:33:13 +
Gerrit-HasComments: Yes


[kudu-CR] [clock] add sanity check to detect wall clock jumps

2023-03-23 Thread Alexey Serbin (Code Review)
Hello Tidy Bot, Zoltan Chovan, Yuqi Du, Ashwani Raina, Yingchun Lai, Attila 
Bukor, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19473

to look at the new patch set (#5).

Change subject: [clock] add sanity check to detect wall clock jumps
..

[clock] add sanity check to detect wall clock jumps

There was a case when a timestamp read from system/local clock using
the ntp_adjtime() call jumped 40+ years ahead when running kudu-tserver
on an Azure VM, while ntp_adjtime() didn't report an error on clock
being unsynchronized. That came along with a huge number of kernel
messages, and other software (such as the SASL library used by SSSD)
detected the strange jump in the local clock as well.  My multiple
attempts to reproduce the issue on a real hardware node, Dockerized
environment run at a real hardware server in a datacenter, and GCE & EC2
VMs were not successful.

This patch adds a sanity check to detect such strange jumps in wall
clock readings.  The idea is to rely on the readings from the
CLOCK_MONOTONIC clock captured along with the wall clock readings.
A jump should manifest itself in big difference between the wall clock
delta and the corresponding CLOCK_MONOTONIC delta.  If such a condition
is detected, then HybridClock::NowWithErrorUnlocked() dumps diagnostic
information about clock synchronisation status and aborts.

This patch also adds a unit test for the newly added functionality.

As a part of this changelist, the following new flags are introduced:
  * --wall_clock_jump_detection
  The flag to control the newly introduced sanity check for readings
  of the wall clock.  Acceptable values are 'auto', 'enabled,
  and 'disabled'.  Set to 'auto' by default, meaning the sanity
  check for timestamps is enabled if the process detects that it's
  running on an Azure VM.
  * --wall_clock_jump_threshold_s
  The threshold (in seconds) for the difference in corresponding
  deltas of the wall clock's and CLOCK_MONOTONIC_RAW clock's
  readings.  Set to 900 (15 minutes) by default.

The reasoning behind having --wall_clock_jump_detection=auto by
default is to skip an extra check at the majority of nodes out there
since NTP-synchronized system clock isn't supposed to jump that much
at all.  However, since a problem is observed at some inadequate VMs
such as ones in Azure cloud, it's now enabled automatically on such
nodes to detect such issues.  If the clock jump went unnoticed, the
timestamp would be persisted with an operation in the WAL and propagated
to other replicas.  That could lead to crashes during tablet
bootstrapping, and would require manual intervention to remove
the orphaned operations with out-of-whack timestamps from the WAL.

To assess the performance hit of the jump detection, I also added
a parameterized HybridClockJumpProtectionTest.BasicPerf scenario
to compare how fast HybridClock::NowWithError() method runs with
and without the clock jump detection.

I found the difference in runtimes to be non-material, especially for
the RELEASE build.  The details are below.

I used the command below, running it 10 times for DEBUG and RELEASE
builds to run the test scenarios.  I recorded the timings output by
the test scenarios and computed the average time taken to call invoke
HybridClock::NowWithError() 200 times.

  KUDU_ALLOW_SLOW_TESTS=1 ./bin/hybrid_clock-test --gtest_filter='*Perf*'

  DEBUG build:

  without clock jump detection:

200 iterations in 1299.592535 ms
200 iterations in 1232.211616 ms
200 iterations in 1247.237539 ms
200 iterations in 1242.360072 ms
200 iterations in 1368.855273 ms
200 iterations in 1334.929966 ms
200 iterations in 1325.736948 ms
200 iterations in 1332.034905 ms
200 iterations in 1291.398501 ms
200 iterations in 1308.596095 ms

average: 1298.2953450 ms

  with clock jump detection:

200 iterations in 1281.678966 ms
200 iterations in 1230.120822 ms
200 iterations in 1295.044968 ms
200 iterations in 1234.285532 ms
200 iterations in 1365.331334 ms
200 iterations in 1358.448343 ms
200 iterations in 1336.828018 ms
200 iterations in 1352.825380 ms
200 iterations in 1260.334167 ms
200 iterations in 1343.749751 ms

average: 1305.8647281 ms

  ---

  RELEASE build:

  without clock jump detection:

200 iterations in 288.288766 ms
200 iterations in 288.252244 ms
200 iterations in 289.027414 ms
200 iterations in 288.729931 ms
200 iterations in 294.876418 ms
200 iterations in 288.053229 ms
200 iterations in 291.324382 ms
200 iterations in 289.249022 ms
200 iterations in 288.665515 ms
200 iterations in 288.629355 m

[kudu-CR] [java] KUDU-3455 Reduce space complexity about pruning hash partitions for in-list predicate

2023-03-23 Thread Yuqi Du (Code Review)
Hello Mahesh Reddy, Alexey Serbin, Ashwani Raina, Yingchun Lai, Yifan Zhang, 
Kudu Jenkins, Wang Xixu,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19568

to look at the new patch set (#8).

Change subject: [java] KUDU-3455 Reduce space complexity about pruning hash 
partitions for in-list predicate
..

[java] KUDU-3455 Reduce space complexity about pruning hash partitions for 
in-list predicate

Logic of pruning hash partitions for in-list predicate in java-client
has a high space complexity, and it may cause java-client to go out of memory.
At the same time, there are many deep copies for 'PartialRow', which makes the
current algorithm a little slow.

This patch fixes the problems and provides a recursive algorithm, that
uses an algorithm like 'deep first search' to pick all combinations for
every in-list columns and try to release PartialRow objects ASAP.

At the same time, new algorithm has a good speedup by reducing lots of heavy
copies of 'PartialRow' objects. My pressure experiments show that new
algorithm's speedup is 100 times or so in the case of older algorithm
not out of memory.

After Yifan Zhang's reminding, the same problems found in cpp-client by 
reviewing
the codes. I'll fix it in another patch.

Change-Id: Icd6f213cb705e1b2a001562cc7cebe4164281723
---
M java/kudu-client/src/main/java/org/apache/kudu/client/PartitionPruner.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestPartitionPruner.java
2 files changed, 393 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/19568/8
--
To view, visit http://gerrit.cloudera.org:8080/19568
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icd6f213cb705e1b2a001562cc7cebe4164281723
Gerrit-Change-Number: 19568
Gerrit-PatchSet: 8
Gerrit-Owner: Yuqi Du 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Reviewer: Yuqi Du 


[kudu-CR] [rowset metadata] add min/max encoded keys update during restart

2023-03-23 Thread KeDeng (Code Review)
Hello Alexey Serbin, Yuqi Du, Yingchun Lai, Kudu Jenkins, Wang Xixu,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19650

to look at the new patch set (#2).

Change subject: [rowset_metadata] add min/max encoded keys update during restart
..

[rowset_metadata] add min/max encoded keys update during restart

We can cache the encoded min/max keys in rowset metadata to help
bootstrap tablets without having to fully initializing the CFileReaders
for the key columns of each rowset. However, we can only update the
min/max encoded keys during insert or update operations right now.
This means that if we didn't save the min/max encoded key before,
we won't have the opportunity to use this feature to accelerate bootstrap
in the future.

Considering that we will read the min/max key values during the bootstrap
process, we can set the min/max encoded keys in the metadata at the same
time.

I added a small case to ensures we can get the same keys from the bootstrap
process at startup.

Change-Id: I1b7ed75bdf7a2ff0e16c2670f1a6f9819ee8e8d3
---
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_server-test.cc
5 files changed, 119 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/50/19650/2
--
To view, visit http://gerrit.cloudera.org:8080/19650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1b7ed75bdf7a2ff0e16c2670f1a6f9819ee8e8d3
Gerrit-Change-Number: 19650
Gerrit-PatchSet: 2
Gerrit-Owner: KeDeng 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Reviewer: Yuqi Du 


[kudu-CR] [master] exclude tservers in MAINTENANCE MODE when leader rebalancing

2023-03-23 Thread Yuqi Du (Code Review)
Hello Mahesh Reddy, Alexey Serbin, Yingchun Lai, Yifan Zhang, Attila Bukor, 
Kudu Jenkins, Abhishek Chennaka, KeDeng, Wang Xixu,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19608

to look at the new patch set (#8).

Change subject: [master] exclude tservers in MAINTENANCE_MODE when leader 
rebalancing
..

[master] exclude tservers in MAINTENANCE_MODE when leader rebalancing

Currently, auto leader rebalancing may transfer new leaders to tservers
which have entered MAINTENANCE_MODE. Because a tserver in MAINTENANCE_MODE
is maintainning by administrators, maybe it is decommissioning. Leader
will serve many users' requests, so it's not reasonable that leader rebalancer
transfers some leaders to tservers in MAINTENANCE_MODE. We should exclude
the tservers in MAINTENANCE_MODE from destination tservers when auto
leader rebalancer runs.

This patch fixes the problem and add an unit test for this.

Change-Id: I2f85a675e69fd02a62e2625881dad2ca5e27acd9
---
M src/kudu/master/auto_leader_rebalancer-test.cc
M src/kudu/master/auto_leader_rebalancer.cc
M src/kudu/master/auto_leader_rebalancer.h
3 files changed, 80 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/19608/8
--
To view, visit http://gerrit.cloudera.org:8080/19608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f85a675e69fd02a62e2625881dad2ca5e27acd9
Gerrit-Change-Number: 19608
Gerrit-PatchSet: 8
Gerrit-Owner: Yuqi Du 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Reviewer: Yuqi Du 


[kudu-CR] [rowset metadata] add min/max encoded keys update during restart

2023-03-23 Thread KeDeng (Code Review)
KeDeng has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19650


Change subject: [rowset_metadata] add min/max encoded keys update during restart
..

[rowset_metadata] add min/max encoded keys update during restart

We can cache the encoded min/max keys in rowset metadata to help
bootstrap tablets without having to fully initializing the CFileReaders
for the key columns of each rowset. However, we can only update the
min/max encoded keys during insert or update operations right now.
This means that if we didn't save the min/max encoded key before,
we won't have the opportunity to use this feature to accelerate bootstrap
in the future.

Considering that we will read the min/max key values during the bootstrap
process, we can set the min/max encoded keys in the metadata at the same
time.

I added a small case to ensures we can get the same keys from the bootstrap
process at startup.

Change-Id: I1b7ed75bdf7a2ff0e16c2670f1a6f9819ee8e8d3
---
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_server-test.cc
5 files changed, 118 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/50/19650/1
-- 
To view, visit http://gerrit.cloudera.org:8080/19650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1b7ed75bdf7a2ff0e16c2670f1a6f9819ee8e8d3
Gerrit-Change-Number: 19650
Gerrit-PatchSet: 1
Gerrit-Owner: KeDeng 


[kudu-CR] KUDU-3437 Set default block cache metrics policy to reset

2023-03-23 Thread Wang Xixu (Code Review)
Wang Xixu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19585 )

Change subject: KUDU-3437 Set default block_cache_metrics_policy to reset
..


Patch Set 3:

> Patch Set 3:
>
> (1 comment)

I got it. Thanks. Maybe you can destroy the singleton BlockCache before 
initiating the master at the second time, just like: 
ClearLocalSystemCatalogAndCopy(leader_hp).


--
To view, visit http://gerrit.cloudera.org:8080/19585
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc32d7ab02201382debcbe36311579550353bf71
Gerrit-Change-Number: 19585
Gerrit-PatchSet: 3
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Yuqi Du 
Gerrit-Reviewer: Zoltan Chovan 
Gerrit-Reviewer: Ádám Bakai 
Gerrit-Comment-Date: Fri, 24 Mar 2023 02:42:28 +
Gerrit-HasComments: No


[kudu-CR] KUDU-3371 [fs] make LogBlockContainer a base class

2023-03-23 Thread Abhishek Chennaka (Code Review)
Abhishek Chennaka has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19610 )

Change subject: KUDU-3371 [fs] make LogBlockContainer a base class
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19610/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19610/1//COMMIT_MSG@16
PS1, Line 16: Most of the member functions are left in the base class
: LogBlockContainer, I'm planning to add a new derived class which
: stores containers' metadata into RocksDB in next patches to avoid
> It's OK to have a bigger change that would make this part at least logicall
+1


http://gerrit.cloudera.org:8080/#/c/19610/1/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

http://gerrit.cloudera.org:8080/#/c/19610/1/src/kudu/fs/log_block_manager.cc@2568
PS1, Line 2568: open
nit: opening



--
To view, visit http://gerrit.cloudera.org:8080/19610
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieb199de95973aaeba76947ad095907272d84ca67
Gerrit-Change-Number: 19610
Gerrit-PatchSet: 1
Gerrit-Owner: Yingchun Lai 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Reviewer: Yuqi Du 
Gerrit-Comment-Date: Thu, 23 Mar 2023 19:43:41 +
Gerrit-HasComments: Yes


[kudu-CR] [tools] Add test to generate heavy rowset compaction

2023-03-23 Thread Abhishek Chennaka (Code Review)
Abhishek Chennaka has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19278 )

Change subject: [tools] Add test to generate heavy rowset compaction
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19278/9/src/kudu/benchmarks/compaction-highmem.cc
File src/kudu/benchmarks/compaction-highmem.cc:

http://gerrit.cloudera.org:8080/#/c/19278/9/src/kudu/benchmarks/compaction-highmem.cc@165
PS9, Line 165: RunLoadgen
Does it make sense to increase the number of maintenance manager threads and 
reduce the polling period to increase the chances of scheduling a compaction?


http://gerrit.cloudera.org:8080/#/c/19278/9/src/kudu/benchmarks/compaction-highmem.cc@270
PS9, Line 270: RunLoadgen
Same as above.



--
To view, visit http://gerrit.cloudera.org:8080/19278
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ief4ec03f1bf0f7936f8fb054948f87e71333f824
Gerrit-Change-Number: 19278
Gerrit-PatchSet: 9
Gerrit-Owner: Ashwani Raina 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Zoltan Chovan 
Gerrit-Comment-Date: Thu, 23 Mar 2023 19:13:06 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1945 Update auto incrementing counter during bootstrap

2023-03-23 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19445 )

Change subject: KUDU-1945 Update auto incrementing counter during bootstrap
..


Patch Set 6: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc
File src/kudu/integration-tests/auto_incrementing-itest.cc:

http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@341
PS6, Line 341:
nit: extra space


http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@419
PS6, Line 419: 3
kNumTabletServers


http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@420
PS6, Line 420: 2
kNumTabletServers - 1



--
To view, visit http://gerrit.cloudera.org:8080/19445
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61b305efd7d5a065a2976327567163956c0c2184
Gerrit-Change-Number: 19445
Gerrit-PatchSet: 6
Gerrit-Owner: Abhishek Chennaka 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 23 Mar 2023 18:54:00 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1945 Update auto incrementing counter during bootstrap

2023-03-23 Thread Abhishek Chennaka (Code Review)
Hello Marton Greber, Tidy Bot, Alexey Serbin, Kudu Jenkins, Wenzhe Zhou,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19445

to look at the new patch set (#6).

Change subject: KUDU-1945 Update auto incrementing counter during bootstrap
..

KUDU-1945 Update auto incrementing counter during bootstrap

The auto incrementing counter would be reset to zero when the tablet
is being initialized. It is essential to have the correct value of
the counter. There are two cases:

1. WAL segments contain insert operations
In this scenario the WAL segments are replayed and since each insert
operation entry has auto incrementing counter which will be used for
the insert operations present in that entry, as long as we have the
latest insert operation entry in the WAL segments, the auto
incrementing counter is populated correctly during bootstrap.

2. WAL segments do not contain insert operations
In the case, we need to fetch the highest auto incrementing counter
value present in the data which is already flushed and update the
in memory auto incrementing counter appropriately. This patch
accomplishes this task.

There are tests for the bootstrap scenarios where
1. We have no WAL segments with an INSERT OP containing the auto
incrementing column value. We rely on the auto incrementing counter
present in the data directories in this case.
2. We have no WAL segments with auto incrementing column value
and all the rows of the table are deleted. We reset the auto
incrementing counter in this case.
3. We have non committed replicate ops containing INSERT OPs with
the auto incrementing column values.

Manually tested the time taken to populate the auto incrementing
counter:
Columns - A Non Unique Primary Key column of type INT32
- 8 INT64 columns
- 5 STRING columns
Rows - 500k rows with data populated
Time taken in populating the counter during bootstrap:
Min - 235ms
Max - 466ms
Median - 312ms

The total time spent boostrapping the tablet was between 18-25
seconds.
Change-Id: I61b305efd7d5a065a2976327567163956c0c2184
---
M src/kudu/integration-tests/auto_incrementing-itest.cc
M src/kudu/tablet/tablet.cc
2 files changed, 327 insertions(+), 34 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/45/19445/6
--
To view, visit http://gerrit.cloudera.org:8080/19445
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I61b305efd7d5a065a2976327567163956c0c2184
Gerrit-Change-Number: 19445
Gerrit-PatchSet: 6
Gerrit-Owner: Abhishek Chennaka 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wenzhe Zhou 


[kudu-CR] KUDU-1945 Update auto incrementing counter during bootstrap

2023-03-23 Thread Abhishek Chennaka (Code Review)
Abhishek Chennaka has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19445 )

Change subject: KUDU-1945 Update auto incrementing counter during bootstrap
..


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/19445/5/src/kudu/integration-tests/auto_incrementing-itest.cc
File src/kudu/integration-tests/auto_incrementing-itest.cc:

http://gerrit.cloudera.org:8080/#/c/19445/5/src/kudu/integration-tests/auto_incrementing-itest.cc@415
PS5, Line 415: ERT_EQ(results[i].size(), resu
> nit: int i = 0; i < 3; i++
Done


http://gerrit.cloudera.org:8080/#/c/19445/5/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

http://gerrit.cloudera.org:8080/#/c/19445/5/src/kudu/tablet/tablet.cc@423
PS5, Line 423: shared_ptr rowset;
> warning: static member accessed through instance [readability-static-access
Done


http://gerrit.cloudera.org:8080/#/c/19445/5/src/kudu/tablet/tablet.cc@446
PS5, Line 446:  {
> you remove the initial size 32K for mem
Right, but 32K is the default value


http://gerrit.cloudera.org:8080/#/c/19445/5/src/kudu/tablet/tablet.cc@466
PS5, Line 466:
> add a space before <<
Done


http://gerrit.cloudera.org:8080/#/c/19445/5/src/kudu/tablet/tablet.cc@466
PS5, Line 466: t_t
> AIC is not common abbreviation, not easy to understand.
Done



--
To view, visit http://gerrit.cloudera.org:8080/19445
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61b305efd7d5a065a2976327567163956c0c2184
Gerrit-Change-Number: 19445
Gerrit-PatchSet: 4
Gerrit-Owner: Abhishek Chennaka 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 23 Mar 2023 18:27:09 +
Gerrit-HasComments: Yes


[kudu-CR] WIP KUDU-3371 [fs] Use RocksDB to store LBM metadata

2023-03-23 Thread Yingchun Lai (Code Review)
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/18569

to look at the new patch set (#35).

Change subject: WIP KUDU-3371 [fs] Use RocksDB to store LBM metadata
..

WIP KUDU-3371 [fs] Use RocksDB to store LBM metadata

Since LogBlockContainer store block records sequentially in
metadata file, the live blocks maybe in a very low ratio,
and it cause disk space wasting and long time bootstrap.

This patch use RocksDB to store LBM metadata, a new
item will be Put() into RocksDB when a new block created
in LBM, and the item will be Delete()d from RocksDB when
the block removed from LBM. Data in RocksDB can be
maintained in RocksDB itself, i.e. deleted items will be
GCed so doesn't need rewriting as how we do it in current
LBM.

The implemention also reuse most logic of LBM, the main
difference is store Block records metadata in RocksDB.
Introduce a new class LogrBlockManager that stores
metadata in RocksDB, the main idea:
a. Create container
   Data file is created as before, metadata is stored
   in keys prefixed by the container's id, append the
   block id, e.g. .. Make sure
   there is no such keys in RocksDB before this
   container created.
b. Open container
   Make sure the data file is healthy.
c. Deconstruct container
   If the container is dead (full and no live blocks),
   remove the data file, and clean up keys prefixed by
   the container's id.
d. Load container (by ProcessRecords())
   Iterate the RocksDB in the key range
   [, ), only live
   block records will be populated, we can use them
   as before.
e. Create blocks in a container
   Put() serialized BlockRecordPB records into RocksDB
   in batch, keys are in form of
   '.' as mentioned above.
f. Remove blocks from a container
   Contruct the keys by container's id and block's
   id, Delete() them from RocksDB in batch.

This patch contains the following changes:
- Adds a new block manager type named 'logr', it use RocksDB
  to store LBM metadata, it is also specified by flag
  '--block_manager'.
- block_manager-test supports to test LogrBlockManager
- block_manager-stress-test supports to test LogrBlockManager
- log_block_manager-test supports to test LogrBlockManager
- tablet_server-test supports to test LogrBlockManager
- dense_node-itest supports to test LogrBlockManager
- kudu-tool-test supports to test LogrBlockManager

It's optional to use RocksDB, we can use the former LBM as
before, we can introduce more tools to convert data between
the two implemention in the future.

The optimization is obvious as shown in JIRA KUDU-3371, it
shows that reopen staged reduced upto 90% time cost.

Change-Id: Ie72f6914eb5653a9c034766c6cd3741a8340711f
---
M src/kudu/benchmarks/CMakeLists.txt
M src/kudu/client/CMakeLists.txt
M src/kudu/consensus/CMakeLists.txt
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/dir_manager.cc
M src/kudu/fs/dir_manager.h
M src/kudu/fs/dir_util.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_report.cc
M src/kudu/fs/fs_report.h
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test-util.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/server/CMakeLists.txt
M src/kudu/tablet/compaction-test.cc
M src/kudu/tools/CMakeLists.txt
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_test_util.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/util/CMakeLists.txt
31 files changed, 2,089 insertions(+), 572 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/69/18569/35
--
To view, visit http://gerrit.cloudera.org:8080/18569
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie72f6914eb5653a9c034766c6cd3741a8340711f
Gerrit-Change-Number: 18569
Gerrit-PatchSet: 35
Gerrit-Owner: Yingchun Lai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai 


[kudu-CR] revert

2023-03-23 Thread Yingchun Lai (Code Review)
Yingchun Lai has abandoned this change. ( http://gerrit.cloudera.org:8080/19648 
)

Change subject: revert
..


Abandoned

revert
--
To view, visit http://gerrit.cloudera.org:8080/19648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I2ba30261af8296f9b8821b0552e71439a8c837bf
Gerrit-Change-Number: 19648
Gerrit-PatchSet: 1
Gerrit-Owner: Yingchun Lai 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] revert

2023-03-23 Thread Yingchun Lai (Code Review)
Yingchun Lai has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19648


Change subject: revert
..

revert

Change-Id: I2ba30261af8296f9b8821b0552e71439a8c837bf
---
M src/kudu/fs/log_block_manager.cc
1 file changed, 271 insertions(+), 203 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/19648/1
--
To view, visit http://gerrit.cloudera.org:8080/19648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2ba30261af8296f9b8821b0552e71439a8c837bf
Gerrit-Change-Number: 19648
Gerrit-PatchSet: 1
Gerrit-Owner: Yingchun Lai 


[kudu-CR] WIP KUDU-3371 [fs] Use RocksDB to store LBM metadata

2023-03-23 Thread Yingchun Lai (Code Review)
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/18569

to look at the new patch set (#34).

Change subject: WIP KUDU-3371 [fs] Use RocksDB to store LBM metadata
..

WIP KUDU-3371 [fs] Use RocksDB to store LBM metadata

Since LogBlockContainer store block records sequentially in
metadata file, the live blocks maybe in a very low ratio,
and it cause disk space wasting and long time bootstrap.

This patch use RocksDB to store LBM metadata, a new
item will be Put() into RocksDB when a new block created
in LBM, and the item will be Delete()d from RocksDB when
the block removed from LBM. Data in RocksDB can be
maintained in RocksDB itself, i.e. deleted items will be
GCed so doesn't need rewriting as how we do it in current
LBM.

The implemention also reuse most logic of LBM, the main
difference is store Block records metadata in RocksDB.
Introduce a new class LogrBlockManager that stores
metadata in RocksDB, the main idea:
a. Create container
   Data file is created as before, metadata is stored
   in keys prefixed by the container's id, append the
   block id, e.g. .. Make sure
   there is no such keys in RocksDB before this
   container created.
b. Open container
   Make sure the data file is healthy.
c. Deconstruct container
   If the container is dead (full and no live blocks),
   remove the data file, and clean up keys prefixed by
   the container's id.
d. Load container (by ProcessRecords())
   Iterate the RocksDB in the key range
   [, ), only live
   block records will be populated, we can use them
   as before.
e. Create blocks in a container
   Put() serialized BlockRecordPB records into RocksDB
   in batch, keys are in form of
   '.' as mentioned above.
f. Remove blocks from a container
   Contruct the keys by container's id and block's
   id, Delete() them from RocksDB in batch.

This patch contains the following changes:
- Adds a new block manager type named 'logr', it use RocksDB
  to store LBM metadata, it is also specified by flag
  '--block_manager'.
- block_manager-test supports to test LogrBlockManager
- block_manager-stress-test supports to test LogrBlockManager
- log_block_manager-test supports to test LogrBlockManager
- tablet_server-test supports to test LogrBlockManager
- dense_node-itest supports to test LogrBlockManager
- kudu-tool-test supports to test LogrBlockManager

It's optional to use RocksDB, we can use the former LBM as
before, we can introduce more tools to convert data between
the two implemention in the future.

The optimization is obvious as shown in JIRA KUDU-3371, it
shows that reopen staged reduced upto 90% time cost.

Change-Id: Ie72f6914eb5653a9c034766c6cd3741a8340711f
---
M src/kudu/benchmarks/CMakeLists.txt
M src/kudu/client/CMakeLists.txt
M src/kudu/consensus/CMakeLists.txt
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/dir_manager.cc
M src/kudu/fs/dir_manager.h
M src/kudu/fs/dir_util.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_report.cc
M src/kudu/fs/fs_report.h
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test-util.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/server/CMakeLists.txt
M src/kudu/tablet/compaction-test.cc
M src/kudu/tools/CMakeLists.txt
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_test_util.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/util/CMakeLists.txt
31 files changed, 2,117 insertions(+), 668 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/69/18569/34
--
To view, visit http://gerrit.cloudera.org:8080/18569
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie72f6914eb5653a9c034766c6cd3741a8340711f
Gerrit-Change-Number: 18569
Gerrit-PatchSet: 34
Gerrit-Owner: Yingchun Lai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai 


[kudu-CR] jwt: introduce MiniOidc

2023-03-23 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18473 )

Change subject: jwt: introduce MiniOidc
..


Patch Set 18: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/18473
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26e9b3bcd0946adbe4642a19c5ef1124e39632c6
Gerrit-Change-Number: 18473
Gerrit-PatchSet: 18
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yuqi Du 
Gerrit-Reviewer: Zoltan Chovan 
Gerrit-Comment-Date: Thu, 23 Mar 2023 16:34:37 +
Gerrit-HasComments: No


[kudu-CR] [master] exclude tservers in MAINTENANCE MODE when leader rebalancing

2023-03-23 Thread Mahesh Reddy (Code Review)
Mahesh Reddy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19608 )

Change subject: [master] exclude tservers in MAINTENANCE_MODE when leader 
rebalancing
..


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/19608/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19608/7//COMMIT_MSG@7
PS7, Line 7: exclude
nit: Exclude


http://gerrit.cloudera.org:8080/#/c/19608/7//COMMIT_MSG@11
PS7, Line 11: maintainning
nit: being maintained


http://gerrit.cloudera.org:8080/#/c/19608/7//COMMIT_MSG@11
PS7, Line 11: Leader
: will serve many users' requests, so it's not reasonable that 
leader rebalancer
: transfer some leader to it. We should exclude the tservers in 
MAINTENANCE_MODE.
nit: Since it might be decommissioned, it doesn't make sense for the leader 
rebalancer to transfer a leader to a tserver in MAINTENANCE_MODE as leaders 
serve many users' requests. For this reason, we should exclude such tservers.


http://gerrit.cloudera.org:8080/#/c/19608/7//COMMIT_MSG@15
PS7, Line 15: add
nit: adds



--
To view, visit http://gerrit.cloudera.org:8080/19608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f85a675e69fd02a62e2625881dad2ca5e27acd9
Gerrit-Change-Number: 19608
Gerrit-PatchSet: 7
Gerrit-Owner: Yuqi Du 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Reviewer: Yuqi Du 
Gerrit-Comment-Date: Thu, 23 Mar 2023 16:18:24 +
Gerrit-HasComments: Yes


[kudu-CR] WIP KUDU-3371 [fs] Use RocksDB to store LBM metadata

2023-03-23 Thread Yingchun Lai (Code Review)
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/18569

to look at the new patch set (#33).

Change subject: WIP KUDU-3371 [fs] Use RocksDB to store LBM metadata
..

WIP KUDU-3371 [fs] Use RocksDB to store LBM metadata

Since LogBlockContainer store block records sequentially in
metadata file, the live blocks maybe in a very low ratio,
and it cause disk space wasting and long time bootstrap.

This patch use RocksDB to store LBM metadata, a new
item will be Put() into RocksDB when a new block created
in LBM, and the item will be Delete()d from RocksDB when
the block removed from LBM. Data in RocksDB can be
maintained in RocksDB itself, i.e. deleted items will be
GCed so doesn't need rewriting as how we do it in current
LBM.

The implemention also reuse most logic of LBM, the main
difference is store Block records metadata in RocksDB.
Introduce a new class LogrBlockManager that stores
metadata in RocksDB, the main idea:
a. Create container
   Data file is created as before, metadata is stored
   in keys prefixed by the container's id, append the
   block id, e.g. .. Make sure
   there is no such keys in RocksDB before this
   container created.
b. Open container
   Make sure the data file is healthy.
c. Deconstruct container
   If the container is dead (full and no live blocks),
   remove the data file, and clean up keys prefixed by
   the container's id.
d. Load container (by ProcessRecords())
   Iterate the RocksDB in the key range
   [, ), only live
   block records will be populated, we can use them
   as before.
e. Create blocks in a container
   Put() serialized BlockRecordPB records into RocksDB
   in batch, keys are in form of
   '.' as mentioned above.
f. Remove blocks from a container
   Contruct the keys by container's id and block's
   id, Delete() them from RocksDB in batch.

This patch contains the following changes:
- Adds a new block manager type named 'logr', it use RocksDB
  to store LBM metadata, it is also specified by flag
  '--block_manager'.
- block_manager-test supports to test LogrBlockManager
- block_manager-stress-test supports to test LogrBlockManager
- log_block_manager-test supports to test LogrBlockManager
- tablet_server-test supports to test LogrBlockManager
- dense_node-itest supports to test LogrBlockManager
- kudu-tool-test supports to test LogrBlockManager

It's optional to use RocksDB, we can use the former LBM as
before, we can introduce more tools to convert data between
the two implemention in the future.

The optimization is obvious as shown in JIRA KUDU-3371, it
shows that reopen staged reduced upto 90% time cost.

Change-Id: Ie72f6914eb5653a9c034766c6cd3741a8340711f
---
M src/kudu/benchmarks/CMakeLists.txt
M src/kudu/client/CMakeLists.txt
M src/kudu/consensus/CMakeLists.txt
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/dir_manager.cc
M src/kudu/fs/dir_manager.h
M src/kudu/fs/dir_util.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_report.cc
M src/kudu/fs/fs_report.h
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test-util.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/server/CMakeLists.txt
M src/kudu/tablet/compaction-test.cc
M src/kudu/tools/CMakeLists.txt
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_test_util.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/util/CMakeLists.txt
31 files changed, 2,117 insertions(+), 668 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/69/18569/33
--
To view, visit http://gerrit.cloudera.org:8080/18569
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie72f6914eb5653a9c034766c6cd3741a8340711f
Gerrit-Change-Number: 18569
Gerrit-PatchSet: 33
Gerrit-Owner: Yingchun Lai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai 


[kudu-CR] Exit external mini cluster services when test is stopped

2023-03-23 Thread Code Review
Hello Zoltan Chovan, Alexey Serbin, Attila Bukor, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19613

to look at the new patch set (#3).

Change subject: Exit external_mini_cluster services when test is stopped
..

Exit external_mini_cluster services when test is stopped

In subprocess.cc it is explicitly set by prctl to prevent orphans when
the main process dies. However it is still only supported on
linux. Another possible solution was to use a signal handler and kill the
subprocesses, but glog is already using the signal handler. This
solution periodically checks if the parent pid is set to 1 (pid of init
process). If that happens, then the main thread exits.

Change-Id: I7ca88b3b8924dc7dc5c05a0239dfa86db67af255
---
M src/kudu/master/master_runner.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/server/server_base.cc
M src/kudu/server/server_base.h
M src/kudu/tools/tool_action_master.cc
M src/kudu/tools/tool_action_tserver.cc
M src/kudu/tserver/tablet_server_runner.cc
7 files changed, 77 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/13/19613/3
--
To view, visit http://gerrit.cloudera.org:8080/19613
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7ca88b3b8924dc7dc5c05a0239dfa86db67af255
Gerrit-Change-Number: 19613
Gerrit-PatchSet: 3
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Chovan 
Gerrit-Reviewer: Ádám Bakai 


[kudu-CR] Exit external mini cluster services when test is stopped

2023-03-23 Thread Code Review
Ádám Bakai has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19613 )

Change subject: Exit external_mini_cluster services when test is stopped
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/19613/2/src/kudu/kserver/kserver.cc
File src/kudu/kserver/kserver.cc:

http://gerrit.cloudera.org:8080/#/c/19613/2/src/kudu/kserver/kserver.cc@60
PS2, Line 60: TAG_FLAG(exit_if_orphaned, hidden);
> I guess this flag should be marked 'unsafe' because kudu servers started no
Done


http://gerrit.cloudera.org:8080/#/c/19613/2/src/kudu/master/master_runner.cc
File src/kudu/master/master_runner.cc:

http://gerrit.cloudera.org:8080/#/c/19613/2/src/kudu/master/master_runner.cc@29
PS2, Line 29: #include 
> nit: move this up to come before C++ headers, separating from C++ headers w
Done


http://gerrit.cloudera.org:8080/#/c/19613/2/src/kudu/master/master_runner.cc@467
PS2, Line 467:   while (true) {
 : SleepFor(MonoDelta::FromSeconds(1));
 : if (FLAGS_exit_if_orphaned && getppid() == 1) return 
Status::OK();
 :   }
> Instead of introducing and dancing around globals (like FLAGS_exit_if_orpha
I don't see how creating RunMasterServerMiniCluster and 
RunMasterServerStandalone would help. It's perfectly valid to start kudu master 
with "kudu master run" command, there is no way at least right now without 
introducing some mechanism to decide if the master is running in mini cluster 
or as a stand-alone( with kudu master run). I made a different type of 
simplification, though, where both runners just calls 
ServerBase::WaitFinished() (which could be static function right now, but it 
isn't because I don't see any reason that it couldnt be changed to use some 
member variables.


http://gerrit.cloudera.org:8080/#/c/19613/2/src/kudu/mini-cluster/external_mini_cluster.cc
File src/kudu/mini-cluster/external_mini_cluster.cc:

http://gerrit.cloudera.org:8080/#/c/19613/2/src/kudu/mini-cluster/external_mini_cluster.cc@1789
PS2, Line 1789: Status ExternalTabletServer::Restart() {
> What about Restart()?  Should the set of flags for "tserver run" be updated
Done



--
To view, visit http://gerrit.cloudera.org:8080/19613
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7ca88b3b8924dc7dc5c05a0239dfa86db67af255
Gerrit-Change-Number: 19613
Gerrit-PatchSet: 2
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Chovan 
Gerrit-Reviewer: Ádám Bakai 
Gerrit-Comment-Date: Thu, 23 Mar 2023 14:17:22 +
Gerrit-HasComments: Yes


[kudu-CR] [Tool] Show the information of a tablet

2023-03-23 Thread Wang Xixu (Code Review)
Hello Tidy Bot, Alexey Serbin, Yuqi Du, Yingchun Lai, Kudu Jenkins, KeDeng,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19501

to look at the new patch set (#11).

Change subject: [Tool] Show the information of a tablet
..

[Tool] Show the information of a tablet

This tool is used to show the table name, table id and all
replicas on different tablet servers according to the tablet
id.

Sometimes, the log prints the tablet id when a query failed or
the system got something wrong. But it is difficult to find
useful information according to the tablet id.

This tool will be helpful for debuging. The following is an
exmple:

Table id: 9112ec4b6eac48f5967da5a1f20610b0, Table name: test_table
Tablet Server UUID|  Host   | Port  | Is Leader
--+-+---+---
 4cd055e0cf3b4165823747e3c349dc6f | 127.241.215.131 | 45963 | true
 65a671158cd74ac3995f5c68180621b8 | 127.241.215.130 | 42211 | false
 6b6ef15e7d4a471f92f4c1eca1a46638 | 127.241.215.129 | 36321 | false

Change-Id: Ib5ae5f61f50a44c4787843df76adaa61700ae9fe
---
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/tablet-internal.cc
M src/kudu/client/tablet-internal.h
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool.proto
M src/kudu/tools/tool_action_tablet.cc
13 files changed, 279 insertions(+), 79 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/01/19501/11
--
To view, visit http://gerrit.cloudera.org:8080/19501
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib5ae5f61f50a44c4787843df76adaa61700ae9fe
Gerrit-Change-Number: 19501
Gerrit-PatchSet: 11
Gerrit-Owner: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Reviewer: Yuqi Du 


[kudu-CR] [Tool] Show the information of a tablet

2023-03-23 Thread Wang Xixu (Code Review)
Wang Xixu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19501 )

Change subject: [Tool] Show the information of a tablet
..


Patch Set 9:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/19501/9/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

http://gerrit.cloudera.org:8080/#/c/19501/9/src/kudu/client/client-internal.cc@298
PS9, Line 298: Status KuduClient::Data::GetTablet(KuduClient* client,
> I'm not clear why to move this function from KuduClient to KuduClient::Data
Just to keep the same style. Client.cc exposes the interface to the user, and 
client-internal.cc implements the interface. Please see the following function: 
GetTabletServer()


http://gerrit.cloudera.org:8080/#/c/19501/9/src/kudu/client/client-internal.cc@301
PS9, Line 301: const
> At least, it can be static (in L307, use client->default_admin_operation_ti
Done L307.


http://gerrit.cloudera.org:8080/#/c/19501/9/src/kudu/master/catalog_manager.h
File src/kudu/master/catalog_manager.h:

http://gerrit.cloudera.org:8080/#/c/19501/9/src/kudu/master/catalog_manager.h@797
PS9, Line 797: get_extra_info
> Could you plz add some comments to describe what dose 'get_extra_info' mean
Done


http://gerrit.cloudera.org:8080/#/c/19501/9/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/19501/9/src/kudu/master/catalog_manager.cc@6241
PS9, Line 6241: R
> nit: indent
Done


http://gerrit.cloudera.org:8080/#/c/19501/9/src/kudu/tools/tool_action_tablet.cc
File src/kudu/tools/tool_action_tablet.cc:

http://gerrit.cloudera.org:8080/#/c/19501/9/src/kudu/tools/tool_action_tablet.cc@109
PS9, Line 109:   if (FLAGS_json) {
> DataTable is support to output in JSON format(see DataTable::PrintTo), why
Done


http://gerrit.cloudera.org:8080/#/c/19501/9/src/kudu/tools/tool_action_tablet.cc@124
PS9, Line 124:   cout << Substitute("Table id: $0, Table name: $1",
> So if --json is enabled, table id and name will not be output?
To simply the code, I will output table id and table name for every replica. 
Maybe the data has some duplication, but it does  matter. The following is an 
example:

pretty format:
 Table Id | Table Name |Tablet Server UUID  
  | Host  | Port  | Is Leader
--++--+---+---+---
 883e5ffc9e2f41f780ddffd8a8744f13 | test_table | 
5116514e00024a71926c722f5ccfaf44 | 127.42.53.193 | 41263 | false
 883e5ffc9e2f41f780ddffd8a8744f13 | test_table | 
7041740e966d45d38be3cc6c79c1f67f | 127.42.53.194 | 35579 | true
 883e5ffc9e2f41f780ddffd8a8744f13 | test_table | 
4c4d43e7e0bc4d41b23f072ad2dfe85d | 127.42.53.195 | 40701 | false

json format:
[{"Table Id":"53092288a6944385bcb78fdc6a2d5322","Table 
Name":"test_table","Tablet Server 
UUID":"909a581760ae40ceb8837d72cc989ede","Host":"127.45.86.65","Port":"44353","Is
 Leader":"true"},{"Table Id":"53092288a6944385bcb78fdc6a2d5322","Table 
Name":"test_table","Tablet Server 
UUID":"b509cc1057af4dda9cc596c7f87849a1","Host":"127.45.86.66","Port":"42719","Is
 Leader":"false"},{"Table Id":"53092288a6944385bcb78fdc6a2d5322","Table 
Name":"test_table","Tablet Server 
UUID":"9413e042a42549409367c722a50785e4","Host":"127.45.86.67","Port":"36483","Is
 Leader":"false"}]



--
To view, visit http://gerrit.cloudera.org:8080/19501
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib5ae5f61f50a44c4787843df76adaa61700ae9fe
Gerrit-Change-Number: 19501
Gerrit-PatchSet: 9
Gerrit-Owner: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Reviewer: Yuqi Du 
Gerrit-Comment-Date: Thu, 23 Mar 2023 12:04:03 +
Gerrit-HasComments: Yes


[kudu-CR] [Tool] Show the information of a tablet

2023-03-23 Thread Wang Xixu (Code Review)
Hello Tidy Bot, Alexey Serbin, Yuqi Du, Yingchun Lai, Kudu Jenkins, KeDeng,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19501

to look at the new patch set (#10).

Change subject: [Tool] Show the information of a tablet
..

[Tool] Show the information of a tablet

This tool is used to show the table name, table id and all
replicas on different tablet servers according to the tablet
id.

Sometimes, the log prints the tablet id when a query failed or
the system got something wrong. But it is difficult to find
useful information according to the tablet id.

This tool will be helpful for debuging. The following is an
exmple:

Table id: 9112ec4b6eac48f5967da5a1f20610b0, Table name: test_table
Tablet Server UUID|  Host   | Port  | Is Leader
--+-+---+---
 4cd055e0cf3b4165823747e3c349dc6f | 127.241.215.131 | 45963 | true
 65a671158cd74ac3995f5c68180621b8 | 127.241.215.130 | 42211 | false
 6b6ef15e7d4a471f92f4c1eca1a46638 | 127.241.215.129 | 36321 | false

Change-Id: Ib5ae5f61f50a44c4787843df76adaa61700ae9fe
---
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/tablet-internal.cc
M src/kudu/client/tablet-internal.h
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool.proto
M src/kudu/tools/tool_action_tablet.cc
13 files changed, 283 insertions(+), 79 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/01/19501/10
--
To view, visit http://gerrit.cloudera.org:8080/19501
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib5ae5f61f50a44c4787843df76adaa61700ae9fe
Gerrit-Change-Number: 19501
Gerrit-PatchSet: 10
Gerrit-Owner: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Reviewer: Yuqi Du 


[kudu-CR] WIP KUDU-3371 [fs] Use RocksDB to store LBM metadata

2023-03-23 Thread Yingchun Lai (Code Review)
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/18569

to look at the new patch set (#32).

Change subject: WIP KUDU-3371 [fs] Use RocksDB to store LBM metadata
..

WIP KUDU-3371 [fs] Use RocksDB to store LBM metadata

Since LogBlockContainer store block records sequentially in
metadata file, the live blocks maybe in a very low ratio,
and it cause disk space wasting and long time bootstrap.

This patch use RocksDB to store LBM metadata, a new
item will be Put() into RocksDB when a new block created
in LBM, and the item will be Delete()d from RocksDB when
the block removed from LBM. Data in RocksDB can be
maintained in RocksDB itself, i.e. deleted items will be
GCed so doesn't need rewriting as how we do it in current
LBM.

The implemention also reuse most logic of LBM, the main
difference is store Block records metadata in RocksDB.
Introduce a new class LogrBlockManager that stores
metadata in RocksDB, the main idea:
a. Create container
   Data file is created as before, metadata is stored
   in keys prefixed by the container's id, append the
   block id, e.g. .. Make sure
   there is no such keys in RocksDB before this
   container created.
b. Open container
   Make sure the data file is healthy.
c. Deconstruct container
   If the container is dead (full and no live blocks),
   remove the data file, and clean up keys prefixed by
   the container's id.
d. Load container (by ProcessRecords())
   Iterate the RocksDB in the key range
   [, ), only live
   block records will be populated, we can use them
   as before.
e. Create blocks in a container
   Put() serialized BlockRecordPB records into RocksDB
   in batch, keys are in form of
   '.' as mentioned above.
f. Remove blocks from a container
   Contruct the keys by container's id and block's
   id, Delete() them from RocksDB in batch.

This patch contains the following changes:
- Adds a new block manager type named 'logr', it use RocksDB
  to store LBM metadata, it is also specified by flag
  '--block_manager'.
- block_manager-test supports to test LogrBlockManager
- block_manager-stress-test supports to test LogrBlockManager
- log_block_manager-test supports to test LogrBlockManager
- tablet_server-test supports to test LogrBlockManager
- dense_node-itest supports to test LogrBlockManager
- kudu-tool-test supports to test LogrBlockManager

It's optional to use RocksDB, we can use the former LBM as
before, we can introduce more tools to convert data between
the two implemention in the future.

The optimization is obvious as shown in JIRA KUDU-3371, it
shows that reopen staged reduced upto 90% time cost.

Change-Id: Ie72f6914eb5653a9c034766c6cd3741a8340711f
---
M src/kudu/benchmarks/CMakeLists.txt
M src/kudu/client/CMakeLists.txt
M src/kudu/consensus/CMakeLists.txt
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/dir_manager.cc
M src/kudu/fs/dir_manager.h
M src/kudu/fs/dir_util.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_report.cc
M src/kudu/fs/fs_report.h
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test-util.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/server/CMakeLists.txt
M src/kudu/tablet/compaction-test.cc
M src/kudu/tools/CMakeLists.txt
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_test_util.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/util/CMakeLists.txt
31 files changed, 2,038 insertions(+), 574 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/69/18569/32
--
To view, visit http://gerrit.cloudera.org:8080/18569
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie72f6914eb5653a9c034766c6cd3741a8340711f
Gerrit-Change-Number: 18569
Gerrit-PatchSet: 32
Gerrit-Owner: Yingchun Lai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai 


[kudu-CR] jwt: introduce MiniOidc

2023-03-23 Thread Zoltan Chovan (Code Review)
Zoltan Chovan has uploaded a new patch set (#18) to the change originally 
created by Andrew Wong. ( http://gerrit.cloudera.org:8080/18473 )

Change subject: jwt: introduce MiniOidc
..

jwt: introduce MiniOidc

This patch takes some existing test utilities and encapsulates them into
the new MiniOidc class. The MiniOidc serves the purpose of being the
OpenID Connect Discovery endpoint, as the host for JWKSs of each
account, and the dispenser of JWTs.

This encapsulation will be useful in testing JWTs from non-C++ tests
that typically rely on exposing Mini* components via control shell.

Co-authored-by: Andrew Wong 

Change-Id: I26e9b3bcd0946adbe4642a19c5ef1124e39632c6
---
M src/kudu/util/CMakeLists.txt
M src/kudu/util/jwt-util-test.cc
M src/kudu/util/jwt-util.cc
M src/kudu/util/jwt-util.h
A src/kudu/util/mini_oidc.cc
A src/kudu/util/mini_oidc.h
6 files changed, 323 insertions(+), 150 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/73/18473/18
--
To view, visit http://gerrit.cloudera.org:8080/18473
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I26e9b3bcd0946adbe4642a19c5ef1124e39632c6
Gerrit-Change-Number: 18473
Gerrit-PatchSet: 18
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yuqi Du 
Gerrit-Reviewer: Zoltan Chovan 


[kudu-CR] jwt: introduce MiniOidc

2023-03-23 Thread Zoltan Chovan (Code Review)
Zoltan Chovan has uploaded a new patch set (#17) to the change originally 
created by Andrew Wong. ( http://gerrit.cloudera.org:8080/18473 )

Change subject: jwt: introduce MiniOidc
..

jwt: introduce MiniOidc

This patch takes some existing test utilities and encapsulates them into
the new MiniOidc class. The MiniOidc serves the purpose of being the
OpenID Connect Discovery endpoint, as the host for JWKSs of each
account, and the dispenser of JWTs.

This encapsulation will be useful in testing JWTs from non-C++ tests
that typically rely on exposing Mini* components via control shell.

Co-authored-by: Andrew Wong 

Change-Id: I26e9b3bcd0946adbe4642a19c5ef1124e39632c6
---
M src/kudu/util/CMakeLists.txt
M src/kudu/util/jwt-util-test.cc
M src/kudu/util/jwt-util.cc
M src/kudu/util/jwt-util.h
A src/kudu/util/mini_oidc.cc
A src/kudu/util/mini_oidc.h
6 files changed, 323 insertions(+), 150 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/73/18473/17
--
To view, visit http://gerrit.cloudera.org:8080/18473
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I26e9b3bcd0946adbe4642a19c5ef1124e39632c6
Gerrit-Change-Number: 18473
Gerrit-PatchSet: 17
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yuqi Du 
Gerrit-Reviewer: Zoltan Chovan 


[kudu-CR] KUDU-3452 Create tablet without enough healthy tservers

2023-03-23 Thread Yifan Zhang (Code Review)
Yifan Zhang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19571 )

Change subject: KUDU-3452 Create tablet without enough healthy tservers
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/19571/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19571/3//COMMIT_MSG@10
PS3, Line 10: will retry continuously
> I'm not sure I understand what will retry continuously here?  What is compo
It seems that the catalog manager's 'bgtasks' thread will continuously try to 
assign replicas for newly created tablets(in PREPARING state), and then it 
fails to select as many alive tservers as needed.


http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/master/catalog_manager.cc@417
PS3, Line 417: 
DEFINE_bool(enable_create_tablet_without_enough_healthy_tservers, false,
We should also tag this new flag 'hidden', just as the 
'catalog_manager_check_ts_count_for_create_table' flag.


http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/master/catalog_manager.cc@5933
PS3, Line 5933:   int nreplicas = 0;
  :   // Get replica factor.
  :   {
  : TableMetadataLock table_guard(tablet->table().get(), 
LockMode::READ);
  : nreplicas = table_guard.data().pb.num_replicas();
  :   }
  :   // Try to create a table without enough alive tablet 
servers.
  :   if (alive_num_ts < nreplicas &&
  :   
FLAGS_enable_create_tablet_without_enough_healthy_tservers) {
  : // RAFT protocol require: (replica_factor + 1) / 2 >= 
the number of alive tservers,
  : // or it can not run.
  : if (alive_num_ts < (nreplicas + 1) / 2) {
  :   return Status::InvalidArgument(Substitute("RAFT 
protocal requires at least $0 "
  : "alive 
tablet servers, but only left $1 "
  : "alive 
tablet servers",
  : ((nreplicas 
+ 1) / 2), alive_num_ts));
  : }
  : nreplicas = alive_num_ts;
  :   }
> How about encapsulate the logic into SelectNReplicasForTablet()? since it's
+1


http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/master/catalog_manager.cc@6006
PS3, Line 6006: Status CatalogManager::SelectReplicasForTablet(const 
PlacementPolicy& policy,
> I didn't find any where else using SelectReplicasForTablet(), why not remov
+1


http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/tools/kudu-tool-test.cc@9228
PS3, Line 9228:
> Is it able to alter the table in this case?
Would be great to test adding range partitions when there is only 2 tservers 
alive.



--
To view, visit http://gerrit.cloudera.org:8080/19571
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I742ba1ff770f5c8b1be5800334c29bec96e195c6
Gerrit-Change-Number: 19571
Gerrit-PatchSet: 3
Gerrit-Owner: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Reviewer: Yuqi Du 
Gerrit-Comment-Date: Thu, 23 Mar 2023 08:49:27 +
Gerrit-HasComments: Yes


[kudu-CR] [master] exclude tservers in MAINTENANCE MODE when leader rebalancing

2023-03-23 Thread Yuqi Du (Code Review)
Yuqi Du has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19608 )

Change subject: [master] exclude tservers in MAINTENANCE_MODE when leader 
rebalancing
..


Patch Set 7:

(11 comments)

> Patch Set 5:
>
> (3 comments)

Thanks for your advices.

http://gerrit.cloudera.org:8080/#/c/19608/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19608/5//COMMIT_MSG@7
PS5, Line 7: in
> nit: in
Done


http://gerrit.cloudera.org:8080/#/c/19608/5//COMMIT_MSG@9
PS5, Line 9: Cur
> nit: Change to "Currently, "
Done


http://gerrit.cloudera.org:8080/#/c/19608/5//COMMIT_MSG@10
PS5, Line 10: Because a tserver in MAINTENANCE_MODE
: is maintainning by
> Could you explain these semantics in the commit message? E.g. why is it not
I rewrite it.


http://gerrit.cloudera.org:8080/#/c/19608/5//COMMIT_MSG@12
PS5, Line 12: st
> nit:whose in
in


http://gerrit.cloudera.org:8080/#/c/19608/5/src/kudu/master/auto_leader_rebalancer-test.cc
File src/kudu/master/auto_leader_rebalancer-test.cc:

http://gerrit.cloudera.org:8080/#/c/19608/5/src/kudu/master/auto_leader_rebalancer-test.cc@239
PS5, Line 239: tserve
> nit: tserver
Done


http://gerrit.cloudera.org:8080/#/c/19608/5/src/kudu/master/auto_leader_rebalancer-test.cc@251
PS5, Line 251:   ASSERT_STR_CONTAINS(out, Substitute("$0,$1", 
mini_tserver->uuid(), "MAINTENANCE_MODE"));
> Shall we check the leader replicas' location in case the leader replicas wi
I am not sure that I have understood your means.

It doesn't matter whether 'enter_maintenance' command cause some the leader 
replicas transferring.
At this, we focus whether leaders is rebalanced. Because if all tservers are 
normal, kudu cluster will reach rebalanced and one of them is in 
MAINTENANCE_MODE kudu cluster would not reach rebalanced.

If we are careful about 'enter_maintenance''s influence, we can add unit tests 
about the CLI command.


http://gerrit.cloudera.org:8080/#/c/19608/5/src/kudu/master/auto_leader_rebalancer-test.cc@263
PS5, Line 263: // This cluster cannot reach a rebalanced state of leaders since 
1 of the 3 tservers
 :   // is in maintenance mode.
> nit: Rephrase to "This cluster cannot reach a rebalanced state of leaders s
Done


http://gerrit.cloudera.org:8080/#/c/19608/5/src/kudu/master/auto_leader_rebalancer.cc
File src/kudu/master/auto_leader_rebalancer.cc:

http://gerrit.cloudera.org:8080/#/c/19608/5/src/kudu/master/auto_leader_rebalancer.cc@239
PS5, Line 239: uuid_leaders
> +1
That's a problem.

Leader rebalancer should guarantee  that don't transfer leader to a tserver in 
MAINTENANCE_MODE.

Tservers in MAINTENANCE_MODE hold some leader replicas, that's possible, and 
transfer them to other tservers is not the function of leader rebalancing, it's 
another control flow. Leader rebalancer don't do the tranferring.


http://gerrit.cloudera.org:8080/#/c/19608/5/src/kudu/master/auto_leader_rebalancer.cc@250
PS5, Line 250:  that ar
> nit: change to "that are"
Done


http://gerrit.cloudera.org:8080/#/c/19608/5/src/kudu/master/auto_leader_rebalancer.cc@250
PS5, Line 250: transferrin
> nit: transferring
Done


http://gerrit.cloudera.org:8080/#/c/19608/5/src/kudu/master/auto_leader_rebalancer.cc@392
PS5, Line 392: set nit: set
Done



--
To view, visit http://gerrit.cloudera.org:8080/19608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f85a675e69fd02a62e2625881dad2ca5e27acd9
Gerrit-Change-Number: 19608
Gerrit-PatchSet: 7
Gerrit-Owner: Yuqi Du 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Reviewer: Yuqi Du 
Gerrit-Comment-Date: Thu, 23 Mar 2023 08:36:15 +
Gerrit-HasComments: Yes


[kudu-CR] [master] exclude tservers in MAINTENANCE MODE when leader rebalancing

2023-03-23 Thread Yuqi Du (Code Review)
Hello Mahesh Reddy, Alexey Serbin, Yingchun Lai, Yifan Zhang, Attila Bukor, 
Kudu Jenkins, Abhishek Chennaka, KeDeng, Wang Xixu,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19608

to look at the new patch set (#7).

Change subject: [master] exclude tservers in MAINTENANCE_MODE when leader 
rebalancing
..

[master] exclude tservers in MAINTENANCE_MODE when leader rebalancing

Currently, auto leader rebalancing may transfer new leaders to tservers
which have entered MAINTENANCE_MODE. Because a tserver in MAINTENANCE_MODE
is maintainning by administrators, maybe it is decommissioning. Leader
will serve many users' requests, so it's not reasonable that leader rebalancer
transfer some leader to it. We should exclude the tservers in MAINTENANCE_MODE.

This patch fixes the problem and add an unit test for this.

Change-Id: I2f85a675e69fd02a62e2625881dad2ca5e27acd9
---
M src/kudu/master/auto_leader_rebalancer-test.cc
M src/kudu/master/auto_leader_rebalancer.cc
M src/kudu/master/auto_leader_rebalancer.h
3 files changed, 81 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/19608/7
--
To view, visit http://gerrit.cloudera.org:8080/19608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f85a675e69fd02a62e2625881dad2ca5e27acd9
Gerrit-Change-Number: 19608
Gerrit-PatchSet: 7
Gerrit-Owner: Yuqi Du 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Reviewer: Yuqi Du 


[kudu-CR] [master] exclude tservers in MAINTENANCE MODE when leader rebalancing

2023-03-23 Thread Yuqi Du (Code Review)
Hello Mahesh Reddy, Alexey Serbin, Yingchun Lai, Yifan Zhang, Attila Bukor, 
Kudu Jenkins, Abhishek Chennaka, KeDeng, Wang Xixu,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19608

to look at the new patch set (#6).

Change subject: [master] exclude tservers in MAINTENANCE_MODE when leader 
rebalancing
..

[master] exclude tservers in MAINTENANCE_MODE when leader rebalancing

Currently, auto leader rebalancing may transfer new leaders to tservers
which have entered MAINTENANCE_MODE. Because a tserver in MAINTENANCE_MODE
is maintainning by administrators, maybe it is
decommissioning, so it's not reasonable
that leader rebalancer transfer some leader to it. Leader should serve
many requests for users. So we should exclude the tservers in MAINTENANCE_MODE.

This patch fixes the problem and add an unit test for this.

Change-Id: I2f85a675e69fd02a62e2625881dad2ca5e27acd9
---
M src/kudu/master/auto_leader_rebalancer-test.cc
M src/kudu/master/auto_leader_rebalancer.cc
M src/kudu/master/auto_leader_rebalancer.h
3 files changed, 81 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/19608/6
--
To view, visit http://gerrit.cloudera.org:8080/19608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f85a675e69fd02a62e2625881dad2ca5e27acd9
Gerrit-Change-Number: 19608
Gerrit-PatchSet: 6
Gerrit-Owner: Yuqi Du 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Reviewer: Yuqi Du