[kudu-CR] [util] fix compilation breakage
Alexey Serbin has posted comments on this change. Change subject: [util] fix compilation breakage .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8065/1/src/kudu/util/env_posix.cc File src/kudu/util/env_posix.cc: Line 20: #include > warning: #includes are not sorted properly [llvm-include-order] I'll ignore this for this changelist, and fix it in another one, where I'm planning to remove the IWYU pragmas for and -- To view, visit http://gerrit.cloudera.org:8080/8065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie9d13815052d8e4c1360db3f94e15ffd77768af5 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
[kudu-CR] [util] fix compilation breakage
Alexey Serbin has submitted this change and it was merged. Change subject: [util] fix compilation breakage .. [util] fix compilation breakage On some Linux systems (old ones), there isn't linux/sysinfo.h header file. IWYU might suggest to include linux/sysinfo.h instead of linux/kernel.h if compiling on a new system, so I added corresponding IWYU pragmas. I'll address the IWYU issue in a generic way in a separate change list (so no pragmas would be necessary). This patch is a quick fix for the compilation breakage. Change-Id: Ie9d13815052d8e4c1360db3f94e15ffd77768af5 Reviewed-on: http://gerrit.cloudera.org:8080/8065 Reviewed-by: Adar DemboReviewed-by: Andrew Wong Tested-by: Kudu Jenkins --- M src/kudu/util/env_posix.cc 1 file changed, 5 insertions(+), 4 deletions(-) Approvals: Andrew Wong: Looks good to me, but someone else must approve Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ie9d13815052d8e4c1360db3f94e15ffd77768af5 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] c++ client: try harder to pass table IDs into RPCs that can accept them
Hello Dan Burkert, Jean-Daniel Cryans, Todd Lipcon, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/8066 to review the following change. Change subject: c++ client: try harder to pass table IDs into RPCs that can accept them .. c++ client: try harder to pass table IDs into RPCs that can accept them Some RPCs (such as IsCreateTableDone or IsAlterTableDone) can identify a table by name or by ID. Using an ID is more robust since it's globally unique and immutable. This patch changes several RPC calls to use table IDs. Note: table IDs were only added to AlterTableResponsePB in Kudu 0.10. By using them here, the C++ client may crash when altering tables belonging to an older deployment. Change-Id: I0052d18f714aee1226b96bf1da9defa5738fdd37 --- M src/kudu/client/client-internal.cc M src/kudu/client/client-internal.h M src/kudu/client/client.cc M src/kudu/integration-tests/alter_table-test.cc M src/kudu/integration-tests/master_failover-itest.cc 5 files changed, 117 insertions(+), 39 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/8066/1 -- To view, visit http://gerrit.cloudera.org:8080/8066 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I0052d18f714aee1226b96bf1da9defa5738fdd37 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1807 (part 2): ban GetTableSchema for table createdness in clients
Hello Dan Burkert, Jean-Daniel Cryans, Todd Lipcon, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8026 to look at the new patch set (#3). Change subject: KUDU-1807 (part 2): ban GetTableSchema for table createdness in clients .. KUDU-1807 (part 2): ban GetTableSchema for table createdness in clients This patch modifies new Java clients to no longer use the create_table_done field in GetTableSchema RPCs. After the CreateTable RPC returns, the client will issue a GetTableSchema RPC to build a KuduTable, then enter an IsCreateTableDone RPC loop until all of the tablets have been created. There are a couple other changes worth noting: - openTable no longer waits for table creation before returning. Doing that now would require at least one IsCreateTableDone RPC, which basically defeats the purpose of these changes. Moreover, the C++ client doesn't do this, so I don't see why the Java client should. - I removed the 'tablesNotServed' logic which had no useful effect because, as I recently learned, there's retry logic to deal with TABLET_NOT_RUNNING master errors embedded more deeply in the Java client. - I removed the master permit acquisition from the "is create table done" loop, since these loops should no longer produce a thundering herds (as I suppose they could have due to 'tablesNotServed'). - I modified createTable and alterTable to make waiting optional. The default behavior is to wait, which was already the case for createTable, but not for alterTable. I also added an isCreateTableDone method, which is now somewhat useful as createTable's waiting is optional. - The IsCreateTableDone and IsAlterTableDone loops now use table IDs to ensure that they're robust in the face of concurrent operations that may change table names. This depends on AlterTableResponsePB containing table IDs, which was only true as of Kudu 0.10. If used against an older server, the client will throw an exception. Change-Id: I54fa07dc34a97f1c9da06ec9129d6d4590b7aac6 --- M java/kudu-client/src/main/java/org/apache/kudu/client/AlterTableOptions.java M java/kudu-client/src/main/java/org/apache/kudu/client/AlterTableResponse.java M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java M java/kudu-client/src/main/java/org/apache/kudu/client/CreateTableOptions.java M java/kudu-client/src/main/java/org/apache/kudu/client/CreateTableRequest.java M java/kudu-client/src/main/java/org/apache/kudu/client/CreateTableResponse.java M java/kudu-client/src/main/java/org/apache/kudu/client/DeadlineTracker.java M java/kudu-client/src/main/java/org/apache/kudu/client/GetTableSchemaRequest.java M java/kudu-client/src/main/java/org/apache/kudu/client/GetTableSchemaResponse.java M java/kudu-client/src/main/java/org/apache/kudu/client/IsAlterTableDoneRequest.java M java/kudu-client/src/main/java/org/apache/kudu/client/IsCreateTableDoneRequest.java A java/kudu-client/src/main/java/org/apache/kudu/client/IsCreateTableDoneResponse.java M java/kudu-client/src/main/java/org/apache/kudu/client/KuduClient.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestAlterTable.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestDeadlineTracker.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestFlexiblePartitioning.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduTable.java 18 files changed, 687 insertions(+), 377 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/26/8026/3 -- To view, visit http://gerrit.cloudera.org:8080/8026 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I54fa07dc34a97f1c9da06ec9129d6d4590b7aac6 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1807 (part 2): ban GetTableSchema for table createdness in clients
Adar Dembo has posted comments on this change. Change subject: KUDU-1807 (part 2): ban GetTableSchema for table createdness in clients .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/8026/2//COMMIT_MSG Commit Message: PS2, Line 20: with TABLET_NOT_RUNNING : master errors > Mind pointing me where that is? When the master responds to GetTableLocations with a TABLET_NOT_RUNNING error, it's also got a ServiceUnavailable status. The Java client automatically retries these after backing off; the code that kicks off the retry is here: https://github.com/apache/kudu/blob/master/java/kudu-client/src/main/java/org/apache/kudu/client/RpcProxy.java#L357 http://gerrit.cloudera.org:8080/#/c/8026/2/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java: Line 641: // TODO(adar): how can we do getTableSchemaRpc.setParent(rpc)? > Have a getTableSchema that takes a parent rpc? Good idea. My first try was to modify getTableSchema() to return the RPC after sending it, and then this code called setParent() and added a callback to rpc.getDeferred(). But if the RPC response came in before rpc.getDeferred(), we'd end up adding the callback to some useless deferred and hang the client. http://gerrit.cloudera.org:8080/#/c/8026/1/java/kudu-client/src/main/java/org/apache/kudu/client/IsCreateTableDoneRequest.java File java/kudu-client/src/main/java/org/apache/kudu/client/IsCreateTableDoneRequest.java: Line 36: IsCreateTableDoneRequest(KuduTable table, String name) { > I believe it's wrong to use the table name in any context where the KuduTab Alright. I chased this rabbit hole down and wound up with a larger patch and a separate patch for the C++ client. Enjoy the review. -- To view, visit http://gerrit.cloudera.org:8080/8026 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I54fa07dc34a97f1c9da06ec9129d6d4590b7aac6 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] [util] fix compilation breakage
Andrew Wong has posted comments on this change. Change subject: [util] fix compilation breakage .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/8065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie9d13815052d8e4c1360db3f94e15ffd77768af5 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] [util] fix compilation breakage
Adar Dembo has posted comments on this change. Change subject: [util] fix compilation breakage .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie9d13815052d8e4c1360db3f94e15ffd77768af5 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] [util] fix compilation breakage
Alexey Serbin has uploaded a new change for review. http://gerrit.cloudera.org:8080/8065 Change subject: [util] fix compilation breakage .. [util] fix compilation breakage On some Linux systems (old ones), there isn't linux/sysinfo.h header file. IWYU might suggest to include linux/sysinfo.h instead of linux/kernel.h if compiling on a new system, so I added corresponding IWYU pragmas. I'll address the IWYU issue in a generic way in a separate change list (so no pragmas would be necessary). This patch is a quick fix for the compilation breakage. Change-Id: Ie9d13815052d8e4c1360db3f94e15ffd77768af5 --- M src/kudu/util/env_posix.cc 1 file changed, 5 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/65/8065/1 -- To view, visit http://gerrit.cloudera.org:8080/8065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ie9d13815052d8e4c1360db3f94e15ffd77768af5 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin
[kudu-CR] docs: split disk failure from disk config changes
Adar Dembo has submitted this change and it was merged. Change subject: docs: split disk failure from disk config changes .. docs: split disk failure from disk config changes The administration notes commented on Kudu's handling of disk failures with instructions to rebuild a tserver with a new directory configuration. While related, these two are separate and should be documented as such. Change-Id: I732286d0f56f7a15705ad544fc7dfc426287714e Reviewed-on: http://gerrit.cloudera.org:8080/7984 Tested-by: Kudu Jenkins Reviewed-by: Adar Dembo--- M docs/administration.adoc 1 file changed, 32 insertions(+), 19 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/7984 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I732286d0f56f7a15705ad544fc7dfc426287714e Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] docs: split disk failure from disk config changes
Adar Dembo has posted comments on this change. Change subject: docs: split disk failure from disk config changes .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7984 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I732286d0f56f7a15705ad544fc7dfc426287714e Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] docs: split disk failure from disk config changes
Hello Dan Burkert, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7984 to look at the new patch set (#5). Change subject: docs: split disk failure from disk config changes .. docs: split disk failure from disk config changes The administration notes commented on Kudu's handling of disk failures with instructions to rebuild a tserver with a new directory configuration. While related, these two are separate and should be documented as such. Change-Id: I732286d0f56f7a15705ad544fc7dfc426287714e --- M docs/administration.adoc 1 file changed, 32 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/84/7984/5 -- To view, visit http://gerrit.cloudera.org:8080/7984 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I732286d0f56f7a15705ad544fc7dfc426287714e Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] docs: split disk failure from disk config changes
Andrew Wong has posted comments on this change. Change subject: docs: split disk failure from disk config changes .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/7984/4//COMMIT_MSG Commit Message: PS4, Line 9: administartion > administration Done -- To view, visit http://gerrit.cloudera.org:8080/7984 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I732286d0f56f7a15705ad544fc7dfc426287714e Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-2144. Add metrics for Reactor load
Hello Mike Percy, Michael Ho, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/8064 to review the following change. Change subject: KUDU-2144. Add metrics for Reactor load .. KUDU-2144. Add metrics for Reactor load This adds two new metrics: 1) reactor_load_percent This measures the percentage of time that a reactor spends doing active work (i.e not blocked in epoll_wait()). As this approaches 100%, it indicates that the server may be reactor-bound (eg due to skew, a performance bug, or insufficient number of reactor threads) At first glance, it might seem like this should just be a simple counter of cycles spent in epoll_wait(), so that metrics systems would calculate a derived cycles/second rate. However, a simple rate metric would not well represent the fact that there are multiple reactor threads, and it's possible (and often the case) that only one of them is highly loaded due to skew. Additionally, there are some bursty workloads where the load average over minute granularity is relatively low, but when viewed on a short time scale the reactor thread approaches 100% load. So, to capture that, this new metric is a histogram of percentages. Values are contributed to the histogram in the periodic TimerHandler which runs every 100ms. So, if there is any 100ms period in which the reactor is fully loaded, it will contribute a "100%" sample to the histogram. We can then inspect the percentiles and the raw counts to see if there are even any short bursts where the reactors are the bottleneck. To test this out, I ran rpc-bench and modified the number of server reactors. As I added more server reactors, I could see that the load percentage (particularly the 95th percentile) went down as each reactor was able to spend more time idle. 2) reactor_active_latency_us This metric takes another approach to measure potential latency issues caused by reactors by measuring the histogram of time spent invoking watcher callbacks. If, for example, we see that callback execution frequently takes 100ms, we can assume that a similar amount of latency would be contributed to any inbound or outbound RPCs associated with the same reactor. To test this out, I simulated KUDU-1944 (OpenSSL lock contention slowing down socket IO) by adding a usleep(10ms) call in Socket::Recv and running rpc-bench. I could see that the new metric shot up accordingly. Change-Id: Ic530af2836b1c31b3d754a9e0068fc5d31aa6fbb --- M src/kudu/rpc/reactor.cc M src/kudu/rpc/reactor.h M src/kudu/rpc/rpc-bench.cc M src/kudu/util/metrics.h 4 files changed, 125 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/64/8064/1 -- To view, visit http://gerrit.cloudera.org:8080/8064 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ic530af2836b1c31b3d754a9e0068fc5d31aa6fbb Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mike Percy
[kudu-CR] parse metrics log: merge metrics across entities of same type
Adar Dembo has posted comments on this change. Change subject: parse_metrics_log: merge metrics across entities of same type .. Patch Set 1: (4 comments) First time looking at this so mostly a scan, but since it's a standalone script that probably doesn't matter so much. http://gerrit.cloudera.org:8080/#/c/8062/1/src/kudu/scripts/parse_metrics_log.py File src/kudu/scripts/parse_metrics_log.py: PS1, Line 99: if k == 'name': : continue : if k == 'values' or k == 'counts' or k == 'min' or k == 'max' or k == 'mean': : continue Nit: combine? Line 103: else: Nit: no need for the else. Line 161: cur.get('counts', [])) Nit: indentation here. Line 237: if prev_data and ts < prev_data['ts'] + 30: Worth a comment explaining this check. -- To view, visit http://gerrit.cloudera.org:8080/8062 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I442cddd4c5352dcfae30fa23888082d916069d5f Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] docs: split disk failure from disk config changes
Adar Dembo has posted comments on this change. Change subject: docs: split disk failure from disk config changes .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/7984/4//COMMIT_MSG Commit Message: PS4, Line 9: administartion administration -- To view, visit http://gerrit.cloudera.org:8080/7984 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I732286d0f56f7a15705ad544fc7dfc426287714e Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] rename suicide on eio flag to crash on eio
Adar Dembo has submitted this change and it was merged. Change subject: rename suicide_on_eio flag to crash_on_eio .. rename suicide_on_eio flag to crash_on_eio The name `suicide_on_eio` seems a bit controversial/insensitive. As we move forward in documenting its usage with disk failures, it's probably best rename it. Change-Id: I6c83156865b0f8ac80a20701558119ace6c05399 Reviewed-on: http://gerrit.cloudera.org:8080/7994 Reviewed-by: Adar DemboTested-by: Kudu Jenkins --- M docs/release_notes.adoc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/data_dirs-test.cc M src/kudu/fs/fs_manager-test.cc M src/kudu/fs/log_block_manager-test.cc M src/kudu/integration-tests/disk_failure-itest.cc M src/kudu/util/env-test.cc M src/kudu/util/env_posix.cc 8 files changed, 18 insertions(+), 18 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/7994 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I6c83156865b0f8ac80a20701558119ace6c05399 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Add tablet state summary metrics
Will Berkeley has submitted this change and it was merged. Change subject: Add tablet state summary metrics .. Add tablet state summary metrics This patch adds metrics for the number of tablets in each state (i.e. the number of tablets for each value of TabletStatePB). To avoid contention in the tablet manager, the frequency with which the state numbers are computed is limited. Requests that come too soon for an update will receive cached counts. The maximum frequency of updates is controlled by a new flag. Change-Id: I9fda2061d341586f0aa343787af59984a627080a Reviewed-on: http://gerrit.cloudera.org:8080/7980 Tested-by: Kudu Jenkins Reviewed-by: Mike Percy--- M src/kudu/integration-tests/tablet_copy-itest.cc M src/kudu/tserver/CMakeLists.txt M src/kudu/tserver/ts_tablet_manager.cc M src/kudu/tserver/ts_tablet_manager.h A src/kudu/tserver/ts_tablet_manager_metrics.cc A src/kudu/tserver/ts_tablet_manager_metrics.h M src/kudu/util/metrics.cc M src/kudu/util/metrics.h 8 files changed, 380 insertions(+), 7 deletions(-) Approvals: Mike Percy: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/7980 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I9fda2061d341586f0aa343787af59984a627080a Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley
[kudu-CR] KUDU-2135 (part 1): add persistent disk states
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8048 to look at the new patch set (#4). Change subject: KUDU-2135 (part 1): add persistent disk states .. KUDU-2135 (part 1): add persistent disk states This patch introduces a new format for PathInstanceMetadataFiles that includes a health state and path for each data directory, and a timestamp that the path instance file was last updated. These additions will be necessary to ensure that failed disks will not be used and that data is where it is expected be (which will be increasingly important as we begin to supporting removal, addition, and replacement of disks). The expected usage of the new format would enforce startup restrictions on the server (e.g. to never use a previously failed disk, to not use data if it is on a different disk than expected, etc.) and tooling does not yet exist to work around these restrictions. As such, this patch will upgrade and maintain the new path sets as necessary, persisting disk health when disks fail, but will not apply the mentioned restrictions at startup. Tests are added to data_dirs-test to ensure the path instance files get updated and upgraded appropriately. Change-Id: Ia7ea6418f6190d7d0c6d5a1aaca4ca592c49ce41 --- M src/kudu/fs/block_manager_util-test.cc M src/kudu/fs/block_manager_util.cc M src/kudu/fs/block_manager_util.h M src/kudu/fs/data_dirs-test.cc M src/kudu/fs/data_dirs.cc M src/kudu/fs/data_dirs.h M src/kudu/fs/fs.proto 7 files changed, 540 insertions(+), 129 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/8048/4 -- To view, visit http://gerrit.cloudera.org:8080/8048 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7ea6418f6190d7d0c6d5a1aaca4ca592c49ce41 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] parse metrics log: merge metrics across entities of same type
Hello Adar Dembo, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/8062 to review the following change. Change subject: parse_metrics_log: merge metrics across entities of same type .. parse_metrics_log: merge metrics across entities of same type This changes the tool so that all of the tablet metrics get summed together before reporting. Previously we just reported the info from an arbitrary tablet (probably the last one to be listed in the log). This also adds a short circuit such that we skip parsing and merging of metrics which have not been slated for output. This sped up processing a metrics file ~10x when reporting only one or two metrics from it. Lastly, this adds reporting of the max value from histograms. Change-Id: I442cddd4c5352dcfae30fa23888082d916069d5f --- M src/kudu/scripts/parse_metrics_log.py 1 file changed, 80 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/62/8062/1 -- To view, visit http://gerrit.cloudera.org:8080/8062 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I442cddd4c5352dcfae30fa23888082d916069d5f Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo
[kudu-CR] Update instructions for publishing changes to the live site
Dan Burkert has posted comments on this change. Change subject: Update instructions for publishing changes to the live site .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8059/1/README.adoc File README.adoc: Line 348: cd _publish && git push # Update the live web site. May want to add a note about having to push an empty commit if the change is large in order to work around a bug in the ASF gitpubsub -- To view, visit http://gerrit.cloudera.org:8080/8059 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7fc1a40d6deedf8eb1a9d8e6d13f803f6109c06b Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR](gh-pages) Add backdated announcement blog post for 1.5
Jean-Daniel Cryans has submitted this change and it was merged. Change subject: Add backdated announcement blog post for 1.5 .. Add backdated announcement blog post for 1.5 Change-Id: I301db21a8c5e75b2cbe47b3992094d7b1eca7731 Reviewed-on: http://gerrit.cloudera.org:8080/8058 Reviewed-by: Alexey SerbinReviewed-by: Jean-Daniel Cryans Tested-by: Jean-Daniel Cryans --- A _posts/2017-09-08-apache-kudu-1-5-0-released.md 1 file changed, 39 insertions(+), 0 deletions(-) Approvals: Jean-Daniel Cryans: Looks good to me, approved; Verified Alexey Serbin: Looks good to me, but someone else must approve -- To view, visit http://gerrit.cloudera.org:8080/8058 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I301db21a8c5e75b2cbe47b3992094d7b1eca7731 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR](gh-pages) Add backdated announcement blog post for 1.5
Jean-Daniel Cryans has posted comments on this change. Change subject: Add backdated announcement blog post for 1.5 .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8058 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I301db21a8c5e75b2cbe47b3992094d7b1eca7731 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Dan BurkertGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR](gh-pages) Add backdated announcement blog post for 1.5
Jean-Daniel Cryans has posted comments on this change. Change subject: Add backdated announcement blog post for 1.5 .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8058 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I301db21a8c5e75b2cbe47b3992094d7b1eca7731 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Dan BurkertGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-2125: Tablet copy client does not retry on failures
Hello Adar Dembo, Alexey Serbin, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8016 to look at the new patch set (#14). Change subject: KUDU-2125: Tablet copy client does not retry on failures .. KUDU-2125: Tablet copy client does not retry on failures The tablet copy client would fail the tablet copy after encountering an RPC error. This commit adds retry logic for tablet copy operations when encountering SERVER_TOO_BUSY or UNAVAILABLE errors, which are retriable. Change-Id: I7c8454fc600a841bd15306a2b3b06ddf53130be6 --- M src/kudu/integration-tests/cluster_itest_util.cc M src/kudu/integration-tests/cluster_itest_util.h M src/kudu/integration-tests/tablet_copy_client_session-itest.cc M src/kudu/tserver/tablet_copy_client.cc M src/kudu/tserver/tablet_copy_client.h 5 files changed, 139 insertions(+), 33 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/16/8016/14 -- To view, visit http://gerrit.cloudera.org:8080/8016 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c8454fc600a841bd15306a2b3b06ddf53130be6 Gerrit-PatchSet: 14 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2125: Tablet copy client does not retry on failures
Hello Adar Dembo, Alexey Serbin, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8016 to look at the new patch set (#13). Change subject: KUDU-2125: Tablet copy client does not retry on failures .. KUDU-2125: Tablet copy client does not retry on failures The tablet copy client would fail the tablet copy after encountering an RPC error. This commit adds retry logic for tablet copy operations when encountering SERVER_TOO_BUSY or UNAVAILABLE errors, which are retriable. Change-Id: I7c8454fc600a841bd15306a2b3b06ddf53130be6 --- M src/kudu/integration-tests/cluster_itest_util.cc M src/kudu/integration-tests/cluster_itest_util.h M src/kudu/integration-tests/tablet_copy_client_session-itest.cc M src/kudu/tserver/tablet_copy_client.cc M src/kudu/tserver/tablet_copy_client.h 5 files changed, 139 insertions(+), 33 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/16/8016/13 -- To view, visit http://gerrit.cloudera.org:8080/8016 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c8454fc600a841bd15306a2b3b06ddf53130be6 Gerrit-PatchSet: 13 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2125: Tablet copy client does not retry on failures
Dan Burkert has posted comments on this change. Change subject: KUDU-2125: Tablet copy client does not retry on failures .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/8016/10/src/kudu/tserver/tablet_copy_client.cc File src/kudu/tserver/tablet_copy_client.cc: PS10, Line 702: >= > It can happen if the server sends the wrong total_data_length, or sends a c I'll just fix it. -- To view, visit http://gerrit.cloudera.org:8080/8016 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7c8454fc600a841bd15306a2b3b06ddf53130be6 Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-2125: Tablet copy client does not retry on failures
Dan Burkert has posted comments on this change. Change subject: KUDU-2125: Tablet copy client does not retry on failures .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/8016/10/src/kudu/tserver/tablet_copy_client.cc File src/kudu/tserver/tablet_copy_client.cc: PS10, Line 702: >= > Can that actually happen? I think the checksum would be invalid on the bloc It can happen if the server sends the wrong total_data_length, or sends a chunk that exceeds the total_data_length. We aren't currently defensive about checking this. -- To view, visit http://gerrit.cloudera.org:8080/8016 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7c8454fc600a841bd15306a2b3b06ddf53130be6 Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-2125: Tablet copy client does not retry on failures
Mike Percy has posted comments on this change. Change subject: KUDU-2125: Tablet copy client does not retry on failures .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/8016/10/src/kudu/tserver/tablet_copy_client.cc File src/kudu/tserver/tablet_copy_client.cc: PS10, Line 702: >= > I thought about it, and mildly preferred >= so that the client doesn't cont Can that actually happen? I think the checksum would be invalid on the block. I'm nervous about adding fuzzy math to our data copying code. -- To view, visit http://gerrit.cloudera.org:8080/8016 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7c8454fc600a841bd15306a2b3b06ddf53130be6 Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] rename suicide on eio flag to crash on eio
Adar Dembo has posted comments on this change. Change subject: rename suicide_on_eio flag to crash_on_eio .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7994 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6c83156865b0f8ac80a20701558119ace6c05399 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR](gh-pages) Add backdated announcement blog post for 1.5
Alexey Serbin has posted comments on this change. Change subject: Add backdated announcement blog post for 1.5 .. Patch Set 4: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/8058 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I301db21a8c5e75b2cbe47b3992094d7b1eca7731 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Dan BurkertGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] rename suicide on eio flag to crash on eio
Hello Adar Dembo, Todd Lipcon, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7994 to look at the new patch set (#3). Change subject: rename suicide_on_eio flag to crash_on_eio .. rename suicide_on_eio flag to crash_on_eio The name `suicide_on_eio` seems a bit controversial/insensitive. As we move forward in documenting its usage with disk failures, it's probably best rename it. Change-Id: I6c83156865b0f8ac80a20701558119ace6c05399 --- M docs/release_notes.adoc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/data_dirs-test.cc M src/kudu/fs/fs_manager-test.cc M src/kudu/fs/log_block_manager-test.cc M src/kudu/integration-tests/disk_failure-itest.cc M src/kudu/util/env-test.cc M src/kudu/util/env_posix.cc 8 files changed, 18 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/7994/3 -- To view, visit http://gerrit.cloudera.org:8080/7994 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6c83156865b0f8ac80a20701558119ace6c05399 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR](gh-pages) Add backdated announcement blog post for 1.5
Hello Alexey Serbin, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8058 to look at the new patch set (#4). Change subject: Add backdated announcement blog post for 1.5 .. Add backdated announcement blog post for 1.5 Change-Id: I301db21a8c5e75b2cbe47b3992094d7b1eca7731 --- A _posts/2017-09-08-apache-kudu-1-5-0-released.md 1 file changed, 39 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/58/8058/4 -- To view, visit http://gerrit.cloudera.org:8080/8058 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I301db21a8c5e75b2cbe47b3992094d7b1eca7731 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Dan BurkertGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR](gh-pages) Add backdated announcement blog post for 1.5
Dan Burkert has posted comments on this change. Change subject: Add backdated announcement blog post for 1.5 .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/8058/3/_posts/2017-09-08-apache-kudu-1-5-0-released.md File _posts/2017-09-08-apache-kudu-1-5-0-released.md: PS3, Line 16: - optimizations to improve write throughput and failover recovery times. : - the Raft consensus implementation has been made more resilient and flexible : through "tombstoned voting", which allows Kudu to self-heal in more edge-case : scenarios. : - the number of threads used by Kudu servers has been further reduced, with : additional reductions planned for the future. : - a new configuration dashboard on the web UI which provides a high-level : summary of important configuration values. : - a new `kudu tablet move` command which moves a tablet replica from one tablet : server to another. : - a new `kudu local_replica data_size` command which summarizes the space usage : of a local tablet. : - all on-disk data is now checksummed by default, which provides error detection : for improved confidence when running Kudu on unreliable hardware. > Style nit (didn't notice in the first revision; feel free to ignore): maybe Done -- To view, visit http://gerrit.cloudera.org:8080/8058 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I301db21a8c5e75b2cbe47b3992094d7b1eca7731 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Dan BurkertGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR](gh-pages) Add backdated announcement blog post for 1.5
Alexey Serbin has posted comments on this change. Change subject: Add backdated announcement blog post for 1.5 .. Patch Set 3: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/8058/3/_posts/2017-09-08-apache-kudu-1-5-0-released.md File _posts/2017-09-08-apache-kudu-1-5-0-released.md: PS3, Line 16: - optimizations to improve write throughput and failover recovery times. : - the Raft consensus implementation has been made more resilient and flexible : through "tombstoned voting", which allows Kudu to self-heal in more edge-case : scenarios. : - the number of threads used by Kudu servers has been further reduced, with : additional reductions planned for the future. : - a new configuration dashboard on the web UI which provides a high-level : summary of important configuration values. : - a new `kudu tablet move` command which moves a tablet replica from one tablet : server to another. : - a new `kudu local_replica data_size` command which summarizes the space usage : of a local tablet. : - all on-disk data is now checksummed by default, which provides error detection : for improved confidence when running Kudu on unreliable hardware. Style nit (didn't notice in the first revision; feel free to ignore): maybe, either capitalize the first letter of every item or remove the trailing period. -- To view, visit http://gerrit.cloudera.org:8080/8058 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I301db21a8c5e75b2cbe47b3992094d7b1eca7731 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Dan BurkertGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR](gh-pages) Add backdated announcement blog post for 1.5
Dan Burkert has posted comments on this change. Change subject: Add backdated announcement blog post for 1.5 .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8058/2/_posts/2017-09-08-apache-kudu-1-5-0-released.md File _posts/2017-09-08-apache-kudu-1-5-0-released.md: PS2, Line 20: futher > seems like a typo; should it be 'further'? Done -- To view, visit http://gerrit.cloudera.org:8080/8058 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I301db21a8c5e75b2cbe47b3992094d7b1eca7731 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Dan BurkertGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR](gh-pages) Add backdated announcement blog post for 1.5
Dan Burkert has uploaded a new patch set (#3). Change subject: Add backdated announcement blog post for 1.5 .. Add backdated announcement blog post for 1.5 Change-Id: I301db21a8c5e75b2cbe47b3992094d7b1eca7731 --- A _posts/2017-09-08-apache-kudu-1-5-0-released.md 1 file changed, 39 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/58/8058/3 -- To view, visit http://gerrit.cloudera.org:8080/8058 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I301db21a8c5e75b2cbe47b3992094d7b1eca7731 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Dan BurkertGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR](gh-pages) Add backdated announcement blog post for 1.5
Alexey Serbin has posted comments on this change. Change subject: Add backdated announcement blog post for 1.5 .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8058/2/_posts/2017-09-08-apache-kudu-1-5-0-released.md File _posts/2017-09-08-apache-kudu-1-5-0-released.md: PS2, Line 20: futher seems like a typo; should it be 'further'? -- To view, visit http://gerrit.cloudera.org:8080/8058 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I301db21a8c5e75b2cbe47b3992094d7b1eca7731 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Dan BurkertGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-2055 [part 2]: Add util to construct sorted disjoint interval
Andrew Wong has posted comments on this change. Change subject: KUDU-2055 [part 2]: Add util to construct sorted disjoint interval .. Patch Set 1: (8 comments) http://gerrit.cloudera.org:8080/#/c/8041/1/src/kudu/util/sorted_disjoint_interval-test.cc File src/kudu/util/sorted_disjoint_interval-test.cc: PS1, Line 50: struct DisjointIntervalComparator { : bool operator() (const ClosedInterval& first, const ClosedInterval& second) const { : if (first.left == second.left) return first.right < second.right; : return first.left < second.left; : } : }; nit: why does this need to be its own struct? why not a lambda within sort()? PS1, Line 58: IntTraits nit: consider IntIntervalTraits or something? Or just IntInterval? http://gerrit.cloudera.org:8080/#/c/8041/1/src/kudu/util/sorted_disjoint_interval.h File src/kudu/util/sorted_disjoint_interval.h: PS1, Line 17: #ifndef KUDU_UTIL_SORTED_DISJOINT_INTERVAL_H : #define KUDU_UTIL_SORTED_DISJOINT_INTERVAL_H Use #pragma once for C++11 ! PS1, Line 27: A sorted disjoint interval is a data structure which stores a set of intervals that : // are sorted and disjoint. nit: How about, "A set of sorted disjoint intervals holds a list of non-overlapping ranges." PS1, Line 33: |--1---| |-2-| nit: to emphasize the sortedness of the sorted disjoint interval set, maybe flip 1 and 2? PS1, Line 46: The Traits class should have the following members: : // Traits::point_type : // a typedef for what a "point" in the range is : // : // Traits::interval_type : // a typedef for an interval : // : // static vector sort(const vector& interval) : // sort the given intervals. : // : // static bool valid(const interval_type& interval) : // return true if the interval is valid. : // : // static interval_type coalesce(const interval_type& a, const interval_type& b) : // coalesce two intervals. : // : // static bool overlap(const interval_type& a, const interval_type& b) : // return true if two intervals are overlapped. Is there a way to encapsulate this into some sort of templatized abstract base class? Not sure how easy/difficult that'd be with the templates. PS1, Line 65: template : class SortedDisjointInterval { Why does SortedDisjointInterval need its own class; could it just be a templatized utility method that takes in a vector of s and returns a vector of s? PS1, Line 66: SortedDisjointInterval nit: it's more of a SortedDisjointIntervalList -- To view, visit http://gerrit.cloudera.org:8080/8041 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I61a813c047be4882f246eaf404598e7e18fcac87 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Mike Percy has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 9: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 9 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR](gh-pages) Add backdated announcement blog post for 1.5
Dan Burkert has uploaded a new patch set (#2). Change subject: Add backdated announcement blog post for 1.5 .. Add backdated announcement blog post for 1.5 Change-Id: I301db21a8c5e75b2cbe47b3992094d7b1eca7731 --- A _posts/2017-09-08-apache-kudu-1-5-0-released.md 1 file changed, 39 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/58/8058/2 -- To view, visit http://gerrit.cloudera.org:8080/8058 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I301db21a8c5e75b2cbe47b3992094d7b1eca7731 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Dan BurkertGerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2125: Tablet copy client does not retry on failures
Dan Burkert has posted comments on this change. Change subject: KUDU-2125: Tablet copy client does not retry on failures .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/8016/4/src/kudu/integration-tests/tablet_copy_client_session-itest.cc File src/kudu/integration-tests/tablet_copy_client_session-itest.cc: PS4, Line 305: ASSERT_OK > I'm surprised at this. I've had weird issues trying to do this before and n Done -- To view, visit http://gerrit.cloudera.org:8080/8016 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7c8454fc600a841bd15306a2b3b06ddf53130be6 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Update instructions for publishing changes to the live site
Hello Dan Burkert, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/8059 to review the following change. Change subject: Update instructions for publishing changes to the live site .. Update instructions for publishing changes to the live site Change-Id: I7fc1a40d6deedf8eb1a9d8e6d13f803f6109c06b --- M README.adoc 1 file changed, 21 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/59/8059/1 -- To view, visit http://gerrit.cloudera.org:8080/8059 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I7fc1a40d6deedf8eb1a9d8e6d13f803f6109c06b Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Dan Burkert
[kudu-CR] KUDU-2125: Tablet copy client does not retry on failures
Hello Adar Dembo, Alexey Serbin, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8016 to look at the new patch set (#12). Change subject: KUDU-2125: Tablet copy client does not retry on failures .. KUDU-2125: Tablet copy client does not retry on failures The tablet copy client would fail the tablet copy after encountering an RPC error. This commit adds retry logic for tablet copy operations when encountering SERVER_TOO_BUSY or UNAVAILABLE errors, which are retriable. Change-Id: I7c8454fc600a841bd15306a2b3b06ddf53130be6 --- M src/kudu/integration-tests/cluster_itest_util.cc M src/kudu/integration-tests/cluster_itest_util.h M src/kudu/integration-tests/tablet_copy_client_session-itest.cc M src/kudu/tserver/tablet_copy_client.cc M src/kudu/tserver/tablet_copy_client.h 5 files changed, 133 insertions(+), 33 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/16/8016/12 -- To view, visit http://gerrit.cloudera.org:8080/8016 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c8454fc600a841bd15306a2b3b06ddf53130be6 Gerrit-PatchSet: 12 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR](gh-pages) Add backdated announcement blog post for 1.5
Hello Jean-Daniel Cryans, Mike Percy, Todd Lipcon, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/8058 to review the following change. Change subject: Add backdated announcement blog post for 1.5 .. Add backdated announcement blog post for 1.5 Change-Id: I301db21a8c5e75b2cbe47b3992094d7b1eca7731 --- A _posts/2017-09-08-apache-kudu-1-5-0-released.md 1 file changed, 39 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/58/8058/1 -- To view, visit http://gerrit.cloudera.org:8080/8058 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I301db21a8c5e75b2cbe47b3992094d7b1eca7731 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Dan BurkertGerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2055 [part 2]: Add util to construct sorted disjoint interval
Alexey Serbin has posted comments on this change. Change subject: KUDU-2055 [part 2]: Add util to construct sorted disjoint interval .. Patch Set 1: (9 comments) http://gerrit.cloudera.org:8080/#/c/8041/1/src/kudu/util/sorted_disjoint_interval-test.cc File src/kudu/util/sorted_disjoint_interval-test.cc: PS1, Line 36: struct ClosedInterval { Should this be a class/structure template as well (with type for the left and right point)? PS1, Line 37: int left, int right If not using a class template for this ClosedInterval, consider making the parameters for this constructor and the left and right members of the same type. Line 46: int64_t left; // Inclusive nit: const? Line 47: int64_t right; // Inclusive int: const? PS1, Line 85: vector expected; : expected.emplace_back(-23, 2); : expected.emplace_back(3, 7); nit: here and below consider replacing with something like const vector expected = { {-23, 2}, {3, 7} }; Line 132: I don't see any coverage for single element intervals, like { {0, 0}, {0, 1}, {1, 1} } etc. Consider adding tests for that as well. http://gerrit.cloudera.org:8080/#/c/8041/1/src/kudu/util/sorted_disjoint_interval.h File src/kudu/util/sorted_disjoint_interval.h: PS1, Line 89: inline Maybe, just move it inside the class definition above? PS1, Line 98: !Traits::valid(*cur) Two questions here: 1. Is it necessary to check for the end of the container here? 2. Why not to move this under the for () cycle below? PS1, Line 106: cur++ ++cur might be better since it doesn't need to store the prev iterator value. -- To view, visit http://gerrit.cloudera.org:8080/8041 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I61a813c047be4882f246eaf404598e7e18fcac87 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Hello Adar Dembo, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7985 to look at the new patch set (#9). Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. KUDU-2124. Don't hold session lock while initializing a TabletCopySession This patch replaces the interior mutex in TabletCopySourceSession with a dynamic once, so that initialization can happen concurrently. This allows initialization to happen outside of the critical section in the tablet copy service. Implemented a regression test that injects latency into BeginTabletCopySession() calls. Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 --- M src/kudu/integration-tests/cluster_itest_util.cc M src/kudu/integration-tests/cluster_itest_util.h M src/kudu/integration-tests/tablet_copy-itest.cc M src/kudu/tserver/tablet_copy_service-test.cc M src/kudu/tserver/tablet_copy_service.cc M src/kudu/tserver/tablet_copy_source_session.cc M src/kudu/tserver/tablet_copy_source_session.h M src/kudu/tserver/tablet_server-test-base.h M src/kudu/tserver/tablet_server-test.cc M src/kudu/tserver/tablet_server_test_util.cc M src/kudu/tserver/tablet_server_test_util.h 11 files changed, 243 insertions(+), 62 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/7985/9 -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 9 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2125: Tablet copy client does not retry on failures
Dan Burkert has posted comments on this change. Change subject: KUDU-2125: Tablet copy client does not retry on failures .. Patch Set 10: (3 comments) http://gerrit.cloudera.org:8080/#/c/8016/10/src/kudu/integration-tests/cluster_itest_util.cc File src/kudu/integration-tests/cluster_itest_util.cc: PS10, Line 858: != > Neat: I didn't know you could check for equality between a thing and an opt Yah, it auto derefs it. Very dangerous if you don't do a check up front, though. I don't think it throws an exception or anything. http://gerrit.cloudera.org:8080/#/c/8016/10/src/kudu/tserver/tablet_copy_client.cc File src/kudu/tserver/tablet_copy_client.cc: PS10, Line 702: >= > why >= and not == ? I thought about it, and mildly preferred >= so that the client doesn't continue looping indefinitely if the server sends a block with a too-long length. I'm not sure what would happen in that scenario, but infinite loop doesn't seem good. Line 747: // Polynomial backoff with 50% jitter. > Have you used this before? I plotted y=10x^2 and it seems pretty good -- we I briefly tried to see if we had an abstraction for exponential backoff with jitter calculation, and it seems we don't. This was just gut feeling. I also graphed it initially, for anyone else following along: https://www.wolframcloud.com/objects/b532668a-64c0-45d4-8fad-dfce1718cda6 -- To view, visit http://gerrit.cloudera.org:8080/8016 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7c8454fc600a841bd15306a2b3b06ddf53130be6 Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1807 (part 2): ban GetTableSchema for table createdness in clients
Jean-Daniel Cryans has posted comments on this change. Change subject: KUDU-1807 (part 2): ban GetTableSchema for table createdness in clients .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/8026/2//COMMIT_MSG Commit Message: PS2, Line 20: with TABLET_NOT_RUNNING : master errors Mind pointing me where that is? http://gerrit.cloudera.org:8080/#/c/8026/2/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java: Line 641: // TODO(adar): how can we do getTableSchemaRpc.setParent(rpc)? Have a getTableSchema that takes a parent rpc? -- To view, visit http://gerrit.cloudera.org:8080/8026 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I54fa07dc34a97f1c9da06ec9129d6d4590b7aac6 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-2135 (part 1): add persistent disk states
Andrew Wong has posted comments on this change. Change subject: KUDU-2135 (part 1): add persistent disk states .. Patch Set 2: Ah, Adar mentioned it'd be a good idea to decompose DataDirManager::Open() since it's so long. I'll do that now (but for the most part, this is reviewable) -- To view, visit http://gerrit.cloudera.org:8080/8048 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia7ea6418f6190d7d0c6d5a1aaca4ca592c49ce41 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] KUDU-1807 (part 2): ban GetTableSchema for table createdness in clients
Dan Burkert has posted comments on this change. Change subject: KUDU-1807 (part 2): ban GetTableSchema for table createdness in clients .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8026/1/java/kudu-client/src/main/java/org/apache/kudu/client/IsCreateTableDoneRequest.java File java/kudu-client/src/main/java/org/apache/kudu/client/IsCreateTableDoneRequest.java: Line 36: IsCreateTableDoneRequest(KuduTable table, String name) { > I suppose so, but IsAlterTableDoneRequest behaves this way, as do both IsAl I believe it's wrong to use the table name in any context where the KuduTable object is available, since it should have an attached Kudu table ID (and table name, incidentally, so the separate name isn't even necessary here). Anything else is a TOCTOU violation that the user can't work around. Now, if the API took _only_ a table name (and not the table object), I could understand sending the RPC without an ID. -- To view, visit http://gerrit.cloudera.org:8080/8026 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I54fa07dc34a97f1c9da06ec9129d6d4590b7aac6 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-2135 (part 2): check integrity of new on-disk format
Andrew Wong has posted comments on this change. Change subject: KUDU-2135 (part 2): check integrity of new on-disk format .. Patch Set 28: (5 comments) Note: another significant portion of this patch has been pushed into another (see https://gerrit.cloudera.org/#/c/8048/) This patch is reviewable, but should not be merged until tooling is written to change the path set instance files. http://gerrit.cloudera.org:8080/#/c/7270/26/src/kudu/fs/data_dirs.cc File src/kudu/fs/data_dirs.cc: Line 543: InsertOrDie(_by_uuid, uuid, idx); > This is confusing w.r.t. failed_data_dirs. Renamed this to unassigned_dirs; in this case they're equivalent (a failed dir will be unassigned), but this provides better context for what we're going to do with them. PS26, Line 633: : // Finally, initialize the 'fullness' status of the data directories. : for (const auto& dd : data_dirs_) { : uint16_t uuid_idx; : CHECK(FindUuidIndexByDataDir(dd.get(), _idx)); : if (ContainsKey(failed_data_dirs_, uuid_idx)) { : continue; : } : Status refresh_status = dd->RefreshIsFull(DataDir::RefreshMode::ALWAYS); : if (PREDICT_FALSE(!refresh_status.ok())) { : if (refresh_status.IsDiskFailure()) { : RETURN_NOT_OK(MarkDataDirFailed(uuid_idx, refresh_status.ToString())); : continue; : } : return refresh_status; : > Why was this moved all the way down here? You could use instance health to The latter. This is safeguarding if we fail during the fullness refresh. Line 744: } else { > Why crash and not return an IOError, letting a caller higher in the stack c Done Line 908: > Hmm, what if we just skipped WritePathHealthStates? Would that be incorrect I suppose it wouldn't be incorrect; the server could continue running without marking on-disk that the disk has failed. Line 953: if (read_only_) { > Can you add a comment explaining what this is serializing and why? Done -- To view, visit http://gerrit.cloudera.org:8080/7270 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifddf0817fe1a82044077f5544c400c88de20769f Gerrit-PatchSet: 28 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Add tablet state summary metrics
Hello Adar Dembo, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7980 to look at the new patch set (#9). Change subject: Add tablet state summary metrics .. Add tablet state summary metrics This patch adds metrics for the number of tablets in each state (i.e. the number of tablets for each value of TabletStatePB). To avoid contention in the tablet manager, the frequency with which the state numbers are computed is limited. Requests that come too soon for an update will receive cached counts. The maximum frequency of updates is controlled by a new flag. Change-Id: I9fda2061d341586f0aa343787af59984a627080a --- M src/kudu/integration-tests/tablet_copy-itest.cc M src/kudu/tserver/CMakeLists.txt M src/kudu/tserver/ts_tablet_manager.cc M src/kudu/tserver/ts_tablet_manager.h A src/kudu/tserver/ts_tablet_manager_metrics.cc A src/kudu/tserver/ts_tablet_manager_metrics.h M src/kudu/util/metrics.cc M src/kudu/util/metrics.h 8 files changed, 380 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/80/7980/9 -- To view, visit http://gerrit.cloudera.org:8080/7980 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9fda2061d341586f0aa343787af59984a627080a Gerrit-PatchSet: 9 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley
[kudu-CR] Add tablet state summary metrics
Mike Percy has posted comments on this change. Change subject: Add tablet state summary metrics .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/7980/8/src/kudu/tserver/ts_tablet_manager_metrics.h File src/kudu/tserver/ts_tablet_manager_metrics.h: PS8, Line 56: TSTabletManager* ts_tablet_manager; : : FunctionGaugeDetacher metric_detacher; Don't shoot me, but shouldn't these two members be private? Aren't they never meant to be accessed by other classes? -- To view, visit http://gerrit.cloudera.org:8080/7980 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9fda2061d341586f0aa343787af59984a627080a Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] KUDU-2135 (part 1): add persistent disk states
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8048 to look at the new patch set (#3). Change subject: KUDU-2135 (part 1): add persistent disk states .. KUDU-2135 (part 1): add persistent disk states This patch introduces a new format for PathInstanceMetadataFiles that includes a health state and path for each data directory, and a timestamp that the path instance file was last updated. These additions will be necessary to ensure that failed disks will not be used and that data is where it is expected be (which will be increasingly important as we begin to supporting removal, addition, and replacement of disks). The expected usage of the new format would enforce startup restrictions on the server (e.g. to never use a previously failed disk, to not use data if it is on a different disk than expected, etc.) and tooling does not yet exist to work around these restrictions. As such, this patch will upgrade and maintain the new path sets as necessary, persisting disk health when disks fail, but will not apply the mentioned restrictions at startup. Tests are added to data_dirs-test to ensure the path instance files get updated and upgraded appropriately. Change-Id: Ia7ea6418f6190d7d0c6d5a1aaca4ca592c49ce41 --- M src/kudu/fs/block_manager_util-test.cc M src/kudu/fs/block_manager_util.cc M src/kudu/fs/block_manager_util.h M src/kudu/fs/data_dirs-test.cc M src/kudu/fs/data_dirs.cc M src/kudu/fs/data_dirs.h M src/kudu/fs/fs.proto 7 files changed, 473 insertions(+), 76 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/8048/3 -- To view, visit http://gerrit.cloudera.org:8080/8048 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7ea6418f6190d7d0c6d5a1aaca4ca592c49ce41 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Kudu Jenkins