[kudu-CR] WIP: KUDU-1366: allow building against jemalloc

2017-05-17 Thread Todd Lipcon (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6918

to look at the new patch set (#3).

Change subject: WIP: KUDU-1366: allow building against jemalloc
..

WIP: KUDU-1366: allow building against jemalloc

This adds jemalloc to thirdparty and hooks it up for process memory accounting.
I also did a couple little jemalloc-specific optimizations in faststring.

WIP because it needs benchmarking to see if this is truly a win.

Change-Id: I59f77e9f1ebdf6f1451e0f723a91cda4790c434b
---
M CMakeLists.txt
M src/kudu/server/CMakeLists.txt
M src/kudu/util/CMakeLists.txt
M src/kudu/util/faststring-test.cc
M src/kudu/util/faststring.cc
M src/kudu/util/malloc.h
M src/kudu/util/process_memory.cc
M thirdparty/build-definitions.sh
M thirdparty/build-thirdparty.sh
M thirdparty/download-thirdparty.sh
M thirdparty/vars.sh
11 files changed, 168 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/18/6918/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6918
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I59f77e9f1ebdf6f1451e0f723a91cda4790c434b
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] [c++ client] less logging about authn token

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: [c++ client] less logging about authn token
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I13693bc89e528eb4dfdd0e85d6d5b8faac8edadf
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] [c++ client] less logging about authn token

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: [c++ client] less logging about authn token
..


[c++ client] less logging about authn token

LOG(INFO) --> VLOG(2) on receiving and adopting authn token by
the Kudu C++ client.

Change-Id: I13693bc89e528eb4dfdd0e85d6d5b8faac8edadf
Reviewed-on: http://gerrit.cloudera.org:8080/6914
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon 
---
M src/kudu/client/client-internal.cc
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I13693bc89e528eb4dfdd0e85d6d5b8faac8edadf
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] WIP: KUDU-1366: allow building against jemalloc

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has uploaded a new patch set (#2).

Change subject: WIP: KUDU-1366: allow building against jemalloc
..

WIP: KUDU-1366: allow building against jemalloc

This adds jemalloc to thirdparty and hooks it up for process memory accounting.
I also did a couple little jemalloc-specific optimizations in faststring.

WIP because it needs benchmarking to see if this is truly a win.

Change-Id: I59f77e9f1ebdf6f1451e0f723a91cda4790c434b
---
M CMakeLists.txt
M src/kudu/server/CMakeLists.txt
M src/kudu/util/CMakeLists.txt
M src/kudu/util/faststring-test.cc
M src/kudu/util/faststring.cc
M src/kudu/util/malloc.h
M src/kudu/util/process_memory.cc
M thirdparty/build-definitions.sh
M thirdparty/build-thirdparty.sh
M thirdparty/download-thirdparty.sh
M thirdparty/vars.sh
11 files changed, 165 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/18/6918/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6918
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I59f77e9f1ebdf6f1451e0f723a91cda4790c434b
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-1998. Re-enable tcmalloc on OSX

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: KUDU-1998. Re-enable tcmalloc on OSX
..


KUDU-1998. Re-enable tcmalloc on OSX

We switched back away from using malloc hooks for memory tracking, so
this shouldn't be an issue anymore.

Change-Id: I9bb3329aa4c12595dedf27d022558671e90206fb
Reviewed-on: http://gerrit.cloudera.org:8080/6917
Reviewed-by: Adar Dembo 
Tested-by: Kudu Jenkins
---
M CMakeLists.txt
1 file changed, 1 insertion(+), 3 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6917
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I9bb3329aa4c12595dedf27d022558671e90206fb
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2017. Fix calculation of perf improvement for flushes

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: KUDU-2017. Fix calculation of perf improvement for flushes
..


KUDU-2017. Fix calculation of perf improvement for flushes

The code to calculate the MM "perf improvement score" for a flush had two bugs:

  (1) we calculated threshold - current_usage instead of current_usage -
  threshold, which resulted in a negative score.

  (2) we had an unsigned integer overflow, which resulted in the above negative
  score becoming insanely large.

These two wrongs "made a right" in which we'd still trigger flushes at the
flush threshold, which is why this went unnoticed for quite some time. However,
the flushing behavior is more aggressive than originally intended, and we would
lose the correct prioritization of flushing tablets that are farther above the
threshold.

This patch addresses the issue.

I tested using tpch_real_world against a server configured with a 40G
memory limit. I verified the 'perf improvement' just prior to a flush
was the expected value (151 perf improvement listed for a tablet with
1.15G MRS and 1G flush threshold)

Change-Id: I8bcf443db857ce172b8e83758eb73a24c4b0c6af
Reviewed-on: http://gerrit.cloudera.org:8080/6908
Reviewed-by: Adar Dembo 
Tested-by: Kudu Jenkins
---
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tablet/tablet_replica_mm_ops.cc
2 files changed, 7 insertions(+), 6 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6908
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I8bcf443db857ce172b8e83758eb73a24c4b0c6af
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1998. Re-enable tcmalloc on OSX

2017-05-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1998. Re-enable tcmalloc on OSX
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6917
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9bb3329aa4c12595dedf27d022558671e90206fb
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR] KUDU-2017. Fix calculation of perf improvement for flushes

2017-05-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-2017. Fix calculation of perf improvement for flushes
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6908
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bcf443db857ce172b8e83758eb73a24c4b0c6af
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-1998. Re-enable tcmalloc on OSX

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6917

Change subject: KUDU-1998. Re-enable tcmalloc on OSX
..

KUDU-1998. Re-enable tcmalloc on OSX

We switched back away from using malloc hooks for memory tracking, so
this shouldn't be an issue anymore.

Change-Id: I9bb3329aa4c12595dedf27d022558671e90206fb
---
M CMakeLists.txt
1 file changed, 1 insertion(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/17/6917/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6917
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I9bb3329aa4c12595dedf27d022558671e90206fb
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 


[kudu-CR] WIP: KUDU-1366: allow building against jemalloc

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6918

Change subject: WIP: KUDU-1366: allow building against jemalloc
..

WIP: KUDU-1366: allow building against jemalloc

This adds jemalloc to thirdparty and hooks it up for process memory accounting.
I also did a couple little jemalloc-specific optimizations in faststring.

WIP because it needs benchmarking to see if this is truly a win.

Change-Id: I59f77e9f1ebdf6f1451e0f723a91cda4790c434b
---
M CMakeLists.txt
M src/kudu/server/CMakeLists.txt
M src/kudu/util/CMakeLists.txt
M src/kudu/util/faststring-test.cc
M src/kudu/util/faststring.cc
M src/kudu/util/malloc.h
M src/kudu/util/process_memory.cc
M thirdparty/build-definitions.sh
M thirdparty/build-thirdparty.sh
M thirdparty/download-thirdparty.sh
M thirdparty/vars.sh
11 files changed, 165 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/18/6918/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6918
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I59f77e9f1ebdf6f1451e0f723a91cda4790c434b
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 


[kudu-CR] KUDU-2017. Fix calculation of perf improvement for flushes

2017-05-17 Thread Todd Lipcon (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6908

to look at the new patch set (#2).

Change subject: KUDU-2017. Fix calculation of perf improvement for flushes
..

KUDU-2017. Fix calculation of perf improvement for flushes

The code to calculate the MM "perf improvement score" for a flush had two bugs:

  (1) we calculated threshold - current_usage instead of current_usage -
  threshold, which resulted in a negative score.

  (2) we had an unsigned integer overflow, which resulted in the above negative
  score becoming insanely large.

These two wrongs "made a right" in which we'd still trigger flushes at the
flush threshold, which is why this went unnoticed for quite some time. However,
the flushing behavior is more aggressive than originally intended, and we would
lose the correct prioritization of flushing tablets that are farther above the
threshold.

This patch addresses the issue.

I tested using tpch_real_world against a server configured with a 40G
memory limit. I verified the 'perf improvement' just prior to a flush
was the expected value (151 perf improvement listed for a tablet with
1.15G MRS and 1G flush threshold)

Change-Id: I8bcf443db857ce172b8e83758eb73a24c4b0c6af
---
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tablet/tablet_replica_mm_ops.cc
2 files changed, 7 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/6908/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6908
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8bcf443db857ce172b8e83758eb73a24c4b0c6af
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] process memory: go back to non-incremental tracking

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: process_memory: go back to non-incremental tracking
..


process_memory: go back to non-incremental tracking

The incremental tracking has more overhead than expected, even with our
fancy striped counters, etc. In tpch_real_world benchmarks, the
LongAdder::IncrementBy() and GetAllocatedSize() calls in the hooks
were taking a significant amount of CPU (in the top 10 consumers).

Here's yet another approach: back to non-incremental tracking, but with
an optimization that, if we've calculated the memory usage in the last
50ms, or if another thread is already in the process of calculating, we
don't re-calculate it.

This has some minor risk of "overshooting" the memory limit, but since
our limiting is already probabilistic and not 100% thorough anyway, I
think that's acceptable.

Benchmarked using process_memory-test:

Before:
I0517 16:46:09.596946  7419 process_memory-test.cc:68] Performed 2.18441e+06 
iters/sec

After:
I0517 16:45:50.497973  7051 process_memory-test.cc:68] Performed 1.22264e+08 
iters/sec

(55x speedup)

I also ran tpch_real_world and graphed CPU consumed per inserted row and
found it to be noticeably better with this optimized version
(approximately 7-8% by eye-balling it). I spot-checked perf results and
don't see any tcmalloc stat-related methods in the high consumers list
anymore. I also checked in this workload that the heap usage was
properly limited (indistinguishable from the before-patch
implementation)

Change-Id: I8823028de3ea260f1450d9bf34af2dc5a794b206
Reviewed-on: http://gerrit.cloudera.org:8080/6915
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo 
---
M src/kudu/tserver/tablet_server-stress-test.cc
M src/kudu/util/process_memory.cc
2 files changed, 28 insertions(+), 41 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I8823028de3ea260f1450d9bf34af2dc5a794b206
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] process memory: go back to non-incremental tracking

2017-05-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: process_memory: go back to non-incremental tracking
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8823028de3ea260f1450d9bf34af2dc5a794b206
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-1125 (part 1) catalog manager: try to avoid unnecessarily rewriting tablet info

2017-05-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1125 (part 1) catalog_manager: try to avoid unnecessarily 
rewriting tablet info
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6916/1//COMMIT_MSG
Commit Message:

PS1, Line 17: I manually verified that, if I restarted a master with a single TS
: reporting to it, when it came back up, it was able to skip writes 
to
: disk when receiving tablet reports redundant with existing 
information.
Any way to automate the testing? Perhaps using catalog write metrics?


-- 
To view, visit http://gerrit.cloudera.org:8080/6916
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0de7189b8f1dbeea55172929396b73fd92fd62ba
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1826 add propagated timestamp to sync client

2017-05-17 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change.

Change subject: KUDU-1826 add propagated timestamp to sync client
..


Patch Set 8:

Failure seems to be an unrelated spark flake.

-- 
To view, visit http://gerrit.cloudera.org:8080/6798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbfeb9526423f5acd0ef6482869d36cdd146e921
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR] process memory: go back to non-incremental tracking

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: process_memory: go back to non-incremental tracking
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6915/1/src/kudu/util/process_memory.cc
File src/kudu/util/process_memory.cc:

PS1, Line 198: GetMonoTimeMicros()
> Makes sense. Could you add a comment explaining this, so no one "optimizes 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8823028de3ea260f1450d9bf34af2dc5a794b206
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1125 (part 1) catalog manager: try to avoid unnecessarily rewriting tablet info

2017-05-17 Thread Todd Lipcon (Code Review)
Hello Adar Dembo,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/6916

to review the following change.

Change subject: KUDU-1125 (part 1) catalog_manager: try to avoid unnecessarily 
rewriting tablet info
..

KUDU-1125 (part 1) catalog_manager: try to avoid unnecessarily rewriting tablet 
info

This adds a flag on CowObject locks to track whether the mutable data
field has ever been accessed. In the case that it's not accessed, we can
guarantee that the underlying object didn't change, in which case it's
safe to avoid writing it to disk during a tablet report.

This is a simple way of short-circuiting a common case of unnecessary
slowness in tablet report processing.

I manually verified that, if I restarted a master with a single TS
reporting to it, when it came back up, it was able to skip writes to
disk when receiving tablet reports redundant with existing information.

This doesn't achieve all of the goals of KUDU-1125 (we still do separate
sync writes for each reported tablet) but should still be a substantial
reduction. Perhaps most importantly, if a heartbeat from a tserver times
out due to long processing, the retry from the TS should hit this code
path and therefore be processed quickly.

Change-Id: I0de7189b8f1dbeea55172929396b73fd92fd62ba
---
M src/kudu/master/catalog_manager.cc
M src/kudu/util/cow_object.h
2 files changed, 36 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/16/6916/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6916
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I0de7189b8f1dbeea55172929396b73fd92fd62ba
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 


[kudu-CR] process memory: go back to non-incremental tracking

2017-05-17 Thread Todd Lipcon (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6915

to look at the new patch set (#2).

Change subject: process_memory: go back to non-incremental tracking
..

process_memory: go back to non-incremental tracking

The incremental tracking has more overhead than expected, even with our
fancy striped counters, etc. In tpch_real_world benchmarks, the
LongAdder::IncrementBy() and GetAllocatedSize() calls in the hooks
were taking a significant amount of CPU (in the top 10 consumers).

Here's yet another approach: back to non-incremental tracking, but with
an optimization that, if we've calculated the memory usage in the last
50ms, or if another thread is already in the process of calculating, we
don't re-calculate it.

This has some minor risk of "overshooting" the memory limit, but since
our limiting is already probabilistic and not 100% thorough anyway, I
think that's acceptable.

Benchmarked using process_memory-test:

Before:
I0517 16:46:09.596946  7419 process_memory-test.cc:68] Performed 2.18441e+06 
iters/sec

After:
I0517 16:45:50.497973  7051 process_memory-test.cc:68] Performed 1.22264e+08 
iters/sec

(55x speedup)

I also ran tpch_real_world and graphed CPU consumed per inserted row and
found it to be noticeably better with this optimized version
(approximately 7-8% by eye-balling it). I spot-checked perf results and
don't see any tcmalloc stat-related methods in the high consumers list
anymore. I also checked in this workload that the heap usage was
properly limited (indistinguishable from the before-patch
implementation)

Change-Id: I8823028de3ea260f1450d9bf34af2dc5a794b206
---
M src/kudu/tserver/tablet_server-stress-test.cc
M src/kudu/util/process_memory.cc
2 files changed, 28 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/6915/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8823028de3ea260f1450d9bf34af2dc5a794b206
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1826 add propagated timestamp to sync client

2017-05-17 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change.

Change subject: KUDU-1826 add propagated timestamp to sync client
..


Patch Set 7:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6798/7/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java
File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java:

PS7, Line 237: @return a long indicating the last timestamp received from a 
server
> Can you mention that the timestamp can't be interpreted as absolute time (i
Done


http://gerrit.cloudera.org:8080/#/c/6798/7/java/kudu-client/src/main/java/org/apache/kudu/client/KuduClient.java
File java/kudu-client/src/main/java/org/apache/kudu/client/KuduClient.java:

PS7, Line 44: NO_TIMESTAMP
> don't we have this constant defined somewhere else?
We do in the AsyncKuduClient. I defined it again here since it seems a bit 
strange that we'd otherwise force users to do something like:
(KuduClient.getLastPropagatedTimestamp() == AsyncKuduClient.NO_TIMESTAMP)

Would having an alternative for statements like ^that warrant the redundancy?

This is noted in the commit message.


http://gerrit.cloudera.org:8080/#/c/6798/7/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java
File java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java:

PS7, Line 92: long initial_ts = syncClient.getLastPropagatedTimestamp();
: 
: // Check that the initial timestamp is consistent with the 
asynchronous client.
: assertEquals(initial_ts, client.getLastPropagatedTimestamp());
: assertEquals(initial_ts, 
syncClient.getLastPropagatedTimestamp());
: 
: // Attempt to change the timestamp to a lower value. This 
should not change
: // the internal timestamp, as it must be monotonically 
increasing.
: syncClient.updateLastPropagatedTimestamp(initial_ts - 1);
: assertEquals(initial_ts, client.getLastPropagatedTimestamp());
: assertEquals(initial_ts, 
syncClient.getLastPropagatedTimestamp());
: 
: // Use the synchronous client to update the last propagated 
timestamp and
: // check with both clients that the timestamp was updated.
: syncClient.updateLastPropagatedTimestamp(initial_ts + 1);
: assertEquals(initial_ts + 1, 
client.getLastPropagatedTimestamp());
: assertEquals(initial_ts + 1, 
syncClient.getLastPropagatedTimestamp());
> can you actually perform some writes or something to test that the timestam
Fair point, I'd originally begun adding some tests elsewhere, but opted to move 
to a separate test. Will do.


-- 
To view, visit http://gerrit.cloudera.org:8080/6798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbfeb9526423f5acd0ef6482869d36cdd146e921
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1826 add propagated timestamp to sync client

2017-05-17 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6798

to look at the new patch set (#8).

Change subject: KUDU-1826 add propagated timestamp to sync client
..

KUDU-1826 add propagated timestamp to sync client

Updating and getting the propagated timestamp only exist as functions
of the AsyncKuduClient, but these functions are still useful to the
KuduClient.

This patch exposes AsyncKuduClient's updateLastPropagatedTimestamp()
and getLastPropagatedTimestamp() to the KuduClient. The static value
NO_TIMESTAMP is also added to the KuduClient so timestamps returned
by KuduClient.getLastPropagatedTimestamp() can be compared with
KuduClient.NO_TIMESTAMP.

Change-Id: Ibbfeb9526423f5acd0ef6482869d36cdd146e921
---
M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduClient.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java
M 
java/kudu-client/src/test/java/org/apache/kudu/client/TestScannerMultiTablet.java
4 files changed, 67 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/98/6798/8
-- 
To view, visit http://gerrit.cloudera.org:8080/6798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibbfeb9526423f5acd0ef6482869d36cdd146e921
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] process memory: go back to non-incremental tracking

2017-05-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: process_memory: go back to non-incremental tracking
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6915/1//COMMIT_MSG
Commit Message:

PS1, Line 34: noticeably better
> I think it's more significant -- eg, where the red line crosses 20 us/row, 
Ah yes, the Y axis is raw cpu micros, not percentages. Makes sense, thanks for 
the correction.


http://gerrit.cloudera.org:8080/#/c/6915/1/src/kudu/util/process_memory.cc
File src/kudu/util/process_memory.cc:

Line 191:   const int64_t kReadIntervalMicros = 5;
> If we're doing 1M inserts/sec, then it must be with a smaller row size (1M 
I buy it.


-- 
To view, visit http://gerrit.cloudera.org:8080/6915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8823028de3ea260f1450d9bf34af2dc5a794b206
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] process memory: go back to non-incremental tracking

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: process_memory: go back to non-incremental tracking
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6915/1//COMMIT_MSG
Commit Message:

PS1, Line 34: noticeably better
> Thanks. It looks like the difference between the two implementations is abo
I think it's more significant -- eg, where the red line crosses 20 us/row, the 
blue line is around 18.5 us/row. So, that's 7.5% or so.


http://gerrit.cloudera.org:8080/#/c/6915/1/src/kudu/util/process_memory.cc
File src/kudu/util/process_memory.cc:

Line 191:   const int64_t kReadIntervalMicros = 5;
> I thought we could consume more than that, even in such a short period of t
If we're doing 1M inserts/sec, then it must be with a smaller row size (1M * 1k 
row size = 1GB/sec, which is more than I've seen us ingest per node). And I 
think in that case, overshooting by 50M isn't so bad, either, considering to 
sustain 1GB ingest per second you must have a relatively high memory limit, 
lots of maintenance threads, etc, and thus 50M is only a tiny percentage of 
your overall heap, right?


-- 
To view, visit http://gerrit.cloudera.org:8080/6915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8823028de3ea260f1450d9bf34af2dc5a794b206
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] process memory: go back to non-incremental tracking

2017-05-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: process_memory: go back to non-incremental tracking
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6915/1//COMMIT_MSG
Commit Message:

PS1, Line 34: noticeably better
> takes a long time to run, so I didn't wait for the whole thing to finish :)
Thanks. It looks like the difference between the two implementations is about 
1-2% CPU time. Am I reading the graph correctly?


http://gerrit.cloudera.org:8080/#/c/6915/1/src/kudu/util/process_memory.cc
File src/kudu/util/process_memory.cc:

Line 191:   const int64_t kReadIntervalMicros = 5;
> figured that we would have a hard time using more than a few MB in 50ms, an
I thought we could consume more than that, even in such a short period of time. 
Like, ballpark 1M inserts/s, so 50K for every 50ms. With a 1K row size, that's 
50 MB.

Do you think that's achievable?


PS1, Line 198: GetMonoTimeMicros()
> I wanted to re-fetch time, so that we guarantee the sleep _between_ calls i
Makes sense. Could you add a comment explaining this, so no one "optimizes 
away" the second call?


-- 
To view, visit http://gerrit.cloudera.org:8080/6915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8823028de3ea260f1450d9bf34af2dc5a794b206
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] process memory: go back to non-incremental tracking

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: process_memory: go back to non-incremental tracking
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6915/1//COMMIT_MSG
Commit Message:

PS1, Line 34: noticeably better
> Can you quantify that?
takes a long time to run, so I didn't wait for the whole thing to finish :) 
Here's the plot I looked at, though:
https://imagebin.ca/v/3Mq0VwvSGSJ1


http://gerrit.cloudera.org:8080/#/c/6915/1/src/kudu/util/process_memory.cc
File src/kudu/util/process_memory.cc:

Line 191:   const int64_t kReadIntervalMicros = 5;
> I'm curious as to how you arrived at this number. My instinct would be to g
figured that we would have a hard time using more than a few MB in 50ms, and it 
would definitely avoid contention?


PS1, Line 198: GetMonoTimeMicros()
> Shouldn't this be 'time'?
I wanted to re-fetch time, so that we guarantee the sleep _between_ calls is at 
least 50ms. That way, if the call itself started being slow (eg because we have 
1 threads) we guarantee some progress


-- 
To view, visit http://gerrit.cloudera.org:8080/6915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8823028de3ea260f1450d9bf34af2dc5a794b206
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] process memory: go back to non-incremental tracking

2017-05-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: process_memory: go back to non-incremental tracking
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6915/1//COMMIT_MSG
Commit Message:

PS1, Line 34: noticeably better
Can you quantify that?


http://gerrit.cloudera.org:8080/#/c/6915/1/src/kudu/util/process_memory.cc
File src/kudu/util/process_memory.cc:

Line 191:   const int64_t kReadIntervalMicros = 5;
I'm curious as to how you arrived at this number. My instinct would be to go 
lower, like 10ms.


PS1, Line 198: GetMonoTimeMicros()
Shouldn't this be 'time'?


-- 
To view, visit http://gerrit.cloudera.org:8080/6915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8823028de3ea260f1450d9bf34af2dc5a794b206
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes


[kudu-CR] WIP: upgrade to proto3

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has abandoned this change.

Change subject: WIP: upgrade to proto3
..


Abandoned

Grant did this elsewhere

-- 
To view, visit http://gerrit.cloudera.org:8080/6297
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I0171a00b339e4a108ad2e5612442b300b76b0656
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] process memory: go back to non-incremental tracking

2017-05-17 Thread Todd Lipcon (Code Review)
Hello Adar Dembo,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/6915

to review the following change.

Change subject: process_memory: go back to non-incremental tracking
..

process_memory: go back to non-incremental tracking

The incremental tracking has more overhead than expected, even with our
fancy striped counters, etc. In tpch_real_world benchmarks, the
LongAdder::IncrementBy() and GetAllocatedSize() calls in the hooks
were taking a significant amount of CPU (in the top 10 consumers).

Here's yet another approach: back to non-incremental tracking, but with
an optimization that, if we've calculated the memory usage in the last
50ms, or if another thread is already in the process of calculating, we
don't re-calculate it.

This has some minor risk of "overshooting" the memory limit, but since
our limiting is already probabilistic and not 100% thorough anyway, I
think that's acceptable.

Benchmarked using process_memory-test:

Before:
I0517 16:46:09.596946  7419 process_memory-test.cc:68] Performed 2.18441e+06 
iters/sec

After:
I0517 16:45:50.497973  7051 process_memory-test.cc:68] Performed 1.22264e+08 
iters/sec

(55x speedup)

I also ran tpch_real_world and graphed CPU consumed per inserted row and
found it to be noticeably better with this optimized version. I
spot-checked perf results and don't see any tcmalloc stat-related
methods in the high consumers list anymore. I also checked in this
workload that the heap usage was properly limited (indistinguishable
from the before-patch implementation)

Change-Id: I8823028de3ea260f1450d9bf34af2dc5a794b206
---
M src/kudu/tserver/tablet_server-stress-test.cc
M src/kudu/util/process_memory.cc
2 files changed, 23 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/6915/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8823028de3ea260f1450d9bf34af2dc5a794b206
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 


[kudu-CR] KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs

2017-05-17 Thread Dan Burkert (Code Review)
Dan Burkert has submitted this change and it was merged.

Change subject: KUDU-1875: Refuse unauthenticated connections from publicly 
routable IP addrs
..


KUDU-1875: Refuse unauthenticated connections from publicly routable
IP addrs

This rejects unauthenticated connections from publicly routable IPs,
even if authentication and encryption are not configured.

An adavanced flag 'trusted_subnets' is provided to whitelist
trusted subnets. If this flag is set explicitly, all unauthenticated
or unencrypted connections are prohibited except the ones from the
specified address blocks. Otherwise, private network (127.0.0.0/8,
etc.) and local subnets of all local network interfaces will be used.
Set it to '0.0.0.0/0' allows unauthenticated/unencrypted connections
from all remote IP addresses. However, if network access is not
otherwise restricted by a firewall, malicious users may be able to
gain unauthorized access.

Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e
Reviewed-on: http://gerrit.cloudera.org:8080/6514
Tested-by: Kudu Jenkins
Reviewed-by: Dan Burkert 
---
M src/kudu/rpc/negotiation-test.cc
M src/kudu/rpc/server_negotiation.cc
M src/kudu/rpc/server_negotiation.h
M src/kudu/util/net/net_util-test.cc
M src/kudu/util/net/net_util.cc
M src/kudu/util/net/net_util.h
M src/kudu/util/net/socket.h
7 files changed, 337 insertions(+), 7 deletions(-)

Approvals:
  Dan Burkert: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6514
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e
Gerrit-PatchSet: 38
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs

2017-05-17 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-1875: Refuse unauthenticated connections from publicly 
routable IP addrs
..


Patch Set 37: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6514
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e
Gerrit-PatchSet: 37
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-1952 Remove round-robin for block placement

2017-05-17 Thread Andrew Wong (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6636

to look at the new patch set (#35).

Change subject: KUDU-1952 Remove round-robin for block placement
..

KUDU-1952 Remove round-robin for block placement

This is the first of a multi-patch patchset to mitigate the effects of
single-disk failure. Throughout the code, the term "DataDir" refers to
a data directory, which is often mounted on a distinct disk. Thus,
"disks" and "data directories" will be used interchangeably.

This patch adds a mapping from tablet to a set of disks and uses it to
replace the existing round-robin placement of blocks. Tablets are
mapped to a fixed number of disks (i.e. a DataDirGroup). New blocks
are placed randomly in directories within each tablet's DataDirGroup.

Tablet-to-group mappings are generated and stored as metadata upon
tablet creation, or upon tablet replacement during a tablet copy.
During group creation, disks are added to groups by randomly selecting
two available directories and selecting the one with fewer tablets on
it. This avoids pigeonholing new tablets to disks with relatively few
tablets, while still trending towards filling underloaded disks.

Groups are maintained when restarting the server, as they are flushed
with metadata, and are deleted upon tablet deletion.  When loading
tablet data from a previous version of Kudu, the tablet's metadata
will not have a DataDirGroup. One will be generated containing all
data directories, as the tablet's data may already be spread across
any number of disks.

As this patch only addresses block placement, it does not itself
mitigate the effects of single-disk failure. Given this, and given the
tradeoff between I/O and failure disk-failure tolerance, the default
behavior will be to spread tablet data across all disks.

Testing is done at the block manager level in block_manager-test and
log_block_manager-test, as well as in the new data_dirs-test.

A design doc can be found here:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
---
M src/kudu/cfile/bloomfile-test-base.h
M src/kudu/cfile/cfile-test-base.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
A src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs.proto
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/delta_compaction-test.cc
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/deltafile-test.cc
M src/kudu/tablet/deltamemstore-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
M src/kudu/tablet/tablet_bootstrap-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
38 files changed, 1,022 insertions(+), 192 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/6636/35
-- 
To view, visit http://gerrit.cloudera.org:8080/6636
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
Gerrit-PatchSet: 35
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs

2017-05-17 Thread Hao Hao (Code Review)
Hello Todd Lipcon, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6514

to look at the new patch set (#37).

Change subject: KUDU-1875: Refuse unauthenticated connections from publicly 
routable IP addrs
..

KUDU-1875: Refuse unauthenticated connections from publicly routable
IP addrs

This rejects unauthenticated connections from publicly routable IPs,
even if authentication and encryption are not configured.

An adavanced flag 'trusted_subnets' is provided to whitelist
trusted subnets. If this flag is set explicitly, all unauthenticated
or unencrypted connections are prohibited except the ones from the
specified address blocks. Otherwise, private network (127.0.0.0/8,
etc.) and local subnets of all local network interfaces will be used.
Set it to '0.0.0.0/0' allows unauthenticated/unencrypted connections
from all remote IP addresses. However, if network access is not
otherwise restricted by a firewall, malicious users may be able to
gain unauthorized access.

Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e
---
M src/kudu/rpc/negotiation-test.cc
M src/kudu/rpc/server_negotiation.cc
M src/kudu/rpc/server_negotiation.h
M src/kudu/util/net/net_util-test.cc
M src/kudu/util/net/net_util.cc
M src/kudu/util/net/net_util.h
M src/kudu/util/net/socket.h
7 files changed, 337 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/14/6514/37
-- 
To view, visit http://gerrit.cloudera.org:8080/6514
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e
Gerrit-PatchSet: 37
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] [c++ client] less logging about authn token

2017-05-17 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6914

Change subject: [c++ client] less logging about authn token
..

[c++ client] less logging about authn token

LOG(INFO) --> VLOG(2) on receiving and adopting authn token by
the Kudu C++ client.

Change-Id: I13693bc89e528eb4dfdd0e85d6d5b8faac8edadf
---
M src/kudu/client/client-internal.cc
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/14/6914/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I13693bc89e528eb4dfdd0e85d6d5b8faac8edadf
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 


[kudu-CR] log: reduce segment size from 64MB to 8MB

2017-05-17 Thread Todd Lipcon (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6911

to look at the new patch set (#4).

Change subject: log: reduce segment size from 64MB to 8MB
..

log: reduce segment size from 64MB to 8MB

Currently, we always retain a minimum of one segment of the WAL, even if
a tablet is cold and has not performed any writes in a long time. If a
server has hundreds or thousands of tablets, keeping a 64MB segment for
each tablet adds up to a lot of "wasted" disk space, especially if the
WAL is configured to an expensive disk such as an SSD.

In addition to wasting space, the current code always re-reads all live
WAL segments at startup. Solving this has been an open problem for quite
some time, but there are various subtleties (described in KUDU-38). So,
as a band-aid, reducing the size of the segment will linearly reduce the
amount of work during bootstrap of "cold" tablets.

In summary, the expectation is that, in a "cold" server, this will
reduce WAL disk space usage and startup time by approximately a factor
of 8.

I verified this by making these same changes in the configuration of a server
with ~6000 cold tablets. The disk usage of the WAL disk went from 381GB to
48GB, with a similar factor reduction in startup time.

Because the segment size is reduced by a factor of 8, the patch also
increases the max retention segment count by the same factor (from 10 to
80). This has one risk, in that we currently keep an open file
descriptor for every retained segment. However, in typical workloads,
the number of tablets with a high rate of active writes is not in the
hundreds or thousands, and thus the total number of file descriptors is
still likely to be manageable. Nevertheless, this patch adds a TODO that
we should consider a FileCache for these descriptors if we start to see
the usage be problematic.

So, if reducing to 8MB is good, why not go further, like 4MB or 1MB? The
assumption is that 8MB is still large enough to get good sequential IO
throughput on both read and write, and large enough to limit the FD
usage as described above. If we add an FD cache at some point, we could
consider reducing it further, particularly if running on SSDs where the
sequential throughput is less affected by size.

Although it's possible that a user had configured max_segments_to_retain
based on their own disk space requirements, the flag is marked
experimental, so I don't think we have to worry about compatibility in
that respect. We should consider changing the units here to be MB rather
than segment-count, so that the value is more robust.

Change-Id: Iadcda6b085e69cae5a15d54bb4c945d7605d5f98
---
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_util.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
4 files changed, 48 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/6911/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iadcda6b085e69cae5a15d54bb4c945d7605d5f98
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1952 Remove round-robin for block placement

2017-05-17 Thread Andrew Wong (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6636

to look at the new patch set (#34).

Change subject: KUDU-1952 Remove round-robin for block placement
..

KUDU-1952 Remove round-robin for block placement

This is the first of a multi-patch patchset to mitigate the effects of
single-disk failure. Throughout the code, the term "DataDir" refers to
a data directory, which is often mounted on a distinct disk. Thus,
"disks" and "data directories" will be used interchangeably.

This patch adds a mapping from tablet to a set of disks and uses it to
replace the existing round-robin placement of blocks. Tablets are
mapped to a fixed number of disks (i.e. a DataDirGroup). New blocks
are placed randomly in directories within each tablet's DataDirGroup.

Tablet-to-group mappings are generated and stored as metadata upon
tablet creation, or upon tablet replacement during a tablet copy.
During group creation, disks are added to groups by randomly selecting
two available directories and selecting the one with fewer tablets on
it. This avoids pigeonholing new tablets to disks with relatively few
tablets, while still trending towards filling underloaded disks.

Groups are maintained when restarting the server, as they are flushed
with metadata, and are deleted upon tablet deletion.  When loading
tablet data from a previous version of Kudu, the tablet's metadata
will not have a DataDirGroup. One will be generated containing all
data directories, as the tablet's data may already be spread across
any number of disks.

As this patch only addresses block placement, it does not itself
mitigate single-disk failure. Given this, and given the tradeoff
between I/O and failure disk-failure tolerance, the default behavior
will be to spread tablet data across all disks.

Testing is done at the block manager level in block_manager-test and
log_block_manager-test, as well as in the new data_dirs-test.

A design doc can be found here:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
---
M src/kudu/cfile/bloomfile-test-base.h
M src/kudu/cfile/cfile-test-base.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
A src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs.proto
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/delta_compaction-test.cc
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/deltafile-test.cc
M src/kudu/tablet/deltamemstore-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
M src/kudu/tablet/tablet_bootstrap-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
38 files changed, 1,023 insertions(+), 192 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/6636/34
-- 
To view, visit http://gerrit.cloudera.org:8080/6636
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
Gerrit-PatchSet: 34
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] log: reduce segment size from 64MB to 8MB

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: log: reduce segment size from 64MB to 8MB
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6911/1//COMMIT_MSG
Commit Message:

PS1, Line 7: 64MB to 8MB
> it is possible, but I think based on the rolling code it should be fine -- 
With async preallocation enabled, it seems to do fine, because we check whether 
the batch is about to overflow the boundary, and if it is, we start an async 
preallocation but don't actually switch to the new file until the _next_ batch. 
With async preallocation off, it indeed alternates between large segments 
(containing a REPLICATE message with the big batch) and small segments 
(containing just a COMMIT message). But, it seems perfectly happy to restart 
from this case anyway.

Will add a new test which covers this case.

Separately, I wonder whether we should remove the 'non-async preallocation' 
code path entirely, so we have fewer combinations in the code to think about, 
but I think a bunch of tests rely on it for determinism, so won't try to do 
that here.


-- 
To view, visit http://gerrit.cloudera.org:8080/6911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iadcda6b085e69cae5a15d54bb4c945d7605d5f98
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] log: reduce segment size from 64MB to 8MB

2017-05-17 Thread Todd Lipcon (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6911

to look at the new patch set (#3).

Change subject: log: reduce segment size from 64MB to 8MB
..

log: reduce segment size from 64MB to 8MB

Currently, we always retain a minimum of one segment of the WAL, even if
a tablet is cold and has not performed any writes in a long time. If a
server has hundreds or thousands of tablets, keeping a 64MB segment for
each tablet adds up to a lot of "wasted" disk space, especially if the
WAL is configured to an expensive disk such as an SSD.

In addition to wasting space, the current code always re-reads all live
WAL segments at startup. Solving this has been an open problem for quite
some time, but there are various subtleties (described in KUDU-38). So,
as a band-aid, reducing the size of the segment will linearly reduce the
amount of work during bootstrap of "cold" tablets.

In summary, the expectation is that, in a "cold" server, this will
reduce WAL disk space usage and startup time by approximately a factor
of 8.

I verified this by making these same changes in the configuration of a server
with ~6000 cold tablets. The disk usage of the WAL disk went from 381GB to
48GB, with a similar factor reduction in startup time.

Because the segment size is reduced by a factor of 8, the patch also
increases the max retention segment count by the same factor (from 10 to
80). This has one risk, in that we currently keep an open file
descriptor for every retained segment. However, in typical workloads,
the number of tablets with a high rate of active writes is not in the
hundreds or thousands, and thus the total number of file descriptors is
still likely to be manageable. Nevertheless, this patch adds a TODO that
we should consider a FileCache for these descriptors if we start to see
the usage be problematic.

So, if reducing to 8MB is good, why not go further, like 4MB or 1MB? The
assumption is that 8MB is still large enough to get good sequential IO
throughput on both read and write, and large enough to limit the FD
usage as described above. If we add an FD cache at some point, we could
consider reducing it further, particularly if running on SSDs where the
sequential throughput is less affected by size.

Although it's possible that a user had configured max_segments_to_retain
based on their own disk space requirements, the flag is marked
experimental, so I don't think we have to worry about compatibility in
that respect. We should consider changing the units here to be MB rather
than segment-count, so that the value is more robust.

Change-Id: Iadcda6b085e69cae5a15d54bb4c945d7605d5f98
---
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_util.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
4 files changed, 49 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/6911/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iadcda6b085e69cae5a15d54bb4c945d7605d5f98
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] log: change default retention to one segment

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: log: change default retention to one segment
..


log: change default retention to one segment

We've been testing this on some various clusters recently and it doesn't
seem to have any adverse effects. Additionally, it reduces the amount
of space used by the WAL for idle tablets, and it reduces startup time
since there are fewer WALs to replay. So, it's probably a good idea to
change the default.

Change-Id: I3094f36949b3e518e872993b3026f8314e271a5e
Reviewed-on: http://gerrit.cloudera.org:8080/6907
Reviewed-by: David Ribeiro Alves 
Tested-by: Kudu Jenkins
---
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
3 files changed, 6 insertions(+), 5 deletions(-)

Approvals:
  David Ribeiro Alves: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6907
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I3094f36949b3e518e872993b3026f8314e271a5e
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] log: reduce segment size from 64MB to 8MB

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: log: reduce segment size from 64MB to 8MB
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6911/1//COMMIT_MSG
Commit Message:

PS1, Line 7: 64MB to 8MB
> Is it possible for a single Write RPC to exceed 8 MB? If so, does this chan
it is possible, but I think based on the rolling code it should be fine -- 
we'll trigger a roll when we see that the current batch doesn't fit, but it 
doesn't loop or anything. So, I think you should end up with one batch per 
segment, which is more or less what we want.

There's some possibility we end up with alternating empty segments and 
one-batch segments.

We do have a number of tests which were already setting to 1MB roll threshold, 
but let me double check that we have one that also sends large batches.


-- 
To view, visit http://gerrit.cloudera.org:8080/6911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iadcda6b085e69cae5a15d54bb4c945d7605d5f98
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] log: change default retention to one segment

2017-05-17 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: log: change default retention to one segment
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6907
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3094f36949b3e518e872993b3026f8314e271a5e
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-HasComments: No


[kudu-CR] log: reduce segment size from 64MB to 8MB

2017-05-17 Thread Todd Lipcon (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6911

to look at the new patch set (#2).

Change subject: log: reduce segment size from 64MB to 8MB
..

log: reduce segment size from 64MB to 8MB

Currently, we always retain a minimum of one segment of the WAL, even if
a tablet is cold and has not performed any writes in a long time. If a
server has hundreds or thousands of tablets, keeping a 64MB segment for
each tablet adds up to a lot of "wasted" disk space, especially if the
WAL is configured to an expensive disk such as an SSD.

In addition to wasting space, the current code always re-reads all live
WAL segments at startup. Solving this has been an open problem for quite
some time, but there are various subtleties (described in KUDU-38). So,
as a band-aid, reducing the size of the segment will linearly reduce the
amount of work during bootstrap of "cold" tablets.

In summary, the expectation is that, in a "cold" server, this will
reduce WAL disk space usage and startup time by approximately a factor
of 8.

I verified this by making these same changes in the configuration of a server
with ~6000 cold tablets. The disk usage of the WAL disk went from 381GB to
48GB, with a similar factor reduction in startup time.

Because the segment size is reduced by a factor of 8, the patch also
increases the max retention segment count by the same factor (from 10 to
80). This has one risk, in that we currently keep an open file
descriptor for every retained segment. However, in typical workloads,
the number of tablets with a high rate of active writes is not in the
hundreds or thousands, and thus the total number of file descriptors is
still likely to be manageable. Nevertheless, this patch adds a TODO that
we should consider a FileCache for these descriptors if we start to see
the usage be problematic.

So, if reducing to 8MB is good, why not go further, like 4MB or 1MB? The
assumption is that 8MB is still large enough to get good sequential IO
throughput on both read and write, and large enough to limit the FD
usage as described above. If we add an FD cache at some point, we could
consider reducing it further, particularly if running on SSDs where the
sequential throughput is less affected by size.

Although it's possible that a user had configured max_segments_to_retain
based on their own disk space requirements, the flag is marked
experimental, so I don't think we have to worry about compatibility in
that respect. We should consider changing the units here to be MB rather
than segment-count, so that the value is more robust.

Change-Id: Iadcda6b085e69cae5a15d54bb4c945d7605d5f98
---
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_util.cc
2 files changed, 5 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/6911/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iadcda6b085e69cae5a15d54bb4c945d7605d5f98
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] log: change default retention to one segment

2017-05-17 Thread Todd Lipcon (Code Review)
Hello David Ribeiro Alves, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6907

to look at the new patch set (#3).

Change subject: log: change default retention to one segment
..

log: change default retention to one segment

We've been testing this on some various clusters recently and it doesn't
seem to have any adverse effects. Additionally, it reduces the amount
of space used by the WAL for idle tablets, and it reduces startup time
since there are fewer WALs to replay. So, it's probably a good idea to
change the default.

Change-Id: I3094f36949b3e518e872993b3026f8314e271a5e
---
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
3 files changed, 6 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/6907/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6907
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3094f36949b3e518e872993b3026f8314e271a5e
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] KUDU-1952 Remove round-robin for block placement

2017-05-17 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change.

Change subject: KUDU-1952 Remove round-robin for block placement
..


Patch Set 33: Code-Review-1

Getting to the bottom of all the test failures.

-- 
To view, visit http://gerrit.cloudera.org:8080/6636
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
Gerrit-PatchSet: 33
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] log: reduce segment size from 64MB to 8MB

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: log: reduce segment size from 64MB to 8MB
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6911/1//COMMIT_MSG
Commit Message:

PS1, Line 27: This has one risk, in that we currently keep an open file
: descriptor for every retained segment
> We also keep an open fd for every segment being written to, correct? But th
yea, still just one for the active writer, but N for the retained readers (even 
though they're usually not very actively read, so LRU caching should be 
effective)


-- 
To view, visit http://gerrit.cloudera.org:8080/6911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iadcda6b085e69cae5a15d54bb4c945d7605d5f98
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1192 Periodically flush glog buffers from a thread

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: KUDU-1192 Periodically flush glog buffers from a thread
..


KUDU-1192 Periodically flush glog buffers from a thread

Added a wait timeout to AsyncLogger RunThread based on FLAGS_logbufsec
When the wait times out, it will flush even if there is nothing enqueued.
Also sets the logbufsecs default to 5 seconds, instead of the 30s.

Change-Id: Id4c6d440e9259efcf222530f13137f7de5bf00fc
Reviewed-on: http://gerrit.cloudera.org:8080/6853
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo 
Reviewed-by: Todd Lipcon 
---
M src/kudu/util/async_logger.cc
M src/kudu/util/flags.cc
M src/kudu/util/logging-test.cc
3 files changed, 35 insertions(+), 3 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, but someone else must approve
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Id4c6d440e9259efcf222530f13137f7de5bf00fc
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: William Li 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: William Li 


[kudu-CR] KUDU-1192 Periodically flush glog buffers from a thread

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1192 Periodically flush glog buffers from a thread
..


Patch Set 11: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id4c6d440e9259efcf222530f13137f7de5bf00fc
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: William Li 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: William Li 
Gerrit-HasComments: No


[kudu-CR] KUDU-1192 Periodically flush glog buffers from a thread

2017-05-17 Thread William Li (Code Review)
William Li has posted comments on this change.

Change subject: KUDU-1192 Periodically flush glog buffers from a thread
..


Patch Set 11:

(14 comments)

all are related to the missing  header file. fixed.

http://gerrit.cloudera.org:8080/#/c/6853/10/src/kudu/util/logging-test.cc
File src/kudu/util/logging-test.cc:

Line 103:  int /*message_len*/) override {
> error: use of undeclared identifier 'message_count_' [clang-diagnostic-erro
Done


Line 112: SleepFor(MonoDelta::FromMilliseconds(5));
> error: use of undeclared identifier 'flush_count_' [clang-diagnostic-error]
Done


Line 119: 
> error: expected member name or ';' after declaration specifiers [clang-diag
Done


Line 119: 
> error: no type named 'atomic' in namespace 'std' [clang-diagnostic-error]
Done


Line 120:   std::atomic flush_count_ = {0};
> error: no type named 'atomic' in namespace 'std' [clang-diagnostic-error]
Done


Line 120:   std::atomic flush_count_ = {0};
> error: expected member name or ';' after declaration specifiers [clang-diag
Done


Line 156:   async.Stop();
> error: no member named 'message_count_' in 'kudu::CountingLogger' [clang-di
Done


Line 156:   async.Stop();
> error: no type named 'Compare' in the global namespace [clang-diagnostic-er
Done


Line 156:   async.Stop();
> error: expected ')' [clang-diagnostic-error]
Done


Line 160:   // 'flush' set to true.
> error: no member named 'flush_count_' in 'kudu::CountingLogger' [clang-diag
Done


Line 178:   ASSERT_EVENTUALLY([&]() {
> error: no type named 'Compare' in the global namespace [clang-diagnostic-er
Done


Line 178:   ASSERT_EVENTUALLY([&]() {
> error: expected ')' [clang-diagnostic-error]
Done


Line 178:   ASSERT_EVENTUALLY([&]() {
> error: no member named 'message_count_' in 'kudu::CountingLogger' [clang-di
Done


Line 181: // so there should be no more messages in the buffer.
> error: no member named 'flush_count_' in 'kudu::CountingLogger' [clang-diag
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id4c6d440e9259efcf222530f13137f7de5bf00fc
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: William Li 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: William Li 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1192 Periodically flush glog buffers from a thread

2017-05-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1192 Periodically flush glog buffers from a thread
..


Patch Set 11: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id4c6d440e9259efcf222530f13137f7de5bf00fc
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: William Li 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: William Li 
Gerrit-HasComments: No


[kudu-CR] Improve logging during server startup

2017-05-17 Thread Todd Lipcon (Code Review)
Hello Mike Percy, Will Berkeley,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/6913

to review the following change.

Change subject: Improve logging during server startup
..

Improve logging during server startup

This makes a few improvements to the log output during startup,
particularly helpful when there are lots of tablets:

- Only show progress reports of loading tablet metadata once a second,
  with an indication of progress, rather than logging one line per
  tablet.
- In the common case of reading a log which ends on a log entry
  boundary, with all following data being preallocated space (all 0
  bytes), don't log anything with the word "corruption" in it, to avoid
  scaring users.
- Only log timing of Tablet::Open() when slow (>100ms)
- Bump various fine-grained details about tablet startup to VLOG(1)
  level instead of LOG(INFO)

Change-Id: Ia5e597dfd08d5cf0cbbfe2c5ef532ffba9d113d7
---
M src/kudu/consensus/log_reader.cc
M src/kudu/consensus/log_util.cc
M src/kudu/consensus/log_util.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet_bootstrap-test.cc
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tserver/ts_tablet_manager.cc
7 files changed, 126 insertions(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/13/6913/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6913
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia5e597dfd08d5cf0cbbfe2c5ef532ffba9d113d7
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] log: reduce segment size from 64MB to 8MB

2017-05-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: log: reduce segment size from 64MB to 8MB
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6911/1//COMMIT_MSG
Commit Message:

PS1, Line 7: 64MB to 8MB
> experimenting a bit on a server with 6000 tablets right now. I picked 8MB b
Is it possible for a single Write RPC to exceed 8 MB? If so, does this change 
cause any problems for such a Write?


PS1, Line 27: This has one risk, in that we currently keep an open file
: descriptor for every retained segment
We also keep an open fd for every segment being written to, correct? But that 
number isn't changing; it should still be one fd per tablet, right?


-- 
To view, visit http://gerrit.cloudera.org:8080/6911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iadcda6b085e69cae5a15d54bb4c945d7605d5f98
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] Clean up tablet-copy error message

2017-05-17 Thread Dan Burkert (Code Review)
Dan Burkert has submitted this change and it was merged.

Change subject: Clean up tablet-copy error message
..


Clean up tablet-copy error message

The error message when tablet-copy is unauthorized was misleading.

before:

Remote error: Unable to begin tablet copy session: Not authorized:
unauthorized access to method: BeginTabletCopySession: Invalid argument:
Unable to decode tablet copy RPC error message: message: "Not
authorized: unauthorized access to method: BeginTabletCopySession" code:
ERROR_APPLICATION

after:

Remote error: Unable to begin tablet copy session: Not authorized:
unauthorized access to method: BeginTabletCopySession

Change-Id: I39292d55af96dd20b7a26f02972c2013b6e1f95e
Reviewed-on: http://gerrit.cloudera.org:8080/6912
Reviewed-by: Alexey Serbin 
Tested-by: Kudu Jenkins
---
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
2 files changed, 10 insertions(+), 20 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6912
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I39292d55af96dd20b7a26f02972c2013b6e1f95e
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-1192 Periodically flush glog buffers from a thread

2017-05-17 Thread William Li (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6853

to look at the new patch set (#11).

Change subject: KUDU-1192 Periodically flush glog buffers from a thread
..

KUDU-1192 Periodically flush glog buffers from a thread

Added a wait timeout to AsyncLogger RunThread based on FLAGS_logbufsec
When the wait times out, it will flush even if there is nothing enqueued.
Also sets the logbufsecs default to 5 seconds, instead of the 30s.

Change-Id: Id4c6d440e9259efcf222530f13137f7de5bf00fc
---
M src/kudu/util/async_logger.cc
M src/kudu/util/flags.cc
M src/kudu/util/logging-test.cc
3 files changed, 35 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/53/6853/11
-- 
To view, visit http://gerrit.cloudera.org:8080/6853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id4c6d440e9259efcf222530f13137f7de5bf00fc
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: William Li 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: William Li 


[kudu-CR] KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs

2017-05-17 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1875: Refuse unauthenticated connections from publicly 
routable IP addrs
..


Patch Set 36:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6514/36/src/kudu/rpc/server_negotiation.cc
File src/kudu/rpc/server_negotiation.cc:

Line 961:   if 
(google::GetCommandLineFlagInfoOrDie("trusted_subnets").is_default) {
> We only flags marked with the 'runtime' tag to be set after startup.  Typic
I see. I do not see a reason we have to make it runtime updatable. That was my 
misunderstanding. Thanks!


-- 
To view, visit http://gerrit.cloudera.org:8080/6514
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e
Gerrit-PatchSet: 36
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] Clean up tablet-copy error message

2017-05-17 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: Clean up tablet-copy error message
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6912
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I39292d55af96dd20b7a26f02972c2013b6e1f95e
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: No


[kudu-CR] KUDU-1192 Periodically flush glog buffers from a thread

2017-05-17 Thread William Li (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6853

to look at the new patch set (#10).

Change subject: KUDU-1192 Periodically flush glog buffers from a thread
..

KUDU-1192 Periodically flush glog buffers from a thread

Added a wait timeout to AsyncLogger RunThread based on FLAGS_logbufsec
When the wait times out, it will flush even if there is nothing enqueued.
Also sets the logbufsecs default to 5 seconds, instead of the 30s.

Change-Id: Id4c6d440e9259efcf222530f13137f7de5bf00fc
---
M src/kudu/util/async_logger.cc
M src/kudu/util/flags.cc
M src/kudu/util/logging-test.cc
3 files changed, 34 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/53/6853/10
-- 
To view, visit http://gerrit.cloudera.org:8080/6853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id4c6d440e9259efcf222530f13137f7de5bf00fc
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: William Li 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: William Li 


[kudu-CR] Clean up tablet-copy error message

2017-05-17 Thread Dan Burkert (Code Review)
Dan Burkert has uploaded a new patch set (#2).

Change subject: Clean up tablet-copy error message
..

Clean up tablet-copy error message

The error message when tablet-copy is unauthorized was misleading.

before:

Remote error: Unable to begin tablet copy session: Not authorized:
unauthorized access to method: BeginTabletCopySession: Invalid argument:
Unable to decode tablet copy RPC error message: message: "Not
authorized: unauthorized access to method: BeginTabletCopySession" code:
ERROR_APPLICATION

after:

Remote error: Unable to begin tablet copy session: Not authorized:
unauthorized access to method: BeginTabletCopySession

Change-Id: I39292d55af96dd20b7a26f02972c2013b6e1f95e
---
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
2 files changed, 10 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/12/6912/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6912
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I39292d55af96dd20b7a26f02972c2013b6e1f95e
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] log: reduce segment size from 64MB to 8MB

2017-05-17 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: log: reduce segment size from 64MB to 8MB
..


Patch Set 1:

cool, add that to the commit message?

-- 
To view, visit http://gerrit.cloudera.org:8080/6911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iadcda6b085e69cae5a15d54bb4c945d7605d5f98
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-2017. Fix calculation of perf improvement for flushes

2017-05-17 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-2017. Fix calculation of perf improvement for flushes
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6908/1/src/kudu/tablet/tablet_replica_mm_ops.cc
File src/kudu/tablet/tablet_replica_mm_ops.cc:

PS1, Line 75: double anchored_mb = static_cast(stats->ram_anchored()) / 
(1024 * 1024);
nit: could we declare this above the if stmt and avoid doing the conversion to 
mb twice?


-- 
To view, visit http://gerrit.cloudera.org:8080/6908
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bcf443db857ce172b8e83758eb73a24c4b0c6af
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] Clean up tablet-copy error message

2017-05-17 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: Clean up tablet-copy error message
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6912/1/src/kudu/tserver/tablet_copy_client.cc
File src/kudu/tserver/tablet_copy_client.cc:

PS1, Line 330: 
> consider removing this method from the header file as well.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6912
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I39292d55af96dd20b7a26f02972c2013b6e1f95e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: Yes


[kudu-CR] Clean up tablet-copy error message

2017-05-17 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: Clean up tablet-copy error message
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6912/1/src/kudu/tserver/tablet_copy_client.cc
File src/kudu/tserver/tablet_copy_client.cc:

PS1, Line 330: 
consider removing this method from the header file as well.


-- 
To view, visit http://gerrit.cloudera.org:8080/6912
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I39292d55af96dd20b7a26f02972c2013b6e1f95e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: Yes


[kudu-CR] log: reduce segment size from 64MB to 8MB

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: log: reduce segment size from 64MB to 8MB
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6911/1//COMMIT_MSG
Commit Message:

PS1, Line 7: 64MB to 8MB
> how did you get to this number. did you make some experiments? (looks good 
experimenting a bit on a server with 6000 tablets right now. I picked 8MB 
basically randomly, figuring that it's big enough for "good sequential IO" but 
small enough to get the benefit here.

Initial experiments do show indeed a 6-8x improvement in startup time and disk 
usage.


-- 
To view, visit http://gerrit.cloudera.org:8080/6911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iadcda6b085e69cae5a15d54bb4c945d7605d5f98
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] log: change default retention to one segment

2017-05-17 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: log: change default retention to one segment
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6907
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3094f36949b3e518e872993b3026f8314e271a5e
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-HasComments: No


[kudu-CR] [java-client] clear non-covered entries from meta cache on table open (second try)

2017-05-17 Thread Dan Burkert (Code Review)
Dan Burkert has submitted this change and it was merged.

Change subject: [java-client] clear non-covered entries from meta cache on 
table open (second try)
..


[java-client] clear non-covered entries from meta cache on table open (second 
try)

This is a followup to d07ecd6ded01201c9. I missed a place in the open
table flow where we retry opening if the table has non-running tablets.
This was revealed when the test accompanying that commit was flaky about
20% of the time.

Change-Id: I8dbd16b2179b1d72b635a4dfaf54928d62125607
Reviewed-on: http://gerrit.cloudera.org:8080/6909
Reviewed-by: Todd Lipcon 
Tested-by: Kudu Jenkins
---
M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java
1 file changed, 5 insertions(+), 0 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6909
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I8dbd16b2179b1d72b635a4dfaf54928d62125607
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] Clean up tablet-copy error message

2017-05-17 Thread Dan Burkert (Code Review)
Hello Mike Percy, Will Berkeley,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/6912

to review the following change.

Change subject: Clean up tablet-copy error message
..

Clean up tablet-copy error message

The error message when tablet-copy is unauthorized was misleading.

before:

Remote error: Unable to begin tablet copy session: Not authorized:
unauthorized access to method: BeginTabletCopySession: Invalid argument:
Unable to decode tablet copy RPC error message: message: "Not
authorized: unauthorized access to method: BeginTabletCopySession" code:
ERROR_APPLICATION

after:

Remote error: Unable to begin tablet copy session: Not authorized:
unauthorized access to method: BeginTabletCopySession

Change-Id: I39292d55af96dd20b7a26f02972c2013b6e1f95e
---
M src/kudu/tserver/tablet_copy_client.cc
1 file changed, 10 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/12/6912/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6912
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I39292d55af96dd20b7a26f02972c2013b6e1f95e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-1826 add propagated timestamp to sync client

2017-05-17 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1826 add propagated timestamp to sync client
..


Patch Set 7:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6798/7/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java
File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java:

PS7, Line 237: @return a long indicating the last timestamp received from a 
server
Can you mention that the timestamp can't be interpreted as absolute time (i.e 
that it's encoded in some special way). please also add that to the other 
client to keep the comments consistent


http://gerrit.cloudera.org:8080/#/c/6798/7/java/kudu-client/src/main/java/org/apache/kudu/client/KuduClient.java
File java/kudu-client/src/main/java/org/apache/kudu/client/KuduClient.java:

PS7, Line 44: NO_TIMESTAMP
don't we have this constant defined somewhere else?


http://gerrit.cloudera.org:8080/#/c/6798/7/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java
File java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java:

PS7, Line 92: long initial_ts = syncClient.getLastPropagatedTimestamp();
: 
: // Check that the initial timestamp is consistent with the 
asynchronous client.
: assertEquals(initial_ts, client.getLastPropagatedTimestamp());
: assertEquals(initial_ts, 
syncClient.getLastPropagatedTimestamp());
: 
: // Attempt to change the timestamp to a lower value. This 
should not change
: // the internal timestamp, as it must be monotonically 
increasing.
: syncClient.updateLastPropagatedTimestamp(initial_ts - 1);
: assertEquals(initial_ts, client.getLastPropagatedTimestamp());
: assertEquals(initial_ts, 
syncClient.getLastPropagatedTimestamp());
: 
: // Use the synchronous client to update the last propagated 
timestamp and
: // check with both clients that the timestamp was updated.
: syncClient.updateLastPropagatedTimestamp(initial_ts + 1);
: assertEquals(initial_ts + 1, 
client.getLastPropagatedTimestamp());
: assertEquals(initial_ts + 1, 
syncClient.getLastPropagatedTimestamp());
can you actually perform some writes or something to test that the timestamps 
get updated by the client when messages are received from the server. May 
adding these tests to the already existing timestamp tests would be easier.


-- 
To view, visit http://gerrit.cloudera.org:8080/6798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbfeb9526423f5acd0ef6482869d36cdd146e921
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes


[kudu-CR] log: reduce segment size from 64MB to 8MB

2017-05-17 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: log: reduce segment size from 64MB to 8MB
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6911/1//COMMIT_MSG
Commit Message:

PS1, Line 7: 64MB to 8MB
how did you get to this number. did you make some experiments? (looks good just 
curious)


-- 
To view, visit http://gerrit.cloudera.org:8080/6911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iadcda6b085e69cae5a15d54bb4c945d7605d5f98
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1952 Remove round-robin for block placement

2017-05-17 Thread Andrew Wong (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6636

to look at the new patch set (#33).

Change subject: KUDU-1952 Remove round-robin for block placement
..

KUDU-1952 Remove round-robin for block placement

This is the first of a multi-patch patchset to mitigate the effects of
single-disk failure. Throughout the code, the term "DataDir" refers to
a data directory, which is often mounted on a distinct disk. Thus,
"disks" and "data directories" will be used interchangeably.

This patch adds a mapping from tablet to a set of disks and uses it to
replace the existing round-robin placement of blocks. Tablets are
mapped to a fixed number of disks (i.e. a DataDirGroup). New blocks
are placed randomly in directories within each tablet's DataDirGroup.

Tablet-to-group mappings are generated and stored as metadata upon
tablet creation, or upon tablet replacement during a tablet copy.
During group creation, disks are added to groups by randomly selecting
two available directories and selecting the one with fewer tablets on
it. This avoids pigeonholing new tablets to disks with relatively few
tablets, while still trending towards filling underloaded disks.

Groups are maintained when restarting the server, as they are flushed
with metadata, and are deleted upon tablet deletion.  When loading
tablet data from a previous version of Kudu, the tablet's metadata
will not have a DataDirGroup. One will be generated containing all
data directories, as the tablet's data may already be spread across
any number of disks.

As this patch only addresses block placement, it does not itself
mitigate single-disk failure. Given this, and given the tradeoff
between I/O and failure disk-failure tolerance, the default behavior
will be to spread tablet data across all disks.

Testing is done at the block manager level in block_manager-test and
log_block_manager-test, as well as in the new data_dirs-test.

A design doc can be found here:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
---
M src/kudu/cfile/bloomfile-test-base.h
M src/kudu/cfile/cfile-test-base.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
A src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs.proto
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/delta_compaction-test.cc
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/deltafile-test.cc
M src/kudu/tablet/deltamemstore-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
M src/kudu/tablet/tablet_bootstrap-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
38 files changed, 1,018 insertions(+), 192 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/6636/33
-- 
To view, visit http://gerrit.cloudera.org:8080/6636
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
Gerrit-PatchSet: 33
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] log: reduce segment size from 64MB to 8MB

2017-05-17 Thread Todd Lipcon (Code Review)
Hello Mike Percy, Adar Dembo,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/6911

to review the following change.

Change subject: log: reduce segment size from 64MB to 8MB
..

log: reduce segment size from 64MB to 8MB

Currently, we always retain a minimum of one segment of the WAL, even if
a tablet is cold and has not performed any writes in a long time. If a
server has hundreds or thousands of tablets, keeping a 64MB segment for
each tablet adds up to a lot of "wasted" disk space, especially if the
WAL is configured to an expensive disk such as an SSD.

In addition to wasting space, the current code always re-reads all live
WAL segments at startup. Solving this has been an open problem for quite
some time, but there are various subtleties (described in KUDU-38). So,
as a band-aid, reducing the size of the segment will linearly reduce the
amount of work during bootstrap of "cold" tablets.

In summary, the expectation is that, in a "cold" server, this will
reduce WAL disk space usage and startup time by approximately a factor
of 8.

Because the segment size is reduced by a factor of 8, the patch also
increases the max retention segment count by the same factor (from 10 to
80). This has one risk, in that we currently keep an open file
descriptor for every retained segment. However, in typical workloads,
the number of tablets with a high rate of active writes is not in the
hundreds or thousands, and thus the total number of file descriptors is
still likely to be manageable. Nevertheless, this patch adds a TODO that
we should consider a FileCache for these descriptors if we start to see
the usage be problematic.

Although it's possible that a user had configured max_segments_to_retain
based on their own disk space requirements, the flag is marked
experimental, so I don't think we have to worry about compatibility in
that respect. We should consider changing the units here to be MB rather
than segment-count, so that the value is more robust.

Change-Id: Iadcda6b085e69cae5a15d54bb4c945d7605d5f98
---
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_util.cc
2 files changed, 5 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/6911/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Iadcda6b085e69cae5a15d54bb4c945d7605d5f98
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Mike Percy 


[kudu-CR] KUDU-1860: ksck doesn't identify tablets that are evicted but still in config

2017-05-17 Thread Will Berkeley (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6772

to look at the new patch set (#7).

Change subject: KUDU-1860: ksck doesn't identify tablets that are evicted but 
still in config
..

KUDU-1860: ksck doesn't identify tablets that are evicted but still in config

This patch enhances ksck to gather consensus info from every
tablet. It compares this info with master and outputs the
master's config and every conflicting config, if there are any
conflicts. To do this efficiently it implements a new
GetAllConsensusStates RPC that gathers info about every
replica's consensus state.

This will catch at least the two problems identified in
KUDU-1860: 1. The leader has a pending config to remove a
tablet, but it is not committed so the master does not see this
config. This can hide an unhealthy tablet if, e.g., one pending
config member is down and the pending-to-be-kicked-out member is
up, so 1/2 replicas are alive in the leader's active config but
the master thinks 2/3 are alive. 2. No replica is leader but the
master believes there is a leader because its cache is old and
hasn't been updated.

Sample output showing #1:
https://gist.github.com/wdberkeley/f964439bb0c407d3769c2dd9a0959ca0

Change-Id: I16e4de09821b372c3773b4ade3fd9e37ab818808
---
M src/kudu/consensus/consensus.proto
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/ksck.cc
M src/kudu/tools/ksck.h
M src/kudu/tools/ksck_remote.cc
M src/kudu/tools/ksck_remote.h
M src/kudu/tools/tool_action_cluster.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
9 files changed, 411 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/6772/7
-- 
To view, visit http://gerrit.cloudera.org:8080/6772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I16e4de09821b372c3773b4ade3fd9e37ab818808
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs

2017-05-17 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-1875: Refuse unauthenticated connections from publicly 
routable IP addrs
..


Patch Set 36:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6514/36/src/kudu/rpc/server_negotiation.cc
File src/kudu/rpc/server_negotiation.cc:

Line 961:   if 
(google::GetCommandLineFlagInfoOrDie("trusted_subnets").is_default) {
> Does not quite follow if we move this if condition to call_once, how do we 
We only flags marked with the 'runtime' tag to be set after startup.  Typically 
string flags can't be marked runtime due to thread safety issues.  Is there a 
reason we would want it to be runtime updatable?  We probably shouldn't be 
using call_once at all if that's the case.


-- 
To view, visit http://gerrit.cloudera.org:8080/6514
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e
Gerrit-PatchSet: 36
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] log: change default retention to one segment

2017-05-17 Thread Todd Lipcon (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6907

to look at the new patch set (#2).

Change subject: log: change default retention to one segment
..

log: change default retention to one segment

We've been testing this on some various clusters recently and it doesn't
seem to have any adverse effects. Additionally, it reduces the amount
of space used by the WAL for idle tablets, and it reduces startup time
since there are fewer WALs to replay. So, it's probably a good idea to
change the default.

Change-Id: I3094f36949b3e518e872993b3026f8314e271a5e
---
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
3 files changed, 3 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/6907/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6907
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3094f36949b3e518e872993b3026f8314e271a5e
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs

2017-05-17 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1875: Refuse unauthenticated connections from publicly 
routable IP addrs
..


Patch Set 36:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6514/36/src/kudu/rpc/server_negotiation.cc
File src/kudu/rpc/server_negotiation.cc:

Line 961:   if 
(google::GetCommandLineFlagInfoOrDie("trusted_subnets").is_default) {
> Can this if condition be moved inside the call_once, and only evaluated it 
Does not quite follow if we move this if condition to call_once, how do we know 
if the flag has been set explicitly later?


-- 
To view, visit http://gerrit.cloudera.org:8080/6514
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e
Gerrit-PatchSet: 36
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs

2017-05-17 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-1875: Refuse unauthenticated connections from publicly 
routable IP addrs
..


Patch Set 36:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6514/36/src/kudu/rpc/server_negotiation.cc
File src/kudu/rpc/server_negotiation.cc:

Line 961:   if 
(google::GetCommandLineFlagInfoOrDie("trusted_subnets").is_default) {
Can this if condition be moved inside the call_once, and only evaluated it once 
per process?  As much as possible should be moved inside the call_once as 
possible.


-- 
To view, visit http://gerrit.cloudera.org:8080/6514
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e
Gerrit-PatchSet: 36
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1875: Refuse unauthenticated connections from publicly 
routable IP addrs
..


Patch Set 36: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6514
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e
Gerrit-PatchSet: 36
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] Refactor ConsensusStatePB to hold committed and pending configs

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: Refactor ConsensusStatePB to hold committed and pending configs
..


Refactor ConsensusStatePB to hold committed and pending configs

This patch refactors ConsensusStatePB so it contains both the
committed config and, if there is a pending config, the pending
config. Previously, consumers of the consensus state would ask
for either the committed or active (= pending if there is a pending
config, else the committed config) config. With this change, the
committed and active configs are always included.

The addition of pending info to the ConsensusStatePB will be used
by ksck in a follow-up patch. Additionally, the master may use
this info in the future to be more aware of the health of tablets.

Change-Id: I4bc4bdd9752fc29a7ce2cefcdc95c4366f5353af
Reviewed-on: http://gerrit.cloudera.org:8080/6809
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon 
---
M src/kudu/consensus/consensus.h
M src/kudu/consensus/consensus.proto
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/metadata.proto
M src/kudu/consensus/quorum_util-test.cc
M src/kudu/consensus/quorum_util.cc
M src/kudu/consensus/quorum_util.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_state.cc
M src/kudu/consensus/raft_consensus_state.h
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/delete_table-itest.cc
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/master/master-path-handlers.cc
M src/kudu/master/master.proto
M src/kudu/master/sys_catalog.cc
M src/kudu/tools/kudu-admin-test.cc
M src/kudu/tserver/tablet_copy.proto
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_service.cc
M src/kudu/tserver/tablet_copy_source_session.cc
M src/kudu/tserver/tablet_copy_source_session.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager-test.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver-path-handlers.cc
34 files changed, 334 insertions(+), 306 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6809
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I4bc4bdd9752fc29a7ce2cefcdc95c4366f5353af
Gerrit-PatchSet: 16
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] Refactor ConsensusStatePB to hold committed and pending configs

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: Refactor ConsensusStatePB to hold committed and pending configs
..


Patch Set 15: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6809
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4bc4bdd9752fc29a7ce2cefcdc95c4366f5353af
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: No


[kudu-CR] KUDU-1952 Remove round-robin for block placement

2017-05-17 Thread Andrew Wong (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6636

to look at the new patch set (#32).

Change subject: KUDU-1952 Remove round-robin for block placement
..

KUDU-1952 Remove round-robin for block placement

This is the first of a multi-patch patchset to mitigate the effects of
single-disk failure. Throughout the code, the term "DataDir" refers to
a data directory, which is often mounted on a distinct disk. Thus,
"disks" and "data directories" will be used interchangeably.

This patch adds a mapping from tablet to a set of disks and uses it to
replace the existing round-robin placement of blocks. Tablets are
mapped to a fixed number of disks (i.e. a DataDirGroup). New blocks
are placed randomly in directories within each tablet's DataDirGroup.

Tablet-to-group mappings are generated and stored as metadata upon
tablet creation, or upon tablet replacement during a tablet copy.
During group creation, disks are added to groups by randomly selecting
two available directories and selecting the one with fewer tablets on
it. This avoids pigeonholing new tablets to disks with relatively few
tablets, while still trending towards filling underloaded disks.

Groups are maintained when restarting the server, as they are flushed
with metadata, and are deleted upon tablet deletion.  When loading
tablet data from a previous version of Kudu, the tablet's metadata
will not have a DataDirGroup. One will be generated containing all
data directories, as the tablet's data may already be spread across
any number of disks.

As this patch only addresses block placement, it does not itself
mitigate single-disk failure. Given this, and given the tradeoff
between I/O and failure disk-failure tolerance, the default behavior
will be to spread tablet data across all disks.

Testing is done at the block manager level in block_manager-test and
log_block_manager-test, as well as in the new data_dirs-test.

A design doc can be found here:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
---
M src/kudu/cfile/bloomfile-test-base.h
M src/kudu/cfile/cfile-test-base.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
A src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs.proto
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/delta_compaction-test.cc
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/deltafile-test.cc
M src/kudu/tablet/deltamemstore-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
M src/kudu/tablet/tablet_bootstrap-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
38 files changed, 1,002 insertions(+), 190 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/6636/32
-- 
To view, visit http://gerrit.cloudera.org:8080/6636
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
Gerrit-PatchSet: 32
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1952 Remove round-robin for block placement

2017-05-17 Thread Andrew Wong (Code Review)
Hello Adar Dembo,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6636

to look at the new patch set (#31).

Change subject: KUDU-1952 Remove round-robin for block placement
..

KUDU-1952 Remove round-robin for block placement

This is the first of a multi-patch patchset to mitigate the effects of
single-disk failure. Throughout the code, the term "DataDir" refers to
a data directory, which is often mounted on a distinct disk. Thus,
"disks" and "data directories" will be used interchangeably.

This patch adds a mapping from tablet to a set of disks and uses it to
replace the existing round-robin placement of blocks. Tablets are
mapped to a fixed number of disks (i.e. a DataDirGroup). New blocks
are placed randomly in directories within each tablet's DataDirGroup.

Tablet-to-group mappings are generated and stored as metadata upon
tablet creation, or upon tablet replacement during a tablet copy.
During group creation, disks are added to groups by randomly selecting
two available directories and selecting the one with fewer tablets on
it. This avoids pigeonholing new tablets to disks with relatively few
tablets, while still trending towards filling underloaded disks.

Groups are maintained when restarting the server, as they are flushed
with metadata, and are deleted upon tablet deletion.  When loading
tablet data from a previous version of Kudu, the tablet's metadata
will not have a DataDirGroup. One will be generated containing all
data directories, as the tablet's data may already be spread across
any number of disks.

As this patch only addresses block placement, it does not itself
mitigate single-disk failure. Given this, and given the tradeoff
between I/O and failure disk-failure tolerance, the default behavior
will be to spread tablet data across all disks.

Testing is done at the block manager level in block_manager-test and
log_block_manager-test, as well as in the new data_dirs-test.

A design doc can be found here:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
---
M src/kudu/cfile/bloomfile-test-base.h
M src/kudu/cfile/cfile-test-base.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
A src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs.proto
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/delta_compaction-test.cc
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/deltafile-test.cc
M src/kudu/tablet/deltamemstore-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
M src/kudu/tablet/tablet_bootstrap-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
38 files changed, 1,002 insertions(+), 190 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/6636/31
-- 
To view, visit http://gerrit.cloudera.org:8080/6636
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
Gerrit-PatchSet: 31
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] Refactor ConsensusStatePB to hold committed and pending configs

2017-05-17 Thread Will Berkeley (Code Review)
Hello Mike Percy, Todd Lipcon, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6809

to look at the new patch set (#15).

Change subject: Refactor ConsensusStatePB to hold committed and pending configs
..

Refactor ConsensusStatePB to hold committed and pending configs

This patch refactors ConsensusStatePB so it contains both the
committed config and, if there is a pending config, the pending
config. Previously, consumers of the consensus state would ask
for either the committed or active (= pending if there is a pending
config, else the committed config) config. With this change, the
committed and active configs are always included.

The addition of pending info to the ConsensusStatePB will be used
by ksck in a follow-up patch. Additionally, the master may use
this info in the future to be more aware of the health of tablets.

Change-Id: I4bc4bdd9752fc29a7ce2cefcdc95c4366f5353af
---
M src/kudu/consensus/consensus.h
M src/kudu/consensus/consensus.proto
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/metadata.proto
M src/kudu/consensus/quorum_util-test.cc
M src/kudu/consensus/quorum_util.cc
M src/kudu/consensus/quorum_util.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_state.cc
M src/kudu/consensus/raft_consensus_state.h
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/delete_table-itest.cc
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/master/master-path-handlers.cc
M src/kudu/master/master.proto
M src/kudu/master/sys_catalog.cc
M src/kudu/tools/kudu-admin-test.cc
M src/kudu/tserver/tablet_copy.proto
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_service.cc
M src/kudu/tserver/tablet_copy_source_session.cc
M src/kudu/tserver/tablet_copy_source_session.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager-test.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver-path-handlers.cc
34 files changed, 334 insertions(+), 306 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/09/6809/15
-- 
To view, visit http://gerrit.cloudera.org:8080/6809
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4bc4bdd9752fc29a7ce2cefcdc95c4366f5353af
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley