[kudu-CR] [KUDU-2521] Java Implementation for BloomFilter

2018-09-12 Thread Anonymous Coward (Code Review)
jinxing6...@126.com has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11333 )

Change subject: [KUDU-2521] Java Implementation for BloomFilter
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java
File java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java:

http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@73
PS6, Line 73: this.nBits = bitSet.size();
> Could you look into whether bitSet.size() is a constant time operation? If
it's:
```
public int size() {
return words.length * BITS_PER_WORD;
}
```


http://gerrit.cloudera.org:8080/#/c/11333/7/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java
File java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java:

http://gerrit.cloudera.org:8080/#/c/11333/7/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@17
PS7, Line 17:
Thanks a lot for quick response ~Adar
I updated according to your comments.
Please take a look :)



--
To view, visit http://gerrit.cloudera.org:8080/11333
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32673c008f9958088d281c2c198c543bfea2eb8e
Gerrit-Change-Number: 11333
Gerrit-PatchSet: 7
Gerrit-Owner: jinxing6...@126.com
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: jinxing6...@126.com
Gerrit-Comment-Date: Thu, 13 Sep 2018 04:09:53 +
Gerrit-HasComments: Yes


[kudu-CR] [KUDU-2521] Java Implementation for BloomFilter

2018-09-12 Thread Anonymous Coward (Code Review)
Hello Dan Burkert, Kudu Jenkins, Adar Dembo, Grant Henke, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11333

to look at the new patch set (#7).

Change subject: [KUDU-2521] Java Implementation for BloomFilter
..

[KUDU-2521] Java Implementation for BloomFilter

Change-Id: I32673c008f9958088d281c2c198c543bfea2eb8e
---
A java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java
A java/kudu-client/src/test/java/org/apache/kudu/client/TestBloomFilter.java
2 files changed, 590 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/33/11333/7
--
To view, visit http://gerrit.cloudera.org:8080/11333
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I32673c008f9958088d281c2c198c543bfea2eb8e
Gerrit-Change-Number: 11333
Gerrit-PatchSet: 7
Gerrit-Owner: jinxing6...@126.com
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: jinxing6...@126.com


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-12 Thread Anupama Gupta (Code Review)
Anupama Gupta has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 7:

(13 comments)

Many thanks for the comments. Please take a look.

http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@39
PS6, Line 39: table
> nit: tablet, here and elsewhere
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@45
PS6, Line 45: (we will refer to it as
: "prefix column" and its specific value as "prefix key").
> nit: since we're not using these as a variable names, but rather as definit
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@48
PS6, Line 48: Therefore, we can use the index to skip to the rows that have 
distinct prefix keys,
: and also satisfy the predicate on the `tstamp` column.
> nit: maybe drop the ** around "skip" here, since you do it down below anywa
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@58
PS6, Line 58: `host = helium`
> nit: would be nice if the entire thing were in backticks, since it's a cond
You are right. Made the change.


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@60
PS6, Line 60: satisfy the predicate, and we
> nit: probably not needed
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@59
PS6, Line 59: . At that
: point we would
> nit: reword "until the predicate no longer matches. At that point we would
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@67
PS6, Line 67: tio
> nit: this is a little distracting, below too. Let's just keep it singular s
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@73
PS6, Line 73: o get worse with respect to the full tablet scan performance when 
the prefix column cardinality
> This and below are being rendered weirdly by github. Would like to confirm
Yes, it works fine with jekyll. (Link to this screen shot - 
https://raw.githubusercontent.com/AnupamaGupta01/kudu-1/gh-pages-staging/img/index-skip-scan/equation.png)


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@74
PS6, Line 74: C%20tablet%20%7D).
: Therefore, in order to use skip scan perf
> "consistent performance in cases of large prefix column cardinality"
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@93
PS6, Line 93:
> nit: probably not needed, below too.
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@94
PS6, Line 94: Range pr
> nit: In-list
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@98
PS6, Line 98: orki
> nit: team
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@101
PS6, Line 101:
 : References
 : ==
 :
 : 
[[1]](https://storage.googleapis.com/pub-tools-public-publication-data/pdf/42851.pdf):
 Gupta, Ashish, et al. "Mesa:
 : Geo-replicated, near real-time, scalable data warehousing." 
Proceedings of the VLDB Endowment 7.12 (2014): 1259-1270.
 :
 : [[2]](https://oracle-base.com/articles/9i/index-skip-scanning/): 
Index Skip Scanning - Oracle Database
 :
> It's really up to you, but WDYT about just linking these in-line? This is a
Thanks, I see your point. I think that the current section for references looks 
fine after incorporating Alexey's suggestions on the same (in Patch 4, L62).



--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 7
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Thu, 13 Sep 2018 03:35:44 +
Gerrit-HasComments: Yes


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-12 Thread Anupama Gupta (Code Review)
Hello Alexey Serbin, Mike Percy, Attila Bukor, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11263

to look at the new patch set (#7).

Change subject: Blogpost describing index skip scan optimization.
..

Blogpost describing index skip scan optimization.

Link to the version with images:
https://github.com/AnupamaGupta01/kudu/blob/blogpost-2/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md

Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
---
A _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
A img/index-skip-scan/example-table.png
A img/index-skip-scan/skip-scan-example-table.png
A img/index-skip-scan/skip-scan-performance-graph.png
4 files changed, 112 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/7
--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 7
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 


[kudu-CR] [tests] minor cleanup on BadTabletCopyITest

2018-09-12 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11434 )

Change subject: [tests] minor cleanup on BadTabletCopyITest
..

[tests] minor cleanup on BadTabletCopyITest

Added missing NO_FATALS() for StartCluster() and LoadTable().

Change-Id: If6db0c613db6e4ea4dec034089a1fc980317e776
Reviewed-on: http://gerrit.cloudera.org:8080/11434
Reviewed-by: Adar Dembo 
Tested-by: Kudu Jenkins
---
M src/kudu/integration-tests/tablet_copy-itest.cc
1 file changed, 5 insertions(+), 6 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/11434
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If6db0c613db6e4ea4dec034089a1fc980317e776
Gerrit-Change-Number: 11434
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] [tests] minor cleanup on BadTabletCopyITest

2018-09-12 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11434 )

Change subject: [tests] minor cleanup on BadTabletCopyITest
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11434
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6db0c613db6e4ea4dec034089a1fc980317e776
Gerrit-Change-Number: 11434
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Thu, 13 Sep 2018 00:33:15 +
Gerrit-HasComments: No


[kudu-CR] [tests] minor cleanup on BadTabletCopyITest

2018-09-12 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11434


Change subject: [tests] minor cleanup on BadTabletCopyITest
..

[tests] minor cleanup on BadTabletCopyITest

Added missing NO_FATALS() for StartCluster() and LoadTable().

Change-Id: If6db0c613db6e4ea4dec034089a1fc980317e776
---
M src/kudu/integration-tests/tablet_copy-itest.cc
1 file changed, 5 insertions(+), 6 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/11434/1
--
To view, visit http://gerrit.cloudera.org:8080/11434
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If6db0c613db6e4ea4dec034089a1fc980317e776
Gerrit-Change-Number: 11434
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 


[kudu-CR] [location awareness] replica selection honors placement policy

2018-09-12 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11207 )

Change subject: [location_awareness] replica selection honors placement policy
..

[location_awareness] replica selection honors placement policy

This patch introduces placement policy into the catalog manager's
replica selection process. The replica selection logic is factored out
into the new PlacementPolicy class.

In essence (for details see [1]), the placement policy is about:
  * in case of N locations, N > 2, not placing the majority of replicas
in one location
  * spreading replicas evenly among available locations
  * within a location, spreading replicas evenly among tablet servers

This patch also adds a few test scenarios for the new functionality.

[1] https://s.apache.org/location-awareness-design

Change-Id: I4169098abf17d5591d4c1675561afc15b5477fcd
Reviewed-on: http://gerrit.cloudera.org:8080/11207
Reviewed-by: Adar Dembo 
Reviewed-by: Will Berkeley 
Tested-by: Kudu Jenkins
---
M src/kudu/gutil/map-util.h
M src/kudu/master/CMakeLists.txt
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
A src/kudu/master/placement_policy-test.cc
A src/kudu/master/placement_policy.cc
A src/kudu/master/placement_policy.h
M src/kudu/master/ts_descriptor.cc
M src/kudu/master/ts_descriptor.h
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
11 files changed, 1,228 insertions(+), 224 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, but someone else must approve
  Will Berkeley: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/11207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I4169098abf17d5591d4c1675561afc15b5477fcd
Gerrit-Change-Number: 11207
Gerrit-PatchSet: 13
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] [location awareness] replica selection honors placement policy

2018-09-12 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11207 )

Change subject: [location_awareness] replica selection honors placement policy
..


Patch Set 12: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4169098abf17d5591d4c1675561afc15b5477fcd
Gerrit-Change-Number: 11207
Gerrit-PatchSet: 12
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Wed, 12 Sep 2018 22:51:07 +
Gerrit-HasComments: No


[kudu-CR] [location awareness] replica selection honors placement policy

2018-09-12 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11207 )

Change subject: [location_awareness] replica selection honors placement policy
..


Patch Set 12: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/11207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4169098abf17d5591d4c1675561afc15b5477fcd
Gerrit-Change-Number: 11207
Gerrit-PatchSet: 12
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Wed, 12 Sep 2018 22:38:57 +
Gerrit-HasComments: No


[kudu-CR] [location awareness] replica selection honors placement policy

2018-09-12 Thread Alexey Serbin (Code Review)
Hello Will Berkeley, Kudu Jenkins, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11207

to look at the new patch set (#12).

Change subject: [location_awareness] replica selection honors placement policy
..

[location_awareness] replica selection honors placement policy

This patch introduces placement policy into the catalog manager's
replica selection process. The replica selection logic is factored out
into the new PlacementPolicy class.

In essence (for details see [1]), the placement policy is about:
  * in case of N locations, N > 2, not placing the majority of replicas
in one location
  * spreading replicas evenly among available locations
  * within a location, spreading replicas evenly among tablet servers

This patch also adds a few test scenarios for the new functionality.

[1] https://s.apache.org/location-awareness-design

Change-Id: I4169098abf17d5591d4c1675561afc15b5477fcd
---
M src/kudu/gutil/map-util.h
M src/kudu/master/CMakeLists.txt
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
A src/kudu/master/placement_policy-test.cc
A src/kudu/master/placement_policy.cc
A src/kudu/master/placement_policy.h
M src/kudu/master/ts_descriptor.cc
M src/kudu/master/ts_descriptor.h
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
11 files changed, 1,228 insertions(+), 224 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/11207/12
--
To view, visit http://gerrit.cloudera.org:8080/11207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4169098abf17d5591d4c1675561afc15b5477fcd
Gerrit-Change-Number: 11207
Gerrit-PatchSet: 12
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] [location awareness] replica selection honors placement policy

2018-09-12 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11207 )

Change subject: [location_awareness] replica selection honors placement policy
..


Patch Set 11:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11207/11/src/kudu/master/placement_policy.cc
File src/kudu/master/placement_policy.cc:

http://gerrit.cloudera.org:8080/#/c/11207/11/src/kudu/master/placement_policy.cc@134
PS11, Line 134: located
> Nit: "are located"
Done


http://gerrit.cloudera.org:8080/#/c/11207/11/src/kudu/master/placement_policy.cc@145
PS11, Line 145: existing_set.emplace(std::move(ts));
> EmplaceOrDie? Or do you expect duplicates?
Done


http://gerrit.cloudera.org:8080/#/c/11207/11/src/kudu/master/placement_policy.cc@178
PS11, Line 178: // Get the load of the location: a location with N tablet 
servers and R replicas
  : // has load R/N.
  : //
  : // Parameters:
  : //   'location'   The location in question.
  : //   'locations_info' Information on tablet replicas slated for 
placement,
  : //but not created yet. That's the placement 
information
  : //on to-be-replicas in the context of 
optimizing tablet
  : //replica distribution in the cluster.
> Move this to the header file.
Done



--
To view, visit http://gerrit.cloudera.org:8080/11207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4169098abf17d5591d4c1675561afc15b5477fcd
Gerrit-Change-Number: 11207
Gerrit-PatchSet: 11
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Wed, 12 Sep 2018 22:38:09 +
Gerrit-HasComments: Yes


[kudu-CR] [location awareness] replica selection honors placement policy

2018-09-12 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11207 )

Change subject: [location_awareness] replica selection honors placement policy
..


Patch Set 11:

LGTM besides Adar's comments.


--
To view, visit http://gerrit.cloudera.org:8080/11207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4169098abf17d5591d4c1675561afc15b5477fcd
Gerrit-Change-Number: 11207
Gerrit-PatchSet: 11
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Wed, 12 Sep 2018 22:17:18 +
Gerrit-HasComments: No


[kudu-CR] [location awareness] replica selection honors placement policy

2018-09-12 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11207 )

Change subject: [location_awareness] replica selection honors placement policy
..


Patch Set 11:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11207/11/src/kudu/master/placement_policy.cc
File src/kudu/master/placement_policy.cc:

http://gerrit.cloudera.org:8080/#/c/11207/11/src/kudu/master/placement_policy.cc@134
PS11, Line 134: located
Nit: "are located"


http://gerrit.cloudera.org:8080/#/c/11207/11/src/kudu/master/placement_policy.cc@145
PS11, Line 145: existing_set.emplace(std::move(ts));
EmplaceOrDie? Or do you expect duplicates?


http://gerrit.cloudera.org:8080/#/c/11207/11/src/kudu/master/placement_policy.cc@178
PS11, Line 178: // Get the load of the location: a location with N tablet 
servers and R replicas
  : // has load R/N.
  : //
  : // Parameters:
  : //   'location'   The location in question.
  : //   'locations_info' Information on tablet replicas slated for 
placement,
  : //but not created yet. That's the placement 
information
  : //on to-be-replicas in the context of 
optimizing tablet
  : //replica distribution in the cluster.
Move this to the header file.



--
To view, visit http://gerrit.cloudera.org:8080/11207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4169098abf17d5591d4c1675561afc15b5477fcd
Gerrit-Change-Number: 11207
Gerrit-PatchSet: 11
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Wed, 12 Sep 2018 22:14:16 +
Gerrit-HasComments: Yes


[kudu-CR] [location awareness] replica selection honors placement policy

2018-09-12 Thread Alexey Serbin (Code Review)
Hello Will Berkeley, Kudu Jenkins, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11207

to look at the new patch set (#11).

Change subject: [location_awareness] replica selection honors placement policy
..

[location_awareness] replica selection honors placement policy

This patch introduces placement policy into the catalog manager's
replica selection process. The replica selection logic is factored out
into the new PlacementPolicy class.

In essence (for details see [1]), the placement policy is about:
  * in case of N locations, N > 2, not placing the majority of replicas
in one location
  * spreading replicas evenly among available locations
  * within a location, spreading replicas evenly among tablet servers

This patch also adds a few test scenarios for the new functionality.

[1] https://s.apache.org/location-awareness-design

Change-Id: I4169098abf17d5591d4c1675561afc15b5477fcd
---
M src/kudu/gutil/map-util.h
M src/kudu/master/CMakeLists.txt
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
A src/kudu/master/placement_policy-test.cc
A src/kudu/master/placement_policy.cc
A src/kudu/master/placement_policy.h
M src/kudu/master/ts_descriptor.cc
M src/kudu/master/ts_descriptor.h
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
11 files changed, 1,229 insertions(+), 224 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/11207/11
--
To view, visit http://gerrit.cloudera.org:8080/11207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4169098abf17d5591d4c1675561afc15b5477fcd
Gerrit-Change-Number: 11207
Gerrit-PatchSet: 11
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] HMS integration: provide Java API to override owner during table create

2018-09-12 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11413 )

Change subject: HMS integration: provide Java API to override owner during 
table create
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11413/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestHiveMetastoreIntegration.java
File 
java/kudu-client/src/test/java/org/apache/kudu/client/TestHiveMetastoreIntegration.java:

http://gerrit.cloudera.org:8080/#/c/11413/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestHiveMetastoreIntegration.java@27
PS3, Line 27:
> There are a few metadata fields in the HMS which we could expose, but aren'
Sure, I agree that should not be included in this patch.



--
To view, visit http://gerrit.cloudera.org:8080/11413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I21069d81a4b54e55c399933b69789bf83bd58de0
Gerrit-Change-Number: 11413
Gerrit-PatchSet: 4
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Wed, 12 Sep 2018 21:56:00 +
Gerrit-HasComments: Yes


[kudu-CR] [docs] Add basic advice on setting block cache size

2018-09-12 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11420 )

Change subject: [docs] Add basic advice on setting block cache size
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/11420/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11420/1//COMMIT_MSG@9
PS1, Line 9: improving
Nit: "improving" is an odd choice here, since the subject is "the block cache 
size" rather than "the performance of the block cache".


http://gerrit.cloudera.org:8080/#/c/11420/1//COMMIT_MSG@11
PS1, Line 11: e.g.
: consider a workload doing full table scans vs. one mostly 
re-scanning
: a small range checking for updates)
Worth noting that the scanner API has a SetCacheBlocks() method to control 
whether an individual scan's resulting blocks are cached in the block cache or 
not.


http://gerrit.cloudera.org:8080/#/c/11420/1/docs/troubleshooting.adoc
File docs/troubleshooting.adoc:

http://gerrit.cloudera.org:8080/#/c/11420/1/docs/troubleshooting.adoc@542
PS1, Line 542: the memory pressure percentage times the hard limit
Nit: clearer as "`--memory_pressure_percentage` of 
`--memory_limit_hard_bytes`", to better connect to the previous sentence?


http://gerrit.cloudera.org:8080/#/c/11420/1/docs/troubleshooting.adoc@543
PS1, Line 543: With the defaults, this means the `--block_cache_capacity_mb` 
should not exceed
 : 30% of `--memory_limit_hard_bytes`
Nit: wouldn't it be clearer to use "block cache capacity" and "memory hard 
limit" respectively in this sentence, since you're referring to tangible 
concepts and not to gflags?


http://gerrit.cloudera.org:8080/#/c/11420/1/docs/troubleshooting.adoc@596
PS1, Line 596: block_cache_usage`
Won't block_cache_usage be 100% of block_cache_capacity_mb given enough uptime? 
Meaning, the only case where it won't be the same value is a freshly started 
tserver with a cold cache.



--
To view, visit http://gerrit.cloudera.org:8080/11420
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc7411c38b6fcc8694509ec89c32e2fe74e6c0db
Gerrit-Change-Number: 11420
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Wed, 12 Sep 2018 21:00:31 +
Gerrit-HasComments: Yes


[kudu-CR] common: add equality methods to ColumnBlock and SelectionVector

2018-09-12 Thread Adar Dembo (Code Review)
Adar Dembo has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/11267 )

Change subject: common: add equality methods to ColumnBlock and SelectionVector
..


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/11267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I9712bd7748bb01af7b6f68897a453a0aa149cdcf
Gerrit-Change-Number: 11267
Gerrit-PatchSet: 5
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] common: add equality methods to ColumnBlock and SelectionVector

2018-09-12 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11267 )

Change subject: common: add equality methods to ColumnBlock and SelectionVector
..


Patch Set 5: Verified+1

Overriding Jenkins, unrelated failure in tls_socket-test.


--
To view, visit http://gerrit.cloudera.org:8080/11267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9712bd7748bb01af7b6f68897a453a0aa149cdcf
Gerrit-Change-Number: 11267
Gerrit-PatchSet: 5
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 12 Sep 2018 20:33:41 +
Gerrit-HasComments: No


[kudu-CR] KUDU-686 (part 2/2): use DeltaPreparer in DeltaFileIterator

2018-09-12 Thread Adar Dembo (Code Review)
Hello Mike Percy, David Ribeiro Alves, Kudu Jenkins, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11395

to look at the new patch set (#3).

Change subject: KUDU-686 (part 2/2): use DeltaPreparer in DeltaFileIterator
..

KUDU-686 (part 2/2): use DeltaPreparer in DeltaFileIterator

This patch rewrites much of DeltaFileIterator to leverage DeltaPreparer for
delta preparation and service. Besides sharing code, the motivation is to
take advantage of the performance improvement inherent to DeltaPreparer:
decoding a batch of deltas just once in PrepareBatch() as opposed to on each
call to ApplyUpdates().

Seeing as DeltaPreparer was originally built for DMSIterator, here are the
various augmentations that were necessary to support DeltaFileIterator:
- REINSERT support, which meant more complicated deletion state tracking.
- FilterColumnIdsAndCollectDeltas support, cribbed from DeltaFileIterator.
- A templatized traits system to control which features were enabled. This
  also meant templatizing both DeltaPreparer and DeltaFileIterator.
- Early out from the "apply all deltas for a row" loop when the timestamp
  is no longer relevant. I modified DMSIterator to take advantage of this,
  which should result in a performance improvement.

I tried to centralize as much state tracking in DeltaPreparer, though there
were several aspects of this that were confusing (namely prepared_idx_,
last_added_idx_, and prepared_count_).

The improvement should be most noticeable on tables with wide schemas where
multiple columns are projected. In these situations, the column-by-column
ApplyUpdates() approach incurred a lot of unnecessary delta decoding. I
don't have a suitable microbenchmark to prove this, but I did run
diskrowset-test's TestDeltaApplicationPerformance under perf and the
resulting flame graphs showed the bulk of the iteration time as having moved
from ApplyUpdates() to PrepareBatch().

Change-Id: I87de52092262c4b42c1bd5107f9139edfc3888b5
---
M src/kudu/tablet/delta_key.h
M src/kudu/tablet/delta_store.cc
M src/kudu/tablet/delta_store.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/deltamemstore-test.cc
M src/kudu/tablet/deltamemstore.cc
M src/kudu/tablet/deltamemstore.h
M src/kudu/tablet/mvcc.cc
M src/kudu/tablet/mvcc.h
M src/kudu/tablet/tablet-test-util.h
11 files changed, 473 insertions(+), 395 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/95/11395/3
--
To view, visit http://gerrit.cloudera.org:8080/11395
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I87de52092262c4b42c1bd5107f9139edfc3888b5
Gerrit-Change-Number: 11395
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] deltafile-test: DeltaFileIterator fuzz test

2018-09-12 Thread Adar Dembo (Code Review)
Hello Mike Percy, Kudu Jenkins, Grant Henke, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11140

to look at the new patch set (#9).

Change subject: deltafile-test: DeltaFileIterator fuzz test
..

deltafile-test: DeltaFileIterator fuzz test

Here's a new fuzz test for deltafile-test that generates random deltas and
iterates at random snapshots over it, testing the various delta iterator
functions.

The test introduces MirroredDeltas, a utility that's useful for tracking
deltas that are written into a delta file (or into a DMS, for that matter).

Change-Id: I539ff0f2055cf398a42efb824238188230e25516
---
M src/kudu/tablet/CMakeLists.txt
M src/kudu/tablet/delta_key.h
M src/kudu/tablet/delta_store.h
M src/kudu/tablet/deltafile-test.cc
A src/kudu/tablet/tablet-test-util.cc
M src/kudu/tablet/tablet-test-util.h
6 files changed, 474 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/11140/9
--
To view, visit http://gerrit.cloudera.org:8080/11140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I539ff0f2055cf398a42efb824238188230e25516
Gerrit-Change-Number: 11140
Gerrit-PatchSet: 9
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-686 (part 2/2): use DeltaPreparer in DeltaFileIterator

2018-09-12 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11395 )

Change subject: KUDU-686 (part 2/2): use DeltaPreparer in DeltaFileIterator
..


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/11395/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11395/2//COMMIT_MSG@7
PS2, Line 7: use DeltaPreparer
> Can you talk more about how this changes the logic? Or do you view this as
It's a refactor at heart, though it (obviously) affects the division of labor 
between PrepareBatch() and ApplyUpdates(). I mentioned that below where I talk 
about flame graphs. At the end of the day the DeltaFileIterator still does the 
same thing it always did: read deltas from a file, decode them, and apply the 
appropriate ones during a scan. It just does it more efficiently.

I actually don't want to mention diff scans in these two patches because they 
stand on their own merits: they improve scan performance when projecting 
multiple columns by reducing CPU load incurred via unnecessary delta decoding. 
Bringing diff scans into this will just muddy the waters. Instead, I'll tie 
back to these patches in the diff scan patch series.


http://gerrit.cloudera.org:8080/#/c/11395/2//COMMIT_MSG@9
PS2, Line 9: leverage DeltaPreparer
> why? just code reuse? get rid of Todd's TODO in the code?
See the JIRA and below: this refactor improves performance by getting rid of 
the decoding-heavy approach taken by the "visitor" pattern previously used by 
DeltaFileIterator.


http://gerrit.cloudera.org:8080/#/c/11395/2//COMMIT_MSG@25
PS2, Line 25: entralize as
> Can you be more specific about what we think these improvements are?
The rest of the sentence explains: it does away with a bunch of unnecessary 
delta decoding. There's more information in the JIRA.

I'll add some more color here but the commit message is already quite detailed 
(by my standards) and in the interest of brevity I don't want to duplicate too 
much from the JIRA.


http://gerrit.cloudera.org:8080/#/c/11395/2/src/kudu/tablet/delta_store.h
File src/kudu/tablet/delta_store.h:

http://gerrit.cloudera.org:8080/#/c/11395/2/src/kudu/tablet/delta_store.h@258
PS2, Line 258: re irrelevan
> nit: doc this parameter
Done


http://gerrit.cloudera.org:8080/#/c/11395/2/src/kudu/tablet/delta_store.cc
File src/kudu/tablet/delta_store.cc:

http://gerrit.cloudera.org:8080/#/c/11395/2/src/kudu/tablet/delta_store.cc@57
PS2, Line 57: // Returns whether a mutation at 'ts' is relevant under 'snap'.
> Mind documenting the interface to this helper? It's not obvious what the me
Done


http://gerrit.cloudera.org:8080/#/c/11395/2/src/kudu/tablet/deltamemstore.cc
File src/kudu/tablet/deltamemstore.cc:

http://gerrit.cloudera.org:8080/#/c/11395/2/src/kudu/tablet/deltamemstore.cc@248
PS2, Line 248:   // TODO(adar): is this correct?
> Mind adding a comment with the thought process behind this bookkeeping / cl
Not sure what you mean. This isn't bookkeeping or cleanup; it's an optimization 
to skip any additional deltas for this row once we've established that we're 
done with the row.

I added the TODO because this seemed like a brain-dead optimization that should 
have been done previously, so I was wondering if perhaps it's an incorrect 
thing to do. See 
https://gerrit.cloudera.org/c/11394/2/src/kudu/tablet/deltamemstore.cc#b268 for 
the old code here.



--
To view, visit http://gerrit.cloudera.org:8080/11395
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87de52092262c4b42c1bd5107f9139edfc3888b5
Gerrit-Change-Number: 11395
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 12 Sep 2018 19:40:18 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-686 (part 1/2): decompose guts of DMSIterator into DeltaPreparer

2018-09-12 Thread Adar Dembo (Code Review)
Hello Mike Percy, David Ribeiro Alves, Kudu Jenkins, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11394

to look at the new patch set (#3).

Change subject: KUDU-686 (part 1/2): decompose guts of DMSIterator into 
DeltaPreparer
..

KUDU-686 (part 1/2): decompose guts of DMSIterator into DeltaPreparer

To address KUDU-686, we're going to repurpose DMSIterator's PrepareBatch()
machinery and associated in-memory state for use in the DeltaFileIterator.
Doing so obviates the need for a "multi-pass" approach to ApplyUpdates()
because DeltaFileIterator will no longer be performing any decoding there,
having done all of it in PrepareBatch().

This patch lays the groundwork by refactoring the guts of DMSIterator into
the new DeltaPreparer class. DMSIterator will continue to concern itself
with CBTree iteration, but will delegate the delta preparation and service
to DeltaPreparer.

In performing this refactor, I tried to be as faithful as possible to the
original code, even when I didn't really understand it (the prepared_idx_
and prepared_count_ state tracking gave me a lot of grief).

No new tests; I figured there was enough test coverage of DMSIterator, and
testing DeltaPreparer directly seemed like it'd be low bang for the buck.

Change-Id: I26c92dfa69b5323e7fbb18d02010ed99cc29c3e5
---
M src/kudu/tablet/delta_iterator_merger.cc
M src/kudu/tablet/delta_iterator_merger.h
M src/kudu/tablet/delta_store.cc
M src/kudu/tablet/delta_store.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/deltamemstore.cc
M src/kudu/tablet/deltamemstore.h
8 files changed, 391 insertions(+), 276 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/11394/3
--
To view, visit http://gerrit.cloudera.org:8080/11394
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I26c92dfa69b5323e7fbb18d02010ed99cc29c3e5
Gerrit-Change-Number: 11394
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] [KUDU-2521] Java Implementation for BloomFilter

2018-09-12 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11333 )

Change subject: [KUDU-2521] Java Implementation for BloomFilter
..


Patch Set 6:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java
File java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java:

http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@225
PS5, Line 225: return checkIfContains(byteBuffer);
> I'm not sure here.
Actually, the length of 'bytes' doesn't matter because it's hashed into a 
single long regardless.


http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java
File java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java:

http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@29
PS6, Line 29:  * An space-efficient filter and offers an approximate 
containment check.
Nit: "A space-efficient filter which offers an approximate containment check"?


http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@37
PS6, Line 37: shrink the amount
Nit: "constrain the number"


http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@39
PS6, Line 39: you are expecting TServer to return records have
:  * the same value in a scan.
Nit: "you expect the TServer to return records with the same value in a scan".


http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@52
PS6, Line 52:  *   // TODO: implemnt the interface for serializaing and sending
"implement" and "serializing"


http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@69
PS6, Line 69: if (bitSet.size() < 8) {
:   throw new IllegalArgumentException("Number of bits in 
bitset should be at least 8, but found "
: + bitSet.length());
: }
You can use Guava's Preconditions to simplify these assertions:

  Preconditions.checkArgument(bitset.size() < 8, "...");


http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@73
PS6, Line 73: this.nBits = bitSet.size();
Could you look into whether bitSet.size() is a constant time operation? If it 
is, there's no need to store nBits separately; we could just query 
bitSet.size() whenever we need the number of bits.


http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@82
PS6, Line 82:* @param nBytes size of Bloom filter
Nit: "size of bloom filter in bytes"


http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@83
PS6, Line 83:* @param fpRate false positive rate
Could you elaborate on what this means? How should a user know what value to 
provide here?

Below as well.


http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@89
PS6, Line 89:   public static BloomFilter BySizeAndFPRate(int nBytes, double 
fpRate, HashFunction hashFunction) {
Please doc this one too.


http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@97
PS6, Line 97: Bloom
Nit: lower case "bloom" (to be consistent with how you've written "bloom 
filter" in other Javadoc.


http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@112
PS6, Line 112:   public void put(byte[] data) {
Please Javadoc the various put() methods.


http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@163
PS6, Line 163:
 :   public byte[] getBitSet() {
 : return bitSet.toByteArray();
 :   }
 :
 :   public int getNHashes() {
 : return nHashes;
 :   }
 :
 :   public String getHashFunctionName() {
 : return hashFunction.toString();
 :   }
Are these testing only? If so, annotate them as such. If not, provide Javadoc.


http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@200
PS6, Line 200: assert (byteBuffer.length >= length);
Nit: use Guava's Preconditions here.


http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/test/java/org/apache/kudu/client/TestBloomFilter.java
File java/kudu-client/src/test/java/org/apache/kudu/client/TestBloomFilter.java:

http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/tes

[kudu-CR] [docs] Add basic advice on setting block cache size

2018-09-12 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11420 )

Change subject: [docs] Add basic advice on setting block cache size
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11420/1/docs/troubleshooting.adoc
File docs/troubleshooting.adoc:

http://gerrit.cloudera.org:8080/#/c/11420/1/docs/troubleshooting.adoc@538
PS1, Line 538: as a percentage of `--memory_limit_hard_bytes`
Hrm, at first glance I read this as either:
- the block cache shouldn't be larger than 1% of --memory_limit_hard_bytes, or
- there is some percentage of --memory_limit_hard_bytes at which setting the 
block cache size would be bad

Since this is clarified below, maybe just, "(see below)"?


http://gerrit.cloudera.org:8080/#/c/11420/1/docs/troubleshooting.adoc@557
PS1, Line 557: metricz
metrics?


http://gerrit.cloudera.org:8080/#/c/11420/1/docs/troubleshooting.adoc@602
PS1, Line 602: `block_cache_evictions` metric is significant compared to 
`block_cache_inserts`
I'm trying to think of a case where this isn't true, but the above is true. 
I.e. where there isn't a lot of churn in the block cache and it's near 
capacity, but there are a lot of cache misses. That shouldn't happen, right?



--
To view, visit http://gerrit.cloudera.org:8080/11420
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc7411c38b6fcc8694509ec89c32e2fe74e6c0db
Gerrit-Change-Number: 11420
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Wed, 12 Sep 2018 18:33:56 +
Gerrit-HasComments: Yes


[kudu-CR] [location awareness] Add location info in ksck report

2018-09-12 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11422 )

Change subject: [location_awareness] Add location info in ksck report
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11422/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11422/1//COMMIT_MSG@29
PS1, Line 29: Tablet Server UUID|   Location
: --+---
:  fc18b4bdb2ae4dd0a5717e8fe1f1de69 | 
:  0f25d1891fce48789e06fdec493fb90e | 
:  ad868d782dc743369d96a9e958811f81 | 
:  c541b6ebc3a14f0f9bbee26d52545683 | /foo-bar0/BAAZ.9-quux
:
:Location|  Count
: ---+-
:  /foo-bar0/BAAZ.9-quux |   1
:  |   3
> ksck output is already pretty lengthy so I don't want to introduce an extra
+1

Woops, indeed -- we have 'Tablet Server Summary' section.  I think the best fit 
for the tablet server location information is in that table.



--
To view, visit http://gerrit.cloudera.org:8080/11422
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ideff2dd4975c99a1135002624de2620fb95c
Gerrit-Change-Number: 11422
Gerrit-PatchSet: 1
Gerrit-Owner: Fengling Wang 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Fengling Wang 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Wed, 12 Sep 2018 17:35:47 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-686 (part 2/2): use DeltaPreparer in DeltaFileIterator

2018-09-12 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11395 )

Change subject: KUDU-686 (part 2/2): use DeltaPreparer in DeltaFileIterator
..


Patch Set 2:

(6 comments)

I'm working through this patch... as it's changing some core code I've never 
worked with before, I am just posting some initial thoughts.

http://gerrit.cloudera.org:8080/#/c/11395/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11395/2//COMMIT_MSG@7
PS2, Line 7: use DeltaPreparer
Can you talk more about how this changes the logic? Or do you view this as 
simply a refactor?

Can you also put this in context of diff scans and explain how this helps?


http://gerrit.cloudera.org:8080/#/c/11395/2//COMMIT_MSG@9
PS2, Line 9: leverage DeltaPreparer
why? just code reuse? get rid of Todd's TODO in the code?


http://gerrit.cloudera.org:8080/#/c/11395/2//COMMIT_MSG@25
PS2, Line 25: improvements
Can you be more specific about what we think these improvements are?


http://gerrit.cloudera.org:8080/#/c/11395/2/src/kudu/tablet/delta_store.h
File src/kudu/tablet/delta_store.h:

http://gerrit.cloudera.org:8080/#/c/11395/2/src/kudu/tablet/delta_store.h@258
PS2, Line 258: finished_row
nit: doc this parameter


http://gerrit.cloudera.org:8080/#/c/11395/2/src/kudu/tablet/delta_store.cc
File src/kudu/tablet/delta_store.cc:

http://gerrit.cloudera.org:8080/#/c/11395/2/src/kudu/tablet/delta_store.cc@57
PS2, Line 57: template
Mind documenting the interface to this helper? It's not obvious what the 
meaning of 'finished_row' is in this context.


http://gerrit.cloudera.org:8080/#/c/11395/2/src/kudu/tablet/deltamemstore.cc
File src/kudu/tablet/deltamemstore.cc:

http://gerrit.cloudera.org:8080/#/c/11395/2/src/kudu/tablet/deltamemstore.cc@248
PS2, Line 248:   // TODO(adar): is this correct?
Mind adding a comment with the thought process behind this bookkeeping / 
cleanup code?



--
To view, visit http://gerrit.cloudera.org:8080/11395
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87de52092262c4b42c1bd5107f9139edfc3888b5
Gerrit-Change-Number: 11395
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 12 Sep 2018 17:26:28 +
Gerrit-HasComments: Yes


[kudu-CR] [KUDU-2521] Java Implementation for BloomFilter

2018-09-12 Thread Anonymous Coward (Code Review)
jinxing6...@126.com has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11333 )

Change subject: [KUDU-2521] Java Implementation for BloomFilter
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java
File java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java:

http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@225
PS5, Line 225: return checkIfContains(byteBuffer);
> Likewise, precondition on bytes.length vs. bitmap size here?
I'm not sure here.
`bytes` is generated from the records.
We cannot guarantee bytes.length vs. bitmap.size I think


http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java
File java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java:

http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@27
PS6, Line 27:
Thanks a lot for careful review ~ Adar

I think I resolved your comments, please take another look when you have time


http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/test/java/org/apache/kudu/client/TestBloomFilter.java
File java/kudu-client/src/test/java/org/apache/kudu/client/TestBloomFilter.java:

http://gerrit.cloudera.org:8080/#/c/11333/6/java/kudu-client/src/test/java/org/apache/kudu/client/TestBloomFilter.java@34
PS6, Line 34:   public void testNumberOfHashes() {
I think it's necessary to check correctness of the `nHashes` calculated, but 
didn't find a better way.



--
To view, visit http://gerrit.cloudera.org:8080/11333
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32673c008f9958088d281c2c198c543bfea2eb8e
Gerrit-Change-Number: 11333
Gerrit-PatchSet: 6
Gerrit-Owner: jinxing6...@126.com
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: jinxing6...@126.com
Gerrit-Comment-Date: Wed, 12 Sep 2018 17:13:04 +
Gerrit-HasComments: Yes


[kudu-CR] [KUDU-2521] Java Implementation for BloomFilter

2018-09-12 Thread Anonymous Coward (Code Review)
Hello Dan Burkert, Kudu Jenkins, Adar Dembo, Grant Henke, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11333

to look at the new patch set (#6).

Change subject: [KUDU-2521] Java Implementation for BloomFilter
..

[KUDU-2521] Java Implementation for BloomFilter

Change-Id: I32673c008f9958088d281c2c198c543bfea2eb8e
---
A java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java
A java/kudu-client/src/test/java/org/apache/kudu/client/TestBloomFilter.java
2 files changed, 514 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/33/11333/6
--
To view, visit http://gerrit.cloudera.org:8080/11333
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I32673c008f9958088d281c2c198c543bfea2eb8e
Gerrit-Change-Number: 11333
Gerrit-PatchSet: 6
Gerrit-Owner: jinxing6...@126.com
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: jinxing6...@126.com


[kudu-CR] [iwyu] add catalog manager.{cc,h} files

2018-09-12 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11430 )

Change subject: [iwyu] add catalog_manager.{cc,h} files
..

[iwyu] add catalog_manager.{cc,h} files

Removed catalog_manager.{cc,h} files from the IWYU's black list.
It seems at this point the IWYU tool no longer outputs conflicting
suggestions.

Change-Id: I3b65dc90a5e6378286e51fcfdcdefdf499fbd89c
Reviewed-on: http://gerrit.cloudera.org:8080/11430
Reviewed-by: Adar Dembo 
Tested-by: Kudu Jenkins
---
M build-support/iwyu.py
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
3 files changed, 3 insertions(+), 6 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/11430
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3b65dc90a5e6378286e51fcfdcdefdf499fbd89c
Gerrit-Change-Number: 11430
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] [location awareness] Add location info in ksck report

2018-09-12 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11422 )

Change subject: [location_awareness] Add location info in ksck report
..


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/11422/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11422/1//COMMIT_MSG@29
PS1, Line 29: Tablet Server UUID|   Location
: --+---
:  fc18b4bdb2ae4dd0a5717e8fe1f1de69 | 
:  0f25d1891fce48789e06fdec493fb90e | 
:  ad868d782dc743369d96a9e958811f81 | 
:  c541b6ebc3a14f0f9bbee26d52545683 | /foo-bar0/BAAZ.9-quux
:
:Location|  Count
: ---+-
:  /foo-bar0/BAAZ.9-quux |   1
:  |   3
> Is this a part of master summary as well?
ksck output is already pretty lengthy so I don't want to introduce an extra 
table with #rows = #registered TS. Plus I think it's surprising to have TS 
locations not located with TS. So the locations should be another column in the 
TS table, even if it's not the easiest thing to do with ksck's data model.


http://gerrit.cloudera.org:8080/#/c/11422/1/src/kudu/tools/ksck.h
File src/kudu/tools/ksck.h:

http://gerrit.cloudera.org:8080/#/c/11422/1/src/kudu/tools/ksck.h@283
PS1, Line 283: std::unordered_map
> Consider introducing a typedef for this and document what key and value con
I don't think we want to store locations like this. See my comment in 
ksck_remote.cc.


http://gerrit.cloudera.org:8080/#/c/11422/1/src/kudu/tools/ksck.cc
File src/kudu/tools/ksck.cc:

http://gerrit.cloudera.org:8080/#/c/11422/1/src/kudu/tools/ksck.cc@216
PS1, Line 216: sh.ts_locations = master->ts_locations();
The locations should come from the leader master, not from each master. They 
are a "cluster" property so a locations map belongs in KsckCluster rather than 
in each KsckMaster. The fact that each master is assigning a location to each 
TS is kind of an implementation quirk.


http://gerrit.cloudera.org:8080/#/c/11422/1/src/kudu/tools/ksck_remote.h
File src/kudu/tools/ksck_remote.h:

http://gerrit.cloudera.org:8080/#/c/11422/1/src/kudu/tools/ksck_remote.h@84
PS1, Line 84: std::shared_ptr master_proxy_;
I don't think we need this. We should use a KuduClient to get the location from 
ListTabletServers.


http://gerrit.cloudera.org:8080/#/c/11422/1/src/kudu/tools/ksck_remote.cc
File src/kudu/tools/ksck_remote.cc:

http://gerrit.cloudera.org:8080/#/c/11422/1/src/kudu/tools/ksck_remote.cc@467
PS1, Line 467: RemoteKsckCluster::RetrieveTabletServers
This is where we should retrieve location information from the leader master, I 
think. It'll be necessary to add the location to the TabletServer objects 
returned from ListTabletServers. Then it should be stored in the 
KsckTabletServer objects (can be const since we construct them here). That 
should make it easier to map the location information into the TS table instead 
of as a separate section in the master summary.


http://gerrit.cloudera.org:8080/#/c/11422/1/src/kudu/tools/ksck_results.h
File src/kudu/tools/ksck_results.h:

http://gerrit.cloudera.org:8080/#/c/11422/1/src/kudu/tools/ksck_results.h@145
PS1, Line 145: std::unordered_map
> I thought KsckServerHealthSummary is just for a single server instance, why
Right, we just need a single location, which will be populated only for TS 
summaries.



--
To view, visit http://gerrit.cloudera.org:8080/11422
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ideff2dd4975c99a1135002624de2620fb95c
Gerrit-Change-Number: 11422
Gerrit-PatchSet: 1
Gerrit-Owner: Fengling Wang 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Fengling Wang 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Wed, 12 Sep 2018 15:50:54 +
Gerrit-HasComments: Yes


[kudu-CR] [location awareness] replica selection honors placement policy

2018-09-12 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11207 )

Change subject: [location_awareness] replica selection honors placement policy
..


Patch Set 10: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4169098abf17d5591d4c1675561afc15b5477fcd
Gerrit-Change-Number: 11207
Gerrit-PatchSet: 10
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Wed, 12 Sep 2018 15:32:13 +
Gerrit-HasComments: No


[kudu-CR] Supporting Spark streaming DataFrame in KuduContext.

2018-09-12 Thread Attila Piros (Code Review)
Attila Piros has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11199 )

Change subject: Supporting Spark streaming DataFrame in KuduContext.
..


Patch Set 5:

Gentle ping. Is there any new comments?


--
To view, visit http://gerrit.cloudera.org:8080/11199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iead04539d3514920a5d6803c34715e5686124572
Gerrit-Change-Number: 11199
Gerrit-PatchSet: 5
Gerrit-Owner: Attila Piros 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Attila Piros 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Wed, 12 Sep 2018 14:09:14 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2463 pt 2: adjust MVCC on Raft no-op

2018-09-12 Thread Andrew Wong (Code Review)
Hello Tidy Bot, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11427

to look at the new patch set (#2).

Change subject: KUDU-2463 pt 2: adjust MVCC on Raft no-op
..

KUDU-2463 pt 2: adjust MVCC on Raft no-op

Based on the same rationale as Part 1 of this patch series, this patch
updates MVCC's safe and clean time using the no-op timestamp provided by
the leader following a successful Raft election.

There isn't an obvious reference to the tablet (to get to the MVCC
module) in Raft consensus, but there is a ReplicaTransactionFactory,
that the TabletReplica implements. I've extended this to be a more
general ConsensusRoundHandler that can be used to create transactions or
finish transactions as needed.

An interesting thing to note is that in some cases (e.g. brand new
tablets), the first election would replicate the no-op with a timestamp
of 1 (the timestamp we're trying to avoid). I tracked the cause of this
to be the fact that sometimes the hybrid clock doesn't get updated
before sending out the first no-op, and this will result in the first
assigned timestamp being 1. To work around this, I updated the clock to
the initial clean time, which seems in line with other updates to the
hybrid clock in the time manager.

I tested this in the following ways:
- to ensure nothing terrible happens when there is a lot of election
  churn (and hence, a lot of timestamp advancement), I've tweaked
  exactly_once_writes-itest to actually churn elections. Previously it
  attempted this with just a low timeout; I injected some latency to
  make it churn a bit harder.
- I added a test that ensured that, on its own, a tablet would bump its
  MVCC timestamps, by virtue of its elections
- I tested the above with single-replica tablets as well
- a few other tests needed to be tweaked given the extra bump to the
  hybrid clock

Change-Id: Icbf812e2cb7c322fd980245cfe40c886a15a
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/time_manager.cc
M src/kudu/integration-tests/alter_table-test.cc
M src/kudu/integration-tests/exactly_once_writes-itest.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/timestamp_advancement-itest.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
10 files changed, 179 insertions(+), 78 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/11427/2
--
To view, visit http://gerrit.cloudera.org:8080/11427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icbf812e2cb7c322fd980245cfe40c886a15a
Gerrit-Change-Number: 11427
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-2463 pt 3: don't scan if MVCC hasn't moved

2018-09-12 Thread Andrew Wong (Code Review)
Hello Tidy Bot, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11428

to look at the new patch set (#2).

Change subject: KUDU-2463 pt 3: don't scan if MVCC hasn't moved
..

KUDU-2463 pt 3: don't scan if MVCC hasn't moved

In cases when a tablet bootstrap yields an MvccManager whose clean time
has not been advanced (e.g. if there are no write ops in the WAL), and
the tablet has otherwise not bumped its MVCC timestamps (e.g. if it has
not yet elected a leader), a scan (whose correctness depends on the
MvccManager to determine what transactions have been applied) will
return incorrect results.

In the same way that we prevent compactions from occuring if MVCC's
timestamps have not been moved, this patch prevents incorrect results
from being returend from a scan, forcing it to be retried elsewhere.

Tests are added to attempt to scan in this state, verifying that we get
an error.

Change-Id: Idc0f77673e1f04a34ab1f5c1930bbaa2498b39bf
---
M src/kudu/integration-tests/timestamp_advancement-itest.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/mini-cluster/internal_mini_cluster.cc
M src/kudu/mini-cluster/internal_mini_cluster.h
M src/kudu/mini-cluster/mini_cluster.h
M src/kudu/tablet/mvcc.cc
M src/kudu/tablet/mvcc.h
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/tserver/tablet_service.cc
10 files changed, 158 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/28/11428/2
--
To view, visit http://gerrit.cloudera.org:8080/11428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idc0f77673e1f04a34ab1f5c1930bbaa2498b39bf
Gerrit-Change-Number: 11428
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot