[Impala-ASF-CR] IMPALA-8504 (part 2): Support CREATE TABLE statement with Kudu/HMS integration

2019-06-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13375 )

Change subject: IMPALA-8504 (part 2): Support CREATE TABLE statement with 
Kudu/HMS integration
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/13375
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icffe412395f47f5e07d97bad457020770cfa7502
Gerrit-Change-Number: 13375
Gerrit-PatchSet: 5
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Mon, 03 Jun 2019 03:24:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8503: allow the Hive Metastore to start with kudu-hive plugin

2019-06-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13319 )

Change subject: IMPALA-8503: allow the Hive Metastore to start with kudu-hive 
plugin
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/13319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4bd1488ced51840ec986d29ed371e26168abcc76
Gerrit-Change-Number: 13319
Gerrit-PatchSet: 5
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Mon, 03 Jun 2019 03:20:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7608: Estimate row count from file size when no stats available

2019-06-02 Thread Fang-Yu Rao (Code Review)
Fang-Yu Rao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12974 )

Change subject: IMPALA-7608: Estimate row count from file size when no stats 
available
..


Patch Set 9:

> Patch Set 9:
>
> (21 comments)
>
> > Patch Set 8:
> >
> > (1 comment)
>
Hi all, I have addressed your previous comments. Please review the updated 
patch set. Thank you very much!


--
To view, visit http://gerrit.cloudera.org:8080/12974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic414121c8df0d5222e4aeea096b5365beb04568a
Gerrit-Change-Number: 12974
Gerrit-PatchSet: 9
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 03 Jun 2019 01:33:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7608: Estimate row count from file size when no stats available

2019-06-02 Thread Fang-Yu Rao (Code Review)
Fang-Yu Rao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12974 )

Change subject: IMPALA-7608: Estimate row count from file size when no stats 
available
..


Patch Set 9:

(21 comments)

> Patch Set 8:
>
> (1 comment)

Hi all, I have tried to addressed your previous comments. Please review the 
updated patch set. Thank you very much!

http://gerrit.cloudera.org:8080/#/c/12974/8/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/12974/8/be/src/service/query-options.h@48
PS8, Line 48: #define QUERY_OPTS_TABLE\
> Remove the last line, no need to describe symptoms of a DCHECK here.
Done


http://gerrit.cloudera.org:8080/#/c/12974/8/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/12974/8/common/thrift/ImpalaService.thrift@391
PS8, Line 391: // scanning.
> EST->ESTIMATE. No real need to abbreviate in a safety valve argument that w
Done


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@285
PS8, Line 285:   private int numFilesNoDiskIds_ = 0;
> Can you rewrite the comment to just describe what the value is. No need to
Done


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@288
PS8, Line 288:   // List of conjuncts for min/max values of 
parquet::Statistics, that are used to skip
> Should be a constant, not a variable, i.e. private static double DEFAULT_RO
Done


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1148
PS8, Line 1148:   " sel=" + 
Double.toString(computeSelectivity()));
> I would prefer if you just passed in the query options to this methods. Oth
Done


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1190
PS8, Line 1190: }
> Do we have a test case for this?
It seems currently we do not have a test case for this. After some 
investigation, I found the following table could exercise this code path.

CREATE TABLE array_demo (
  pets ARRAY 
) STORED AS PARQUET;

Note that sumAvgRowSizes is equal to 0 not because there is no Column defined 
for this table. It is because the type of the current Column is not of 
ScalarType. In this specific case, the current Column is of ArrayType. 
Similarly, if the column is of MapType, sumAvgRowSizes would be equal to 0 as 
well.


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1198
PS8, Line 1198: // the hdfs table.
> Should we estimate the compression factor differently depending on the file
Thanks. There are 8 supported file format defined in 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java.

After some discussions with Tim, I divided the files into 3 categories - 
uncompressed, legacy compressed (e.g., text, avro, rc, seq), and  columnar 
(e.g., parquet and orc). Depending on the category of a file, we multiply the 
size of the file by its corresponding compression factor to derive an estimated 
original size of the file before compression, based on which we could compute 
an estimate of the number of rows in the file according to the estimate of the 
row width.


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
File fe/src/test/java/org/apache/impala/planner/CardinalityTest.java:

PS8:
Thanks! I will do as suggested.


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@234
PS8, Line 234:
> Not true if there's a group by - we should test that too.
Thanks! Will add another test case for group by.


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@262
PS8, Line 262: verifyCardinality("SELECT COUNT(a) FROM 
functional.tinytable", 1, true,
> Can we use something cleaner for constant lists like: Arrays.asList(0, 1, 0
Thanks! Will do as suggested.


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@454
PS8, Line 454:  functional.tinytable.
> I think this new planner test option is awkward since the planner tests alr
Thanks for the suggestion! I will revise verifyCardinality as suggested.


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@482
PS8, Line 482: List path = Arrays.asList(0);
Thanks for the suggestion! I have added a static variable called "tolerance" in 
this class that denotes the margin of error. The default value of toleranc

[Impala-ASF-CR] IMPALA-7608: Estimate row count from file size when no stats available

2019-06-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12974 )

Change subject: IMPALA-7608: Estimate row count from file size when no stats 
available
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3479/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic414121c8df0d5222e4aeea096b5365beb04568a
Gerrit-Change-Number: 12974
Gerrit-PatchSet: 9
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 03 Jun 2019 00:51:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7608: Estimate row count from file size when no stats available

2019-06-02 Thread Fang-Yu Rao (Code Review)
Fang-Yu Rao has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/12974 )

Change subject: IMPALA-7608: Estimate row count from file size when no stats 
available
..

IMPALA-7608: Estimate row count from file size when no stats available

Added the feature that computes an estimated number of rows in the current
hdfs table if the statistics for the cardinality of the current hdfs table is 
not
available.

Also added an additional query option to revert the change in case of 
regression.

Testing:
(1) In CardinalityTest.java, replaced the original statement
"verifyCardinality("SELECT a FROM functional.tinytable", -1);"
with "verifyCardinality("SELECT a FROM functional.tinytable", 10);".
(2) In CardinalityTest.java, added a new test to ensure that
the returned cardinality is still -1 when this feature is disabled.
(3) In CarginalityTest.java, added more tests to check the cardinality
of most PlanNode implementations. For each tested PlanNode, the behaviors
before and after we disable the feature are both tested.
(4) In set.test, modified three related test cases to make sure that
the added query option is included after executing "set all" in various
scenarios.
(5) There are 8 JUnit tests in PlannerTest.java that would produce different
distributed query plans when this feature is enabled. Added an additional
JUnit test for each of those 8 affected JUnit tests when this feature is
enabled. Specifically, each tested query in a newly added test files involves
at least one hdfs table without available statistics.

Change-Id: Ic414121c8df0d5222e4aeea096b5365beb04568a
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M fe/src/main/java/org/apache/impala/catalog/HdfsCompression.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
R 
testdata/workloads/functional-planner/queries/PlannerTest/default-join-distr-mode-shuffle-hdfs-num-rows-est-disabled.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/default-join-distr-mode-shuffle-hdfs-num-rows-est-enabled.test
R 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection-hdfs-num-rows-est-disabled.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection-hdfs-num-rows-est-enabled.test
R 
testdata/workloads/functional-planner/queries/PlannerTest/joins-hdfs-num-rows-est-disabled.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/joins-hdfs-num-rows-est-enabled.test
R 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters-hdfs-num-rows-est-disabled.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters-hdfs-num-rows-est-enabled.test
R 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation-hdfs-num-rows-est-disabled.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation-hdfs-num-rows-est-enabled.test
R 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements-hdfs-num-rows-est-disabled.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements-hdfs-num-rows-est-enabled.test
R 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing-hdfs-num-rows-est-disabled.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing-hdfs-num-rows-est-enabled.test
R 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite-hdfs-num-rows-est-disabled.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite-hdfs-num-rows-est-enabled.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test
M testdata/workloads/functional-query/queries/QueryTest/set.test
27 files changed, 2,877 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/12974/9
--
To view, visit http://gerrit.cloudera.org:8080/12974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic414121c8df0d5222e4aeea096b5365beb04568a
Gerrit-Change-Number: 12974
Gerrit-PatchSet: 9
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-8504 (part 2): Support CREATE TABLE statement with Kudu/HMS integration

2019-06-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13375 )

Change subject: IMPALA-8504 (part 2): Support CREATE TABLE statement with 
Kudu/HMS integration
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4387/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/13375
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icffe412395f47f5e07d97bad457020770cfa7502
Gerrit-Change-Number: 13375
Gerrit-PatchSet: 5
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Sun, 02 Jun 2019 21:52:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8503: allow the Hive Metastore to start with kudu-hive plugin

2019-06-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13319 )

Change subject: IMPALA-8503: allow the Hive Metastore to start with kudu-hive 
plugin
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4386/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/13319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4bd1488ced51840ec986d29ed371e26168abcc76
Gerrit-Change-Number: 13319
Gerrit-PatchSet: 5
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Sun, 02 Jun 2019 21:52:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8504 (part 2): Support CREATE TABLE statement with Kudu/HMS integration

2019-06-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13375 )

Change subject: IMPALA-8504 (part 2): Support CREATE TABLE statement with 
Kudu/HMS integration
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3478/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13375
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icffe412395f47f5e07d97bad457020770cfa7502
Gerrit-Change-Number: 13375
Gerrit-PatchSet: 5
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Sun, 02 Jun 2019 21:36:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8504 (part 2): Support CREATE TABLE statement with Kudu/HMS integration

2019-06-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13375 )

Change subject: IMPALA-8504 (part 2): Support CREATE TABLE statement with 
Kudu/HMS integration
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13375/5/tests/custom_cluster/test_kudu.py
File tests/custom_cluster/test_kudu.py:

http://gerrit.cloudera.org:8080/#/c/13375/5/tests/custom_cluster/test_kudu.py@191
PS5, Line 191:
flake8: W292 no newline at end of file



--
To view, visit http://gerrit.cloudera.org:8080/13375
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icffe412395f47f5e07d97bad457020770cfa7502
Gerrit-Change-Number: 13375
Gerrit-PatchSet: 5
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Sun, 02 Jun 2019 20:58:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8504 (part 2): Support CREATE TABLE statement with Kudu/HMS integration

2019-06-02 Thread Hao Hao (Code Review)
Hello Thomas Marshall, Alexey Serbin, Andrew Wong, Grant Henke, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13375

to look at the new patch set (#5).

Change subject: IMPALA-8504 (part 2): Support CREATE TABLE statement with 
Kudu/HMS integration
..

IMPALA-8504 (part 2): Support CREATE TABLE statement with Kudu/HMS integration

This commit supports the actual handling of CREATE TABLE DDL for managed
Kudu tables when integration with Hive Metastore is enabled. When
Kudu/HMS integration is enabled, for CREATE TABLE statement, Impala can
rely on Kudu to create the table in the HMS.

Change-Id: Icffe412395f47f5e07d97bad457020770cfa7502
---
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M tests/common/custom_cluster_test_suite.py
M tests/common/kudu_test_suite.py
M tests/common/skip.py
M tests/custom_cluster/test_kudu.py
M tests/query_test/test_kudu.py
10 files changed, 305 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/13375/5
--
To view, visit http://gerrit.cloudera.org:8080/13375
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icffe412395f47f5e07d97bad457020770cfa7502
Gerrit-Change-Number: 13375
Gerrit-PatchSet: 5
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-8504 (part 2): Support CREATE TABLE statement with Kudu/HMS integration

2019-06-02 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13375 )

Change subject: IMPALA-8504 (part 2): Support CREATE TABLE statement with 
Kudu/HMS integration
..


Patch Set 4:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/13375/4/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
File fe/src/main/java/org/apache/impala/catalog/KuduTable.java:

http://gerrit.cloudera.org:8080/#/c/13375/4/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@172
PS4, Line 172: org.apache.hadoop.hive.metastore.api.Table msTbl,
 :   boolean isHMSIntegrationEnabled
> these can fit on the same line
It exceeds the line length limit (91).


http://gerrit.cloudera.org:8080/#/c/13375/4/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@231
PS4, Line 231: if (hmsHosts != null && kuduHmsHosts != null 
&&hmsHosts.equals(kuduHmsHosts)) {
> nit: separate by a space
Done


http://gerrit.cloudera.org:8080/#/c/13375/4/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@234
PS4, Line 234:   throw new ImpalaRuntimeException(
 :   String.format("Kudu is integrated with a different 
Hive Metastore " +
 :   "than that used by Impala, Kudu is configured to 
use the HMS: " +
 :   "%s, while Impala is configured to use the HMS: 
%s",
 :   kuduHmsUris, hmsUris));
> readability nit: remove 'else' and move this out of the scope if/else scope
Done


http://gerrit.cloudera.org:8080/#/c/13375/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/13375/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1930
PS4, Line 1930: !KuduTable.isHMSIntegrationEnabledAndValidate(masterHosts,
  : 
  hmsUris
> formatting - put everything after the '=' on the next line, indented by 4
Done


http://gerrit.cloudera.org:8080/#/c/13375/4/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
File fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java:

http://gerrit.cloudera.org:8080/#/c/13375/4/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@87
PS4, Line 87: public static final String HIVE_METASTORE_URIS_KEY =
:   "hive.metastore.uris";
> This can fit on 1 line
Done


http://gerrit.cloudera.org:8080/#/c/13375/4/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@120
PS4, Line 120:   /**
 :* Return the value of thrift URI for the remote Hive 
Metastore.
 :*/
 :   public static String 
getHiveMetastoreUrisKeyValue(IMetaStoreClient client)
 :   throws ConfigValSecurityException, TException {
 : return client.getConfigValue(
 : HIVE_METASTORE_URIS_KEY, DEFAULT_HIVE_METASTORE_URIS);
 :   }
> Why to create this shortcut instead of just using getMetastoreConfigValue()
This is to ensure always use the correct key (HIVE_METASTORE_URIS_KEY) when 
calling this function.


http://gerrit.cloudera.org:8080/#/c/13375/4/tests/custom_cluster/test_kudu.py
File tests/custom_cluster/test_kudu.py:

http://gerrit.cloudera.org:8080/#/c/13375/4/tests/custom_cluster/test_kudu.py@129
PS4, Line 129: manged
> typo
Done


http://gerrit.cloudera.org:8080/#/c/13375/4/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

http://gerrit.cloudera.org:8080/#/c/13375/4/tests/query_test/test_kudu.py@93
PS4, Line 93:   @SkipIfKudu.hms_integration_enabled
> Why would a test like this, which is focused on scanning kudu tables, need
Done


http://gerrit.cloudera.org:8080/#/c/13375/4/tests/query_test/test_kudu.py@175
PS4, Line 175:   @SkipIfKudu.hms_integration_enabled
> I assume that for cases like these, where the table is modified outside of
I am not sure it is the case, as far as my understanding of IMPALA-4828 is due 
to the table schema is cached in impala which is different from the one in 
Kudu. While with HMS integration, Kudu only automatically update the schema in 
the HMS. Therefore, I don't see the behavior will change. Or I am missing 
something?



--
To view, visit http://gerrit.cloudera.org:8080/13375
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icffe412395f47f5e07d97bad457020770cfa7502
Gerrit-Change-Number: 13375
Gerrit-PatchSet: 4
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Sun, 02 Jun 2019 20:57:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8503: allow the Hive Metastore to start with kudu-hive plugin

2019-06-02 Thread Hao Hao (Code Review)
Hello Thomas Marshall, Alexey Serbin, Andrew Wong, Adar Dembo, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13319

to look at the new patch set (#5).

Change subject: IMPALA-8503: allow the Hive Metastore to start with kudu-hive 
plugin
..

IMPALA-8503: allow the Hive Metastore to start with kudu-hive plugin

This patch allows to start the Hive Metasotre with Kudu plugin which is
required for enabling Kudu's integration with the HMS. The Kudu plugin
is downloaded and extracted from native-toolchain S3 bucket.

Change-Id: I4bd1488ced51840ec986d29ed371e26168abcc76
---
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
M fe/src/test/resources/hive-site.xml.py
M impala-parent/pom.xml
M testdata/bin/run-hive-server.sh
5 files changed, 22 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/13319/5
--
To view, visit http://gerrit.cloudera.org:8080/13319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4bd1488ced51840ec986d29ed371e26168abcc76
Gerrit-Change-Number: 13319
Gerrit-PatchSet: 5
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall