[Impala-ASF-CR] IMPALA-3823: Add timer to measure Parquet footer reads

2016-09-14 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new patch set (#5).

Change subject: IMPALA-3823: Add timer to measure Parquet footer reads
..

IMPALA-3823: Add timer to measure Parquet footer reads

It's been observed that Parquet footer reads perform poorly especially
when reading from S3. This patch adds a timer "FooterProcessingTimer"
which keeps a track of the average time each split of each scan node
spends in reading and processing the parquet footer.

Added a new utility counter called SummaryStatsCounter which keeps
track of the min, max and average values seen so far from a set of
values. This counter is used to calculate the min, max and average
time taken to scan and process Parquet footers per query per node.

The RuntimeProfile has also been updated to keep a track of, display
and serialize this new counter to thrift.

BE tests have been added to verify that this counter works fine.

Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile-test.cc
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/RuntimeProfile.thrift
M tests/query_test/test_scanners.py
8 files changed, 288 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/4371/5
-- 
To view, visit http://gerrit.cloudera.org:8080/4371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-1654: General partition exprs in DDL operations.

2016-09-14 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-1654: General partition exprs in DDL operations.
..


Patch Set 11:

(1 comment)

I agree we should follow up offline to fix your env issues and update the wiki 
if necessary.

http://gerrit.cloudera.org:8080/#/c/3942/11/fe/src/main/java/com/cloudera/impala/analysis/AlterTableStmt.java
File fe/src/main/java/com/cloudera/impala/analysis/AlterTableStmt.java:

Line 80: if (tableRef instanceof CollectionTableRef) {
> Is it possible for an AlterTableStmt getting a CollectionTableRef?
I think you could have something like this:

use functional;
alter table allcomplextypes.int_array_col set fileformat parquet;

You can look at some existing analyzer tests to see how to create a test that 
has a "functional" default db.


-- 
To view, visit http://gerrit.cloudera.org:8080/3942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2c9162fcf9d227b8daf4c2e761d57bab4e26408f
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Amos Bird 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Amos Bird 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1654: General partition exprs in DDL operations.

2016-09-14 Thread Amos Bird (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3942

to look at the new patch set (#12).

Change subject: IMPALA-1654: General partition exprs in DDL operations.
..

IMPALA-1654: General partition exprs in DDL operations.

This commit handles partition related DDL in a more general way. We can
now use compound predicates to specify a list of partitions in
statements like ALTER TABLE DROP PARTITION and COMPUTE INCREMENTAL
STATS, etc. It will also make sure some statements only accept one
partition at a time, such as PARTITION SET LOCATION and LOAD DATA. ALTER
TABLE ADD PARTITION remains using the old PartitionKeyValue's logic.

The changed partition related DDLs are as follows,

Table: p (i int) partitioned by (j int, k string)
Partitions:
+---+---+---++--+--+---+
| j | k | #Rows | #Files | Size | Bytes Cached | Cache Replication |
+---+---+---++--+--+---+
| 1 | a | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| 1 | b | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| 1 | c | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| 2 | d | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| 2 | e | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| 2 | f | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| Total |   | -1| 0  | 0B   | 0B   |   |
+---+---+---++--+--+---+

1. show files in p partition (j<2, k='a');
2. alter table p partition (j<2, k in ("b","c") set cached in 'testPool';

// j can appear more than once,
3.1. alter table p partition (j<2, j>0, k<>"d") set uncached;
// it is the same as
3.2. alter table p partition (j<2 and j>0, not k="e") set uncached;
// we can also do 'or'
3.3. alter table p partition (j<2 or j>0, k like "%") set uncached;

// missing 'k' matches all values of k
4. alter table p partition (j<2) set fileformat textfile;
5. alter table p partition (k rlike ".*") set serdeproperties ("k"="v");
6. alter table p partition (j is not null) set tblproperties ("k"="v");
7. alter table p drop partition (j<2);
8. compute incremental stats p partition(j<2);

The remaining old partition related DDLs are as follows,

1. load data inpath '/path/from' into table p partition (j=2, k="d");
2. alter table p add partition (j=2, k="g");
3. alter table p partition (j=2, k="g") set location '/path/to';
4. insert into p partition (j=2, k="g") values (1), (2), (3);

General partition expressions or partially specified partition specs
allows partition predicates to return empty partition set no matter
'IF EXISTS' is specified.

Examples:

[localhost.localdomain:21000] >
alter table p drop partition (j=2, k="f");
Query: alter table p drop partition (j=2, k="f")
+-+
| summary |
+-+
| Dropped 1 partition(s). |
+-+
Fetched 1 row(s) in 0.78s
[localhost.localdomain:21000] >
alter table p drop partition (j=2, k<"f");
Query: alter table p drop partition (j=2, k<"f")
+-+
| summary |
+-+
| Dropped 2 partition(s). |
+-+
Fetched 1 row(s) in 0.41s
[localhost.localdomain:21000] >
alter table p drop partition (k="a");
Query: alter table p drop partition (k="a")
+-+
| summary |
+-+
| Dropped 1 partition(s). |
+-+
Fetched 1 row(s) in 0.25s
[localhost.localdomain:21000] > show partitions p;
Query: show partitions p
+---+---+---++--+--+---+
| j | k | #Rows | #Files | Size | Bytes Cached | Cache Replication |
+---+---+---++--+--+---+
| 1 | b | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| 1 | c | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| Total |   | -1| 0  | 0B   | 0B   |   |
+---+---+---++--+--+---+
Fetched 3 row(s) in 0.01s

Change-Id: I2c9162fcf9d227b8daf4c2e761d57bab4e26408f
---
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M common/thrift/CatalogService.thrift
M common/thrift/Frontend.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/com/cloudera/impala/analysis/AlterTableDropPartitionStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetCachedStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetFileFormatStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetLocationStmt.java
M 

[Impala-ASF-CR] IMPALA-1654: General partition exprs in DDL operations.

2016-09-14 Thread Amos Bird (Code Review)
Amos Bird has posted comments on this change.

Change subject: IMPALA-1654: General partition exprs in DDL operations.
..


Patch Set 11:

(1 comment)

Thanks for helping me fix the auditing bugs. I wish this wiki may help me get 
my local env working. 
https://cwiki.apache.org/confluence/display/IMPALA/How+to+load+and+run+Impala+tests

http://gerrit.cloudera.org:8080/#/c/3942/11/fe/src/main/java/com/cloudera/impala/analysis/AlterTableStmt.java
File fe/src/main/java/com/cloudera/impala/analysis/AlterTableStmt.java:

Line 80: if (tableRef instanceof CollectionTableRef) {
> Could you add a test for this?
Is it possible for an AlterTableStmt getting a CollectionTableRef?


-- 
To view, visit http://gerrit.cloudera.org:8080/3942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2c9162fcf9d227b8daf4c2e761d57bab4e26408f
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Amos Bird 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Amos Bird 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Bump Kudu version to 1.0-RC1

2016-09-14 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: Bump Kudu version to 1.0-RC1
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4420
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbe554d6782212f91db07757f429c5571a7a44da
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3739: Enable stress tests on Kudu

2016-09-14 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has uploaded a new patch set (#4).

Change subject: IMPALA-3739: Enable stress tests on Kudu
..

IMPALA-3739: Enable stress tests on Kudu

This commit modifies the stress test framework to run TPC-H and TPC-DS
workloads against Kudu. The follwing changes are included in this
commit:
1. Created template files with DDL and DML statements for loading TPC-H and
   TPC-DS data in Kudu
2. Created a script (load-tpc-kudu.py) to load data in Kudu. The
   script is invoked by the stress test runner to load test data in an
   existing Impala/Kudu cluster (both local and CM-managed clusters are
   supported).
3. Created SQL files with TPC-DS queries to be executed in Kudu. SQL
   files with TPC-H queries for Kudu were added in a previous patch.
4. Modified the stress test runner to take additional parameters
   specific to Kudu (e.g. kudu master addr)

The stress test runner for Kudu was tested on EC2 clusters for both TPC-H
and TPC-DS workloads.

Missing functionality:
* No CRUD operations in the existing TPC-H/TPC-DS workloads for Kudu.
* Not all supported TPC-DS queries are included. Currently, only the
  TPC-DS queries from the testdata/workloads/tpcds/queries directory
  were modified to run against Kudu.

Change-Id: I3c9fc3dae24b761f031ee8e014bd611a49029d34
---
A testdata/bin/load-tpc-kudu.py
A testdata/datasets/tpcds/tpcds_kudu_template.sql
A testdata/datasets/tpch/tpch_kudu_template.sql
A testdata/workloads/tpcds/queries/tpcds-kudu-q19.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q27.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q3.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q34.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q42.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q43.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q46.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q47.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q52.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q53.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q55.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q59.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q6.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q61.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q63.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q65.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q68.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q7.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q73.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q79.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q8.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q88.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q89.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q96.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q98.test
M tests/comparison/db_connection.py
M tests/stress/concurrent_select.py
30 files changed, 2,473 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/4327/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3c9fc3dae24b761f031ee8e014bd611a49029d34
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-3739: Enable stress tests on Kudu

2016-09-14 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-3739: Enable stress tests on Kudu
..


Patch Set 3:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4327/3/testdata/bin/load-tpc-kudu.py
File testdata/bin/load-tpc-kudu.py:

PS3, Line 51: tbls_to_clean = tpch_tables if workload.lower() == 'tpch' 
else tpcds_tables
> Maybe use the cursor to get the list of tables? That way you don't have to 
The change to enable drop db cascade for Kudu is in review. If you're strongly 
against simply doing a drop db when this is in, I'll implement your proposal.


PS3, Line 81:   sql_file_path = "%s/testdata/datasets/%s/%s_kudu_template.sql"
> Use os.path.join() here.
Done


http://gerrit.cloudera.org:8080/#/c/4327/3/testdata/datasets/tpcds/tpcds_kudu_template.sql
File testdata/datasets/tpcds/tpcds_kudu_template.sql:

PS3, Line 39: 'kudu.key_columns' = 'ss_sold_date_sk,ss_ticket_number, 
ss_item_sk'
> For my education, I looked at http://www.tpc.org/tpc_documents_current_vers
No intention here, I just got this wrong :) I actually found a few other 
inconsistencies.


http://gerrit.cloudera.org:8080/#/c/4327/3/testdata/workloads/tpcds/queries/tpcds-kudu-q19.test
File testdata/workloads/tpcds/queries/tpcds-kudu-q19.test:

Line 39: 
> I noticed none of the TPC-DS Kudu queries have RESULTS. Why? (I searched fo
The TPC-DS workload is not currently enabled for Kudu in our regular 
(non-stress) testing framework. It requires some extra work which I left for 
another patch. Once this is done, these queries will also get proper RESULTS 
and TYPES sections. For now it seemed kind of pointless to add these sections. 
Besides the stress test has its own result verification mechanism.


http://gerrit.cloudera.org:8080/#/c/4327/3/testdata/workloads/tpcds/queries/tpcds-kudu-q47.test
File testdata/workloads/tpcds/queries/tpcds-kudu-q47.test:

PS3, Line 33:   ,round(v1_lead.sum_sales, 2) nsum
> Nit: tab character.
Done


http://gerrit.cloudera.org:8080/#/c/4327/3/testdata/workloads/tpcds/queries/tpcds-kudu-q65.test
File testdata/workloads/tpcds/queries/tpcds-kudu-q65.test:

PS3, Line 55: order by
:   s_store_name,
:   i_item_desc,
:   sc.revenue,
:   i_current_price,
:   i_wholesale_cost,
:   i_brand
> The ORDER BY has more columns than the TPC-DS-for-HDFS counterpart. Any rea
I had trouble getting consistent results across multiple invocations of this 
query unless I included these additional columns here. Added a comment.


http://gerrit.cloudera.org:8080/#/c/4327/3/tests/stress/concurrent_select.py
File tests/stress/concurrent_select.py:

PS3, Line 1463: tpch_kudu_queries = load_tpc_queries("tpch", "kudu")
> Change "kudu" to load_in_kudu=True
Good catch, thanks. Done


PS3, Line 1468: tpcds_kudu_queries = load_tpc_queries("tpcds", "kudu")
> Change "kudu" to load_in_kudu=True
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3c9fc3dae24b761f031ee8e014bd611a49029d34
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4116: Remove 'cdh' from version string

2016-09-14 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4116: Remove 'cdh' from version string
..


Patch Set 2: Code-Review+2

I've rethought it and I'm convinced this has a very low probability of causing 
issues.

-- 
To view, visit http://gerrit.cloudera.org:8080/4421
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7754538a23e73dcdebc6e3df509f357cbe03198c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4101: qgen: Hive join predicates should only contains equality functions

2016-09-14 Thread Anonymous Coward (Code Review)
stak...@cloudera.com has uploaded a new patch set (#2).

Change subject: IMPALA-4101: qgen: Hive join predicates should only contains 
equality functions
..

IMPALA-4101: qgen: Hive join predicates should only contains equality functions

Background:

Hive only supports equi-joins in its JOIN clause, while Postgres and Impala 
support more
complex functions such as <, <=, >, >=, etc. This change modifies the
QueryGenerator._create_relational_join_condition and
QueryGenerator._create_boolean_func_tree methods to only construct equality join
conditions under certain conditions.

The _create_boolean_func_tree method is invoked via
QueryGenerator -> create_query -> _create_from_clause -> _create_join_clause ->
_create_relational_join_condition -> _create_boolean_func_tree. This method is 
invoked
when constructing the JOIN, WHERE, and HAVING clauses. It creates a tree of 
functions
that would typically be found in any of these clauses.

Changes:

The parameter "signatures" is added to the method _create_boolean_func_tree, 
and it lists
out all the allowed signatures the function is allowed to use. Previously, this 
list of
signatures was populated by calling _funcs_to_allowed_signatures(FUNCS), and if
"signatures" is not specified, then the code defaults back to the results of 
that method.
A new method in the DefaultProfile called get_allowed_join_signatures is 
introduced and
returns a list of function signatures that are allowed within a JOIN clause. The
DefaultProfile allows all given signatures, while the HiveProfile only allows 
for the
Equals and And functions, as well as any function that operates over only one 
column.
The reason for these restrictions is that Hive only allows equality joins, does 
not allow
OR operators in the join clause, and does not allow any operator that operates 
over
multiple columns.

Note that the _create_boolean_func_tree still allows for OR operators due to 
some logic
around it's "and_or_fill_ratio" variable. The plan is to fix this in a future 
patch that
specifically focuses on removing OR operators from Hive JOIN clauses.

Minor change to discrepancy_searcher so that the logs print out "Hive" instead 
of
"Impala" when running against Hive.

Testing:

* Added a new unit test that ensures the HiveProfile only returns equality joins
* Tested against Hive locally
* Tested against Impala via Leopard
* Tested against Impala via the Discrepancy Checker inside an Impala Docker 
container

Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459
---
M tests/comparison/discrepancy_searcher.py
M tests/comparison/query_generator.py
M tests/comparison/query_profile.py
A tests/comparison/tests/hive/test_create_join_clause.py
4 files changed, 110 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/4419/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4419
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stak...@cloudera.com


[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database

2016-09-14 Thread Anonymous Coward (Code Review)
stak...@cloudera.com has posted comments on this change.

Change subject: IMPALA-3980: qgen: re-enable Hive as a target database
..


Patch Set 5:

> Uploaded patch set 5: Patch Set 4 was rebased.

@Michael, not sure what happened there, but I just pushed the rebased version.

-- 
To view, visit http://gerrit.cloudera.org:8080/4011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stak...@cloudera.com
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: stak...@cloudera.com
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4116: Remove 'cdh' from version string

2016-09-14 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#2).

Change subject: IMPALA-4116: Remove 'cdh' from version string
..

IMPALA-4116: Remove 'cdh' from version string

Change-Id: I7754538a23e73dcdebc6e3df509f357cbe03198c
---
M bin/save-version.sh
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/4421/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4421
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7754538a23e73dcdebc6e3df509f357cbe03198c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Jim Apple 


[Impala-ASF-CR] Make gen build version.py resilient to a failing git rev-parse

2016-09-14 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: Make gen_build_version.py resilient to a failing git rev-parse
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4411/3/bin/gen_build_version.py
File bin/gen_build_version.py:

Line 35: # Redirecting stdout and stderr to os.devnull
> nit: You could remove both parameters altogether, since None is the default
Didn't realize this earlier, but None actually doesn't redirect stdout or 
stderr. So it does end up getting printed on stdout/stderr. I've added a change 
to redirect to os.devnull.


-- 
To view, visit http://gerrit.cloudera.org:8080/4411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id7af33502bbb70185dc15ffca6219436a616f25b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Make gen build version.py resilient to a failing git rev-parse

2016-09-14 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: Make gen_build_version.py resilient to a failing git rev-parse
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4411/3/bin/gen_build_version.py
File bin/gen_build_version.py:

Line 35: can_obtain_git_hash = call(['git', 'rev-parse', 'HEAD'], stdout=None, 
stderr=None) == 0
nit: You could remove both parameters altogether, since None is the default for 
both.


-- 
To view, visit http://gerrit.cloudera.org:8080/4411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id7af33502bbb70185dc15ffca6219436a616f25b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Make gen build version.py resilient to a failing git rev-parse

2016-09-14 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: Make gen_build_version.py resilient to a failing git rev-parse
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4411/2/bin/gen_build_version.py
File bin/gen_build_version.py:

Line 35: can_obtain_git_hash = call(['git', 'rev-parse', 'HEAD'], stdout=PIPE, 
stderr=PIPE) == 0
Python docs discourage using PIPE here as it can block, depending on the output 
volume of the subprocess (https://docs.python.org/2/library/subprocess.html). 
However, in your case you're not reading the output, so you could just use None 
for both instead, which is also the default.


Line 43:   # SAVE_VERSION_SCRIPT will generate a dummy version.info file if we 
cannot obtain the
Maybe replace L54 with this?


http://gerrit.cloudera.org:8080/#/c/4411/2/bin/save-version.sh
File bin/save-version.sh:

Line 25: GIT_HASH=$(git rev-parse HEAD) > /dev/null
I think this should be 2> /dev/null, no?


-- 
To view, visit http://gerrit.cloudera.org:8080/4411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id7af33502bbb70185dc15ffca6219436a616f25b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1654: General partition exprs in DDL operations.

2016-09-14 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-1654: General partition exprs in DDL operations.
..


Patch Set 11:

(6 comments)

Amos, sorry to meddle with your patch but it seemed easier/faster this way 
since I have no idea what's wrong with your env.

I fixed the AuthorizationTest and AuditingTest for you. Please look at the 
changes carefully and try to apply the same principles in other places in your 
patch.

I added comments to explain what the bug was.

I'm still looking at those other failing tests. Hopefully we can fix your env 
so you can repro yourself.

http://gerrit.cloudera.org:8080/#/c/3942/11/fe/src/main/java/com/cloudera/impala/analysis/AlterTableStmt.java
File fe/src/main/java/com/cloudera/impala/analysis/AlterTableStmt.java:

Line 70:   public void analyze(Analyzer analyzer) throws AnalysisException {
The changes here lead to a bunch of changes in error messages which I think are 
fine.

Can you fix the expected error messages in the FE tests, e.g., AnalyzeDDLTest?


Line 72: TableRef tableRef = new TableRef(tableName_.toPath(), null, 
Privilege.ALTER);
This version is better because:
* we only ask the catalog for this table once, avoiding potential inconsistency 
problems
* we register the appropriate audit and auth events exactly once in 
tableRef.analyze()
* more streamlined code


Line 80: if (tableRef instanceof CollectionTableRef) {
Could you add a test for this?


http://gerrit.cloudera.org:8080/#/c/3942/10/fe/src/main/java/com/cloudera/impala/analysis/ShowFilesStmt.java
File fe/src/main/java/com/cloudera/impala/analysis/ShowFilesStmt.java:

Line 61: TableRef tableRef = new TableRef(tableName_.toPath(), null, 
Privilege.VIEW_METADATA);
I'm explaining the auth/audit bug in the following comments.


Line 65: if (!(table_ instanceof HdfsTable)) {
This registers audit and auth events for the VIEW_METADATA privilege.


Line 74:   partitionSet_.setPartitionShouldExist();
This registers audit and auth events for the SELECT privilege - which is wrong 
and caused tests to fail.


-- 
To view, visit http://gerrit.cloudera.org:8080/3942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2c9162fcf9d227b8daf4c2e761d57bab4e26408f
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Amos Bird 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Amos Bird 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1654: General partition exprs in DDL operations.

2016-09-14 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new patch set (#11).

Change subject: IMPALA-1654: General partition exprs in DDL operations.
..

IMPALA-1654: General partition exprs in DDL operations.

This commit handles partition related DDL in a more general way. We can
now use compound predicates to specify a list of partitions in
statements like ALTER TABLE DROP PARTITION and COMPUTE INCREMENTAL
STATS, etc. It will also make sure some statements only accept one
partition at a time, such as PARTITION SET LOCATION and LOAD DATA. ALTER
TABLE ADD PARTITION remains using the old PartitionKeyValue's logic.

The changed partition related DDLs are as follows,

Table: p (i int) partitioned by (j int, k string)
Partitions:
+---+---+---++--+--+---+
| j | k | #Rows | #Files | Size | Bytes Cached | Cache Replication |
+---+---+---++--+--+---+
| 1 | a | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| 1 | b | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| 1 | c | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| 2 | d | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| 2 | e | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| 2 | f | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| Total |   | -1| 0  | 0B   | 0B   |   |
+---+---+---++--+--+---+

1. show files in p partition (j<2, k='a');
2. alter table p partition (j<2, k in ("b","c") set cached in 'testPool';

// j can appear more than once,
3.1. alter table p partition (j<2, j>0, k<>"d") set uncached;
// it is the same as
3.2. alter table p partition (j<2 and j>0, not k="e") set uncached;
// we can also do 'or'
3.3. alter table p partition (j<2 or j>0, k like "%") set uncached;

// missing 'k' matches all values of k
4. alter table p partition (j<2) set fileformat textfile;
5. alter table p partition (k rlike ".*") set serdeproperties ("k"="v");
6. alter table p partition (j is not null) set tblproperties ("k"="v");
7. alter table p drop partition (j<2);
8. compute incremental stats p partition(j<2);

The remaining old partition related DDLs are as follows,

1. load data inpath '/path/from' into table p partition (j=2, k="d");
2. alter table p add partition (j=2, k="g");
3. alter table p partition (j=2, k="g") set location '/path/to';
4. insert into p partition (j=2, k="g") values (1), (2), (3);

General partition expressions or partially specified partition specs
allows partition predicates to return empty partition set no matter
'IF EXISTS' is specified.

Examples:

[localhost.localdomain:21000] >
alter table p drop partition (j=2, k="f");
Query: alter table p drop partition (j=2, k="f")
+-+
| summary |
+-+
| Dropped 1 partition(s). |
+-+
Fetched 1 row(s) in 0.78s
[localhost.localdomain:21000] >
alter table p drop partition (j=2, k<"f");
Query: alter table p drop partition (j=2, k<"f")
+-+
| summary |
+-+
| Dropped 2 partition(s). |
+-+
Fetched 1 row(s) in 0.41s
[localhost.localdomain:21000] >
alter table p drop partition (k="a");
Query: alter table p drop partition (k="a")
+-+
| summary |
+-+
| Dropped 1 partition(s). |
+-+
Fetched 1 row(s) in 0.25s
[localhost.localdomain:21000] > show partitions p;
Query: show partitions p
+---+---+---++--+--+---+
| j | k | #Rows | #Files | Size | Bytes Cached | Cache Replication |
+---+---+---++--+--+---+
| 1 | b | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| 1 | c | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| Total |   | -1| 0  | 0B   | 0B   |   |
+---+---+---++--+--+---+
Fetched 3 row(s) in 0.01s

Change-Id: I2c9162fcf9d227b8daf4c2e761d57bab4e26408f
---
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M common/thrift/CatalogService.thrift
M common/thrift/Frontend.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/com/cloudera/impala/analysis/AlterTableDropPartitionStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetCachedStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetFileFormatStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetLocationStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetTblProperties.java
M 

[Impala-ASF-CR] IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder

2016-09-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder
..


Patch Set 14:

(53 comments)

http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/analytic-eval-node.cc
File be/src/exec/analytic-eval-node.cc:

Line 207: state, Substitute(PREPARE_FAILED_ERROR_MSG, "write"));
> old code and agg uses state_->block_mgr()->MemLimitTooLowError().  Any reas
Historical reasons it looks like:  the PrepareForRead() version below diverged 
from the PAGG/PHJ versions.

Switched to the block_mgr version for consistency.


http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/blocking-join-node.cc
File be/src/exec/blocking-join-node.cc:

PS14, Line 297: SCOPED_TIMER(build_timer_);
> We used to have two different timers in PHJ for measuring the time to hash 
It's not clear to me that the original timers were well thought out. It didn't 
look like there was a timer for how long it took to build the hash table - that 
was lumped into 'build_timer_'.

I added two new timers with clearer meanings that track the total hash table 
build timer and time spent partitioning rows. Also moved the remaining 
build-related timers into the Builder.

Now 'built_timer_' is the initial partitioning and hash table build, and 
'repartition_timer_' is the time spent repartitioning'.


http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

Line 179: largest_fraction = max(largest_fraction,
> DCHECK_GT(num_build_rows, 0);
I don't think that's valid, it's casting to double to avoid crashing on the 
divide-by-zero. I changed it to avoid the calculation if num_build_rows = 0 to 
avoid that problem.


PS14, Line 318: a hash table
> hash tables
Done. Both alternatives seem grammatically correct to me though.


Line 341:   // building hash tables to avoid building hash tables for 
partitions we'll spill anyway.
> That's an interesting comment. If I understand it correctly, it means that 
Done with some slight tweaks.

This is a lot simpler than the previous behaviour, where spilling can happen 
even during probing.

I didn't measure how often, but it's not at all improbable, since the probe 
buffers are 8MB each and we could allocate up to 16 of them.


Line 401:   RETURN_IF_ERROR(SpillPartition());
> May help to document that failure to find any partition to spill (e.g. all 
I added to the comment up the top of the loop to make the termination more 
explicit.


PS14, Line 529: PhjBuilder::HashTableStoresNulls()
> This seems to belong to PartitionedHashJoinNode conceptually.
I feel like it makes sense here though since the builder owns the hash tables.

Plumbing-wise it's also easier since PhjBuilder needs this info and starts 
executing before PartitionedHashJoinNode.


Line 646:   do {
> Please see comments in BlockingJoinNode,  it would be great to retain timer
Done


PS14, Line 782: process_build_batch_fn_level0
> 'process_build_batch_fn_level0_'
I just copied this verbatim - it seems to be referring to the local variable 
though.


http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

Line 425: 
> nit: unnecessary blank line ?
Done


http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

PS14, Line 151: 
  : 
  : 
> This may be important information to retain.
Not sure why I removed it. May have had a good reason but I can't recall it.


PS14, Line 87: Expr::CreateExprTree(pool_, eq_join_conjunct.right
> Does this overlap with CreateExprTree() in PhjBuilder::Init() ? Same questi
They are redundant currently, but the idea is that PhjBuilder and PhjNode will 
exist more independently in different plan fragments, so I wanted to 
encapsulate the state where possible.

The expr contexts need to be independent to eventually support many probe sides 
: 1 build side for broadcast joins.

I think documenting this explicitly may be confusing because it is sort-of 
explaining the current state of things relative to the previous way of doing 
things. Whereas I think with the separate build side, the default assumption is 
that builder data structures are private and not shared with the exec node.

It may make sense to share Expr trees globally between fragment instances as 
part of the multithreading changes, but I don't think it's worth trying to 
share them here until we have the right infrastructure.


Line 213:   for (unique_ptr& partition : spilled_partitions_)
> missing { }
Ah, missed that clang-format wrapped it.


PS14, Line 494: to output
> outputting
Done


PS14, Line 585: hash_partitions_
> 'hash_partitions_'
Done


Line 594:   continue;
> Is there a reason why we cannot call OutputUnmatchedBuild() directly at thi
I 

[Impala-ASF-CR] IMPALA-3719: Simplify CREATE TABLE statements with Kudu tables

2016-09-14 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3719: Simplify CREATE TABLE statements with Kudu tables
..


Patch Set 1:

(28 comments)

Ok, here's a bit to get started. Just made my way through most of analysis. 
Still have a lot to get through but figured I'll give you the feedback I have 
while I do a few other things.

http://gerrit.cloudera.org:8080/#/c/4414/1//COMMIT_MSG
Commit Message:

PS1, Line 21: When attempting to create an external
:   table "foo" in database "bar", Impala will search for a Kudu 
table
:   name "foo.bar" and "bar" (Kudu doesn't have database name spaces
:   yet.)
I think this sentence is duplicated by #10. Maybe remove this part. This item 
is about reading the columns, right? The issue of table name resolution is 
separate IMO, and covered by #10, though consider moving them next together :)


PS1, Line 25:  The Kudu table is now required to exist at the time of creation 
in
:   Impala.
for external tables? or if this is more general can you explain a bit more?


PS1, Line 31: default value
can it still be overridden?


PS1, Line 39: Eventually Hive should have the needed
:   functionality and the Kudu delegate (renamed in this patch to 
KuduCatalogOpExecutor)
:   can be removed.
maybe, it's not clear if impala would want to keep the ability to do this on 
its own... Remove?


PS1, Line 54:  If the database is "default" then
:   the Kudu table name will not include the database.
hm... I wonder if it'd be easier just to keep it standardized and always have 
the name, i.e. using default in this case. What do you think? I'll see if I 
change my mind after reading more code.


PS1, Line 56: 11) Several improvements in the grammar related to the family
:   of CREATE TABLE statements.
I don't think this bullet adds much value to the commit message. Everything 
else is more user visible, this is just some necessary code cleanup.


PS1, Line 58: 12) Added new tests and modified existing Kudu test to use the new
:   CREATE TABLE syntax.
I think this can be removed too, it should be implied there are tests and it 
takes away from the billion other "features" in this patch :)


PS1, Line 60: source of truth
Does this mean we always hit Kudu for metadata? In the Catalog? In the 
impalad's Catalog?


PS1, Line 60: insteads
instead


PS1, Line 64: Not included in this commit:
: - Additional column properties such as nullability and compression
: encodings.
I don't think you need to include this, we have a separate JIRA for this.


http://gerrit.cloudera.org:8080/#/c/4414/1/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

FYI I'm not looking at this very carefully because it was reviewed previously 
https://gerrit.cloudera.org/#/c/2865/

Let me know if this has changed much since then.


PS1, Line 405:   view_column_def_list, view_column_defs;
separate nonterminals if they dont fit?


PS1, Line 458: key_ident;
does this not need Boolean like those below?


http://gerrit.cloudera.org:8080/#/c/4414/1/fe/src/main/java/com/cloudera/impala/analysis/CreateTableStmt.java
File fe/src/main/java/com/cloudera/impala/analysis/CreateTableStmt.java:

I'm not in love with the TableDef and TableDefOptions stuff because it feels 
like that's parser implementation concerns bleeding into analysis code.

Do you think we need both? Can we fold the latter into TableDef?

If we have to keep both, it'd be helpful to at least group the functions in 
this file based on their source (e.g. TableDef, TDOptions, or on this itself) 
with comments around the groups? Maybe within those groups they could also be 
sorted by visibility? There's a lot of stuff going on so I think it'd be 
helpful to keep it tidy.


PS1, Line 240: 
getColumnDefs().add(KuduCatalogOpExecutor.REQUIRED_THRIFT_COLUMN_DEF);
is this needed? I thought there was going to be code that fetches the columns 
from Kudu and stores them.


PS1, Line 242:   if 
(!getTblProperties().containsKey(KuduTable.KEY_TABLE_NAME)) {
 : getTblProperties().put(KuduTable.KEY_TABLE_NAME,
 : KuduUtil.getDefaultCreateKuduTableName(getDb(), 
getTbl()));
 :   }
hm... can we simplify this and only allow internal tables to be created with 
the name db_name.table_name? Obviously there could be conflicts and it should 
fail then. Then it'd be 1 less case to consider.

Also, I wonder if we should prefix the underlying kudu table names with 
something, e.g. IMPALA_db.table?


PS1, Line 275: // Kudu's data distribution rules are enforced here. A 
good reference is
 : // 
http://getkudu.io/docs/schema_design.html#data-distribution. As a summary:
 : // * Only primary key column can be used in distribution 
definitions
 :   

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

2016-09-14 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in 
run-hbase.sh
..


IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

We used to include a step in run-hbase.sh for calling a python
script that queried Zookeeper to see if the HBase master was up.
The original script was problematic, so we stopped using it during
our mini-cluster HBase start up procedure.

HBase start up issues continue to plague us, however. This patch
reintroduces a Zookeeper check, with the following updates:

- replace the original script with check-hbase-nodes.py
- query the correct node /hbase/master, not just /hbase/rs
- use the python Zookeeper library kazoo, rather than calling
  out to the shell and parsing the return string
- since we are moving toward testing on a remote cluster, also
  add the capability to pass in the address for the host that
  provides the Zookeeper and HBase services
- add an additional check that the HDFS service is running,
  because of an edge case where the HBase master can briefly
  start without a cluster running.

In addition to the expected tests, this script was also tested
under the conditions of IMPALA-4088, whereby the HBase RegionServer
is running, but the master fails because another listening process
has already taken its TCP port (60010) during startup.

Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Reviewed-on: http://gerrit.cloudera.org:8080/4348
Reviewed-by: Alex Behm 
Tested-by: Internal Jenkins
---
M infra/python/deps/requirements.txt
A testdata/bin/check-hbase-nodes.py
M testdata/bin/run-hbase.sh
D testdata/bin/wait-for-hbase-master.py
4 files changed, 176 insertions(+), 59 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

2016-09-14 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in 
run-hbase.sh
..


Patch Set 9: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] Make gen build version.py resilient to a failing git rev-parse

2016-09-14 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: Make gen_build_version.py resilient to a failing git rev-parse
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4411/1/bin/gen_build_version.py
File bin/gen_build_version.py:

PS1, Line 34: can_obtain_git_hash = False if os.system('git rev-parse HEAD') 
else True
> This took me a second glance to understand, maybe replace with
Done


PS1, Line 39: if not version_file_exists or can_obtain_git_hash:
> what happens if we cannot obtain the git hash and don't have a version file
Done


http://gerrit.cloudera.org:8080/#/c/4411/1/bin/save-version.sh
File bin/save-version.sh:

Line 25: GIT_HASH=$(git rev-parse HEAD)
> Discard stderr?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id7af33502bbb70185dc15ffca6219436a616f25b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Make gen build version.py resilient to a failing git rev-parse

2016-09-14 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new patch set (#2).

Change subject: Make gen_build_version.py resilient to a failing git rev-parse
..

Make gen_build_version.py resilient to a failing git rev-parse

It was noticed that some build processes did not checkout Impala and
instead built it from a tarball. This would cause our gen_build_version
script to write a blank version info everytime to the version.info file.

This patch takes care of the case where if there is an already existing
version.info file and we cannot get the git rev-parse output, we use
the old file instead. Blank version info is written only when we don't
have an old version.info file and we cannot do a git rev-parse.

Change-Id: Id7af33502bbb70185dc15ffca6219436a616f25b
---
M bin/gen_build_version.py
M bin/save-version.sh
2 files changed, 27 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/4411/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id7af33502bbb70185dc15ffca6219436a616f25b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-3739: Enable stress tests on Kudu

2016-09-14 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-3739: Enable stress tests on Kudu
..


Patch Set 3:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4327/3/testdata/bin/load-tpc-kudu.py
File testdata/bin/load-tpc-kudu.py:

PS3, Line 51: tbls_to_clean = tpch_tables if workload.lower() == 'tpch' 
else tpcds_tables
Maybe use the cursor to get the list of tables? That way you don't have to 
hardcode the table names L39-46.


PS3, Line 81:   sql_file_path = "%s/testdata/datasets/%s/%s_kudu_template.sql"
Use os.path.join() here.


http://gerrit.cloudera.org:8080/#/c/4327/3/testdata/datasets/tpcds/tpcds_kudu_template.sql
File testdata/datasets/tpcds/tpcds_kudu_template.sql:

PS3, Line 39: 'kudu.key_columns' = 'ss_sold_date_sk,ss_ticket_number, 
ss_item_sk'
For my education, I looked at 
http://www.tpc.org/tpc_documents_current_versions/pdf/tpc-ds_v2.3.0.pdf and saw 
that for this table, the PK is ss_item_sk,ss_ticket_number . Can you explain 
why ss_sold_date_sk is added as a key column?


http://gerrit.cloudera.org:8080/#/c/4327/3/testdata/workloads/tpcds/queries/tpcds-kudu-q19.test
File testdata/workloads/tpcds/queries/tpcds-kudu-q19.test:

Line 39: 
I noticed none of the TPC-DS Kudu queries have RESULTS. Why? (I searched for a 
TODO and didn't see a reason that might explain it; maybe I missed it.)


http://gerrit.cloudera.org:8080/#/c/4327/3/testdata/workloads/tpcds/queries/tpcds-kudu-q47.test
File testdata/workloads/tpcds/queries/tpcds-kudu-q47.test:

PS3, Line 33:   ,round(v1_lead.sum_sales, 2) nsum
Nit: tab character.


http://gerrit.cloudera.org:8080/#/c/4327/3/testdata/workloads/tpcds/queries/tpcds-kudu-q65.test
File testdata/workloads/tpcds/queries/tpcds-kudu-q65.test:

PS3, Line 55: order by
:   s_store_name,
:   i_item_desc,
:   sc.revenue,
:   i_current_price,
:   i_wholesale_cost,
:   i_brand
The ORDER BY has more columns than the TPC-DS-for-HDFS counterpart. Any reason?


http://gerrit.cloudera.org:8080/#/c/4327/3/tests/stress/concurrent_select.py
File tests/stress/concurrent_select.py:

PS3, Line 1463: tpch_kudu_queries = load_tpc_queries("tpch", "kudu")
Change "kudu" to load_in_kudu=True


PS3, Line 1468: tpcds_kudu_queries = load_tpc_queries("tpcds", "kudu")
Change "kudu" to load_in_kudu=True


-- 
To view, visit http://gerrit.cloudera.org:8080/4327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3c9fc3dae24b761f031ee8e014bd611a49029d34
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3718: Support subset of functional-query for Kudu

2016-09-14 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3718: Support subset of functional-query for Kudu
..


Patch Set 5: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3718: Support subset of functional-query for Kudu

2016-09-14 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3718: Support subset of functional-query for Kudu
..


IMPALA-3718: Support subset of functional-query for Kudu

Adds initial support for the functional-query test workload
for Kudu tables.

There are a few issues that make loading the functional
schema difficult on Kudu:
 1) Kudu tables must have one or more columns that together
constitute a unique primary key.
   a) Primary key columns must currently be the first columns
  in the table definition (KUDU-1271).
   b) Primary key columns cannot be nullable (KUDU-1570).
 2) Kudu tables must be specified with distribution
parameters.

(1) limits the tables that can be loaded without ugly
workarounds. This patch only includes important tables that
are used for relevant tests, most notably the alltypes*
family. In particular, alltypesagg is important but it does
not have a set of columns that are non-nullable and form a unique
primary key. As a result, that table is created in Kudu with
a different name and an additional BIGINT column for a PK
that is a unique index and is generated at data loading time
using the ROW_NUMBER analytic function. A view is then
wrapped around the underlying table that matches the
alltypesagg schema exactly. When KUDU-1570 is resolved, this
can be simplified.

(2) requires some additional considerations and custom
syntax. As a result, the DDL to create the tables is
explicitly specified in CREATE_KUDU sections in the
functional_schema_constraints.csv, and an additional
DEPENDENT_LOAD_KUDU section was added to specify custom data
loading DML that differs from the existing DEPENDENT_LOAD.

TODO: IMPALA-4005: generate_schema_statements.py needs refactoring

Tests that are not relevant or not yet supported have been
marked with xfail and a skip where appropriate.

TODO: Support remaining functional tables/tests when possible.

Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc
Reviewed-on: http://gerrit.cloudera.org:8080/4175
Reviewed-by: Matthew Jacobs 
Tested-by: Internal Jenkins
---
M bin/impala-config.sh
M testdata/bin/compute-table-stats.sh
M testdata/bin/generate-schema-statements.py
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-planner/queries/PlannerTest/kudu-delete.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/functional-query_core.csv
M testdata/workloads/functional-query/functional-query_exhaustive.csv
M testdata/workloads/functional-query/functional-query_pairwise.csv
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
M tests/common/skip.py
M tests/common/test_result_verifier.py
M tests/query_test/test_aggregation.py
M tests/query_test/test_exprs.py
M tests/query_test/test_queries.py
M tests/query_test/test_runtime_filters.py
M tests/query_test/test_scanners.py
M tests/query_test/test_tpcds_queries.py
19 files changed, 343 insertions(+), 80 deletions(-)

Approvals:
  Matthew Jacobs: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-4110: Clean up issues found by Apache RAT.

2016-09-14 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4110: Clean up issues found by Apache RAT.
..


IMPALA-4110: Clean up issues found by Apache RAT.

Change-Id: I5bfe77f9a871018e7a67553ed270e2df53006962
Reviewed-on: http://gerrit.cloudera.org:8080/4361
Reviewed-by: Alex Behm 
Tested-by: Internal Jenkins
---
M LICENSE.txt
M be/.astylerc
M be/.impala.doxy
M be/src/benchmarks/expr-benchmark.cc
M be/src/bufferpool/CMakeLists.txt
M be/src/experiments/data-provider.cc
M be/src/experiments/hashing/cache-hash-test.cc
M be/src/experiments/hashing/growing-test.cc
M be/src/experiments/hashing/interface/cache-hash-test.cc
M be/src/experiments/hashing/interface/growing-test.cc
M be/src/experiments/hashing/multilevel/cache-hash-test.cc
M be/src/experiments/hashing/multilevel/growing-test.cc
M be/src/experiments/hashing/prefetch/cache-hash-test.cc
M be/src/experiments/hashing/prefetch/growing-test.cc
M be/src/experiments/hashing/split-benchmarks/cache-hash-test.cc
M be/src/experiments/hashing/split-benchmarks/growing-test.cc
M be/src/experiments/hashing/split-benchmarks/partitioning-throughput-test.cc
M be/src/experiments/hashing/streaming/cache-hash-test.cc
M be/src/experiments/hashing/streaming/growing-test.cc
M be/src/testutil/certificates-info.txt
M be/src/util/asan.h
M bin/README-RUNNING-BENCHMARKS
M bin/check-rat-report.py
D bin/cpplint.py
M bin/impala-ipython
M bin/impala-py.test
M bin/impala-python
M bin/rat_exclude_files.txt
M cmake_modules/FindAvro.cmake
M cmake_modules/FindBreakpad.cmake
M cmake_modules/FindBzip2.cmake
M cmake_modules/FindGFlags.cmake
M cmake_modules/FindGLog.cmake
M cmake_modules/FindHDFS.cmake
M cmake_modules/FindJNI.cmake
M cmake_modules/FindLdap.cmake
M cmake_modules/FindLlvm.cmake
M cmake_modules/FindLlvmBinaries.cmake
M cmake_modules/FindLz4.cmake
M cmake_modules/FindOpenSSL.cmake
M cmake_modules/FindPProf.cmake
M cmake_modules/FindRapidJson.cmake
M cmake_modules/FindRe2.cmake
M cmake_modules/FindSnappy.cmake
M cmake_modules/FindThrift.cmake
M cmake_modules/FindZlib.cmake
M fe/src/main/java/com/cloudera/impala/analysis/AggregateInfoBase.java
M fe/src/main/java/com/cloudera/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/com/cloudera/impala/analysis/PartitionSpec.java
M fe/src/main/java/com/cloudera/impala/catalog/ArrayType.java
M fe/src/main/java/com/cloudera/impala/catalog/MapType.java
M fe/src/main/java/com/cloudera/impala/catalog/StructType.java
M fe/src/main/java/com/cloudera/impala/planner/Planner.java
M fe/src/test/resources/authz-policy.ini.template
M fe/src/test/resources/hive-log4j.properties.template
M fe/src/test/resources/log4j.properties.template
M infra/python/deps/download_requirements
M infra/python/deps/requirements.txt
M testdata/avro_schema_resolution/create_table.sql
M testdata/bin/cache_tables.py
M testdata/bin/create-mini.sql
M testdata/bin/load-dependent-tables.sql
M testdata/bin/load-hive-builtins.sh
M testdata/bin/run-hive.sh
M testdata/cluster/node_templates/cdh5/etc/init.d/kms
M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-common
M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-master
M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-tserver
M testdata/cluster/node_templates/cdh5/etc/init.d/llama-application
M testdata/cluster/node_templates/common/etc/init.d/common.tmpl
M testdata/cluster/node_templates/common/etc/init.d/hdfs-common
M testdata/cluster/node_templates/common/etc/init.d/hdfs-datanode
M testdata/cluster/node_templates/common/etc/init.d/hdfs-namenode
M testdata/cluster/node_templates/common/etc/init.d/yarn-common
M testdata/cluster/node_templates/common/etc/init.d/yarn-nodemanager
M testdata/cluster/node_templates/common/etc/init.d/yarn-resourcemanager
D testdata/data/mstr/eatwh1/EATWH1-DDL.sql
D testdata/data/mstr/eatwh1/LoadTablesEATWH1.sql
D testdata/data/mstr/eatwh1/csv/COST_MARKET_CLASS.TXT
D testdata/data/mstr/eatwh1/csv/COST_MARKET_DEP.TXT
D testdata/data/mstr/eatwh1/csv/COST_MARKET_DIV.TXT
D testdata/data/mstr/eatwh1/csv/COST_REGION_CLASS.TXT
D testdata/data/mstr/eatwh1/csv/COST_REGION_ITEM.TXT
D testdata/data/mstr/eatwh1/csv/COST_STORE_DEP.TXT
D testdata/data/mstr/eatwh1/csv/COST_STORE_ITEM.TXT
D testdata/data/mstr/eatwh1/csv/FT1.TXT
D testdata/data/mstr/eatwh1/csv/FT10.TXT
D testdata/data/mstr/eatwh1/csv/FT11.TXT
D testdata/data/mstr/eatwh1/csv/FT12.TXT
D testdata/data/mstr/eatwh1/csv/FT13.TXT
D testdata/data/mstr/eatwh1/csv/FT14.TXT
D testdata/data/mstr/eatwh1/csv/FT15.TXT
D testdata/data/mstr/eatwh1/csv/FT17.TXT
D testdata/data/mstr/eatwh1/csv/FT2.TXT
D testdata/data/mstr/eatwh1/csv/FT3.TXT
D testdata/data/mstr/eatwh1/csv/FT4.TXT
D testdata/data/mstr/eatwh1/csv/FT5.TXT
D testdata/data/mstr/eatwh1/csv/FT6.TXT
D testdata/data/mstr/eatwh1/csv/FT7.TXT
D testdata/data/mstr/eatwh1/csv/FT8.TXT
D testdata/data/mstr/eatwh1/csv/FT9.TXT
D testdata/data/mstr/eatwh1/csv/INVENTORY_CURR.TXT
D 

[Impala-ASF-CR] IMPALA-4110: Clean up issues found by Apache RAT.

2016-09-14 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4110: Clean up issues found by Apache RAT.
..


Patch Set 8: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5bfe77f9a871018e7a67553ed270e2df53006962
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

2016-09-14 Thread Harrison Sheinblatt (Code Review)
Harrison Sheinblatt has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in 
run-hbase.sh
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4348/9/testdata/bin/check-hbase-nodes.py
File testdata/bin/check-hbase-nodes.py:

Line 135: zk_client.stop()
A problem is that zk client wants you to stop() and then close().  In this 
case, if there is an exception or something that gets through, the close() will 
be called without stop().  This may work without a 'graceful shutdown' of the 
client, and be OK.  I've done a 2-phase context manager in the past to ensure 
stop() is called before close().


-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3739: Enable stress tests on Kudu

2016-09-14 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has uploaded a new patch set (#3).

Change subject: IMPALA-3739: Enable stress tests on Kudu
..

IMPALA-3739: Enable stress tests on Kudu

This commit modifies the stress test framework to run TPC-H and TPC-DS
workloads against Kudu. The follwing changes are included in this
commit:
1. Created template files with DDL and DML statements for loading TPC-H and
   TPC-DS data in Kudu
2. Created a script (load-tpc-kudu.py) to load data in Kudu. The
   script is invoked by the stress test runner to load test data in an
   existing Impala/Kudu cluster (both local and CM-managed clusters are
   supported).
3. Created SQL files with TPC-DS queries to be executed in Kudu. SQL
   files with TPC-H queries for Kudu were added in a previous patch.
4. Modified the stress test runner to take additional parameters
   specific to Kudu (e.g. kudu master addr)

The stress test runner for Kudu was tested on EC2 clusters for both TPC-H
and TPC-DS workloads.

Missing functionality:
* No CRUD operations in the existing TPC-H/TPC-DS workloads for Kudu.
* Not all supported TPC-DS queries are included. Currently, only the
  TPC-DS queries from the testdata/workloads/tpcds/queries directory
  were modified to run against Kudu.

Change-Id: I3c9fc3dae24b761f031ee8e014bd611a49029d34
---
A testdata/bin/load-tpc-kudu.py
A testdata/datasets/tpcds/tpcds_kudu_template.sql
A testdata/datasets/tpch/tpch_kudu_template.sql
A testdata/workloads/tpcds/queries/tpcds-kudu-q19.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q27.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q3.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q34.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q42.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q43.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q46.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q47.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q52.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q53.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q55.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q59.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q6.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q61.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q63.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q65.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q68.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q7.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q73.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q79.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q8.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q88.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q89.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q96.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q98.test
M tests/comparison/db_connection.py
M tests/stress/concurrent_select.py
30 files changed, 2,478 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/4327/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3c9fc3dae24b761f031ee8e014bd611a49029d34
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-4110: Apache RAT script on Impala tarballs.

2016-09-14 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4110: Apache RAT script on Impala tarballs.
..


Patch Set 1: Verified+1

No GVO implications, so self-verifying

-- 
To view, visit http://gerrit.cloudera.org:8080/4405
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic95bd38fbb90f9a901602dd91cee541b16bf4714
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4110: Apache RAT script on Impala tarballs.

2016-09-14 Thread Jim Apple (Code Review)
Jim Apple has submitted this change and it was merged.

Change subject: IMPALA-4110: Apache RAT script on Impala tarballs.
..


IMPALA-4110: Apache RAT script on Impala tarballs.

Apache RAT is a tool for license auditing. It will be used as part of
the Apache release process. This patch includes a script for parsing
its output and a file containing a list of filename globs that should
be ignored. The script takes two command line parameters as input -
the filename of the ignored file globs, and the filename of the RAT
xml output.

Change-Id: Ic95bd38fbb90f9a901602dd91cee541b16bf4714
Reviewed-on: http://gerrit.cloudera.org:8080/4405
Reviewed-by: Alex Behm 
Tested-by: Jim Apple 
---
A bin/check-rat-report.py
A bin/rat_exclude_files.txt
2 files changed, 191 insertions(+), 0 deletions(-)

Approvals:
  Jim Apple: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4405
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ic95bd38fbb90f9a901602dd91cee541b16bf4714
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Todd Lipcon