[Impala-ASF-CR] IMPALA-4369: Avoid DCHECK in Parquet scanner with MT DOP > 0.

2016-10-25 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4851

Change subject: IMPALA-4369: Avoid DCHECK in Parquet scanner with MT_DOP > 0.
..

IMPALA-4369: Avoid DCHECK in Parquet scanner with MT_DOP > 0.

When HdfsParquetScanner::Open() failed we used to hit a DCHECK
when trying to access HdfsParquetScanner::batch() which is
only valid to call for non-MT scan nodes.

Change-Id: Ifbfdde505dbbd2742e7ab79a2415ff317a9bfa2f
---
M be/src/exec/hdfs-scan-node-base.cc
A testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test
M tests/query_test/test_mt_dop.py
3 files changed, 26 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/4851/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4851
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ifbfdde505dbbd2742e7ab79a2415ff317a9bfa2f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] IMPALA-4336: Cast exprs after unnesting union operands.

2016-10-25 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4336: Cast exprs after unnesting union operands.
..


Patch Set 1:

(8 comments)

Flushing some initial comments. Still need to wrap my head around all the 
moving parts.

http://gerrit.cloudera.org:8080/#/c/4815/1//COMMIT_MSG
Commit Message:

PS1, Line 15: Also clarifies the use of resultExprs vs. baseTblResultExprs
: in unions. We should consistently use resultExprs because the
: final expr substitution happens during planning.
: The only place where baseTblResultExprs should be used is in
: UnionStmt.materializeRequiredSlots() because that is called
: before plan generation and we need to mark the slots of resolved
: exprs as materialized.
I am not sure this belongs to the commit message. If you don't have it already, 
you may want to put it as a comment somewhere in the code.


http://gerrit.cloudera.org:8080/#/c/4815/1/fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
File fe/src/main/java/org/apache/impala/analysis/UnionStmt.java:

PS1, Line 202: Propagates DISTINCT from left to right, and checks that all 
union operands are
 :* are union compatible, adding implicit casts if necessary.
I think you need to update this comment. This function seems to be doing a lot 
more than simply propagating distinct and checking union operand compatibility.


PS1, Line 203: are
remove


PS1, Line 218: // Analyze all operands and make sure they return an equal 
number of exprs.
 : for (int i = 0; i < operands_.size(); ++i) {
 :   try {
 : operands_.get(i).analyze(analyzer);
 : QueryStmt firstQuery = operands_.get(0).getQueryStmt();
 : List firstExprs = 
operands_.get(0).getQueryStmt().getResultExprs();
 : QueryStmt query = operands_.get(i).getQueryStmt();
 : List exprs = query.getResultExprs();
 : if (firstExprs.size() != exprs.size()) {
 :   throw new AnalysisException("Operands have unequal 
number of columns:\n" +
 :   "'" + queryStmtToSql(firstQuery) + "' has " +
 :   firstExprs.size() + " column(s)\n" +
 :   "'" + queryStmtToSql(query) + "' has " + 
exprs.size() + " column(s)");
 : }
 :   } catch (AnalysisException e) {
 : if (analyzer.getMissingTbls().isEmpty()) throw e;
 :   }
 : }
This function has grown a bit too much. I would put this block is a separate 
function called analyzeOperands().


Line 241: // Remember SQL string before unnesting operands.
the


PS1, Line 258: // Cast all result exprs to a compatible type.
move above L263


PS1, Line 282: // Should never happen
 : throw new AnalysisException("Error creating agg info in 
UnionStmt.analyze()");
Maybe convert it to Preconditions.checkState(false, "Error")?


http://gerrit.cloudera.org:8080/#/c/4815/1/testdata/workloads/functional-query/queries/QueryTest/union.test
File testdata/workloads/functional-query/queries/QueryTest/union.test:

PS1, Line 1030: (select 80 union all select 90)
Maybe add a test case with nested union distinct, if not already there.


-- 
To view, visit http://gerrit.cloudera.org:8080/4815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 13: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


IMPALA-2521: Add clustered hint to insert statements

This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
plan, just before the table sink. This has the effect that data will be
clustered by its partition prior to writing partitions, which therefore
can be written sequentially.

Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Reviewed-on: http://gerrit.cloudera.org:8080/4745
Reviewed-by: Alex Behm 
Tested-by: Internal Jenkins
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
10 files changed, 418 insertions(+), 44 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4023: don't attach buffered tuple streams to batches
..


IMPALA-4023: don't attach buffered tuple streams to batches

This simplifies the memory transfer model by eliminating one category of
resources that can be attached.

This patch also separates the concepts of attaching resources and
flushing resources. Previously RowBatch::AddTupleStream() implicitly
flushed resources from the ExecNode pipeline, which various ExecNodes
relied on to free up memory reservations for subsequent processing. In a
subsequent patch I want the FlushResources() API to become stronger: it
will force streaming ExecNodes to flush their batches or forces
blocking ExecNodes to acquire ownership of the memory resources.
We can't do this right now since we don't have a way to transfer
ownership of BufferedBlockMgr Blocks.

Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff
Reviewed-on: http://gerrit.cloudera.org:8080/4448
Reviewed-by: Tim Armstrong 
Tested-by: Internal Jenkins
---
M be/src/exec/analytic-eval-node.cc
M be/src/exec/analytic-eval-node.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/row-batch-test.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/sorted-run-merger.cc
M be/src/runtime/sorted-run-merger.h
M be/src/runtime/sorter.cc
16 files changed, 240 insertions(+), 159 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4448
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4023: don't attach buffered tuple streams to batches
..


Patch Set 13: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4448
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1654: General partition exprs in DDL operations.

2016-10-25 Thread Amos Bird (Code Review)
Hello Marcel Kornacker, Bharath Vissapragada, Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3942

to look at the new patch set (#19).

Change subject: IMPALA-1654: General partition exprs in DDL operations.
..

IMPALA-1654: General partition exprs in DDL operations.

This commit handles partition related DDL in a more general way. We can
now use compound predicates to specify a list of partitions in
statements like ALTER TABLE DROP PARTITION and COMPUTE INCREMENTAL
STATS, etc. It will also make sure some statements only accept one
partition at a time, such as PARTITION SET LOCATION and LOAD DATA. ALTER
TABLE ADD PARTITION remains using the old PartitionKeyValue's logic.

The changed partition related DDLs are as follows,

Table: p (i int) partitioned by (j int, k string)
Partitions:
+---+---+---++--+--+---+
| j | k | #Rows | #Files | Size | Bytes Cached | Cache Replication |
+---+---+---++--+--+---+
| 1 | a | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| 1 | b | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| 1 | c | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| 2 | d | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| 2 | e | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| 2 | f | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| Total |   | -1| 0  | 0B   | 0B   |   |
+---+---+---++--+--+---+

1. show files in p partition (j<2, k='a');
2. alter table p partition (j<2, k in ("b","c") set cached in 'testPool';

// j can appear more than once,
3.1. alter table p partition (j<2, j>0, k<>"d") set uncached;
// it is the same as
3.2. alter table p partition (j<2 and j>0, not k="e") set uncached;
// we can also do 'or'
3.3. alter table p partition (j<2 or j>0, k like "%") set uncached;

// missing 'k' matches all values of k
4. alter table p partition (j<2) set fileformat textfile;
5. alter table p partition (k rlike ".*") set serdeproperties ("k"="v");
6. alter table p partition (j is not null) set tblproperties ("k"="v");
7. alter table p drop partition (j<2);
8. compute incremental stats p partition(j<2);

The remaining old partition related DDLs are as follows,

1. load data inpath '/path/from' into table p partition (j=2, k="d");
2. alter table p add partition (j=2, k="g");
3. alter table p partition (j=2, k="g") set location '/path/to';
4. insert into p partition (j=2, k="g") values (1), (2), (3);

General partition expressions or partially specified partition specs
allows partition predicates to return empty partition set no matter
'IF EXISTS' is specified.

Examples:

[localhost.localdomain:21000] >
alter table p drop partition (j=2, k="f");
Query: alter table p drop partition (j=2, k="f")
+-+
| summary |
+-+
| Dropped 1 partition(s). |
+-+
Fetched 1 row(s) in 0.78s
[localhost.localdomain:21000] >
alter table p drop partition (j=2, k<"f");
Query: alter table p drop partition (j=2, k<"f")
+-+
| summary |
+-+
| Dropped 2 partition(s). |
+-+
Fetched 1 row(s) in 0.41s
[localhost.localdomain:21000] >
alter table p drop partition (k="a");
Query: alter table p drop partition (k="a")
+-+
| summary |
+-+
| Dropped 1 partition(s). |
+-+
Fetched 1 row(s) in 0.25s
[localhost.localdomain:21000] > show partitions p;
Query: show partitions p
+---+---+---++--+--+---+
| j | k | #Rows | #Files | Size | Bytes Cached | Cache Replication |
+---+---+---++--+--+---+
| 1 | b | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| 1 | c | -1| 0  | 0B   | NOT CACHED   | NOT CACHED|
| Total |   | -1| 0  | 0B   | 0B   |   |
+---+---+---++--+--+---+
Fetched 3 row(s) in 0.01s

Change-Id: I2c9162fcf9d227b8daf4c2e761d57bab4e26408f
---
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M common/thrift/CatalogService.thrift
M common/thrift/Frontend.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetCachedStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetFileFormatStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java
M 

[Impala-ASF-CR] IMPALA-3983/IMPALA-3974: Delete function jar resources after load

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3983/IMPALA-3974: Delete function jar resources after 
load
..


Patch Set 7: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/364/

-- 
To view, visit http://gerrit.cloudera.org:8080/4617
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f9dedb5b342415380c83e61a72eb497371a8199
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4362: Misc. fixes for PFE counters

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4362: Misc. fixes for PFE counters
..


IMPALA-4362: Misc. fixes for PFE counters

* ExecTime was always 0, because it wasn't updated before the profile
  was reported to the coordinator
* Made ExecTree* counters children of their respective parents
* Moved ExecTreePrepareTime into the right profile
* Renamed timings profile to something a bit more obvious.

Change-Id: Iaa9dccdcd91297a11a093b921f8d84d32f32be84
Reviewed-on: http://gerrit.cloudera.org:8080/4829
Reviewed-by: Henry Robinson 
Tested-by: Internal Jenkins
---
M be/src/runtime/plan-fragment-executor.cc
1 file changed, 23 insertions(+), 11 deletions(-)

Approvals:
  Henry Robinson: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4829
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Iaa9dccdcd91297a11a093b921f8d84d32f32be84
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins


[Impala-ASF-CR] IMPALA-4362: Misc. fixes for PFE counters

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4362: Misc. fixes for PFE counters
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4829
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa9dccdcd91297a11a093b921f8d84d32f32be84
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4277: remove unneeded LegacyTCLIService

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4277: remove unneeded LegacyTCLIService
..


IMPALA-4277: remove unneeded LegacyTCLIService

Change-Id: I827590b19dc542f6256ae2e0d541eaa32a76520b
Reviewed-on: http://gerrit.cloudera.org:8080/4844
Reviewed-by: Henry Robinson 
Tested-by: Internal Jenkins
---
M common/thrift/CMakeLists.txt
D common/thrift/cli_service.thrift
M testdata/bin/wait-for-hiveserver2.py
3 files changed, 8 insertions(+), 1,138 deletions(-)

Approvals:
  Henry Robinson: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I827590b19dc542f6256ae2e0d541eaa32a76520b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins


[Impala-ASF-CR] IMPALA-3725 Support Kudu UPSERT in Impala

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3725 Support Kudu UPSERT in Impala
..


Patch Set 13: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4277: remove unneeded LegacyTCLIService

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4277: remove unneeded LegacyTCLIService
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I827590b19dc542f6256ae2e0d541eaa32a76520b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3838: Codegen EvalRuntimeFilters().

2016-10-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3838: Codegen EvalRuntimeFilters().
..


Patch Set 1:

(7 comments)

Mostly looks good - would be good to know whether any queries regressed with 
the change.

http://gerrit.cloudera.org:8080/#/c/4833/1/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

PS1, Line 1415: ExternalLinkage
Why not private linkage?


http://gerrit.cloudera.org:8080/#/c/4833/1/be/src/exec/filter-context.cc
File be/src/exec/filter-context.cc:

Line 90: // @expr_type_arg = constant %"struct.impala::ColumnType" { i32 4, i32 
-1, i32 -1,
What query produced this?


Line 155:   codegen->GetFunction(IRFunction::RUNTIME_FILTER_EVAL, true);
Do we need to clone this function? It doesn't look like we modify it.


Line 205: return Status("Codegen'ed FilterContext::Eval() fails 
verification, "
Weird wrapping.


http://gerrit.cloudera.org:8080/#/c/4833/1/be/src/runtime/runtime-filter.h
File be/src/runtime/runtime-filter.h:

Line 64:   bool Eval(void* val, const ColumnType& col_type) const noexcept;
Document 'val' - not self-explanatory


Line 86:   static const char* LLVM_CLASS_NAME;
Missed this one


http://gerrit.cloudera.org:8080/#/c/4833/1/tests/query_test/test_tpch_queries.py
File tests/query_test/test_tpch_queries.py:

Line 39:   v.get_value('table_format').file_format in ['text', 
'parquet', 'kudu'])
Does this provide meaningful coverage? It doesn't seem like it's worth running 
on both text and parquet in core.


-- 
To view, visit http://gerrit.cloudera.org:8080/4833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I27114869840e268d17e91d6e587ef811628e3837
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

2016-10-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
..


Patch Set 9:

(8 comments)

Rebased to pick up IMPALA-3884

http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

PS7, Line 459: 
> that
Done


http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/codegen/codegen-anyval.h
File be/src/codegen/codegen-anyval.h:

Line 184:   /// lowered type. This *Val should be non-null. The output variable 
is called 'name'.
> nit: space
Done


http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

PS7, Line 347: 
 :   /// TODO: Return Status inst
> Avoid cloning if possible to reduce compilation time.
Done


http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

PS7, Line 1660: ond, the
  : // value of the slot was initialized in the right way in 
InitAggSlots() (e.g. 0 for
  : // SUM) that we get the right result if UpdateSlot() 
pretends that the NULL bit of
  : // 'dst' is unse
> Please also see comments above.
I tried to clean this up a bit. My intention was to document the details of the 
initialisation in InitAggSlot() and provide a pointer here.


PS7, Line 1675: dst.SetIsNull(slot_desc->CodegenIsNull(codegen, , 
agg_tuple_arg));
> Why is this not in the old code or was it done somewhere else ?
It wasn't in the old code - the old code didn't handle arbitrary UDAs, just a 
subset of the builtins. E.g. the old code would have given wrong results if we 
changed SUM() so that it returned NULL on overflow.


http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/exec/partitioned-aggregation-node.h
File be/src/exec/partitioned-aggregation-node.h:

PS7, Line 649: 'dst_val' should contain the previous value of the aggregate
 :   /// function, and 'updated_dst_val' is set to the new value 
after the Update or Merge
 :   /// operation is applied.
> Just curious why we separate 'dst_val' and 'result_val'. Doesn't 'result_va
We need two different CodegenAnyVal, since it really represents the value 
(which changes) rather than the memory location. We could use an in-out 
argument instead of separate input and output arguments but this seemed simpler 
to me.

I renamed 'result_val' to 'updated_dst_val' to hopefully make the connection 
clearer (the naming was confusing).


http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/exprs/scalar-fn-call.cc
File be/src/exprs/scalar-fn-call.cc:

PS7, Line 496:  false
> Please feel free to ignore but you can consider setting this second argumen
It seems clearer to make it explicit, given it's not obvious what the "default" 
behaviour should be.


http://gerrit.cloudera.org:8080/#/c/4655/7/tests/query_test/test_aggregation.py
File tests/query_test/test_aggregation.py:

Line 141:   # Verify codegen was enabled for all stages of the aggregation.
> Can remove this restriction now that IMPALA-3884 is in master
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

2016-10-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#9).

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
..

IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

This change enables codegen for all builtin aggregate functions,
e.g. timestamp functions and group_concat.

There are several parts to the change:
* Adding support for generic UDAs. Previous the codegen code did not
  handle multiple input arguments or NULL return values.
* Defaulting to using the UDA interface when there is not a special
  codegen path (we have implementations of all builtin aggregate
  functions for the interpreted path).
* Remove all the logic to disable codegen for the special cases that now
  are supported.

Also fix the generation of code to get/set NULL bits since I needed
to add functionality there anyway.

Testing:
Add tests that check that codegen was enabled for builtin aggregate
functions. Also fix some gaps in the preexisting tests.

Also add tests for UDAs that check input/output nulls are handled
correctly, in anticipation of enabling codegen for arbitrary UDAs.

Perf:
Ran local TPC-H and targeted perf. Spent a lot of time on TPC-H Q1,
since my original approach regressed it ~5%. In the end the problem was
to do with the ordering of loads/stores to the slot and null bit in the
generated code: the previous version of the code exploited some
properties of the particular aggregate function. I ended up replicating
this behaviour to avoid regressing perf.

Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/aggregation-node.cc
M be/src/exec/exec-node.cc
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-table.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/old-hash-table.cc
M be/src/exec/partitioned-aggregation-node-ir.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/text-converter.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/case-expr.cc
M be/src/exprs/compound-predicates.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/exprs/expr.cc
M be/src/exprs/literal.cc
M be/src/exprs/null-literal.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/slot-ref.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/tuple.cc
M be/src/runtime/types.h
M be/src/testutil/test-udas.cc
M be/src/testutil/test-udfs.cc
M be/src/util/tuple-row-compare.cc
M testdata/workloads/functional-query/queries/QueryTest/uda.test
M tests/common/test_result_verifier.py
M tests/query_test/test_aggregation.py
M tests/query_test/test_udfs.py
39 files changed, 842 insertions(+), 562 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/4655/9
-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Move vim-specific config file to top level directory

2016-10-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Move vim-specific config file to top level directory
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4850
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I10599fdec22be3fb5c1e5105f23d96c6955052d8
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3788: Support for Kudu 'read-your-writes' consistency

2016-10-25 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3788: Support for Kudu 'read-your-writes' consistency
..


Patch Set 4: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4779/3/be/src/service/query-exec-state.cc
File be/src/service/query-exec-state.cc:

Line 547: }
> I think this could go in UnregisterQuery too. I put it here because I thoug
I don't have a strong preference either way -- mostly wondering what the 
reasoning was.  The division between the two don't seem very clear (i.e. its 
not clear why some stuff in Unregister() isn't in Done()), but I agree 
Unregister() doesn't seem any better.


-- 
To view, visit http://gerrit.cloudera.org:8080/4779
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6bcb5fc218ad4ab935343a55b2188441d8c7dfbd
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

2016-10-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#8).

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
..

IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

This change enables codegen for all builtin aggregate functions,
e.g. timestamp functions and group_concat.

There are several parts to the change:
* Adding support for generic UDAs. Previous the codegen code did not
  handle multiple input arguments or NULL return values.
* Defaulting to using the UDA interface when there is not a special
  codegen path (we have implementations of all builtin aggregate
  functions for the interpreted path).
* Remove all the logic to disable codegen for the special cases that now
  are supported.

Also fix the generation of code to get/set NULL bits since I needed
to add functionality there anyway.

Testing:
Add tests that check that codegen was enabled for builtin aggregate
functions. Also fix some gaps in the preexisting tests.

Also add tests for UDAs that check input/output nulls are handled
correctly, in anticipation of enabling codegen for arbitrary UDAs.

Perf:
Ran local TPC-H and targeted perf. Spent a lot of time on TPC-H Q1,
since my original approach regressed it ~5%. In the end the problem was
to do with the ordering of loads/stores to the slot and null bit in the
generated code: the previous version of the code exploited some
properties of the particular aggregate function. I ended up replicating
this behaviour to avoid regressing perf.

Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/aggregation-node.cc
M be/src/exec/exec-node.cc
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-table.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/old-hash-table.cc
M be/src/exec/partitioned-aggregation-node-ir.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/text-converter.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/case-expr.cc
M be/src/exprs/compound-predicates.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/exprs/expr.cc
M be/src/exprs/literal.cc
M be/src/exprs/null-literal.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/slot-ref.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/tuple.cc
M be/src/runtime/types.h
M be/src/testutil/test-udas.cc
M be/src/testutil/test-udfs.cc
M be/src/util/tuple-row-compare.cc
M testdata/workloads/functional-query/queries/QueryTest/uda.test
M tests/common/test_result_verifier.py
M tests/query_test/test_aggregation.py
M tests/query_test/test_udfs.py
39 files changed, 846 insertions(+), 564 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/4655/8
-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Move vim-specific config file to top level directory

2016-10-25 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4850

Change subject: Move vim-specific config file to top level directory
..

Move vim-specific config file to top level directory

Files in be/ get wiped out by clean.sh if they're listed in .gitignore.
Moving this file up one directory seems promising to prevent it from
being wiped out.

Change-Id: I10599fdec22be3fb5c1e5105f23d96c6955052d8
---
M .gitignore
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/4850/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4850
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I10599fdec22be3fb5c1e5105f23d96c6955052d8
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] Enabling end-to-end tests on a remote cluster

2016-10-25 Thread Harrison Sheinblatt (Code Review)
Harrison Sheinblatt has posted comments on this change.

Change subject: Enabling end-to-end tests on a remote cluster
..


Patch Set 1:

(3 comments)

Responded to comments.

http://gerrit.cloudera.org:8080/#/c/4769/1/testdata/bin/compute-table-stats.sh
File testdata/bin/compute-table-stats.sh:

PS1, Line 27: IMPALAD
> IMPALA-4346 has been filed.
Can you reference the Jira in a comment?


http://gerrit.cloudera.org:8080/#/c/4769/1/testdata/bin/create-load-data.sh
File testdata/bin/create-load-data.sh:

PS1, Line 38: HS2_HOST_PORT
> I think the latter is preferable -- corral all the required configs in one 
Is it reasonable to add a comment referencing the Jira here?


http://gerrit.cloudera.org:8080/#/c/4769/1/testdata/bin/setup-hdfs-env.sh
File testdata/bin/setup-hdfs-env.sh:

PS1, Line 53: CACHEADMIN_ARGS
> Clarification: can you be more explicit about the check you want? Something
If the is_kerberized block is executed above, then the CACHADMIN_ARGS would 
include '-owner ${PREVIOUS_USER}'.  If HADOOP_USER_NAME is also true, then we 
add another '-owner ${USER}' to this, which probably breaks it.  I think there 
are probably 4 bugs: 1) The is_kerberized block above probably isn't supported 
and should be removed and 2) the CACHEADMIN_ARGS definition logic needs a clear 
conditional, ideally in a single location, that sets the user/group/owner 
information properly in a way that you can easily tell it's always 
well-defined.  Here it looks like the logic is intended to be that if it's 
kerberized it sets owner one way, if it's not kerberized and the hadoop user is 
defined it's set another way and if it's not kerberized and the hadoop user is 
not defined it stays undefined. If we want to keep the is_kerberized logic in 
one place, then we can have it set another parameter about owner fields and 
here only update it if it's set already.  3) CACHEADMIN_ARGS is prob!
 ably the wrong name as it is for the -addPool command and sets a subset of the 
args 4) We should explicitly set all arguments to cacheadmin, -addPool if 
possible (e.g. mode maxTtl)


-- 
To view, visit http://gerrit.cloudera.org:8080/4769
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Martin Grund 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Enabling end-to-end tests on a remote cluster

2016-10-25 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: Enabling end-to-end tests on a remote cluster
..


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4769/1/bin/remote_data_load.py
File bin/remote_data_load.py:

PS1, Line 88: RemoteDataLoad
> I'd separate out the common functionality needed for dealing with remote cl
IMPALA-4367 has been filed.


PS1, Line 132: v10
> Hardcoding v10.  Is this necessary?  I think URL may be missing with later 
IMPALA-4367 has been filed.


PS1, Line 155: get_service_client_configurations
> A lot of this seems like it could be in comparisons/cluster.py, or at least
IMPALA-4367 has been filed.


PS1, Line 212: find_snapshot_file
> It would be good to start converting the snapshot file management into pyth
IMPALA-4367 has been filed.


-- 
To view, visit http://gerrit.cloudera.org:8080/4769
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Martin Grund 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3771: Expose kudu client timeout and set default

2016-10-25 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4849

Change subject: IMPALA-3771: Expose kudu client timeout and set default
..

IMPALA-3771: Expose kudu client timeout and set default

The Kudu client timeout was too low for Impala usage. This
sets the default timeout to 3 minutes and exposes it as a
gflag.

New timeout tests were added.

Note: Kudu Java client exceptions on timeouts are very
verbose and not useful, so the code that catches exceptions
around the client APIs that are likely to time out do not
set the inner exception otherwise the query errors will be
unusable. Instead, the Kudu exceptions are logged and a
simple Impala exception is thrown. KUDU-1734 tracks fixing
the Kudu errors. After the error messages are fixed, Kudu
exceptions can be added back to the Impala exceptions.

Change-Id: Iad95e8e38aad4f76d21bac6879db6c02b3c3e045
---
M be/src/catalog/catalog.cc
M be/src/common/global-flags.cc
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-table-sink.cc
M be/src/service/frontend.cc
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
A 
testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-catalogd.test
A 
testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-impalad.test
M tests/common/impala_test_suite.py
M tests/custom_cluster/test_kudu.py
16 files changed, 194 insertions(+), 75 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/4849/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4849
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Iad95e8e38aad4f76d21bac6879db6c02b3c3e045
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 13: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3676: Use clang as a static analysis tool

2016-10-25 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-3676: Use clang as a static analysis tool
..


Patch Set 7:

> > I rearranged them all. Tim convinced me to roll some back.
 > 
 > That doesn't really answer my question - given the current state of
 > the patch, what are the criteria we should use, going forward, to
 > decide whether a class should be CacheAligned?
 > 
 > I suggest we reserve CacheAligned / grouping for classes that
 > really need it, given the reduction in comprehension. Or am I
 > underestimating how big the impact of properly aligning even
 > non-performance critical classes will be?

CacheAligned and grouping I view separately.

clang-tidy has a warning about cache alignment that is about correctness - If X 
must be cache aligned because we wanted it to be so for performance reasons, 
than clang will warn when NEWing classes that contain that class because we are 
breaking our own expectations about alignment. The cache alignments I added 
were only to quiet that warning. I did not set the BlockingQueue to be 
cache-aligned - that was there when I started.

For grouping, I think you and I are on the same page.

-- 
To view, visit http://gerrit.cloudera.org:8080/4758
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ed168488cb30ddeccd0087f3840541d858f9c06
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 13:

Alex, can you +2 this again? The tests got renamed and I didn't catch the 
syntax error when running locally. :(

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Lars Volker (Code Review)
Hello Marcel Kornacker, Internal Jenkins, Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4745

to look at the new patch set (#13).

Change subject: IMPALA-2521: Add clustered hint to insert statements
..

IMPALA-2521: Add clustered hint to insert statements

This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
plan, just before the table sink. This has the effect that data will be
clustered by its partition prior to writing partitions, which therefore
can be written sequentially.

Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
10 files changed, 418 insertions(+), 44 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4745/13
-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-3676: Use clang as a static analysis tool

2016-10-25 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3676: Use clang as a static analysis tool
..


Patch Set 6:

> I rearranged them all. Tim convinced me to roll some back.

That doesn't really answer my question - given the current state of the patch, 
what are the criteria we should use, going forward, to decide whether a class 
should be CacheAligned?

I suggest we reserve CacheAligned / grouping for classes that really need it, 
given the reduction in comprehension. Or am I underestimating how big the 
impact of properly aligning even non-performance critical classes will be?

-- 
To view, visit http://gerrit.cloudera.org:8080/4758
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ed168488cb30ddeccd0087f3840541d858f9c06
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4153: Fix count(*) on all blank('') columns - test

2016-10-25 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4153: Fix count(*) on all blank('') columns - test
..


Patch Set 1:

Laszlo, can you try merging this again or is there something wrong with the 
patch?

-- 
To view, visit http://gerrit.cloudera.org:8080/4755
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I23923f95f43d67977ee1520a1fc09ce297548b3f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches

2016-10-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4023: don't attach buffered tuple streams to batches
..


Patch Set 13: Code-Review+2

(1 comment)

Carry +2

http://gerrit.cloudera.org:8080/#/c/4448/11/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

PS11, Line 385: copied out now or sent up the
  :   // plan and copied out by a blo
> copied out now or sent up the tree and copied out by a blocking ancestor.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4448
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches

2016-10-25 Thread Tim Armstrong (Code Review)
Hello Internal Jenkins, Alex Behm, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4448

to look at the new patch set (#12).

Change subject: IMPALA-4023: don't attach buffered tuple streams to batches
..

IMPALA-4023: don't attach buffered tuple streams to batches

This simplifies the memory transfer model by eliminating one category of
resources that can be attached.

This patch also separates the concepts of attaching resources and
flushing resources. Previously RowBatch::AddTupleStream() implicitly
flushed resources from the ExecNode pipeline, which various ExecNodes
relied on to free up memory reservations for subsequent processing. In a
subsequent patch I want the FlushResources() API to become stronger: it
will force streaming ExecNodes to flush their batches or forces
blocking ExecNodes to acquire ownership of the memory resources.
We can't do this right now since we don't have a way to transfer
ownership of BufferedBlockMgr Blocks.

Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff
---
M be/src/exec/analytic-eval-node.cc
M be/src/exec/analytic-eval-node.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/row-batch-test.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/sorted-run-merger.cc
M be/src/runtime/sorted-run-merger.h
M be/src/runtime/sorter.cc
16 files changed, 240 insertions(+), 159 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/4448/12
-- 
To view, visit http://gerrit.cloudera.org:8080/4448
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3872: allow providing PyPi mirror for python packages

2016-10-25 Thread Tim Armstrong (Code Review)
Hello Michael Brown,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4770

to look at the new patch set (#4).

Change subject: IMPALA-3872: allow providing PyPi mirror for python packages
..

IMPALA-3872: allow providing PyPi mirror for python packages

We still rely on the python.org json API, which doesn't seem to be
mirrored (instead there's a html-based index format implemented by
the mirrors).

Change-Id: Ibc11f010332c0225121c86c9930e35c7ac01409c
---
M infra/python/deps/pip_download.py
1 file changed, 8 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/4770/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibc11f010332c0225121c86c9930e35c7ac01409c
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3872: allow providing PyPi mirror for python packages

2016-10-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3872: allow providing PyPi mirror for python packages
..


Patch Set 4: Code-Review+1

Carry +1

-- 
To view, visit http://gerrit.cloudera.org:8080/4770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc11f010332c0225121c86c9930e35c7ac01409c
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3872: allow providing PyPi mirror for python packages

2016-10-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3872: allow providing PyPi mirror for python packages
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4770/3/infra/python/deps/pip_download.py
File infra/python/deps/pip_download.py:

PS3, Line 31: from urlparse import urljoin
> Remove
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc11f010332c0225121c86c9930e35c7ac01409c
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

2016-10-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
..


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4631/1/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

PS1, Line 193: BufferOpts
> const& ?
Done


http://gerrit.cloudera.org:8080/#/c/4631/1/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS1, Line 642: without the client providing a "
 :   "buffer"
> is this going to make sense to a user?
Probably not. The old message wasn't actionable either by a user (beyond 
reporting the bug). Reworded to make it clear that it's an internal error.


Line 708:   DCHECK_LE(buffer_size, max_buffer_size_);
> how do we ensure this is true? is it the check in DiskIoMgr::Read() that wi
DiskIoMgr::ReadRange() checks if there's a client buffer before calling 
TryAllocateNextBufferForRange().


http://gerrit.cloudera.org:8080/#/c/4631/1/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS1, Line 147: these buffers are returned to the caller wrapped
 : /// in BufferDescriptors.
> took a few seconds to parse and understand. how about:
Yes that's better...


Line 150: /// another read. In error cases, the IoMgr will recycle the buffers 
more promptly but
> is this relevant for all cases, or only cases (a) and (b)?
Tried to clarify. Also removed the last sentence since I think it was confusing 
and not really useful to the reader.


PS1, Line 392: bool
> maybe make all of these members 'const' since they are "options" that don't
Done


PS1, Line 422: BufferOpts
> const BufferOpts& ?
Done


Line 506: int64_t client_buffer_len_;
> maybe move this right before cached_buffer_ since it's related in the sense
Switched it to a tagged union. We don't really have much precedent in the 
codebase for this pattern so I invented some conventions that made sense to me.


-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

2016-10-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2).

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
..

IMPALA-3202: DiskIoMgr improvements for new buffer pool

The main goal of this patch is to add support to the DiskIoMgr
to read into client-provided buffers, instead of the DiskIoMgr
allocating intermediate buffers. This is important for the new
buffer pool because we want it to entirely manage its own memory,
rather than delegating part of its memory management to the DiskIoMgr.

Also do some cleanup:
* Consolidate some correlated ScanRange parameters into a parameter
  struct.
* Remove the "untracked" buffers mem tracker, which is no longer
  necessary.
* Change the buffer type in DiskIoMgr to use uint8_t* to be consistent
  with the rest of Impala.

Testing:
Added a unit test. We also get coverage from the BufferedBlockMgr unit
tests, any spilling tests and the Parquet tests with large footers.

Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/buffered-block-mgr.cc
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr-stress-test.cc
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-stress.h
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
15 files changed, 404 insertions(+), 224 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/4631/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 12: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/360/

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4362: Misc. fixes for PFE counters

2016-10-25 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4362: Misc. fixes for PFE counters
..


Patch Set 2: Code-Review+2

Rebase.

-- 
To view, visit http://gerrit.cloudera.org:8080/4829
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa9dccdcd91297a11a093b921f8d84d32f32be84
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4362: Misc. fixes for PFE counters

2016-10-25 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4362: Misc. fixes for PFE counters
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4829
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa9dccdcd91297a11a093b921f8d84d32f32be84
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3788: Support for Kudu 'read-your-writes' consistency

2016-10-25 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3788: Support for Kudu 'read-your-writes' consistency
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4779/3/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

Line 1163:   if (kudu_latest_observed_ts > 0) {
why not carry the > 0 convention and just always set the session state?  looks 
like scan node already checks > 0 anyway.


http://gerrit.cloudera.org:8080/#/c/4779/3/be/src/service/query-exec-state.cc
File be/src/service/query-exec-state.cc:

Line 547: }
why is it better to do here rather than, say, ImpalaServer::UnregisterQuery()? 
is it important that we're holding QES::lock_?


http://gerrit.cloudera.org:8080/#/c/4779/3/tests/common/impala_test_suite.py
File tests/common/impala_test_suite.py:

Line 322
nice to see this go


-- 
To view, visit http://gerrit.cloudera.org:8080/4779
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6bcb5fc218ad4ab935343a55b2188441d8c7dfbd
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4350: Crash with vlog level 2 in hash join node

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4350: Crash with vlog level 2 in hash join node
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4830
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ieedd49d8a17709177a622ddc15b78a3f48e12d3f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4350: Crash with vlog level 2 in hash join node

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4350: Crash with vlog level 2 in hash join node
..


IMPALA-4350: Crash with vlog level 2 in hash join node

Testing:
Reproduced by running test_mem_usage_scaling against an Impala with
-v=2. Confirmed the fix avoided the crash.

We don't have any routine testing of high log levels so there isn't
a clear way to get good coverage of all code paths where there might be
similar bugs.

Change-Id: Ieedd49d8a17709177a622ddc15b78a3f48e12d3f
Reviewed-on: http://gerrit.cloudera.org:8080/4830
Reviewed-by: Tim Armstrong 
Tested-by: Internal Jenkins
---
M be/src/exec/partitioned-hash-join-node.cc
1 file changed, 13 insertions(+), 6 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4830
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ieedd49d8a17709177a622ddc15b78a3f48e12d3f
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3872: allow providing PyPi mirror for python packages

2016-10-25 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-3872: allow providing PyPi mirror for python packages
..


Patch Set 3: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4770/2/infra/python/deps/pip_download.py
File infra/python/deps/pip_download.py:

PS2, Line 73: return True
> This doesn't work for me - if I have PYPI_MIRROR = "https://pypi.infra.clou
Too bad. I think the URL python libraries are needlessly complicated.


http://gerrit.cloudera.org:8080/#/c/4770/3/infra/python/deps/pip_download.py
File infra/python/deps/pip_download.py:

PS3, Line 31: from urlparse import urljoin
Remove


-- 
To view, visit http://gerrit.cloudera.org:8080/4770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc11f010332c0225121c86c9930e35c7ac01409c
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches

2016-10-25 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4023: don't attach buffered tuple streams to batches
..


Patch Set 11: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4448/11/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

PS11, Line 385: copied out or sent up the plan
  :   // tree via MarkNeedsDeepCopy()
copied out now or sent up the tree and copied out by a blocking ancestor.


-- 
To view, visit http://gerrit.cloudera.org:8080/4448
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3872: allow providing PyPi mirror for python packages

2016-10-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3872: allow providing PyPi mirror for python packages
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4770/2/infra/python/deps/pip_download.py
File infra/python/deps/pip_download.py:

PS2, Line 34: pypi_mirror
> Use all-caps for the name?
Done


PS2, Line 73:   pkg_url = "{0}/packages/{1}".format(pypi_mirror, 
pkg['path'])
> What about:
This doesn't work for me - if I have PYPI_MIRROR = 
"https://pypi.infra.cloudera.com/api/pypi/pypi-public; then it strips off 
api/pypi/pypi-public automatically. I'm not sure why it's doing that.


-- 
To view, visit http://gerrit.cloudera.org:8080/4770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc11f010332c0225121c86c9930e35c7ac01409c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR](hadoop-next) Merge remote-tracking branch 'gerrit/master' into HEAD

2016-10-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged.

Change subject: Merge remote-tracking branch 'gerrit/master' into HEAD
..


Merge remote-tracking branch 'gerrit/master' into HEAD

Change-Id: I1b56e4dbb67889bbe4ff8462ccc2b49968821a1f
---
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
1 file changed, 0 insertions(+), 4 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved; Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4841
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I1b56e4dbb67889bbe4ff8462ccc2b49968821a1f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: hadoop-next
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches

2016-10-25 Thread Tim Armstrong (Code Review)
Hello Internal Jenkins, Alex Behm, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4448

to look at the new patch set (#10).

Change subject: IMPALA-4023: don't attach buffered tuple streams to batches
..

IMPALA-4023: don't attach buffered tuple streams to batches

This simplifies the memory transfer model by eliminating one category of
resources that can be attached.

This patch also separates the concepts of attaching resources and
flushing resources. Previously RowBatch::AddTupleStream() implicitly
flushed resources from the ExecNode pipeline, which various ExecNodes
relied on to free up memory reservations for subsequent processing. In a
subsequent patch I want the FlushResources() API to become stronger: it
will force streaming ExecNodes to flush their batches or forces
blocking ExecNodes to acquire ownership of the memory resources.
We can't do this right now since we don't have a way to transfer
ownership of BufferedBlockMgr Blocks.

Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff
---
M be/src/exec/analytic-eval-node.cc
M be/src/exec/analytic-eval-node.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/row-batch-test.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/sorted-run-merger.cc
M be/src/runtime/sorted-run-merger.h
M be/src/runtime/sorter.cc
16 files changed, 239 insertions(+), 158 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/4448/10
-- 
To view, visit http://gerrit.cloudera.org:8080/4448
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage

2016-10-25 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-1169: Admission control info on the queries debug webpage
..


Patch Set 6: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4260: Alter table add column drops all the column stats

2016-10-25 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4260: Alter table add column drops all the column stats
..


Patch Set 1: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4845/1/testdata/workloads/functional-query/queries/QueryTest/compute-stats.test
File testdata/workloads/functional-query/queries/QueryTest/compute-stats.test:

Line 121: 
Can you also add a test case for CHANGE and REPLACE columns as well?


-- 
To view, visit http://gerrit.cloudera.org:8080/4845
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia373ec917efa7ab9f2a59b8a870b7ebc30175dda
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4340: explain how to install postgresql-9.5 or higher

2016-10-25 Thread Michael Brown (Code Review)
Michael Brown has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4846

Change subject: IMPALA-4340: explain how to install postgresql-9.5 or higher
..

IMPALA-4340: explain how to install postgresql-9.5 or higher

The random query generator needs to compare against PostgresQL 9.5 or
higher to take advantage of some of the more recent features, especially
as it pertains to Impala/Kudu INSERT and UPSERT queries.  Developers
will need assistance setting up their development environments if they
need to use the random query generator.

This patch provides instructions on how to do so. We provide
instructions, not automation, since this will have side-effects on
developers' workstations: we can't presume to know how a developer might
want to install or configure postgres, and we haven't tested on anything
except our own development environment.

Change-Id: I1e3b510120451fcb5af97145fa47ccb4c53f00d9
---
A tests/comparison/POSTGRES.txt
M tests/comparison/README
2 files changed, 88 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/4846/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4846
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I1e3b510120451fcb5af97145fa47ccb4c53f00d9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 


[Impala-ASF-CR] IMPALA-4277: remove unneeded LegacyTCLIService

2016-10-25 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4277: remove unneeded LegacyTCLIService
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I827590b19dc542f6256ae2e0d541eaa32a76520b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage

2016-10-25 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-1169: Admission control info on the queries debug webpage
..


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4756/5/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

PS5, Line 801: o handle i
> I had suggested moving it to be closer to the register call if this is call
Done


http://gerrit.cloudera.org:8080/#/c/4756/5/be/src/service/query-exec-state.h
File be/src/service/query-exec-state.h:

Line 157: 
> Blank line before this.
Done


Line 159:   /// been created and had the pool set yet, or this StmtType doesn't 
go through admission
> missing /
Done


PS5, Line 161: ol()
> nullptr
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage

2016-10-25 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new patch set (#6).

Change subject: IMPALA-1169: Admission control info on the queries debug webpage
..

IMPALA-1169: Admission control info on the queries debug webpage

This patch adds a new event, 'Queued', to the query event log to
indicate when a query is queued by the admission controller. This
means that queries on the '/queries' page that are currently
queued will display this as their 'Last Event', making it possible
to see which queries are current queued.

It also adds a column to show the resource pool associated with
the queries, and it updates the wording of the first event that
gets marked for each query from 'Start execution' to 'Query
submitted', since this is before planning and admission control
and therefore execution hasn't actually startd yet.

Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283
---
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.h
M tests/common/impala_service.py
M tests/custom_cluster/test_admission_controller.py
M www/queries.tmpl
9 files changed, 72 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/4756/6
-- 
To view, visit http://gerrit.cloudera.org:8080/4756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-4277: remove unneeded LegacyTCLIService

2016-10-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4844

Change subject: IMPALA-4277: remove unneeded LegacyTCLIService
..

IMPALA-4277: remove unneeded LegacyTCLIService

Change-Id: I827590b19dc542f6256ae2e0d541eaa32a76520b
---
M common/thrift/CMakeLists.txt
D common/thrift/cli_service.thrift
M testdata/bin/wait-for-hiveserver2.py
3 files changed, 8 insertions(+), 1,138 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/4844/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I827590b19dc542f6256ae2e0d541eaa32a76520b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-1702: Enforce single-table consistency in query analysis.

2016-10-25 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-1702: Enforce single-table consistency in query analysis.
..


Patch Set 17: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Huaisi Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3725 Support Kudu UPSERT in Impala

2016-10-25 Thread Thomas Tauber-Marshall (Code Review)
Hello Matthew Jacobs, Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4047

to look at the new patch set (#13).

Change subject: IMPALA-3725 Support Kudu UPSERT in Impala
..

IMPALA-3725 Support Kudu UPSERT in Impala

This patch introduces a new query statement, UPSERT, for Kudu
tables which operates like an INSERT and uses all of the analysis,
planning, and execution machinery as INSERT, except that if
there's a primary key collision instead of returning an error an
update is performed.

New syntax:
[with_clause] UPSERT INTO [TABLE] table_name [(column list)]
{
  query_stmt
 | VALUES (value [, value...]) [, (value [, (value...)]) ...]
}

where column list must contain all of the key columns in
table_name, if specified, and table_name must be a Kudu table.

This patch also improves the behavior of INSERTing into Kudu
tables without specifying all of the key columns - this now
results in an analysis exception, rather than attempting the
INSERT and receiving an error back from Kudu.

Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9
---
M be/src/exec/kudu-table-sink.cc
M common/thrift/DataSinks.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java
M fe/src/main/java/org/apache/impala/planner/TableSink.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
A fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
17 files changed, 656 insertions(+), 84 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/4047/13
-- 
To view, visit http://gerrit.cloudera.org:8080/4047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-4086: Add benchmark for simple scheduler

2016-10-25 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#3).

Change subject: IMPALA-4086: Add benchmark for simple scheduler
..

IMPALA-4086: Add benchmark for simple scheduler

Change-Id: I89ec1c6c1828bb0b86d1e13ce4dfc5a8df865c2e
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/simple-scheduler-benchmark.cc
M be/src/util/benchmark.cc
M be/src/util/debug-util.cc
M be/src/util/debug-util.h
5 files changed, 176 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/4554/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4554
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I89ec1c6c1828bb0b86d1e13ce4dfc5a8df865c2e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3983/IMPALA-3974: Delete function jar resources after load

2016-10-25 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3983/IMPALA-3974: Delete function jar resources after 
load
..


Patch Set 6: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4617
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f9dedb5b342415380c83e61a72eb497371a8199
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

2016-10-25 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
..


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4631/1/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

PS1, Line 193: BufferOpts
const& ?


http://gerrit.cloudera.org:8080/#/c/4631/1/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS1, Line 642: without the client providing a "
 :   "buffer"
is this going to make sense to a user?


Line 708:   DCHECK_LE(buffer_size, max_buffer_size_);
how do we ensure this is true? is it the check in DiskIoMgr::Read() that will 
always enforce?


http://gerrit.cloudera.org:8080/#/c/4631/1/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS1, Line 147: these buffers are returned to the caller wrapped
 : /// in BufferDescriptors.
took a few seconds to parse and understand. how about:

... these buffers are wrapped in BufferDescriptors and return to the caller.


Line 150: /// another read. In error cases, the IoMgr will recycle the buffers 
more promptly but
is this relevant for all cases, or only cases (a) and (b)?


PS1, Line 392: bool
maybe make all of these members 'const' since they are "options" that don't 
change, right?


PS1, Line 422: BufferOpts
const BufferOpts& ?


Line 506: int64_t client_buffer_len_;
maybe move this right before cached_buffer_ since it's related in the sense 
that it's another possible source of the buffer.  Alternatively, consider even 
making a union to make it clear that they are mutually exclusive, and the union 
tag indicates the buffer "mode" (Managed, HdfsCached, Client - or something 
like that).  Though it could be that doesn't work out cleanly given that 
HdfsCached can fallback to Managed mode.


-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE ADD PARTITION

2016-10-25 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded a new patch set (#19).

Change subject: IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER 
TABLE ADD PARTITION
..

IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE
ADD PARTITION

Just like Hive, Implala should support multiple partitions in ALTER
TABLE ADD PARTITION statements. The syntax is as follows:

ALTER TABLE table_name ADD [IF NOT EXISTS]
PARTITION partition_spec1 [location_spec1] [cache_spec1]
PARTITION partition_spec2 [location_spec2] [cache_spec2]
...

Grammar was modified to handle the new syntax. Introduced
PartitionParams class to capture the repeatable part of the statement.
TPartitionParams is the name of the corresponding thrift class.

AlterTableAddPartitionStmt and CatalogOpExecutor classes were also
modified to work with a list of partitions. Duplicate partition specs
are rejected in AlterTableAddPartitionStmt.analyze().

Added FE, E2E and integration tests.

Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
---
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/PartitionKeyValue.java
A fe/src/main/java/org/apache/impala/analysis/PartitionParams.java
M fe/src/main/java/org/apache/impala/analysis/PartitionSpec.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M testdata/workloads/functional-query/queries/QueryTest/alter-table.test
M testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test
M tests/common/impala_test_suite.py
M tests/metadata/test_hms_integration.py
M tests/metadata/test_refresh_partition.py
16 files changed, 940 insertions(+), 215 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/4144/19
-- 
To view, visit http://gerrit.cloudera.org:8080/4144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
Gerrit-PatchSet: 19
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE ADD PARTITION

2016-10-25 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded a new patch set (#19).

Change subject: IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER 
TABLE ADD PARTITION
..

IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE
ADD PARTITION

Just like Hive, Implala should support multiple partitions in ALTER
TABLE ADD PARTITION statements. The syntax is as follows:

ALTER TABLE table_name ADD [IF NOT EXISTS]
PARTITION partition_spec1 [location_spec1] [cache_spec1]
PARTITION partition_spec2 [location_spec2] [cache_spec2]
...

Grammar was modified to handle the new syntax. Introduced
PartitionParams class to capture the repeatable part of the statement.
TPartitionParams is the name of the corresponding thrift class.

AlterTableAddPartitionStmt and CatalogOpExecutor classes were also
modified to work with a list of partitions. Duplicate partition specs
are rejected in AlterTableAddPartitionStmt.analyze().

Added FE, E2E and integration tests.

Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
---
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/PartitionKeyValue.java
A fe/src/main/java/org/apache/impala/analysis/PartitionParams.java
M fe/src/main/java/org/apache/impala/analysis/PartitionSpec.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M testdata/workloads/functional-query/queries/QueryTest/alter-table.test
M testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test
M tests/common/impala_test_suite.py
M tests/metadata/test_hms_integration.py
M tests/metadata/test_refresh_partition.py
16 files changed, 940 insertions(+), 215 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/4144/19
-- 
To view, visit http://gerrit.cloudera.org:8080/4144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
Gerrit-PatchSet: 19
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-4086: Add benchmark for simple scheduler

2016-10-25 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#2).

Change subject: IMPALA-4086: Add benchmark for simple scheduler
..

IMPALA-4086: Add benchmark for simple scheduler

Change-Id: I89ec1c6c1828bb0b86d1e13ce4dfc5a8df865c2e
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/simple-scheduler-benchmark.cc
M be/src/util/benchmark.cc
M be/src/util/debug-util.cc
M be/src/util/debug-util.h
5 files changed, 191 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/4554/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4554
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I89ec1c6c1828bb0b86d1e13ce4dfc5a8df865c2e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4086: Add benchmark for simple scheduler

2016-10-25 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4086: Add benchmark for simple scheduler
..


Patch Set 1:

(3 comments)

Thanks Tim for the review. I addressed your comments in PS2. I will rebase this 
next, then pick up work on it, so no need to review again as of now.

http://gerrit.cloudera.org:8080/#/c/4554/1/be/src/benchmarks/simple-scheduler-benchmark.cc
File be/src/benchmarks/simple-scheduler-benchmark.cc:

Line 130: /// according to the parameter 'replica_preference'.
> Mention that it uses the default # blocks?
Done


Line 147: /// Build and run a benchmark suite for various table sizes. 
Scheduling will be done
> Mention that it's done with the default cluster size?
Done


http://gerrit.cloudera.org:8080/#/c/4554/1/be/src/util/benchmark.cc
File be/src/util/benchmark.cc:

PS1, Line 51: by
> I don't think this "by" is needed
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4554
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I89ec1c6c1828bb0b86d1e13ce4dfc5a8df865c2e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 12: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR](hadoop-next) Merge remote-tracking branch 'gerrit/master' into HEAD

2016-10-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Merge remote-tracking branch 'gerrit/master' into HEAD
..


Patch Set 1: Code-Review+2 Verified+1

Clean merge

-- 
To view, visit http://gerrit.cloudera.org:8080/4841
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1b56e4dbb67889bbe4ff8462ccc2b49968821a1f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: hadoop-next
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 12:

Rebase done in PS12

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Lars Volker (Code Review)
Hello Marcel Kornacker, Internal Jenkins, Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4745

to look at the new patch set (#12).

Change subject: IMPALA-2521: Add clustered hint to insert statements
..

IMPALA-2521: Add clustered hint to insert statements

This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
plan, just before the table sink. This has the effect that data will be
clustered by its partition prior to writing partitions, which therefore
can be written sequentially.

Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
10 files changed, 417 insertions(+), 44 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4745/12
-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage

2016-10-25 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-1169: Admission control info on the queries debug webpage
..


Patch Set 5:

Thanks! Do the exhaustive AC tests pass?

-- 
To view, visit http://gerrit.cloudera.org:8080/4756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage

2016-10-25 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-1169: Admission control info on the queries debug webpage
..


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4756/5/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

PS5, Line 801: Registered
I don't think 'Registered' is very meaningful to users (and also it's being set 
*before* registration, so the time is wrong.

"Query submitted" or something might be clearer. The idea is an event that 
shows that Impala has started processing a query. I preferred it in its earlier 
place - what was the rationale for moving it?


http://gerrit.cloudera.org:8080/#/c/4756/5/be/src/service/query-exec-state.h
File be/src/service/query-exec-state.h:

Line 157:   /// Resource pool associated with this query, or an empty string if 
the schedule has not
Blank line before this.


Line 159:   // control.
missing /


PS5, Line 161: NULL
nullptr


-- 
To view, visit http://gerrit.cloudera.org:8080/4756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4223: Handle truncated file read from HDFS cache

2016-10-25 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4223: Handle truncated file read from HDFS cache
..


Patch Set 1:

(1 comment)

> (1 comment)
 > 
 > Can we test this by having the test load metadata and then truncate
 > a cached file?

We can try to do this in a custom cluster test. It needs to follow the steps 
outlined here: 
https://issues.cloudera.org/browse/IMPALA-4223?focusedCommentId=212776=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-212776

They require changes to the system limits to allow for larger cached files, and 
to the data nodes to increase the file cache as well, so they might be rather 
disruptive. Once these are changed we can write a custom cluster test to 
download, truncate, and upload files and run queries over them, checking that 
the correct log messages appear.

Should we break this out into several Jiras / changes? The limits should be 
change in impala-setup, too. The datanode settings change will be required to 
integrate this into fuzz testing, too.

I will also have a look at the scanner fuzz test and see if it is easy to 
inject these error there.

http://gerrit.cloudera.org:8080/#/c/4828/1/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

PS1, Line 448: disk
> how about "uncached" instead?
I followed the error from L433, assuming that since we already report a similar 
case it made sense to be consistent. Should I change both?


-- 
To view, visit http://gerrit.cloudera.org:8080/4828
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id1e1fdb0211819c5938956abb13b512350a46f1a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4350: Crash with vlog level 2 in hash join node

2016-10-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4350: Crash with vlog level 2 in hash join node
..


Patch Set 2: Code-Review+2

Carry  +2

-- 
To view, visit http://gerrit.cloudera.org:8080/4830
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ieedd49d8a17709177a622ddc15b78a3f48e12d3f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Remove seemingly incorrect DCHECK-s.

2016-10-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Remove seemingly incorrect DCHECK-s.
..


Patch Set 1:

(1 comment)

I agree that this looks like a bug.

http://gerrit.cloudera.org:8080/#/c/4835/1//COMMIT_MSG
Commit Message:

Line 7: Remove seemingly incorrect DCHECK-s.
Can you file an Impala JIRA for this?


-- 
To view, visit http://gerrit.cloudera.org:8080/4835
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I52e1b1354e9ea056b49331e75e53759952a81b76
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4309: Introduce Expr rewrite phase and supporting classes.

2016-10-25 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new patch set (#7).

Change subject: IMPALA-4309: Introduce Expr rewrite phase and supporting 
classes.
..

IMPALA-4309: Introduce Expr rewrite phase and supporting classes.

Introduces a new phase for rewriting Exprs after analysis and
before subquery rewriting. The transformed Exprs replace the
original ones in analyzed statements. If Exprs were changed,
the whole statement is reset() and re-analyzed, similar to how
subqueries are rewritten. If both Exprs and subqueries are
rewritten there is only one re-analysis of the changed statement.

The following new classes work together to perform transformations:
1. ExprRewriteRule
- base class for Expr transformation rules
2. ExprRewriter
- drives the transformation of Exprs using a list of
  ExprRewriteRules

Statements that have exprs to be rewritten need to implement
a new method rewriteExprs() that accepts an ExprRewriter.

As an example, this patch adds a rule for converting
BetweenPredicates into their equivalent CompoundPredicates.
The BetweenPredicate has been notoriously buggy due to a lack
of such a separate rewrite phase and is now cleaned up.

Testing:
1. Added a new test for checking that the rewrite framework
   covers all relevant statements, clauses and can properly
   handle nested statements and subqueries.
2. There are many existing tests for BetweePredicates and
   they all exercise the new rewrite rule/phase.
3. Ran a private core/hdfs run and it passed.

Change-Id: I2279dc984bcf7742db4fa3b1aa67283ecbb05e6e
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectList.java
M fe/src/main/java/org/apache/impala/analysis/SelectListItem.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StatementBase.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
A fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
A fe/src/main/java/org/apache/impala/rewrite/ExprRewriteRule.java
A fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
A fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-views.test
29 files changed, 551 insertions(+), 216 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/4746/7
-- 
To view, visit http://gerrit.cloudera.org:8080/4746
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2279dc984bcf7742db4fa3b1aa67283ecbb05e6e
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4223: Handle truncated file read from HDFS cache

2016-10-25 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4223: Handle truncated file read from HDFS cache
..


Patch Set 1:

(1 comment)

Can we test this by having the test load metadata and then truncate a cached 
file?

http://gerrit.cloudera.org:8080/#/c/4828/1/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

PS1, Line 448: disk
how about "uncached" instead?


-- 
To view, visit http://gerrit.cloudera.org:8080/4828
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id1e1fdb0211819c5938956abb13b512350a46f1a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Remove seemingly incorrect DCHECK-s.

2016-10-25 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: Remove seemingly incorrect DCHECK-s.
..


Patch Set 1:

Is it reasonably possible to create a file that demonstrates this and can be 
added to the regression test suite?

-- 
To view, visit http://gerrit.cloudera.org:8080/4835
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I52e1b1354e9ea056b49331e75e53759952a81b76
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

2016-10-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4655/7/tests/query_test/test_aggregation.py
File tests/query_test/test_aggregation.py:

Line 141: # (IMPALA-3884).
Can remove this restriction now that IMPALA-3884 is in master


-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3872: allow providing PyPi mirror for python packages

2016-10-25 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-3872: allow providing PyPi mirror for python packages
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4770/2/infra/python/deps/pip_download.py
File infra/python/deps/pip_download.py:

PS2, Line 34: pypi_mirror
Use all-caps for the name?


PS2, Line 73:   pkg_url = "{0}/packages/{1}".format(pypi_mirror, 
pkg['path'])
What about:

  pkg_url = urlparse.urljoin(PYPI_MIRROR, 
'/packages/{path}'.format(path=pkg['path']))


-- 
To view, visit http://gerrit.cloudera.org:8080/4770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc11f010332c0225121c86c9930e35c7ac01409c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Remove seemingly incorrect DCHECK-s.

2016-10-25 Thread Zoltan Ivanfi (Code Review)
Zoltan Ivanfi has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4835

Change subject: Remove seemingly incorrect DCHECK-s.
..

Remove seemingly incorrect DCHECK-s.

The first conditional DCHECK means that if a page's size is 0 then it's
compressed size is also 0. This, however, seems to be a false
assumption, as the compressed output may include metadata, such as
length or checksum.

The GZIP compressor, for example, states that an input of 0 bytes
requires 23 bytes when compressed. The Snappy compressor also mentions
storing length information in the compressed output. The compressed size
estimation in the LZ4 compressor also contains a constant part.

The "Last page might be empty" comment and the second DCHECK also seems
to be based on a false assumption. If a value doesn't fit on the current
page, AppendRow creates a possible bigger new page and tries writing the
data in the new page instead. This means that if the data is bigger than
the page size, then the current page is finalized and a new page is
added, even if the original page was empty. In other words, empty pages
can occur in the middle of the pages_ array as well, not only at the end
of it.

Change-Id: I52e1b1354e9ea056b49331e75e53759952a81b76
---
M be/src/exec/hdfs-parquet-table-writer.cc
1 file changed, 1 insertion(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/4835/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4835
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I52e1b1354e9ea056b49331e75e53759952a81b76
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 


[Impala-ASF-CR] IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE ADD PARTITION

2016-10-25 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded a new patch set (#18).

Change subject: IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER 
TABLE ADD PARTITION
..

IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE
ADD PARTITION

Just like Hive, Implala should support multiple partitions in ALTER
TABLE ADD PARTITION statements. The syntax is as follows:

ALTER TABLE table_name ADD [IF NOT EXISTS]
PARTITION partition_spec1 [location_spec1] [cache_spec1]
PARTITION partition_spec2 [location_spec2] [cache_spec2]
...

Grammar was modified to handle the new syntax. Introduced
PartitionParams class to capture the repeatable part of the statement.
TPartitionParams is the name of the corresponding thrift class.

AlterTableAddPartitionStmt and CatalogOpExecutor classes were also
modified to work with a list of partitions. Duplicate partition specs
are rejected in AlterTableAddPartitionStmt.analyze().

Added FE, E2E and integration tests.

Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
---
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/PartitionKeyValue.java
A fe/src/main/java/org/apache/impala/analysis/PartitionParams.java
M fe/src/main/java/org/apache/impala/analysis/PartitionSpec.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M testdata/workloads/functional-query/queries/QueryTest/alter-table.test
M testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test
M tests/common/impala_test_suite.py
M tests/metadata/test_hms_integration.py
M tests/metadata/test_refresh_partition.py
16 files changed, 949 insertions(+), 224 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/4144/18
-- 
To view, visit http://gerrit.cloudera.org:8080/4144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
Gerrit-PatchSet: 18
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE ADD PARTITION

2016-10-25 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded a new patch set (#18).

Change subject: IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER 
TABLE ADD PARTITION
..

IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE
ADD PARTITION

Just like Hive, Implala should support multiple partitions in ALTER
TABLE ADD PARTITION statements. The syntax is as follows:

ALTER TABLE table_name ADD [IF NOT EXISTS]
PARTITION partition_spec1 [location_spec1] [cache_spec1]
PARTITION partition_spec2 [location_spec2] [cache_spec2]
...

Grammar was modified to handle the new syntax. Introduced
PartitionParams class to capture the repeatable part of the statement.
TPartitionParams is the name of the corresponding thrift class.

AlterTableAddPartitionStmt and CatalogOpExecutor classes were also
modified to work with a list of partitions. Duplicate partition specs
are rejected in AlterTableAddPartitionStmt.analyze().

Added FE, E2E and integration tests.

Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
---
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/PartitionKeyValue.java
A fe/src/main/java/org/apache/impala/analysis/PartitionParams.java
M fe/src/main/java/org/apache/impala/analysis/PartitionSpec.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M testdata/workloads/functional-query/queries/QueryTest/alter-table.test
M testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test
M tests/common/impala_test_suite.py
M tests/metadata/test_hms_integration.py
M tests/metadata/test_refresh_partition.py
16 files changed, 949 insertions(+), 224 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/4144/18
-- 
To view, visit http://gerrit.cloudera.org:8080/4144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
Gerrit-PatchSet: 18
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-3838: Codegen EvalRuntimeFilters().

2016-10-25 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-3838: Codegen EvalRuntimeFilters().
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4833/1/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

PS1, Line 63: !(BATCHES_PER_FILTER_SELECTIVITY_CHECK & 
(BATCHES_PER_FILTER_SELECTIVITY_CHECK - 1)),
BitUtil::IsPowerOf2().


-- 
To view, visit http://gerrit.cloudera.org:8080/4833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I27114869840e268d17e91d6e587ef811628e3837
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE ADD PARTITION

2016-10-25 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded a new patch set (#18).

Change subject: IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER 
TABLE ADD PARTITION
..

IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE
ADD PARTITION

Just like Hive, Implala should support multiple partitions in ALTER
TABLE ADD PARTITION statements. The syntax is as follows:

ALTER TABLE table_name ADD [IF NOT EXISTS]
PARTITION partition_spec1 [location_spec1] [cache_spec1]
PARTITION partition_spec2 [location_spec2] [cache_spec2]
...

Grammar was modified to handle the new syntax. Introduced
PartitionParams class to capture the repeatable part of the statement.
TPartitionParams is the name of the corresponding thrift class.

AlterTableAddPartitionStmt and CatalogOpExecutor classes were also
modified to work with a list of partitions. Duplicate partition specs
are rejected in AlterTableAddPartitionStmt.analyze().

Added FE, E2E and integration tests.

Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
---
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/PartitionKeyValue.java
A fe/src/main/java/org/apache/impala/analysis/PartitionParams.java
M fe/src/main/java/org/apache/impala/analysis/PartitionSpec.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M testdata/workloads/functional-query/queries/QueryTest/alter-table.test
M testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test
M tests/common/impala_test_suite.py
M tests/metadata/test_hms_integration.py
M tests/metadata/test_refresh_partition.py
16 files changed, 952 insertions(+), 224 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/4144/18
-- 
To view, visit http://gerrit.cloudera.org:8080/4144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
Gerrit-PatchSet: 18
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE ADD PARTITION

2016-10-25 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change.

Change subject: IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER 
TABLE ADD PARTITION
..


Patch Set 14:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4144/14/tests/metadata/test_hms_integration.py
File tests/metadata/test_hms_integration.py:

PS14, Line 714: # Sometimes metadata load is triggered here, so compare only 
the first three
  : # returned partitions.
> Thanks for looking into it. Yes, the explanation makes sense. Let's remove 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

2016-10-25 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
..


Patch Set 7:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

PS7, Line 459: hat
that


http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/codegen/codegen-anyval.h
File be/src/codegen/codegen-anyval.h:

Line 184:   /// lowered type. This *Val should be non-null.The output variable 
is called 'name'.
nit: space


http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

PS7, Line 347: but we don't clone if we can avoid it to
 :   /// reduce compilation time.
Avoid cloning if possible to reduce compilation time.


http://gerrit.cloudera.org:8080/#/c/4655/6/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

PS6, Line 1063: if we initialize the destination value to 0 (with the NULL bit 
set)
> We could do that but it doesn't seem worth it (the interpreted path has eno
Sure. I agree that the interpreted path probably won't benefit much. My concern 
has more to do with the explanation of the optimization and the assumption made 
here:

1. Who is initializing the slot to the proper value ?
2. How are these initialized values different for different aggregate functions 
?
3. Will the NULL bits be changed after initialization ? If so, by whom ?
4. And why is it okay to not check the NULL bits after calling the UDA function 
for certain aggregate functions ? Is it because they cannot change or is it 
already handled somewhere else ?


http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

PS7, Line 1660:  Second,
  : // the NULL bit of the dst slot can be ignored if the slot 
is initialised in the
  : // right way (see InitAggSlots()). Empirically this 
optimisation makes TPC-H Q1
  : // 5-10% faster.
Please also see comments above.

It's not very clear what it means by "the NULL bit of the dst slot" can be 
ignored ? Is it more precise to say their state cannot change after 
initialization or they will be updated appropriately by the update function ? 
It seems different aggregate function (e.g. min vs avg) has different default 
values with different conditions for the NULL bit to be set. May help to 
document a bit more details.


PS7, Line 1675: dst.SetIsNull(slot_desc->CodegenIsNull(codegen, , 
agg_tuple_arg));
Why is this not in the old code or was it done somewhere else ?


http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/exec/partitioned-aggregation-node.h
File be/src/exec/partitioned-aggregation-node.h:

PS7, Line 649: 'dst_val' should contain the previous value of the aggregate
 :   /// function, and 'result_val' is set to the new value after 
the Update or Merge
 :   /// operation is applied.
Just curious why we separate 'dst_val' and 'result_val'. Doesn't 'result_val' 
subsume 'dst_val' ?


http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/exprs/scalar-fn-call.cc
File be/src/exprs/scalar-fn-call.cc:

PS7, Line 496:  false
Please feel free to ignore but you can consider setting this second argument to 
default value of false. However, keeping it as is may be clearer (and you have 
gone through the trouble of updating all call sites - thanks !).


-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 11:

Fixed AnalyzeStmtsTest, also needs another rebase due to more Kudu related 
changes making it into master before this one. Will push another PS.

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 10: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/355/

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No