[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan nodes

2017-04-04 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5003: Constant propagation in scan nodes
..


Patch Set 15:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/6389/15/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

Line 109: # Aggregate doesn't affect pushing predicates
This test is not specific to constant propagation, so let's leave it out.


Line 124: # Again, the contradiction is derived in the ScanNode
Same here, this test seems to be more about predicate assignment than constant 
propagation.


Line 139: # Make sure predicates are not pushed into inline views with limits
Same here, this test is about predicate assignment (plenty of coverage 
elsewhere already).

We should really only be testing the optimizeConjuncts() functionality in 
isolation here,


Line 161: # We can optimize in some cases
Instead of testing various predicate assignment scenarios, we should focus more 
on interesting cases for optimizeConjuncts(). Some ideas;
- All conjuncts are optimized to a single "true"
- Conjuncts with NULLs
- Duplicate conjuncts
- Test propagation into non-trivial exprs, e.g. disjunctions, function call 
exprs
- Test errors during constant propagation (I'm still thinking whether there's 
an easy way to do this)

For HDFS scans, we should also optimize the collectionConjuncts_, and add 1-2 
tests for that case.

Have you tried an extreme example with lots of conjuncts to see how long this 
optimization takes? Should we consider having a limit on when not to apply 
constant propagation due to prohibitive optimization cost?


http://gerrit.cloudera.org:8080/#/c/6389/15/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

Line 127:  SCANRANGELOCATIONS
Don't think we need the SCANRANGELOCATIONS and DISTRIBUTEDPLAN sections.


-- 
To view, visit http://gerrit.cloudera.org:8080/6389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan nodes

2017-04-04 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5003: Constant propagation in scan nodes
..


Patch Set 15:

(14 comments)

Code is looking good, will take another pass over the tests

http://gerrit.cloudera.org:8080/#/c/6389/15/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

Line 923:* Propagates constant expressions of the form  = 
 to
Explain candidate param


Line 924:* other uses of slot ref in all expressions in conjuncts; returns 
a BitSet with
in all expressions in conjuncts -> in the given conjnucts


Line 942: // Don't rewrite with our own substitution!  We may have 
already processed
Explanation seems more complicated than necessary. Clearly it would be wrong to 
propagate a constant to the originating binary predicate.

Your function comment is quite accurate "to other uses of slot ref" :)


Line 960:* Returns true if the conjuncts may be true and false if a 
contradiction has
Shorter: Returns false if a contradiction has been implied, true otherwise.


Line 968:   BitSet candidates = new BitSet(conjuncts.size());
> Alex, thanks for the BitSet suggest - this worked out far cleaner than any 
Yea, this looks pretty good to me!


Line 980:   // applied by the rewriter.  We must also re-analyze result 
exprs to
I think we can remove everything after "We must also re-analyze  ..." to keep 
the comment short.

Also, I think constant propagation should fall under the "enable_expr_rewrites" 
query option, i.e., if false we should only remove duplicates in here. 
Otherwise, it's hard to say what that query option means at a high level.


Line 985:   if (index < 0) continue;
> This can be Preconditions.checkState(index >= 0) now.  In the prior form of
Agree. Makes sense.


Line 988:   // Reset Expr to force updating cost
Re-analyze expr to add implicit casts, resolve function signatures, and compute 
cost


Line 998:   // because they can't be evaluated if expr rewriting is 
turned off.
> I can't see a good way to test this with the fe tests.  Should we write a c
You should be able to do this with a PlannerTest. For example, look at 
PlannerTest.testParquetFiltering() which sets a query option and then runs a 
.test file.

But the easier solution, imo, is to not propagate constants when expr rewriting 
is disabled.


http://gerrit.cloudera.org:8080/#/c/6389/11/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

Line 1266: 
> Optimization should still be done in the form of eliminating duplicates, ev
Looks like the constant propagation is applied even when expr rewrites are 
disabled. Imo, it would be preferable not to propagate constants in that case. 
Otherwise, it's kind of hard to reason about the meaning of that query option.


http://gerrit.cloudera.org:8080/#/c/6389/15/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

Line 1273: // Allow constant propagation within conjuncts.
This actually has two purposes:
1. constant propagation
2. expr optimization on fully-resolved conjuncts


http://gerrit.cloudera.org:8080/#/c/6389/15/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

Line 101: # XXX - this seems correct, but is this a legal transformation?
> I'm fairly convinced at this point that considering slotrefs through implic
I agree that implicit casts on both sides seem ok.

It does not seem fine if there is an explicit cast on the slotref.
In your example, the 'cast(1 as tinyint)' is constant and will be folded 
(truncated), so this case seems fine.


http://gerrit.cloudera.org:8080/#/c/6389/15/testdata/workloads/functional-planner/queries/PlannerTest/joins.test
File testdata/workloads/functional-planner/queries/PlannerTest/joins.test:

Line 381: # hbase-hdfs join with scan filtering (bogus)
Can you be more specific than "(bogus)"? What behavior are we testing here?


Line 394: 02:HASH JOIN [INNER JOIN]
> This is a very sad plan.
Agree that this can be improved.


-- 
To view, visit http://gerrit.cloudera.org:8080/6389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4166: Add SORT BY sql clause

2017-04-04 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4166: Add SORT BY sql clause
..


Patch Set 7:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/6495/7/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

PS7, Line 984: Set
There is no KW_SET, right? Update the name?


http://gerrit.cloudera.org:8080/#/c/6495/7/fe/src/main/java/org/apache/impala/analysis/AlterTableSetSortByColumnsStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableSetSortByColumnsStmt.java:

PS7, Line 73: getTargetTable()
t?


http://gerrit.cloudera.org:8080/#/c/6495/7/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

Line 2484:* Add a warning that will be displayed to the user. Ignores null 
messages.
Comment that no warning can be issued if warnings have been retrieved?


http://gerrit.cloudera.org:8080/#/c/6495/7/fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java:

PS7, Line 119: sortByColumns_.
I think this can throw a NPE. Looking at the parser I see that 
CreateTableListStmt can pass null for sortByColumns_ and couldn't find where 
this changes.


http://gerrit.cloudera.org:8080/#/c/6495/7/fe/src/main/java/org/apache/impala/analysis/TableDef.java
File fe/src/main/java/org/apache/impala/analysis/TableDef.java:

PS7, Line 296: propertyString
inline


http://gerrit.cloudera.org:8080/#/c/6495/7/fe/src/main/java/org/apache/impala/analysis/TablePropertyAnalyzer.java
File fe/src/main/java/org/apache/impala/analysis/TablePropertyAnalyzer.java:

PS7, Line 41: /**
:* Analyzes the 'sort.by.columns' property of 'table'.
:*/
:   public static List analyzeSortByColumns(Table table) 
throws AnalysisException {
: return TablePropertyAnalyzer.analyzeSortByColumns(
: table.getMetaStoreTable().getParameters(), table);
:   }
: 
:   /**
:* Analyzes the 'sort.by.columns' property in 'tblProperties' 
against the columns of
:* 'table'.
:*/
:   public static List analyzeSortByColumns(Map tblProperties,
:   Table table) throws AnalysisException {
: return TablePropertyAnalyzer.analyzeSortByColumns(
: getSortByColumnsFromProperties(tblProperties), table);
:   }
These are used in only one place, right? Maybe inline there?


Line 76: }
Precondition check for HBase?


PS7, Line 127: private static List getSortByColumnsFromProperties(
 :   Map tblProperties) {
You may want to consider if it's better to have a getSortByColumn(Table table) 
function instead of this one. Just a though...


http://gerrit.cloudera.org:8080/#/c/6495/7/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

PS7, Line 539: // If the table has a 'sort.by.columns' property and the query 
has a 'noclustered'
 : // hint, we issue a warning during analysis and ignore the 
'noclustered' hint.
That comment doesn't seem relevant here. Remove?


PS7, Line 541: !insertStmt.hasNoClusteredHint()
If insertStmt.hasClusteredHint() is true doesn't that imply that 
hasNoClusteredHint() is false? I thought they are mutually exclusive.


http://gerrit.cloudera.org:8080/#/c/6495/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

PS7, Line 1793: params.sort_by_columns.size() > 0
use isEmpty()


http://gerrit.cloudera.org:8080/#/c/6495/7/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

Line 526: 
Test for HBase and Kudu?


Line 1054:   }
Same here (HBase + Kudu). You need to put all Kudu tests in a function and 
check for supported.


PS7, Line 1904: most
"must" typo


PS7, Line 1920: // Kudu table tests are in TestCreateManagedKuduTable().
No need for that comment :)


http://gerrit.cloudera.org:8080/#/c/6495/7/fe/src/test/java/org/apache/impala/analysis/ParserTest.java
File fe/src/test/java/org/apache/impala/analysis/ParserTest.java:

PS7, Line 2347: // Sort by clause
How about some tests with additional table options? Try changing the order in 
which the table options are specified, e.g. location bla sort by ()


http://gerrit.cloudera.org:8080/#/c/6495/7/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
File testdata/workloads/functional-planner/queries/PlannerTest/ddl.test:

PS7, Line 207: table
select query


http://gerrit.cloudera.org:8080/#/c/6495/7/testdata/workloads/functional-planner/queries/PlannerTest/insert-sort-by.test
File 

[Impala-ASF-CR] IMPALA-5156: Drop VLOG level passed into Kudu client

2017-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5156: Drop VLOG level passed into Kudu client
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6f160e20de37ac13887ccfebc5fb1d595fd05db1
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5156: Drop VLOG level passed into Kudu client

2017-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5156: Drop VLOG level passed into Kudu client
..


IMPALA-5156: Drop VLOG level passed into Kudu client

The Kudu logging is very verbose and all of the logging may
affect performance. This change reduces the Kudu client log
level to v-1, since Impala is typically run at v=1 and isn't
too noisy.

In some simple concurrency testing, Todd found that reducing
the vlog level resulted in an increase in throughput from
~17 qps to 60qps.

Change-Id: I6f160e20de37ac13887ccfebc5fb1d595fd05db1
Reviewed-on: http://gerrit.cloudera.org:8080/6549
Reviewed-by: Matthew Jacobs 
Tested-by: Impala Public Jenkins
---
M be/src/exec/kudu-util.cc
1 file changed, 3 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Verified
  Matthew Jacobs: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/6549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I6f160e20de37ac13887ccfebc5fb1d595fd05db1
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan nodes

2017-04-04 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5003: Constant propagation in scan nodes
..


Patch Set 15:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/6389/15/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

Line 968:   BitSet candidates = new BitSet(conjuncts.size());
Alex, thanks for the BitSet suggest - this worked out far cleaner than any 
other solution.


Line 985:   if (index < 0) continue;
This can be Preconditions.checkState(index >= 0) now.  In the prior form of the 
code, changed was a set of exprs, so we could have multiple rewrites of the 
same index.  With a BitSet, this is no longer true.


Line 998:   // because they can't be evaluated if expr rewriting is 
turned off.
I can't see a good way to test this with the fe tests.  Should we write a 
custom cluster or ee test that validates this?


http://gerrit.cloudera.org:8080/#/c/6389/15/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

Line 101: # XXX - this seems correct, but is this a legal transformation?
I'm fairly convinced at this point that considering slotrefs through implicit 
casts is legal.  The only case I can think of where this could be a problem is 
if the cast is applied to the constant, which results in truncation, but I 
don't believe we can create those kind of casts implicitly.  Consider:

select * from functional.alltypes a
where a.tinyint_col = cast(1 as tinyint) and a.tinyint_col CONDITION 
cast(1 as tinyint)

With our current integer casting situation, I think we truncate instead of 
convert to NULL, so this results in tinyint_col having a value, and since the 
cast can be evaluated as a literal, we may perform propagation.  No matter how 
I try to break this expression, though, it seems to always evaluate exactly the 
same as our truncation rule, so I think this is a legal transformation.


http://gerrit.cloudera.org:8080/#/c/6389/15/testdata/workloads/functional-planner/queries/PlannerTest/joins.test
File testdata/workloads/functional-planner/queries/PlannerTest/joins.test:

Line 394: 02:HASH JOIN [INNER JOIN]
This is a very sad plan.


-- 
To view, visit http://gerrit.cloudera.org:8080/6389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4293: query profile should include error log

2017-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4293: query profile should include error log
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6553
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9b7298df6a6f05a8cdb486283ae4728b00a5ef1
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4293: query profile should include error log

2017-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4293: query profile should include error log
..


IMPALA-4293: query profile should include error log

This puts the string-ified error log into an "Errors:"
field in the profile.

Testing:
Visually checked that the field showed up in the profiles in
the web UI. Ran core tests.

Change-Id: Ib9b7298df6a6f05a8cdb486283ae4728b00a5ef1
Reviewed-on: http://gerrit.cloudera.org:8080/6553
Reviewed-by: Henry Robinson 
Tested-by: Impala Public Jenkins
---
M be/src/service/impala-server.cc
1 file changed, 2 insertions(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Henry Robinson: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/6553
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ib9b7298df6a6f05a8cdb486283ae4728b00a5ef1
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins


[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan nodes

2017-04-04 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#15).

Change subject: IMPALA-5003: Constant propagation in scan nodes
..

IMPALA-5003: Constant propagation in scan nodes

When conjuncts are pushed into table refs from inline views, they can
be considered for constant progagation within that node.  In certain
cases, we might end up with a FALSE conditional and now we can
convert ScanNodes to EmptySet nodes when that occurs, and in other
cases, we might be able to prune partitions based on expressions which
have now become constant.

Testing: Expanded the test cases for the planner to achieve constant
propagation.  Added Kudu, datasource, Hdfs and HBase tests to validate
we can create EmptySetNodes.

Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/SelectList.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/conjunct-ordering.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
M testdata/workloads/functional-planner/queries/PlannerTest/hdfs.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
17 files changed, 471 insertions(+), 88 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/6389/15
-- 
To view, visit http://gerrit.cloudera.org:8080/6389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan nodes and inline views

2017-04-04 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#14).

Change subject: IMPALA-5003: Constant propagation in scan nodes and inline views
..

IMPALA-5003: Constant propagation in scan nodes and inline views

When conjuncts are pushed into table refs and inline views, they can
be considered for constant progagation within that node.  In certain
cases, we might end up with a FALSE conditional and now we can
convert ScanNodes to EmptySet nodes when that occurs.

Testing: Expanded the test cases for the planner to achieve constant
propagation.  Added Kudu, datasource, Hdfs and HBase tests to validate
we can create EmptySetNodes.

Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/SelectList.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/conjunct-ordering.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
M testdata/workloads/functional-planner/queries/PlannerTest/hdfs.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
17 files changed, 471 insertions(+), 88 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/6389/14
-- 
To view, visit http://gerrit.cloudera.org:8080/6389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan nodes and inline views

2017-04-04 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5003: Constant propagation in scan nodes and inline views
..


Patch Set 13:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6389/13/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

Line 36: import org.apache.impala.common.Pair;
Actually unused, will remove


http://gerrit.cloudera.org:8080/#/c/6389/13/fe/src/main/java/org/apache/impala/planner/ValueRange.java
File fe/src/main/java/org/apache/impala/planner/ValueRange.java:

Line 67:   public boolean incorporatePredicate(BinaryPredicate.Operator op, 
Expr bound) {
I supposed we should remove this code and add it back in a later diff (when it 
can actually be tested).


http://gerrit.cloudera.org:8080/#/c/6389/13/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

Line 116: 01:AGGREGATE [FINALIZE]
typo


-- 
To view, visit http://gerrit.cloudera.org:8080/6389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan nodes and inline views

2017-04-04 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#13).

Change subject: IMPALA-5003: Constant propagation in scan nodes and inline views
..

IMPALA-5003: Constant propagation in scan nodes and inline views

When conjuncts are pushed into table refs and inline views, they can
be considered for constant progagation within that node.  In certain
cases, we might end up with a FALSE conditional and now we can
convert ScanNodes to EmptySet nodes when that occurs.

Testing: Expanded the test cases for the planner to achieve constant
propagation.  Added Kudu, datasource, Hdfs and HBase tests to validate
we can create EmptySetNodes.

Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/SelectList.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/ValueRange.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/conjunct-ordering.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
M testdata/workloads/functional-planner/queries/PlannerTest/hdfs.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
18 files changed, 535 insertions(+), 112 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/6389/13
-- 
To view, visit http://gerrit.cloudera.org:8080/6389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5156: Drop VLOG level passed into Kudu client

2017-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5156: Drop VLOG level passed into Kudu client
..


Patch Set 3:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/438/

-- 
To view, visit http://gerrit.cloudera.org:8080/6549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6f160e20de37ac13887ccfebc5fb1d595fd05db1
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5156: Drop VLOG level passed into Kudu client

2017-04-04 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5156: Drop VLOG level passed into Kudu client
..


Patch Set 3: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/6549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6f160e20de37ac13887ccfebc5fb1d595fd05db1
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5156: Drop VLOG level passed into Kudu client

2017-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5156: Drop VLOG level passed into Kudu client
..


Patch Set 2: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/437/

-- 
To view, visit http://gerrit.cloudera.org:8080/6549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6f160e20de37ac13887ccfebc5fb1d595fd05db1
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5137: Support Kudu UNIXTIME MICROS as Impala TIMESTAMP

2017-04-04 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new patch set (#2).

Change subject: IMPALA-5137: Support Kudu UNIXTIME_MICROS as Impala TIMESTAMP
..

IMPALA-5137: Support Kudu UNIXTIME_MICROS as Impala TIMESTAMP

Adds Impala support for TIMESTAMP types stored in Kudu.

Impala stores TIMESTAMP values in 96-bits and has nanosecond
precision. Kudu's timestamp is a 64-bit microsecond delta
from the Unix epoch (called UNIXTIME_MICROS), so a conversion
is necessary.

TODO: As of now, this only supports writing TIMESTAMPs to Kudu.
Reading will require the Kudu client to return
UNIXTIME_MICROS in a padded slot for Impala. In the
meantime, an exception is thrown in the FE when attempting
to read from a TIMESTAMP from a Kudu table. Some tests were
added using the python Kudu client to read the results, but
more tests will be added when Impala can read as well.

Change-Id: Iae6ccfffb79118a9036fb2227dba3a55356c896d
---
M be/src/exec/kudu-table-sink.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M tests/query_test/test_kudu.py
7 files changed, 125 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/6526/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6526
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iae6ccfffb79118a9036fb2227dba3a55356c896d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] PREVIEW IMPALA-5073: Use mmap instead of malloc for buffer pool

2017-04-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#4).

Change subject: PREVIEW IMPALA-5073: Use mmap instead of malloc for buffer pool
..

PREVIEW IMPALA-5073: Use mmap instead of malloc for buffer pool

Allocate with mmap instead of TCMalloc to give more control over memory
usage. Also allocate huge pages when possible to reduce TLB pressure.

Adds additional memory metrics, since we previously relied on the
assumption that all memory was allocated through TCMalloc.
memory.total-used and memory.total-reserved track the total across
the buffer pool and TCMalloc. When the buffer pool is not present,
they just report the TCMalloc values.

ASAN still uses malloc() because it doesn't instrument mmap().

Testing:
Added some unit tests to test edge cases. Many pre-existing tests also
exercise the modified code.

Change-Id: Ifbc748f74adcbbdcfa45f3ec7df98284925acbd6
---
M be/src/catalog/catalogd-main.cc
M be/src/runtime/bufferpool/buffer-allocator-test.cc
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/bufferpool/reservation-tracker.cc
M be/src/runtime/bufferpool/reservation-tracker.h
M be/src/runtime/bufferpool/system-allocator.cc
M be/src/runtime/bufferpool/system-allocator.h
M be/src/runtime/exec-env.cc
M be/src/statestore/statestored-main.cc
M be/src/util/asan.h
M be/src/util/memory-metrics.cc
M be/src/util/memory-metrics.h
M be/src/util/metrics-test.cc
M be/src/util/metrics.h
M common/thrift/generate_error_codes.py
M common/thrift/metrics.json
18 files changed, 387 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/6474/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6474
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbc748f74adcbbdcfa45f3ec7df98284925acbd6
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-04 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 3:

(14 comments)

Just looked at the thrift so far.

http://gerrit.cloudera.org:8080/#/c/6535/3/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

PS3, Line 381: the total number
 : // of fragment instances
what's that?


PS3, Line 386: i32
Types.TFragmentIdx ?


PS3, Line 412: 6: list destinations
isn't that common across all instances?


PS3, Line 414: fragment
instance?


PS3, Line 435: coord_state_idx
not clear why the backend will care what index it was represented by in a 
coordinator's datastructure. why is that?
oh, reading on I see this is really just used as to identify the backend when 
calling back via ReportExecStatus. it could use a little explanation.


Line 441:   4: list fragment_ctxs
are these in any particular order?


Line 453: // ReportExecStatus
other RPC names have a blank line after them, which makes them easier to find.


PS3, Line 552: overall
what does "overall" mean here?  this makes it seem like it's the merged status 
of all the instances, but from reading the code it's not.


PS3, Line 560: insert_exec_status
I guess this will hold the insert status for all instances, right?


Line 563:   7: optional map error_log;
and same here. this will be the aggregated error log across all instances?


PS3, Line 572: CancelPlanFragment
CancelQueryFInstances


Line 669: 
It would be helpful to add the RPC comment to show which RPC the following 
structures below to (like we have above):

// UpdateFilter


Line 699: 
// PublishFilter


PS3, Line 734: ReportExecStatus
Eventually this will report status for all the instances (at least in the 
periodic case), right? So it might be clearer to name it similar to the others: 
ReportFInstancesStatus() or similar.


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5156: Drop VLOG level passed into Kudu client

2017-04-04 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5156: Drop VLOG level passed into Kudu client
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6f160e20de37ac13887ccfebc5fb1d595fd05db1
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4293: query profile should include error log

2017-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4293: query profile should include error log
..


Patch Set 1:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/436/

-- 
To view, visit http://gerrit.cloudera.org:8080/6553
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9b7298df6a6f05a8cdb486283ae4728b00a5ef1
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3203: Part 2: per-core free lists in buffer pool

2017-04-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3203: Part 2: per-core free lists in buffer pool
..


Patch Set 18:

PS18 was just a rebase.

-- 
To view, visit http://gerrit.cloudera.org:8080/6414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832
Gerrit-PatchSet: 18
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4293: query profile should include error log

2017-04-04 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4293: query profile should include error log
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6553
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9b7298df6a6f05a8cdb486283ae4728b00a5ef1
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4293: query profile should include error log

2017-04-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6553

Change subject: IMPALA-4293: query profile should include error log
..

IMPALA-4293: query profile should include error log

This puts the string-ified error log into an "Errors:"
field in the profile.

Testing:
Visually checked that the field showed up in the profiles in
the web UI. Ran core tests.

Change-Id: Ib9b7298df6a6f05a8cdb486283ae4728b00a5ef1
---
M be/src/service/impala-server.cc
1 file changed, 2 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/6553/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6553
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib9b7298df6a6f05a8cdb486283ae4728b00a5ef1
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3203: Part 2: per-core free lists in buffer pool

2017-04-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#18).

Change subject: IMPALA-3203: Part 2: per-core free lists in buffer pool
..

IMPALA-3203: Part 2: per-core free lists in buffer pool

Add per-core lists of clean pages and free pages to enable allocation
of buffers without contention on shared locks in the common case.

This is implemented with an additional layer of abstraction in
"BufferAllocator", which tracks all memory (free buffers and clean
pages) that is not in use but has not been released to the OS.
The old BufferAllocator is renamed to SystemAllocator.

See "Spilled Page Mgmt" and "MMap Allocator & Scalable Free Lists" in
https://goo.gl/0zuy97 for a high-level summary of how this fits into
the buffer pool design.

The guts of the new code is BufferAllocator::AllocateInternal(),
which progresses through several strategies for allocating memory.

Misc changes:
* Enforce upper limit on buffer size to reduce the number of free lists
  required.
* Add additional allocation counters.
* Slightly reorganise the MemTracker GC functions to use lambdas and
  clarify the order in which they should be called. Also adds a target
  memory value so that they don't need to free *all* of the memory in
  the system.
* Fix an accounting bug in the buffer pool where it didn't
  evict dirty pages before reclaiming a clean page.

Performance:
We will need to validate the performance of the system under high query
concurrency before this is used as part of query execution. The benchmark
in Part 1 provided some evidence that this approach of a list per core
should scale well to many cores.

Testing:
Added buffer-allocator-test to test the free list resizing algorithm
directly.

Added a test to buffer-pool-test to exercise the various new memory
reclamation code paths that are now possible. Also run buffer-pool-test
under two different faked-out NUMA setups - one with no NUMA and another
with three NUMA nodes.

buffer-pool-test, suballocator-test, and buffered-tuple-stream-v2-test
provide some further basic coverage. Future system and unit tests will
validate this further before it is used for query execution (see
IMPALA-3200).

Ran an initial version of IMPALA-4114, the ported BufferedBlockMgr
tests, against this. The randomised stress test revealed some accounting
bugs which are fixed. I'll post those tests as a follow-on patch.

Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832
---
M be/src/benchmarks/free-lists-benchmark.cc
M be/src/common/init.cc
M be/src/runtime/buffered-block-mgr.cc
M be/src/runtime/bufferpool/CMakeLists.txt
A be/src/runtime/bufferpool/buffer-allocator-test.cc
M be/src/runtime/bufferpool/buffer-allocator.cc
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool-counters.h
M be/src/runtime/bufferpool/buffer-pool-internal.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/bufferpool/free-list-test.cc
M be/src/runtime/bufferpool/free-list.h
M be/src/runtime/bufferpool/suballocator-test.cc
M be/src/runtime/bufferpool/suballocator.h
A be/src/runtime/bufferpool/system-allocator.cc
A be/src/runtime/bufferpool/system-allocator.h
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/runtime/exec-env.cc
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
A be/src/testutil/cpu-util.h
A be/src/testutil/rand-util.h
M be/src/util/cpu-info.cc
M be/src/util/cpu-info.h
30 files changed, 1,558 insertions(+), 363 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/6414/18
-- 
To view, visit http://gerrit.cloudera.org:8080/6414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832
Gerrit-PatchSet: 18
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4166: Add SORT BY sql clause

2017-04-04 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4166: Add SORT BY sql clause
..


Patch Set 5:

(5 comments)

Responses to some comments.

http://gerrit.cloudera.org:8080/#/c/6495/5/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

PS5, Line 455: the caller must make sure that the value matches any columns 
he/she adds to the
 :* table.
> We could throw one if the number is wrong, but eventually the caller will h
No that's fine. You already mention that this is used only for tests.


http://gerrit.cloudera.org:8080/#/c/6495/5/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

PS5, Line 546: orderingExprs.addAll(insertStmt.getPartitionKeyExprs()
> Insert hints are not allowed for HBase tables and the analysis will fail. F
If we enforce it at the analysis phase, then adding a Preconditions check here 
is not a bad idea.


http://gerrit.cloudera.org:8080/#/c/6495/5/testdata/workloads/functional-planner/queries/PlannerTest/insert-sort-by.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/insert-sort-by.test:

PS5, Line 10: ASC NULLS LAST
> No, we cannot change it. I thought about this, but couldn't see why we'd wa
I was thinking more of changing the ASC into DESC but that's fine if this is an 
acceptable limitation.


PS5, Line 39:  DISTRIBUTEDPLAN
: WRITE TO HDFS [test_sort_by.alltypes, OVERWRITE=false, 
PARTITION-KEYS=(year,month)]
: |  partitions=24
: |
: 01:SORT
: |  order by: year ASC NULLS LAST, month ASC NULLS LAST, int_col 
ASC NULLS LAST, bool_col ASC NULLS LAST
: |
: 00:SCAN HDFS [functional.alltypes]
:partitions=24/24 files=24 size=478.45KB
> They make sure that the shuffle/noshuffle hint has been observed, indicated
Good point. I think we can leave them.


http://gerrit.cloudera.org:8080/#/c/6495/5/testdata/workloads/functional-query/queries/QueryTest/alter-table.test
File testdata/workloads/functional-query/queries/QueryTest/alter-table.test:

PS5, Line 1066:  
> They are part of the output of "describe formatted". Without them the compa
It's probably because of the HMS schema used (maybe they use CHAR(N)). Anyways, 
no need to do something. I just wanted to make sure we don't do anything funky.


-- 
To view, visit http://gerrit.cloudera.org:8080/6495
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I08834f38a941786ab45a4381c2732d929a934f75
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5156: Drop VLOG level passed into Kudu client

2017-04-04 Thread Matthew Jacobs (Code Review)
Hello Henry Robinson,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6549

to look at the new patch set (#2).

Change subject: IMPALA-5156: Drop VLOG level passed into Kudu client
..

IMPALA-5156: Drop VLOG level passed into Kudu client

The Kudu logging is very verbose and all of the logging may
affect performance. This change reduces the Kudu client log
level to v-1, since Impala is typically run at v=1 and isn't
too noisy.

In some simple concurrency testing, Todd found that reducing
the vlog level resulted in an increase in throughput from
~17 qps to 60qps.

Change-Id: I6f160e20de37ac13887ccfebc5fb1d595fd05db1
---
M be/src/exec/kudu-util.cc
1 file changed, 3 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/6549/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6f160e20de37ac13887ccfebc5fb1d595fd05db1
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Henry Robinson 


[Impala-ASF-CR] IMPALA-1726: Treat parquet ENUMs as STRINGs when creating tables.

2017-04-04 Thread Jakub Kukul (Code Review)
Jakub Kukul has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6550

Change subject: IMPALA-1726: Treat parquet ENUMs as STRINGs when creating 
tables.
..

IMPALA-1726: Treat parquet ENUMs as STRINGs when creating tables.

Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475
---
M docs/topics/impala_parquet.xml
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
2 files changed, 2 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/6550/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jakub Kukul 


[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines

2017-04-04 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-5140: improve docs building guidelines
..


Patch Set 7:

> Do we want to maintain the
 > "make" command so users can simple 1) install the DITA-OT; 2)
 > navigate to the "docs" directory; and 3) execute "make" to build
 > HTML and PDF? I think this is the best way to go.

Thanks. That means I have another patch set to do: adding dita to PATH. I plan 
to remove all but 1 dita invocation example.

-- 
To view, visit http://gerrit.cloudera.org:8080/6512
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4643: [DOCS] Change URLs / set up keydefs for JIRA reports

2017-04-04 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4643: [DOCS] Change URLs / set up keydefs for JIRA 
reports
..


Patch Set 2:

(5 comments)

I still think the anchor should just be removed, because it adds a broken link, 
instead of preserving a broken link. I also found a few other places where 
direct links are used, not the xrefs. If there's a good reason for not using 
xrefs, that's fine.

http://gerrit.cloudera.org:8080/#/c/6515/1/docs/topics/impala_fixed_issues.xml
File docs/topics/impala_fixed_issues.xml:

PS1, Line 573: 
> Done
I believe a user reading these docs will click the anchor and it will go 
nowhere. It seems like on LHS, there is no link to click. Thus this patch (RHS) 
introduces a dead link. I would suggest you revert this change and add a 
follow-on Jira to fix the link.


http://gerrit.cloudera.org:8080/#/c/6515/2/docs/topics/impala_fixed_issues.xml
File docs/topics/impala_fixed_issues.xml:

Line 2590: https://issues.apache.org/jira/issues/?jql=project%3Dimpala+and+fixVersion%3D%22Impala+2.1%22+and+resolution%3D%22Fixed%22;
 format="html" scope="external">
Why not jira_list_210?


Line 5036: https://issues.apache.org/jira/issues/?jql=project%3Dimpala+and+fixVersion%3D%22Impala+1.1%22+and+resolution%3D%22Fixed%22;
 format="html" scope="external">
Why not jira_list_110 (or some other good name)?


Line 5168: https://issues.apache.org/jira/issues/?jql=project%3Dimpala+and+fixVersion%3D%22Impala+1.0.1%22+and+resolution%3D%22Fixed%22;
 format="html" scope="external">
Why not jira_list_101?


Line 5403: https://issues.apache.org/jira/issues/?jql=project+%3D+impala+AND+resolution+%3D+Fixed+AND+fixVersion+%3D+%22Impala+1.0%22+ORDER+BY+key+ASC%2C+assignee+ASC%2C+priority+DESC;
 scope="external" format="html">this
Why not jira_list_100?


-- 
To view, visit http://gerrit.cloudera.org:8080/6515
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I007e634f9da57289674683dd5bf64e3e3ca8f525
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3203: Part 2: per-core free lists in buffer pool

2017-04-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#17).

Change subject: IMPALA-3203: Part 2: per-core free lists in buffer pool
..

IMPALA-3203: Part 2: per-core free lists in buffer pool

Add per-core lists of clean pages and free pages to enable allocation
of buffers without contention on shared locks in the common case.

This is implemented with an additional layer of abstraction in
"BufferAllocator", which tracks all memory (free buffers and clean
pages) that is not in use but has not been released to the OS.
The old BufferAllocator is renamed to SystemAllocator.

See "Spilled Page Mgmt" and "MMap Allocator & Scalable Free Lists" in
https://goo.gl/0zuy97 for a high-level summary of how this fits into
the buffer pool design.

The guts of the new code is BufferAllocator::AllocateInternal(),
which progresses through several strategies for allocating memory.

Misc changes:
* Enforce upper limit on buffer size to reduce the number of free lists
  required.
* Add additional allocation counters.
* Slightly reorganise the MemTracker GC functions to use lambdas and
  clarify the order in which they should be called. Also adds a target
  memory value so that they don't need to free *all* of the memory in
  the system.
* Fix an accounting bug in the buffer pool where it didn't
  evict dirty pages before reclaiming a clean page.

Performance:
We will need to validate the performance of the system under high query
concurrency before this is used as part of query execution. The benchmark
in Part 1 provided some evidence that this approach of a list per core
should scale well to many cores.

Testing:
Added buffer-allocator-test to test the free list resizing algorithm
directly.

Added a test to buffer-pool-test to exercise the various new memory
reclamation code paths that are now possible. Also run buffer-pool-test
under two different faked-out NUMA setups - one with no NUMA and another
with three NUMA nodes.

buffer-pool-test, suballocator-test, and buffered-tuple-stream-v2-test
provide some further basic coverage. Future system and unit tests will
validate this further before it is used for query execution (see
IMPALA-3200).

Ran an initial version of IMPALA-4114, the ported BufferedBlockMgr
tests, against this. The randomised stress test revealed some accounting
bugs which are fixed. I'll post those tests as a follow-on patch.

Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832
---
M be/src/benchmarks/free-lists-benchmark.cc
M be/src/common/init.cc
M be/src/runtime/buffered-block-mgr.cc
M be/src/runtime/bufferpool/CMakeLists.txt
A be/src/runtime/bufferpool/buffer-allocator-test.cc
M be/src/runtime/bufferpool/buffer-allocator.cc
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool-counters.h
M be/src/runtime/bufferpool/buffer-pool-internal.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/bufferpool/free-list-test.cc
M be/src/runtime/bufferpool/free-list.h
M be/src/runtime/bufferpool/suballocator-test.cc
M be/src/runtime/bufferpool/suballocator.h
A be/src/runtime/bufferpool/system-allocator.cc
A be/src/runtime/bufferpool/system-allocator.h
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/runtime/exec-env.cc
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
A be/src/testutil/cpu-util.h
A be/src/testutil/rand-util.h
M be/src/util/cpu-info.cc
M be/src/util/cpu-info.h
30 files changed, 1,558 insertions(+), 364 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/6414/17
-- 
To view, visit http://gerrit.cloudera.org:8080/6414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3203: Part 2: per-core free lists in buffer pool

2017-04-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#16).

Change subject: IMPALA-3203: Part 2: per-core free lists in buffer pool
..

IMPALA-3203: Part 2: per-core free lists in buffer pool

Add per-core lists of clean pages and free pages to enable allocation
of buffers without contention on shared locks in the common case.

This is implemented with an additional layer of abstraction in
"BufferAllocator", which tracks all memory (free buffers and clean
pages) that is not in use but has not been released to the OS.
The old BufferAllocator is renamed to SystemAllocator.

See "Spilled Page Mgmt" and "MMap Allocator & Scalable Free Lists" in
https://goo.gl/0zuy97 for a high-level summary of how this fits into
the buffer pool design.

The guts of the new code is BufferAllocator::AllocateInternal(),
which progresses through several strategies for allocating memory.

Misc changes:
* Enforce upper limit on buffer size to reduce the number of free lists
  required.
* Add additional allocation counters.
* Slightly reorganise the MemTracker GC functions to use lambdas and
  clarify the order in which they should be called. Also adds a target
  memory value so that they don't need to free *all* of the memory in
  the system.
* Fix an accounting bug in the buffer pool where it didn't
  evict dirty pages before reclaiming a clean page.

Performance:
We will need to validate the performance of the system under high query
concurrency before this is used as part of query execution. The benchmark
in Part 1 provided some evidence that this approach of a list per core
should scale well to many cores.

Testing:
Added buffer-allocator-test to test the free list resizing algorithm
directly.

Added a test to buffer-pool-test to exercise the various new memory
reclamation code paths that are now possible. Also run buffer-pool-test
under two different faked-out NUMA setups - one with no NUMA and another
with three NUMA nodes.

buffer-pool-test, suballocator-test, and buffered-tuple-stream-v2-test
provide some further basic coverage. Future system and unit tests will
validate this further before it is used for query execution (see
IMPALA-3200).

Ran an initial version of IMPALA-4114, the ported BufferedBlockMgr
tests, against this. The randomised stress test revealed some accounting
bugs which are fixed. I'll post those tests as a follow-on patch.

Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832
---
M be/src/benchmarks/free-lists-benchmark.cc
M be/src/common/init.cc
M be/src/runtime/buffered-block-mgr.cc
M be/src/runtime/bufferpool/CMakeLists.txt
A be/src/runtime/bufferpool/buffer-allocator-test.cc
M be/src/runtime/bufferpool/buffer-allocator.cc
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool-counters.h
M be/src/runtime/bufferpool/buffer-pool-internal.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/bufferpool/free-list-test.cc
M be/src/runtime/bufferpool/free-list.h
M be/src/runtime/bufferpool/suballocator-test.cc
M be/src/runtime/bufferpool/suballocator.h
A be/src/runtime/bufferpool/system-allocator.cc
A be/src/runtime/bufferpool/system-allocator.h
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/runtime/exec-env.cc
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
A be/src/testutil/cpu-util.h
A be/src/testutil/rand-util.h
M be/src/util/cpu-info.cc
M be/src/util/cpu-info.h
30 files changed, 1,557 insertions(+), 364 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/6414/16
-- 
To view, visit http://gerrit.cloudera.org:8080/6414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3203: Part 2: per-core free lists in buffer pool

2017-04-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3203: Part 2: per-core free lists in buffer pool
..


Patch Set 15:

(41 comments)

http://gerrit.cloudera.org:8080/#/c/6414/15/be/src/common/init.cc
File be/src/common/init.cc:

PS15, Line 101:  
> weird indentation and close parenthesis is missing.
Done


http://gerrit.cloudera.org:8080/#/c/6414/15/be/src/runtime/bufferpool/buffer-allocator.cc
File be/src/runtime/bufferpool/buffer-allocator.cc:

PS15, Line 59: .
> should be more explicit that it might return less memory than 'target_bytes
Done


PS15, Line 64: returns the total amount freed 
> hmm, it's kinda confusing that the return value has this subtle difference 
I think it's doing the same operation, I just think I needed to get two 
different number out of it. I changed it to return a pair and changed the 
arguments so that the meaning was hopefully a bit clearer.


PS15, Line 70: FreeMemory
> do you think it makes sense to call this FreeSystemMemory() to make it clea
Works for me, I think it's better to be clear


PS15, Line 137:  Return the buffer size for a buffer of the given length.
> doesn't match
Done


PS15, Line 138: GetBuffersOfSize
> GetListsForSize?
Done


PS15, Line 254: ffinally
> typo
Done


Line 261:   if (delta == len) break;
> since we can't get rid of the other DecreaseSystemBytesRemaining() probably
Done


PS15, Line 263: lock_arenas
> thought this would become 'final_attempt'
Done


Line 304:   vector arena_locks;
> let's add a comment explaining all of this. Something like:
Added a lightly-edited version of this comment


PS15, Line 317:   // It's possible that 'system_bytes_remaining_' was 
incremented since we last checked
  :   // it.
> maybe this sentence would now be explained in the proposed comment above.
Done


PS15, Line 319: another thread between when that thread released memory to the 
system and
  :   // incremented 'system_bytes_remaining_'.
> I think it'd be good to explicitly say:
Tried to make this explicit.


PS15, Line 322: false
> why false instead of true? I guess it doesn't matter since it should succee
That's true, I hadn't thought of it like that.


PS15, Line 419: FreeBuffers
> this name is a bit confusing now that i see it in use since "free buffers" 
Done


PS15, Line 421: so that other threads
  : // can claim the memory.
> not sure what this comment is saying.
This code was removed.


PS15, Line 423: bytes_freed > 0
> when would this be false?
This code was removed.


PS15, Line 478: !al.owns_lock() && 
> why have this condition here? it's still correct to skip the whole block if
That's true, it doesn't really matter, removed it.


Line 488: // Figure out how many buffers we should free of this size.
> i was misled by this comment since at this point, buffers_to_free is only h
Tried to improve the comment.


PS15, Line 489: bytes_to_free
> this is so similarly named to buffers_to_free and buffer_bytes_to_free but 
Yeah I think I had a bug at one point that was a result of me confusing the 
two. Renamed it.


Line 496: // that they are freed based on memory address in the expected 
order.
> this is basically the same as evicting the pages, right? it would help to u
Done


PS15, Line 525:   int64_t max_to_claim = claim_memory ? bytes_to_free : 0;
  :   if (bytes_freed > max_to_claim) {
  : // Add back the extra for other threads before releasing 
the lock to avoid race
  : // where the other thread may not be able to find enough 
buffers.
  : parent_->system_bytes_remaining_.Add(bytes_freed - 
max_to_claim);
  : bytes_freed = max_to_claim;
  :   }
> as mentioned earlier in the function comment, this is pretty confusing.
I changed the interface a bit. Hopefully the new version will make it less 
confusing.


PS15, Line 551: BufferHandle buffer;
  : {
  :   lock_guard pl(page->buffer_lock);
  :   buffer = move(page->buffer);
  : }
  : lists->free_buffers.AddFreeBuffer(move(buffer));
  : lists->num_free_buffers.Add(1);
> why doesn't this need to do the other stuff that FreeBufferArena::AddFreeBu
AddFreeBuffer() got simplified as a consequence of other changes so these 
should align now.


PS15, Line 569: touched
> needed?
Done


Line 571:   // least one, we guarantee that an idle list will shrink to 
zero entries.
> can you add some motivation for why we need to care about the previous two 
After looking at this again I realised I was probably overthinking it. We only 
really need one low water mark. I think we may want to reduce the frequency of 
memory maintenance - every 1s may be too aggressive at cleaning up memory.


PS15, Line 576: bytes_freed > 0
> when would this be false?
Done



[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines

2017-04-04 Thread Laurel Hale (Code Review)
Laurel Hale has posted comments on this change.

Change subject: IMPALA-5140: improve docs building guidelines
..


Patch Set 7:

> > Laurel, doe we still need to tell people how to generate SQL
 > > reference when the entire doc can be generated just fine?
 > 
 > Laurel, can you comment on this question?

No, we do not need to explain how to generate just the SQL reference. However, 
a larger issue is: Do we want to maintain the "make" command so users can 
simple 1) install the DITA-OT; 2) navigate to the "docs" directory; and 3) 
execute "make" to build HTML and PDF? I think this is the best way to go.

-- 
To view, visit http://gerrit.cloudera.org:8080/6512
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5156: Drop VLOG level passed into Kudu client

2017-04-04 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5156: Drop VLOG level passed into Kudu client
..


Patch Set 1: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6549/1/be/src/exec/kudu-util.cc
File be/src/exec/kudu-util.cc:

PS1, Line 105: std::max
#include . Add a comment?


-- 
To view, visit http://gerrit.cloudera.org:8080/6549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6f160e20de37ac13887ccfebc5fb1d595fd05db1
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5156: Drop VLOG level passed into Kudu client

2017-04-04 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6549

Change subject: IMPALA-5156: Drop VLOG level passed into Kudu client
..

IMPALA-5156: Drop VLOG level passed into Kudu client

The Kudu logging is very verbose and all of the logging may
affect performance. This change reduces the Kudu client log
level to v-1, since Impala is typically run at v=1 and isn't
too noisy.

In some simple concurrency testing, Todd found that reducing
the vlog level resulted in an increase in throughput from
~17 qps to 60qps.

Change-Id: I6f160e20de37ac13887ccfebc5fb1d595fd05db1
---
M be/src/exec/kudu-util.cc
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/6549/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I6f160e20de37ac13887ccfebc5fb1d595fd05db1
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-4643: [DOCS] Change URLs / set up keydefs for JIRA reports

2017-04-04 Thread John Russell (Code Review)
Hello Jim Apple,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6515

to look at the new patch set (#2).

Change subject: IMPALA-4643: [DOCS] Change URLs / set up keydefs for JIRA 
reports
..

IMPALA-4643: [DOCS] Change URLs / set up keydefs for JIRA reports

In addition to switching URLs for individual JIRAs
to point to issues.apache.org, also construct alternative
URLs for JIRA reports (usually 1 per release, summarizing
the issues fixed in that release).

There is some inconsistency in which releases have associated
JIRA reports. For releases where we never published a link to
a JIRA report, I added a  tag that could be used to
construct a report in future, but didn't fill in any URL.

Some releases do have a link to a JIRA report in the release
notes, however the report is empty. Possibly there was some
strangeness around the release process for those, or there
could be some missing info in the JIRA system. I just
transcribed the links in those cases and didn't try to
reconstruct the history to debug the empty reports.

Change-Id: I007e634f9da57289674683dd5bf64e3e3ca8f525
---
M docs/impala_keydefs.ditamap
M docs/topics/impala_fixed_issues.xml
M docs/topics/impala_known_issues.xml
3 files changed, 216 insertions(+), 66 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/6515/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6515
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I007e634f9da57289674683dd5bf64e3e3ca8f525
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-4643: [DOCS] Change URLs / set up keydefs for JIRA reports

2017-04-04 Thread John Russell (Code Review)
John Russell has posted comments on this change.

Change subject: IMPALA-4643: [DOCS] Change URLs / set up keydefs for JIRA 
reports
..


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/6515/1/docs/topics/impala_fixed_issues.xml
File docs/topics/impala_fixed_issues.xml:

PS1, Line 59: ="jira_list_280"/>.
> This works, but why not make it jira_list_280?
Done


PS1, Line 573: 
> This didn't work for me. It seems as if there is no fixed_issues_232 anchor
Done


PS1, Line 573: fixed_issues_232
> I looked in "impala_fixed_issues.xml" and there IS a concept id "fixed_issu
I believe this is a bug in the HTML transform. Why don't we classify that as 
out-of-scope for this particular gerrit.


PS1, Line 2461:  This points to empty search results.
As I said in the commit message, there are some releases where maybe nobody did 
the appropriate JIRA hookup for upstream issues. I'm reproducing the experience 
readers had in the Cloudera Impala docs, good or bad, just without any 
Cloudera-specific references.


PS1, Line 2688: 
> This points to empty search results:
As stated above, I'm proposing that that's out of scope for this gerrit.


PS1, Line 2706: 
> This points to empty search results:
As stated above, I'm proposing that that's out of scope for this gerrit.


PS1, Line 2770:  This points to empty search results:
As stated above, I'm proposing that that's out of scope for this gerrit.


PS1, Line 2820: 
> This points to empty search results:
As stated above, I'm proposing that that's out of scope for this gerrit.


http://gerrit.cloudera.org:8080/#/c/6515/1/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

PS1, Line 96: he JIRA report of blocker/critical issues:
> Doesn't work.
Gerrit did truncate it when I copied the URL. This is invisible comment, part 
of a comment only. Why don't I just remove this link from the comment.


-- 
To view, visit http://gerrit.cloudera.org:8080/6515
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I007e634f9da57289674683dd5bf64e3e3ca8f525
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4166: Add SORT BY sql clause

2017-04-04 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4166: Add SORT BY sql clause
..


Patch Set 7:

(20 comments)

Thank you for your review! Please see PS7.

http://gerrit.cloudera.org:8080/#/c/6495/5/common/thrift/JniCatalog.thrift
File common/thrift/JniCatalog.thrift:

Line 383:   // any such columns of the source table. If unspecified, the 
destination table will
> Otherwise? Will the destination table inherit the sort by columns of the so
Yes, it'll inherit them. I expanded the comment. We have a test for this in 
QueryTest/create-table-like-table.test.


http://gerrit.cloudera.org:8080/#/c/6495/5/fe/src/main/java/org/apache/impala/analysis/AlterTableSetSortByColumnsStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableSetSortByColumnsStmt.java:

PS5, Line 64: super.analyze(analyzer);
> Comment not applicable. Remove?
Done


http://gerrit.cloudera.org:8080/#/c/6495/5/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

PS5, Line 127: hasNoClusteredHint_
> Similar to the comment on hasShuffleHint, explain the allowed values for ha
Done


PS5, Line 763:  
> "if any"
Done


http://gerrit.cloudera.org:8080/#/c/6495/5/fe/src/main/java/org/apache/impala/analysis/TablePropertyAnalyzer.java
File fe/src/main/java/org/apache/impala/analysis/TablePropertyAnalyzer.java:

PS5, Line 77: return Lists.newArrayList();
> This is an immutable list whereas the next function returns a mutable one. 
Done


PS5, Line 83: 
:*/
> remove
Done


PS5, Line 91: sortByC
> nit: rename to sortByColName? since you also have tableCols it may be good 
Done


Line 92:   // Make sure it's not a primary key column or partition column.
> nit: remove empty line
Done


PS5, Line 102: 
 :   for (int j = 0; j < tableCols.size(); ++j) {
> Not sure I follow this part based on the code below.
Done, updated the comment.


Line 120: return colIdxs;
> nit: remove empty line
Done


http://gerrit.cloudera.org:8080/#/c/6495/5/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

PS5, Line 455: the caller must make sure that the value matches any columns 
he/she adds to the
 :* table.
> Can't we throw an exception instead?
We could throw one if the number is wrong, but eventually the caller will have 
to provide k columns and then set this to some value 0 <= n < k. Should we 
check that? The comment was meant to indicate that the caller will be on 
themselves to use this.


http://gerrit.cloudera.org:8080/#/c/6495/5/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

PS5, Line 546: orderingExprs.addAll(insertStmt.getPartitionKeyExprs()
> I assume this works for HBase tables...
Insert hints are not allowed for HBase tables and the analysis will fail. From 
looking at InsertStmt I think that this would return an empty list of exprs. 
However I think it could be more explicit. Should I add an "instanceof" check 
for HdfsTable here?


http://gerrit.cloudera.org:8080/#/c/6495/5/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

PS5, Line 521: 'sort.by.columns'='id
> How about 'ID' (check for case)?
Indeed it did not work, I fixed it.


PS5, Line 1642: "CREATE TABLE LIKE is not supported for Kudu tables");
> Also, add the negative case where sort by cols don't exist in the source ta
Done


PS5, Line 1913: // Partitioned HDFS table
  : AnalyzesOk("create table functional.new_table (i int) 
PARTITIONED BY (d decimal)" +
  : "SORT BY (i)");
  : // Column in sortby hint must not be a Hdfs partition 
column.
  : AnalysisError("create table functional.new_table (i int) 
PARTITIONED BY (d decimal)" +
  : "SORT BY (d)", "SORT BY column list must not contain 
partition column: 'd'");
  : 
  : // Kudu table tests are in TestCreateManagedKuduTable().
  :   }
  : 
  :   @Test
  :   public void TestAlterKuduTable() {
  : TestUtils.assumeKuduIsSupported();
  :
> For Kudu specific tests they need to be in a function that first calls Test
I move the tests to TestCreateManagedKuduTable().

We don't support setting sort by columns for HBase tables (I added code to 
AlterTableSortByStmt to enforce this). During analysis, we don't parse the 
table property, either.


http://gerrit.cloudera.org:8080/#/c/6495/5/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
File testdata/workloads/functional-planner/queries/PlannerTest/ddl.test:

PS5, Line 162: create table t sort by
> Add one with a partitioned table? Also, a CTAS with a more 

[Impala-ASF-CR] IMPALA-4166: Add SORT BY sql clause

2017-04-04 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#7).

Change subject: IMPALA-4166: Add SORT BY sql clause
..

IMPALA-4166: Add SORT BY sql clause

This change adds support for adding SORT BY (...) clauses to CREATE
TABLE and ALTER TABLE statements. Examples are:

CREATE TABLE t (i INT, j INT, k INT) PARTITIONED BY (l INT) SORT BY (i, j);
CREATE TABLE t (i INT, x INT PRIMARY KEY) PARTITION BY HASH(x)
PARTITIONS 8 SORT BY(i) STORED AS KUDU;
CREATE TABLE t SORT BY (int_col,id) LIKE u;
CREATE TABLE t LIKE PARQUET '/foo' SORT BY (id,zip);

ALTER TABLE t SORT BY (int_col,id);
ALTER TABLE t SORT BY ();

SORT BY columns can be specified for all table types, but effectiveness
may vary based on table type, for example TEXT tables will not see
improved compression. The SORT BY clause must not contain clustering
columns for HDFS tables, and must not contain primary key columns for
Kudu tables. The columns in the SORT BY clause are stored in the
'sort.by.columns' table property and will result in an additional SORT
node being added to the plan before the final table sink. Specifying
SORT BY columns also enables clustering during inserts, so the SORT node
will contain all partitioning columns first, followed by the SORT BY
columns. We do this because SORT BY columns add a SORT node to the plan
and adding the clustering columns to the SORT node is cheap.

SORT BY columns supersede the sortby() hint, which we will remove in a
subsequent change (IMPALA-5144).

This change introduces a new TablePropertyAnalyzer class to collect
various methods used to analyze table properties.

Change-Id: I08834f38a941786ab45a4381c2732d929a934f75
---
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
A 
fe/src/main/java/org/apache/impala/analysis/AlterTableSetSortByColumnsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
A fe/src/main/java/org/apache/impala/analysis/TablePropertyAnalyzer.java
M fe/src/main/java/org/apache/impala/catalog/Column.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
A testdata/workloads/functional-planner/queries/PlannerTest/insert-sort-by.test
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/alter-table.test
M 
testdata/workloads/functional-query/queries/QueryTest/create-table-as-select.test
M 
testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test
M 
testdata/workloads/functional-query/queries/QueryTest/create-table-like-table.test
M testdata/workloads/functional-query/queries/QueryTest/create-table.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
30 files changed, 1,062 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/6495/7
-- 
To view, visit http://gerrit.cloudera.org:8080/6495
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I08834f38a941786ab45a4381c2732d929a934f75
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-5108: [DOCS] Explain 50% margin for idle * settings

2017-04-04 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5108: [DOCS] Explain 50% margin for idle_* settings
..


Patch Set 1:

(3 comments)

Looks pretty good - just a couple of nits.

http://gerrit.cloudera.org:8080/#/c/6463/1/docs/topics/impala_timeouts.xml
File docs/topics/impala_timeouts.xml:

Line 116: query option. Any value specified for the 
--idle_query_timeout startup
non-zero? If it's zero, the default timeout is effectively infinite.


PS1, Line 141: and queries
I just looked - queries are checked every second.


PS1, Line 141: To avoid excessive polling, 
No need to add justification.


-- 
To view, visit http://gerrit.cloudera.org:8080/6463
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0b19912dac1df13bfcbcb67f0bd4ed0064ad8db9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4166: Add SORT BY sql clause

2017-04-04 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#6).

Change subject: IMPALA-4166: Add SORT BY sql clause
..

IMPALA-4166: Add SORT BY sql clause

This change adds support for adding SORT BY (...) clauses to CREATE
TABLE and ALTER TABLE statements. Examples are:

CREATE TABLE t (i INT, j INT, k INT) PARTITIONED BY (l INT) SORT BY (i, j);
CREATE TABLE t (i INT, x INT PRIMARY KEY) PARTITION BY HASH(x)
PARTITIONS 8 SORT BY(i) STORED AS KUDU;
CREATE TABLE t SORT BY (int_col,id) LIKE u;
CREATE TABLE t LIKE PARQUET '/foo' SORT BY (id,zip);

ALTER TABLE t SORT BY (int_col,id);
ALTER TABLE t SORT BY ();

SORT BY columns can be specified for all table types, but effectiveness
may vary based on table type, for example TEXT tables will not see
improved compression. The SORT BY clause must not contain clustering
columns for HDFS tables, and must not contain primary key columns for
Kudu tables. The columns in the SORT BY clause are stored in the
'sort.by.columns' table property and will result in an additional SORT
node being added to the plan before the final table sink. Specifying
SORT BY columns also enables clustering during inserts, so the SORT node
will contain all partitioning columns first, followed by the SORT BY
columns. We do this because SORT BY columns add a SORT node to the plan
and adding the clustering columns to the SORT node is cheap.

SORT BY columns supersede the sortby() hint, which we will remove in a
subsequent change (IMPALA-5144).

This change introduces a new TablePropertyAnalyzer class to collect
various methods used to analyze table properties.

Change-Id: I08834f38a941786ab45a4381c2732d929a934f75
---
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
A 
fe/src/main/java/org/apache/impala/analysis/AlterTableSetSortByColumnsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
A fe/src/main/java/org/apache/impala/analysis/TablePropertyAnalyzer.java
M fe/src/main/java/org/apache/impala/catalog/Column.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
A testdata/workloads/functional-planner/queries/PlannerTest/insert-sort-by.test
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/alter-table.test
M 
testdata/workloads/functional-query/queries/QueryTest/create-table-as-select.test
M 
testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test
M 
testdata/workloads/functional-query/queries/QueryTest/create-table-like-table.test
M testdata/workloads/functional-query/queries/QueryTest/create-table.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
30 files changed, 1,060 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/6495/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6495
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I08834f38a941786ab45a4381c2732d929a934f75
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] Allow BlockingQueue and ThreadPool to accept rvalue args

2017-04-04 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: Allow BlockingQueue and ThreadPool to accept rvalue args
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6442/1//COMMIT_MSG
Commit Message:

Line 11: queue. Very often we create a thin wrapper for each work item we submit
> This CR uses it in only one place where it doesn't seem to make much sense 
The shared_ptr example is illustrative, but does also have some small effect: 
the shared_ptr is no longer copied so the ref count changes and synchronization 
that would have to happen with a copy are elided. 

I do have a change that uses this for a non-copyable object. But I also think 
that this is a natural improvement for the API in that copy c'tors won't be 
silently called where it's avoidable.

Note that some callsites of this API will have their behavior changed with no 
code changes. For example, Coordinator::UpdateFilter() uses ThreadPool::Offer() 
with a boost::function that is an rvalue. With this patch the function is moved 
into the queue and not copied. That could be significant when the function has 
large argument objects.


http://gerrit.cloudera.org:8080/#/c/6442/1/be/src/util/blocking-queue.h
File be/src/util/blocking-queue.h:

Line 198:   /// Overloads for inserting an item into the list, depending 
whether it should be moved
> depending on
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1791870576cb269e86495034f92555de48f92f10
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Allow BlockingQueue and ThreadPool to accept rvalue args

2017-04-04 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new patch set (#2).

Change subject: Allow BlockingQueue and ThreadPool to accept rvalue args
..

Allow BlockingQueue and ThreadPool to accept rvalue args

Previously the BlockingQueue and ThreadPool APIs only accepted const
lvalue references, so the argument was always copied into the
queue. Very often we create a thin wrapper for each work item we submit
to a thread pool, and will not want to use that object again, so moving
it into the pool rather than copying makes the most sense.

Note the introduction of an extra template parameter into Offer() and
BlockingPut*(). To enable perfect-forwarding (i.e. allow the methods to
accept rvalue or lvalues and pass them through), we need to use a)
rvalue references (V&&) and b) do so in a 'type-deducing
context' [1]. Having the enclosing class be template-parameterized does not
count as type-deducing, so we add the dummy V parameter and the compiler
will ensure that V is properly compatible with the original T type.

[1] 
http://eli.thegreenplace.net/2014/perfect-forwarding-and-universal-references-in-c/

Change-Id: I1791870576cb269e86495034f92555de48f92f10
---
M be/src/rpc/TAcceptQueueServer.cpp
M be/src/util/blocking-queue.h
M be/src/util/thread-pool.h
3 files changed, 18 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/6442/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1791870576cb269e86495034f92555de48f92f10
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 


[Impala-ASF-CR] IMPALA-3381: Support AM/PM marker in date and time format strings

2017-04-04 Thread Greg Rahn (Code Review)
Greg Rahn has posted comments on this change.

Change subject: IMPALA-3381: Support AM/PM marker in date and time format 
strings
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6523/1/be/src/runtime/timestamp-parse-util.cc
File be/src/runtime/timestamp-parse-util.cc:

Line 173:   case 'a': tok_type = AM_PM_MARKER; dt_ctx->has_am_pm_marker = 
true; break;
> I decided to use "a" for the AM/PM marker because this is what Hive does.
The mask can not hard-code any value -- it is how to interpret the date/time 
string.  If the input string contains no AM/PM string then the mask should not 
attempt to add it and it needs to be read as 24h time.   For example if the 
value is '2000-01-01 05:00' one should not be able to use a mask to make this 
be interpreted as  '2000-01-01 05:00 PM'.


-- 
To view, visit http://gerrit.cloudera.org:8080/6523
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I99794a3e152f1712c6c469bb266d23a81d19ca34
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] save

2017-04-04 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: save
..


Patch Set 1:

I think this should have been squashed.

-- 
To view, visit http://gerrit.cloudera.org:8080/6540
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d33f82ed6661e849b2a61f3b6dd2039813bca7c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2522: Add doc for sortby() and clustered hints

2017-04-04 Thread John Russell (Code Review)
John Russell has posted comments on this change.

Change subject: IMPALA-2522: Add doc for sortby() and clustered hints
..


Patch Set 4:

I heard from Lars (IMPALA-5157) that SORTBY should actually be removed. I guess 
abandoning this review is the wrong thing to do if CLUSTERED is still 
appropriate. I should do another patch set that keeps CLUSTERED and removes 
SORTBY.

-- 
To view, visit http://gerrit.cloudera.org:8080/5655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id3c1da9a87ace361b096fa73d8504b2f54e75bed
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5137: Support Kudu UNIXTIME MICROS as Impala TIMESTAMP

2017-04-04 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5137: Support Kudu UNIXTIME_MICROS as Impala TIMESTAMP
..


Patch Set 1:

(2 comments)

> Did you mean to update the patch?

Yeah, but I need to iron out a few more things and add a few more tests. I was 
waiting to see if I'd get David's API changes for the read path soon, but it 
sounds like I wont so I'll get this patch cleaned up w/ just the write path for 
the time being.

http://gerrit.cloudera.org:8080/#/c/6526/1//COMMIT_MSG
Commit Message:

PS1, Line 11: Impala's TIMESTAMP type is
> Impala stores TIMESTAMP values internally as... ?
Done


PS1, Line 14: will is
> just "is"?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6526
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae6ccfffb79118a9036fb2227dba3a55356c896d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5137: Support Kudu UNIXTIME MICROS as Impala TIMESTAMP

2017-04-04 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5137: Support Kudu UNIXTIME_MICROS as Impala TIMESTAMP
..


Patch Set 1:

Did you mean to update the patch?

-- 
To view, visit http://gerrit.cloudera.org:8080/6526
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae6ccfffb79118a9036fb2227dba3a55356c896d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] save

2017-04-04 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6540

Change subject: save
..

save

Change-Id: I9d33f82ed6661e849b2a61f3b6dd2039813bca7c
---
M be/src/runtime/data-stream-test.cc
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/6540/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6540
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I9d33f82ed6661e849b2a61f3b6dd2039813bca7c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines

2017-04-04 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5140: improve docs building guidelines
..


Patch Set 7:

> Laurel, doe we still need to tell people how to generate SQL
 > reference when the entire doc can be generated just fine?

Laurel, can you comment on this question?

-- 
To view, visit http://gerrit.cloudera.org:8080/6512
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines

2017-04-04 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-5140: improve docs building guidelines
..


Patch Set 7:

Jim, any more comments?

-- 
To view, visit http://gerrit.cloudera.org:8080/6512
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4858: add more info to MemLimitExceeded errors

2017-04-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2).

Change subject: IMPALA-4858: add more info to MemLimitExceeded errors
..

IMPALA-4858: add more info to MemLimitExceeded errors

This improves the usefulness of MemLimitExceeded errors:
- The host name is included.
- The memory left in the query and process limits is included.
- The approach for deciding whether to print the query or process
  MemTracker hierarchy is more robust: we show the query hierarchy
  only when we're closer to the query limit than the process limit.
  Previously if we were near but not over the process limit it
  printed the query output, which was often useless.
- The detailed output is included in the main error message, rather
  than in the "details" or merged errors. This reduces the impact
  of not logging those details - IMPALA-4697.

Note that the output still does not exactly reflect the state of
the world when the memory limit was exceeded because other threads
will concurrently allocate and release memory.

This also refactors the various related methods slightly to remove
RuntimeState::LogMemLimitExceeded() and reduce the number of
different methods that can log memory limit errors.

Sample output:
--
  Memory limit exceeded: Cannot perform aggregation at node with id 1. Failed 
to allocate 252 bytes for intermediate tuple.
  Fragment ab4765ecacca4d01:23edbb080002 could not allocate 252.00 B 
without exceeding limit.
  Error occurred on backend tarmstrong-box:22001 by fragment 
ab4765ecacca4d01:23edbb080002
  Memory left in process limit: 7.94 GB
  Memory left in query limit: 1.02 MB
  Query(ab4765ecacca4d01:23edbb08): Limit=300.00 MB Total=298.98 MB 
Peak=299.98 MB
Fragment ab4765ecacca4d01:23edbb080002: Total=105.32 MB Peak=114.14 MB
  AGGREGATION_NODE (id=1): Total=11.46 MB Peak=11.84 MB
  HDFS_SCAN_NODE (id=0): Total=93.73 MB Peak=102.09 MB
  DataStreamSender (dst_id=2): Total=70.02 KB Peak=86.02 KB
  CodeGen: Total=24.83 KB Peak=1.22 MB
Block Manager: Limit=200.00 MB Total=185.00 MB Peak=186.00 MB
Fragment ab4765ecacca4d01:23edbb080005: Total=185.66 MB Peak=194.66 MB
  AGGREGATION_NODE (id=3): Total=185.48 MB Peak=194.48 MB
  EXCHANGE_NODE (id=2): Total=0 Peak=0
  DataStreamRecvr: Total=160.24 KB Peak=435.45 KB
  DataStreamSender (dst_id=4): Total=688.00 B Peak=688.00 B
  CodeGen: Total=24.28 KB Peak=963.50 KB

Change-Id: Id4ab9d162cf7cd4508bce1efbfccfb4ba97e7355
---
M be/src/common/status.cc
M be/src/common/status.h
M be/src/runtime/collection-value-builder-test.cc
M be/src/runtime/mem-tracker.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/util/debug-util.cc
M be/src/util/debug-util.h
8 files changed, 58 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/6537/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6537
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id4ab9d162cf7cd4508bce1efbfccfb4ba97e7355
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[native-toolchain-CR] Ported native-toolchain to work on ppc64le

2017-04-04 Thread Valencia Edna Serrao (Code Review)
Valencia Edna Serrao has posted comments on this change.

Change subject: Ported native-toolchain to work on ppc64le
..


Patch Set 1:

> Valencia, before we continue the discussion of the work on this
 > toolchain patch much further, I think it would be good to know what
 > the overarching plan is for the ppc64le work across Impala,
 > including testing. For example, one major concern is that the
 > current jenkins test infrastructure does not have ppc64le worker
 > nodes, so I don't see how we'll be able to test this. Can you
 > submit a plan for your ppc64le work to the Apache dev@ list so we
 > can have a broader discussion with the community?

I understand your concern, Matthew. We are working on it. I'll initiate the 
thread at Apache dev@ about ppc64le work plan for Impala and its 
native-toolchain, after getting clarity from my IBM managers.

-- 
To view, visit http://gerrit.cloudera.org:8080/6468
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7931752ca497bc7a5e3cc574bbb54637f382c72a
Gerrit-PatchSet: 1
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Valencia Edna Serrao 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Valencia Edna Serrao 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4086: Add benchmark for simple scheduler

2017-04-04 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4086: Add benchmark for simple scheduler
..


Patch Set 5:

I updated the TODO in scheduler.h and renamed the scheduler benchmark cc file.

-- 
To view, visit http://gerrit.cloudera.org:8080/4554
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I89ec1c6c1828bb0b86d1e13ce4dfc5a8df865c2e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4086: Add benchmark for simple scheduler

2017-04-04 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#5).

Change subject: IMPALA-4086: Add benchmark for simple scheduler
..

IMPALA-4086: Add benchmark for simple scheduler

Change-Id: I89ec1c6c1828bb0b86d1e13ce4dfc5a8df865c2e
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/scheduler-benchmark.cc
M be/src/scheduling/scheduler.h
M be/src/util/benchmark.cc
M be/src/util/debug-util.cc
M be/src/util/debug-util.h
6 files changed, 177 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/4554/5
-- 
To view, visit http://gerrit.cloudera.org:8080/4554
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I89ec1c6c1828bb0b86d1e13ce4dfc5a8df865c2e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-2716: Hive/Impala incompatibility for timestamp data in Parquet

2017-04-04 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2716: Hive/Impala incompatibility for timestamp data in 
Parquet
..


Patch Set 5:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/5939/5/fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java:

Line 169:   private static void analyzeParquetMrWriteZone(Table table,
These cases need tests in AnalyzeDDL


http://gerrit.cloudera.org:8080/#/c/5939/5/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java:

Line 183: if 
(getTblProperties().containsKey(HdfsTable.TBL_PROP_PARQUET_MR_WRITE_ZONE)) {
These cases need tests in AnalyzeDDL


http://gerrit.cloudera.org:8080/#/c/5939/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 54: import org.apache.impala.common.InternalException;
not needed


http://gerrit.cloudera.org:8080/#/c/5939/5/tests/custom_cluster/test_parquet_timestamp_compatibility.py
File tests/custom_cluster/test_parquet_timestamp_compatibility.py:

Line 25: class TestParquetTimestampCompatibility(CustomClusterTestSuite):
Test needs a comment. In particular, what is covered here and what is covered 
in integration tests not part of Impala tests.


Line 26:   TEST_DB_NAME = 'timezone_test_db'
Why not use the unique_database fixture? That's the best practice.


Line 39: self.client = impalad.service.create_beeswax_client()
I think we already have a self.client from ImpalaTestSuite.


Line 49: 
"/test-warehouse/alltypesagg_hive_13_1_parquet/alltypesagg_hive_13_1.parquet"
We need the appropriate filesystem prefix here. This will break on local fs or 
S3 builds. Take a look at:

filesystem_utils.py get_fs_path()

and how that is used elsewhere


Line 56: self.client.execute('USE ' + self.TEST_DB_NAME)
Why? Better to avoid changing the session state


Line 59: self.client.execute('INVALIDATE METADATA')
why?


Line 61:   def _set_impala_table_timezone(self, timezone):
simplify to pass table name as param


Line 71:   @pytest.mark.execute_serially
How long does this test take? I'm thinking we should only have minimal tests 
for the impalad startup option in a custom cluster test, probably in the 
existing test for convert_legacy_hive_parquet_utc_timestamps. 

The other tests should go into a regular test (subclass of ImpalaTestSuite) and 
run in parallel.


Line 81: statement = '''ALTER TABLE hive_table
Should be an analyzer test in AnalyzeDDL


Line 104: select_from_impala_table = '''SELECT timestamp_col FROM 
impala_table WHERE id = 1'''
Can you think of a way to validate all values in the table in bulk e.g. using 
our existing timestamp conversion functions?
Having a few example values is still good, but we should also test that we are 
internally consistent with out conversion functions.


Line 113: self._set_impala_table_timezone('EST')
Add some comments about the behavior here. We are getting different values for 
the same timezone because the underlying Parquet files were written by 
Hive/Impala.


Line 141: statement = '''ALTER TABLE hive_table
analyzer test is more appropriate (also fix elsewhere)


Line 185:   def test_ddl(self, vector):
We need a test where Hive sets a garbage timezone and Impala throws an error 
when analyzing a query against that table.

We also need a test that shows what happens when a table property timezone is 
set and the convert_legacy_hive_parquet_utc_timestamps option is used. Probably 
good to consolidate with the existing custom cluster test.


-- 
To view, visit http://gerrit.cloudera.org:8080/5939
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3f24525ef45a2814f476bdee76655b30081079d6
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes