[Impala-ASF-CR] [DOCS] Explain Boost setting needed for 96-bit timestamps

2017-09-11 Thread John Russell (Code Review)
John Russell has posted comments on this change.

Change subject: [DOCS] Explain Boost setting needed for 96-bit timestamps
..


Patch Set 2:

Tim and Michael, I added you as reviewers based on your frequent activity in 
be/src/exprs/. If someone else is more familiar with the compilation aspect 
w.r.t. TIMESTAMP, please suggest any other reviewer.

-- 
To view, visit http://gerrit.cloudera.org:8080/7983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4b67cd7762f682c3a054e0d9641080aa51801c83
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5894: [DOCS] Clarify placement of STRAIGHT JOIN hint

2017-09-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5894: [DOCS] Clarify placement of STRAIGHT_JOIN hint
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-docs-submit/153/

-- 
To view, visit http://gerrit.cloudera.org:8080/8031
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3cac1afccc132f389b2017ad217fdf7e7b04513a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4670: Introduces RpcMgr class

2017-09-11 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4670: Introduces RpcMgr class
..


Patch Set 6: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7901/6/be/src/rpc/rpc-mgr-test.cc
File be/src/rpc/rpc-mgr-test.cc:

PS6, Line 65: FLAGS_hostname
Do we want this to be the loopback address since this is only a test?

The thrift-server-test listens on the loopback address instead of the 
externally visible hostname, but I'm fine either way.


-- 
To view, visit http://gerrit.cloudera.org:8080/7901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8adb10ae375d7bf945394c38a520f12d29cf7b46
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates

2017-09-11 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5881: Use native allocation while building catalog 
updates
..


Patch Set 17:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/7955/17/fe/src/test/java/org/apache/impala/thrift/NativeByteArrayOutputStreamTest.java
File 
fe/src/test/java/org/apache/impala/thrift/NativeByteArrayOutputStreamTest.java:

Line 40:   || size >= 
NativeByteArrayOutputStream.BUFFER_MAX_SIZE_LIMIT) {
Our real allocator would not OOM if the size is big, at least not in this 
place, so let's remove the second check.


Line 60:   private void writeAndCheck(NativeByteArrayOutputStream nboas,
I'd prefer to split this up into writeOk() and writeNotOk(). Otherwise, tests 
may go into the "wrong" code path but still succeed, e.g., because the 
allocation succeeded and updated the bytesWritten correctly, although we 
expected the allocation to fail.


Line 78: NativeByteArrayOutputStream nboas =
The first allocation happens here in the c'tor. We need to test that as well.

Alternatively, you can change the nbaos implementation to not allocate in the 
c'tor (either way is fine with me)


Line 90: nboas = new NativeByteArrayOutputStream(testAllocator);
nit: nbaos


Line 108: 
remove extraneous newines


Line 114: testAllocator = new NativeTestAllocator();
Add a helper function for these smaller tests that creates a new allocator, 
nbaos, array, etc.


http://gerrit.cloudera.org:8080/#/c/7955/17/fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java
File fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java:

Line 81:   public void testBasicSerialization()
We usually call these TestBasicSerialization (first letter is uppercase)


-- 
To view, visit http://gerrit.cloudera.org:8080/7955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1767 Adds predicate to test boolean values true, false, unknown.

2017-09-11 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-1767 Adds predicate to test boolean values true, false, 
unknown.
..


Patch Set 6:

(22 comments)

http://gerrit.cloudera.org:8080/#/c/8014/1/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

PS1, Line 2876: 
don't think you need that


PS1, Line 2878: 
same here


Line 2891:   | UNMATCHED_STRING_LITERAL:l expr:e
nit: extra spaces (see marked as red).


http://gerrit.cloudera.org:8080/#/c/8014/2/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

PS2, Line 273:   KW_UNBOUNDED, KW_UNCACHED, KW_UNION, KW_UPDATE, KW_UPDATE_FN, 
KW_UPSERT, KW_USE,
nit: long line (see vertical separator, that's ok for .test files but for 
everything else, we try to stay < 90).


http://gerrit.cloudera.org:8080/#/c/8014/6/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

PS6, Line 337: rules.add(BoolTestToCompoundRule.INSTANCE);
Maybe add a brief comment about this rule as well?


PS6, Line 1093: it
nit: them


PS6, Line 1098: For all conjuncts other than
  : // the ones listed below, the method should be a no-op.
remove. Similar comment above (L1082).


http://gerrit.cloudera.org:8080/#/c/8014/2/fe/src/main/java/org/apache/impala/analysis/BoolTestPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BoolTestPredicate.java:

PS2, Line 26: predicate that tests a boolean value using IS.
"a boolean test predicate."? I think you can also add the grammar spec here 
besides the example.


PS2, Line 33:  it to be executable, i.e., it is illegal to call 
For literals we usually make them static as well; also, remove "_" from the 
name.


PS2, Line 38: ivate
No need for "this". We use "_" in the name to indicate private fields.


PS2, Line 41: super();
: this.isNegated_ =
You can use the addChild() function here, same thing.


PS2, Line 88: 
Will that work with something like "select 1 is true"? For instance, "select 1 
= true" returns true. Type is not boolean but it can be casted into one.


PS2, Line 94:  new AnalysisException("Ex
Similar comment as above. e.g. select 1 is 1;


http://gerrit.cloudera.org:8080/#/c/8014/2/fe/src/main/java/org/apache/impala/rewrite/BoolTestToCompoundRule.java
File fe/src/main/java/org/apache/impala/rewrite/BoolTestToCompoundRule.java:

PS2, Line 29: expressions
nit re terminology: "compound predicate". Technically, it is an expression 
(subclass of Expr) but try to use the more specific term.


http://gerrit.cloudera.org:8080/#/c/8014/6/fe/src/main/java/org/apache/impala/rewrite/BoolTestToCompoundRule.java
File fe/src/main/java/org/apache/impala/rewrite/BoolTestToCompoundRule.java:

PS6, Line 29: to compound expressions
"a compound predicate."


Line 64: }
Preconditions.checkNotNull(result);


http://gerrit.cloudera.org:8080/#/c/8014/6/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java:

PS6, Line 578:   
nit: extra space


PS6, Line 580: TestBoolTestPredicate
A few more test cases to consider:
1. bool test predicate in the select list
2. nested bool test predicates, e.g. ((bool_col is true) is true)
3. other bool exprs than bool_col, e.g. function returning bool, case expr 
4. wrap the bool test in an istrue/isfalse function


PS6, Line 586: null
Also use "unknown"


PS6, Line 591: where 1 is true"
This doesn't seem to be consistent with how we handle for instance expr like 1 
= true (we allow that).


PS6, Line 591: AnalysisError
Add a few more negative cases. Example:
1. subquery
2. literals that can't be cast to true/false, e.g. 10 is true


http://gerrit.cloudera.org:8080/#/c/8014/6/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java:

PS6, Line 168: // TODO: figure out how/if parens are printed to reflect 
expression tree.
Did you figure that out? If so, remove the TODO.


-- 
To view, visit http://gerrit.cloudera.org:8080/8014
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5bd0aa82a0316ac236b7ecf8612ef4b30c016a00
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5250: Unify decompressor output length semantics

2017-09-11 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5250: Unify decompressor output_length semantics
..


Patch Set 1:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/8030/1//COMMIT_MSG
Commit Message:

Line 9: This patch makes the semantics output_length parameter in
the semantics of the output_length


Line 10: Codec::ProcessBlock to be the same. In existing code different
Codec::ProcessBlock() to be the same across all codecs. In the existing code 
different decompressors treat output_length differently.


Line 13:to actual decompressed length, but it does not set it to actual
to the actual decompressed length


Line 16:be exactly the same as actual decompressed length, otherwise
as the actual decompressed length


Line 19:actual decompressed length and will set it to actual decompressed
the actual decompressed length


Line 21: This inconsistency leads to a bug where the error message is
This inconsistency lead to a bug where 


Line 24: decompressors can handle oversized output buffer correctly.
can handle an oversized output buffer correctly.


Line 25: Lz4Decompressor will use the "safe" version of decompression function
will use the "safe" instead of the "fast" decompression function


Line 26: instead of the "fast" version, for the latter is insecure with 
corrupted
We use LZ4 for compressing exchange data, so we check the perf impact of these 
changes. I can help you with the details.


Line 27: data and requires decompressed length to be known.
and requires the decompressed length to be known


http://gerrit.cloudera.org:8080/#/c/8030/1/be/src/util/decompress-test.cc
File be/src/util/decompress-test.cc:

Line 153:   // Test the behavior when the decompressor is given too little / 
too much space
"." at end of sentence


Line 155:   // correct output size when the space is enough, and does not write 
beyond the output
the correct output size


Line 157:   void DecompressOverUnderSizedOutputBuffer(Codec *compressor, Codec 
*decompressor,
Out style is "Codec* compressor"


Line 196: }
I think we should test more directly the scenario mentioned in the JIRA where a 
corrupt file can lead to the output_len not being set to 0 properly.


http://gerrit.cloudera.org:8080/#/c/8030/1/be/src/util/decompress.cc
File be/src/util/decompress.cc:

Line 426: // If size_only is false, output buffer size must be at least 
output_len. *output_len is
Mention what output_len is set to if a non-OK status is returned.


Line 427: // also updated with actual output size.
the actual output size


Line 575:   // LZ4_decompress_fast will cause a segmentation fault if passed a 
nullptr output.
Comment still relevant?


Line 579:   if (ret < 0) {
single line


-- 
To view, visit http://gerrit.cloudera.org:8080/8030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifd42942b169921a7eb53940c3762bc45bb82a993
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4987: Fix flaky test test row availability.py

2017-09-11 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4987: Fix flaky test test_row_availability.py
..


Patch Set 1:

(13 comments)

Thanks for fixing this

http://gerrit.cloudera.org:8080/#/c/8036/1//COMMIT_MSG
Commit Message:

Line 9: This patch keeps test_row_availbility from random failure. In this test
Please take a pass to correct wording/grammar. For example, it should be:

"This patch keeps test_rows_availability from randomly failing."

More such examples below.


Line 10: the time interval between 'Rows available' event and the previous event
the 'Rows available' timeline event


Line 11: in runtime profile is measured in order to make sure that rows become
in the runtime profile


Line 12: available after a specific amount of time. This is not correct since
Instead of 'This' try to rephrase/repeat what 'This' refers to for clarity, 
e.g. This measurement is not correct ...


Line 13: the previous event is that the coordinator finishes sending query to
finished sending the query to the backends


Line 14: backends, which means the execution on backend might have already
on some backends might have already started


Line 15: started. This patch tracks another event "ready to start" as the
"Read to start"


Line 17: query to backends after this event so the time check should always 
pass.
the query to backends


http://gerrit.cloudera.org:8080/#/c/8036/1/tests/query_test/test_rows_availability.py
File tests/query_test/test_rows_availability.py:

Line 81: row_avail_time_ms = 
self.__parse_time_ms(self.__find_time(line))
rows_avail_time_ms (not just one row is available)


Line 92: "'ready to start' event.\nExpected the event to be marked no 
earlier than "\
'Ready to start'


Line 93: "%sms after the 'ready to start' event.\nQuery: %s"\
'Ready to start'


Line 99: """Find event time point in a line from runtime timeline."""
from the runtime profile timeline


Line 100: # Example line: "- Rows available: 3s311ms (2s300ms)"
Example should include what this function returns


-- 
To view, visit http://gerrit.cloudera.org:8080/8036
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I96142f1868a26426cbc34aa9a0e0a56979df66c3
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3642: Adding backend addresses to error statuses for some scratch failures.

2017-09-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3642: Adding backend addresses to error statuses for 
some scratch failures.
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If31a50fdf6031312d0348d48aeb8f9688274cac2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3642: Adding backend addresses to error statuses for some scratch failures.

2017-09-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3642: Adding backend addresses to error statuses for 
some scratch failures.
..


IMPALA-3642: Adding backend addresses to error statuses for some scratch 
failures.

Adds GetBackendAddress() (which is host:port) to error messages stemming
from SCRATCH_LIMIT_EXCEEDED, SCRATCH_READ_TRUNCATED, and
SCRATCH_ALLOCATION_FAILED messages.

Testing:

* Unit tests assert the string is updated for SCRATCH_LIMIT_EXCEEDED
  and SCRATCH_ALLOCATION_FAILED. SCRATCH_READ_TRUNCATED doesn't
  have an existing test, and I didn't add a new one.

* Manually testing a query that spills after "chmod 000 /tmp/impala-scratch":
$ chmod 000 /tmp/impala-scratch
$ impala-shell
[dev:21000] > set mem_limit=100m;
MEM_LIMIT set to 100m
[dev:21000] > select count(*) from tpch_parquet.lineitem join 
tpch_parquet.orders on l_orderkey = o_orderkey;
Query: select count(*) from tpch_parquet.lineitem join tpch_parquet.orders 
on l_orderkey = o_orderkey
Query submitted at: 2017-09-11 11:07:06 (Coordinator: http://dev:25000)
Query progress can be monitored at: 
http://dev:25000/query_plan?query_id=5c48ff8f4103c194:1b40a6c
WARNINGS: Could not create files in any configured scratch directories 
(--scratch_dirs=/tmp/impala-scratch) on backend 'dev:22002'. See logs for 
previous errors that may have prevented creating or writing scratch files.
Opening 
'/tmp/impala-scratch/5c48ff8f4103c194:1b40a6c_08e8d63b-169d-4571-a0fe-c48fa08d73e6'
 for write failed with errno=13 description=Error(13): Permission denied
Opening 
'/tmp/impala-scratch/5c48ff8f4103c194:1b40a6c_08e8d63b-169d-4571-a0fe-c48fa08d73e6'
 for write failed with errno=13 description=Error(13): Permission denied
Opening 
'/tmp/impala-scratch/5c48ff8f4103c194:1b40a6c_08e8d63b-169d-4571-a0fe-c48fa08d73e6'
 for write failed with errno=13 description=Error(13): Permission denied

Change-Id: If31a50fdf6031312d0348d48aeb8f9688274cac2
Reviewed-on: http://gerrit.cloudera.org:8080/7816
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M common/thrift/generate_error_codes.py
4 files changed, 13 insertions(+), 9 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: If31a50fdf6031312d0348d48aeb8f9688274cac2
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3897 Codegen null-aware constant in PHJ::ProcessBuildBatch()

2017-09-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3897 Codegen null-aware constant in 
PHJ::ProcessBuildBatch()
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7849
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I06acbebc9d2d23bef4734b480a5d3ce41680ea70
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3897 Codegen null-aware constant in PHJ::ProcessBuildBatch()

2017-09-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3897 Codegen null-aware constant in 
PHJ::ProcessBuildBatch()
..


IMPALA-3897 Codegen null-aware constant in PHJ::ProcessBuildBatch()

This change codegen outs a branch in ProcessBuildBatch(). This branch never gets
executed for most of the join types except NULL_AWARE_LEFT_ANTI_JOIN.
The branch itself is not expensive to execute, but it will reduce codegen
time by removing the dead code inside the branch for almost all join
modes.

Change-Id: I06acbebc9d2d23bef4734b480a5d3ce41680ea70
Reviewed-on: http://gerrit.cloudera.org:8080/7849
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
4 files changed, 20 insertions(+), 10 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7849
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I06acbebc9d2d23bef4734b480a5d3ce41680ea70
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-5894: [DOCS] Clarify placement of STRAIGHT JOIN hint

2017-09-11 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5894: [DOCS] Clarify placement of STRAIGHT_JOIN hint
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/8031
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3cac1afccc132f389b2017ad217fdf7e7b04513a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Alex Behm 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

2017-09-11 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new patch set (#4).

Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
..

IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

Impala currently kinits by forking off a child process. This
has proved to be expensive in many cases since the subprocess
tries to reserve as much memory as Impala is currently using
which can be quite a lot.

This patch adds a flag called 'use_krpc_kinit' that defaults to
true. When it's true, it uses the Kudu security library's kinit code
that programatically uses the krb5 library to kinit.
When it's false, we run our current path which kicks off the
kinit-thread and forks off a kinit process periodically to reacquire
tickets based on FLAGS_kerberos_reinit_interval.

Converted existing tests in thrift-server-test to run with and
without kerberos. We now run this BE test with kerberos by using
Kudu's MiniKdc utility. This introduces a new dependency on some
kerberos binaries that are checked through FindKerberosPrograms.cmake.
Note that this is only a test dependency and not a dependency for
the impalad binaries and friends. Compilation will still succeed if
the kerberos binaries for the MiniKdc are not found, however, the
thrift-server-test will fail.

Updated bin/bootstap_development.sh to install new sasl-gssapi
modules and the kerberos binaries required for the MiniKdc.

Testing: Verified with thrift-server-test and also manually on a
live kerberized cluster.

Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
---
M CMakeLists.txt
M be/src/kudu/security/CMakeLists.txt
M be/src/kudu/security/init.cc
M be/src/kudu/security/test/mini_kdc.cc
M be/src/rpc/CMakeLists.txt
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication.cc
M be/src/rpc/thrift-server-test.cc
M be/src/testutil/gtest-util.h
M bin/bootstrap_development.sh
A cmake_modules/FindKerberosPrograms.cmake
11 files changed, 208 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7938/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4082: Remove todo item in getRegionsInRange

2017-09-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4082: Remove todo item in getRegionsInRange
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1213/

-- 
To view, visit http://gerrit.cloudera.org:8080/8018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I35f0a3cd9363b55b3cde237beaf3037a7967582a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4082: Remove todo item in getRegionsInRange

2017-09-11 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4082: Remove todo item in getRegionsInRange
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/8018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I35f0a3cd9363b55b3cde237beaf3037a7967582a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4082: Remove todo item in getRegionsInRange

2017-09-11 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4082: Remove todo item in getRegionsInRange
..


Patch Set 1: Code-Review+1

Thank you for fixing this.

-- 
To view, visit http://gerrit.cloudera.org:8080/8018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I35f0a3cd9363b55b3cde237beaf3037a7967582a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5912: fix crash in trunc(..., "WW") in release build

2017-09-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5912: fix crash in trunc(..., "WW") in release build
..


IMPALA-5912: fix crash in trunc(..., "WW") in release build

The bug is with the pattern below:

  const date& d = TruncMonth(orig_date).date();

The problem is that 'd' is a reference into the temporary returned from
TruncMonth. But the temporary is only guaranteed to live until the
expression has been evaluated. Thus if the compiler re-uses the register
or stack slot holding the temporary, 'd' may end up pointing to a bogus
value, causing a crash or incorrect results.

The fix is to simply create a local date value with the required date,
which avoids use of references to expression temporary and makes the
logic more obviously correct.

Also remove other uses of references to temporary that were correct but
unnecessary given that the function returned a value and C++11
return value optimisation should kick in.

Testing:
Ran expr-test to completion with a release build. Before this fix it
reliably crashed for me.

Change-Id: Ic5017518188f5025daa5040ca1943581a0675726
Reviewed-on: http://gerrit.cloudera.org:8080/8015
Reviewed-by: Thomas Tauber-Marshall 
Reviewed-by: Dan Hecht 
Tested-by: Impala Public Jenkins
---
M be/src/exprs/udf-builtins.cc
1 file changed, 6 insertions(+), 6 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Thomas Tauber-Marshall: Looks good to me, but someone else must approve
  Dan Hecht: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/8015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ic5017518188f5025daa5040ca1943581a0675726
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4987: Fix flaky test test row availability.py

2017-09-11 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/8036

Change subject: IMPALA-4987: Fix flaky test test_row_availability.py
..

IMPALA-4987: Fix flaky test test_row_availability.py

This patch keeps test_row_availbility from random failure. In this test
the time interval between 'Rows available' event and the previous event
in runtime profile is measured in order to make sure that rows become
available after a specific amount of time. This is not correct since
the previous event is that the coordinator finishes sending query to
backends, which means the execution on backend might have already
started. This patch tracks another event "ready to start" as the
beginning of time interval instead. The coordinator begins to send
query to backends after this event so the time check should always pass.

Change-Id: I96142f1868a26426cbc34aa9a0e0a56979df66c3
---
M tests/query_test/test_rows_availability.py
1 file changed, 30 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/8036/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8036
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I96142f1868a26426cbc34aa9a0e0a56979df66c3
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 


[Impala-ASF-CR] IMPALA-5912: fix crash in trunc(..., "WW") in release build

2017-09-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5912: fix crash in trunc(..., "WW") in release build
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/8015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic5017518188f5025daa5040ca1943581a0675726
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5890: Abort queries if scanner hits IO errors

2017-09-11 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#7).

Change subject: IMPALA-5890: Abort queries if scanner hits IO errors
..

IMPALA-5890: Abort queries if scanner hits IO errors

Prior to this fix, an error in ScannerContext::Stream::GetNextBuffer()
could leave the stream in an inconsistent state:

- The DiskIoMgr hits EOF unexpected, cancels the scan range and enqueues
a buffer with eosr set.
- The ScannerContext::Stream tries to read more bytes, but since it has
hit eosr, it tries to read beyond the end of the scan range using
DiskIoMgr::Read().
- The previous read error resulted in a new file handle being opened.
The now truncated, smaller file causes the seek to fail.
- Then during error handling, the BaseSequenceScanner calls SkipToSync()
and trips over the NULL pointer in in the IO buffer.

In my reproduction this only happens with the file handle cache enabled,
which causes Impala to see two different sized handles: the one from the
cache when the query starts, and the one after reopening the file.

To fix this, we abort a scan range if it encounters an I/O error.

This change also fixes GetBytesInternal() to maintain the invariant that
the output buffer points to the boundary buffer whenever the latter
contains some data.

I tested this by running the repro from the JIRA and impalad did not
crash but aborted the queries. I also ran the repro with
abort_on_error=1, and with the file handle cache disabled.

Text files are not affected by this problem, since the
text scanner doesn't try to recover from errors during ProcessRange()
but wraps it in RETURN_IF_ERROR instead. With this change queries abort
with the same error.

Parquet files are also not affected since they have the metadata at the
end. Truncated files immediately fail with this error:
WARNINGS: File 
'hdfs://localhost:20500/test-warehouse/tpch.partsupp_parquet/foo.0.parq'
has an invalid version number: 

Change-Id: I44dc95184c241fbcdbdbebad54339530680d3509
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/runtime-state.cc
M common/thrift/generate_error_codes.py
7 files changed, 84 insertions(+), 48 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/8011/7
-- 
To view, visit http://gerrit.cloudera.org:8080/8011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I44dc95184c241fbcdbdbebad54339530680d3509
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5890: Abort queries if scanner hits IO errors

2017-09-11 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5890: Abort queries if scanner hits IO errors
..


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8011/6/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 456: status = Status(GetHdfsErrorMsg("Error reading from HDFS 
file: ", file_));
> Missed this one.
Done


Line 468:   status = Status(ss.str());
> Missed this one.
Done


Line 476:   status = Status(GetHdfsErrorMsg("Error reading from 
HDFS file: ", file_));
> Missed this one.
Done


http://gerrit.cloudera.org:8080/#/c/8011/6/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 544: return Status(Substitute("Invalid scan range.  Bad disk id: $0", 
disk_id));
> We should treat these as non-recoverable too I think.
Should we re-use DISK_IO_ERROR for those, even though they're technically not 
IO errors? Or should we create a new error code for them?


Line 1167: ret_status = Status(ErrorMsg(TErrorCode::RUNTIME_ERROR,
> Any reason not to change these too?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/8011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I44dc95184c241fbcdbdbebad54339530680d3509
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] PREVIEW: IMPALA-3437: DECIMAL V2: avoid implicit decimal->double conversion

2017-09-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#4).

Change subject: PREVIEW: IMPALA-3437: DECIMAL_V2: avoid implicit 
decimal->double conversion
..

PREVIEW: IMPALA-3437: DECIMAL_V2: avoid implicit decimal->double conversion

This changes the behaviour of applying an arithmetic operator to
constant DECIMAL and non-DECIMAL arguments. In DECIMAL_V1, this
caused an implicit conversion to floating point, which caused
users a lot of confusion in some cases. In DECIMAL_V2 the typing
rules are simplified: constant decimals are treated the same as any
other decimals.

Testing:
Added some expression tests for different arithmetic operators
and binary predicates (the two Expr subclasses that call
convertNumericLiteralsFromDecimal()).

Extended analyzer tests to test DECIMAL_V2 behaviour.

Ran core tests.

TODO:
* add test cases from the JIRA including explicit casts
* add tests for scientific notation to address coverage gap

Change-Id: Ie419a75784eec2294947103e6e1465dfadfc29da
---
M be/src/exprs/expr-test.cc
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
3 files changed, 130 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/7916/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7916
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie419a75784eec2294947103e6e1465dfadfc29da
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] PREVIEW: IMPALA-3437: DECIMAL V2: avoid implicit decimal->double conversion

2017-09-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#3).

Change subject: PREVIEW: IMPALA-3437: DECIMAL_V2: avoid implicit 
decimal->double conversion
..

PREVIEW: IMPALA-3437: DECIMAL_V2: avoid implicit decimal->double conversion

This changes the behaviour of applying an arithmetic operator to
constant DECIMAL and non-DECIMAL arguments. In DECIMAL_V1, this
caused an implicit conversion to floating point, which caused
users a lot of confusion in some cases. In DECIMAL_V2 the typing
rules are simplified: constant decimals are treated the same as any
other decimals.

Testing:
Added some expression tests for different arithmetic operators
and binary predicates (the two Expr subclasses that call
convertNumericLiteralsFromDecimal()).

Extended analyzer tests to test DECIMAL_V2 behaviour.

Ran core tests.

TODO: add tests for scientific notation to address coverage gap

Change-Id: Ie419a75784eec2294947103e6e1465dfadfc29da
---
M be/src/exprs/expr-test.cc
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
3 files changed, 130 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/7916/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7916
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie419a75784eec2294947103e6e1465dfadfc29da
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2636: HS2 GetTables() returns TABLE TYPE as TABLE for VIEW

2017-09-11 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2636: HS2 GetTables() returns TABLE_TYPE as TABLE for 
VIEW
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7353/1/fe/src/main/java/org/apache/impala/service/MetadataOp.java
File fe/src/main/java/org/apache/impala/service/MetadataOp.java:

Line 62:   private static final String TABLE_TYPE_VIEW = "VIEW";
> Please check if this is consistent with what Hive returns. Does Hive return
In any case I think returning "VIEW" is better, but it would be good to 
document the Impala/Hive discrepancy in a comment somewhere if Hive returns 
something different


-- 
To view, visit http://gerrit.cloudera.org:8080/7353
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I90616388e6181cf342b3de389af940214ed46428
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: sandeep akinapelli 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2636: HS2 GetTables() returns TABLE TYPE as TABLE for VIEW

2017-09-11 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2636: HS2 GetTables() returns TABLE_TYPE as TABLE for 
VIEW
..


Patch Set 1:

(10 comments)

Nice, thanks for working on this one

http://gerrit.cloudera.org:8080/#/c/7353/1/fe/src/main/java/org/apache/impala/service/MetadataOp.java
File fe/src/main/java/org/apache/impala/service/MetadataOp.java:

Line 62:   private static final String TABLE_TYPE_VIEW = "VIEW";
Please check if this is consistent with what Hive returns. Does Hive return 
"VIRTUAL_VIEW"?


Line 63:   private static final String TABLE_TYPE_INDEX_TABLE = "INDEX_TABLE";
Why not just "INDEX"?
(Not that it really matters too much since Impala cannot load these anyway)


Line 298: if (table.getMetaStoreTable() != null) {
Why this null check? Loaded tables should really have this set.


Line 321: if(typeStr == null) {
single line and space after "if"


Line 324: typeStr = typeStr.toUpperCase();
Is this really needed? If so, please Inline into L326.


Line 328:   case EXTERNAL_TABLE:
indent case by 2


Line 338: } catch (Exception e) {
move this up so it's clear the switch block cannot throw


Line 495: // Impala catalog only contains TABLE. Returns an empty set if 
the search does not
This should be fixed as well.


http://gerrit.cloudera.org:8080/#/c/7353/1/fe/src/test/java/org/apache/impala/service/FrontendTest.java
File fe/src/test/java/org/apache/impala/service/FrontendTest.java:

Line 221:   public void TestGetView() throws ImpalaException {
We should test two cases:
1. View that is not loaded is shown as a table
2. View that is loaded is shown as a view

For testing the first case you should be able to use addTestDb() and 
addTestView()

For testing the second case, add a dummy AnalyzesOk("select * from 
functional.alltypes_view") to make sure the view is really loaded at this point 
- otherwise the test relies on the view already being loaded and might be flaky


Line 226: req.get_tables_req.setSchemaName("functional");
No need to test tables here


-- 
To view, visit http://gerrit.cloudera.org:8080/7353
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I90616388e6181cf342b3de389af940214ed46428
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: sandeep akinapelli 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5538: Use explicit catalog versions for deleted objects

2017-09-11 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change.

Change subject: IMPALA-5538: Use explicit catalog versions for deleted objects
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7731/2/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

PS2, Line 233:  0, if they have then skip this step and use that data
nit: perhaps easier to read? "...version 0. If so, then skip ..."


PS2, Line 336: if (topic_deletions) {
 :   VLOG(1) << "Publishing deletion: " << entry_key << "@"
 :   << catalog_object.catalog_version;
 : } else {
 :   VLOG(1) << "Publishing update: " << entry_key << "@"
 :   << catalog_object.catalog_version;
 : }
perhaps remove some redundancy this way:
VLOG(1) << "Publishing " 
<< (topic_deletions ? "deletion" : "update") 
<< ": " << entry_key << "@" 
<< catalog_object.catalog_version;


http://gerrit.cloudera.org:8080/#/c/7731/2/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

PS2, Line 1366: if (item.deleted) {
  : VLOG(3) << "Deleted item: " << item.key << " version: "
  : << catalog_object.catalog_version;
  :   } else {
  : VLOG(3) << "Added item: " << item.key << " version: "
  : << catalog_object.catalog_version;
  :   }
same suggestion as in catalog_server to reduce redundancy.


http://gerrit.cloudera.org:8080/#/c/7731/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

PS2, Line 152: .
the word "delta" implies that we're dealing with inserts, deletes, and possibly 
updates. any reason why this member is named more generally? why not call it 
deleteLog_?


-- 
To view, visit http://gerrit.cloudera.org:8080/7731
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I93cb7a033dc8f0d3e0339394b36affe14523274c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5912: fix crash in trunc(..., "WW") in release build

2017-09-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5912: fix crash in trunc(..., "WW") in release build
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8015/1//COMMIT_MSG
Commit Message:

PS1, Line 23: Also remove other uses of references to temporary that were 
correct but
: unnecessary given that the function returned a value and C++11
: return value optimisation should kick in.
> Yeah, i wasn't trying to express an opinion. just noting that it's a common
No worries, I didn't take it that way - I was just thinking that some people 
must like the pattern since it's been repeatedly used.


-- 
To view, visit http://gerrit.cloudera.org:8080/8015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic5017518188f5025daa5040ca1943581a0675726
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder

2017-09-11 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8034/1/be/src/exec/exec-node.h
File be/src/exec/exec-node.h:

PS1, Line 215:  MemTracker* decoder_mem_tracker() { return 
decoder_mem_tracker_.get(); }
The dictionary is only used for Parquet, so I think the MemPool should not be 
in the ExecNode, as that is used for a large number of other things. Look into 
moving this down to HdfsParquetScanner (or reuse dictionary_pool_).


http://gerrit.cloudera.org:8080/#/c/8034/1/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

PS1, Line 213: dict_decoder_(new 
MemPool(parent->scan_node_->decoder_mem_tracker())),
It is important not to create objects on a per-column basis unless truly 
necessary. I think it makes more sense to have a single MemPool up at the 
HdfsParquetScanner level. Look at how dictionary_pool_ works. I think it might 
be better to reuse that pool.


http://gerrit.cloudera.org:8080/#/c/8034/1/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

PS1, Line 234: *val_ptr = *dict_[index];
dict_ needs to return T's directly rather than T*'s. This is a performance 
critical path, and an extra dereference is too expensive.


-- 
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Joe McDonnell 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates

2017-09-11 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has uploaded a new patch set (#17).

Change subject: IMPALA-5881: Use native allocation while building catalog 
updates
..

IMPALA-5881: Use native allocation while building catalog updates

This patch moves the allocation of thrift structures for serializing
the output of GetAllCatalogObjects() call into the native side. This
is done to prevent the Catalog from hitting JVM array limitations
(2GB maximum size) at scale. Additionally this patch also extends the
--compact_catalog_top=true to apply TCompactProtocol for the above
serialization to reduce the in-memory footprint.

This patch also caps the native allocations to not go beyond 4GB due
to Thrift library limitations. (Thrift internally uses uint32_t
datatype to represent the message size which limits the size to
~4.2GB).

Testing: Passed ASAN + HDFS core and DEBUG + HDFS exhaustive.

Deployed the patch on a 16 node cluster and tested it on a
Catalog-update topic of 3.5GB (uncompressed) or 780MB (compressed).

Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
---
M be/src/catalog/catalog.cc
M be/src/catalog/catalog.h
M be/src/rpc/jni-thrift-util.h
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
A fe/src/main/java/org/apache/impala/thrift/NativeAllocator.java
A fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java
A fe/src/main/java/org/apache/impala/thrift/TNativeSerializer.java
A fe/src/test/java/org/apache/impala/thrift/NativeByteArrayOutputStreamTest.java
A fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java
12 files changed, 716 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/7955/17
-- 
To view, visit http://gerrit.cloudera.org:8080/7955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

2017-09-11 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new patch set (#2).

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
..

IMPALA-5920: Remove admission control dependency on YARN RM jar

Impala's admission controller relies on the YARN
fair-scheduler.xml for configuration. That configuration is
loaded using YARN directly (ie. as a library by the
frontend). In Hadoop 3, a number of changes were made to the
YARN resourcemanager which break Impala. While we eventually
want to rethink the admission control configuration
(IMPALA-4159), in the meantime we at least should avoid
using unsupported YARN APIs.

This patch removes the fe dependency on the YARN artifact
'hadoop-yarn-server-resourcemanager' which contains private
APIs and isn't meant to be used as a library.

A subset of the required code has been added in
'common/yarn-extras', taken from Hadoop 2.6.0.
The goal the 'yarn-extras' is to make Impala's handling of
the AC configuration self-sufficient, i.e. it shouldn't
matter what version of Hadoop exists. As-is, this was tested
and found to work when Hadoop 2.6 is installed. Because
the yarn-extras/pom.xml still references hadoop-common,
hadoop-yarn-common, and hadoop-yarn-api, additional testing
will be required to ensure Impala using yarn-extras works
when installed along side Hadoop 3.

That testing for Hadoop 3 will be done later. Future changes
will make any other changes required for existing code to
work when Hadoop 3 is installed.

Testing:
* Ran existing tests on master.

Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
---
M CMakeLists.txt
A common/yarn-extras/CMakeLists.txt
A common/yarn-extras/README.txt
A common/yarn-extras/pom.xml
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/resource/ResourceWeights.java
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueueType.java
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueuePlacementPolicy.java
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueuePlacementRule.java
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/utils/BuilderUtils.java
M fe/CMakeLists.txt
M fe/pom.xml
M fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
17 files changed, 1,804 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/8035/2
-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN resourcemanager

2017-09-11 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/8035

Change subject: IMPALA-5920: Remove admission control dependency on YARN 
resourcemanager
..

IMPALA-5920: Remove admission control dependency on YARN resourcemanager

Impala's admission controller relies on the YARN
fair-scheduler.xml for configuration. That configuration is
loaded using YARN directly (ie. as a library by the
frontend). In Hadoop 3, a number of changes were made to the
YARN resourcemanager which break Impala. While we eventually
want to rethink the admission control configuration
(IMPALA-4159), in the meantime we at least should avoid
using unsupported YARN APIs.

This patch removes the fe dependency on the YARN artifact
'hadoop-yarn-server-resourcemanager' which contains private
APIs and isn't meant to be used as a library.

A subset of the required code has been added in
'common/yarn-extras', taken from Hadoop 2.6.0.
The goal the 'yarn-extras' is to make Impala's handling of
the AC configuration self-sufficient, i.e. it shouldn't
matter what version of Hadoop exists. As-is, this was tested
and found to work when Hadoop 2.6 is installed. Because
the yarn-extras/pom.xml still references hadoop-common,
hadoop-yarn-common, and hadoop-yarn-api, additional testing
will be required to ensure Impala using yarn-extras works
when installed along side Hadoop 3.

That testing for Hadoop 3 will be done later. Future changes
will make any other changes required for existing code to
work when Hadoop 3 is installed.

Testing:
* Ran existing tests on master.

Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
---
M CMakeLists.txt
A common/yarn-extras/CMakeLists.txt
A common/yarn-extras/README.txt
A common/yarn-extras/pom.xml
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/resource/ResourceWeights.java
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueueType.java
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueuePlacementPolicy.java
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueuePlacementRule.java
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
A 
common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/utils/BuilderUtils.java
M fe/CMakeLists.txt
M fe/pom.xml
M fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
17 files changed, 1,804 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/8035/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-3642: Adding backend addresses to error statuses for some scratch failures.

2017-09-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3642: Adding backend addresses to error statuses for 
some scratch failures.
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1212/

-- 
To view, visit http://gerrit.cloudera.org:8080/7816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If31a50fdf6031312d0348d48aeb8f9688274cac2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3642: Adding backend addresses to error statuses for some scratch failures.

2017-09-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3642: Adding backend addresses to error statuses for 
some scratch failures.
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If31a50fdf6031312d0348d48aeb8f9688274cac2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3642: Adding backend addresses to error statuses for some scratch failures.

2017-09-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3642: Adding backend addresses to error statuses for 
some scratch failures.
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If31a50fdf6031312d0348d48aeb8f9688274cac2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3897 Codegen null-aware constant in PHJ::ProcessBuildBatch()

2017-09-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3897 Codegen null-aware constant in 
PHJ::ProcessBuildBatch()
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1211/

-- 
To view, visit http://gerrit.cloudera.org:8080/7849
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I06acbebc9d2d23bef4734b480a5d3ce41680ea70
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3897 Codegen null-aware constant in PHJ::ProcessBuildBatch()

2017-09-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3897 Codegen null-aware constant in 
PHJ::ProcessBuildBatch()
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7849
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I06acbebc9d2d23bef4734b480a5d3ce41680ea70
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

2017-09-11 Thread anujphadke (Code Review)
anujphadke has uploaded a new patch set (#7).

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
..

IMPALA-4848: Add WIDTH_BUCKET() function

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Expr.java
5 files changed, 169 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/7
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder

2017-09-11 Thread Pranay Singh (Code Review)
Pranay Singh has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/8034

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
..

IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder

Currently DictDecoder class and DictEncoder class uses std::vector
to store the tables mapping codeword to value and vice-versa. It is
hard to detect the memory usage by these tables when they becomes
very large, since this memory is not accounted by Impala's memory
mangement infrastructure.

This patch introduces a new MemTracker to track the memory used
by std::vector in DictDecoder class, whereas an existing memory pool
for DictEncoder class is used when allocating memory for std::vector
in DictEncoder class.

Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
---
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/parquet-column-readers.cc
M be/src/util/dict-encoding.h
M be/src/util/dict-test.cc
5 files changed, 52 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pranay Singh


[Impala-ASF-CR] IMPALA-5890: Abort queries if scanner hits IO errors

2017-09-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5890: Abort queries if scanner hits IO errors
..


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8011/6/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 456: status = Status(GetHdfsErrorMsg("Error reading from HDFS 
file: ", file_));
Missed this one.


Line 468:   status = Status(ss.str());
Missed this one.


Line 476:   status = Status(GetHdfsErrorMsg("Error reading from 
HDFS file: ", file_));
Missed this one.


http://gerrit.cloudera.org:8080/#/c/8011/6/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 544: return Status(Substitute("Invalid scan range.  Bad disk id: $0", 
disk_id));
We should treat these as non-recoverable too I think.


Line 1167: ret_status = Status(ErrorMsg(TErrorCode::RUNTIME_ERROR,
Any reason not to change these too?


-- 
To view, visit http://gerrit.cloudera.org:8080/8011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I44dc95184c241fbcdbdbebad54339530680d3509
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates

2017-09-11 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-5881: Use native allocation while building catalog 
updates
..


Patch Set 16:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/7955/16/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

Line 120:   // Now that we deserialize the thrift objects into 
TGetAllCatalogObjectsResponse,
> deserialized (past tense)
Done


http://gerrit.cloudera.org:8080/#/c/7955/16/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

Line 84:   // flag.
> remove this line
Done


http://gerrit.cloudera.org:8080/#/c/7955/16/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java
File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java:

Line 61:   " maximum limit of " + BUFFER_MAX_SIZE_LIMIT + " bytes";
> remove first space in string
Done


Line 99:* - bufferPtr_>=0
> use consistent spacing, i.e. bufferPtr_ >= 0
Done


Line 102: Preconditions.checkState(bufferPtr_ >= 0);
> remove (documenting in comment is better here)
Done


http://gerrit.cloudera.org:8080/#/c/7955/16/fe/src/test/java/org/apache/impala/thrift/NativeByteArrayOutputStreamTest.java
File 
fe/src/test/java/org/apache/impala/thrift/NativeByteArrayOutputStreamTest.java:

Line 28:   // Custom NativeAllocator that accounts the allocated and freed 
bytes and randomly
> the allocated/freed bytes and simulates allocation failures
Done


Line 39:   // Throws an OOM on every alternate call to reallocate().
> on every 2nd call
Done


Line 51:   allocatedBytes_ = 0;
> single line
Done


Line 87: while (testAllocator_.getAllocationFailures() < 10) {
> I still don't think this is a great approach because it does not reflect re
Makes sense. Deterministic tests are always better and the test combinations 
are small too.


Line 91: writeAndCheck(b, 1, 0);
> These should be tested from a fresh Nbaos and not from the one that is alre
yea I added a new nboas for all these set of tests.


http://gerrit.cloudera.org:8080/#/c/7955/16/fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java
File fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java:

Line 58:   // Deserialize the object at result.buffer_ptr and we confirm it 
is the same
> remove "we"
Done


Line 119: Assert.assertTrue(nativeSerializer.getSerializedBytesSize() >= 
3584 * 1024 * 1024);
> Isn't there an assertGe() or something like that? Also print a message with
Yea, I didn't find it.


Line 137:   e.printStackTrace();
> remove
Done


Line 166:   testBasicSerialization(factory);
> Let's make these separate top-level tests, if's fine to repeat the loop ove
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5538: Use explicit catalog versions for deleted objects

2017-09-11 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5538: Use explicit catalog versions for deleted objects
..


Patch Set 2:

(12 comments)

Change is looking pretty good to me, only minor comments left.

http://gerrit.cloudera.org:8080/#/c/7731/2/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

Line 305:   BuildTopicUpdatesHelper(catalog_objects.updated_objects, false);
I like the previous version more where this is inlined into L292. A quick 
comment about the requirements/structure of a topic update would be nice, e.g. 
stating that the set of deletions and updates are disjoint and therefore the 
order in which updates/deletions are added is irrelevant.


Line 317: if (catalog_object.catalog_version <= last_sent_catalog_version_) 
continue;
Move this above L312?


http://gerrit.cloudera.org:8080/#/c/7731/2/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

Line 1340: LOG(INFO) << "Received large catalog update(>100mb): "
Received a large catalog item?


http://gerrit.cloudera.org:8080/#/c/7731/2/common/thrift/StatestoreService.thrift
File common/thrift/StatestoreService.thrift:

Line 86:   3: required bool deleted = false;
Isn't it better to not have a default value to make sure this is always set?


http://gerrit.cloudera.org:8080/#/c/7731/2/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java:

Line 56:  *   determine which catalog objects were deleted since the last 
catalog update topic.
catalog topic update


Line 57:  *   Once the catalog update topic is constructed, the old deleted 
catalog objects are
catalog topic update


Line 74:   // Retrieve all the removed catalog objects with version > 
'fromVersion'.
/** style comment


http://gerrit.cloudera.org:8080/#/c/7731/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

Line 347:   // Identify the catalog objects that were updated (added/dropped) 
in the catalog with
/** style comment


Line 351: Set addedCatalogObjectKeys = Sets.newHashSet();
updatedCatalogObjects?


Line 363: TCatalogObject catalogTbl = new 
TCatalogObject(TCatalogObjectType.TABLE,
Why not just iterate over db.getTables()?


Line 441: 
addedCatalogObjectKeys.add(CatalogDeltaLog.toCatalogObjectKey(privilege));
Instead of having a duplicate add() in all places, how about creating the 
addedCatalogObjets set right before L448 by looping over the 
addedCatalogObjects? Seems only marginally less efficient, but less prone to 
forgetting an add() somewhere.


Line 448: for (TCatalogObject removedObject: 
deltaLog_.retrieveObjects(fromVersion)) {
Nice! Much clearer.


-- 
To view, visit http://gerrit.cloudera.org:8080/7731
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I93cb7a033dc8f0d3e0339394b36affe14523274c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7438/4//COMMIT_MSG
Commit Message:

Line 75:   DECIMAL_V2 disabled: 4.25s
> Running perf top showed that most of the time is spent in ScaleDownAndRound
Yeah that seems likely - the faster everything else becomes the more this would 
be a bottleneck.

I don't see an obvious way to speed this up (except maybe avoiding the 
redundant divide/mod calls?) and I guess it only makes a difference if a lot of 
expressions are overflowing. Maybe someone with deeper knowledge of decimal 
might see a way to improve it.


http://gerrit.cloudera.org:8080/#/c/7438/4/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

Line 204:   if (round) {
It's unfortunate that we're not computing the quotient and remainder at the 
same time. It looks like boost eventually calls divide_unsigned_helper in 
multiprecision/cpp_int/divide.hpp with the same arguments in both cases but 
throws away either the result or remainder.

If avoiding two divide calls would help performance we could consider calling 
divide_unsigned_helper() directly.


-- 
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3897 Codegen null-aware constant in PHJ::ProcessBuildBatch()

2017-09-11 Thread anujphadke (Code Review)
anujphadke has posted comments on this change.

Change subject: IMPALA-3897 Codegen null-aware constant in 
PHJ::ProcessBuildBatch()
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7849/1/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

Line 280:   /// Free local allocations made from expr evaluators during hash 
table construction.
> Can you add a short description of the parameter?
Done


PS1, Line 477: 
> nit: parameter name doesn't match other functions
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7849
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I06acbebc9d2d23bef4734b480a5d3ce41680ea70
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3897 Codegen null-aware constant in PHJ::ProcessBuildBatch()

2017-09-11 Thread anujphadke (Code Review)
anujphadke has uploaded a new patch set (#2).

Change subject: IMPALA-3897 Codegen null-aware constant in 
PHJ::ProcessBuildBatch()
..

IMPALA-3897 Codegen null-aware constant in PHJ::ProcessBuildBatch()

This change codegen outs a branch in ProcessBuildBatch(). This branch never gets
executed for most of the join types except NULL_AWARE_LEFT_ANTI_JOIN.
The branch itself is not expensive to execute, but it will reduce codegen
time by removing the dead code inside the branch for almost all join
modes.

Change-Id: I06acbebc9d2d23bef4734b480a5d3ce41680ea70
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
4 files changed, 20 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/7849/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7849
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I06acbebc9d2d23bef4734b480a5d3ce41680ea70
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4863/IMPALA-5311: Correctly account the file type and compression codec

2017-09-11 Thread anujphadke (Code Review)
anujphadke has posted comments on this change.

Change subject: IMPALA-4863/IMPALA-5311: Correctly account the file type and 
compression codec
..


Patch Set 7:

I tested this locally on my machine with a cluster size = 1.
I was able to recreate this locally on starting the impala minicluster with 
size=3.
I have a fix for this which I am testing right now.
Spoke to Tim about it. Sorry forgot to update it here.

-- 
To view, visit http://gerrit.cloudera.org:8080/7245
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I797916505f62e568f4159e07099481b8ff571da2
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5912: fix crash in trunc(..., "WW") in release build

2017-09-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5912: fix crash in trunc(..., "WW") in release build
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8015/1//COMMIT_MSG
Commit Message:

PS1, Line 23: Also remove other uses of references to temporary that were 
correct but
: unnecessary given that the function returned a value and C++11
: return value optimisation should kick in.
> I guess there's an element of taste to it. It definitely gets confusing.
Yeah, i wasn't trying to express an opinion. just noting that it's a common 
pattern.


-- 
To view, visit http://gerrit.cloudera.org:8080/8015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic5017518188f5025daa5040ca1943581a0675726
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4863/IMPALA-5311: Correctly account the file type and compression codec

2017-09-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4863/IMPALA-5311: Correctly account the file type and 
compression codec
..


Patch Set 7:

Ping?

-- 
To view, visit http://gerrit.cloudera.org:8080/7245
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I797916505f62e568f4159e07099481b8ff571da2
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5199: prevent hang on empty row batch exchange

2017-09-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5199: prevent hang on empty row batch exchange
..


Patch Set 2:

Yeah I think there are some fundamental problems with the current parallel 
startup logic - I don't think it's actually possible to make all cases work 
with the current approach where all state is receiver-side and there's no 
additional communication (point-to-point or via the coordinator) to 
disambiguate hangs and delays.

IMPALA-3990

-- 
To view, visit http://gerrit.cloudera.org:8080/8005
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib370ebe44e3bb34d3f0fb9f05aa6386eb91c8645
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-09-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8023/1/be/src/runtime/krpc-data-stream-mgr.h
File be/src/runtime/krpc-data-stream-mgr.h:

Line 129: /// In exceptional circumstances, the data stream manager will 
garbage-collect the closed
There's a pre-existing flaw in the reasoning here that we should call out. 
"Exceptional circumstances" is vague and I think hides a distinction between an 
unhealthy cluster with extreme delays and the expected behaviour of certain 
long-running queries. I think the problem is an invalid assumption that the the 
receiver sends batches on a regular cadence with a bounded delay before the 
first batch is sent and when each subsequent batch is sent. That assumption is 
incorrect. I think we should call it out in this comment so that readers 
understand the current flaw. Here's an example where it's wrong.

Consider a plan with three fragments.

  F1 (long-running)
   |
   V
  F2 (limit = 1 on exchange)
   |
   V
  F3 (long-running selective scan)

1. The fragments all start up.
2. Instance 1 of F3 immediately finds and returns a matching row, which is sent 
to F2.
3. This causes F2 to hit its limit, close its exchange and tear itself down.
4. Let's assume F1 also has a lot of work to do and won't finish for 20 minutes
5. Instance 2 of F3 is still churning away on the scan. After 10 minutes it 
finally find a matching row.
6. F3 tries to send the row, can't find the receiver after a timeout and 
returns an error to the coordinator
7. The coordinator cancels the query and returns an error

There are two problems here:
1. The query failed when it shouldn't have
2. F3 wasn't cancelled when it was no longer needed and used lots of resources 
unnecessarily.

The JIRA is IMPALA-3990. I believe that the main reason we haven't seen this in 
practice is that it can only occur when there's a limit without order in a 
subquery. Most queries with that property are non-deterministic and it doesn't 
really make a lot of sense to have a long-running query that returns 
non-deterministic results.

But this actually blocked me from implementing early-close for joins with empty 
build sides, which is a nice optimisations.

There may also be a slightly different invalid assumption that the time between 
the receiver closing the exchange and the sender sending its last batch is 
bounded. That seems possible to solve with sender-side state if the receiver 
notifies the sender that the receiver was not present and the sender can infer 
it was closed cleanly.


-- 
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-11 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#5).

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..

IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

Implement the new DECIMAL return type rules for multiply expressions,
active when query option DECIMAL_V2=1. The algorithm for determining
the type of the result of multiplication is described in the JIRA.

DECIMAL V1:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,38)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,30)|
+---+

DECIMAL V2:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,37)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,6) |
+---+

In this patch, we also fix the early multiplication overflow. We compute
an int256 intermediate value, which we then attempt to scale down and
round to int128.

Benchmarks:

In the following benchmarks, we are selecting from lineitem_big, which
has about 100 times as many rows as the normal lineitem.

Query:
  select
sum(l_quantity * l_tax) +
sum(l_extendedprice * l_discount)
  from lineitem_big;

Before:
  DECIMAL_V2 disabled: 2.24s
  DECIMAL_V2 enabled : 2.14

After:
  DECIMAL_V2 disabled: 2.66s
  DECIMAL_V2 enabled : 2.25s

Query:
  select
sum(l_quantity * l_tax * cast(1 as decimal(38, 0))) +
sum(l_extendedprice * l_discount * cast(1 as decimal(38, 0)))
  from lineitem_big

Before:
  DECIMAL_V2 disabled: 2.34s
  DECIMAL_V2 enabled : 2.36s

After:
  DECIMAL_V2 disabled: 4.25s
  DECIMAL_V2 enabled : 4.15s

Query:
  select
sum(l_quantity * l_tax * cast(1 as decimal(38, 37))) +
sum(l_extendedprice * l_discount * cast(1 as decimal(38, 37)))
  from lineitem_big

Before:
  DECIMAL_V2 disabled: 2.84s
  DECIMAL_V2 enabled : 8.26s

After:
  DECIMAL_V2 disabled: 69.16s
  DECIMAL_V2 enabled : 66.13s

Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M be/src/util/bit-util.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
4 files changed, 282 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4978: Impala should set the kerberos principal to the FQDN

2017-09-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4978: Impala should set the kerberos principal to the 
FQDN
..


Patch Set 1:

Do we want to move forward with this?

-- 
To view, visit http://gerrit.cloudera.org:8080/6260
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1c94f36cb3493afdfcf84f8b31b9897404bd095f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5667: Race in DataStremSender could cause TransmitData sidecar corruption

2017-09-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5667: Race in DataStremSender could cause TransmitData 
sidecar corruption
..


Patch Set 1:

Is this something we want to move forward?

-- 
To view, visit http://gerrit.cloudera.org:8080/7436
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7742551236205c49393a7351291eae6aefb4fa09
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4939: Decimal V2 multiplication

2017-09-11 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4939, IMPALA-4939: Decimal V2 multiplication
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7438/4//COMMIT_MSG
Commit Message:

Line 7: IMPALA-4939, IMPALA-4939: Decimal V2 multiplication
> The same JIRA is listed twice.
Done


http://gerrit.cloudera.org:8080/#/c/7438/1/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

Line 301:   // The following is frought with apparent difficulty, as there 
is only 1 bit
> It seems most efficient development-wise if we give this a try now while we
Implemented checking the leading zeros optimization. It doesn't seem to have a 
big impact on the performance of the queries I tried.


-- 
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-11 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#5).

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..

IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

Implement the new DECIMAL return type rules for multiply expressions,
active when query option DECIMAL_V2=1. The algorithm for determining
the type of the result of multiplication is described in the JIRA.

DECIMAL V1:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,38)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,30)|
+---+

DECIMAL V2:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,37)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,6) |
+---+

In this patch, we also fix the early multiplication overflow. We compute
an int256 intermediate value, which we then attempt to scale down and
round to int128.

Benchmarks:

In the following benchmarks, we are selecting from lineitem_big, which
has about 100 times as many rows as the normal lineitem.

Query:
  select
sum(l_quantity * l_tax) +
sum(l_extendedprice * l_discount)
  from lineitem_big;

Before:
  DECIMAL_V2 disabled: 2.24s
  DECIMAL_V2 enabled : 2.14

After:
  DECIMAL_V2 disabled: 2.66s
  DECIMAL_V2 enabled : 2.25s

Query:
  select
sum(l_quantity * l_tax * cast(1 as decimal(38, 0))) +
sum(l_extendedprice * l_discount * cast(1 as decimal(38, 0)))
  from lineitem_big

Before:
  DECIMAL_V2 disabled: 2.34s
  DECIMAL_V2 enabled : 2.36s

After:
  DECIMAL_V2 disabled: 4.25s
  DECIMAL_V2 enabled : 4.15s

Query:
  select
sum(l_quantity * l_tax * cast(1 as decimal(38, 37))) +
sum(l_extendedprice * l_discount * cast(1 as decimal(38, 37)))
  from lineitem_big

Before:
  DECIMAL_V2 disabled: 2.84s
  DECIMAL_V2 enabled : 8.26s

After:
  DECIMAL_V2 disabled: 69.16s
  DECIMAL_V2 enabled : 66.13s

Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M be/src/util/bit-util.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
4 files changed, 282 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5912: fix crash in trunc(..., "WW") in release build

2017-09-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5912: fix crash in trunc(..., "WW") in release build
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8015/1//COMMIT_MSG
Commit Message:

PS1, Line 23: Also remove other uses of references to temporary that were 
correct but
: unnecessary given that the function returned a value and C++11
: return value optimisation should kick in.
> okay with me (if RVO is applying so no perf impact), but the const-ref-to-t
I guess there's an element of taste to it. It definitely gets confusing.


-- 
To view, visit http://gerrit.cloudera.org:8080/8015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic5017518188f5025daa5040ca1943581a0675726
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5199: prevent hang on empty row batch exchange

2017-09-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5199: prevent hang on empty row batch exchange
..


Patch Set 2: Code-Review+1

Making these cases consistent makes sense to me.  Really, I think we need to 
eventually rethink the whole parallel startup stuff, especially now that we 
have per-query execrpc, etc. It just seems too fragile.  But in the mean time, 
I see no reason the eos and non-eos case inconsistent.

-- 
To view, visit http://gerrit.cloudera.org:8080/8005
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib370ebe44e3bb34d3f0fb9f05aa6386eb91c8645
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5890: Abort queries if scanner hits IO errors

2017-09-11 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5890: Abort queries if scanner hits IO errors
..


Patch Set 5:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8011/5/be/src/exec/base-sequence-scanner.cc
File be/src/exec/base-sequence-scanner.cc:

PS5, Line 177: if (status.IsCancelled() || status.IsMemLimitExceeded()) 
return status;
 : 
 : // Log error from file format parsing.
 : 
state_->LogError(ErrorMsg(TErrorCode::SEQUENCE_SCANNER_PARSE_ERROR,
 : stream_->filename(), stream_->file_offset(),
 : (stream_->eof() ? "(EOF)" : "")));
 : 
 : // Make sure errors specified in the status are logged as 
well
 : state_->LogError(status.msg());
 : 
 : // If abort on error then return, otherwise try to recover.
 : if (state_->abort_on_error()) return status;
 : 
 : // Abort scan range for I/O related errors
 : if (status.code() == 
TErrorCode::SCANNER_CONTEXT_WRAPPED_IO_ERROR) {
 :   eos_ = true;
 :   return Status::OK();
 : }
> it does seem like this could be written in terms of RuntimeState::LogOrRetu
It looks to me as if the query will be aborted if return a non-OK status here. 
ProcessSplit() will return that status and it will be handled in 
hdfs-scan-node.cc:441. This will call SetDoneInternal() and the scanners will 
stop. Setting eos_ here will make stop ProcessSplit() but make it return 
Status::OK().

if (!status.ok()) {
  unique_lock l(lock_);
  // If there was already an error, the main thread will do the cleanup
  if (!status_.ok()) break;

  if (status.IsCancelled() && done_) {
// Scan node initiated scanner thread cancellation.  No need to do 
anything.
break;
  }
  // Set status_ before calling SetDone() (which shuts down the 
RowBatchQueue),
  // to ensure that GetNextInternal() notices the error status.
  status_ = status;
  SetDoneInternal();
  break;
}

That being said, I changed the code to abort the query.


http://gerrit.cloudera.org:8080/#/c/8011/5/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

PS5, Line 194: Internal "
> would be best to keep "Internal error" on the same line so that it's more e
Done


Line 284:   io_buffer_bytes_left_ = 0;
> Maybe add a short comment explaining why this is necessary?
Done, thx for the suggestion.


Line 371:   Status wrapped_status(TErrorCode::SCANNER_CONTEXT_WRAPPED_IO_ERROR, 
status.msg().msg());
> Maybe reference IMPALA-4697 for context on why we're doing this instead of 
Removed the code altogether.


http://gerrit.cloudera.org:8080/#/c/8011/5/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

Line 277: Status WrapIoError(const Status& status);
> Is there a simple thing we can start with for this patch (rather than intro
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/8011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I44dc95184c241fbcdbdbebad54339530680d3509
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5912: fix crash in trunc(..., "WW") in release build

2017-09-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5912: fix crash in trunc(..., "WW") in release build
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1210/

-- 
To view, visit http://gerrit.cloudera.org:8080/8015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic5017518188f5025daa5040ca1943581a0675726
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5890: Abort queries if scanner hits IO errors

2017-09-11 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#6).

Change subject: IMPALA-5890: Abort queries if scanner hits IO errors
..

IMPALA-5890: Abort queries if scanner hits IO errors

Prior to this fix, an error in ScannerContext::Stream::GetNextBuffer()
could leave the stream in an inconsistent state:

- The DiskIoMgr hits EOF unexpected, cancels the scan range and enqueues
a buffer with eosr set.
- The ScannerContext::Stream tries to read more bytes, but since it has
hit eosr, it tries to read beyond the end of the scan range using
DiskIoMgr::Read().
- The previous read error resulted in a new file handle being opened.
The now truncated, smaller file causes the seek to fail.
- Then during error handling, the BaseSequenceScanner calls SkipToSync()
and trips over the NULL pointer in in the IO buffer.

In my reproduction this only happens with the file handle cache enabled,
which causes Impala to see two different sized handles: the one from the
cache when the query starts, and the one after reopening the file.

To fix this, we abort a scan range if it encounters an I/O error.

This change also fixes GetBytesInternal() to maintain the invariant that
the output buffer points to the boundary buffer whenever the latter
contains some data.

I tested this by running the repro from the JIRA and impalad did not
crash but aborted the queries. I also ran the repro with
abort_on_error=1, and with the file handle cache disabled.

Text files are not affected by this problem, since the
text scanner doesn't try to recover from errors during ProcessRange()
but wraps it in RETURN_IF_ERROR instead. With this change queries abort
with the same error.

Parquet files are also not affected since they have the metadata at the
end. Truncated files immediately fail with this error:
WARNINGS: File 
'hdfs://localhost:20500/test-warehouse/tpch.partsupp_parquet/foo.0.parq'
has an invalid version number: 

Change-Id: I44dc95184c241fbcdbdbebad54339530680d3509
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/runtime-state.cc
M common/thrift/generate_error_codes.py
7 files changed, 66 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/8011/6
-- 
To view, visit http://gerrit.cloudera.org:8080/8011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I44dc95184c241fbcdbdbebad54339530680d3509
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5597: Check predicate children types when building runtime filter plan

2017-09-11 Thread Tianyi Wang (Code Review)
Tianyi Wang has abandoned this change.

Change subject: IMPALA-5597: Check predicate children types when building 
runtime filter plan
..


Abandoned

Sorry. Not ready.

-- 
To view, visit http://gerrit.cloudera.org:8080/7949
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I0d66673e280ce13cd3a514236c0c2ecbc50475a6
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-5912: fix crash in trunc(..., "WW") in release build

2017-09-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5912: fix crash in trunc(..., "WW") in release build
..


Patch Set 1: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8015/1//COMMIT_MSG
Commit Message:

PS1, Line 23: Also remove other uses of references to temporary that were 
correct but
: unnecessary given that the function returned a value and C++11
: return value optimisation should kick in.
okay with me (if RVO is applying so no perf impact), but the 
const-ref-to-temporary is a pattern we use broadly especially for 
TimestampValue (for better or worse).


-- 
To view, visit http://gerrit.cloudera.org:8080/8015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic5017518188f5025daa5040ca1943581a0675726
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5597: Check predicate children types when building runtime filter plan

2017-09-11 Thread Tianyi Wang (Code Review)
Tianyi Wang has restored this change.

Change subject: IMPALA-5597: Check predicate children types when building 
runtime filter plan
..


Restored

Reopen since a new fix is implemented.

-- 
To view, visit http://gerrit.cloudera.org:8080/7949
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: restore
Gerrit-Change-Id: I0d66673e280ce13cd3a514236c0c2ecbc50475a6
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-1767 Adds predicate to test boolean values true, false, unknown.

2017-09-11 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has uploaded a new patch set (#5).

Change subject: IMPALA-1767 Adds predicate to test boolean values true, false, 
unknown.
..

IMPALA-1767 Adds predicate to test boolean values true, false, unknown.

Adds a new expression to represent the following boolean predicate:
 IS [NOT] (TRUE | FALSE | UNKNOWN)
The expression is rewritten into compound expressions using
negation, is null, and equality. As a result, there are no BE
operators to eval this predicate.

Testing:
- Frontend: parser, analyzer, rewriter, tosql
- EndToEnd query expressions

Change-Id: I5bd0aa82a0316ac236b7ecf8612ef4b30c016a00
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/analysis/BoolTestPredicate.java
A fe/src/main/java/org/apache/impala/rewrite/BoolTestToCompoundRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
9 files changed, 337 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8014/5
-- 
To view, visit http://gerrit.cloudera.org:8080/8014
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5bd0aa82a0316ac236b7ecf8612ef4b30c016a00
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-1767 Adds predicate to test boolean values true, false, unknown.

2017-09-11 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change.

Change subject: IMPALA-1767 Adds predicate to test boolean values true, false, 
unknown.
..


Patch Set 6:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/8014/2/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

Line 2933:   {: RESULT = p; :}
> It's a little weird that "IS NULL" is handled this way and "IS TRUE" is han
they're separate predicates in the standard (is null vs. is boolean)so I've 
kept them separate in the grammar as well. one option to make things more 
uniform here is to put the not handling for is null in a separate rule. open to 
other suggestions as well to make this clearer.

for null vs unknown, I've kept them separate since the rule for boolean places 
more restrictions on type of the lhs. afaik, unknown expects that the lhs type 
is null or boolean but when checking null, the lhs can be any type or null. for 
example, if you test a boolean column with unknown or null in postgres, it 
works. switch the column's type to int, and is null type checks but is unknown 
does not.


http://gerrit.cloudera.org:8080/#/c/8014/3/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

PS3, Line 1583: 
> whitespace
Done


PS3, Line 3034: 
> whitespace, here and below
Done


http://gerrit.cloudera.org:8080/#/c/8014/3/fe/src/main/java/org/apache/impala/analysis/BoolTestPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BoolTestPredicate.java:

PS3, Line 26: Class describing a predicate that tests a boolean value using IS.
:  * The form of a bool test 
> You should explain more thoroughly what this expr actually does, eg. the ha
more detailed comment was in the rewrite.. makes more sense here so moved it.


PS3, Line 28: s a bool or null and it returns a bool
:  * (much like IS [NO
> weird line wrapping here
Done


PS3, Line 33: l
> We don't use trailing underscores on constant values.
Done


PS3, Line 94:   thro
> Preconditions.checkState
Done


http://gerrit.cloudera.org:8080/#/c/8014/2/fe/src/main/java/org/apache/impala/rewrite/BoolTestToCompoundRule.java
File fe/src/main/java/org/apache/impala/rewrite/BoolTestToCompoundRule.java:

Line 29:  * Rewrites a bool test predicate to compound expressions.
> Please consider adding the rewrite rule (i.e., line 50 and 47) up here to t
Done


http://gerrit.cloudera.org:8080/#/c/8014/3/fe/src/main/java/org/apache/impala/rewrite/BoolTestToCompoundRule.java
File fe/src/main/java/org/apache/impala/rewrite/BoolTestToCompoundRule.java:

PS3, Line 39: 
: public class BoolT
> single line
Done


PS3, Line 64:   }
:
> single line
Done


http://gerrit.cloudera.org:8080/#/c/8014/3/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java:

PS3, Line 164: 
> whitespace
Done


http://gerrit.cloudera.org:8080/#/c/8014/3/fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
File fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java:

Line 1257: // BoolTestPredicate.
> modify one of these tests to use "NOT"
added a test using not, but I'm not sure if I've captured what you're after.


-- 
To view, visit http://gerrit.cloudera.org:8080/8014
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5bd0aa82a0316ac236b7ecf8612ef4b30c016a00
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1767 Adds predicate to test boolean values true, false, unknown.

2017-09-11 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has uploaded a new patch set (#6).

Change subject: IMPALA-1767 Adds predicate to test boolean values true, false, 
unknown.
..

IMPALA-1767 Adds predicate to test boolean values true, false, unknown.

Adds a new expression to represent the following boolean predicate:
 IS [NOT] (TRUE | FALSE | UNKNOWN)
The expression is rewritten into compound expressions using
negation, is null, and equality. As a result, there are no BE
operators to eval this predicate.

Testing:
- Frontend: parser, analyzer, rewriter, tosql
- EndToEnd query expressions

Change-Id: I5bd0aa82a0316ac236b7ecf8612ef4b30c016a00
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/analysis/BoolTestPredicate.java
A fe/src/main/java/org/apache/impala/rewrite/BoolTestToCompoundRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
9 files changed, 337 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8014/6
-- 
To view, visit http://gerrit.cloudera.org:8080/8014
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5bd0aa82a0316ac236b7ecf8612ef4b30c016a00
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-1767 Adds predicate to test boolean values true, false, unknown.

2017-09-11 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has uploaded a new patch set (#4).

Change subject: IMPALA-1767 Adds predicate to test boolean values true, false, 
unknown.
..

IMPALA-1767 Adds predicate to test boolean values true, false, unknown.

Adds a new expression to represent the following boolean predicate:
 IS [NOT] (TRUE | FALSE | UNKNOWN)
The expression is rewritten into compound expressions using
negation, is null, and equality. As a result, there are no BE
operators to eval this predicate.

Testing:
- Frontend: parser, analyzer, rewriter, tosql
- EndToEnd query expressions

Change-Id: I5bd0aa82a0316ac236b7ecf8612ef4b30c016a00
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/analysis/BoolTestPredicate.java
A fe/src/main/java/org/apache/impala/rewrite/BoolTestToCompoundRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
9 files changed, 336 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8014/4
-- 
To view, visit http://gerrit.cloudera.org:8080/8014
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5bd0aa82a0316ac236b7ecf8612ef4b30c016a00
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-1767 Adds predicate to test boolean values true, false, unknown.

2017-09-11 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-1767 Adds predicate to test boolean values true, false, 
unknown.
..


Patch Set 3:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/8014/3/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

PS3, Line 1583:   
whitespace


PS3, Line 3034:  
whitespace, here and below


http://gerrit.cloudera.org:8080/#/c/8014/3/fe/src/main/java/org/apache/impala/analysis/BoolTestPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BoolTestPredicate.java:

PS3, Line 26: Class describing a predicate that tests a boolean values using IS.
:  * Example: (1 < 1) IS TRUE
You should explain more thoroughly what this expr actually does, eg. the 
handling of NULLs.


PS3, Line 28: a
:  * CompoundPredicate
weird line wrapping here


PS3, Line 33: _
We don't use trailing underscores on constant values.


PS3, Line 94: assert
Preconditions.checkState


http://gerrit.cloudera.org:8080/#/c/8014/3/fe/src/main/java/org/apache/impala/rewrite/BoolTestToCompoundRule.java
File fe/src/main/java/org/apache/impala/rewrite/BoolTestToCompoundRule.java:

PS3, Line 39: if (!(expr instanceof BoolTestPredicate))
:   return expr;
single line


PS3, Line 64: private BoolTestToCompoundRule() {
:   }
single line


http://gerrit.cloudera.org:8080/#/c/8014/3/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java:

PS3, Line 164:   
whitespace


http://gerrit.cloudera.org:8080/#/c/8014/3/fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
File fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java:

Line 1257: // BoolTestPredicate.
modify one of these tests to use "NOT"


-- 
To view, visit http://gerrit.cloudera.org:8080/8014
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5bd0aa82a0316ac236b7ecf8612ef4b30c016a00
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1767 Adds predicate to test boolean values true, false, unknown.

2017-09-11 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change.

Change subject: IMPALA-1767 Adds predicate to test boolean values true, false, 
unknown.
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8014/2/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

Line 2933:   {: RESULT = p; :}
It's a little weird that "IS NULL" is handled this way and "IS TRUE" is handled 
in line 3034. (And even more weird that IS NULL and IS UNKOWN are handled 
differently.)


http://gerrit.cloudera.org:8080/#/c/8014/2/fe/src/main/java/org/apache/impala/rewrite/BoolTestToCompoundRule.java
File fe/src/main/java/org/apache/impala/rewrite/BoolTestToCompoundRule.java:

Line 29:  * Rewrites a bool test predicate to compound expressions. The form of 
a bool
Please consider adding the rewrite rule (i.e., line 50 and 47) up here to the 
documentation.


-- 
To view, visit http://gerrit.cloudera.org:8080/8014
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5bd0aa82a0316ac236b7ecf8612ef4b30c016a00
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1767 Adds predicate to test boolean values true, false, unknown.

2017-09-11 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change.

Change subject: IMPALA-1767 Adds predicate to test boolean values true, false, 
unknown.
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8014/2/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

PS2, Line 273: KW_UPDATE,
> Treating "unknown" as a keyword conflicts with kudu specification: encoding
Backed out the approach where "unknown" is a keyword. That lets us not require 
that "unknown" be reserved, and does not require any changes for the existing 
kudu spec.


-- 
To view, visit http://gerrit.cloudera.org:8080/8014
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5bd0aa82a0316ac236b7ecf8612ef4b30c016a00
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1767 Adds predicate to test boolean values true, false, unknown.

2017-09-11 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has uploaded a new patch set (#3).

Change subject: IMPALA-1767 Adds predicate to test boolean values true, false, 
unknown.
..

IMPALA-1767 Adds predicate to test boolean values true, false, unknown.

Adds a new expression to represent the following boolean predicate:
 IS [NOT] (TRUE | FALSE | UNKNOWN)
The expression is rewritten into compound expressions using
negation, is null, and equality. As a result, there are no BE
operators to eval this predicate.

Testing:
- Frontend: parser, analyzer, rewriter, tosql
- EndToEnd query expressions

Change-Id: I5bd0aa82a0316ac236b7ecf8612ef4b30c016a00
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/analysis/BoolTestPredicate.java
A fe/src/main/java/org/apache/impala/rewrite/BoolTestToCompoundRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
9 files changed, 330 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8014/3
-- 
To view, visit http://gerrit.cloudera.org:8080/8014
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5bd0aa82a0316ac236b7ecf8612ef4b30c016a00
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-5894: [DOCS] Clarify placement of STRAIGHT JOIN hint

2017-09-11 Thread John Russell (Code Review)
John Russell has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/8031

Change subject: IMPALA-5894: [DOCS] Clarify placement of STRAIGHT_JOIN hint
..

IMPALA-5894: [DOCS] Clarify placement of STRAIGHT_JOIN hint

"immediately after the SELECT keyword" was mentioned in a
few places for STRAIGHT_JOIN. I reworded all instances to
mention that [DISTINCT | ALL] can also come before the
hint name.

Change-Id: I3cac1afccc132f389b2017ad217fdf7e7b04513a
---
M docs/shared/impala_common.xml
M docs/topics/impala_hints.xml
M docs/topics/impala_perf_joins.xml
3 files changed, 12 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/8031/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8031
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I3cac1afccc132f389b2017ad217fdf7e7b04513a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 


[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates

2017-09-11 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5881: Use native allocation while building catalog 
updates
..


Patch Set 16:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/7955/16/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

Line 120:   // Now that we deserialize the thrift objects into 
TGetAllCatalogObjectsResponse,
deserialized (past tense)


http://gerrit.cloudera.org:8080/#/c/7955/16/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

Line 84:   // flag.
remove this line


http://gerrit.cloudera.org:8080/#/c/7955/16/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java
File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java:

Line 61:   " maximum limit of " + BUFFER_MAX_SIZE_LIMIT + " bytes";
remove first space in string


Line 99:* - bufferPtr_>=0
use consistent spacing, i.e. bufferPtr_ >= 0


Line 102: Preconditions.checkState(bufferPtr_ >= 0);
remove (documenting in comment is better here)


http://gerrit.cloudera.org:8080/#/c/7955/16/fe/src/test/java/org/apache/impala/thrift/NativeByteArrayOutputStreamTest.java
File 
fe/src/test/java/org/apache/impala/thrift/NativeByteArrayOutputStreamTest.java:

Line 28:   // Custom NativeAllocator that accounts the allocated and freed 
bytes and randomly
the allocated/freed bytes and simulates allocation failures


Line 39:   // Throws an OOM on every alternate call to reallocate().
on every 2nd call


Line 51:   allocatedBytes_ = 0;
single line


Line 87: while (testAllocator_.getAllocationFailures() < 10) {
I still don't think this is a great approach because it does not reflect 
real-world usage of the NBAOS.

I think it's worth exhaustively testing all states (there are not that many). 
You could do something like this instead:


// Test first allocation failure
testAllocator_.setFailRealloc(true);
try {
  Nbaos n = new Nbaos(testAllocator_);
} catch () {
  // Fill in
}


// Test allocating failure from all possible states, allocating 64M at a time.
int reallocs = 1;
byte[] b = new byte[64 * 1024 * 1024 * 1024];
while (true) {
  TestAllocator alloc = new TestAllocator();
  Nbaos n = new Nbaos(alloc);
  while (alloc.getReallocs() == reallocs) {
n.write(b, 0, len);
  }
  alloc.setFailRealloc(true);
  writeAndCheck();
  
  reallocs = alloc.getReallocs();
  // If nbaos is at max capacity break out of loop
}


Line 91: writeAndCheck(b, 1, 0);
These should be tested from a fresh Nbaos and not from the one that is already 
'polluted' from previous tests.


http://gerrit.cloudera.org:8080/#/c/7955/16/fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java
File fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java:

Line 58:   // Deserialize the object at result.buffer_ptr and we confirm it 
is the same
remove "we"


Line 119: Assert.assertTrue(nativeSerializer.getSerializedBytesSize() >= 
3584 * 1024 * 1024);
Isn't there an assertGe() or something like that? Also print a message with the 
actual size for debugging


Line 137:   e.printStackTrace();
remove


Line 166:   testBasicSerialization(factory);
Let's make these separate top-level tests, if's fine to repeat the loop over 
PROTOCOLS_TO_TEST.


-- 
To view, visit http://gerrit.cloudera.org:8080/7955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[native-toolchain-CR] Bump Kudu version to 3f49724

2017-09-11 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: Bump Kudu version to 3f49724
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/8028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I38334bdbaa1e7a6c6f5d595c26efb76efa062305
Gerrit-PatchSet: 1
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[native-toolchain-CR] Bump Kudu version to 3f49724

2017-09-11 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has submitted this change and it was merged.

Change subject: Bump Kudu version to 3f49724
..


Bump Kudu version to 3f49724

Change-Id: I38334bdbaa1e7a6c6f5d595c26efb76efa062305
---
M buildall.sh
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Matthew Jacobs: Looks good to me, approved
  Thomas Tauber-Marshall: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/8028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I38334bdbaa1e7a6c6f5d595c26efb76efa062305
Gerrit-PatchSet: 1
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-3642: Adding backend addresses to error statuses for some scratch failures.

2017-09-11 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has uploaded a new patch set (#2).

Change subject: IMPALA-3642: Adding backend addresses to error statuses for 
some scratch failures.
..

IMPALA-3642: Adding backend addresses to error statuses for some scratch 
failures.

Adds GetBackendAddress() (which is host:port) to error messages stemming
from SCRATCH_LIMIT_EXCEEDED, SCRATCH_READ_TRUNCATED, and
SCRATCH_ALLOCATION_FAILED messages.

Testing:

* Unit tests assert the string is updated for SCRATCH_LIMIT_EXCEEDED
  and SCRATCH_ALLOCATION_FAILED. SCRATCH_READ_TRUNCATED doesn't
  have an existing test, and I didn't add a new one.

* Manually testing a query that spills after "chmod 000 /tmp/impala-scratch":
$ chmod 000 /tmp/impala-scratch
$ impala-shell
[dev:21000] > set mem_limit=100m;
MEM_LIMIT set to 100m
[dev:21000] > select count(*) from tpch_parquet.lineitem join 
tpch_parquet.orders on l_orderkey = o_orderkey;
Query: select count(*) from tpch_parquet.lineitem join tpch_parquet.orders 
on l_orderkey = o_orderkey
Query submitted at: 2017-09-11 11:07:06 (Coordinator: http://dev:25000)
Query progress can be monitored at: 
http://dev:25000/query_plan?query_id=5c48ff8f4103c194:1b40a6c
WARNINGS: Could not create files in any configured scratch directories 
(--scratch_dirs=/tmp/impala-scratch) on backend 'dev:22002'. See logs for 
previous errors that may have prevented creating or writing scratch files.
Opening 
'/tmp/impala-scratch/5c48ff8f4103c194:1b40a6c_08e8d63b-169d-4571-a0fe-c48fa08d73e6'
 for write failed with errno=13 description=Error(13): Permission denied
Opening 
'/tmp/impala-scratch/5c48ff8f4103c194:1b40a6c_08e8d63b-169d-4571-a0fe-c48fa08d73e6'
 for write failed with errno=13 description=Error(13): Permission denied
Opening 
'/tmp/impala-scratch/5c48ff8f4103c194:1b40a6c_08e8d63b-169d-4571-a0fe-c48fa08d73e6'
 for write failed with errno=13 description=Error(13): Permission denied

Change-Id: If31a50fdf6031312d0348d48aeb8f9688274cac2
---
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M common/thrift/generate_error_codes.py
4 files changed, 13 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/7816/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If31a50fdf6031312d0348d48aeb8f9688274cac2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5890: Abort queries if scanner hits IO errors

2017-09-11 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5890: Abort queries if scanner hits IO errors
..


Patch Set 5:

> Yeah we could probably add an error code like DISK_IO_MGR_ERROR and
 > use it for the different I/O errors emanating from the DiskIoMgr
 > code.

I'll try that out and will push a PS shortly.

-- 
To view, visit http://gerrit.cloudera.org:8080/8011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I44dc95184c241fbcdbdbebad54339530680d3509
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5211: Simplifying ifnull conditional.

2017-09-11 Thread Philip Zeyliger (Code Review)
Hello Matthew Jacobs,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7829

to look at the new patch set (#6).

Change subject: IMPALA-5211: Simplifying ifnull conditional.
..

IMPALA-5211: Simplifying ifnull conditional.

Re-writes ifnull(x, y) into if(x IS DISTINCT FROM y, x, NULL),
and simplifies x IS DISTINCT FROM y -> FALSE if x.equals(y). Removes
backend implementation of ifnull.

It's important and subtle that the re-write uses "x IS DISTINCT FROM y"
rather than "x != y" so that the simplification can be made while
handling null values correctly. ("x != x" may be either false or null,
but x is distinct from x is always false.)

The error messages are slightly altered by this change:

Before:

  Query: select nullif("a", 1)
  ERROR: AnalysisException: No matching function with signature: nullif(STRING, 
TINYINT).

  Query: select nullif(1, 2, 3)
  ERROR: AnalysisException: No matching function with signature: 
nullif(TINYINT, TINYINT, TINYINT).

After:

  ERROR: AnalysisException: operands of type STRING and TINYINT are not 
comparable: 'a' IS DISTINCT FROM 1
  ERROR: AnalysisException: nullif() must be called with two arguments.

This kind of error message is consistent with what you get for other re-writes.

Testing:
* Added new tests to ExprRewriteRulesTests for nullif and the if(x
  distinct from y, ...) simplification.
* New test for the rewrite in ParserTest.
* Adds an nvl2() test, incidentally.
* Confirmed (using EclEmma, which uses jococo engine) that coverage is good.
* Ran the tests.

Change-Id: Id91ca968a0c0be44e1ec54ad8602f91a5cb2e0e5
---
M be/src/exprs/conditional-functions-ir.cc
M be/src/exprs/conditional-functions.cc
M be/src/exprs/conditional-functions.h
M be/src/exprs/scalar-expr.cc
M be/src/exprs/scalar-expr.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
A fe/src/main/java/org/apache/impala/rewrite/SimplifyDistinctFromRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
12 files changed, 195 insertions(+), 93 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/7829/6
-- 
To view, visit http://gerrit.cloudera.org:8080/7829
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id91ca968a0c0be44e1ec54ad8602f91a5cb2e0e5
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-5890: Abort queries if scanner hits IO errors

2017-09-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5890: Abort queries if scanner hits IO errors
..


Patch Set 5:

Yeah we could probably add an error code like DISK_IO_MGR_ERROR and use it for 
the different I/O errors emanating from the DiskIoMgr code.

-- 
To view, visit http://gerrit.cloudera.org:8080/8011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I44dc95184c241fbcdbdbebad54339530680d3509
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5890: Abort queries if scanner hits IO errors

2017-09-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5890: Abort queries if scanner hits IO errors
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8011/5/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

Line 277: Status WrapIoError(const Status& status);
> I agree, Lars filed IMPALA-5915 for this. I think there were some questions
Is there a simple thing we can start with for this patch (rather than introduce 
the wrapping) and then refine it later as needed? Like use a single error code?


-- 
To view, visit http://gerrit.cloudera.org:8080/8011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I44dc95184c241fbcdbdbebad54339530680d3509
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5890: Abort queries if scanner hits IO errors

2017-09-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5890: Abort queries if scanner hits IO errors
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8011/5/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

Line 284:   io_buffer_bytes_left_ = 0;
Maybe add a short comment explaining why this is necessary?

// Make state consistent in case we return early with an error below.


Line 371:   Status wrapped_status(TErrorCode::SCANNER_CONTEXT_WRAPPED_IO_ERROR, 
status.msg().msg());
Maybe reference IMPALA-4697 for context on why we're doing this instead of 
MergeStatus().


http://gerrit.cloudera.org:8080/#/c/8011/5/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

Line 277: Status WrapIoError(const Status& status);
> rather than wrapping, is it not possible to make our IO code return a singl
I agree, Lars filed IMPALA-5915 for this. I think there were some questions 
about how to design it - e.g. do we have a hierarchy of error codes, or some 
other way of categorising them to avoid having to enumerate all the possible 
errors in the error handling code.


-- 
To view, visit http://gerrit.cloudera.org:8080/8011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I44dc95184c241fbcdbdbebad54339530680d3509
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5890: Abort queries if scanner hits IO errors

2017-09-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5890: Abort queries if scanner hits IO errors
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8011/5/be/src/exec/base-sequence-scanner.cc
File be/src/exec/base-sequence-scanner.cc:

PS5, Line 177: if (status.IsCancelled() || status.IsMemLimitExceeded()) 
return status;
 : 
 : // Log error from file format parsing.
 : 
state_->LogError(ErrorMsg(TErrorCode::SEQUENCE_SCANNER_PARSE_ERROR,
 : stream_->filename(), stream_->file_offset(),
 : (stream_->eof() ? "(EOF)" : "")));
 : 
 : // Make sure errors specified in the status are logged as 
well
 : state_->LogError(status.msg());
 : 
 : // If abort on error then return, otherwise try to recover.
 : if (state_->abort_on_error()) return status;
 : 
 : // Abort scan range for I/O related errors
 : if (status.code() == 
TErrorCode::SCANNER_CONTEXT_WRAPPED_IO_ERROR) {
 :   eos_ = true;
 :   return Status::OK();
 : }
it does seem like this could be written in terms of 
RuntimeState::LogOrReturnErrror(), right? I guess the one complication is that 
we're trying not to fail entirely, but instead mark eos_ in that case. But are 
we sure that returning error from GetNextInternal() won't do the same thing? I 
thought the scanners have a check higher up the callstack that would go ahead 
to the next scan range if abort_on_error==false and GetNext() returns 
(recoverable) error, but I could be mistaken.


http://gerrit.cloudera.org:8080/#/c/8011/5/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

PS5, Line 194: Internal "
would be best to keep "Internal error" on the same line so that it's more 
easily grep-able.


http://gerrit.cloudera.org:8080/#/c/8011/5/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

Line 277: Status WrapIoError(const Status& status);
rather than wrapping, is it not possible to make our IO code return a single 
(or some set of) unrecoverable IO error codes directly, that we can then 
identify as IO errors?


-- 
To view, visit http://gerrit.cloudera.org:8080/8011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I44dc95184c241fbcdbdbebad54339530680d3509
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5250: Unify decompressor output length semantics

2017-09-11 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/8030

Change subject: IMPALA-5250: Unify decompressor output_length semantics
..

IMPALA-5250: Unify decompressor output_length semantics

This patch makes the semantics output_length parameter in
Codec::ProcessBlock to be the same. In existing code different
decompressor treats output_length differently:
1. SnappyDecompressor needs output_length to be greater than or equal
   to actual decompressed length, but it does not set it to actual
   decompressed length after decompression.
2. SnappyBlockDecompressor and Lz4Decompressor require output_length to
   be exactly the same as actual decompressed length, otherwise
   decompression fails.
3. Other decompressors need output_length to be greater than or equal to
   actual decompressed length and will set it to actual decompressed
   length if oversized.
This inconsistency leads to a bug where the error message is
undeterministic when the compressed block is corrupted. This patch makes
all decompressor behave as 3 and adds a testcase checking that
decompressors can handle oversized output buffer correctly.
Lz4Decompressor will use the "safe" version of decompression function
instead of the "fast" version, for the latter is insecure with corrupted
data and requires decompressed length to be known.

Change-Id: Ifd42942b169921a7eb53940c3762bc45bb82a993
---
M be/src/util/decompress-test.cc
M be/src/util/decompress.cc
2 files changed, 31 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/8030/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ifd42942b169921a7eb53940c3762bc45bb82a993
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 


[Impala-ASF-CR] IMPALA-5912: fix crash in trunc(..., "WW") in release build

2017-09-11 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-5912: fix crash in trunc(..., "WW") in release build
..


Patch Set 1: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/8015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic5017518188f5025daa5040ca1943581a0675726
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1767 Adds predicate to test boolean values true, false, unknown.

2017-09-11 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change.

Change subject: IMPALA-1767 Adds predicate to test boolean values true, false, 
unknown.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8014/2/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

PS2, Line 273: KW_UNKNOWN
Treating "unknown" as a keyword conflicts with kudu specification: encoding and 
compression may both specify "unknown". I can work-around this, but in general, 
adding "unknown" as a keyword may break existing queries. Any precedence on 
adding keywords (I saw the handling of "default", for example)?


-- 
To view, visit http://gerrit.cloudera.org:8080/8014
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5bd0aa82a0316ac236b7ecf8612ef4b30c016a00
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3360: Codegen inserting into runtime filters

2017-09-11 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/8029

Change subject: IMPALA-3360: Codegen inserting into runtime filters
..

IMPALA-3360: Codegen inserting into runtime filters

This patch codegens PhjBuilder::InsertRuntimeFilters() and
FilterContext::Insert().

This allows us to unroll the loop over all the filters in
PhjBuilder::ProcessBuildBatch(), including skipping consideration of
FilterContext's that don't have a local_bloom_filter, and eliminates
the branch on type that happens in RawValue::GetHashValue().

Testing:
- Ran existing runtime filter tests.
- Ran perf tests locally (all avg. over three runs):
  - Four way self join on tpch_parquet.lineitem. Should be a good case
for this as there's several large hash join build sides that will
benefit from the codegen. Total query running time improved ~7%
(from 16.07s to 14.91s).
  - Single join of tpch_parquet.lineitem against a selectively
filtered tpch_parquet.lineitem. Should be a bad case for this
patch, as the build side of the join is very small. Total query
running time regressed by about ~2% (from 0.73s to 0.75s) due to
an increase in codegen time (from 295ms to 309ms for the fragment
containing the hash join).

Change-Id: I79cf23ad92dadaab996a50a2ca07ef9ebe8639bb
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exec/filter-context.cc
M be/src/exec/filter-context.h
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/util/CMakeLists.txt
A be/src/util/bloom-filter-ir.cc
M be/src/util/bloom-filter.h
10 files changed, 232 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/8029/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8029
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I79cf23ad92dadaab996a50a2ca07ef9ebe8639bb
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 


[native-toolchain-CR] Bump Kudu version to 3f49724

2017-09-11 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: Bump Kudu version to 3f49724
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/8028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I38334bdbaa1e7a6c6f5d595c26efb76efa062305
Gerrit-PatchSet: 1
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[native-toolchain-CR] Bump Kudu version to 3f49724

2017-09-11 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/8028

Change subject: Bump Kudu version to 3f49724
..

Bump Kudu version to 3f49724

Change-Id: I38334bdbaa1e7a6c6f5d595c26efb76efa062305
---
M buildall.sh
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/28/8028/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I38334bdbaa1e7a6c6f5d595c26efb76efa062305
Gerrit-PatchSet: 1
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall