[Impala-ASF-CR] IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds

2017-11-22 Thread Kim Jin Chul (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8614

to look at the new patch set (#4).

Change subject: IMPALA-2250: Make multiple COUNT(DISTINCT) message state 
workarounds
..

IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds

Change-Id: I5084be10946d68f3ec0760c2b7e698635df26a89
---
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
1 file changed, 4 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8614/4
--
To view, visit http://gerrit.cloudera.org:8080/8614
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5084be10946d68f3ec0760c2b7e698635df26a89
Gerrit-Change-Number: 8614
Gerrit-PatchSet: 4
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Kim Jin Chul 


[Impala-ASF-CR] IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds

2017-11-22 Thread Kim Jin Chul (Code Review)
Kim Jin Chul has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8614 )

Change subject: IMPALA-2250: Make multiple COUNT(DISTINCT) message state 
workarounds
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8614/3/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
File fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java:

http://gerrit.cloudera.org:8080/#/c/8614/3/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java@249
PS3, Line 249: if 
(distinctAggExprs.get(i).getFnName().getFunction().equalsIgnoreCase(
 : "count")) {
> In my example, it does not make sense to print an error message that uses N
Okay, let me reflect your proposal.



--
To view, visit http://gerrit.cloudera.org:8080/8614
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5084be10946d68f3ec0760c2b7e698635df26a89
Gerrit-Change-Number: 8614
Gerrit-PatchSet: 3
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Comment-Date: Wed, 22 Nov 2017 08:01:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6225: Query profile date-time strings should have ns precision.

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8611 )

Change subject: IMPALA-6225: Query profile date-time strings should have ns 
precision.
..


Patch Set 7: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1514/


--
To view, visit http://gerrit.cloudera.org:8080/8611
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e124b9c7e0717b8dc2cdab46aea41d74c5f2fd0
Gerrit-Change-Number: 8611
Gerrit-PatchSet: 7
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Wed, 22 Nov 2017 08:20:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3703: Store query context in thread-local variables

2017-11-22 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8621 )

Change subject: IMPALA-3703: Store query context in thread-local variables
..


Patch Set 1:

(2 comments)

Thanks for the review.
I added SetThreadName(), SetQueryId(), and SetInstanceId() to InfoTable. They 
have their separate buffer on the stack.

> "how hard will it be to visit a /threads debug page and see what all the 
> threads are up to?"
If you have a complete core file (one that is not generated from a minidump), 
one can traverse all the threads and inspect the global 
impala::thread_info_table object and retrieve the needed information.
If you have a core that is created from a minidump, it is only a little bit 
more complicated. During thread traversion one need to select the stack frame 
that has the InfoTable object (currently it is Thread::SuperviseThread()). 
After the needed information can be retrieved from the InfoTable object.

> "Have you tried pulling one of these out in a minidump?"
I used minidumps, but I converted them to core files because the tooling for 
minidumps are not that user friendly / feature rich.

> RegisterAppMemory
Yes, this can be used, but then I would need to call RegisterAppMemory on every 
thread creation, and UnregisterAppMemory on every thread destruction. It is not 
that big a deal, since I can put this logic into InfoTable's 
constructor/destructor pairs.
I would also need to make the breakpad ExceptionHandler object accessible for 
InfoTable, and synchronize the accesses to it between threads.
With stack allocation, everything is automatically included in the minidump. If 
you say we cannot afford having this object on the stack, I rewrite it to use 
the heap.

> testing
I created backend tests in thread-info-test.cc

http://gerrit.cloudera.org:8080/#/c/8621/1/be/src/common/thread-info.h
File be/src/common/thread-info.h:

http://gerrit.cloudera.org:8080/#/c/8621/1/be/src/common/thread-info.h@70
PS1, Line 70: /// Call this global function to add information about the 
current thread
> Should we document that you should avoid information that's sensitive here?
I removed this function, now only the member AddExtraInfo() can be used to add 
unstructured text info to the InfoTable.
I added a comment there that it is not for sensitive data.


http://gerrit.cloudera.org:8080/#/c/8621/1/be/src/common/thread-info.h@73
PS1, Line 73: /// is a no-op.
> It's weird to me that this can be a no-op if there's no object, but it fail
I removed this function, now we cannot call InfoTable::AddExtraInfo() without 
an object.



--
To view, visit http://gerrit.cloudera.org:8080/8621
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I566f7f1db5117c498e86e0bd05b33bdcff624609
Gerrit-Change-Number: 8621
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 22 Nov 2017 16:14:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3703: Store query context in thread-local variables

2017-11-22 Thread Zoltan Borok-Nagy (Code Review)
Hello Lars Volker, Philip Zeyliger, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8621

to look at the new patch set (#2).

Change subject: IMPALA-3703: Store query context in thread-local variables
..

IMPALA-3703: Store query context in thread-local variables

This commit introduces the InfoTable class which can hold
information about the current thread that can be useful
in debug sessions.
It needs to be allocated on the stack in order to include
it to minidumps.

Currently an InfoTable object is created in
Thread::SuperviseThread. This object is available in all
the child stack frames through the global function
GetThreadInfoTable().

InfoTable has members for the thread name, query id, and
instance id. These are fixed size char buffers.
InfoTable also has a member called text, which is a char buffer
that holds lines of strings. A line has a format of
 = . New data can be appended to the buffer by
calling AddExtraInfo(key, value).

If you have a fully-fledged core file, you can locate the
InfoTable for the current thread through the global pointer
impala::thread_info_table.

In a core file that has been created from a minidump,
we need to select the stack frame that allocated the InfoTable
object in order to inspect it. It is currently allocated in
Thread::SuperviseThread().

We can use printf in gdb to print the members, e.g.:
printf "%s" thread_info_table.text
printf "%s\n" thread_info_table.query_id

Currently the thread category, thread name, query id, and
instance id is stored.

I created some tests in thread-info-test.cc

Change-Id: I566f7f1db5117c498e86e0bd05b33bdcff624609
---
M be/src/common/CMakeLists.txt
A be/src/common/thread-info-test.cc
A be/src/common/thread-info.cc
A be/src/common/thread-info.h
M be/src/runtime/query-state.cc
M be/src/util/thread.cc
6 files changed, 285 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/8621/2
--
To view, visit http://gerrit.cloudera.org:8080/8621
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I566f7f1db5117c498e86e0bd05b33bdcff624609
Gerrit-Change-Number: 8621
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-5936: operator '%' overflows on large decimals

2017-11-22 Thread Zoltan Borok-Nagy (Code Review)
Hello Taras Bobrovytsky, Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8574

to look at the new patch set (#3).

Change subject: IMPALA-5936: operator '%' overflows on large decimals
..

IMPALA-5936: operator '%' overflows on large decimals

Suppose we have a large decimal number, which is greater
than INT_MAX. We want to calculate the modulo of this
number by 3:
BIG_DECIMAL % 3

The result of this calculation can be 0, 1, or 2.
This can fit into a decimal with precision 1.

The in-memory representation of such small decimals are
stored in int32_t in the backend. Let's call this int32_t
the result type. The backend had the invalid assumption
that it can do the calculation as well using the result type.
This assumption is true for multiplying or adding numbers,
but not for modulo.

Now the backend selects the biggest type of ['return type',
'1st operand type', '2nd operand type'] to do the calculation.

Change-Id: I2b06c8acd5aa490943e84013faf2eaac7c26ceb4
---
M be/src/exprs/decimal-operators-ir.cc
M testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test
2 files changed, 24 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/8574/3
--
To view, visit http://gerrit.cloudera.org:8080/8574
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2b06c8acd5aa490943e84013faf2eaac7c26ceb4
Gerrit-Change-Number: 8574
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-5936: operator '%' overflows on large decimals

2017-11-22 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8574 )

Change subject: IMPALA-5936: operator '%' overflows on large decimals
..


Patch Set 3:

(2 comments)

Thanks.

http://gerrit.cloudera.org:8080/#/c/8574/2/be/src/exprs/decimal-operators-ir.cc
File be/src/exprs/decimal-operators-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8574/2/be/src/exprs/decimal-operators-ir.cc@675
PS2, Line 675:   case 4: { \
> It's probably a good idea to DCHECK here that x_size <= 4 and y_size <= 4
Done


http://gerrit.cloudera.org:8080/#/c/8574/2/be/src/exprs/decimal-operators-ir.cc@684
PS2, Line 684:   } \
> Maybe DCHECK that x_size <= 8 and y_size <= 8, to make sure where are not r
Done



--
To view, visit http://gerrit.cloudera.org:8080/8574
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2b06c8acd5aa490943e84013faf2eaac7c26ceb4
Gerrit-Change-Number: 8574
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 22 Nov 2017 16:48:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8631


Change subject: IMPALA-6227: deflake admission stress tests
..

IMPALA-6227: deflake admission stress tests

The problem was that, during the initial admission decision phase, some
queries were initially queued then dequeued once memory came available.
All of the accounting in the test implicitly relies on queries not being
dequeued until queries are later explicitly ended, so if this happened,
the test broke in multiple subtle ways.

This happened because the query only scanned a small number of
rows, which could be all buffered on the receiver side of the
exchange even before the client fetched any rows from the coordinator.
This means that the reserved memory on some backends could increase
then decrease during the initial admission phase, resulting in a
query being queued then dequeued.

The fix is to increase the number of rows returned by the query so that
all fragments remain active during the initial admission phase.
This increased test execution time somewhat, so I also had to bump the
queue wait timeout for the admission stress tests (they assume that
queries don't time out in the queue).

Testing:
Ran the test under debug, release and ASAN builds, i.e.

  impala-py.test tests/custom_cluster/test_admission_controller.py \
--workload_exploration_strategy="functional-query:exhaustive"

I looped the mem_limit test for a while to confirm it didn't reproduce
(it reproduced reliably every 2-3 iterations before this fix).

I will try looping it a bit more under different build types
concurrently with the review.

Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
---
M fe/src/test/resources/llama-site-test2.xml
M tests/custom_cluster/test_admission_controller.py
2 files changed, 102 insertions(+), 38 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/8631/1
--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4132: Use -fno-omit-frame-pointer

2017-11-22 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8612 )

Change subject: IMPALA-4132: Use -fno-omit-frame-pointer
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8612/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8612/1//COMMIT_MSG@10
PS1, Line 10: As a result we expect more useful stack
: traces to be produced.
Just curious how well this works, could you paste a couple of thread stacks 
before and after this change using a stripped binary. It'd be a great change 
from a supportability POV if we can get reliable stacks without having to 
attach symbols separately.



-- 
To view, visit http://gerrit.cloudera.org:8080/8612
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib7d0d88ba015a847356ed0274fe91017b98cb941
Gerrit-Change-Number: 8612
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 22 Nov 2017 17:33:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3703: Store query context in thread-local variables

2017-11-22 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8621 )

Change subject: IMPALA-3703: Store query context in thread-local variables
..


Patch Set 2:

(6 comments)

Thanks for the updates. Some more comments.

I'd advise you to get Tim/Lars to look at this before reacting too much to my 
comments, since there are a bunch of things here that I'm insufficiently 
familiar with to review wisely. But I like the direction it's going!

http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info-test.cc
File be/src/common/thread-info-test.cc:

http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info-test.cc@55
PS2, Line 55:   Status status = info_table.AddExtraInfo("extra", "info");
please add a comment about what's going on:

// We've exhausted our allocated space by now, so AddExtraInfo should be 
failing.


http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h
File be/src/common/thread-info.h:

http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@37
PS2, Line 37: /// it in minidumps. Until this object lives, it is available 
through the global
"Until this object lives" -> "While this object is alive"


http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@41
PS2, Line 41: class InfoTable : boost::noncopyable {
This is the only usage of boost:noncopyable I've seen in our code base. We 
should check that it's the style of this sort of thing we want.

I think DISALLOW_COPY_AND_ASSIGN does something very similar and is present in 
our code base. It's defined in be/src/gutil/macros.h.


http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@43
PS2, Line 43:   InfoTable() : text(), text_size(0), query_id(), instance_id(), 
thread_name() {
We should comment that you're only allowed to create one of these per thread at 
a time. I had to think through how the constructor/destructor update a thread 
local. (I had assumed it would work the other way, with the thread_local being 
maintained by the cpde that's creating these , but that may just be the Java in 
me speaking.)


http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@61
PS2, Line 61: std::copy(line.begin(), line.end(), text + text_size);
Would line.copy(text + text_size, line.length()) be clearer here?

I don't think anything is null-terminating these, so printf in the debugger may 
have a bunch of junk in the remaining buffer. We should either null-terminate 
the strings here, or, perhaps better, zero out the bytes when initialized.


http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@108
PS2, Line 108:   char query_id[TUNIQUE_ID_STRING_SIZE];
You're choosing to store this as a string intentionally?



--
To view, visit http://gerrit.cloudera.org:8080/8621
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I566f7f1db5117c498e86e0bd05b33bdcff624609
Gerrit-Change-Number: 8621
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 22 Nov 2017 17:56:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4985: use parquet stats of nested types for dynamic pruning

2017-11-22 Thread Vuk Ercegovac (Code Review)
Hello Lars Volker, Alex Behm, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8480

to look at the new patch set (#10).

Change subject: IMPALA-4985: use parquet stats of nested types for dynamic 
pruning
..

IMPALA-4985: use parquet stats of nested types for dynamic pruning

Currently, parquet row-groups can be pruned at run-time using
min/max stats when predicates (in, binary) are specified for
column scalar types. This patch extends pruning to nested types
for the same class of predicates. A nested value is an instance
of a nested type (struct, array, map). A nested value consists of
other nested and scalar values (as declared by its type).
Predicates that can be used for row-group pruning must be applied to
nested scalar values. In addition, the parent of the nested scalar
must also be required, that is, not empty. The latter requirement
is conservative: some filters that could be used for pruning are
not used for correctness reasons.

Testing:
- extended nested-types-parquet-stats e2e test cases.

Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe
---
M be/src/exec/hdfs-parquet-scanner.h
M fe/src/main/java/org/apache/impala/analysis/CollectionStructType.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test
M tests/query_test/test_nested_types.py
10 files changed, 649 insertions(+), 38 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/8480/10
--
To view, visit http://gerrit.cloudera.org:8080/8480
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe
Gerrit-Change-Number: 8480
Gerrit-PatchSet: 10
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-4985: use parquet stats of nested types for dynamic pruning

2017-11-22 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8480 )

Change subject: IMPALA-4985: use parquet stats of nested types for dynamic 
pruning
..


Patch Set 10:

found the issue with the planner test: mismatch between local and test 
environment. change includes the fix.

carrying +2


--
To view, visit http://gerrit.cloudera.org:8080/8480
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe
Gerrit-Change-Number: 8480
Gerrit-PatchSet: 10
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 22 Nov 2017 18:06:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

2017-11-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8414 )

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
..


Patch Set 12:

Tianyi, did you have any more comments?

I'm not going to merge this regardless right now for the reasons stated in the 
commit message.


--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 22 Nov 2017 18:17:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder

2017-11-22 Thread Pranay Singh (Code Review)
Hello Joe McDonnell, Tim Armstrong, Bikramjeet Vig,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8034

to look at the new patch set (#17).

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
..

IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder

Currently DictDecoder class and DictEncoder class uses std::vector
to store the tables mapping codeword to value and vice-versa. It is
hard to detect the memory usage by these tables when they becomes
very large, since this memory is not accounted by Impala's memory
mangement infrastructure.

This patch uses the memory tracker of HdfsScanner to track the memory used
by dictionary in DictDecoder class. Similary it uses memory tracker of
HdfsTableSink to track the memory used by dictionary in DictEncoder class.

Memory for the dictionary, stored as std::vector is still allocated
from std:allocator but the amount allocated is accounted by
introducing a counter which is incremented and decremented as the
memory is consumed and released by vector.

Testing
---
Ran all the backend and end-end tests with no failures.

Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
---
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/util/dict-encoding.h
M be/src/util/dict-test.cc
5 files changed, 140 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/17
--
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-Change-Number: 8034
Gerrit-PatchSet: 17
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-1144: Fix exception when cancelling query in Impala-shell with CTRL-C

2017-11-22 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8549 )

Change subject: IMPALA-1144: Fix exception when cancelling query in 
Impala-shell with CTRL-C
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8549/6/tests/shell/test_shell_commandline.py
File tests/shell/test_shell_commandline.py:

http://gerrit.cloudera.org:8080/#/c/8549/6/tests/shell/test_shell_commandline.py@352
PS6, Line 352: sleep(0.5)
Increased this to 1.5 as RELEASE and ASAN build failed with this value.
As a result RELEASE passes but ASAN still fails. Increasing this further to 
check where ASAN turns green.



--
To view, visit http://gerrit.cloudera.org:8080/8549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6cefaf1dae78baae238289816a7cb9d210fb38e2
Gerrit-Change-Number: 8549
Gerrit-PatchSet: 6
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 22 Nov 2017 18:23:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4985: use parquet stats of nested types for dynamic pruning

2017-11-22 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8480 )

Change subject: IMPALA-4985: use parquet stats of nested types for dynamic 
pruning
..


Patch Set 10: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8480
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe
Gerrit-Change-Number: 8480
Gerrit-PatchSet: 10
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 22 Nov 2017 18:25:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4985: use parquet stats of nested types for dynamic pruning

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8480 )

Change subject: IMPALA-4985: use parquet stats of nested types for dynamic 
pruning
..


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1515/


--
To view, visit http://gerrit.cloudera.org:8080/8480
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe
Gerrit-Change-Number: 8480
Gerrit-PatchSet: 10
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 22 Nov 2017 18:28:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1144: Fix exception when cancelling query in Impala-shell with CTRL-C

2017-11-22 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8549 )

Change subject: IMPALA-1144: Fix exception when cancelling query in 
Impala-shell with CTRL-C
..


Patch Set 6: Code-Review+1

(1 comment)

Looks good to me at this point. I'll let you figure out the right timings for 
the test. Feel free to limit the environments in which we run it, too.

http://gerrit.cloudera.org:8080/#/c/8549/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8549/6//COMMIT_MSG@9
PS6, Line 9: Issue1: When query is cancelled via CTRL-C while being executed in 
Impala-shell
nit: "Issue 1" reads better to me.



--
To view, visit http://gerrit.cloudera.org:8080/8549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6cefaf1dae78baae238289816a7cb9d210fb38e2
Gerrit-Change-Number: 8549
Gerrit-PatchSet: 6
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 22 Nov 2017 18:46:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6206: Fix data load failure with -notests

2017-11-22 Thread Zach Amsden (Code Review)
Zach Amsden has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8580 )

Change subject: IMPALA-6206: Fix data load failure with -notests
..

IMPALA-6206: Fix data load failure with -notests

When tests are not built, specifically with -notests, instead of
just -skiptests, the be-test target is omitted by cmake, and since
nothing in impalad depends on uda/udf samples, they are not built.
This causes data load to fail on a clean build.

Just build them anyway under the target ImpalaUdf since they
build in a few seconds.

This shaves a few minutes off data load testing by avoiding
the time spent linking tests.  Note: only empirically, not
scientifically measured.

Testing: Run a clean build
  find . -name 'libud[af]sample.so'
  ./be/build/debug/udf_samples/libudasample.so
  ./be/build/debug/udf_samples/libudfsample.so

Change-Id: Idaa193989d77d56b72db05ad54e1fb0a345938fb
Reviewed-on: http://gerrit.cloudera.org:8080/8580
Tested-by: Impala Public Jenkins
Reviewed-by: Zach Amsden 
---
M bin/make_impala.sh
1 file changed, 2 insertions(+), 2 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Zach Amsden: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/8580
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Idaa193989d77d56b72db05ad54e1fb0a345938fb
Gerrit-Change-Number: 8580
Gerrit-PatchSet: 5
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-6206: Fix data load failure with -notests

2017-11-22 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8580 )

Change subject: IMPALA-6206: Fix data load failure with -notests
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8580
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idaa193989d77d56b72db05ad54e1fb0a345938fb
Gerrit-Change-Number: 8580
Gerrit-PatchSet: 4
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 22 Nov 2017 18:49:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6225: Query profile date-time strings should have ns precision.

2017-11-22 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8611 )

Change subject: IMPALA-6225: Query profile date-time strings should have ns 
precision.
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8611/7/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/8611/7/tests/query_test/test_observability.py@133
PS7, Line 133: while (retries < 120)
GVO failed because 120s doesn't seem to be a good enough timeout.

08:20:17 ] FAIL 
query_test/test_observability.py::TestObservability::()::test_query_profile_timestamps
08:20:17 ] === FAILURES 
===
08:20:17 ] ___ TestObservability.test_query_profile_timestamps 

08:20:17 ] [gw13] linux2 -- Python 2.7.12 
/home/ubuntu/Impala/bin/../infra/python/env/bin/python
08:20:17 ] query_test/test_observability.py:152: in 
test_query_profile_timestamps
08:20:17 ] assert False, dbg_str
08:20:17 ] E   AssertionError: Debug profile for query not available in 120 
seconds, (, ).
08:20:17 ] E   assert False
08:20:17 ] - Captured stderr call 
-



--
To view, visit http://gerrit.cloudera.org:8080/8611
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e124b9c7e0717b8dc2cdab46aea41d74c5f2fd0
Gerrit-Change-Number: 8611
Gerrit-PatchSet: 7
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Wed, 22 Nov 2017 18:58:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8447 )

Change subject: IMPALA-2181: Add query option levels for display
..


Patch Set 14:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8447/18/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/8447/18/be/src/service/query-options.h@113
PS18, Line 113:
per discussion on the JIRA, I think this should be under DEVELOPMENT until it's 
documented.


http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_client.py@100
PS18, Line 100: _indent_level, output):
This check shouldn't be necessary, right? Unless you're doing what I did and 
running impala-shell against the wrong thrift output because you're too lazy to 
rebuild Impala :).


http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_shell.py@83
PS18, Line 83:
correspond


http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_shell.py@242
PS18, Line 242:   self._print_option_group(advanced_options)
This is subtle - can you add a comment? If I understand correctly, this is 
detecting that the impala server didn't send back any advanced query options, 
so we're assuming it's an old one.


http://gerrit.cloudera.org:8080/#/c/8447/14/tests/hs2/test_hs2.py
File tests/hs2/test_hs2.py:

http://gerrit.cloudera.org:8080/#/c/8447/14/tests/hs2/test_hs2.py@53
PS14, Line 53: fetch_results_req.operationHandle = 
execute_statement_resp.operationHandle
> This function is actually a copy paste: I moved it out from test_session_op
Seems ok then.



--
To view, visit http://gerrit.cloudera.org:8080/8447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Gerrit-Change-Number: 8447
Gerrit-PatchSet: 14
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 22 Nov 2017 19:06:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-22 Thread Gabor Kaszab (Code Review)
Hello Lars Volker, Laszlo Gaal, Zoltan Borok-Nagy, Philip Zeyliger, Attila 
Jeges, Tim Armstrong, Csaba Ringhofer, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8447

to look at the new patch set (#19).

Change subject: IMPALA-2181: Add query option levels for display
..

IMPALA-2181: Add query option levels for display

Four display levels are introduced for each query option: REGULAR, ADVANCED,
DEVELOPMENT and DEPRECATED. When the query options are displayed in Impala
shell using SET then only the REGULAR and ADVANCED options are shown. A new
command called SET ALL shows all the options grouped by their option levels.

When the query options are displayed through HS2 interface then the result set
would contain an extra column indicating the level of each option. Similarly
to Impala shell on this interface the SET command only diplays the REGULAR and
ADVANCED options while SET ALL shows them all.

If the Impala shell connects to an Impala daemon that predates this change
then all the options would be displayed in the REGULAR group.

Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
---
M be/src/service/child-query.cc
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/Frontend.thrift
M common/thrift/beeswax.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/SetStmt.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M shell/impala_client.py
M shell/impala_shell.py
M testdata/workloads/functional-query/queries/QueryTest/set.test
M tests/custom_cluster/test_set_and_unset.py
M tests/hs2/test_hs2.py
M tests/shell/test_shell_interactive.py
18 files changed, 487 insertions(+), 226 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/8447/19
--
To view, visit http://gerrit.cloudera.org:8080/8447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Gerrit-Change-Number: 8447
Gerrit-PatchSet: 19
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-22 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8447 )

Change subject: IMPALA-2181: Add query option levels for display
..


Patch Set 19:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8447/18/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/8447/18/be/src/service/query-options.h@113
PS18, Line 113:   QUERY_OPT_FN(strict_mode, STRICT_MODE, 
TQueryOptionLevel::DEVELOPMENT)\
> per discussion on the JIRA, I think this should be under DEVELOPMENT until
Done


http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_client.py@100
PS18, Line 100: hasattr(option, 'level') and
> This check shouldn't be necessary, right? Unless you're doing what I did an
in theory, if the shell with the feature comes from an Impala that is compiled 
than this is not necessary.
I would still keep it be make 100% sure, that nothing breaks if you don't mind.


http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_shell.py@83
PS18, Line 83: correspond
> correspond
Done


http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_shell.py@242
PS18, Line 242:   # If the shell is connected to an Impala that predates 
IMPALA-2181 then
> This is subtle - can you add a comment? If I understand correctly, this is
Done



--
To view, visit http://gerrit.cloudera.org:8080/8447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Gerrit-Change-Number: 8447
Gerrit-PatchSet: 19
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 22 Nov 2017 19:45:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5146: Fix inconsitent results at FROM UNIXTIME()

2017-11-22 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8629 )

Change subject: IMPALA-5146: Fix inconsitent results at FROM_UNIXTIME()
..


Patch Set 1:

(1 comment)

Thanks for the fix!

http://gerrit.cloudera.org:8080/#/c/8629/1/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8629/1/be/src/exprs/timestamp-functions-ir.cc@84
PS1, Line 84: ToUnixTime
i think it would be clearer to just call tv.HasDateAndTime() directly to show 
the intention of this check.

(also, generally for Impala style, we don't use extraneous parenthesis around 
the conditional, though I see that in this file the style has diverged).



--
To view, visit http://gerrit.cloudera.org:8080/8629
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie3a5e9a9cb39d32993fa2c7f725be44d8b9ce9f2
Gerrit-Change-Number: 8629
Gerrit-PatchSet: 1
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Comment-Date: Wed, 22 Nov 2017 19:50:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8631 )

Change subject: IMPALA-6227: deflake admission stress tests
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8631/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8631/1//COMMIT_MSG@11
PS1, Line 11: All of the accounting in the test implicitly relies on queries 
not being
: dequeued until queries are later explicitly ended, so if this 
happened,
: the test broke in multiple subtle ways.
and this assumption is no longer true because of the final change for 
IMPALA-1575, right?  It'd be good to note that explicitly.


http://gerrit.cloudera.org:8080/#/c/8631/1/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/8631/1/tests/custom_cluster/test_admission_controller.py@705
PS1, Line 705: b4
where is that used?


http://gerrit.cloudera.org:8080/#/c/8631/1/tests/custom_cluster/test_admission_controller.py@792
PS1, Line 792: amount of time
is this actually timing sensitive, or is it that as long as we don't fetch the 
results (and the rowbatches have exceeded the various buffering), the query 
will remain active?



--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Comment-Date: Wed, 22 Nov 2017 20:06:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-22 Thread Gabor Kaszab (Code Review)
Hello Lars Volker, Laszlo Gaal, Zoltan Borok-Nagy, Philip Zeyliger, Attila 
Jeges, Tim Armstrong, Csaba Ringhofer, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8447

to look at the new patch set (#20).

Change subject: IMPALA-2181: Add query option levels for display
..

IMPALA-2181: Add query option levels for display

Four display levels are introduced for each query option: REGULAR, ADVANCED,
DEVELOPMENT and DEPRECATED. When the query options are displayed in Impala
shell using SET then only the REGULAR and ADVANCED options are shown. A new
command called SET ALL shows all the options grouped by their option levels.

When the query options are displayed through HS2 interface then the result set
would contain an extra column indicating the level of each option. Similarly
to Impala shell on this interface the SET command only diplays the REGULAR and
ADVANCED options while SET ALL shows them all.

If the Impala shell connects to an Impala daemon that predates this change
then all the options would be displayed in the REGULAR group.

Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
---
M be/src/service/child-query.cc
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/Frontend.thrift
M common/thrift/beeswax.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/SetStmt.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M shell/impala_client.py
M shell/impala_shell.py
M testdata/workloads/functional-query/queries/QueryTest/set.test
M tests/custom_cluster/test_set_and_unset.py
M tests/hs2/test_hs2.py
M tests/shell/test_shell_interactive.py
18 files changed, 487 insertions(+), 226 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/8447/20
--
To view, visit http://gerrit.cloudera.org:8080/8447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Gerrit-Change-Number: 8447
Gerrit-PatchSet: 20
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-22 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8447 )

Change subject: IMPALA-2181: Add query option levels for display
..


Patch Set 20:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_client.py@100
PS18, Line 100: option.level is not None:
> in theory, if the shell with the feature comes from an Impala that is compi
Done



--
To view, visit http://gerrit.cloudera.org:8080/8447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Gerrit-Change-Number: 8447
Gerrit-PatchSet: 20
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 22 Nov 2017 20:08:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds

2017-11-22 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8614 )

Change subject: IMPALA-2250: Make multiple COUNT(DISTINCT) message state 
workarounds
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8614
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5084be10946d68f3ec0760c2b7e698635df26a89
Gerrit-Change-Number: 8614
Gerrit-PatchSet: 4
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Comment-Date: Wed, 22 Nov 2017 20:45:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8614 )

Change subject: IMPALA-2250: Make multiple COUNT(DISTINCT) message state 
workarounds
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1516/


--
To view, visit http://gerrit.cloudera.org:8080/8614
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5084be10946d68f3ec0760c2b7e698635df26a89
Gerrit-Change-Number: 8614
Gerrit-PatchSet: 4
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Comment-Date: Wed, 22 Nov 2017 20:46:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-11-22 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 3:

>[...]Please do answer the question about changing
 > the FE tests to just use s3a so that we can remove the hackery
 > altogether.[...]

In short: it depends on your end goal.
If the end goal is to get rid of the s3n: provider and its supporting menagerie 
(jets3t, etc.), then yes, it can be done with a small price: a single negative 
test in AnalyzeStmtsTest.java will have to go. The code there in 
https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java#L3000
 performs a test step that should fail for s3n -- but if I remove the s3n 
properties from core-site.xml, this line blows up complaining about missing 
credentials. The stack trace reveals that Impala's HdfsUri::analyze() calls 
into Hadoop's Path::getFileSystem() to find out what kind of filesystem the URI 
points to; this in turn calls down all the way to 
org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:74) being 
called from Jets3t.

If your end goal is to remove dummy credentials from core-site.xml.tmpl, this 
will probably not be complete success: the s3a: provider behaves in a similar 
() though not identical) way. When the front-end tests are run without an S3 
container in sight (e.g. your local dev box with no S3 access set up), 
AnalyzeDDLTest's s3a: URIs still require similar dummy credentials, but with 
s3a: these can be supplied in environment variables. This means that 
core-site.xml[.tmpl] could be cleaned up, but dummy credentials in envireonment 
variables would still be there.


--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 3
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 22 Nov 2017 21:04:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4964: Fix Decimal modulo overflow

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8329 )

Change subject: IMPALA-4964: Fix Decimal modulo overflow
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1517/


--
To view, visit http://gerrit.cloudera.org:8080/8329
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5420201d4440d421e33e443df005cdcc16b8a6cd
Gerrit-Change-Number: 8329
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Wed, 22 Nov 2017 21:23:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-11-22 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 3:

> Patch Set 3:
>
> >[...]Please do answer the question about changing
>  > the FE tests to just use s3a so that we can remove the hackery
>  > altogether.[...]
>
> In short: it depends on your end goal.
> If the end goal is to get rid of the s3n: provider and its supporting 
> menagerie (jets3t, etc.), then yes, it can be done with a small price: a 
> single negative test in AnalyzeStmtsTest.java will have to go. The code there 
> in 
> https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java#L3000
>  performs a test step that should fail for s3n -- but if I remove the s3n 
> properties from core-site.xml, this line blows up complaining about missing 
> credentials. The stack trace reveals that Impala's HdfsUri::analyze() calls 
> into Hadoop's Path::getFileSystem() to find out what kind of filesystem the 
> URI points to; this in turn calls down all the way to 
> org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:74) being 
> called from Jets3t.
>
> If your end goal is to remove dummy credentials from core-site.xml.tmpl, this 
> will probably not be complete success: the s3a: provider behaves in a similar 
> () though not identical) way. When the front-end tests are run without an S3 
> container in sight (e.g. your local dev box with no S3 access set up), 
> AnalyzeDDLTest's s3a: URIs still require similar dummy credentials, but with 
> s3a: these can be supplied in environment variables. This means that 
> core-site.xml[.tmpl] could be cleaned up, but dummy credentials in 
> envireonment variables would still be there.

Hadoop has essentially deprecated s3n and it's removed entirely in Hadoop-3.0. 
(See HADOOP-14738.) The distribution of Hadoop we're depending on also prefers 
s3a. I think we can do away with s3n entirely, but I'm happy to take that on as 
a separate change after yours here.


--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 3
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 22 Nov 2017 21:25:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4985: use parquet stats of nested types for dynamic pruning

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8480 )

Change subject: IMPALA-4985: use parquet stats of nested types for dynamic 
pruning
..


Patch Set 10: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8480
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe
Gerrit-Change-Number: 8480
Gerrit-PatchSet: 10
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 22 Nov 2017 22:00:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4985: use parquet stats of nested types for dynamic pruning

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8480 )

Change subject: IMPALA-4985: use parquet stats of nested types for dynamic 
pruning
..

IMPALA-4985: use parquet stats of nested types for dynamic pruning

Currently, parquet row-groups can be pruned at run-time using
min/max stats when predicates (in, binary) are specified for
column scalar types. This patch extends pruning to nested types
for the same class of predicates. A nested value is an instance
of a nested type (struct, array, map). A nested value consists of
other nested and scalar values (as declared by its type).
Predicates that can be used for row-group pruning must be applied to
nested scalar values. In addition, the parent of the nested scalar
must also be required, that is, not empty. The latter requirement
is conservative: some filters that could be used for pruning are
not used for correctness reasons.

Testing:
- extended nested-types-parquet-stats e2e test cases.

Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe
Reviewed-on: http://gerrit.cloudera.org:8080/8480
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins
---
M be/src/exec/hdfs-parquet-scanner.h
M fe/src/main/java/org/apache/impala/analysis/CollectionStructType.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test
M tests/query_test/test_nested_types.py
10 files changed, 649 insertions(+), 38 deletions(-)

Approvals:
  Alex Behm: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8480
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe
Gerrit-Change-Number: 8480
Gerrit-PatchSet: 11
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-5310: Part 2: Add SAMPLED NDV() function.

2017-11-22 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8569 )

Change subject: IMPALA-5310: Part 2: Add SAMPLED_NDV() function.
..


Patch Set 2:

(9 comments)

I'm still looking at the tests, just want to publish what I have so far. 
Overall the patch (and approach) looks good to me.

http://gerrit.cloudera.org:8080/#/c/8569/1/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8569/1/be/src/exprs/aggregate-functions-ir.cc@1464
PS1, Line 1464: of a SampledNdvState with 'num_buckets' HLL bu
> Good idea done. I did some more similar cleanup in this class.
thanks, easier to follow.


http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1443
PS2, Line 1443: count
IMO 'count' sounds a little generic and overlaps with total_count. May be 
rename them to "bucket_row_count" and "total_row_count" or something similar?


http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1459
PS2, Line 1459:   double sample_perc;
static?


http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1465
PS2, Line 1465:   static int64_t GetStateSize(int64_t num_buckets) {
Given num_buckets is always a const, may be make this a static const too like 
BUCKET size?

static const in64_t STATE_SIZE = NUM_HLL_BUCKETS * BUCKET_SIZE .


http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1530
PS2, Line 1530: ndv
may be call ndv_estimate, just to be clear


http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1552
PS2, Line 1552:  int bucket_idx = (i + j) % SampledNdvState::NUM_HLL_BUCKETS;
  :   merged_count += *state->GetCountPtr(bucket_idx);
  :   counts[pidx] = merged_count;
  :   StringVal hll = StringVal(state->GetHllPtr(bucket_idx), 
HLL_LEN);
  :   HllMerge(ctx, hll, &merged_hll);
  :   ndvs[pidx] = HllFinalEstimate(merged_hll.ptr, HLL_LEN);
Wondering if we can avoid duplicates in data points since hll_to_merge for 
(i,j) = (j,i). Do they add any additional value?

We could probably optimize the loop a little further to avoid that

for (int i=0, i < NUM_HLL_BUCKETS; ++i) {
  for (int j=i, j < NUM_HLL_BUCKETS, ++j) {

}
}


http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1563
PS2, Line 1563:   int64_t max_count = counts[num_points - 1];
not sure I understand this, what invariant guarantees counts[num_points-1] is 
the max_count?


http://gerrit.cloudera.org:8080/#/c/8569/1/be/src/util/mpfit-util.h
File be/src/util/mpfit-util.h:

http://gerrit.cloudera.org:8080/#/c/8569/1/be/src/util/mpfit-util.h@50
PS1, Line 50:   /// Returns true if fitting was successful, false otherwise.
> Done here. I could not find another place, did I miss one?
nope, just here.


http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java:

http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@506
PS2, Line 506: SAMPED_NDV
nit: typo



-- 
To view, visit http://gerrit.cloudera.org:8080/8569
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia51d56ee67ec6073e92f90bebb4005484138b820
Gerrit-Change-Number: 8569
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Comment-Date: Wed, 22 Nov 2017 22:36:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

2017-11-22 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8414 )

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
..


Patch Set 12:

> Patch Set 12:
>
> Tianyi, did you have any more comments?
>
> I'm not going to merge this regardless right now for the reasons stated in 
> the commit message.

I don't have more comments


--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 22 Nov 2017 22:40:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4132: Use -fno-omit-frame-pointer

2017-11-22 Thread Gabor Kaszab (Code Review)
Hello Bharath Vissapragada, Lars Volker, Tim Armstrong, Mostafa Mokhtar, Dan 
Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8612

to look at the new patch set (#2).

Change subject: IMPALA-4132: Use -fno-omit-frame-pointer
..

IMPALA-4132: Use -fno-omit-frame-pointer

Using -fno-omit-frame-pointer would keep the frame pointers for
functions in the register. As a result we expect more useful stack
traces to be produced.

For testing performance benchmark was executed on TPC-H and TPC-DS
without any significant discrepancy from the baseline results.
(For the specific measures check the attachments in the Jira.)

Change-Id: Ib7d0d88ba015a847356ed0274fe91017b98cb941
---
M be/CMakeLists.txt
1 file changed, 2 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/8612/2
--
To view, visit http://gerrit.cloudera.org:8080/8612
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib7d0d88ba015a847356ed0274fe91017b98cb941
Gerrit-Change-Number: 8612
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4132: Use -fno-omit-frame-pointer

2017-11-22 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8612 )

Change subject: IMPALA-4132: Use -fno-omit-frame-pointer
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8612/1/be/CMakeLists.txt
File be/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/8612/1/be/CMakeLists.txt@42
PS1, Line 42: #  -fno-omit-frame-pointers: Keep frame pointer for functions in 
register
> Have you considered using -fasynchronous-unwind-tables instead? This page r
>From my side no such consideration was made.
Tim, are you aware of any discussion related to the flag Lars mentions?


http://gerrit.cloudera.org:8080/#/c/8612/1/be/CMakeLists.txt@42
PS1, Line 42: #  -fno-omit-frame-pointers: Keep frame pointer for functions in 
register
: SET(CXX_COMMON_FL
> How did this affect the size of the resulting binaries?
Done

Binary sizes (output of ls, in bytes):
impalad:
-fno-omit-frame-pointers: 437163424
-fomit-frame-pointers:   437110928

size of the whole be/build/debug/ dir: (output of du, in kbytes)
-fno-omit-frame-pointers: 45456600
-fomit-frame-pointers:   45451244



--
To view, visit http://gerrit.cloudera.org:8080/8612
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib7d0d88ba015a847356ed0274fe91017b98cb941
Gerrit-Change-Number: 8612
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 22 Nov 2017 22:45:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8631 )

Change subject: IMPALA-6227: deflake admission stress tests
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8631/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8631/1//COMMIT_MSG@11
PS1, Line 11: All of the accounting in the test implicitly relies on queries 
not being
: dequeued until queries are later explicitly ended, so if this 
happened,
: the test broke in multiple subtle ways.
> and this assumption is no longer true because of the final change for IMPAL
I don't think it was ever true for the mem_limit tests. Even before IMPALA-1575 
it depended implicitly on the memory staying reserved on all backends.

>From what I can tell, the bug was always there, it was a tweak to the 
>statestore frequency that threw it out of balance.


http://gerrit.cloudera.org:8080/#/c/8631/1/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/8631/1/tests/custom_cluster/test_admission_controller.py@705
PS1, Line 705: b4
> where is that used?
Leftover debugging code that I missed removing.


http://gerrit.cloudera.org:8080/#/c/8631/1/tests/custom_cluster/test_admission_controller.py@792
PS1, Line 792: amount of time
> is this actually timing sensitive, or is it that as long as we don't fetch
It shouldn't be timing sensitive. Updated the comment.



--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 22 Nov 2017 22:45:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8631

to look at the new patch set (#2).

Change subject: IMPALA-6227: deflake admission stress tests
..

IMPALA-6227: deflake admission stress tests

The problem was that, during the initial admission decision phase, some
queries were initially queued then dequeued once memory came available.
All of the accounting in the test implicitly relies on queries not being
dequeued until queries are later explicitly ended, so if this happened,
the test broke in multiple subtle ways.

This happened because the query only scanned a small number of
rows, which could be all buffered on the receiver side of the
exchange even before the client fetched any rows from the coordinator.
This means that the reserved memory on some backends could increase
then decrease during the initial admission phase, resulting in a
query being queued then dequeued.

The fix is to increase the number of rows returned by the query so that
all fragments remain active during the initial admission phase.
This increased test execution time somewhat, so I also had to bump the
queue wait timeout for the admission stress tests (they assume that
queries don't time out in the queue).

Testing:
Ran the test under debug, release and ASAN builds, i.e.

  impala-py.test tests/custom_cluster/test_admission_controller.py \
--workload_exploration_strategy="functional-query:exhaustive"

I looped the mem_limit test for a while to confirm it didn't reproduce
(it reproduced reliably every 2-3 iterations before this fix).

I will try looping it a bit more under different build types
concurrently with the review.

Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
---
M fe/src/test/resources/llama-site-test2.xml
M tests/custom_cluster/test_admission_controller.py
2 files changed, 89 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/8631/2
-- 
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8631

to look at the new patch set (#3).

Change subject: IMPALA-6227: deflake admission stress tests
..

IMPALA-6227: deflake admission stress tests

The problem was that, during the initial admission decision phase, some
queries were initially queued then dequeued once memory came available.
All of the accounting in the test implicitly relies on queries not being
dequeued until queries are later explicitly ended, so if this happened,
the test broke in multiple subtle ways.

This happened because the query only scanned a small number of
rows, which could be all buffered on the receiver side of the
exchange even before the client fetched any rows from the coordinator.
This means that the reserved memory on some backends could increase
then decrease during the initial admission phase, resulting in a
query being queued then dequeued.

The fix is to increase the number of rows returned by the query so that
all fragments remain active during the initial admission phase.
This increased test execution time somewhat, so I also had to bump the
queue wait timeout for the admission stress tests (they assume that
queries don't time out in the queue).

Testing:
Ran the test under debug, release and ASAN builds, i.e.

  impala-py.test tests/custom_cluster/test_admission_controller.py \
--workload_exploration_strategy="functional-query:exhaustive"

I looped the mem_limit test for a while to confirm it didn't reproduce
(it reproduced reliably every 2-3 iterations before this fix).

I will try looping it a bit more under different build types
concurrently with the review.

Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
---
M fe/src/test/resources/llama-site-test2.xml
M tests/custom_cluster/test_admission_controller.py
2 files changed, 111 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/8631/3
--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8631 )

Change subject: IMPALA-6227: deflake admission stress tests
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8631/3/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/8631/3/tests/custom_cluster/test_admission_controller.py@706
PS3, Line 706: sleep(0.2)
what's that about? does the QUERY_TIMEOUT comment above need updating?



--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 22 Nov 2017 22:57:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8631

to look at the new patch set (#4).

Change subject: IMPALA-6227: deflake admission stress tests
..

IMPALA-6227: deflake admission stress tests

The problem was that, during the initial admission decision phase, some
queries were initially queued then dequeued once memory came available.
All of the accounting in the test implicitly relies on queries not being
dequeued until queries are later explicitly ended, so if this happened,
the test broke in multiple subtle ways.

This happened because the query only scanned a small number of
rows, which could be all buffered on the receiver side of the
exchange even before the client fetched any rows from the coordinator.
This means that the reserved memory on some backends could increase
then decrease during the initial admission phase, resulting in a
query being queued then dequeued.

The fix is to increase the number of rows returned by the query so that
all fragments remain active during the initial admission phase.
This increased test execution time somewhat, so I also had to bump the
queue wait timeout for the admission stress tests (they assume that
queries don't time out in the queue).

Testing:
Ran the test under debug, release and ASAN builds, i.e.

  impala-py.test tests/custom_cluster/test_admission_controller.py \
--workload_exploration_strategy="functional-query:exhaustive"

I looped the mem_limit test for a while to confirm it didn't reproduce
(it reproduced reliably every 2-3 iterations before this fix).

It still reproduces every 5-10 runs with exhaustive+release, so I
need to do further work to make it more robust.

Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
---
M fe/src/test/resources/llama-site-test2.xml
M tests/custom_cluster/test_admission_controller.py
2 files changed, 114 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/8631/4
--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8631 )

Change subject: IMPALA-6227: deflake admission stress tests
..


Patch Set 3:

(1 comment)

Just to summarise where this is at, it's a lot less flaky but I'm still seeing 
occasional flakes that I think are the result of the test seeing inconsistent 
metrics (i.e. admitted, queued, dequeued, etc are out of sync).

http://gerrit.cloudera.org:8080/#/c/8631/3/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/8631/3/tests/custom_cluster/test_admission_controller.py@706
PS3, Line 706: sleep(0.2)
> what's that about? does the QUERY_TIMEOUT comment above need updating?
I was experimenting with ways to reduce test runtime and reduce the chances of 
it hitting various timeouts.

Factored out a constant to make the relationship clearer.



--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 22 Nov 2017 23:03:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8631 )

Change subject: IMPALA-6227: deflake admission stress tests
..


Patch Set 4: Code-Review+2

(1 comment)

Seems like an improvement if you want to commit it here even if it doesn't 
address all flakiness.

http://gerrit.cloudera.org:8080/#/c/8631/4/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/8631/4/tests/custom_cluster/test_admission_controller.py@698
PS4, Line 698: every
 :   # second
shouldn't that say "at least every second" or something?



--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 22 Nov 2017 23:14:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8631

to look at the new patch set (#5).

Change subject: IMPALA-6227: deflake admission stress tests
..

IMPALA-6227: deflake admission stress tests

The problem was that, during the initial admission decision phase, some
queries were initially queued then dequeued once memory came available.
All of the accounting in the test implicitly relies on queries not being
dequeued until queries are later explicitly ended, so if this happened,
the test broke in multiple subtle ways.

This happened because the query only scanned a small number of
rows, which could be all buffered on the receiver side of the
exchange even before the client fetched any rows from the coordinator.
This means that the reserved memory on some backends could increase
then decrease during the initial admission phase, resulting in a
query being queued then dequeued.

The fix is to increase the number of rows returned by the query so that
all fragments remain active during the initial admission phase.
This increased test execution time somewhat, so I also had to bump the
queue wait timeout for the admission stress tests (they assume that
queries don't time out in the queue).

Testing:
Ran the test under debug, release and ASAN builds, i.e.

  impala-py.test tests/custom_cluster/test_admission_controller.py \
--workload_exploration_strategy="functional-query:exhaustive"

I looped the mem_limit test for a while to confirm it didn't reproduce
(it reproduced reliably every 2-3 iterations before this fix).

It still reproduces every 5-10 runs with exhaustive+release, so I
need to do further work to make it more robust.

Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
---
M fe/src/test/resources/llama-site-test2.xml
M tests/custom_cluster/test_admission_controller.py
2 files changed, 116 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/8631/5
--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8631 )

Change subject: IMPALA-6227: deflake admission stress tests
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8631/4/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/8631/4/tests/custom_cluster/test_admission_controller.py@698
PS4, Line 698: every
 :   # second
> shouldn't that say "at least every second" or something?
Done



--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 22 Nov 2017 23:16:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8631 )

Change subject: IMPALA-6227: deflake admission stress tests
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1518/


--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 22 Nov 2017 23:17:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8631 )

Change subject: IMPALA-6227: deflake admission stress tests
..


Patch Set 5: Code-Review+2

Carry


--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 22 Nov 2017 23:17:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6092: avoid drop/create function interactions in e2e tests

2017-11-22 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8593 )

Change subject: IMPALA-6092: avoid drop/create function interactions in e2e 
tests
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8593/2/tests/query_test/test_udfs.py
File tests/query_test/test_udfs.py:

http://gerrit.cloudera.org:8080/#/c/8593/2/tests/query_test/test_udfs.py@422
PS2, Line 422: check_call(["hadoop", "fs", "-put", "-f", src_udf_path, 
tgt_udf_path])
> it does get left around (as with the other tests). perhaps a better pattern
changed the paths so that these tmp jar get cleaned up. a new dir is created 
for each test's db. I put the jar in that db dir (as discussed) and verified 
that the dirs + files are removed (useful for dev env where we re-use the 
state).



--
To view, visit http://gerrit.cloudera.org:8080/8593
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ica3538788b1d2ab5e361261e2ade62780b838e65
Gerrit-Change-Number: 8593
Gerrit-PatchSet: 2
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 22 Nov 2017 23:18:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6092: avoid drop/create function interactions in e2e tests

2017-11-22 Thread Vuk Ercegovac (Code Review)
Hello Bharath Vissapragada, Alex Behm, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8593

to look at the new patch set (#3).

Change subject: IMPALA-6092: avoid drop/create function interactions in e2e 
tests
..

IMPALA-6092: avoid drop/create function interactions in e2e tests

The e2e unit tests for udfs can interact via the backend
lib_cache, causing test flakes. IMPALA-6215 explains a
race between the lib_cache and UdfExecutor in the frontend
which is the likely the root cause.
Two e2e tests use the same jar (test_java_udfs and
test_udf_invalid_symbol), test_udf_invalid_symbol drops a
function from that jar, which causes the use of that jar to
fail in the test_java_udfs test. Since the state of lib_cache
is per process, its state causes these interactions across
unit tests.
This change avoids the interactions by using separate jars for
the separate tests.

Change-Id: Ica3538788b1d2ab5e361261e2ade62780b838e65
---
M be/src/runtime/lib-cache.h
M fe/src/main/java/org/apache/impala/hive/executor/UdfExecutor.java
M tests/query_test/test_udfs.py
3 files changed, 16 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/8593/3
--
To view, visit http://gerrit.cloudera.org:8080/8593
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ica3538788b1d2ab5e361261e2ade62780b838e65
Gerrit-Change-Number: 8593
Gerrit-PatchSet: 3
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-6092: avoid drop/create function interactions in e2e tests

2017-11-22 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8593 )

Change subject: IMPALA-6092: avoid drop/create function interactions in e2e 
tests
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8593
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ica3538788b1d2ab5e361261e2ade62780b838e65
Gerrit-Change-Number: 8593
Gerrit-PatchSet: 3
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 22 Nov 2017 23:20:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6092: avoid drop/create function interactions in e2e tests

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8593 )

Change subject: IMPALA-6092: avoid drop/create function interactions in e2e 
tests
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1519/


--
To view, visit http://gerrit.cloudera.org:8080/8593
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ica3538788b1d2ab5e361261e2ade62780b838e65
Gerrit-Change-Number: 8593
Gerrit-PatchSet: 3
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 22 Nov 2017 23:21:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4927: Impala should handle invalid input from Sentry

2017-11-22 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8588 )

Change subject: IMPALA-4927: Impala should handle invalid input from Sentry
..


Patch Set 10: Code-Review+2

I'm not entirely convinced that ImpalaException is the right thing to catch, 
but there are upstream Sentry changes coming anyway that are going to change 
this again, so arguing against it doesn't seem productive.


--
To view, visit http://gerrit.cloudera.org:8080/8588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I781411018d580854d80a9cad81a1ded7ca16af8b
Gerrit-Change-Number: 8588
Gerrit-PatchSet: 10
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 22 Nov 2017 23:56:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4927: Impala should handle invalid input from Sentry

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8588 )

Change subject: IMPALA-4927: Impala should handle invalid input from Sentry
..


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1520/


--
To view, visit http://gerrit.cloudera.org:8080/8588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I781411018d580854d80a9cad81a1ded7ca16af8b
Gerrit-Change-Number: 8588
Gerrit-PatchSet: 10
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 22 Nov 2017 23:58:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5310: Part 2: Add SAMPLED NDV() function.

2017-11-22 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8569 )

Change subject: IMPALA-5310: Part 2: Add SAMPLED_NDV() function.
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@2063
PS2, Line 2063: 0.0
Wondering if 0.0 is a valid sample %. What do we scan in that case? Do we 
assign some default ndvs?


http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@2066
PS2, Line 2066:   for (Column col: allScalarTypes.getColumns()) {
-ve test cases on non scalar columns, just for completeness?



--
To view, visit http://gerrit.cloudera.org:8080/8569
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia51d56ee67ec6073e92f90bebb4005484138b820
Gerrit-Change-Number: 8569
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Comment-Date: Thu, 23 Nov 2017 00:02:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5936: operator '%' overflows on large decimals

2017-11-22 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8574 )

Change subject: IMPALA-5936: operator '%' overflows on large decimals
..


Patch Set 3: Code-Review+2

Looks good!


--
To view, visit http://gerrit.cloudera.org:8080/8574
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2b06c8acd5aa490943e84013faf2eaac7c26ceb4
Gerrit-Change-Number: 8574
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 23 Nov 2017 00:20:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8614 )

Change subject: IMPALA-2250: Make multiple COUNT(DISTINCT) message state 
workarounds
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8614
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5084be10946d68f3ec0760c2b7e698635df26a89
Gerrit-Change-Number: 8614
Gerrit-PatchSet: 4
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Comment-Date: Thu, 23 Nov 2017 00:21:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8614 )

Change subject: IMPALA-2250: Make multiple COUNT(DISTINCT) message state 
workarounds
..

IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds

Change-Id: I5084be10946d68f3ec0760c2b7e698635df26a89
Reviewed-on: http://gerrit.cloudera.org:8080/8614
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins
---
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
1 file changed, 4 insertions(+), 1 deletion(-)

Approvals:
  Alex Behm: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8614
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I5084be10946d68f3ec0760c2b7e698635df26a89
Gerrit-Change-Number: 8614
Gerrit-PatchSet: 5
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Kim Jin Chul 


[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8447 )

Change subject: IMPALA-2181: Add query option levels for display
..


Patch Set 20: Code-Review+1

Looks good to me. Did anyone else on the long list of reviewers want to take 
another look?


--
To view, visit http://gerrit.cloudera.org:8080/8447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Gerrit-Change-Number: 8447
Gerrit-PatchSet: 20
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 23 Nov 2017 00:28:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder

2017-11-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8034 )

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
..


Patch Set 17:

This is still on my radar, need to get back to it.


--
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-Change-Number: 8034
Gerrit-PatchSet: 17
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 23 Nov 2017 00:29:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5237: Support a quoted string in date/time format

2017-11-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8508 )

Change subject: IMPALA-5237: Support a quoted string in date/time format
..


Patch Set 3:

I need to get back to this one after the thanksgiving break - it's on my radar.

@Attila does the latest iteration look good to you?


--
To view, visit http://gerrit.cloudera.org:8080/8508
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178
Gerrit-Change-Number: 8508
Gerrit-PatchSet: 3
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 23 Nov 2017 00:31:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4964: Fix Decimal modulo overflow

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8329 )

Change subject: IMPALA-4964: Fix Decimal modulo overflow
..

IMPALA-4964: Fix Decimal modulo overflow

The modulo operation between two decimals should never overflow. Before
this patch, there would be an overflow if the scale difference between
the two decimals was large. We would try to scale up the one with the
smaller scale, so that the scales matched, which could result in an
overflow.

We fix the problem by first checking if the scaled up value would fit
into 128 bits by estimating the number of leading zeros in it. If we
detect that 128 bits is not enough, we convert both numbers to int256,
do the operation, then convert back to 128 bits.

Testing:
- Added some expr tests that excercise the new code path.

Change-Id: I5420201d4440d421e33e443df005cdcc16b8a6cd
Reviewed-on: http://gerrit.cloudera.org:8080/8329
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-test.cc
M be/src/runtime/decimal-value.inline.h
3 files changed, 129 insertions(+), 29 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8329
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I5420201d4440d421e33e443df005cdcc16b8a6cd
Gerrit-Change-Number: 8329
Gerrit-PatchSet: 5
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-4964: Fix Decimal modulo overflow

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8329 )

Change subject: IMPALA-4964: Fix Decimal modulo overflow
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8329
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5420201d4440d421e33e443df005cdcc16b8a6cd
Gerrit-Change-Number: 8329
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Thu, 23 Nov 2017 00:55:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-22 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8447 )

Change subject: IMPALA-2181: Add query option levels for display
..


Patch Set 20:

(6 comments)

Looks good to me, though I didn't look at the tests very carefully myself. Just 
some minor comments.

http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.h@541
PS20, Line 541: his function
nit: delete (Obvious that the comment is talking about this function).


http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.h@544
PS20, Line 544: beeswax::ConfigVariable& option
if that's modified, you should use ConfigVariable *.  We prefer to use pointers 
rather than references for things that get modified, so that it's clear at the 
callsite.  (i.e. only use const references).


http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.h@545
PS20, Line 545: option_key
that's not explained. and is it even needed -- why not just use the key from 
'option'?


http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.cc@1236
PS20, Line 1236: config.__set_level(TQueryOptionLevel::ADVANCED);
does query_option_levels_ not contain all keys? i.e. should line 1233 be a 
DCHECK?


http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/query-options.h@34
PS20, Line 34: Introduced
comments shouldn't talk about current code vs. older code. They should just 
talk about the code. So, reword this:

Maps from query option name to option level used ...

or similar


http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/query-options.h@175
PS20, Line 175: QueryOptionLevels&
use pointer



--
To view, visit http://gerrit.cloudera.org:8080/8447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Gerrit-Change-Number: 8447
Gerrit-PatchSet: 20
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 23 Nov 2017 00:55:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-22 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8447 )

Change subject: IMPALA-2181: Add query option levels for display
..


Patch Set 20:

(2 comments)

(Also didn't look too much at the python shell code.)

http://gerrit.cloudera.org:8080/#/c/8447/20//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8447/20//COMMIT_MSG@14
PS20, Line 14: HS2 interface
I think that should say "the SET SQL statement"  or similar, since SET can be 
executed via HS2 or beeswax. It's just that the shell doesn't use it today.


http://gerrit.cloudera.org:8080/#/c/8447/20/common/thrift/beeswax.thrift
File common/thrift/beeswax.thrift:

http://gerrit.cloudera.org:8080/#/c/8447/20/common/thrift/beeswax.thrift@98
PS20, Line 98:
maybe we should be super explicit here and say:

// Impala extension: ...

to point out that this isn't part of the "standard" beeswax interface. I don't 
think anything else in this file was customized for Impala besides this. (The 
customization usually happens in ImpalaService.thrift.  We could even add this 
functionality there instead, but I don't feel strongly given that this seems 
like a backwards compatible modification and beeswax will go away.



--
To view, visit http://gerrit.cloudera.org:8080/8447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Gerrit-Change-Number: 8447
Gerrit-PatchSet: 20
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 23 Nov 2017 01:03:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5310: Part 2: Add SAMPLED NDV() function.

2017-11-22 Thread Alex Behm (Code Review)
Hello Bharath Vissapragada,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8569

to look at the new patch set (#3).

Change subject: IMPALA-5310: Part 2: Add SAMPLED_NDV() function.
..

IMPALA-5310: Part 2: Add SAMPLED_NDV() function.

Adds a new SAMPLED_NDV() aggregate function that is
intended to be used in COMPUTE STATS TABLESAMPLE.
This patch only adds the function itself. Integration
with COMPUTE STATS will come in a separate patch.

SAMPLED_NDV() estimates the number of distinct values (NDV)
based on a sample of data and the corresponding sampling rate.
The main idea is to collect several x/y data points where x is
the number of rows and y is the corresponding NDV estimate.
These data points are used to fit an objective function to the
data such that the true NDV can be extrapolated.
The aggregate function maintains a fixed number of HyperLogLog
intermediates to compute the x/y points.
Several objective functions are fit and the best-fit one is
used for extrapolation.

Adds the MPFIT C library to perform curve fitting:
https://www.physics.wisc.edu/~craigm/idl/cmpfit.html

The library is a C port from Fortran. Scipy uses the
Fortran version of the library for curve fitting.

Testing:
- added functional tests
- core/hdfs run passed

Change-Id: Ia51d56ee67ec6073e92f90bebb4005484138b820
---
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/aggregate-functions.h
A be/src/thirdparty/mpfit/DISCLAIMER
A be/src/thirdparty/mpfit/README
A be/src/thirdparty/mpfit/mpfit.c
A be/src/thirdparty/mpfit/mpfit.h
M be/src/util/CMakeLists.txt
A be/src/util/mpfit-util.cc
A be/src/util/mpfit-util.h
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M tests/query_test/test_aggregation.py
15 files changed, 3,897 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/8569/3
--
To view, visit http://gerrit.cloudera.org:8080/8569
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia51d56ee67ec6073e92f90bebb4005484138b820
Gerrit-Change-Number: 8569
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 


[Impala-ASF-CR] IMPALA-5310: Part 2: Add SAMPLED NDV() function.

2017-11-22 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8569 )

Change subject: IMPALA-5310: Part 2: Add SAMPLED_NDV() function.
..


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1443
PS2, Line 1443: count
> IMO 'count' sounds a little generic and overlaps with total_count. May be r
Done


http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1459
PS2, Line 1459:   double sample_perc;
> static?
This is the local state of a single aggregation instance, so static does not 
make sense.


http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1465
PS2, Line 1465:   static int64_t GetStateSize(int64_t num_buckets) {
> Given num_buckets is always a const, may be make this a static const too li
Can't do that because sizeof(SampledNdvState) fails since SampledNdvState is 
considered to be (size not known).

This is the error:
error: invalid application of ‘sizeof’ to incomplete type 
‘impala::SampledNdvState’
   sizeof(SampledNdvState) + NUM_HLL_BUCKETS * BUCKET_SIZE;


http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1530
PS2, Line 1530: ndv
> may be call ndv_estimate, just to be clear
Done


http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1552
PS2, Line 1552:  int bucket_idx = (i + j) % SampledNdvState::NUM_HLL_BUCKETS;
  :   merged_count += *state->GetCountPtr(bucket_idx);
  :   counts[pidx] = merged_count;
  :   StringVal hll = StringVal(state->GetHllPtr(bucket_idx), 
HLL_LEN);
  :   HllMerge(ctx, hll, &merged_hll);
  :   ndvs[pidx] = HllFinalEstimate(merged_hll.ptr, HLL_LEN);
> Wondering if we can avoid duplicates in data points since hll_to_merge for
I expanded the comment here to explain and justify.

Not sure I follow your point generation method. I think in my method only the 
last data point is guaranteed to be a duplicate (which is a good thing, now 
explained in comment). Others may or may not be duplicates (probably not 
duplicates though).

Consider this example input and my point generation method:

input counts:
1, 2, 3, 4

counts of 1st rolling window:
1, 3, 6, 10

counts of 2nd rolling window
2, 5, 9, 10

counts of 3rd rolling window
3, 7, 8, 10

counts of 4th rolling window
4, 5, 7, 10

For each "count" value above, we mere the corresponding HLL intermediates to 
get an NDV estimate for that subset of the data.

Note that the merged NDVs corresponding to the same "count" value are likely to 
be different because they represent different subsets of the data.


http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1563
PS2, Line 1563:   int64_t max_count = counts[num_points - 1];
> not sure I understand this, what invariant guarantees counts[num_points-1]
Added comment.


http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java:

http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@506
PS2, Line 506: SAMPED_NDV
> nit: typo
Done


http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@2063
PS2, Line 2063: 0.0
> Wondering if 0.0 is a valid sample %. What do we scan in that case? Do we a
The value passed to this agg fn is independent of how much data is scanned. You 
can think of the sample percent as an "extrapolation factor".

We will eventually combine TABLESAMPLE and SAMPLED_NDV() in COMPUTE STATS and 
there the effective sampling rate will be plugged into the SAMPLED_NDV() 
function.

But for now, let's view SAMPLED_NDV() as a standalone aggregate function.

I think 0.0 is a valid value. It will give you the estimated NDV at x=0. Not 
sure how useful that is, but I don't think it's invalid (0 is a valid "sampling 
ratio").

I'm not sure it's worth special casing 0, what do you think?


http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@2066
PS2, Line 2066:   for (Column col: allScalarTypes.getColumns()) {
> -ve test cases on non scalar columns, just for completeness?
Maybe I'm misunderstanding but we don't support Exprs on non-scalar types. Even 
if we have nested collections, we must explode them to scalars.



--
To view, visit http://gerrit.cloudera.org:8080/8569
To unsubscribe, visit http://gerrit.cloudera.org:808

[Impala-ASF-CR] IMPALA-3703: Store query context in thread-local variables

2017-11-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8621 )

Change subject: IMPALA-3703: Store query context in thread-local variables
..


Patch Set 2:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info-test.cc
File be/src/common/thread-info-test.cc:

http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info-test.cc@55
PS2, Line 55:   Status status = info_table.AddExtraInfo("extra", "info");
> please add a comment about what's going on:
I agree it would be helpful to have some more comments. Maybe at least a 
comment on each test to explain what it is testing at a high-level, then 
possible comments for any tricky details of the tests themselves.


http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h
File be/src/common/thread-info.h:

http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@41
PS2, Line 41: class InfoTable : boost::noncopyable {
> This is the only usage of boost:noncopyable I've seen in our code base. We
Yeah DISALLOW_COPY_AND_ASSIGN is the canonical way to do it in Impala (for 
better or worse).


http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@42
PS2, Line 42: public:
We've generally been moving to using the new inline initialisation syntax for 
member variables if they're constant, or we're just calling the default 
constructor. I.e.

text_size = 0, etc.


http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@101
PS2, Line 101: size_t
We generally use int64_t for byte sizes in the codebase. There are arguments 
for either int64_t or size_t, but we've settled on int64_t in Impala because 
it's large enough to hold any practical size and we avoid any 
hard-to-reason-about edge cases with mixed signed/unsigned expressions.


http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@105
PS2, Line 105:   char text[MAX_TEXT_SIZE];
The naming convention for member variables is to include a _ suffix. In this 
case we're following the google style guide: 
https://google.github.io/styleguide/cppguide.html#Variable_Names


http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@115
PS2, Line 115: InfoTable& GetThreadInfoTable();
By convention we generally use pointers if arguments or return values are meant 
to be mutable.


http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/runtime/query-state.cc@378
PS2, Line 378:   thread_info_table.SetQueryId(fis->query_id());
We should also do this for other threads running in the context of the 
query/fragment.

The fragment thread can spawn off threads in a couple of places: 
BlockingJoinNode::ProcessBuildInputAndOpenProbe() and 
HdfsScanNode::ThreadTokenAvailableCb().

FragmentInstanceState::Prepare() also spawns off a per-fragment thread, 
although I believe at some point that will become a per-query thread.

I guess we could also automatically inherit query_id and fragment_id from the 
parent thread, if that works out sanely, or maybe we should have a special 
Thread::Create() for fragment/query threads that takes additional arguments. 
Mainly it would be nice to have a pattern that makes it harder to accidentally 
avoid adding the appropriate info.



--
To view, visit http://gerrit.cloudera.org:8080/8621
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I566f7f1db5117c498e86e0bd05b33bdcff624609
Gerrit-Change-Number: 8621
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 23 Nov 2017 01:20:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2248: Make idle session timeout a query option

2017-11-22 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8490 )

Change subject: IMPALA-2248: Make idle_session_timeout a query option
..


Patch Set 9:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/8490/9/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/8490/9/be/src/service/impala-server.h@441
PS9, Line 441:   friend class ClientRequestState;
This is required so that ClientRequestState can call UnregisterSessionTimeout() 
and RegisterSessionTimeout(), right?

Would it be possible to make Register- and UnregisterTimeout() public and get 
rid of the friend class?


http://gerrit.cloudera.org:8080/#/c/8490/9/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8490/9/be/src/service/impala-server.cc@1240
PS9, Line 1240:   std::to_string(this->session_timeout),
I have no knowledge about how threading works around this part of the code. Do 
you know if a race-condition is possible around this->session_timeout?


http://gerrit.cloudera.org:8080/#/c/8490/9/be/src/service/impala-server.cc@1816
PS9, Line 1816: session_timeout_cv_.NotifyOne();
In RegisterSessionTimeout() the notify_one call is also protected by the lock. 
For my benefit, could you explain me why it was not required here?


http://gerrit.cloudera.org:8080/#/c/8490/9/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/8490/9/be/src/service/query-options.h@99
PS9, Line 99:   QUERY_OPT_FN(idle_session_timeout, IDLE_SESSION_TIMEOUT)\
For the sake of IMPALA-2181, could you find out which of the 4 new option group 
this one should belong to? (regular, advanced, development, deprecated). Guess 
some of the regular of advanced groups, right?


http://gerrit.cloudera.org:8080/#/c/8490/9/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/8490/9/be/src/service/query-options.cc@578
PS9, Line 578: "Only positive numbers are allowed.", value));
I think the '4 spaces for indentation' rule applies here as well. I would 
either add 2 more spaces or indent the beginning of the string to match the one 
above (if that didn't make the line too long).


http://gerrit.cloudera.org:8080/#/c/8490/9/tests/custom_cluster/test_session_expiration.py
File tests/custom_cluster/test_session_expiration.py:

http://gerrit.cloudera.org:8080/#/c/8490/9/tests/custom_cluster/test_session_expiration.py@62
PS9, Line 62: sleep(3)
Could you please re-work these sleep times a little bit to reduce the runtimes 
of these tests?
These 2 tests spend 14 secs just on sleep.


http://gerrit.cloudera.org:8080/#/c/8490/9/tests/custom_cluster/test_session_expiration.py@70
PS9, Line 70: sleep(8)
I learnt this week that running the test suite on RELEASE and ASAN build might 
give you tricky results. Once re-worked these numbers, could you please verify 
that the tests pass also on those build?

(For me a query that took like 0,5 sec to start fetching results apparently 
succeeded in a DEBUG build but on RELEASE and ASAN failed even if modified the 
sleep from 0,5 to 1.5 sec)



-- 
To view, visit http://gerrit.cloudera.org:8080/8490
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32e2775f80da387b0df4195fe2c5435b3f8e585e
Gerrit-Change-Number: 8490
Gerrit-PatchSet: 9
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 23 Nov 2017 01:51:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5146: Fix inconsitent results at FROM UNIXTIME()

2017-11-22 Thread Kim Jin Chul (Code Review)
Hello Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8629

to look at the new patch set (#2).

Change subject: IMPALA-5146: Fix inconsitent results at FROM_UNIXTIME()
..

IMPALA-5146: Fix inconsitent results at FROM_UNIXTIME()

The FROM_UNIXTIME(epoch) and FROM_UNIXTIME(epoch, format) produce
different results when epoch is out of range of TimestampValue.
The former produces an empty string, while the latter gives NULL.

The fix is to harmonize the results to NULL.

Testing:
Add unit tests to ExprTest.TimestampFunctions.

Change-Id: Ie3a5e9a9cb39d32993fa2c7f725be44d8b9ce9f2
---
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
2 files changed, 5 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/8629/2
--
To view, visit http://gerrit.cloudera.org:8080/8629
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie3a5e9a9cb39d32993fa2c7f725be44d8b9ce9f2
Gerrit-Change-Number: 8629
Gerrit-PatchSet: 2
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Dan Hecht 


[Impala-ASF-CR] IMPALA-5146: Fix inconsitent results at FROM UNIXTIME()

2017-11-22 Thread Kim Jin Chul (Code Review)
Kim Jin Chul has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8629 )

Change subject: IMPALA-5146: Fix inconsitent results at FROM_UNIXTIME()
..


Patch Set 1:

(1 comment)

Thanks for your comment

http://gerrit.cloudera.org:8080/#/c/8629/1/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8629/1/be/src/exprs/timestamp-functions-ir.cc@84
PS1, Line 84: ToUnixTime
> i think it would be clearer to just call tv.HasDateAndTime() directly to sh
Done



--
To view, visit http://gerrit.cloudera.org:8080/8629
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie3a5e9a9cb39d32993fa2c7f725be44d8b9ce9f2
Gerrit-Change-Number: 8629
Gerrit-PatchSet: 1
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Comment-Date: Thu, 23 Nov 2017 01:58:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5501: [DOCS] Clarify 'binaries' w.r.t. installation procedure

2017-11-22 Thread John Russell (Code Review)
John Russell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8637


Change subject: IMPALA-5501: [DOCS] Clarify 'binaries' w.r.t. installation 
procedure
..

IMPALA-5501: [DOCS] Clarify 'binaries' w.r.t. installation procedure

Change the wording to match up with the earlier instructions
saying to build from source.

Change-Id: Ib12b8920e352ee086ce3fbc2601fbea4310a683c
---
M docs/topics/impala_install.xml
1 file changed, 2 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/8637/1
--
To view, visit http://gerrit.cloudera.org:8080/8637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib12b8920e352ee086ce3fbc2601fbea4310a683c
Gerrit-Change-Number: 8637
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell 


[Impala-ASF-CR] IMPALA-5053: [SECURITY] Make KRPC work with Kerberos

2017-11-22 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8270 )

Change subject: IMPALA-5053: [SECURITY] Make KRPC work with Kerberos
..


Patch Set 5:

(14 comments)

Sorry for the delay. Some more comments.

http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/auth-provider.h
File be/src/rpc/auth-provider.h:

http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/auth-provider.h@27
PS5, Line 27: #include "util/thread.h"
What is this for ?


http://gerrit.cloudera.org:8080/#/c/8270/4/be/src/rpc/auth-provider.h
File be/src/rpc/auth-provider.h:

http://gerrit.cloudera.org:8080/#/c/8270/4/be/src/rpc/auth-provider.h@78
PS4, Line 78:   /// Wrap the client transport with a new TSaslClientTransport.  
This is only for
:   /// internal connections.  Since, as a daemon, we only do 
Kerberos and not LDAP,
:   /// we can go straight to Kerberos.
:   virtual Status WrapClientTransport(const std::string& hostname,
:   boost::shared_ptr 
raw_transport,
:   const std::string& service_name,
:   boost::shared_ptr* 
wrapped_transport);
Is this only used for thrift connections ? May be it helps to state it now that 
we have two different RPC mechanisms in our code.


http://gerrit.cloudera.org:8080/#/c/8270/4/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/8270/4/be/src/rpc/authentication.cc@643
PS4, Line 643: kudu::rpc::DisableSaslInitialization()
Can you please add a comment as to why this is necessary and how is SASL 
initialized when KRPC is enabled ?


http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/authentication.cc@643
PS5, Line 643: KUDU_RETURN_IF_ERROR(kudu::rpc::DisableSaslInitialization(),
 : "Unable to disable Kudu RPC SASL initialization.");
Please add a small comment that this overlaps with the initialization below.


http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/authentication.cc@658
PS5, Line 658: // We assume all impala processes are both server and client.
 : sasl::TSaslServer::SaslInit(GENERAL_CALLBACKS, appname);
 : sasl::TSaslClient::SaslInit(GENERAL_CALLBACKS);
Once we completely move away from Thrift, do we rely on Kudu to initialize the 
Sasl library ?


http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/authentication.cc@1034
PS5, Line 1034:   
RETURN_IF_ERROR(GetInternalKerberosPrincipal(&kerberos_internal_principal));
  :   
RETURN_IF_ERROR(GetExternalKerberosPrincipal(&kerberos_external_principal));
The internal and external principals look like good candidates to be stored in 
ExecEnv and should be initialized once.


http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/authentication.cc@1036
PS5, Line 1036:
nit: blank line not needed.


http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/rpc-mgr.cc
File be/src/rpc/rpc-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/rpc-mgr.cc@63
PS5, Line 63: !FLAGS_principal.empty()
IsKerberosEnabled()


http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/rpc-mgr.cc@70
PS5, Line 70: bld.set_sasl_proto_name(service_name);
It may be better to set FLAGS_rpc_authentication (defined by Kudu) to 
"required" to make sure that Kerberos must be enabled if it's enabled via 
FLAGS_principal.


http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/testutil/mini-kdc-wrapper.h
File be/src/testutil/mini-kdc-wrapper.h:

http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/testutil/mini-kdc-wrapper.h@53
PS5, Line 53: //
nit: ///


http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/testutil/mini-kdc-wrapper.h@86
PS5, Line 86:   Status StopKdc();
Please add comments about what clean up it may do.


http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/testutil/mini-kdc-wrapper.cc
File be/src/testutil/mini-kdc-wrapper.cc:

http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/testutil/mini-kdc-wrapper.cc@67
PS5, Line 67: >
!= seems to make less assumption about the ordering of the ENUM values.


http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/testutil/mini-kdc-wrapper.cc@88
PS5, Line 88: >
!=


http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/util/auth-util.cc
File be/src/util/auth-util.cc:

http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/util/auth-util.cc@86
PS5, Line 86:   const Status BAD_FORMAT_STATUS = Status(strings::Substitute(
:   "Kerberos principal should be of the form: 
/@ - got: $0",
:   principal))
Should this live in generate_error_codes.py ?

Also, our existing implementation of Status has the unfortunate effect of 
creating a backtrace and logging to log file when creating a Status object. May 
be it's not preferable to pre-fabricate one here or it may confuse users to see 
this error message in 

[Impala-ASF-CR] IMPALA-5501: [DOCS] Clarify 'binaries' w.r.t. installation procedure

2017-11-22 Thread John Russell (Code Review)
John Russell has abandoned this change. ( http://gerrit.cloudera.org:8080/8637 )

Change subject: IMPALA-5501: [DOCS] Clarify 'binaries' w.r.t. installation 
procedure
..


Abandoned

Made this gerrit based on master, rather than a private branch, by mistake.
--
To view, visit http://gerrit.cloudera.org:8080/8637
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: Ib12b8920e352ee086ce3fbc2601fbea4310a683c
Gerrit-Change-Number: 8637
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell 


[Impala-ASF-CR] IMPALA-5501: [DOCS] Clarify 'binaries' w.r.t. installation procedure

2017-11-22 Thread John Russell (Code Review)
John Russell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8638


Change subject: IMPALA-5501: [DOCS] Clarify 'binaries' w.r.t. installation 
procedure
..

IMPALA-5501: [DOCS] Clarify 'binaries' w.r.t. installation procedure

Change the wording to match up with the earlier instructions
saying to build from source.

Change-Id: I1ea14e50b2d9a952091a316b637d7ea02c2b6ffd
---
M docs/topics/impala_install.xml
1 file changed, 2 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/8638/1
--
To view, visit http://gerrit.cloudera.org:8080/8638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1ea14e50b2d9a952091a316b637d7ea02c2b6ffd
Gerrit-Change-Number: 8638
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell 


[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder

2017-11-22 Thread Pranay Singh (Code Review)
Pranay Singh has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8034 )

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
..


Patch Set 17:

> This is still on my radar, need to get back to it.

Thanks Tim


--
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-Change-Number: 8034
Gerrit-PatchSet: 17
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 23 Nov 2017 02:23:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8631 )

Change subject: IMPALA-6227: deflake admission stress tests
..


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1518/


--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 23 Nov 2017 02:48:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4927: Impala should handle invalid input from Sentry

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8588 )

Change subject: IMPALA-4927: Impala should handle invalid input from Sentry
..

IMPALA-4927: Impala should handle invalid input from Sentry

Impala requests a list of roles from Sentry and then asks for privileges
for each role. If Sentry returns a non existent role in the first step,
then there will be a Java exception in Impala in the second step and
the communication with Sentry is aborted.

The issue is fixed by handling the exception if an invalid role is
found and continue with getting permissions for the rest of the roles.

Testing:
---
Since invalid role could not be created through impala-shell/Hue
interface the code was instrumented to have an invalid Role " ",
and SHOW ROLES statement was executed from impala shell to see
how the condition is handled.

Change-Id: I781411018d580854d80a9cad81a1ded7ca16af8b
Reviewed-on: http://gerrit.cloudera.org:8080/8588
Reviewed-by: Alex Behm 
Reviewed-by: Zach Amsden 
Tested-by: Impala Public Jenkins
---
M fe/src/main/java/org/apache/impala/util/SentryProxy.java
1 file changed, 10 insertions(+), 3 deletions(-)

Approvals:
  Alex Behm: Looks good to me, but someone else must approve
  Zach Amsden: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I781411018d580854d80a9cad81a1ded7ca16af8b
Gerrit-Change-Number: 8588
Gerrit-PatchSet: 11
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4927: Impala should handle invalid input from Sentry

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8588 )

Change subject: IMPALA-4927: Impala should handle invalid input from Sentry
..


Patch Set 10: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I781411018d580854d80a9cad81a1ded7ca16af8b
Gerrit-Change-Number: 8588
Gerrit-PatchSet: 10
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Thu, 23 Nov 2017 03:25:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig, Dan Hecht, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8631

to look at the new patch set (#6).

Change subject: IMPALA-6227: deflake admission stress tests
..

IMPALA-6227: deflake admission stress tests

The problem was that, during the initial admission decision phase, some
queries were initially queued then dequeued once memory came available.
All of the accounting in the test implicitly relies on queries not being
dequeued until queries are later explicitly ended, so if this happened,
the test broke in multiple subtle ways.

This happened because the query only scanned a small number of
rows, which could be all buffered on the receiver side of the
exchange even before the client fetched any rows from the coordinator.
This means that the reserved memory on some backends could increase
then decrease during the initial admission phase, resulting in a
query being queued then dequeued.

The fix is to increase the number of rows returned by the query so that
all fragments remain active during the initial admission phase.
This increased test execution time somewhat, so I also had to bump the
queue wait timeout for the admission stress tests (they assume that
queries don't time out in the queue).

Testing:
Ran the test under debug, release and ASAN builds, i.e.

  impala-py.test tests/custom_cluster/test_admission_controller.py \
--workload_exploration_strategy="functional-query:exhaustive"

I looped the mem_limit test for a while to confirm it didn't reproduce
(it reproduced reliably every 2-3 iterations before this fix).

It still reproduces every 5-10 runs with exhaustive+release, so I
need to do further work to make it more robust.

Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
---
M fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
M fe/src/test/resources/llama-site-test2.xml
M tests/custom_cluster/test_admission_controller.py
3 files changed, 117 insertions(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/8631/6
--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8631 )

Change subject: IMPALA-6227: deflake admission stress tests
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1521/


--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 23 Nov 2017 04:19:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8631 )

Change subject: IMPALA-6227: deflake admission stress tests
..


Patch Set 6: Code-Review+2

Trivial test fix


--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 23 Nov 2017 04:19:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-22 Thread Gabor Kaszab (Code Review)
Hello Lars Volker, Laszlo Gaal, Zoltan Borok-Nagy, Philip Zeyliger, Attila 
Jeges, Tim Armstrong, Csaba Ringhofer, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8447

to look at the new patch set (#21).

Change subject: IMPALA-2181: Add query option levels for display
..

IMPALA-2181: Add query option levels for display

Four display levels are introduced for each query option: REGULAR, ADVANCED,
DEVELOPMENT and DEPRECATED. When the query options are displayed in Impala
shell using SET then only the REGULAR and ADVANCED options are shown. A new
command called SET ALL shows all the options grouped by their option levels.

When the query options are displayed through the SET SQL statement then the
result set would contain an extra column indicating the level of each option.
Similarly to Impala shell here the SET command only diplays the REGULAR and
ADVANCED options while SET ALL shows them all.

If the Impala shell connects to an Impala daemon that predates this change
then all the options would be displayed in the REGULAR group.

Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
---
M be/src/service/child-query.cc
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/Frontend.thrift
M common/thrift/beeswax.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/SetStmt.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M shell/impala_client.py
M shell/impala_shell.py
M testdata/workloads/functional-query/queries/QueryTest/set.test
M tests/custom_cluster/test_set_and_unset.py
M tests/hs2/test_hs2.py
M tests/shell/test_shell_interactive.py
18 files changed, 486 insertions(+), 226 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/8447/21
--
To view, visit http://gerrit.cloudera.org:8080/8447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Gerrit-Change-Number: 8447
Gerrit-PatchSet: 21
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-22 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8447 )

Change subject: IMPALA-2181: Add query option levels for display
..


Patch Set 21:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/8447/20//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8447/20//COMMIT_MSG@14
PS20, Line 14: the SET SQL s
> I think that should say "the SET SQL statement"  or similar, since SET can
Done


http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.h@541
PS20, Line 541: ets the optio
> nit: delete (Obvious that the comment is talking about this function).
Done


http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.h@544
PS20, Line 544: query option.
> if that's modified, you should use ConfigVariable *.  We prefer to use poin
Done


http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.h@545
PS20, Line 545: lToConfig(
> that's not explained. and is it even needed -- why not just use the key fro
I didn't want to make this function dependent on whether the key field of the 
ConfigVariable was set by the caller or not. I found it safer this way.
I'll mention this in the comment.


http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.cc@1236
PS20, Line 1236:
> does query_option_levels_ not contain all keys? i.e. should line 1233 be a
In fact, a DCHECK is enough here. (Changing this to a DCHECK actually caught a 
bug: Where support_start_over was added to the options map the name should have 
been added with uppercase letters.)


http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/query-options.h@34
PS20, Line 34: Maps query
> comments shouldn't talk about current code vs. older code. They should just
Done


http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/query-options.h@175
PS20, Line 175: QueryOptionLevels*
> use pointer
Done


http://gerrit.cloudera.org:8080/#/c/8447/20/common/thrift/beeswax.thrift
File common/thrift/beeswax.thrift:

http://gerrit.cloudera.org:8080/#/c/8447/20/common/thrift/beeswax.thrift@98
PS20, Line 98:
> maybe we should be super explicit here and say:
Done



--
To view, visit http://gerrit.cloudera.org:8080/8447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Gerrit-Change-Number: 8447
Gerrit-PatchSet: 21
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 23 Nov 2017 05:56:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8631 )

Change subject: IMPALA-6227: deflake admission stress tests
..

IMPALA-6227: deflake admission stress tests

The problem was that, during the initial admission decision phase, some
queries were initially queued then dequeued once memory came available.
All of the accounting in the test implicitly relies on queries not being
dequeued until queries are later explicitly ended, so if this happened,
the test broke in multiple subtle ways.

This happened because the query only scanned a small number of
rows, which could be all buffered on the receiver side of the
exchange even before the client fetched any rows from the coordinator.
This means that the reserved memory on some backends could increase
then decrease during the initial admission phase, resulting in a
query being queued then dequeued.

The fix is to increase the number of rows returned by the query so that
all fragments remain active during the initial admission phase.
This increased test execution time somewhat, so I also had to bump the
queue wait timeout for the admission stress tests (they assume that
queries don't time out in the queue).

Testing:
Ran the test under debug, release and ASAN builds, i.e.

  impala-py.test tests/custom_cluster/test_admission_controller.py \
--workload_exploration_strategy="functional-query:exhaustive"

I looped the mem_limit test for a while to confirm it didn't reproduce
(it reproduced reliably every 2-3 iterations before this fix).

It still reproduces every 5-10 runs with exhaustive+release, so I
need to do further work to make it more robust.

Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Reviewed-on: http://gerrit.cloudera.org:8080/8631
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
M fe/src/test/resources/llama-site-test2.xml
M tests/custom_cluster/test_admission_controller.py
3 files changed, 117 insertions(+), 50 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8631 )

Change subject: IMPALA-6227: deflake admission stress tests
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 23 Nov 2017 07:48:17 +
Gerrit-HasComments: No