[Impala-ASF-CR] IMPALA-6118: Fix assertion failure in coordinator bloom filter updating

2017-10-27 Thread Tianyi Wang (Code Review)
Tianyi Wang has abandoned this change. ( http://gerrit.cloudera.org:8080/8410 )

Change subject: IMPALA-6118: Fix assertion failure in coordinator bloom filter 
updating
..


Abandoned
--
To view, visit http://gerrit.cloudera.org:8080/8410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: Ia28f9eb8d0b3e795d798949a1cfcfeaaec7040ed
Gerrit-Change-Number: 8410
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-6118: Fix assertion failure in coordinator bloom filter updating

2017-10-27 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8410 )

Change subject: IMPALA-6118: Fix assertion failure in coordinator bloom filter 
updating
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8410/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/8410/1/be/src/runtime/coordinator.cc@1180
PS1, Line 1180:   bloom_filter_.directory.clear();
> shrink_to_fit in the next line should free the memory. It's not enforced by
It turns out string() has capacity 15 in GCC5.2 and 0 in GCC4.9. I think we 
should just track size() here since there is no guarantee on anything about 
capacity. I will abandon this.



--
To view, visit http://gerrit.cloudera.org:8080/8410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia28f9eb8d0b3e795d798949a1cfcfeaaec7040ed
Gerrit-Change-Number: 8410
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Sat, 28 Oct 2017 05:06:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6118: Fix assertion failure in coordinator bloom filter updating

2017-10-27 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8410 )

Change subject: IMPALA-6118: Fix assertion failure in coordinator bloom filter 
updating
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8410/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/8410/1/be/src/runtime/coordinator.cc@1180
PS1, Line 1180:   bloom_filter_.directory.clear();
> I think the bigger problem here is actually that the memory of the string i
shrink_to_fit in the next line should free the memory. It's not enforced by the 
standard, but the standard didn't specify the capacity of a newly constructed 
string() as well.



--
To view, visit http://gerrit.cloudera.org:8080/8410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia28f9eb8d0b3e795d798949a1cfcfeaaec7040ed
Gerrit-Change-Number: 8410
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Sat, 28 Oct 2017 05:00:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6118: Fix assertion failure in coordinator bloom filter updating

2017-10-27 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8410 )

Change subject: IMPALA-6118: Fix assertion failure in coordinator bloom filter 
updating
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8410/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/8410/1/be/src/runtime/coordinator.cc@1180
PS1, Line 1180:   bloom_filter_.directory.clear();
I think the bigger problem here is actually that the memory of the string is 
not freed (meaning we have a big chunk of untracked memory!). You could create 
a new string on the stack and move() the contents into that string, so the 
memory is freed.

Then I think the old assumption of 0 capacity is valid again.



--
To view, visit http://gerrit.cloudera.org:8080/8410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia28f9eb8d0b3e795d798949a1cfcfeaaec7040ed
Gerrit-Change-Number: 8410
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Comment-Date: Sat, 28 Oct 2017 04:41:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6118: Fix assertion failure in coordinator bloom filter updating

2017-10-27 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8410


Change subject: IMPALA-6118: Fix assertion failure in coordinator bloom filter 
updating
..

IMPALA-6118: Fix assertion failure in coordinator bloom filter updating

Coordinator::UpdateFilter wrongly assumed std::string::capacity() to be
0 after std::string::clear(), resulting in an DCHECK failure in memory
tracking. This patch uses std::string::shrink_to_fit, calculates the
difference of the capacity and releases memory in the memory tracker
accordingly.

Change-Id: Ia28f9eb8d0b3e795d798949a1cfcfeaaec7040ed
---
M be/src/runtime/coordinator.cc
1 file changed, 7 insertions(+), 5 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/8410/1
--
To view, visit http://gerrit.cloudera.org:8080/8410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia28f9eb8d0b3e795d798949a1cfcfeaaec7040ed
Gerrit-Change-Number: 8410
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 


[Impala-ASF-CR] IMPALA-6123: Fix column order of a query test in test inline view limit

2017-10-27 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8409 )

Change subject: IMPALA-6123: Fix column order of a query test in 
test_inline_view_limit
..


Patch Set 2:

Any idea how this test succeeded in the pre-commit tests?


--
To view, visit http://gerrit.cloudera.org:8080/8409
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330
Gerrit-Change-Number: 8409
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 28 Oct 2017 04:01:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6123: Fix column order of a query test in test inline view limit

2017-10-27 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8409 )

Change subject: IMPALA-6123: Fix column order of a query test in 
test_inline_view_limit
..


Patch Set 2:

Actually nvm, I see it only fails i exhaustive. Makes sense.


--
To view, visit http://gerrit.cloudera.org:8080/8409
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330
Gerrit-Change-Number: 8409
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 28 Oct 2017 04:02:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).

2017-10-27 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8405 )

Change subject: IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8405/2/bin/load-data.py
File bin/load-data.py:

http://gerrit.cloudera.org:8080/#/c/8405/2/bin/load-data.py@114
PS2, Line 114: # When HiveServer2 is configured to use "local" mode (i.e., MR 
jobs are run
> Let's mention the HADOOP JIRA in case it's ever fixed and we can remove the
Good idea. Done.



--
To view, visit http://gerrit.cloudera.org:8080/8405
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10
Gerrit-Change-Number: 8405
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 28 Oct 2017 03:48:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).

2017-10-27 Thread Philip Zeyliger (Code Review)
Hello Joe McDonnell, Tim Armstrong, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8405

to look at the new patch set (#3).

Change subject: IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).
..

IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).

This is a revert of a revert, re-enabling parallel data load.  It avoid
the race condition by explicitly configuring the temporary directory in
question in load-data.py.

When the parallel data load change went in, we discovered
a race with a signature of:

  java.io.FileNotFoundException: File
  /tmp/hadoop-jenkins/mapred/local/1508958341829_tmp does not exist

The number in this path is milliseconds since the epoch, and the race
occurs when two queries submitted to HiveServer2, running with the local
runner, hit the same millisecond time stamp.  The upstream bug is
https://issues.apache.org/jira/browse/MAPREDUCE-6441, and I described the
symptoms in https://issues.apache.org/jira/browse/MAPREDUCE-6992 (which
is now marked as a dupe).

I've tested this by running data load 5 times on the same machines
where it failed before. I also ran data load manually and inspected
the system to make sure that the temporary directories are getting
created as expected in /tmp/impala-data-load-*.

Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10
---
M bin/load-data.py
M testdata/bin/create-load-data.sh
M testdata/bin/run-hive-server.sh
M testdata/bin/run-step.sh
4 files changed, 59 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/8405/3
--
To view, visit http://gerrit.cloudera.org:8080/8405
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10
Gerrit-Change-Number: 8405
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.

2017-10-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8211 )

Change subject: IMPALA-5243: Speed up code gen for wide Avro tables.
..


Patch Set 8: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476
Gerrit-Change-Number: 8211
Gerrit-PatchSet: 8
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 28 Oct 2017 02:51:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.

2017-10-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8211 )

Change subject: IMPALA-5243: Speed up code gen for wide Avro tables.
..

IMPALA-5243: Speed up code gen for wide Avro tables.

HdfsAvroScanner::CodegenMaterializeTuple generates a function linear in
size to the number of columns. On 1000 column tables, codegen time is
significant. This commit roughly halves it for wide columns.
(Note that this had been much worse in recent history (<= Impala 2.9).)

It does so by breaking up MaterializeTuple() into multiple smaller
functions, and then calls them in order. When breaking up into
200-column chunks, there is a noticeable speed-up.

I've made the helper code for generating LLVM function prototypes
have a mutable function name, so that the builder can be re-used
multiple times.

I've checked by inspecting optimized LLVM that in the case where there's
only 1 helper function, code gets inlined so that there doesn't seem to
be an extra function.

I measured codegen time for various "step sizes." The case where there
are no helper functions is about 2.7s. The best case was about a step
size of 200, with timings of 1.3s.

For the query "select count(int_col16) from 
functional_avro.widetable_1000_cols",
codegen times as a function of step size are roughly as follows. This is
averaged across 5 executions, and rounded to 0.1s.

   step time
 10 2.4
 50 2.5
 75 2.9
100 3.0
125 3.0
150 1.4
175 1.3
200 1.3 <-- chosen step size
225 1.5
250 1.4
300 1.6
400 1.6
500 1.8
   1000 2.7

The raw data was generated like so, with some code that let me change the step 
size at runtime:

  $(for step in 10 50 75 100 125 150 175 200 225 250 300 400 500 1000; do for 
try in $(seq 5); do echo $step > /tmp/step_size.txt; echo -n "$step "; 
impala-shell.sh -q "select count(int_col16) from 
functional_avro.widetable_1000_cols; profile;" 2> /dev/null | grep -A9 
'CodeGen:(Total: [0-9]*s' -m 1 | sed -e 's/ - / /' |
  sed -e 's/([0-9]*)//' | tr -d '\n' | tr -s ' ' ' '; echo; done; done) | tee 
out.txt
  ...
  200  CodeGen:(Total: 1s333ms, non-child: 1s333ms, % non-child: 100.00%) 
CodegenTime: 613.562us CompileTime: 605.320ms LoadTime: 0.000ns 
ModuleBitcodeSize: 1.95 MB NumFunctions: 38 NumInstructions: 8.44K 
OptimizationTime: 701.276ms PeakMemoryUsage: 4.12 MB PrepareTime: 10.014ms
  ...
  1000  CodeGen:(Total: 2s659ms, non-child: 2s659ms, % non-child: 100.00%) 
CodegenTime: 558.860us CompileTime: 1s267ms LoadTime: 0.000ns 
ModuleBitcodeSize: 1.95 MB NumFunctions: 34 NumInstructions: 8.41K 
OptimizationTime: 1s362ms PeakMemoryUsage: 4.11 MB PrepareTime: 10.574ms

I have run the core tests with this change.

Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476
Reviewed-on: http://gerrit.cloudera.org:8080/8211
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/codegen/llvm-codegen.h
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
3 files changed, 158 insertions(+), 100 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476
Gerrit-Change-Number: 8211
Gerrit-PatchSet: 9
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources

2017-10-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8303 )

Change subject: IMPALA-1575: Part 1: eagerly release query exec resources
..

IMPALA-1575: Part 1: eagerly release query exec resources

Release of backend resources for query execution on the coordinator
daemon should occur eagerly as soon as query execution is finished
or cancelled. Before this patch some resources managed by QueryState,
like scratch files, were only released when the query was closed
and all of the query's control structures were torn down.

These resources are referred to as "ExecResources" in various places to
distinguish them from resources associated with the client request (like
the result cache) that are still required after the query finishes
executing.

This first change does not solve the admission control problem for two
reasons:
* We don't release the "admitted" memory on the coordinator until
  the query is unregistered.
* Admission control still considers the memory reserved until the
  query memtracker is unregistered, which happens only when the
  QueryState is destroyed: see MemTracker::GetPoolMemReserved().

The flow is mostly similar to initial_reservation_refcnt_, except the
coordinator also holds onto a reference count, which is released
when either the final row is returned or cancellation is initiated.
After the coordinator releases its refcount, the resources can be
freed as soon as local fragments also release their refcounts.

Also clean up Coordinator slightly by preventing runtime_state() from
leaking out to ClientRequestState - instead it's possible to log
the query MemTracker by following parent links in the MemTracker.

This patch is partially based on Joe McDonnell's IMPALA-1575 patch.

Testing:
Ran core tests.

Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Reviewed-on: http://gerrit.cloudera.org:8080/8303
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/test-env.cc
M be/src/service/client-request-state.cc
10 files changed, 140 insertions(+), 89 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Gerrit-Change-Number: 8303
Gerrit-PatchSet: 17
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources

2017-10-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8303 )

Change subject: IMPALA-1575: Part 1: eagerly release query exec resources
..


Patch Set 16: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Gerrit-Change-Number: 8303
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 28 Oct 2017 02:45:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6123: Fix column order of a query test in test inline view limit

2017-10-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8409 )

Change subject: IMPALA-6123: Fix column order of a query test in 
test_inline_view_limit
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1413/


--
To view, visit http://gerrit.cloudera.org:8080/8409
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330
Gerrit-Change-Number: 8409
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 28 Oct 2017 02:35:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6123: Fix column order of a query test in test inline view limit

2017-10-27 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8409 )

Change subject: IMPALA-6123: Fix column order of a query test in 
test_inline_view_limit
..


Patch Set 2: Code-Review+2

I'll +2 this to unblock our exhaustive tests. Please follow up with Alex on 
whether this is problem was expected b/c of IMPALA-886.


--
To view, visit http://gerrit.cloudera.org:8080/8409
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330
Gerrit-Change-Number: 8409
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 28 Oct 2017 02:35:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6123: Fix column order of a query test in test inline view limit

2017-10-27 Thread Bikramjeet Vig (Code Review)
Hello Lars Volker, Tim Armstrong, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8409

to look at the new patch set (#2).

Change subject: IMPALA-6123: Fix column order of a query test in 
test_inline_view_limit
..

IMPALA-6123: Fix column order of a query test in test_inline_view_limit

Currently a "select *" query in test_inline_view_limit fails during
exhaustive testing because Impala returns columns from HBase tables
in a different order (IMPALA-886) than the one expected. This fix
ensures the column order is consistent by specifying the output
columns in the right order in the select query.

Testing:
Tested locally, with and without exhaustive exploration strategy.

Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330
---
M testdata/workloads/functional-query/queries/QueryTest/inline-view-limit.test
1 file changed, 3 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8409/2
--
To view, visit http://gerrit.cloudera.org:8080/8409
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330
Gerrit-Change-Number: 8409
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6123: Fix column order of a query test in test inline view limit

2017-10-27 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8409 )

Change subject: IMPALA-6123: Fix column order of a query test in 
test_inline_view_limit
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8409/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8409/1//COMMIT_MSG@10
PS1, Line 10: because the output column order is different from
: the expected one for it corresponding hbase table
> how about: ... because Impala returns columns from HBase tables in a differ
yes this sounds better.



--
To view, visit http://gerrit.cloudera.org:8080/8409
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330
Gerrit-Change-Number: 8409
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 28 Oct 2017 02:29:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6093: diagnostics for flaky TestHashJoinTimer

2017-10-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8403 )

Change subject: IMPALA-6093: diagnostics for flaky TestHashJoinTimer
..

IMPALA-6093: diagnostics for flaky TestHashJoinTimer

We don't know the root cause yet but try to improve things:
* Eliminate one possible cause of flakiness - unfinished fragments left
  from previous queries.
* Print a profile if an assertion fails so we can see why it failed.

Testing:
Ran core tests.

Change-Id: Ic33296931db807abb960db43b99e5fd0f256
Reviewed-on: http://gerrit.cloudera.org:8080/8403
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M tests/query_test/test_hash_join_timer.py
1 file changed, 18 insertions(+), 6 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8403
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic33296931db807abb960db43b99e5fd0f256
Gerrit-Change-Number: 8403
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6093: diagnostics for flaky TestHashJoinTimer

2017-10-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8403 )

Change subject: IMPALA-6093: diagnostics for flaky TestHashJoinTimer
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8403
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic33296931db807abb960db43b99e5fd0f256
Gerrit-Change-Number: 8403
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 28 Oct 2017 01:55:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6123: Fix column order of a query test in test inline view limit

2017-10-27 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8409 )

Change subject: IMPALA-6123: Fix column order of a query test in 
test_inline_view_limit
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8409/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8409/1//COMMIT_MSG@10
PS1, Line 10: because the output column order is different from
: the expected one for it corresponding hbase table
how about: ... because Impala returns columns from HBase tables in a different 
order (IMPALA-886).



--
To view, visit http://gerrit.cloudera.org:8080/8409
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330
Gerrit-Change-Number: 8409
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 28 Oct 2017 01:19:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6123: Fix column order of a query test in test inline view limit

2017-10-27 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8409


Change subject: IMPALA-6123: Fix column order of a query test in 
test_inline_view_limit
..

IMPALA-6123: Fix column order of a query test in test_inline_view_limit

Currently a "select *" query in test_inline_view_limit fails during
exhaustive testing because the output column order is different from
the expected one for it corresponding hbase table. This fix ensures the
column order is consistent by specifying the output columns in the
right order in the select query.

Testing:
Tested locally, with and without exhaustive exploration strategy.

Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330
---
M testdata/workloads/functional-query/queries/QueryTest/inline-view-limit.test
1 file changed, 3 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8409/1
--
To view, visit http://gerrit.cloudera.org:8080/8409
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330
Gerrit-Change-Number: 8409
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 


[Impala-ASF-CR] IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).

2017-10-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8405 )

Change subject: IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).
..


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8405/2/bin/load-data.py
File bin/load-data.py:

http://gerrit.cloudera.org:8080/#/c/8405/2/bin/load-data.py@114
PS2, Line 114: # When HiveServer2 is configured to use "local" mode (i.e., MR 
jobs are run
Let's mention the HADOOP JIRA in case it's ever fixed and we can remove the 
workaround.



--
To view, visit http://gerrit.cloudera.org:8080/8405
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10
Gerrit-Change-Number: 8405
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 28 Oct 2017 00:14:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

2017-10-27 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
..

IMPALA-5019: Decimal V2 addition

In this patch, we implement the new decimal return type rules for
addition expressions. These rules become active when the query option
DECIMAL_V2 is enabled. The algorithm for determining the type of the
result is described in the JIRA.

DECIMAL V1:
++
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
++
| DECIMAL(38,38) |
++

DECIMAL V2:
++
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
++
| DECIMAL(38,6)  |
++

This patch required backend changes. We implement an algorithm where
we handle the whole and fractional parts separately, and then combine
them to get the final result. This is more complex and slower. We try
to avoid this by first checking if the result would fit into int128.

Testing:
- Added expr tests.
- Tested locally on my machine with a script that generates random
  decimal numbers and checks that Impala adds them correctly.

Performance:

For the common case, performance remains the same.
  select cast(2.2 as decimal(18, 1) + cast(2.2 as decimal(18, 1)

  BEFORE: 4.74s
  AFTER:  4.73s

In this case, we check if it is necessary to do the complex addition,
and it turns out to be not necessary. We see a slowdown because the
result needs to be scaled down by dividing.
  select cast(2.2 as decimal(38, 19) + cast(2.2 as decimal(38, 19)

  BEFORE: 1.63s
  AFTER:  13.57s

In following case, we take the most complex path and see the most
signification perfmance hit.
  select cast(7.5 as decimal(38,37)) + cast(2.2 as decimal(38,37))

  BEFORE: 1.63s
  AFTER: 20.57

Change-Id: I401049c56d910eb1546a178c909c923b01239336
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
4 files changed, 387 insertions(+), 84 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8309/3
--
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

2017-10-27 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
..


Patch Set 2:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/8309/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8309/1//COMMIT_MSG@31
PS1, Line 31: to avoid this by first checking if the result would into int128.
> fit?
Done


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@166
PS2, Line 166: LeadingZeroAdjustment
> I think the name could make it clearer what it's computing. Maybe something
Done. I like the new name.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@170
PS2, Line 170: floor(log2(b))
> The table values are actually floor(log2(10^b)) for i = 0, 1, 2, 3... etc,
Yes, exactly. Fixed the comment.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@181
PS2, Line 181: same_sign
> We usually make numeric/bool template arguments upper case to make it clear
Separated this large function into 3 smaller ones, as you suggested.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@185
PS2, Line 185:   // This is expected to be handled by the caller.
> What about the case when they're zero? That's also meant to be handled by t
It wasn't possible for this condition to fail because at least one must be 
greater than zero for the following reasons:
- If they are both zero, we will not be calling this function, because of the 
leading zero test.
- If one is negative and the other is zero, we will invert it on line 327

I modified this check in the new implementation.

Yes we do have cases, where we add zero.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@272
PS2, Line 272: DCHECK_EQ(result_scale, std::max(this_scale, other_scale));
> A brief comment that this is guaranteed by the frontend migh help people ne
Done


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@277
PS2, Line 277:   DCHECK(!*overflow) << "Cannot overflow unless result is 
Decimal16Value";
> extra indent here
Done


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@287
PS2, Line 287:   if (this_scale < other_scale) {
> It might be slightly easier to follow if the branches were(delta_scale < 0)
Rewrote the code to make it more clear.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@299
PS2, Line 299: into
> long line.
Done


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@305
PS2, Line 305: bool ovf = AdjustToSameScale(*this, this_scale, other, 
other_scale,
> It feels a bit weird that we're calculating delta_scale above and inside Ad
Delta scale is different in AdjustToSameScale(). There are two delta scales in 
this patch: between x and y and between max(x_scale, y_scale) and result_scale.

What we are doing here and in AddLarge() is first adjust to the same scale, do 
the addition, then scale down to result_scale.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@310
PS2, Line 310: if (delta_scale > 0) {
> This might need a brief comment to explain why it's necessary (or perhaps t
Added comment.


http://gerrit.cloudera.org:8080/#/c/8309/2/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
File fe/src/main/java/org/apache/impala/analysis/TypesUtil.java:

http://gerrit.cloudera.org:8080/#/c/8309/2/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java@228
PS2, Line 228:* TODO: implement V2 rules for ADD/SUB.
> TODO not needed.
Done



--
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 2
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Sat, 28 Oct 2017 00:02:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder

2017-10-27 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8034 )

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
..


Patch Set 12:

(5 comments)

I think this is very close.

http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-scanner.h
File be/src/exec/hdfs-parquet-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-scanner.h@352
PS12, Line 352:   MemTracker* dict_mem_tracker() { return 
scan_node_->mem_tracker(); }
See comment in HdfsParquetTableWriter. Also, see how we pass the mem_tracker to 
data_page_pool_.


http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.h
File be/src/exec/hdfs-parquet-table-writer.h:

http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.h@81
PS12, Line 81:   /// Memory tracker to track the memory used by encoder.
 :   MemTracker* dict_mem_tracker();
When I'm reading the code, I'm thinking that this is hiding more than it is 
illuminating. We use this in exactly one place. I would rather have the exact 
code right there with a good comment explaining which mem_tracker we are using.


http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.cc
File be/src/exec/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.cc@179
PS12, Line 179:   dict_encoder_base_->ClearIndices();
  :   dict_encoder_base_->Close();
When Close() does a call to ClearIndices(), this can be simplified to just a 
Close() call.


http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.cc@322
PS12, Line 322:plain_encoded_value_size_,
  :parent_->dict_mem_tracker()));
Nit: Indentation in this case should be even with the "D" in new DictEncoder. 
It goes against every editor's typical behavior, but it's the standard we have. 
Also, feel free to keep plain_encoded_value_size_ on the first line. (It's on 
the edge of our 90 char limit.)


http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/util/dict-encoding.h@63
PS12, Line 63:   void Close() {
 : ReleaseBytes();
I think Close() should do ClearIndices().



--
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-Change-Number: 8034
Gerrit-PatchSet: 12
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 23:59:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6087: Revisit tests withheld from TPC-DS suite for use of TRUNCATE

2017-10-27 Thread Tim Wood (Code Review)
Tim Wood has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8372 )

Change subject: IMPALA-6087: Revisit tests withheld from TPC-DS suite for use 
of TRUNCATE
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8372/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8372/3//COMMIT_MSG@18
PS3, Line 18: (unless done so explicitly).
Better to say, "unless the test does so explicitly."



--
To view, visit http://gerrit.cloudera.org:8080/8372
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79d2e34621639c8f8c4c4eb0b0944eaefca13a7a
Gerrit-Change-Number: 8372
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Fri, 27 Oct 2017 23:07:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options

2017-10-27 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8401 )

Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options
..


Patch Set 2:

(2 comments)

I'll punt to Sailesh for the answer to one of Henry's questions.

http://gerrit.cloudera.org:8080/#/c/8401/2/docs/topics/impala_ssl.xml
File docs/topics/impala_ssl.xml:

http://gerrit.cloudera.org:8080/#/c/8401/2/docs/topics/impala_ssl.xml@171
PS2, Line 171: This value is used in some organizations to disallow TLS 1.0 and 
1.1.
> This seems redundant, as that's what "Allow any TLS version of 1.2 higher."
Hmm I was trying to come up a subtle way to indicate, "consider using this 
value if your organization is security-conscious". I'm not an expert on TLS/SSL 
vulns but I did turn up this one that suggests some problems are in both 1.0 
and 1.1 but not 1.2. 
https://nakedsecurity.sophos.com/2013/02/07/boffins-crack-https-encryptionin-lucky-thirteen-attack/


http://gerrit.cloudera.org:8080/#/c/8401/2/docs/topics/impala_ssl.xml@177
PS2, Line 177: TLSv1.2 may not work
> How does it 'not work' - does the daemon fail to start, or does the daemon
Good question for Sailesh!



--
To view, visit http://gerrit.cloudera.org:8080/8401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e
Gerrit-Change-Number: 8401
Gerrit-PatchSet: 2
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 23:04:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.

2017-10-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8211 )

Change subject: IMPALA-5243: Speed up code gen for wide Avro tables.
..


Patch Set 8: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476
Gerrit-Change-Number: 8211
Gerrit-PatchSet: 8
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 22:55:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.

2017-10-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8211 )

Change subject: IMPALA-5243: Speed up code gen for wide Avro tables.
..


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1412/


--
To view, visit http://gerrit.cloudera.org:8080/8211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476
Gerrit-Change-Number: 8211
Gerrit-PatchSet: 8
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 22:54:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources

2017-10-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8303 )

Change subject: IMPALA-1575: Part 1: eagerly release query exec resources
..


Patch Set 16:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1411/


--
To view, visit http://gerrit.cloudera.org:8080/8303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Gerrit-Change-Number: 8303
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 22:54:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources

2017-10-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8303 )

Change subject: IMPALA-1575: Part 1: eagerly release query exec resources
..


Patch Set 16: Code-Review+2

carry


--
To view, visit http://gerrit.cloudera.org:8080/8303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Gerrit-Change-Number: 8303
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 22:54:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources

2017-10-27 Thread Tim Armstrong (Code Review)
Hello Sailesh Mukil, Joe McDonnell, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8303

to look at the new patch set (#15).

Change subject: IMPALA-1575: Part 1: eagerly release query exec resources
..

IMPALA-1575: Part 1: eagerly release query exec resources

Release of backend resources for query execution on the coordinator
daemon should occur eagerly as soon as query execution is finished
or cancelled. Before this patch some resources managed by QueryState,
like scratch files, were only released when the query was closed
and all of the query's control structures were torn down.

These resources are referred to as "ExecResources" in various places to
distinguish them from resources associated with the client request (like
the result cache) that are still required after the query finishes
executing.

This first change does not solve the admission control problem for two
reasons:
* We don't release the "admitted" memory on the coordinator until
  the query is unregistered.
* Admission control still considers the memory reserved until the
  query memtracker is unregistered, which happens only when the
  QueryState is destroyed: see MemTracker::GetPoolMemReserved().

The flow is mostly similar to initial_reservation_refcnt_, except the
coordinator also holds onto a reference count, which is released
when either the final row is returned or cancellation is initiated.
After the coordinator releases its refcount, the resources can be
freed as soon as local fragments also release their refcounts.

Also clean up Coordinator slightly by preventing runtime_state() from
leaking out to ClientRequestState - instead it's possible to log
the query MemTracker by following parent links in the MemTracker.

This patch is partially based on Joe McDonnell's IMPALA-1575 patch.

Testing:
Ran core tests.

Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
---
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/test-env.cc
M be/src/service/client-request-state.cc
10 files changed, 140 insertions(+), 89 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/8303/15
--
To view, visit http://gerrit.cloudera.org:8080/8303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Gerrit-Change-Number: 8303
Gerrit-PatchSet: 15
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources

2017-10-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8303 )

Change subject: IMPALA-1575: Part 1: eagerly release query exec resources
..


Patch Set 14: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8303/14/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/8303/14/be/src/runtime/runtime-state.cc@87
PS14, Line 87: profile_(RuntimeProfile::Create(obj_pool(), "")) {
> maybe explain that this is the test constructor
Done



--
To view, visit http://gerrit.cloudera.org:8080/8303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Gerrit-Change-Number: 8303
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 22:53:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-10-27 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 6:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@123
PS6, Line 123: until
 :   // any in-flight RPC completes
if the preceding RPC is still in-flight.


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@135
PS6, Line 135: free
frees


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@258
PS6, Line 258: stack
code


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@261
PS6, Line 261: stack
code


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@264
PS6, Line 264: thread
threads


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@279
PS6, Line 279: thread
threads


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@104
PS6, Line 104: CachedProtobufRowBatch
OutboundRowBatch


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@144
PS6, Line 144:   /// Populate a row batch from a serialized protobuf 
input_batch by copying
 :   /// input_batch's tuple_data into the row batch's mempool and 
converting all
 :   /// offsets in the data back into pointers.
stale


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@455
PS6, Line 455: tuple_offsets
input_*


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@531
PS6, Line 531: CachedProtobufRowBatch
OutboundProtoRowBatch


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/util/network-util.cc
File be/src/util/network-util.cc:

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/util/network-util.cc@41
PS6, Line 41: using kudu::Sockaddr;
undo. Bad rebase.


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/util/network-util.cc@120
PS6, Line 120: Sockaddr sock;
undo


http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/data_stream_service.proto
File common/protobuf/data_stream_service.proto:

http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/data_stream_service.proto@38
PS6, Line 38: 'tuple_offsets'
tuple offsets' buffer


http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/data_stream_service.proto@43
PS6, Line 43: 'tuple_data'
tuple data's buffer


http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/row_batch.proto
File common/protobuf/row_batch.proto:

http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/row_batch.proto@25
PS6, Line 25: The indices of the sidecars are included in the header below.
delete



--
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
Gerrit-Change-Number: 8023
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 27 Oct 2017 22:41:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6087: Revisit tests withheld from TPC-DS suite for use of TRUNCATE

2017-10-27 Thread Tim Wood (Code Review)
Tim Wood has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/8372 )

Change subject: IMPALA-6087: Revisit tests withheld from TPC-DS suite for use 
of TRUNCATE
..

IMPALA-6087: Revisit tests withheld from TPC-DS suite for use of TRUNCATE

The .test files in this commit are ones held out from IMPALA-5376
because of observed anomalies with returned decimal precision that I
previously controlled with TRUNCATE().  This ticket was filed after team
members expressed a preference not to use TRUNCATE() and to use
actual results generated within numerical range of expected,
unless the results could not be controlled otherwise.  Rounds of testing
for this commit show that the earlier anomalies no longer occur,
clearing the way for their inclusion in our TPC-DS suite.  It has been
observed that these tests tend to fail with the DECIMAL_V2 option set
(unless done so explicitly).

Testing: The gerrit_verify_dryrun_external (build #24) job passes with this
change.  The fix (rebased hereon) to IMPALA-6106 (finally) cleared up a
source of flapping for these tests.

Change-Id: I79d2e34621639c8f8c4c4eb0b0944eaefca13a7a
---
A testdata/workloads/tpcds/queries/tpcds-q26.test
A testdata/workloads/tpcds/queries/tpcds-q28.test
A testdata/workloads/tpcds/queries/tpcds-q47.test
A testdata/workloads/tpcds/queries/tpcds-q57.test
A testdata/workloads/tpcds/queries/tpcds-q59.test
A testdata/workloads/tpcds/queries/tpcds-q63.test
M tests/query_test/test_tpcds_queries.py
7 files changed, 837 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/8372/3
--
To view, visit http://gerrit.cloudera.org:8080/8372
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I79d2e34621639c8f8c4c4eb0b0944eaefca13a7a
Gerrit-Change-Number: 8372
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Wood 


[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources

2017-10-27 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8303 )

Change subject: IMPALA-1575: Part 1: eagerly release query exec resources
..


Patch Set 14: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8303/14/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/8303/14/be/src/runtime/runtime-state.cc@87
PS14, Line 87: profile_(RuntimeProfile::Create(obj_pool(), "")) {
maybe explain that this is the test constructor



--
To view, visit http://gerrit.cloudera.org:8080/8303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Gerrit-Change-Number: 8303
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 22:21:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2758: Remove BufferedTupleStream::GetRows

2017-10-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8226 )

Change subject: IMPALA-2758: Remove BufferedTupleStream::GetRows
..

IMPALA-2758: Remove BufferedTupleStream::GetRows

This patch removes BufferedTupleStream::GetRows. This function pins a
stream and reads all the rows into a single batch. It is not a good API
since it creates an arbitrarily large row batch. In this patch the call
sites pin the stream and then directly use GetNext to retrieve a single
batch at a time.

Testing: It passes existing tests. A test case for GetRows is removed.

Change-Id: I3831c38994da2b69775a9809ff01de5d23584414
Reviewed-on: http://gerrit.cloudera.org:8080/8226
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
5 files changed, 63 insertions(+), 99 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414
Gerrit-Change-Number: 8226
Gerrit-PatchSet: 7
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2758: Remove BufferedTupleStream::GetRows

2017-10-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8226 )

Change subject: IMPALA-2758: Remove BufferedTupleStream::GetRows
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414
Gerrit-Change-Number: 8226
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 22:15:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6093: diagnostics for flaky TestHashJoinTimer

2017-10-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8403 )

Change subject: IMPALA-6093: diagnostics for flaky TestHashJoinTimer
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1410/


--
To view, visit http://gerrit.cloudera.org:8080/8403
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic33296931db807abb960db43b99e5fd0f256
Gerrit-Change-Number: 8403
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 22:12:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6093: diagnostics for flaky TestHashJoinTimer

2017-10-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8403 )

Change subject: IMPALA-6093: diagnostics for flaky TestHashJoinTimer
..


Patch Set 3: Code-Review+2

carry


--
To view, visit http://gerrit.cloudera.org:8080/8403
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic33296931db807abb960db43b99e5fd0f256
Gerrit-Change-Number: 8403
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 22:11:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6093: diagnostics for flaky TestHashJoinTimer

2017-10-27 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8403 )

Change subject: IMPALA-6093: diagnostics for flaky TestHashJoinTimer
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8403
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic33296931db807abb960db43b99e5fd0f256
Gerrit-Change-Number: 8403
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 22:04:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3436: Return a decimal when rounding a double

2017-10-27 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8398 )

Change subject: IMPALA-3436: Return a decimal when rounding a double
..


Patch Set 1:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/8398/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8398/1//COMMIT_MSG@16
PS1, Line 16: This is implemented by introducing a "hack" where we make the 
parser
How about we say rewrite or translation instead of hack? I don't think it's 
really a terrible hack - it's a minimal change.


http://gerrit.cloudera.org:8080/#/c/8398/1/be/src/exprs/decimal-functions-ir.cc
File be/src/exprs/decimal-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8398/1/be/src/exprs/decimal-functions-ir.cc@111
PS1, Line 111:   DCHECK(!scale.is_null);
Add a FE test case in AnalyzeExprsTest.java that makes sure this is enforced. 
The existing round() functions that take a DECIMAL argument must have a 
non-NULL, constant second argument. New tests should be added to make sure that 
is also enforced for your new round() function.

There's also a subtle change in behavior we should consider. The existing 
round() function accepts non-constant arguments, should the new round 
function() also allow that or not?


http://gerrit.cloudera.org:8080/#/c/8398/1/be/src/exprs/decimal-functions-ir.cc@116
PS1, Line 116:   bool ovf = false;
overflow


http://gerrit.cloudera.org:8080/#/c/8398/1/be/src/exprs/decimal-functions.h
File be/src/exprs/decimal-functions.h:

http://gerrit.cloudera.org:8080/#/c/8398/1/be/src/exprs/decimal-functions.h@58
PS1, Line 58:   FunctionContext*, const DoubleVal&, const IntVal&);
add arg names for consistency


http://gerrit.cloudera.org:8080/#/c/8398/1/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

http://gerrit.cloudera.org:8080/#/c/8398/1/common/function-registry/impala_functions.py@284
PS1, Line 284:   [['round_v1','dround_v1'],
Why keep the dround_v1 alias?


http://gerrit.cloudera.org:8080/#/c/8398/1/common/function-registry/impala_functions.py@286
PS1, Line 286:   [['round','dround'], 'DECIMAL', ['DOUBLE', 'INT'], 
'impala::DecimalFunctions::RoundTo'],
Can you organize these and comment on whether these are used in v1/v2 or both? 
Seems confusing now.


http://gerrit.cloudera.org:8080/#/c/8398/1/common/function-registry/impala_functions.py@287
PS1, Line 287:   [['round','dround','round_v1','dround_v1'],
Why keep the dround_v1 alias?


http://gerrit.cloudera.org:8080/#/c/8398/1/common/function-registry/impala_functions.py@403
PS1, Line 403:   [['round','dround','round_v1','dround_v1'],
Why keep the dround_v1 alias in these?


http://gerrit.cloudera.org:8080/#/c/8398/1/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/8398/1/fe/src/main/cup/sql-parser.cup@70
PS1, Line 70:   private TQueryOptions queryOptions;
add TODO to remove when decimal v1 is gone, maybe come up with a specific thing 
you can grep for later in the code, e.g. DECIMAL_V1 and use that consistently 
in the TODOs for removal


http://gerrit.cloudera.org:8080/#/c/8398/1/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java:

http://gerrit.cloudera.org:8080/#/c/8398/1/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@100
PS1, Line 100: FunctionCallExpr functionCallExpr = new 
FunctionCallExpr(fnName, params);
newline


http://gerrit.cloudera.org:8080/#/c/8398/1/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@111
PS1, Line 111: // nullif(x, y) -> if(x DISTINCT FROM y, x, NULL)
newline


http://gerrit.cloudera.org:8080/#/c/8398/1/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@385
PS1, Line 385:* This can only be called for functions that return wildcard 
decimals and the first
Is this comment still accurate?


http://gerrit.cloudera.org:8080/#/c/8398/1/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@407
PS1, Line 407:   // The situtation where none of the parameters are 
decimal, but the return type is
* Shrink comment to:
None of the parameters are decimal but the return type is decimal.

* It's not clear to me why we'd use (38,0) in this case, can you explain? For 
example, round(double, int) does not have decimal args. Shouldn't the return 
type be (38,X) where X is the value of the second arg if it is a constant?



--
To view, visit http://gerrit.cloudera.org:8080/8398
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I240ec70da7996bbfefcf6438eba4dff8d33d7bd6
Gerrit-Change-Number: 8398
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Comment-Date: Fri, 27 Oct 2017 

[Impala-ASF-CR] IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).

2017-10-27 Thread Philip Zeyliger (Code Review)
Hello Joe McDonnell, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8405

to look at the new patch set (#2).

Change subject: IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).
..

IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).

This is a revert of a revert, re-enabling parallel data load.  It avoid
the race condition by explicitly configuring the temporary directory in
question in load-data.py.

When the parallel data load change went in, we discovered
a race with a signature of:

  java.io.FileNotFoundException: File
  /tmp/hadoop-jenkins/mapred/local/1508958341829_tmp does not exist

The number in this path is milliseconds since the epoch, and the race
occurs when two queries submitted to HiveServer2, running with the local
runner, hit the same millisecond time stamp.  The upstream bug is
https://issues.apache.org/jira/browse/MAPREDUCE-6441, and I described the
symptoms in https://issues.apache.org/jira/browse/MAPREDUCE-6992 (which
is now marked as a dupe).

I've tested this by running data load 5 times on the same machines
where it failed before. I also ran data load manually and inspected
the system to make sure that the temporary directories are getting
created as expected in /tmp/impala-data-load-*.

Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10
---
M bin/load-data.py
M testdata/bin/create-load-data.sh
M testdata/bin/run-hive-server.sh
M testdata/bin/run-step.sh
4 files changed, 58 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/8405/2
--
To view, visit http://gerrit.cloudera.org:8080/8405
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10
Gerrit-Change-Number: 8405
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).

2017-10-27 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8405


Change subject: IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).
..

IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).

This is a revert of a revert, re-enabling parallel data load.  It avoid
the race condition by explicitly configuring the temporary directory in
question in load-data.py.

When the parallel data load change went in, we discovered
a race with a signature of:

  java.io.FileNotFoundException: File
  /tmp/hadoop-jenkins/mapred/local/1508958341829_tmp does not exist

The number in this path is milliseconds since the epoch, and the race
occurs when two queries submitted to HiveServer2, running with the local
runner, hit the same millisecond time stamp.  The upstream bug is
https://issues.apache.org/jira/browse/MAPREDUCE-6441, and I described the
symptoms in https://issues.apache.org/jira/browse/MAPREDUCE-6992 (which
is now marked as a dupe).

I've tested this by running data load 5 times on the same machines
where it failed before. I also ran data load manually and inspected
the system to make sure that the temporary directories are getting
created as expected in /tmp/impala-data-load-*.

Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10
---
M bin/load-data.py
M testdata/bin/create-load-data.sh
M testdata/bin/run-hive-server.sh
M testdata/bin/run-step.sh
4 files changed, 61 insertions(+), 6 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/8405/1
--
To view, visit http://gerrit.cloudera.org:8080/8405
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10
Gerrit-Change-Number: 8405
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.

2017-10-27 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8211 )

Change subject: IMPALA-5243: Speed up code gen for wide Avro tables.
..


Patch Set 8:

> Patch Set 8:
>
> > Oof. GVO caught a (correct) clang-tidy issue. Alas, clang-tidy is
>  > not in pre-review-test, so this was my first experience with that.
>
> You can also use 
> https://jenkins.impala.io/view/Utility/job/gerrit-verify-dryrun-external/ , 
> which does run clang-tidy.

Thanks for the pointer. I'm giving it a shot at 
https://jenkins.impala.io/job/gerrit-verify-dryrun-external/25/

My local clang-tidy build was clean:

$ bin/run_clang_tidy.sh | tee clang_tidy.txt
$ cat clang_tidy.txt | grep ']' 
   $


--
To view, visit http://gerrit.cloudera.org:8080/8211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476
Gerrit-Change-Number: 8211
Gerrit-PatchSet: 8
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 21:39:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.

2017-10-27 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8211 )

Change subject: IMPALA-5243: Speed up code gen for wide Avro tables.
..


Patch Set 8:

> Oof. GVO caught a (correct) clang-tidy issue. Alas, clang-tidy is
 > not in pre-review-test, so this was my first experience with that.

You can also use 
https://jenkins.impala.io/view/Utility/job/gerrit-verify-dryrun-external/ , 
which does run clang-tidy.


--
To view, visit http://gerrit.cloudera.org:8080/8211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476
Gerrit-Change-Number: 8211
Gerrit-PatchSet: 8
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 21:32:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu

2017-10-27 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7793 )

Change subject: IMPALA-4252: Min-max runtime filters for Kudu
..


Patch Set 8:

TPCDS results:

 
+-++++-++++-+---+
 | Workload| Query  | File Format| 
Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%)  | Base StdDev(%) | Num Clients | 
Iters |
 
+-++++-++++-+---+
 | TPCDS(_300) | TPCDS-Q73  | kudu / none / none | 
12.01  | 7.99| R +50.42%  |   0.31%|   1.04%| 1   | 
4 |
 | TPCDS(_300) | TPCDS-Q68  | kudu / none / none | 
17.02  | 13.00   | R +30.97%  |   0.82%|   0.26%| 1   | 
4 |
 | TPCDS(_300) | TPCDS-Q79  | kudu / none / none | 
16.61  | 12.95   | R +28.31%  |   0.56%|   0.54%| 1   | 
4 |
 | TPCDS(_300) | TPCDS-Q1   | kudu / none / none | 2.36 
  | 1.99|   +18.19%  |   2.86%|   1.38%| 1   | 4
 |
 | TPCDS(_300) | TPCDS-Q53  | kudu / none / none | 2.18 
  | 2.08|   +4.96%   |   0.32%|   1.98%| 1   | 4
 |
 | TPCDS(_300) | TPCDS-Q8   | kudu / none / none | 4.03 
  | 3.93|   +2.38%   |   0.43%|   0.38%| 1   | 4
 |
 | TPCDS(_300) | TPCDS-Q98  | kudu / none / none | 7.80 
  | 7.67|   +1.72%   |   0.66%|   0.81%| 1   | 4
 |
 | TPCDS(_300) | TPCDS-Q7   | kudu / none / none | 3.75 
  | 3.71|   +1.15%   |   1.14%|   1.17%| 1   | 4
 |
 | TPCDS(_300) | TPCDS-Q4   | kudu / none / none | 
29.59  | 29.27   |   +1.11%   |   4.13%|   3.67%| 1   | 
4 |
 | TPCDS(_300) | TPCDS-Q65  | kudu / none / none | 
13.97  | 14.01   |   -0.28%   |   2.70%|   1.43%| 1   | 
4 |
 | TPCDS(_300) | TPCDS-Q28  | kudu / none / none | 4.06 
  | 4.08|   -0.35%   |   0.84%|   0.48%| 1   | 4
 |
 | TPCDS(_300) | TPCDS-Q89  | kudu / none / none | 2.46 
  | 2.48|   -0.62%   |   2.60%|   1.81%| 1   | 4
 |
 | TPCDS(_300) | TPCDS-Q55  | kudu / none / none | 2.44 
  | 2.46|   -0.83%   |   1.08%|   5.41%| 1   | 4
 |
 | TPCDS(_300) | TPCDS-Q42  | kudu / none / none | 2.42 
  | 2.47|   -1.89%   |   0.26%|   4.50%| 1   | 4
 |
 | TPCDS(_300) | TPCDS-Q23  | kudu / none / none | 
90.96  | 95.30   |   -4.55%   |   1.06%|   2.21%| 1   | 
4 |
 | TPCDS(_300) | TPCDS-Q43  | kudu / none / none | 5.05 
  | 5.34|   -5.49%   |   0.89%|   4.17%| 1   | 4
 |
 | TPCDS(_300) | TPCDS-Q3   | kudu / none / none | 4.11 
  | 4.36|   -5.84%   |   2.48%|   1.15%| 1   | 4
 |
 | TPCDS(_300) | TPCDS-Q61  | kudu / none / none | 8.56 
  | 9.93|   -13.83%  |   0.78%|   1.84%| 1   | 4
 |
 | TPCDS(_300) | TPCDS-Q2   | kudu / none / none | 
12.19  | 15.68   | I -22.28%  | * 27.57% * |   1.11%| 1   | 
4 |
 | TPCDS(_300) | TPCDS-Q47  | kudu / none / none | 
16.82  | 21.82   | I -22.91%  |   1.16%|   1.25%| 1   | 
4 |
 | TPCDS(_300) | TPCDS-Q19  | kudu / none / none | 4.78 
  | 6.32| I -24.29%  |   1.14%|   0.97%| 1   | 4
 |
 | TPCDS(_300) | TPCDS-Q46  | kudu / none / none | 6.58 
  | 8.77| I -24.88%  |   0.86%|   1.01%| 1   | 4
 |
 | TPCDS(_300) | TPCDS-Q88  | kudu / none / none | 
14.89  | 19.84   | I -24.95%  |   0.17%|   3.79%| 1   | 
4 |
 | TPCDS(_300) | TPCDS-Q59  | kudu / none / none | 
24.05  | 35.61   | I -32.45%  | * 11.97% * | * 20.61% * | 1   | 
4 |
 | TPCDS(_300) | TPCDS-Q34  | kudu / none / none | 3.94 
  | 6.82| I -42.24%  |   2.37%|   4.12%| 1   | 4
 |
 | TPCDS(_300) | TPCDS-Q27  | kudu / none / none | 3.97 
  | 7.91| I -49.85%  |   

[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options

2017-10-27 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8401 )

Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8401/2/docs/topics/impala_ssl.xml
File docs/topics/impala_ssl.xml:

http://gerrit.cloudera.org:8080/#/c/8401/2/docs/topics/impala_ssl.xml@171
PS2, Line 171: This value is used in some organizations to disallow TLS 1.0 and 
1.1.
This seems redundant, as that's what "Allow any TLS version of 1.2 higher." 
means.


http://gerrit.cloudera.org:8080/#/c/8401/2/docs/topics/impala_ssl.xml@177
PS2, Line 177: TLSv1.2 may not work
How does it 'not work' - does the daemon fail to start, or does the daemon 
silently accept connections <1.2?



--
To view, visit http://gerrit.cloudera.org:8080/8401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e
Gerrit-Change-Number: 8401
Gerrit-PatchSet: 2
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 21:03:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources

2017-10-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8303 )

Change subject: IMPALA-1575: Part 1: eagerly release query exec resources
..


Patch Set 11:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/mem-tracker.cc
File be/src/runtime/mem-tracker.cc:

http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/mem-tracker.cc@359
PS11, Line 359: was not available
> Okay, but my original question was whether the comment should be reworded f
Done


http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.cc@95
PS11, Line 95:   if (!released_exec_resources_) ReleaseExecResources();
> What I proposed was not to handle it in Init(), but let the caller handle i
Should have read it more carefully, sorry. I try to avoid function APIs that 
require callers to undo side-effects of a failed function - that seems 
error-prone in general but I guess it is ok in this specific case.


http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/runtime-state.cc@88
PS11, Line 88:   local_query_state_->AcquireExecResourceRefcount(); // 
Decremented in ReleaseResources().
> Why are exvaluating exprs query-wide resources. Aren't they per-thread reso
I'd consider per-thread execution resources to be a subset of query-wide 
execution resources. I.e. releasing a query's execution resources while threads 
are still consuming execution resources is not valid in my mind. The comments 
in QueryState weren't very clear about what holding a refcount meant so I tried 
to improve them.



--
To view, visit http://gerrit.cloudera.org:8080/8303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Gerrit-Change-Number: 8303
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 20:57:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources

2017-10-27 Thread Tim Armstrong (Code Review)
Hello Sailesh Mukil, Joe McDonnell, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8303

to look at the new patch set (#14).

Change subject: IMPALA-1575: Part 1: eagerly release query exec resources
..

IMPALA-1575: Part 1: eagerly release query exec resources

Release of backend resources for query execution on the coordinator
daemon should occur eagerly as soon as query execution is finished
or cancelled. Before this patch some resources managed by QueryState,
like scratch files, were only released when the query was closed
and all of the query's control structures were torn down.

These resources are referred to as "ExecResources" in various places to
distinguish them from resources associated with the client request (like
the result cache) that are still required after the query finishes
executing.

This first change does not solve the admission control problem for two
reasons:
* We don't release the "admitted" memory on the coordinator until
  the query is unregistered.
* Admission control still considers the memory reserved until the
  query memtracker is unregistered, which happens only when the
  QueryState is destroyed: see MemTracker::GetPoolMemReserved().

The flow is mostly similar to initial_reservation_refcnt_, except the
coordinator also holds onto a reference count, which is released
when either the final row is returned or cancellation is initiated.
After the coordinator releases its refcount, the resources can be
freed as soon as local fragments also release their refcounts.

Also clean up Coordinator slightly by preventing runtime_state() from
leaking out to ClientRequestState - instead it's possible to log
the query MemTracker by following parent links in the MemTracker.

This patch is partially based on Joe McDonnell's IMPALA-1575 patch.

Testing:
Ran core tests.

Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
---
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/test-env.cc
M be/src/service/client-request-state.cc
10 files changed, 138 insertions(+), 89 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/8303/14
--
To view, visit http://gerrit.cloudera.org:8080/8303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Gerrit-Change-Number: 8303
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.

2017-10-27 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8211 )

Change subject: IMPALA-5243: Speed up code gen for wide Avro tables.
..


Patch Set 8:

Oof. GVO caught a (correct) clang-tidy issue. Alas, clang-tidy is not in 
pre-review-test, so this was my first experience with that.

I've made the trivial change (removed a const in llvm-codegen.h) and am running 
it through the relevant gauntlet locally.


--
To view, visit http://gerrit.cloudera.org:8080/8211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476
Gerrit-Change-Number: 8211
Gerrit-PatchSet: 8
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 20:56:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: Part 2: copy out strings in uncompressed Avro

2017-10-27 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8146 )

Change subject: IMPALA-5307: Part 2: copy out strings in uncompressed Avro
..


Patch Set 14: Code-Review+1

Thanks. This approach looks okay to me, please see if Michael can do the +2 
review.


--
To view, visit http://gerrit.cloudera.org:8080/8146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If1fc78790d778c874f5aafa5958c3c045a88d233
Gerrit-Change-Number: 8146
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 20:33:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.

2017-10-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8211 )

Change subject: IMPALA-5243: Speed up code gen for wide Avro tables.
..


Patch Set 7: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1407/


--
To view, visit http://gerrit.cloudera.org:8080/8211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476
Gerrit-Change-Number: 8211
Gerrit-PatchSet: 7
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 20:27:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-10-27 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 6:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@75
PS3, Line 75: g two
> see comment in row-batch.h about this terminology.
Done


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@178
PS3, Line 178:   // The two OutboundRowBatch which are re-used across multiple 
RPCs. Each entry contains
 :   // a RowBatchHeaderPB and the buffers for the serialized tuple 
offsets and data. When
 :   // one is being used for an in-flight RPC, the execution 
thread continues to run and
 :   // serializes another row batch into the other entry. 
'current_batch_idx_' is the index
 :   // of the entry being used by the in-flight or last completed 
RPC.
 :   //
 :   // TODO: replace this with an ac
> We need to access these fields from the callback (e.g. due to retry).
req_ removed in PS5.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/row-batch.h@52
PS3, Line 52: class OutboundRowBatch {
> ProtoRowBatch is a conceptual representation of a serialized row batch in b
Removed in PS5.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/row-batch.h@89
PS3, Line 89:
> Had hard time coming up with a good name to indicate the re-use of the vect
Renamed to OutboundRowBatch in PS5.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/row-batch.h@431
PS3, Line 431:   ///
 :   /// 'uncompressed_size': Updated with the uncompressed size of 
'tuple_data'.
 :   ///
 :   /// 'is_compressed': Sets to true if compression is applied on 
'tuple_data'.
 :   /// False otherwise.
 :   ///
 :   /// Returns error status if serialization failed. Returns OK 
otherwise.
 :   /// TODO: clean this up once the thrift RPC implementation is 
removed.
 :   Status Serialize(bool full_dedup, vector* 
tuple_offsets, string* tuple_data,
 :   int64_t* uncompressed_size, bool* is_compressed);
 :
 :   /// Shared implementation between thrift and protobuf to 
deserialize a row batch.
 :   ///
 :   /// 'input_tuple_offsets': an int32_t array of tuples; offsets 
into 'input_tuple_data'.
 :   /// Used for populating the tuples in the row batch with 
actual pointers.
 :   ///
 :   /// 'input_tuple_data': contains pointer and size of tuples' 
data buffer.
 :   /// If 'is_compressed' is true, the data is compressed.
 :   ///
 :   /// 'uncompressed_size': the uncompressed size of 
'input_tuple_data' if it's compressed.
 :   ///
 :   /// 'is_compressed': True if 'input_tuple_data' is compressed.
 :   ///
 :   /// TODO: clean this up once the thrift RPC implementation is 
removed.
 :   void Deserialize(const kudu::Slice& tuple_offsets, const 
kudu::Slice& tuple_data,
 :   int64_t uncompressed_size, bool is_compressed);
 :
 :   typedef FixedSizeHashTable DedupMap;
 :
 :   /// The total size of all data represented in this row batch 
(tuples and referenced
 :   /// string and collection data). This is the size of the row 
batch after removing all
 :   /// gaps in the auxiliary and deduplicated tuples (i.e. the 
smallest footprint for the
 :   /// row batch). If the distinct_tuples argument is non-null, 
full deduplication is
 :   /// enabled. The distinct_tuples map must be empty.
 :   int64_t TotalByteSize(DedupMap* distinct_tuples);
 :
 :   void SerializeInternal(int64_t size, DedupMap* distinct_tuples,
 :   vector* tuple_offsets, string* tuple_data);
 :
 :   /// All members below need to be handled in R
> let's leave a TODO about cleaning this all up once we can remove the thrift
TODO added. Cannot file a JIRA as apache JIRA is being re-indexed.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/service/data-stream-service.cc
File be/src/service/data-stream-service.cc:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/service/data-stream-service.cc@64
PS3, Line 64:
:
:
:
:
:
:
:
:
> see comment in row-batch.h.  I think we should do this later 

[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-10-27 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Mostafa Mokhtar, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8023

to look at the new patch set (#6).

Change subject: IMPALA-4856: Port data stream service to KRPC
..

IMPALA-4856: Port data stream service to KRPC

This patch implements a new data stream service which utilizes KRPC.
Similar to the thrift RPC implementation, there are 3 major components
to the data stream services: KrpcDataStreamSender serializes and sends
row batches materialized by a fragment instance to a KrpcDataStreamRecvr.
KrpcDataStreamMgr is responsible for routing an incoming row batch to
the appropriate receiver. The data stream service runs on the port
FLAGS_krpc_port which is 29000 by default.

Unlike the implementation with thrift RPC, KRPC provides an asynchronous
interface for invoking remote methods. As a result, KrpcDataStreamSender
doesn't need to create a thread per connection. There is one connection
between two Impalad nodes for each direction (i.e. client and server).
Multiple queries can multi-plex on the same connection for transmitting
row batches between two Impalad nodes. The asynchronous interface also
prevents some issues with thrift RPC in which a thread may be stuck in
an RPC call without being able to check for cancellation. A TransmitData()
call with KRPC is in essence a trio of RpcController, a serialized protobuf
request buffer and a protobuf response buffer. The call is invoked via a
DataStreamService proxy object. The serialized tuple offsets and row batches
are sent via "sidecars" in KRPC to avoid extra copy into the serialized
request buffer.

Each impalad node creates a singleton DataStreamService object at start-up
time. All incoming calls are served by a service thread pool created as part
of DataStreamService. By default, there are 64 service threads. The service
threads are shared across all queries so the RPC handler should avoid
blocking as much as possible. In thrift RPC implementation, we make a thrift
thread handling a TransmitData() RPC to block for extended period of time
when the receiver is not yet created when the call arrives. In KRPC
implementation, we store TransmitData() or EndDataStream() requests
which arrive before the receiver is ready in a per-receiver early sender list
stored in KrpcDataStreamMgr. These RPC calls will be processed and responded to
when the receiver is created or when timeout occurs.

Similarly, there is limited space in the sender queues in KrpcDataStreamRecvr.
If adding a row batch to a queue in KrpcDataStreamRecvr causes the buffer limit
to exceed, the request will be stashed in a blocked_sender_ queue to be 
processed
later. The stashed RPC request will not be responded to until it is processed
so as to exert back pressure to the client. An alternative would be to reply 
with
an error and the request / row batches need to be sent again. This may end up
consuming more network bandwidth than the thrift RPC implementation. This change
adopts the behavior of allowing one stashed request per sender.

All rpc requests and responses are serialized using protobuf. The equivalent of
TRowBatch would be ProtoRowBatch which contains a serialized header about the
meta-data of the row batch and two Kudu Slice objects which contain pointers to
the actual data (i.e. tuple offsets and tuple data).

This patch is based on an abandoned patch by Henry Robinson.

TESTING
---

* Build passed with FLAGS_use_krpc=true.

TO DO
-

* Port some BE tests to KRPC services.

Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
---
M be/src/common/status.cc
M be/src/common/status.h
M be/src/exec/data-sink.cc
M be/src/exec/exchange-node.cc
M be/src/exec/kudu-util.h
M be/src/rpc/CMakeLists.txt
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/data-stream-mgr-base.h
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-recvr.h
M be/src/runtime/data-stream-sender.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A be/src/runtime/krpc-data-stream-sender.cc
A be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/service/CMakeLists.txt
A be/src/service/data-stream-service.cc
A be/src/service/data-stream-service.h
M be/src/service/impala-server.cc
M be/src/util/network-util.cc
M cmake_modules/FindProtobuf.cmake
M common/protobuf/CMakeLists.txt
A common/protobuf/common.proto
A common/protobuf/data_stream_service.proto
A common/protobuf/row_batch.proto
M common/thrift/generate_error_codes.py
34 files changed, 2,932 insertions(+), 175 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8023/6
--
To view, visit 

[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources

2017-10-27 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8303 )

Change subject: IMPALA-1575: Part 1: eagerly release query exec resources
..


Patch Set 11:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/mem-tracker.cc
File be/src/runtime/mem-tracker.cc:

http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/mem-tracker.cc@359
PS11, Line 359: was not available
> We do call it on the process MemTracker in QueryState::Init().
Okay, but my original question was whether the comment should be reworded for 
that case:

... if this tracker is part of the query hierarchy.

i.e. "not available" seems misleading now that we don't get this from 'state' 
(which is the thing that wasn't always available).


http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.cc@95
PS11, Line 95:   if (!released_exec_resources_) ReleaseExecResources();
> My initial concern was that handling it in Init() would get messy in the ca
What I proposed was not to handle it in Init(), but let the caller handle it. 
i.e. Init() always gets a resource ref count and QueryExecMgr::StartQuery() has 
to handle the error. It already has to handle this and other errors and release 
ref counts if it won't be passing them along...
Is there a reason to not just do that?


http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/runtime-state.cc@88
PS11, Line 88:   local_query_state_->AcquireExecResourceRefcount(); // 
Decremented in ReleaseResources().
> The resources used for evaluating exprs, etc - this is used in various test
Why are exvaluating exprs query-wide resources. Aren't they per-thread 
resources?



--
To view, visit http://gerrit.cloudera.org:8080/8303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Gerrit-Change-Number: 8303
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 19:37:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options

2017-10-27 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8401 )

Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options
..


Patch Set 2:

(4 comments)

I included the "long-form" description of the RHEL/CentOS issue from Sailesh's 
email.

http://gerrit.cloudera.org:8080/#/c/8401/1/docs/topics/impala_ssl.xml
File docs/topics/impala_ssl.xml:

http://gerrit.cloudera.org:8080/#/c/8401/1/docs/topics/impala_ssl.xml@145
PS1, Line 145: . In , you can use startup
 : options for the impal
> I assume  you meant catalogd and statestored in 2 of these?
Done


http://gerrit.cloudera.org:8080/#/c/8401/1/docs/topics/impala_ssl.xml@158
PS1, Line 158:   
> mention this the default may be?
Right, the default is stated on the next line.


http://gerrit.cloudera.org:8080/#/c/8401/1/docs/topics/impala_ssl.xml@177
PS1, Line 177: s of , TLSv1.2 may not work for
> Elaborate on the default state a little bit: "Default is empty and then Imp
Done


http://gerrit.cloudera.org:8080/#/c/8401/1/docs/topics/impala_ssl.xml@182
PS1, Line 182: 
> Could you also please add a note saying that TLSv1.2 may not work on CentOS
Done



--
To view, visit http://gerrit.cloudera.org:8080/8401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e
Gerrit-Change-Number: 8401
Gerrit-PatchSet: 2
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 19:35:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options

2017-10-27 Thread John Russell (Code Review)
Hello Bharath Vissapragada, Henry Robinson, Michael Brown, Sailesh Mukil, Tim 
Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8401

to look at the new patch set (#2).

Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options
..

IMPALA-5473: [DOCS] Document TLS min version & cipher options

Under the doc JIRA IMPALA-6065.

Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e
---
M docs/topics/impala_ssl.xml
1 file changed, 71 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/8401/2
--
To view, visit http://gerrit.cloudera.org:8080/8401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e
Gerrit-Change-Number: 8401
Gerrit-PatchSet: 2
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow

2017-10-27 Thread John Sherman (Code Review)
John Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8404 )

Change subject: IMPALA-5017: Error on decimal overflow
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG@12
PS1, Line 12: In this patch, the beharior is changed so that an error is 
produced in
typo: beharior should be behavior



--
To view, visit http://gerrit.cloudera.org:8080/8404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019
Gerrit-Change-Number: 8404
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Fri, 27 Oct 2017 19:16:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2758: Remove BufferedTupleStream::GetRows

2017-10-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8226 )

Change subject: IMPALA-2758: Remove BufferedTupleStream::GetRows
..


Patch Set 6: Code-Review+2

Glad we have automated checking for dropped status now, it's super-easy to miss.


--
To view, visit http://gerrit.cloudera.org:8080/8226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414
Gerrit-Change-Number: 8226
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 18:47:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2758: Remove BufferedTupleStream::GetRows

2017-10-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8226 )

Change subject: IMPALA-2758: Remove BufferedTupleStream::GetRows
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1408/


--
To view, visit http://gerrit.cloudera.org:8080/8226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414
Gerrit-Change-Number: 8226
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 18:46:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2758: Remove BufferedTupleStream::GetRows

2017-10-27 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8226 )

Change subject: IMPALA-2758: Remove BufferedTupleStream::GetRows
..


Patch Set 6:

Sorry I forgot to handle some error.


--
To view, visit http://gerrit.cloudera.org:8080/8226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414
Gerrit-Change-Number: 8226
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 18:35:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2758: Remove BufferedTupleStream::GetRows

2017-10-27 Thread Tianyi Wang (Code Review)
Hello Thomas Tauber-Marshall, Tim Armstrong, Dan Hecht, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8226

to look at the new patch set (#6).

Change subject: IMPALA-2758: Remove BufferedTupleStream::GetRows
..

IMPALA-2758: Remove BufferedTupleStream::GetRows

This patch removes BufferedTupleStream::GetRows. This function pins a
stream and reads all the rows into a single batch. It is not a good API
since it creates an arbitrarily large row batch. In this patch the call
sites pin the stream and then directly use GetNext to retrieve a single
batch at a time.

Testing: It passes existing tests. A test case for GetRows is removed.

Change-Id: I3831c38994da2b69775a9809ff01de5d23584414
---
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
5 files changed, 63 insertions(+), 99 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/8226/6
--
To view, visit http://gerrit.cloudera.org:8080/8226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414
Gerrit-Change-Number: 8226
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-10-27 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Mostafa Mokhtar, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8023

to look at the new patch set (#5).

Change subject: IMPALA-4856: Port data stream service to KRPC
..

IMPALA-4856: Port data stream service to KRPC

This patch implements a new data stream service which utilizes KRPC.
Similar to the thrift RPC implementation, there are 3 major components
to the data stream services: KrpcDataStreamSender serializes and sends
row batches materialized by a fragment instance to a KrpcDataStreamRecvr.
KrpcDataStreamMgr is responsible for routing an incoming row batch to
the appropriate receiver. The data stream service runs on the port
FLAGS_krpc_port which is 29000 by default.

Unlike the implementation with thrift RPC, KRPC provides an asynchronous
interface for invoking remote methods. As a result, KrpcDataStreamSender
doesn't need to create a thread per connection. There is one connection
between two Impalad nodes for each direction (i.e. client and server).
Multiple queries can multi-plex on the same connection for transmitting
row batches between two Impalad nodes. The asynchronous interface also
prevents some issues with thrift RPC in which a thread may be stuck in
an RPC call without being able to check for cancellation. A TransmitData()
call with KRPC is in essence a trio of RpcController, a serialized protobuf
request buffer and a protobuf response buffer. The call is invoked via a
DataStreamService proxy object. The serialized tuple offsets and row batches
are sent via "sidecars" in KRPC to avoid extra copy into the serialized
request buffer.

Each impalad node creates a singleton DataStreamService object at start-up
time. All incoming calls are served by a service thread pool created as part
of DataStreamService. By default, there are 64 service threads. The service
threads are shared across all queries so the RPC handler should avoid
blocking as much as possible. In thrift RPC implementation, we make a thrift
thread handling a TransmitData() RPC to block for extended period of time
when the receiver is not yet created when the call arrives. In KRPC
implementation, we store TransmitData() or EndDataStream() requests
which arrive before the receiver is ready in a per-receiver early sender list
stored in KrpcDataStreamMgr. These RPC calls will be processed and responded to
when the receiver is created or when timeout occurs.

Similarly, there is limited space in the sender queues in KrpcDataStreamRecvr.
If adding a row batch to a queue in KrpcDataStreamRecvr causes the buffer limit
to exceed, the request will be stashed in a blocked_sender_ queue to be 
processed
later. The stashed RPC request will not be responded to until it is processed
so as to exert back pressure to the client. An alternative would be to reply 
with
an error and the request / row batches need to be sent again. This may end up
consuming more network bandwidth than the thrift RPC implementation. This change
adopts the behavior of allowing one stashed request per sender.

All rpc requests and responses are serialized using protobuf. The equivalent of
TRowBatch would be ProtoRowBatch which contains a serialized header about the
meta-data of the row batch and two Kudu Slice objects which contain pointers to
the actual data (i.e. tuple offsets and tuple data).

This patch is based on an abandoned patch by Henry Robinson.

TESTING
---

* Build passed with FLAGS_use_krpc=true.

TO DO
-

* Port some BE tests to KRPC services.

Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
---
M be/src/common/status.cc
M be/src/common/status.h
M be/src/exec/data-sink.cc
M be/src/exec/exchange-node.cc
M be/src/exec/kudu-util.h
M be/src/rpc/CMakeLists.txt
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/data-stream-mgr-base.h
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-recvr.h
M be/src/runtime/data-stream-sender.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A be/src/runtime/krpc-data-stream-sender.cc
A be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/service/CMakeLists.txt
A be/src/service/data-stream-service.cc
A be/src/service/data-stream-service.h
M be/src/service/impala-server.cc
M be/src/util/network-util.cc
M cmake_modules/FindProtobuf.cmake
M common/protobuf/CMakeLists.txt
A common/protobuf/common.proto
A common/protobuf/data_stream_service.proto
A common/protobuf/row_batch.proto
M common/thrift/generate_error_codes.py
34 files changed, 2,929 insertions(+), 175 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8023/5
--
To view, visit 

[Impala-ASF-CR] IMPALA-6093: diagnostics for flaky TestHashJoinTimer

2017-10-27 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8403 )

Change subject: IMPALA-6093: diagnostics for flaky TestHashJoinTimer
..


Patch Set 2: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8403/2/tests/query_test/test_hash_join_timer.py
File tests/query_test/test_hash_join_timer.py:

http://gerrit.cloudera.org:8080/#/c/8403/2/tests/query_test/test_hash_join_timer.py@103
PS2, Line 103:  for impalad in ImpalaCluster().impalads:
 :   verifier = MetricVerifier(impalad.service)
 :   
verifier.wait_for_metric("impala-server.num-fragments-in-flight", 0)
nice addition!



--
To view, visit http://gerrit.cloudera.org:8080/8403
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic33296931db807abb960db43b99e5fd0f256
Gerrit-Change-Number: 8403
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 17:46:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options

2017-10-27 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8401 )

Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8401/1/docs/topics/impala_ssl.xml
File docs/topics/impala_ssl.xml:

http://gerrit.cloudera.org:8080/#/c/8401/1/docs/topics/impala_ssl.xml@182
PS1, Line 182:
Could you also please add a note saying that TLSv1.2 may not work on 
CentOS/RHEL 6, even if OpenSSL 1.0.1 is available?

It is due to this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1497859

We are still working on a fix.



--
To view, visit http://gerrit.cloudera.org:8080/8401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e
Gerrit-Change-Number: 8401
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 17:39:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources

2017-10-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8303 )

Change subject: IMPALA-1575: Part 1: eagerly release query exec resources
..


Patch Set 13: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Gerrit-Change-Number: 8303
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 17:34:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: Part 2: copy out strings in uncompressed Avro

2017-10-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8146 )

Change subject: IMPALA-5307: Part 2: copy out strings in uncompressed Avro
..


Patch Set 14: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If1fc78790d778c874f5aafa5958c3c045a88d233
Gerrit-Change-Number: 8146
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 17:33:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6093: diagnostics for flaky TestHashJoinTimer

2017-10-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8403 )

Change subject: IMPALA-6093: diagnostics for flaky TestHashJoinTimer
..


Patch Set 2:

I implemented Bikram's suggestion from the JIRA.


--
To view, visit http://gerrit.cloudera.org:8080/8403
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic33296931db807abb960db43b99e5fd0f256
Gerrit-Change-Number: 8403
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 17:09:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6093: diagnostics for flaky TestHashJoinTimer

2017-10-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8403 )

Change subject: IMPALA-6093: diagnostics for flaky TestHashJoinTimer
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8403/2/tests/query_test/test_hash_join_timer.py
File tests/query_test/test_hash_join_timer.py:

http://gerrit.cloudera.org:8080/#/c/8403/2/tests/query_test/test_hash_join_timer.py@150
PS2, Line 150: assert (asyn_build), "Join is not prepared asynchronously: 
{0}".format(profile)
To test that the profile was printed ok, I modified this to always fail.



--
To view, visit http://gerrit.cloudera.org:8080/8403
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic33296931db807abb960db43b99e5fd0f256
Gerrit-Change-Number: 8403
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 17:08:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6093: diagnostics for flaky TestHashJoinTimer

2017-10-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/8403 )

Change subject: IMPALA-6093: diagnostics for flaky TestHashJoinTimer
..

IMPALA-6093: diagnostics for flaky TestHashJoinTimer

We don't know the root cause yet but try to improve things:
* Eliminate one possible cause of flakiness - unfinished fragments left
  from previous queries.
* Print a profile if an assertion fails so we can see why it failed.

Testing:
Ran core tests.

Change-Id: Ic33296931db807abb960db43b99e5fd0f256
---
M tests/query_test/test_hash_join_timer.py
1 file changed, 18 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/8403/2
--
To view, visit http://gerrit.cloudera.org:8080/8403
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic33296931db807abb960db43b99e5fd0f256
Gerrit-Change-Number: 8403
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.

2017-10-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8211 )

Change subject: IMPALA-5243: Speed up code gen for wide Avro tables.
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1407/


--
To view, visit http://gerrit.cloudera.org:8080/8211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476
Gerrit-Change-Number: 8211
Gerrit-PatchSet: 7
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 16:38:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.

2017-10-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8211 )

Change subject: IMPALA-5243: Speed up code gen for wide Avro tables.
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476
Gerrit-Change-Number: 8211
Gerrit-PatchSet: 7
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 16:38:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.

2017-10-27 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8211 )

Change subject: IMPALA-5243: Speed up code gen for wide Avro tables.
..


Patch Set 7: Code-Review+1

Carrying the +1.

I've not made any changes here (except rebases). I've run the core tests 
successfully. I had run into https://issues.apache.org/jira/browse/IMPALA-6118 
and https://issues.apache.org/jira/browse/IMPALA-6106 in previous testing. I 
didn't see any evidence that this change breaks the build specifically.

The specific failed GVO build had the following assertion. That's IMPALA-6099, 
which is also reasonably unrelated to the Avro-only changes here.


F1023 19:06:33.677640 43548 filter-context.cc:49] Check failed: it != 
counters.end() Tried to increment unknown counter group
*** Check failure stack trace: ***
@  0x2f1e11d  google::LogMessage::Fail()
F1023 19:06:33.677934 43350 filter-context.cc:49] Check failed: it != 
counters.end() Tried to increment unknown counter group
*** Check failure stack trace: ***
@  0x2f1f9c2  google::LogMessage::SendToLog()
@  0x2f1daf7  google::LogMessage::Flush()
@  0x2f1e11d  google::LogMessage::Fail()
@  0x2f210be  google::LogMessageFatal::~LogMessageFatal()
@  0x2f1f9c2  google::LogMessage::SendToLog()
@  0x1ca4d99  impala::FilterStats::IncrCounters()   


@   
   0x1ca72a4  impala::FilterContext::CheckForAlwaysFalse()
@  0x2f1daf7  google::LogMessage::Flush()
@  0x1aa537d  impala::HdfsScanNodeBase::PartitionPassesFilters()
@  0x2f210be  google::LogMessageFatal::~LogMessageFatal()
@  0x1b0c9c6  impala::HdfsParquetScanner::GetNextInternal()
@  0x1ca4d99  impala::FilterStats::IncrCounters()
@  0x1b0acca  impala::HdfsParquetScanner::ProcessSplit()
@  0x1a99cc0  impala::HdfsScanNode::ProcessSplit()
@  0x1ca72a4  impala::FilterContext::CheckForAlwaysFalse()
@  0x1a99136  impala::HdfsScanNode::ScannerThread()
@  0x1a985d3  
_ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_17ThreadResourceMgr12ResourcePoolEENKUlvE_clEv
@  0x1aa537d  impala::HdfsScanNodeBase::PartitionPassesFilters()
@  0x1a9a501  
_ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_17ThreadResourceMgr12ResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
@  0x171bdfc  boost::function0<>::operator()()
@  0x1b0c9c6  impala::HdfsParquetScanner::GetNextInternal()
@  0x19f3393  impala::Thread::SuperviseThread()
@  0x1b0acca  impala::HdfsParquetScanner::ProcessSplit()


@   
   0x19fbf26  boost::_bi::list4<>::operator()<>()
@  0x1a99cc0  impala::HdfsScanNode::ProcessSplit()
@  0x19fbe69  boost::_bi::bind_t<>::operator()()
@  0x19fbe2c  boost::detail::thread_data<>::run()
@  0x1a99136  impala::HdfsScanNode::ScannerThread()
@  0x1a985d3  
_ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_17ThreadResourceMgr12ResourcePoolEENKUlvE_clEv
@  0x20a7c9a  thread_proxy
@  0x1a9a501  
_ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_17ThreadResourceMgr12ResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
@ 0x7f0cd8e3a6ba  start_thread
@ 0x7f0cd8b703dd  clone


--
To view, visit http://gerrit.cloudera.org:8080/8211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476
Gerrit-Change-Number: 8211
Gerrit-PatchSet: 7
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 16:17:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5607: Add additional units to EXTRACT, DATE PART, TRUNC

2017-10-27 Thread Kim Jin Chul (Code Review)
Kim Jin Chul has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8311 )

Change subject: IMPALA-5607: Add additional units to EXTRACT, DATE_PART, TRUNC
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8311/2/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8311/2/be/src/exprs/expr-test.cc@5959
PS2, Line 5959: "cast(trunc(cast('2012-09-10 01:02:03' as timestamp), 
'DAY') as string)",
> I don't think the additional test cases add any value unless you add greate
Done


http://gerrit.cloudera.org:8080/#/c/8311/2/common/thrift/Exprs.thrift
File common/thrift/Exprs.thrift:

http://gerrit.cloudera.org:8080/#/c/8311/2/common/thrift/Exprs.thrift@92
PS2, Line 92:   MILLISECONDS,
> We shouldn't have redundant field definitions here; instead we should be le
I think we need them even though they look like redundant. See my comment at 
line 194: https://gerrit.cloudera.org/#/c/8311/2/be/src/exprs/udf-builtins-ir.cc



--
To view, visit http://gerrit.cloudera.org:8080/8311
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If23ea310ddcc5c1c806b2b1ac24d24a4bd6aa4d9
Gerrit-Change-Number: 8311
Gerrit-PatchSet: 3
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Fri, 27 Oct 2017 09:43:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5607: Add additional units to EXTRACT, DATE PART, TRUNC

2017-10-27 Thread Kim Jin Chul (Code Review)
Hello Greg Rahn, Zach Amsden,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8311

to look at the new patch set (#3).

Change subject: IMPALA-5607: Add additional units to EXTRACT, DATE_PART, TRUNC
..

IMPALA-5607: Add additional units to EXTRACT, DATE_PART, TRUNC

Return type of EXTRACT/DATE_PART is changed from INT to BIGINT
because INT data type cannot cover NANOSECOND's value.

* Add the following fields to EXTRACT and DATE_PART:

  WEEK
  DOW
  DOY
  SECOND/SECONDS
  MICROSECOND/MICROSECONDS
  NANOSECOND/NANOSECONDS

* Add the following field to TRUNC:

  SS

* Testing:
  Extend unit tests in expr-test

Change-Id: If23ea310ddcc5c1c806b2b1ac24d24a4bd6aa4d9
---
M be/src/exprs/expr-test.cc
M be/src/exprs/udf-builtins-ir.cc
M be/src/exprs/udf-builtins.cc
M be/src/exprs/udf-builtins.h
M common/function-registry/impala_functions.py
M common/thrift/Exprs.thrift
6 files changed, 210 insertions(+), 84 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/8311/3
--
To view, visit http://gerrit.cloudera.org:8080/8311
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If23ea310ddcc5c1c806b2b1ac24d24a4bd6aa4d9
Gerrit-Change-Number: 8311
Gerrit-PatchSet: 3
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Zach Amsden