[Impala-ASF-CR] IMPALA-4300: Speed up BloomFilter::Or with SIMD

2016-10-23 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4813

Change subject: IMPALA-4300: Speed up BloomFilter::Or with SIMD
..

IMPALA-4300: Speed up BloomFilter::Or with SIMD

Manually vectorizing speeds up BloomFilter::Or by up to 184x.

The existing loop should auto-vectorize with our GCC setting of -O3,
but it does not for reasons that remain mysterious.

Change-Id: I840799d9cfb81285c796e2abfe2029bb869b0f67
---
M be/src/benchmarks/bloom-filter-benchmark.cc
M be/src/testutil/mem-util.h
M be/src/util/bloom-filter.cc
M be/src/util/cpu-info.cc
M be/src/util/cpu-info.h
5 files changed, 177 insertions(+), 75 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/4813/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I840799d9cfb81285c796e2abfe2029bb869b0f67
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 


[Impala-ASF-CR] IMPALA-4300: Speed up BloomFilter::Or with SIMD

2016-10-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4300: Speed up BloomFilter::Or with SIMD
..


Patch Set 1:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/4813/1/be/src/util/bloom-filter.cc
File be/src/util/bloom-filter.cc:

Line 167
One reason auto-vectorisation must fail here is that the two directories may 
overlap (as far as the compiler knows). 

There's a pragma that asserts that this isn't the case.
https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gcc/Loop-Specific-Pragmas.html 

It may also generate better code if you partially unroll the loop so that it 
operates on 32 bytes at a time - this avoids the need for the compiler to 
unroll into two loops with strides of 32 bytes and 8 bytes.


PS1, Line 162: DCHECK
DCHECK_EQ


Line 163:   auto simd_in = reinterpret_cast(in);
I think double*/const double* is more readable than auto here - I'm used to 
looking at the left end of the line to find the type instead of having to parse 
the cast expression.


PS1, Line 166: int
Use size_t for consistency? I had to think about whether the implicit casts 
caused any problems.


PS1, Line 168: _mm256_loadu_pd
We should be able to use the aligned versions here, right, since we allocated 
the memory with posix_memalign? I.e. 
https://software.intel.com/en-us/node/524101


PS1, Line 181: should
It shouldn't actually with the code in it's current form since it has to assume 
the the directories may overlap.


Line 184:   // TODO: Tune gcc flags to auto-vectorize. This might not be 
possible.
See my other comment - should be possible.


Line 186:   // TODO: IMPALA-4312: Evaluate clang for builds other than ASAN.
Just remove this TODO? I don't think it's informative for people trying to 
understand this code.


Line 190: auto simd_in = reinterpret_cast(&in.directory[0]);
This code is simple enough that I'm fine with moving forward with it even if we 
*could* spend some time getting the auto-vectorisation to work. So it WFM if 
you want to clean up the comments and leave this code as-is, or if you want to 
play around with auto-vectorisation futher.


PS1, Line 191: auto
I think just repeating the types const__m128i* or __m128i* is more readable 
than auto since it takes slightly less effort to figure out the type of the 
variable.


PS1, Line 192: auto
I'd prefer the int type was explicit here.


PS1, Line 196: int
size_t for consistency?


PS1, Line 199: _mm_storeu_si128
We should be able to use the aligned versions here too.


-- 
To view, visit http://gerrit.cloudera.org:8080/4813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I840799d9cfb81285c796e2abfe2029bb869b0f67
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4336: Cast exprs after unnesting union operands.

2016-10-23 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4815

Change subject: IMPALA-4336: Cast exprs after unnesting union operands.
..

IMPALA-4336: Cast exprs after unnesting union operands.

The bug was that we cast the result exprs of operands before
unnesting them. If we unnested an operands, casts were missing
on those unnested operands' result exprs.

The fix is to first unnest operands and then cast result exprs.

Also clarifies the use of resultExprs vs. baseTblResultExprs
in unions. We should consistently use resultExprs because the
final expr substitution happens during planning.
The only place where baseTblResultExprs should be used is in
UnionStmt.materializeRequiredSlots() because that is called
before plan generation and we need to mark the slots of resolved
exprs as materialized.

Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d
---
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M testdata/workloads/functional-query/queries/QueryTest/union.test
5 files changed, 111 insertions(+), 89 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/4815/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] IMPALA-3676: Use clang as a static analysis tool

2016-10-23 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-3676: Use clang as a static analysis tool
..


Patch Set 5:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/4758/2/be/CMakeLists.txt
File be/CMakeLists.txt:

Line 106: if ("${CMAKE_BUILD_TYPE}" STREQUAL "DEBUG")
> It sets up CMake to use clang as the C++ compiler (see bin/impala_impala.sh
Done


http://gerrit.cloudera.org:8080/#/c/4758/4/be/src/common/status.h
File be/src/common/status.h:

Line 212: return *msg_; // NOLINT: clang-tidy thinks this might deref a 
nullptr
> I still think we should reword this to make it clear that it's a limitation
Done


http://gerrit.cloudera.org:8080/#/c/4758/1/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

> Either if the multiple instances of the struct are touched on a particularl
WFM


http://gerrit.cloudera.org:8080/#/c/4758/4/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

Line 432:   /// safe to concurrently read from filter_routing_table_.
> Can you revert these changes?
Done


http://gerrit.cloudera.org:8080/#/c/4758/1/be/src/util/runtime-profile.h
File be/src/util/runtime-profile.h:

Line 335
> I'm ok with this but we should revisit if there are a lot of spurious warni
SGTM


http://gerrit.cloudera.org:8080/#/c/4758/4/bin/run_clang_tidy.sh
File bin/run_clang_tidy.sh:

Line 46:   # This script has been tested when CORES is actually higher than the 
number of cores on
> 2 space indents are the (imperfectly followed) standard in the shell script
Done


http://gerrit.cloudera.org:8080/#/c/4758/4/buildall.sh
File buildall.sh:

Line 256: 
> Remove extra blank line.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4758
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ed168488cb30ddeccd0087f3840541d858f9c06
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3676: Use clang as a static analysis tool

2016-10-23 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new patch set (#5).

Change subject: IMPALA-3676: Use clang as a static analysis tool
..

IMPALA-3676: Use clang as a static analysis tool

This patch adds a script to run clang-tidy over the whole code
base. It is a first step towards running clang-tidy over patches as a
tool to help users spot bugs before code review.

Because of the number of clang-tidy checks, this patch only addresses
some of them. In particular, only checks starting with 'clang' are
considered. Many of them which are flaky or not part of our style are
excluded from the analysis. This patch also exlcudes some checks which
are part of our current style but which would be too laborious to fix
over the entire codebase, like using nullptr rather than NULL.

This patch also fixes a number of small bugs found by clang-tidy.

Change-Id: I4ed168488cb30ddeccd0087f3840541d858f9c06
---
A .clang-tidy
M CMakeLists.txt
M be/CMakeLists.txt
M be/src/benchmarks/bloom-filter-benchmark.cc
M be/src/benchmarks/in-predicate-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.cc
M be/src/common/init.cc
M be/src/common/logging.h
M be/src/common/status.h
M be/src/exec/delimited-text-parser.cc
M be/src/exec/delimited-text-parser.h
M be/src/exec/exec-node.h
M be/src/exec/external-data-source-executor.h
M be/src/exec/hash-table-test.cc
M be/src/exec/hbase-scan-node.h
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/hdfs-table-writer.h
M be/src/exec/parquet-column-readers.h
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/scanner-context.inline.h
M be/src/exec/write-stream.h
M be/src/experiments/bit-stream-utils.8byte.inline.h
M be/src/experiments/tuple-splitter-test.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/aggregate-functions.h
M be/src/exprs/compound-predicates.h
M be/src/exprs/expr-test.cc
M be/src/exprs/slot-ref.cc
M be/src/exprs/string-functions-ir.cc
M be/src/rpc/TAcceptQueueServer.h
M be/src/rpc/authentication.cc
M be/src/rpc/authentication.h
M be/src/rpc/thrift-server.h
M be/src/runtime/buffered-block-mgr.h
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/coordinator.cc
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/exec-env.h
M be/src/runtime/free-pool-test.cc
M be/src/runtime/hbase-table.cc
M be/src/runtime/hdfs-fs-cache.h
M be/src/runtime/multi-precision-test.cc
M be/src/runtime/runtime-filter-bank.h
M be/src/runtime/runtime-filter.h
M be/src/runtime/scoped-buffer.h
M be/src/runtime/string-buffer.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/simple-scheduler-test.cc
M be/src/service/fe-support.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M be/src/service/query-result-set.cc
M be/src/statestore/failure-detector.h
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/testutil/test-udas.cc
M be/src/testutil/test-udas.h
M be/src/transport/TSasl.h
M be/src/transport/TSaslClientTransport.h
M be/src/transport/TSaslServerTransport.h
M be/src/transport/TSaslTransport.h
M be/src/udf/uda-test-harness.h
M be/src/udf/uda-test.cc
M be/src/udf/udf-ir.cc
M be/src/udf/udf.cc
M be/src/udf/udf.h
M be/src/udf_samples/uda-sample.cc
A be/src/util/aligned-new.h
M be/src/util/benchmark-test.cc
M be/src/util/bit-stream-utils.inline.h
M be/src/util/bit-util-test.cc
M be/src/util/blocking-queue-test.cc
M be/src/util/blocking-queue.h
M be/src/util/bloom-filter-test.cc
M be/src/util/buffer-builder.h
M be/src/util/compress.cc
M be/src/util/decompress.cc
M be/src/util/metrics.h
M be/src/util/minidump.cc
M be/src/util/network-util.cc
M be/src/util/parquet-reader.cc
M be/src/util/periodic-counter-updater.h
M be/src/util/pprof-path-handlers.cc
M be/src/util/progress-updater.cc
M be/src/util/rle-test.cc
M be/src/util/runtime-profile.h
M be/src/util/thread-pool.h
M be/src/util/webserver.cc
M bin/make_impala.sh
A bin/run_clang_tidy.sh
M buildall.sh
R cmake_modules/clang_toolchain.cmake
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
115 files changed, 569 insertions(+), 333 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/4758/5
-- 
To view, visit http:

[Impala-ASF-CR] IMPALA-3586: Clean up union-node.h/cc to enable improvements.

2016-10-23 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4817

Change subject: IMPALA-3586: Clean up union-node.h/cc to enable improvements.
..

IMPALA-3586: Clean up union-node.h/cc to enable improvements.

This patch does not address IMPALA-3586, but it cleans up the
code in union-node.h/cc to make it easier to implement those
perf improvements.

The major simplification is to remove conjunct evaluation since
the planner does not assigns conjuncts to a union-node anymore.
Conjuncts are always pushed to the union operands.

Change-Id: Ia5fc23985e8d51acb8a6920717ce4e2f0254fe70
---
M be/src/exec/union-node.cc
M be/src/exec/union-node.h
2 files changed, 90 insertions(+), 128 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/4817/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia5fc23985e8d51acb8a6920717ce4e2f0254fe70
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] IMPALA-4300: Speed up BloomFilter::Or with SIMD

2016-10-23 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new patch set (#2).

Change subject: IMPALA-4300: Speed up BloomFilter::Or with SIMD
..

IMPALA-4300: Speed up BloomFilter::Or with SIMD

Manually vectorizing speeds up BloomFilter::Or by up to 184x.

The existing loop should auto-vectorize with our GCC setting of -O3,
but it does not for reasons that remain mysterious.

Change-Id: I840799d9cfb81285c796e2abfe2029bb869b0f67
---
M be/src/benchmarks/bloom-filter-benchmark.cc
M be/src/testutil/mem-util.h
M be/src/util/bloom-filter.cc
M be/src/util/cpu-info.cc
M be/src/util/cpu-info.h
5 files changed, 176 insertions(+), 75 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/4813/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I840799d9cfb81285c796e2abfe2029bb869b0f67
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4300: Speed up BloomFilter::Or with SIMD

2016-10-23 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4300: Speed up BloomFilter::Or with SIMD
..


Patch Set 1:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/4813/1/be/src/util/bloom-filter.cc
File be/src/util/bloom-filter.cc:

Line 167
> One reason auto-vectorisation must fail here is that the two directories ma
What's stringe is that without the pragma and without __restrict__, Matt 
Godbolt's compiler explorer still shows the loop being autovectorized.

I tried loop unrolling, but it still didn't vectorize.


PS1, Line 162: DCHECK
> DCHECK_EQ
Done


Line 163:   auto simd_in = reinterpret_cast(in);
> I think double*/const double* is more readable than auto here - I'm used to
Done


PS1, Line 166: int
> Use size_t for consistency? I had to think about whether the implicit casts
Done


PS1, Line 168: _mm256_loadu_pd
> We should be able to use the aligned versions here, right, since we allocat
BloomFilters do, but TBloomFilters do not, and from using gdb I see that they 
are not 32-byte aligned.


PS1, Line 181: should
> It shouldn't actually with the code in it's current form since it has to as
Let's continue conversation on overlapping at the other comment.


Line 184:   // TODO: Tune gcc flags to auto-vectorize. This might not be 
possible.
> See my other comment - should be possible.
Let's continue conversation on that comment.


Line 186:   // TODO: IMPALA-4312: Evaluate clang for builds other than ASAN.
> Just remove this TODO? I don't think it's informative for people trying to 
Done


Line 190: auto simd_in = reinterpret_cast(&in.directory[0]);
> This code is simple enough that I'm fine with moving forward with it even i
I will leave as-is.


PS1, Line 191: auto
> I think just repeating the types const__m128i* or __m128i* is more readable
Done


PS1, Line 192: auto
> I'd prefer the int type was explicit here.
Done


PS1, Line 196: int
> size_t for consistency?
Done


PS1, Line 199: _mm_storeu_si128
> We should be able to use the aligned versions here too.
Let's continue this conversation above.

FWIW, i also tried some very complex methods of using non-simd ops to move both 
pointers forward until at least one was aligned. The performance was sometimes 
worse, so I put that aside for another day.


-- 
To view, visit http://gerrit.cloudera.org:8080/4813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I840799d9cfb81285c796e2abfe2029bb869b0f67
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Minor fixes to remove more "cloudera"s from the code.

2016-10-23 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Minor fixes to remove more "cloudera"s from the code.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4774/1/tests/benchmark/report_benchmark_results.py
File tests/benchmark/report_benchmark_results.py:

Line 249:   url = 
'https://api.github.com/repos/apache/incubator-impala/commits/' + commit_sha
> Yes, it seems that commits that land in both repos end up having different 
If by "both repos", you mean github.com/cloudera/Impala and 
github.com/apache/incubator-impala, then I agree. However, 

(a) commits that land in the apache-hosted Impala repo at 
https://git-wip-us.apache.org/repos/asf?p=incubator-impala.git;a=summary have 
the same hashes as those in the github mirror at 
github.com/apache/incubator-impala

and

(b) cloudera/Impaala is not receiving ANY new commits at the moment: 
https://issues.cloudera.org/browse/IMPALA-3691

As such, I doubt anyone is building from it anymore.


-- 
To view, visit http://gerrit.cloudera.org:8080/4774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I27687a3b677def72f1867df54c8e50c93d5cab3a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Minor fixes to remove more "cloudera"s from the code.

2016-10-23 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: Minor fixes to remove more "cloudera"s from the code.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4774/1/tests/benchmark/report_benchmark_results.py
File tests/benchmark/report_benchmark_results.py:

Line 249:   url = 
'https://api.github.com/repos/apache/incubator-impala/commits/' + commit_sha
> If by "both repos", you mean github.com/cloudera/Impala and github.com/apac
SGTM.


-- 
To view, visit http://gerrit.cloudera.org:8080/4774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I27687a3b677def72f1867df54c8e50c93d5cab3a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-23 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 8:

(3 comments)

i won't look at the substitution logic in detail because it gives me a 
headache. :)

but please add that test.

http://gerrit.cloudera.org:8080/#/c/4745/8/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

Line 603:   // Store exprs for Kudu key columns.
add line breaks between if blocks, it's harder to navigate without those ('{' 
in vi let's you jump by a 'paragraph'). at least add one at l613


http://gerrit.cloudera.org:8080/#/c/4745/8/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java:

Line 237:* the outputs of aggregation. Invoked for UnionStmt as well since
something missing there. (i know, not your fault.)


http://gerrit.cloudera.org:8080/#/c/4745/8/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

Line 247: # IMPALA-2521: clustered insert into table adds sort node, correctly 
substituting exprs.
you can't always see the correct substitution until you run the query (because 
the column labels can be the same even when you're pointing at the wrong tuple).

let's add query tests as well, what you have so far should run (even if you 
won't see any benefit in the hdfs case).


-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4300: Speed up BloomFilter::Or with SIMD

2016-10-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4300: Speed up BloomFilter::Or with SIMD
..


Patch Set 1:

(4 comments)

LGTM, just want to make sure the comment is a little clearer.

http://gerrit.cloudera.org:8080/#/c/4813/1/be/src/util/bloom-filter.cc
File be/src/util/bloom-filter.cc:

Line 167
> What's stringe is that without the pragma and without __restrict__, Matt Go
I experimented with some variants: https://godbolt.org/g/rFww4g 
.BloomFilterOrPointersUnrolledIvDep() and BloomFilterOrIvDepInt() emit exactly 
the code you want.
There are a few things that make a difference:

* Avoiding the indirection via the in/out pointers helps a lot. If we extract 
vector.data()/vector.size() from the loop body we can get the same effect.
* Rather subtly, using a signed/unsigned loop counter makes a difference (I 
think with the signed type the compiler is allow to assume no overflow and this 
somehow helps).
* The unrolling and #pragma ivdep aren't always necessary to emit the code, but 
it's a lot cleaner with them.  I think gcc's vectoriser is smart enough to 
insert a check for whether the arrays overlap and two versions of the code for 
the overlapping/non-overlapping cases.

The explicit SIMD version has the virtue of not being sensitive to all these 
things.


PS1, Line 168: _mm256_loadu_pd
> BloomFilters do, but TBloomFilters do not, and from using gdb I see that th
Oh, duh.


PS1, Line 199: _mm_storeu_si128
> Let's continue this conversation above.
Yeah, I doubt there's much difference between the aligned/unaligned versions, 
just didn't want to use the unaligned ones if there was already an invariant 
that the memory was aligned.


http://gerrit.cloudera.org:8080/#/c/4813/2/be/src/util/bloom-filter.cc
File be/src/util/bloom-filter.cc:

Line 181:   // The trivial loop out[i] |= in[i] should auto-vectorize with gcc 
at -O3, but it is not
Let's fix this comment so it's clear that the issue is that the code is not 
written in a way that is friendly to auto-vectorization.

I've seen other cryptic comments about these kind of things ("compiler is 
confused by x")  and they're hard to know what to do with (if not outright 
misleading).


-- 
To view, visit http://gerrit.cloudera.org:8080/4813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I840799d9cfb81285c796e2abfe2029bb869b0f67
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add distcc infrastructure.

2016-10-23 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4820

Change subject: Add distcc infrastructure.
..

Add distcc infrastructure.

This has been working for several months, and it it was written mainly
by Casey Ching while he was at Cloudera working on Impala.

Change-Id: Ia4bc78ad46dda13e4533183195af632f46377cae
---
M .gitignore
A bin/distcc/.gitignore
A bin/distcc/README.md
A bin/distcc/distcc.sh
A bin/distcc/distcc_env.sh
5 files changed, 325 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/4820/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4820
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia4bc78ad46dda13e4533183195af632f46377cae
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 


[Impala-ASF-CR] IMPALA-4300: Speed up BloomFilter::Or with SIMD

2016-10-23 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new patch set (#3).

Change subject: IMPALA-4300: Speed up BloomFilter::Or with SIMD
..

IMPALA-4300: Speed up BloomFilter::Or with SIMD

Manually vectorizing speeds up BloomFilter::Or by up to 184x.

The existing loop should auto-vectorize with our GCC setting of -O3,
but it does not for reasons that remain mysterious.

Change-Id: I840799d9cfb81285c796e2abfe2029bb869b0f67
---
M be/src/benchmarks/bloom-filter-benchmark.cc
M be/src/testutil/mem-util.h
M be/src/util/bloom-filter.cc
M be/src/util/cpu-info.cc
M be/src/util/cpu-info.h
5 files changed, 178 insertions(+), 75 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/4813/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I840799d9cfb81285c796e2abfe2029bb869b0f67
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4300: Speed up BloomFilter::Or with SIMD

2016-10-23 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4300: Speed up BloomFilter::Or with SIMD
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4813/2/be/src/util/bloom-filter.cc
File be/src/util/bloom-filter.cc:

Line 181:   // The trivial loop out[i] |= in[i] should auto-vectorize with gcc 
at -O3, but it is not
> Let's fix this comment so it's clear that the issue is that the code is not
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I840799d9cfb81285c796e2abfe2029bb869b0f67
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3884: Support TYPE TIMESTAMP for HashTableCtx::CodegenAssignNullValue()

2016-10-23 Thread Michael Ho (Code Review)
Michael Ho has uploaded a new patch set (#2).

Change subject: IMPALA-3884: Support TYPE_TIMESTAMP for 
HashTableCtx::CodegenAssignNullValue()
..

IMPALA-3884: Support TYPE_TIMESTAMP for HashTableCtx::CodegenAssignNullValue()

This change implements support for TYPE_TIMESTAMP for
HashTableCtx::CodegenAssignNullValue(). TimestampValue itself
is 16 bytes in size. To match RawValue::Write() in the
interpreted path, CodegenAssignNullValue() emits code to assign
HashUtil::FNV_SEED to both the upper and lower 64-bit of the
destination value. This change also fixes the handling of 128-bit
Decimal16Value in CodegenAssignNullValue() so the emitted code
matches the behavior of the interpreted path.

Change-Id: I0211d38cbef46331e0006fa5ed0680e6e0867bc8
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/hash-table.cc
M be/src/exec/old-hash-table.cc
M be/src/util/bit-util.h
M testdata/workloads/functional-query/queries/QueryTest/joins.test
7 files changed, 72 insertions(+), 37 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/4794/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0211d38cbef46331e0006fa5ed0680e6e0867bc8
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3884: Support TYPE TIMESTAMP for HashTableCtx::CodegenAssignNullValue()

2016-10-23 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-3884: Support TYPE_TIMESTAMP for 
HashTableCtx::CodegenAssignNullValue()
..


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/4794/1/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

Line 586:   DCHECK_EQ(num_bytes & (num_bytes - 1), 0);
> Could you factor this into BitUtil::IsPowerOfTwo() or something along those
Done


Line 591: return ConstantInt::get(context(), APInt(128, vals));
> CodegenAnyVal::SetVal(int128_t) should use this approach. There is a TODO i
Done


http://gerrit.cloudera.org:8080/#/c/4794/1/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

Line 387:   /// than 64-bit), an 128-bit value is formed by setting its upper 
and lower 64-bit set
> This behaviour of copying the value into the upper and lower bits is pretty
Thanks for the suggestion. Refactored GetIntConstant() to take two 64-bit 
values instead. APInt() will handle masking out appropriate bits and getting 
the right size.


http://gerrit.cloudera.org:8080/#/c/4794/1/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

Line 597: // fall through to generate 128-bit 'fnv_seed'
> May as well just call codegen->GetIntConstant() here, or codegen->GetInt128
Done


PS1, Line 608: fnv_seed_float
> Thank you! This bugged me when reading the code before. Optional, but you c
Done


http://gerrit.cloudera.org:8080/#/c/4794/1/testdata/workloads/functional-query/queries/QueryTest/joins.test
File testdata/workloads/functional-query/queries/QueryTest/joins.test:

Line 723:  QUERY
> I have a test in https://gerrit.cloudera.org/#/c/4655/6/tests/query_test/te
Thanks for the heads-up.


-- 
To view, visit http://gerrit.cloudera.org:8080/4794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0211d38cbef46331e0006fa5ed0680e6e0867bc8
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3884: Support TYPE TIMESTAMP for HashTableCtx::CodegenAssignNullValue()

2016-10-23 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-3884: Support TYPE_TIMESTAMP for 
HashTableCtx::CodegenAssignNullValue()
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4794/2/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

PS2, Line 386: repsectively
typo: respectively


-- 
To view, visit http://gerrit.cloudera.org:8080/4794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0211d38cbef46331e0006fa5ed0680e6e0867bc8
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3676: Use clang as a static analysis tool

2016-10-23 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new patch set (#6).

Change subject: IMPALA-3676: Use clang as a static analysis tool
..

IMPALA-3676: Use clang as a static analysis tool

This patch adds a script to run clang-tidy over the whole code
base. It is a first step towards running clang-tidy over patches as a
tool to help users spot bugs before code review.

Because of the number of clang-tidy checks, this patch only addresses
some of them. In particular, only checks starting with 'clang' are
considered. Many of them which are flaky or not part of our style are
excluded from the analysis. This patch also exlcudes some checks which
are part of our current style but which would be too laborious to fix
over the entire codebase, like using nullptr rather than NULL.

This patch also fixes a number of small bugs found by clang-tidy.

Change-Id: I4ed168488cb30ddeccd0087f3840541d858f9c06
---
A .clang-tidy
M CMakeLists.txt
M be/CMakeLists.txt
M be/src/benchmarks/bloom-filter-benchmark.cc
M be/src/benchmarks/in-predicate-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.cc
M be/src/common/init.cc
M be/src/common/logging.h
M be/src/common/status.h
M be/src/exec/delimited-text-parser.cc
M be/src/exec/delimited-text-parser.h
M be/src/exec/exec-node.h
M be/src/exec/external-data-source-executor.h
M be/src/exec/hash-table-test.cc
M be/src/exec/hbase-scan-node.h
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/hdfs-table-writer.h
M be/src/exec/parquet-column-readers.h
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/scanner-context.inline.h
M be/src/exec/write-stream.h
M be/src/experiments/bit-stream-utils.8byte.inline.h
M be/src/experiments/tuple-splitter-test.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/aggregate-functions.h
M be/src/exprs/compound-predicates.h
M be/src/exprs/expr-test.cc
M be/src/exprs/slot-ref.cc
M be/src/exprs/string-functions-ir.cc
M be/src/rpc/TAcceptQueueServer.h
M be/src/rpc/authentication.cc
M be/src/rpc/authentication.h
M be/src/rpc/thrift-server.h
M be/src/runtime/buffered-block-mgr.h
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/exec-env.h
M be/src/runtime/free-pool-test.cc
M be/src/runtime/hbase-table.cc
M be/src/runtime/hdfs-fs-cache.h
M be/src/runtime/multi-precision-test.cc
M be/src/runtime/runtime-filter-bank.h
M be/src/runtime/runtime-filter.h
M be/src/runtime/scoped-buffer.h
M be/src/runtime/string-buffer.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/simple-scheduler-test.cc
M be/src/service/fe-support.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M be/src/service/query-result-set.cc
M be/src/statestore/failure-detector.h
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/testutil/test-udas.cc
M be/src/testutil/test-udas.h
M be/src/transport/TSasl.h
M be/src/transport/TSaslClientTransport.h
M be/src/transport/TSaslServerTransport.h
M be/src/transport/TSaslTransport.h
M be/src/udf/uda-test-harness.h
M be/src/udf/uda-test.cc
M be/src/udf/udf-ir.cc
M be/src/udf/udf.cc
M be/src/udf/udf.h
M be/src/udf_samples/uda-sample.cc
A be/src/util/aligned-new.h
M be/src/util/benchmark-test.cc
M be/src/util/bit-stream-utils.inline.h
M be/src/util/bit-util-test.cc
M be/src/util/blocking-queue-test.cc
M be/src/util/blocking-queue.h
M be/src/util/bloom-filter-test.cc
M be/src/util/buffer-builder.h
M be/src/util/compress.cc
M be/src/util/decompress.cc
M be/src/util/metrics.h
M be/src/util/minidump.cc
M be/src/util/network-util.cc
M be/src/util/parquet-reader.cc
M be/src/util/periodic-counter-updater.h
M be/src/util/pprof-path-handlers.cc
M be/src/util/progress-updater.cc
M be/src/util/rle-test.cc
M be/src/util/runtime-profile.h
M be/src/util/thread-pool.h
M be/src/util/webserver.cc
M bin/make_impala.sh
A bin/run_clang_tidy.sh
M buildall.sh
R cmake_modules/clang_toolchain.cmake
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
116 files changed, 570 insertions(+), 334 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/

[Impala-ASF-CR] IMPALA-4300: Speed up BloomFilter::Or with SIMD

2016-10-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4300: Speed up BloomFilter::Or with SIMD
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4813/3//COMMIT_MSG
Commit Message:

Line 11: The existing loop should auto-vectorize with our GCC setting of -O3,
Can you fix this too? Maybe just say "The previous code was not written in a 
way that GCC could auto-vectorize it". Missed it first time over.


-- 
To view, visit http://gerrit.cloudera.org:8080/4813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I840799d9cfb81285c796e2abfe2029bb869b0f67
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3884: Support TYPE TIMESTAMP for HashTableCtx::CodegenAssignNullValue()

2016-10-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3884: Support TYPE_TIMESTAMP for 
HashTableCtx::CodegenAssignNullValue()
..


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4794/2/be/src/util/bit-util.h
File be/src/util/bit-util.h:

PS2, Line 85: static
constexpr - that way we can use it in static_asserts() etc in future.


-- 
To view, visit http://gerrit.cloudera.org:8080/4794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0211d38cbef46331e0006fa5ed0680e6e0867bc8
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add distcc infrastructure.

2016-10-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Add distcc infrastructure.
..


Patch Set 1:

(6 comments)

Thanks for doing this

http://gerrit.cloudera.org:8080/#/c/4820/1/bin/distcc/distcc.sh
File bin/distcc/distcc.sh:

Line 1: #!/bin/bash
Needs a copyright header


Line 34:   clang) # Assume the compilation options were setup for gcc, which 
would happen using
Ugh, we really should just consolidate the GCC/Clang argument handling into one 
place in the CMake scripts instead of adding this layer of hacks. Anyway, don't 
need to fix now.


Line 46: PATH="$DISTCC_ENV_DIR:$PATH"
Should be able to get rid of this, we don't pick the linker up from PATH now 
anyway.


http://gerrit.cloudera.org:8080/#/c/4820/1/bin/distcc/distcc_env.sh
File bin/distcc/distcc_env.sh:

Line 1: # This file is intended to be sourced by a shell (zsh and bash have 
been tested).
Needs a copyright header


PS1, Line 44: BUILD_FARM
Can you check if BUILD_FARM is defined and print a message? Otherwise people 
may be confused if they run it and it breaks.


Line 131: function switch_linker {
All this linker stuff is irrelevant now that we're using the toolchain linker. 
Let's just delete it.


-- 
To view, visit http://gerrit.cloudera.org:8080/4820
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4bc78ad46dda13e4533183195af632f46377cae
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1654: General partition exprs in DDL operations.

2016-10-23 Thread Amos Bird (Code Review)
Amos Bird has posted comments on this change.

Change subject: IMPALA-1654: General partition exprs in DDL operations.
..


Patch Set 18:

Cool! I'll do the rebase this week.

-- 
To view, visit http://gerrit.cloudera.org:8080/3942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2c9162fcf9d227b8daf4c2e761d57bab4e26408f
Gerrit-PatchSet: 18
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Amos Bird 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Amos Bird 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] Remove unused Bitmap code.

2016-10-23 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: Remove unused Bitmap code.
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I95fcaaa40243999800c2ec2ead5b3479d66a63e7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3586: Clean up union-node.h/cc to enable improvements.

2016-10-23 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3586: Clean up union-node.h/cc to enable improvements.
..


Patch Set 1: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4817/1/be/src/exec/union-node.cc
File be/src/exec/union-node.cc:

PS1, Line 72: for (int i = 0; i < const_expr_lists_.size(); ++i) {
: RETURN_IF_ERROR(Expr::Prepare(
: const_expr_lists_[i], state, row_desc(), 
expr_mem_tracker()));
: AddExprCtxsToFree(const_expr_lists_[i]);
: DCHECK_EQ(const_expr_lists_[i].size(), 
tuple_desc_->slots().size());
:   }
: 
:   // Prepare result expr lists.
:   for (int i = 0; i < child_expr_lists_.size(); ++i) {
: RETURN_IF_ERROR(Expr::Prepare(
: child_expr_lists_[i], state, child(i)->row_desc(), 
expr_mem_tracker()));
: AddExprCtxsToFree(child_expr_lists_[i]);
: DCHECK_EQ(child_expr_lists_[i].size(), 
tuple_desc_->slots().size());
:   }
ranged-for loops as well?


PS1, Line 220: for (int i = 0; i < const_expr_lists_.size(); ++i) {
 : Expr::Close(const_expr_lists_[i], state);
 :   }
 :   for (int i = 0; i < child_expr_lists_.size(); ++i) {
 : Expr::Close(child_expr_lists_[i], state);
 :   }
While you're here, can you rewrite as two ranged-for loops?


http://gerrit.cloudera.org:8080/#/c/4817/1/be/src/exec/union-node.h
File be/src/exec/union-node.h:

Line 58:   /// Const exprs materialized by this node. These exprs don't refer 
to any children.
Add that these are only materialized by the first fragment instance to avoid 
duplication.


-- 
To view, visit http://gerrit.cloudera.org:8080/4817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5fc23985e8d51acb8a6920717ce4e2f0254fe70
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3676: Use clang as a static analysis tool

2016-10-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3676: Use clang as a static analysis tool
..


Patch Set 6: Code-Review+1

(1 comment)

Would give a +2 except I think it'd be good to have someone else look at the 
AlignedNew abstraction - it makes sense to me but I think is somewhat of a new 
pattern. The rest seems uncontroversial.

http://gerrit.cloudera.org:8080/#/c/4758/6/cmake_modules/clang_toolchain.cmake
File cmake_modules/clang_toolchain.cmake:

Thank you! This makes more sense.


-- 
To view, visit http://gerrit.cloudera.org:8080/4758
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ed168488cb30ddeccd0087f3840541d858f9c06
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes