[Impala-ASF-CR] IMPALA-3224: De-Cloudera non-docs JIRA URLs

2017-03-31 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-3224: De-Cloudera non-docs JIRA URLs
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6487/1/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java:

Line 230: //   https://issues.apache.org/jira/browse/IMPALA-3570
> In most places we seem to refer to JIRAs without the full URL:
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6487
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I28ea06e89341de234f9005fdc72a2e43f0ab8182
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3224: De-Cloudera non-docs JIRA URLs

2017-03-31 Thread Jim Apple (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6487

to look at the new patch set (#2).

Change subject: IMPALA-3224: De-Cloudera non-docs JIRA URLs
..

IMPALA-3224: De-Cloudera non-docs JIRA URLs

John Russell is planning to fix the URLS in docs in a separate commit.

Fixed using:

(git ls-files | xargs replace \
'https://issues.cloudera.org/browse/IMPALA' 'IMPALA' --) && \
git checkout HEAD docs

Change-Id: I28ea06e89341de234f9005fdc72a2e43f0ab8182
---
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M shell/shell_output.py
M testdata/bin/compute-table-stats.sh
M testdata/bin/create-load-data.sh
M testdata/bin/load-test-warehouse-snapshot.sh
M testdata/bin/setup-hdfs-env.sh
M tests/comparison/db_connection.py
M tests/comparison/discrepancy_searcher.py
M tests/comparison/query_generator.py
M tests/custom_cluster/test_kudu_not_available.py
M tests/stress/concurrent_select.py
11 files changed, 22 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/6487/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6487
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I28ea06e89341de234f9005fdc72a2e43f0ab8182
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5137: Support Kudu UNIXTIME MICROS as Impala TIMESTAMP

2017-03-31 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5137: Support Kudu UNIXTIME_MICROS as Impala TIMESTAMP
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6526/1/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS1, Line 180: Unix time (seconds since the Unix epoch) representation in UTC
unix time isn't really "in UTC".  It's just the number of seconds since unix 
epcoh (which is specified as Jan 1 1970 0:00 UTC). That is, unix time is 
timezone independent.

The implied timezone (which in this case you want to be UTC) applies to the 
timestamp, not the resulting unix time. So, I think the comment should be 
something like:

/// Interpret 'this' as a timestamp in UTC and convert to unix time.

and rename the method accordingly.  Or am I missing something?


-- 
To view, visit http://gerrit.cloudera.org:8080/6526
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae6ccfffb79118a9036fb2227dba3a55356c896d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3203: Part 2: per-core free lists in buffer pool

2017-03-31 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3203: Part 2: per-core free lists in buffer pool
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6414/13/be/src/runtime/bufferpool/buffer-allocator.h
File be/src/runtime/bufferpool/buffer-allocator.h:

Line 164: };
> We should chat about the design a bit. I added a couple of basic counters a
Sure, let's chat about it. This can be done in a follow on patch, so we can 
continue with the current patch without it.


-- 
To view, visit http://gerrit.cloudera.org:8080/6414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3203: Part 2: per-core free lists in buffer pool

2017-03-31 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3203: Part 2: per-core free lists in buffer pool
..


Patch Set 15:

> Also, what do you think about removing the per-list limits? They're
 > not necessary for correctness and they add an additional thing to
 > tune. I think after the maintenance and scavenging they don't add
 > much.

Fine with me to remove.

-- 
To view, visit http://gerrit.cloudera.org:8080/6414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines

2017-03-31 Thread Laurel Hale (Code Review)
Laurel Hale has posted comments on this change.

Change subject: IMPALA-5140: improve docs building guidelines
..


Patch Set 7: Code-Review+1

everything looks ok

-- 
To view, visit http://gerrit.cloudera.org:8080/6512
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4643: [DOCS] Change URLs / set up keydefs for JIRA reports

2017-03-31 Thread Laurel Hale (Code Review)
Laurel Hale has posted comments on this change.

Change subject: IMPALA-4643: [DOCS] Change URLs / set up keydefs for JIRA 
reports
..


Patch Set 1:

(6 comments)

There are some issues that I've listed in my comments.

http://gerrit.cloudera.org:8080/#/c/6515/1/docs/topics/impala_fixed_issues.xml
File docs/topics/impala_fixed_issues.xml:

PS1, Line 573: fixed_issues_232
I looked in "impala_fixed_issues.xml" and there IS a concept id 
"fixed_issues_232".  Not sure why this isn't working, but the link is not 
working.


PS1, Line 2461: 
This points to empty search results:

https://issues.apache.org/jira/issues/?jql=project%3Dimpala%20and%20fixVersion%3D%22Impala%202.0.5%22%20and%20resolution%3D%22Fixed%22


PS1, Line 2706: 
This points to empty search results:

https://issues.apache.org/jira/issues/?jql=project%3Dimpala%20and%20fixVersion%3D%22Impala%202.0.4%22%20and%20resolution%3D%22Fixed%22


PS1, Line 2770: https://issues.apache.org/jira/issues/?jql=project%3Dimpala%20and%20fixVersion%3D%22Impala%202.0.3%22%20and%20resolution%3D%22Fixed%22


PS1, Line 2820: 
This points to empty search results:

https://issues.apache.org/jira/issues/?jql=project%3Dimpala%20and%20fixVersion%3D%22Impala%202.0.2%22%20and%20resolution%3D%22Fixed%22


-- 
To view, visit http://gerrit.cloudera.org:8080/6515
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I007e634f9da57289674683dd5bf64e3e3ca8f525
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3905: HdfsScanner::GetNext() for Avro, RC, and Seq scans.

2017-03-31 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6527

Change subject: IMPALA-3905: HdfsScanner::GetNext() for Avro, RC, and Seq scans.
..

IMPALA-3905: HdfsScanner::GetNext() for Avro, RC, and Seq scans.

Implements HdfsScanner::GetNext() for the Avro, RC File, and
Sequence File scanners. Changes ProcessSplit() to repeatedly call
GetNext() to share the core scanning code between the legacy
ProcessSplit() interface (ProcessSpit()) and the new GetNext()
interface.

Summary of changes:
- Slightly change code flow for initial scan range that
  only parses the file header. The new code sets
  'only_parsing_header_' in Open() and then honors
  that flag in GetNextInternal(). Before, all the logic
  was inside ProcessSpit().
- Replace 'finished_' with 'eos_'.
- Add a RowBatch parameter to various functions.
- Change Close() to free all resources when a nullptr
  RowBatch is passed.

Testing:
- Exhaustive tests passed on debug
- Core tests passed on asan
- TODO: Perf testing on cluster

Change-Id: Ie18f57b0d3fe0052a8ccd361b6a5fcdf979d0669
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/base-sequence-scanner.h
M be/src/exec/hdfs-avro-scanner-ir.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-sequence-scanner.h
M testdata/workloads/functional-query/queries/DataErrorsTest/avro-errors.test
17 files changed, 575 insertions(+), 493 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/6527/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie18f57b0d3fe0052a8ccd361b6a5fcdf979d0669
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] IMPALA-4883: Union Codegen

2017-03-31 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#6).

Change subject: IMPALA-4883: Union Codegen
..

IMPALA-4883: Union Codegen

For each non-passthrough child of the Union node, codegen the loop that
does per row tuple materialization.

Testing:
Ran test_queries.py test locally in exchaustive mode.

Benchmark:
Ran a local benchmark on a local 10 GB TPCDS dataset on an unpartitioned
store_sales table.

SELECT
  COUNT(c),
  COUNT(ss_customer_sk),
  COUNT(ss_cdemo_sk),
  COUNT(ss_hdemo_sk),
  COUNT(ss_addr_sk),
  COUNT(ss_store_sk),
  COUNT(ss_promo_sk),
  COUNT(ss_ticket_number),
  COUNT(ss_quantity),
  COUNT(ss_wholesale_cost),
  COUNT(ss_list_price),
  COUNT(ss_sales_price),
  COUNT(ss_ext_discount_amt),
  COUNT(ss_ext_sales_price),
  COUNT(ss_ext_wholesale_cost),
  COUNT(ss_ext_list_price),
  COUNT(ss_ext_tax),
  COUNT(ss_coupon_amt),
  COUNT(ss_net_paid),
  COUNT(ss_net_paid_inc_tax),
  COUNT(ss_net_profit),
  COUNT(ss_sold_date_sk)
FROM (
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
) t

Before: 39s704ms
Operator  #Hosts   Avg Time   Max Time#Rows  Est. #Rows   Peak Mem  
Est. Peak Mem  Detail
--
13:AGGREGATE   1  194.504us  194.504us1   1   28.00 KB  
  -1.00 B  FINALIZE
12:EXCHANGE1   17.284us   17.284us3   1  0  
  -1.00 B  UNPARTITIONED
11:AGGREGATE   32s202ms2s934ms3   1  115.00 KB  
 10.00 MB
00:UNION   3   32s514ms   34s926ms  288.01M 288.01M3.08 MB  
0
|--02:SCAN HDFS3  158.373ms  216.085ms   28.80M  28.80M  489.71 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--03:SCAN HDFS3  167.002ms  171.738ms   28.80M  28.80M  489.74 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--04:SCAN HDFS3  125.331ms  145.496ms   28.80M  28.80M  489.57 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--05:SCAN HDFS3  148.478ms  194.311ms   28.80M  28.80M  489.69 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--06:SCAN HDFS3  143.995ms  162.781ms   28.80M  28.80M  489.57 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--07:SCAN HDFS3  169.731ms  250.201ms   28.80M  28.80M  489.58 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--08:SCAN HDFS3  164.110ms  254.374ms   28.80M  28.80M  489.61 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--09:SCAN HDFS3  135.631ms  162.117ms   28.80M  28.80M  489.63 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--10:SCAN HDFS3  138.736ms  167.778ms   28.80M  28.80M  489.67 MB  
  1.88 GB  tpcds_10_parquet.store_sales
01:SCAN HDFS   3  202.015ms  248.728ms   28.80M  28.80M  489.68 MB  
  1.88 GB  tpcds_10_parquet.store_sales

After: 20s664ms
Operator  #Hosts   Avg Time   Max Time#Rows  Est. #Rows   Peak Mem  
Est. Peak Mem  Detail
--
13:AGGREGATE   1  167.757us  167.757us1   1   28.00 KB  
  -1.00 B  FINALIZE
12:EXCHANGE1   16.592us   16.592us3   1  0  
  -1.00 B  UNPARTITIONED
11:AGGREGATE   32s924ms3s715ms3   1  115.00 KB  
 10.00 MB
00:UNION   34s971ms6s082ms  288.01M 288.01M3.08 MB  
0
|--02:SCAN HDFS31s189ms1s588ms   28.80M  28.80M  483.82 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--03:SCAN HDFS31s117ms1s157ms   28.80M  28.80M  484.85 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--04:SCAN HDFS31s226ms1s454ms   28.80M  28.80M  483.00 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--05:SCAN HDFS31s141ms1s3

[Impala-ASF-CR] IMPALA-4883: Union Codegen

2017-03-31 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#6).

Change subject: IMPALA-4883: Union Codegen
..

IMPALA-4883: Union Codegen

For each non-passthrough child of the Union node, codegen the loop that
does per row tuple materialization.

Testing:
Ran test_queries.py test locally in exchaustive mode.

Benchmark:
Ran a local benchmark on a local 10 GB TPCDS dataset on an unpartitioned
store_sales table.

SELECT
  COUNT(c),
  COUNT(ss_customer_sk),
  COUNT(ss_cdemo_sk),
  COUNT(ss_hdemo_sk),
  COUNT(ss_addr_sk),
  COUNT(ss_store_sk),
  COUNT(ss_promo_sk),
  COUNT(ss_ticket_number),
  COUNT(ss_quantity),
  COUNT(ss_wholesale_cost),
  COUNT(ss_list_price),
  COUNT(ss_sales_price),
  COUNT(ss_ext_discount_amt),
  COUNT(ss_ext_sales_price),
  COUNT(ss_ext_wholesale_cost),
  COUNT(ss_ext_list_price),
  COUNT(ss_ext_tax),
  COUNT(ss_coupon_amt),
  COUNT(ss_net_paid),
  COUNT(ss_net_paid_inc_tax),
  COUNT(ss_net_profit),
  COUNT(ss_sold_date_sk)
FROM (
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
) t

Before: 39s704ms
Operator  #Hosts   Avg Time   Max Time#Rows  Est. #Rows   Peak Mem  
Est. Peak Mem  Detail
--
13:AGGREGATE   1  194.504us  194.504us1   1   28.00 KB  
  -1.00 B  FINALIZE
12:EXCHANGE1   17.284us   17.284us3   1  0  
  -1.00 B  UNPARTITIONED
11:AGGREGATE   32s202ms2s934ms3   1  115.00 KB  
 10.00 MB
00:UNION   3   32s514ms   34s926ms  288.01M 288.01M3.08 MB  
0
|--02:SCAN HDFS3  158.373ms  216.085ms   28.80M  28.80M  489.71 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--03:SCAN HDFS3  167.002ms  171.738ms   28.80M  28.80M  489.74 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--04:SCAN HDFS3  125.331ms  145.496ms   28.80M  28.80M  489.57 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--05:SCAN HDFS3  148.478ms  194.311ms   28.80M  28.80M  489.69 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--06:SCAN HDFS3  143.995ms  162.781ms   28.80M  28.80M  489.57 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--07:SCAN HDFS3  169.731ms  250.201ms   28.80M  28.80M  489.58 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--08:SCAN HDFS3  164.110ms  254.374ms   28.80M  28.80M  489.61 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--09:SCAN HDFS3  135.631ms  162.117ms   28.80M  28.80M  489.63 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--10:SCAN HDFS3  138.736ms  167.778ms   28.80M  28.80M  489.67 MB  
  1.88 GB  tpcds_10_parquet.store_sales
01:SCAN HDFS   3  202.015ms  248.728ms   28.80M  28.80M  489.68 MB  
  1.88 GB  tpcds_10_parquet.store_sales

After: 20s664ms
Operator  #Hosts   Avg Time   Max Time#Rows  Est. #Rows   Peak Mem  
Est. Peak Mem  Detail
--
13:AGGREGATE   1  167.757us  167.757us1   1   28.00 KB  
  -1.00 B  FINALIZE
12:EXCHANGE1   16.592us   16.592us3   1  0  
  -1.00 B  UNPARTITIONED
11:AGGREGATE   32s924ms3s715ms3   1  115.00 KB  
 10.00 MB
00:UNION   34s971ms6s082ms  288.01M 288.01M3.08 MB  
0
|--02:SCAN HDFS31s189ms1s588ms   28.80M  28.80M  483.82 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--03:SCAN HDFS31s117ms1s157ms   28.80M  28.80M  484.85 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--04:SCAN HDFS31s226ms1s454ms   28.80M  28.80M  483.00 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--05:SCAN HDFS31s141ms1s3

[Impala-ASF-CR] IMPALA-4883: Union Codegen

2017-03-31 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4883: Union Codegen
..


Patch Set 5:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/6459/5/be/src/exec/union-node-ir.cc
File be/src/exec/union-node-ir.cc:

Line 19: #include "runtime/tuple.h"
> Do we need tuple.h? I don't think I see any references to Tuple* in here.
Done


Line 21: #include "util/runtime-profile-counters.h"
> Is this needed still?
Done


Line 35:   while (!dst_batch->AtCapacity() && child_row_idx < 
child_batch->num_rows()) {
> Nice! We can maybe avoid a few more loads and stores via the child_batch an
Great suggestions. Done.


Line 46:   if (limit_ != -1 && num_rows_returned_ + dst_batch->num_rows() > 
limit_) {
> We don't need to cross-compile this logic. Let's move it into the caller an
Good point. Moved it out of here.


http://gerrit.cloudera.org:8080/#/c/6459/5/be/src/exec/union-node.cc
File be/src/exec/union-node.cc:

Line 168:   if (limit_ != -1 && num_rows_returned_ + row_batch->num_rows() > 
limit_) {
> How about we move this logic around num_rows_returned_ and limit_ into GetN
Done


http://gerrit.cloudera.org:8080/#/c/6459/5/be/src/exec/union-node.h
File be/src/exec/union-node.h:

Line 28: #include "runtime/tuple.h"
> Do we need the tuple.h and tuple-row.h imports? Oh I guess for the inline M
Done


Line 72:   /// each GetNext() call.
> We should add a TODO to remove this. Maybe Michael knows if there's a JIRA 
Added a todo, couldn't find the relevant JIRA.


PS5, Line 99: Null
> NULL here and below, just for consistency with other comments.
Done


PS5, Line 128: row_batch
> dst_batch.
Done


Line 136:   void IR_ALWAYS_INLINE MaterializeExprs(const 
std::vector& exprs,
> Move this to the -ir.cc file? I don't think there's a reason we need to def
Ok, moved it.


Line 148:   bool inline IsChildPassthrough(int child_idx) const {
> I don't think any of the "inline" specifiers here and below do anything - i
Makes sense, removed all of them.


-- 
To view, visit http://gerrit.cloudera.org:8080/6459
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4107d27582ff5416172810364a6e76d3d93c439
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan nodes and inline views

2017-03-31 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5003: Constant propagation in scan nodes and inline views
..


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6389/11/fe/src/main/java/org/apache/impala/rewrite/NormalizeBinaryPredicatesRule.java
File 
fe/src/main/java/org/apache/impala/rewrite/NormalizeBinaryPredicatesRule.java:

Line 36:  * id1 > id2 -> id2 < id1
I am going to abandon changes here.  Although this would make it easier to 
extend to analysis chains, e.g. A <= B <= C <= A -> A = B = C, the 
complications introduced by the non-inclusive relation make implementing this 
quite a bit of work.  This change is already large enough and I'd rather keep 
the scope confined to the two simple steps.


-- 
To view, visit http://gerrit.cloudera.org:8080/6389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan nodes and inline views

2017-03-31 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#11).

Change subject: IMPALA-5003: Constant propagation in scan nodes and inline views
..

IMPALA-5003: Constant propagation in scan nodes and inline views

When conjuncts are pushed into table refs and inline views, they can
be considered for constant progagation within that node.  In certain
cases, we might end up with a FALSE conditional and now we can
convert ScanNodes to EmptySet nodes when that occurs.

I also added an inequality collation phase which is now partially
tested and will combine conjuncts such as a < k1, a < k2  into
a < min(k1, k2), as well as detect equivalence from a >= k, a <= k,
and determine conflicting bounds requirements to be false.

This could be expanded to do analysis against other slotrefs in the
future, but this should probably be saved for another diff.

Testing: Expanded the test cases for the planner to achieve constant
propagation.  Added Kudu, datasource, Hdfs and HBase tests to validate
we can create EmptySetNodes.  Some manual testing for inequality
conjuncts but nothing formal yet.

Query: explain select * from functional_hbase.widetable_250_cols a
where a.int_col1 > 1 and a.int_col1 <= 20 and a.int_col1 < 50 and
a.int_col1 > 2
+---
| Explain String
+---
| Estimated Per-Host Requirements: Memory=1.00GB VCores=1
| PLAN-ROOT SINK
| |
| 01:EXCHANGE [UNPARTITIONED]
| |
| 00:SCAN HBASE [functional_hbase.widetable_250_cols a]
|predicates: a.int_col1 <= 20, a.int_col1 > 2
+---
Fetched 10 row(s) in 0.08s

Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/SelectList.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/ValueRange.java
M fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java
M fe/src/main/java/org/apache/impala/rewrite/NormalizeBinaryPredicatesRule.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/conjunct-ordering.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
M testdata/workloads/functional-planner/queries/PlannerTest/hdfs.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
19 files changed, 644 insertions(+), 117 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/6389/11
-- 
To view, visit http://gerrit.cloudera.org:8080/6389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4883: Union Codegen

2017-03-31 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#6).

Change subject: IMPALA-4883: Union Codegen
..

IMPALA-4883: Union Codegen

For each non-passthrough child of the Union node, codegen the loop that
does per row tuple materialization.

Testing:
Ran test_queries.py test locally in exchaustive mode.

Benchmark:
Ran a local benchmark on a local 10 GB TPCDS dataset on an unpartitioned
store_sales table.

SELECT
  COUNT(c),
  COUNT(ss_customer_sk),
  COUNT(ss_cdemo_sk),
  COUNT(ss_hdemo_sk),
  COUNT(ss_addr_sk),
  COUNT(ss_store_sk),
  COUNT(ss_promo_sk),
  COUNT(ss_ticket_number),
  COUNT(ss_quantity),
  COUNT(ss_wholesale_cost),
  COUNT(ss_list_price),
  COUNT(ss_sales_price),
  COUNT(ss_ext_discount_amt),
  COUNT(ss_ext_sales_price),
  COUNT(ss_ext_wholesale_cost),
  COUNT(ss_ext_list_price),
  COUNT(ss_ext_tax),
  COUNT(ss_coupon_amt),
  COUNT(ss_net_paid),
  COUNT(ss_net_paid_inc_tax),
  COUNT(ss_net_profit),
  COUNT(ss_sold_date_sk)
FROM (
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
) t

Before: 39s704ms
Operator  #Hosts   Avg Time   Max Time#Rows  Est. #Rows   Peak Mem  
Est. Peak Mem  Detail
--
13:AGGREGATE   1  194.504us  194.504us1   1   28.00 KB  
  -1.00 B  FINALIZE
12:EXCHANGE1   17.284us   17.284us3   1  0  
  -1.00 B  UNPARTITIONED
11:AGGREGATE   32s202ms2s934ms3   1  115.00 KB  
 10.00 MB
00:UNION   3   32s514ms   34s926ms  288.01M 288.01M3.08 MB  
0
|--02:SCAN HDFS3  158.373ms  216.085ms   28.80M  28.80M  489.71 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--03:SCAN HDFS3  167.002ms  171.738ms   28.80M  28.80M  489.74 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--04:SCAN HDFS3  125.331ms  145.496ms   28.80M  28.80M  489.57 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--05:SCAN HDFS3  148.478ms  194.311ms   28.80M  28.80M  489.69 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--06:SCAN HDFS3  143.995ms  162.781ms   28.80M  28.80M  489.57 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--07:SCAN HDFS3  169.731ms  250.201ms   28.80M  28.80M  489.58 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--08:SCAN HDFS3  164.110ms  254.374ms   28.80M  28.80M  489.61 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--09:SCAN HDFS3  135.631ms  162.117ms   28.80M  28.80M  489.63 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--10:SCAN HDFS3  138.736ms  167.778ms   28.80M  28.80M  489.67 MB  
  1.88 GB  tpcds_10_parquet.store_sales
01:SCAN HDFS   3  202.015ms  248.728ms   28.80M  28.80M  489.68 MB  
  1.88 GB  tpcds_10_parquet.store_sales

After: 20s664ms
Operator  #Hosts   Avg Time   Max Time#Rows  Est. #Rows   Peak Mem  
Est. Peak Mem  Detail
--
13:AGGREGATE   1  167.757us  167.757us1   1   28.00 KB  
  -1.00 B  FINALIZE
12:EXCHANGE1   16.592us   16.592us3   1  0  
  -1.00 B  UNPARTITIONED
11:AGGREGATE   32s924ms3s715ms3   1  115.00 KB  
 10.00 MB
00:UNION   34s971ms6s082ms  288.01M 288.01M3.08 MB  
0
|--02:SCAN HDFS31s189ms1s588ms   28.80M  28.80M  483.82 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--03:SCAN HDFS31s117ms1s157ms   28.80M  28.80M  484.85 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--04:SCAN HDFS31s226ms1s454ms   28.80M  28.80M  483.00 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--05:SCAN HDFS31s141ms1s3

[Impala-ASF-CR] IMPALA-4883: Union Codegen

2017-03-31 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#6).

Change subject: IMPALA-4883: Union Codegen
..

IMPALA-4883: Union Codegen

For each non-passthrough child of the Union node, codegen the loop that
does per row tuple materialization.

Testing:
Ran test_queries.py test locally in exchaustive mode.

Benchmark:
Ran a local benchmark on a local 10 GB TPCDS dataset on an unpartitioned
store_sales table.

SELECT
  COUNT(c),
  COUNT(ss_customer_sk),
  COUNT(ss_cdemo_sk),
  COUNT(ss_hdemo_sk),
  COUNT(ss_addr_sk),
  COUNT(ss_store_sk),
  COUNT(ss_promo_sk),
  COUNT(ss_ticket_number),
  COUNT(ss_quantity),
  COUNT(ss_wholesale_cost),
  COUNT(ss_list_price),
  COUNT(ss_sales_price),
  COUNT(ss_ext_discount_amt),
  COUNT(ss_ext_sales_price),
  COUNT(ss_ext_wholesale_cost),
  COUNT(ss_ext_list_price),
  COUNT(ss_ext_tax),
  COUNT(ss_coupon_amt),
  COUNT(ss_net_paid),
  COUNT(ss_net_paid_inc_tax),
  COUNT(ss_net_profit),
  COUNT(ss_sold_date_sk)
FROM (
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
  union all
  select fnv_hash(ss_sold_time_sk) c, * from 
tpcds_10_parquet.store_sales_unpartitioned
) t

Before: 39s704ms
Operator  #Hosts   Avg Time   Max Time#Rows  Est. #Rows   Peak Mem  
Est. Peak Mem  Detail
--
13:AGGREGATE   1  194.504us  194.504us1   1   28.00 KB  
  -1.00 B  FINALIZE
12:EXCHANGE1   17.284us   17.284us3   1  0  
  -1.00 B  UNPARTITIONED
11:AGGREGATE   32s202ms2s934ms3   1  115.00 KB  
 10.00 MB
00:UNION   3   32s514ms   34s926ms  288.01M 288.01M3.08 MB  
0
|--02:SCAN HDFS3  158.373ms  216.085ms   28.80M  28.80M  489.71 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--03:SCAN HDFS3  167.002ms  171.738ms   28.80M  28.80M  489.74 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--04:SCAN HDFS3  125.331ms  145.496ms   28.80M  28.80M  489.57 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--05:SCAN HDFS3  148.478ms  194.311ms   28.80M  28.80M  489.69 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--06:SCAN HDFS3  143.995ms  162.781ms   28.80M  28.80M  489.57 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--07:SCAN HDFS3  169.731ms  250.201ms   28.80M  28.80M  489.58 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--08:SCAN HDFS3  164.110ms  254.374ms   28.80M  28.80M  489.61 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--09:SCAN HDFS3  135.631ms  162.117ms   28.80M  28.80M  489.63 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--10:SCAN HDFS3  138.736ms  167.778ms   28.80M  28.80M  489.67 MB  
  1.88 GB  tpcds_10_parquet.store_sales
01:SCAN HDFS   3  202.015ms  248.728ms   28.80M  28.80M  489.68 MB  
  1.88 GB  tpcds_10_parquet.store_sales

After: 20s664ms
Operator  #Hosts   Avg Time   Max Time#Rows  Est. #Rows   Peak Mem  
Est. Peak Mem  Detail
--
13:AGGREGATE   1  167.757us  167.757us1   1   28.00 KB  
  -1.00 B  FINALIZE
12:EXCHANGE1   16.592us   16.592us3   1  0  
  -1.00 B  UNPARTITIONED
11:AGGREGATE   32s924ms3s715ms3   1  115.00 KB  
 10.00 MB
00:UNION   34s971ms6s082ms  288.01M 288.01M3.08 MB  
0
|--02:SCAN HDFS31s189ms1s588ms   28.80M  28.80M  483.82 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--03:SCAN HDFS31s117ms1s157ms   28.80M  28.80M  484.85 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--04:SCAN HDFS31s226ms1s454ms   28.80M  28.80M  483.00 MB  
  1.88 GB  tpcds_10_parquet.store_sales
|--05:SCAN HDFS31s141ms1s3

[Impala-ASF-CR] IMPALA-5137: Support Kudu UNIXTIME MICROS as Impala TIMESTAMP

2017-03-31 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5137: Support Kudu UNIXTIME_MICROS as Impala TIMESTAMP
..


Patch Set 1:

Draft review

-- 
To view, visit http://gerrit.cloudera.org:8080/6526
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae6ccfffb79118a9036fb2227dba3a55356c896d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5137: Support Kudu UNIXTIME MICROS as Impala TIMESTAMP

2017-03-31 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6526

Change subject: IMPALA-5137: Support Kudu UNIXTIME_MICROS as Impala TIMESTAMP
..

IMPALA-5137: Support Kudu UNIXTIME_MICROS as Impala TIMESTAMP

Adds Impala support for TIMESTAMP types stored in Kudu.

Impala's TIMESTAMP type is a 96-bit type with nanosecond
precision and Kudu's timestamp is a 64-bit microsecond delta
from the Unix epoch (called UNIXTIME_MICROS), so a conversion
will is necessary.

TODO: As of now, this only supports writing TIMESTAMPs to Kudu.
Reading will require the Kudu client to return
UNIXTIME_MICROS in a padded slot for Impala.

Change-Id: Iae6ccfffb79118a9036fb2227dba3a55356c896d
---
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-util.cc
M be/src/exec/kudu-util.h
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M tests/query_test/test_kudu.py
7 files changed, 93 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/6526/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6526
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Iae6ccfffb79118a9036fb2227dba3a55356c896d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-4733: Change HBase ports to non-ephemeral

2017-03-31 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4733: Change HBase ports to non-ephemeral
..


Patch Set 3:

> Can you report how you tested this, and if it works on RHEL 7
 > consistently enough that IMPALA-4733 has gone away?

I tested this locally by starting the minicluster on my dev machine, and I'm 
running a private exhaustive build on Cloudera's internal Jenkins. I couldn't 
find a way to make sure this fixes the RHEL7 issues, but it seems reasonable to 
assume that they were caused by HBase trying to bind ports in the ephemeral 
port range. If this passes the internal exhaustive build, I'd be willing to 
give it a try and see if the RHEL7 tests improve.

-- 
To view, visit http://gerrit.cloudera.org:8080/6524
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4733: Change HBase ports to non-ephemeral

2017-03-31 Thread Lars Volker (Code Review)
Hello Bharath Vissapragada,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6524

to look at the new patch set (#3).

Change subject: IMPALA-4733: Change HBase ports to non-ephemeral
..

IMPALA-4733: Change HBase ports to non-ephemeral

We've seen repeated test failures because HBase tries to bind to ports
in the ephemeral port range, which sometimes would already be occupied
by outgoing connections of other proccesses.

This change changes the ports to the new default HBase ports
(HBASE-10123):

HBase Master Port: 6 -> 16000
HBase Master Web UI Port: 60010 -> 16010
HBase ReqionServer Port: 60020 -> 16020
HBase ReqionServer Web UI Port: 60030 -> 16030
HBase Status Multicast Port: 60100 -> 16100

This made it necessary to change the default KMS port, too
(HADOOP-12811):

KMS HTTP port: 16000 -> 9600

Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857
---
M fe/src/test/resources/hbase-site.xml.template
M testdata/cluster/admin
M testdata/cluster/node_templates/cdh5/etc/init.d/kms
M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl
M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
5 files changed, 32 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/6524/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6524
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-4733: Change HBase ports to non-ephemeral

2017-03-31 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4733: Change HBase ports to non-ephemeral
..


Patch Set 3:

(1 comment)

Thank you for the review. Please see my comment and the new PS3.

http://gerrit.cloudera.org:8080/#/c/6524/2/testdata/cluster/node_templates/cdh5/etc/init.d/kms
File testdata/cluster/node_templates/cdh5/etc/init.d/kms:

Line 25: export KMS_HTTP_PORT=$KMS_WEBUI_PORT
> How did it work before? Was it picking the default port or something?
Yes, this was leaving the default port unchanged, which was 16000. I updated 
the commit message to highlight the changes to the ports.


-- 
To view, visit http://gerrit.cloudera.org:8080/6524
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5137: pt1, Refactor TimestampValue constructors

2017-03-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5137: pt1, Refactor TimestampValue constructors
..


Patch Set 2: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/425/

-- 
To view, visit http://gerrit.cloudera.org:8080/6510
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id25e19f7984e5ebf9073d9c569faf69cec142fa1
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4733: Change HBase ports to non-ephemeral

2017-03-31 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4733: Change HBase ports to non-ephemeral
..


Patch Set 2:

Can you report how you tested this, and if it works on RHEL 7 consistently 
enough that IMPALA-4733 has gone away?

-- 
To view, visit http://gerrit.cloudera.org:8080/6524
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4733: Change HBase ports to non-ephemeral

2017-03-31 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-4733: Change HBase ports to non-ephemeral
..


Patch Set 2: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6524/2/testdata/cluster/node_templates/cdh5/etc/init.d/kms
File testdata/cluster/node_templates/cdh5/etc/init.d/kms:

Line 25: export KMS_HTTP_PORT=$KMS_WEBUI_PORT
How did it work before? Was it picking the default port or something?


-- 
To view, visit http://gerrit.cloudera.org:8080/6524
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan nodes and inline views

2017-03-31 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5003: Constant propagation in scan nodes and inline views
..


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6389/6/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java:

Line 72: customRewriter_ = null;
This was an unexpected wrinkle that made things awkward.


http://gerrit.cloudera.org:8080/#/c/6389/10/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

Line 996:   info = new SlotInfo(it.nextIndex() - 1);
lol I forgot to put it in the map.  Worked out of the box after that.  Still 
needs some minor changes (this step only runs after successful constant 
propagation), but also should run even if no propagation is done.


-- 
To view, visit http://gerrit.cloudera.org:8080/6389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] PREVIEW: IMPALA-4678: port backend exec to use buffer pool

2017-03-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#7).

Change subject: PREVIEW: IMPALA-4678: port backend exec to use buffer pool
..

PREVIEW: IMPALA-4678: port backend exec to use buffer pool

Always create global BufferPool at startup using 80% of memory and
limit reservations to  80% of query memory (same as BufferedBlockMgr).

Each ExecNode has to declare its memory requirements at Prepare() time.

Convert HashTable to use the new BufferPool via a Suballocator.

Make PAGG memory consumption more efficient (avoid wasting buffers):
* Allow preaggs to execute with 0 reservation - if streams and hash tables
  cannot be allocated, it will pass through rows.
* Halve the buffer requirement for spilling aggs - avoid allocating
  buffers for aggregated and unaggregated streams simultaneously.

Convert Sorter to use BufferPool.

TODO in this patch:
 * some of the DCHECKS may be too aggressive. With the current memory
   transfer model, operators that accumulate batches, i.e. NLJ, can
   "steal" reservation. We need a test to reproduce this problem. We
   can probably fix by having NLJ copy if it sees an attached buffer.
 * Consider renaming buffer_pool_page_size, e.g. to spillable_page_size

TODO in follow-up patches:
* Rename BufferedTupleStreamV2 to BufferedTupleStream
* Remove the old hash join and aggregation nodes

Testing:
* Updated tests to reflect new memory requirements
* TODO: recalibrate limits in test_mem_usage_scaling
* TODO: more tests to exercise new code paths

Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/analytic-eval-node.cc
M be/src/exec/analytic-eval-node.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hash-table-test.cc
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/exec/partitioned-aggregation-node-ir.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node-ir.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/partitioned-hash-join-node.inline.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/runtime/CMakeLists.txt
D be/src/runtime/buffered-block-mgr-test.cc
D be/src/runtime/buffered-block-mgr.cc
D be/src/runtime/buffered-block-mgr.h
D be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/buffered-tuple-stream-v2.cc
M be/src/runtime/buffered-tuple-stream-v2.h
D be/src/runtime/buffered-tuple-stream.cc
D be/src/runtime/buffered-tuple-stream.h
D be/src/runtime/buffered-tuple-stream.inline.h
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-filter.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/bloom-filter.h
M be/src/util/static-asserts.cc
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/generate_error_codes.py
M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
M 
testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters_phj.test
M testdata/workloads/functional-query/queries/QueryTest/spilling.test
M tests/query_test/test_mem_usage_scaling.py
M tests/query_test/test_sort.py
R tests/query_test/test_spilling.py
M tests/stress/concurrent_select.py
60 files changed, 1,541 insertions(+), 7,538 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/5801/7
-- 
To view, visit http://gerrit.cloudera.org:8080/5801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] PREVIEW: IMPALA-4678: port backend exec to use buffer pool

2017-03-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: PREVIEW: IMPALA-4678: port backend exec to use buffer pool
..


Patch Set 7:

Refreshed to my latest development version.

-- 
To view, visit http://gerrit.cloudera.org:8080/5801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Pass build type to Impala LZO.

2017-03-31 Thread Alex Behm (Code Review)
Alex Behm has submitted this change and it was merged.

Change subject: Pass build type to Impala LZO.
..


Pass build type to Impala LZO.

Before, the build type used for Impala LZO was always debug.
Now, the build type is passed from the Impala CMakeLists.txt.

This patch needs corresponding changes to Impala LZO.

Testing: I tested locally with these build types:
DEBUG, RELEASE, and ADDRESS_SANITIZER.

Change-Id: Ia83e594409ad5938662ca210c810d5d31b8637b0
Reviewed-on: http://gerrit.cloudera.org:8080/6446
Reviewed-by: Alex Behm 
Tested-by: Alex Behm 
---
M CMakeLists.txt
1 file changed, 2 insertions(+), 1 deletion(-)

Approvals:
  Alex Behm: Looks good to me, approved; Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ia83e594409ad5938662ca210c810d5d31b8637b0
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Pass build type to Impala LZO.

2017-03-31 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: Pass build type to Impala LZO.
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83e594409ad5938662ca210c810d5d31b8637b0
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Pass build type to Impala LZO.

2017-03-31 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: Pass build type to Impala LZO.
..


Patch Set 2: Code-Review+2

Merging this change directly after manual validation. It needs a coordinated 
change with Impala-Lzo. GVO and private testing of such coordinated changes is 
currently not supported on jenkins.impala.io. I filed 
https://issues.apache.org/jira/browse/IMPALA-5148 to improve this.

Jim, I could not find an easy way for me to test whether this fixes IMPALA-4699 
as well. I don't think my change has anything to do with that JIRA.

-- 
To view, visit http://gerrit.cloudera.org:8080/6446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83e594409ad5938662ca210c810d5d31b8637b0
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3203: Part 2: per-core free lists in buffer pool

2017-03-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3203: Part 2: per-core free lists in buffer pool
..


Patch Set 15:

Also, what do you think about removing the per-list limits? They're not 
necessary for correctness and they add an additional thing to tune. I think 
after the maintenance and scavenging they don't add much.

-- 
To view, visit http://gerrit.cloudera.org:8080/6414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3203: Part 2: per-core free lists in buffer pool

2017-03-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3203: Part 2: per-core free lists in buffer pool
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6414/13/be/src/runtime/bufferpool/buffer-allocator.h
File be/src/runtime/bufferpool/buffer-allocator.h:

Line 164: };
> I was thinking it might be useful to have information similar to what we ge
We should chat about the design a bit. I added a couple of basic counters as 
part of the follow-up mmap patch: https://gerrit.cloudera.org/#/c/6474

We probably don't want to have global counters shared between all threads, but 
we could probably have per-arena counters aggregated on demand via a SumGauge.


-- 
To view, visit http://gerrit.cloudera.org:8080/6414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] PREVIEW IMPALA-5073: Use mmap instead of malloc for buffer pool

2017-03-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#3).

Change subject: PREVIEW IMPALA-5073: Use mmap instead of malloc for buffer pool
..

PREVIEW IMPALA-5073: Use mmap instead of malloc for buffer pool

Allocate with mmap instead of TCMalloc to give more control over memory
usage. Also allocate huge pages when possible to reduce TLB pressure.

Adds additional memory metrics, since we previously relied on the
assumption that all memory was allocated through TCMalloc.
memory.total-used and memory.total-reserved track the total across
the buffer pool and TCMalloc. When the buffer pool is not present,
they just report the TCMalloc values.

ASAN still uses malloc() because it doesn't instrument mmap().

Testing:
Added some unit tests to test edge cases. Many pre-existing tests also
exercise the modified code.

Change-Id: Ifbc748f74adcbbdcfa45f3ec7df98284925acbd6
---
M be/src/catalog/catalogd-main.cc
M be/src/runtime/bufferpool/buffer-allocator-test.cc
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/bufferpool/reservation-tracker.cc
M be/src/runtime/bufferpool/reservation-tracker.h
M be/src/runtime/bufferpool/system-allocator.cc
M be/src/runtime/bufferpool/system-allocator.h
M be/src/runtime/exec-env.cc
M be/src/statestore/statestored-main.cc
M be/src/util/asan.h
M be/src/util/memory-metrics.cc
M be/src/util/memory-metrics.h
M be/src/util/metrics-test.cc
M be/src/util/metrics.h
M common/thrift/generate_error_codes.py
M common/thrift/metrics.json
18 files changed, 355 insertions(+), 34 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/6474/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6474
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbc748f74adcbbdcfa45f3ec7df98284925acbd6
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5073: Use mmap instead of malloc for buffer pool

2017-03-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2).

Change subject: IMPALA-5073: Use mmap instead of malloc for buffer pool
..

IMPALA-5073: Use mmap instead of malloc for buffer pool

Allocate with mmap instead of TCMalloc to give more control over memory
usage. Also allocate huge pages when possible to reduce TLB pressure.

Adds additional memory metrics, since we previously relied on the
assumption that all memory was allocated through TCMalloc.
memory.total-used and memory.total-reserved track the total across
the buffer pool and TCMalloc. When the buffer pool is not present,
they just report the TCMalloc values.

ASAN still uses malloc() because it doesn't instrument mmap().

Testing:
Added some unit tests to test edge cases. Many pre-existing tests also
exercise the modified code.

Change-Id: Ifbc748f74adcbbdcfa45f3ec7df98284925acbd6
---
M be/src/catalog/catalogd-main.cc
M be/src/runtime/bufferpool/buffer-allocator-test.cc
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/bufferpool/reservation-tracker.cc
M be/src/runtime/bufferpool/reservation-tracker.h
M be/src/runtime/bufferpool/system-allocator.cc
M be/src/runtime/bufferpool/system-allocator.h
M be/src/runtime/exec-env.cc
M be/src/statestore/statestored-main.cc
M be/src/util/asan.h
M be/src/util/memory-metrics.cc
M be/src/util/memory-metrics.h
M be/src/util/metrics-test.cc
M be/src/util/metrics.h
M common/thrift/generate_error_codes.py
M common/thrift/metrics.json
18 files changed, 355 insertions(+), 34 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/6474/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6474
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbc748f74adcbbdcfa45f3ec7df98284925acbd6
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5137: pt1, Refactor TimestampValue constructors

2017-03-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5137: pt1, Refactor TimestampValue constructors
..


Patch Set 2:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/425/

-- 
To view, visit http://gerrit.cloudera.org:8080/6510
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id25e19f7984e5ebf9073d9c569faf69cec142fa1
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3203: Part 2: per-core free lists in buffer pool

2017-03-31 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3203: Part 2: per-core free lists in buffer pool
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6414/13/be/src/runtime/bufferpool/buffer-allocator.h
File be/src/runtime/bufferpool/buffer-allocator.h:

Line 164: };
> Most of the perf counters are client-centric, so we will get a lot of info 
I was thinking it might be useful to have information similar to what we get 
from tc-malloc. Like breakdown of application used memory between free-list, 
clean-pages, and allocated-buffers/pinned-pages/dirty-pages.  To help verify 
the system behaves as we expect, and debug issues when we hit unexpected memory 
pressure.  Also to debug issues if memory skew occurs between cores, etc (i.e. 
visibility in arenas).


-- 
To view, visit http://gerrit.cloudera.org:8080/6414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5137: pt1, Refactor TimestampValue constructors

2017-03-31 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5137: pt1, Refactor TimestampValue constructors
..


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6510/2/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

Line 81:   TimestampValue::FromUnixTime(intp.val).ToString());
not your change and don't have to address it, but it looks like this has weird 
behavior when the input unix time is out of range of a TimestampValue. Looks 
like the result is an empty string, whereas StringValFromTimestamp() used below 
gives null. Filed IIMPALA-5146 for that.


-- 
To view, visit http://gerrit.cloudera.org:8080/6510
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id25e19f7984e5ebf9073d9c569faf69cec142fa1
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines

2017-03-31 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-5140: improve docs building guidelines
..


Patch Set 7:

Patch set 7 just updates the commit message to reflect all the line wrapping.

-- 
To view, visit http://gerrit.cloudera.org:8080/6512
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines

2017-03-31 Thread Michael Brown (Code Review)
Michael Brown has uploaded a new patch set (#7).

Change subject: IMPALA-5140: improve docs building guidelines
..

IMPALA-5140: improve docs building guidelines

Move docs/generatingImpalaDoc.md to docs/README.md. This will
automatically render the document inline at places like:

https://github.com/apache/incubator-impala/tree/master/docs

under the directory listing.

Fix existing markdown which wasn't always rendering properly. Remove
unneeded HTML and backslashes. Add a mention of make, and add one
troubleshooting tip. Wrap most lines at 90 chars. This does not change
how Github renders the markdown, and it makes reading the source easier
as well.

Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
---
A docs/README.md
D docs/generatingImpalaDoc.md
2 files changed, 163 insertions(+), 73 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/6512/7
-- 
To view, visit http://gerrit.cloudera.org:8080/6512
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines

2017-03-31 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5140: improve docs building guidelines
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6512/5/docs/README.md
File docs/README.md:

PS5, Line 111: add the
 :following lines to the end of the file:
> I don't understand this comment. Does patch set 6 not address your concern?
Oh, sorry, misread it.


-- 
To view, visit http://gerrit.cloudera.org:8080/6512
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines

2017-03-31 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-5140: improve docs building guidelines
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6512/5/docs/README.md
File docs/README.md:

PS5, Line 111: add the
 :following lines to the end of the file:
> It will be fine if you open a new terminal. I'd suggest assing "source" to 
I don't understand this comment. Does patch set 6 not address your concern?


-- 
To view, visit http://gerrit.cloudera.org:8080/6512
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines

2017-03-31 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5140: improve docs building guidelines
..


Patch Set 6:

Laurel, doe we still need to tell people how to generate SQL reference when the 
entire doc can be generated just fine?

-- 
To view, visit http://gerrit.cloudera.org:8080/6512
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines

2017-03-31 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5140: improve docs building guidelines
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6512/5/docs/README.md
File docs/README.md:

PS5, Line 74: * **To generate HTML output of the Impala SQL Reference, run 
the following command:**
: 
: ```
: ./bin/dita -input  -format 
html5 \
:   -output  \
:   -filter 
: ```
: 
: * **To generate PDF output of the Impala SQL Reference, run 
the following command:**
: 
: ```
: ./bin/dita -input  -format 
pdf \
:   -output  \
:   -filter 
: ```
> You'd have to tell me, since you are the initial committer of this file. On
Laurel was the original author. I'll add her to the review.


PS5, Line 111: y
 :`/Users//.bash_profile`. Edit
> Done
It will be fine if you open a new terminal. I'd suggest assing "source" to 
these instructions in step 1 so people who don't open a new terminal will get 
the right result.


-- 
To view, visit http://gerrit.cloudera.org:8080/6512
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines

2017-03-31 Thread Michael Brown (Code Review)
Michael Brown has uploaded a new patch set (#6).

Change subject: IMPALA-5140: improve docs building guidelines
..

IMPALA-5140: improve docs building guidelines

Move docs/generatingImpalaDoc.md to docs/README.md. This will
automatically render the document inline at places like:

https://github.com/apache/incubator-impala/tree/master/docs

under the directory listing.

Fix existing markdown which wasn't always rendering properly.  Remove
unneeded HTML and backslashes. Add a mention of make, and add one
troubleshooting tip.

Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
---
A docs/README.md
D docs/generatingImpalaDoc.md
2 files changed, 163 insertions(+), 73 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/6512/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6512
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines

2017-03-31 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-5140: improve docs building guidelines
..


Patch Set 5:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/6512/5/docs/README.md
File docs/README.md:

PS5, Line 14: doc_prototype
> master
Done


PS5, Line 17: doc_prototype
> master
Done


PS5, Line 59: ./bin/dita
> depends on where you put it
It works if you followed step 3 above.


PS5, Line 74: * **To generate HTML output of the Impala SQL Reference, run 
the following command:**
: 
: ```
: ./bin/dita -input  -format 
html5 \
:   -output  \
:   -filter 
: ```
: 
: * **To generate PDF output of the Impala SQL Reference, run 
the following command:**
: 
: ```
: ./bin/dita -input  -format 
pdf \
:   -output  \
:   -filter 
: ```
> Why are these needed?
You'd have to tell me, since you are the initial committer of this file. One 
guess is that it exhibits using a different ditamap.


PS5, Line 111: add the
 :following lines to the end of the file:
> Then 'source' it.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6512
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3381: Support AM/PM marker in date and time format strings

2017-03-31 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-3381: Support AM/PM marker in date and time format 
strings
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6523/1/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

Line 5641:   // AM/PM marker can be repeated and placed anywhere in the format 
string
Do we have a check that prevents having multiple separate am/pm markers?


http://gerrit.cloudera.org:8080/#/c/6523/1/be/src/runtime/timestamp-parse-util.cc
File be/src/runtime/timestamp-parse-util.cc:

Line 173:   case 'a': tok_type = AM_PM_MARKER; dt_ctx->has_am_pm_marker = 
true; break;
Have you considered supporting 'am' and 'pm' as tokens, too, like Greg 
suggested in the JIRA?


Line 201:   if (tok_len != 2) {
You could remove this if statement (and add 0 below).


PS1, Line 467: strncmp
You can use strncasecmp() and simplify the code.


-- 
To view, visit http://gerrit.cloudera.org:8080/6523
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I99794a3e152f1712c6c469bb266d23a81d19ca34
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4893: Efficiently update the rows read counter for sequence file

2017-03-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4893: Efficiently update the rows read counter for 
sequence file
..


Patch Set 1:

(1 comment)

Can you add a test for the RowsRead counter? It would be nice extra coverage 
that I think we're currently missing.

E.g. I think in scanners.test we do some full table scans of some functional 
tables, and that is run for all file formats.

It looks like runtime_filters.test has some verification of RowsRead.

http://gerrit.cloudera.org:8080/#/c/6522/1/be/src/exec/hdfs-sequence-scanner.cc
File be/src/exec/hdfs-sequence-scanner.cc:

Line 346:   COUNTER_ADD(scan_node_->rows_read_counter(), num_rows_read);
I think we can avoid the duplicated logic if we break here instead of 
returning. I.e.

if (stream->eof()) break;


-- 
To view, visit http://gerrit.cloudera.org:8080/6522
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie42c97a36e46172884cc497aa645036c2c11f541
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3079: Fix sequence file writer

2017-03-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3079: Fix sequence file writer
..


Patch Set 6:

Forgive the drive-by comment, but I'm curious about whether we plan to make 
sequence files a supported format for writing. It seems strange to put all this 
effort into it and keep it hidden behind the flag with other file writers like 
Avro that are totally broken.

-- 
To view, visit http://gerrit.cloudera.org:8080/6107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0db642ad35132a9a5a6611810a6cafbbe26e7487
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines

2017-03-31 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5140: improve docs building guidelines
..


Patch Set 5:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/6512/5/docs/README.md
File docs/README.md:

PS5, Line 14: doc_prototype
master


PS5, Line 17: doc_prototype
master


PS5, Line 59: ./bin/dita
depends on where you put it


PS5, Line 74: * **To generate HTML output of the Impala SQL Reference, run 
the following command:**
: 
: ```
: ./bin/dita -input  -format 
html5 \
:   -output  \
:   -filter 
: ```
: 
: * **To generate PDF output of the Impala SQL Reference, run 
the following command:**
: 
: ```
: ./bin/dita -input  -format 
pdf \
:   -output  \
:   -filter 
: ```
Why are these needed?


PS5, Line 111: add the
 :following lines to the end of the file:
Then 'source' it.


-- 
To view, visit http://gerrit.cloudera.org:8080/6512
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4883: Union Codegen

2017-03-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4883: Union Codegen
..


Patch Set 5:

(13 comments)

Getting pretty close, just minor cleanup at this point.

I also just wanted to check with Michael that the tuple_pool_ approach made the 
most sense for now - we'll need to clean that up as part of his codegen work 
but I don't think it makes sense to fix in this patch.

http://gerrit.cloudera.org:8080/#/c/6459/4/be/src/exec/union-node-ir.cc
File be/src/exec/union-node-ir.cc:

Line 34: 
> We can avoid checking limits for each row if we check it at the end and tru
It looks like we already do this for the passthrough case so we might as well 
do it here.


http://gerrit.cloudera.org:8080/#/c/6459/5/be/src/exec/union-node-ir.cc
File be/src/exec/union-node-ir.cc:

Line 19: #include "runtime/tuple.h"
Do we need tuple.h? I don't think I see any references to Tuple* in here.


Line 21: #include "util/runtime-profile-counters.h"
Is this needed still?


Line 35:   while (!dst_batch->AtCapacity() && child_row_idx < 
child_batch->num_rows()) {
Nice! We can maybe avoid a few more loads and stores via the child_batch and 
tuple_buf pointers. I.e.

int child_batch_rows = child_batch->num_rows().
uint8_t* curr_tuple = *tuple_buf;
...
*tuple_buf = curr_tuple.


Line 46:   if (limit_ != -1 && num_rows_returned_ + dst_batch->num_rows() > 
limit_) {
We don't need to cross-compile this logic. Let's move it into the caller and 
save LLVM some work.

Although, see my comment about moving this logic to GetNext() and sharing it 
for all three codepaths.


http://gerrit.cloudera.org:8080/#/c/6459/5/be/src/exec/union-node.cc
File be/src/exec/union-node.cc:

Line 168:   if (limit_ != -1 && num_rows_returned_ + row_batch->num_rows() > 
limit_) {
How about we move this logic around num_rows_returned_ and limit_ into 
GetNext()? I believe the same logic can work for all three cases if we slightly 
extend it so that it handles the case when GetNext() is called with a non-empty 
batch, which can happen in a subplan.


http://gerrit.cloudera.org:8080/#/c/6459/1/be/src/exec/union-node.h
File be/src/exec/union-node.h:

PS1, Line 71:  in this poo
> Ok, this can be removed in the future.
Michael, do you think it makes sense to use this approach for now? I'm not that 
familiar with CodegenMaterializeExprs() so not sure if there is a better way to 
do this.


http://gerrit.cloudera.org:8080/#/c/6459/5/be/src/exec/union-node.h
File be/src/exec/union-node.h:

Line 28: #include "runtime/tuple.h"
Do we need the tuple.h and tuple-row.h imports? Oh I guess for the inline 
MaterializeExprs function, but we can move that to the -ir.cc file anyway.


Line 72:   /// each GetNext() call.
We should add a TODO to remove this. Maybe Michael knows if there's a JIRA that 
will allow us to remove it.


PS5, Line 99: Null
NULL here and below, just for consistency with other comments.


PS5, Line 128: row_batch
dst_batch.


Line 136:   void IR_ALWAYS_INLINE MaterializeExprs(const 
std::vector& exprs,
Move this to the -ir.cc file? I don't think there's a reason we need to define 
it in the .h


Line 148:   bool inline IsChildPassthrough(int child_idx) const {
I don't think any of the "inline" specifiers here and below do anything - if 
the function is defined in the class body it implicitly has an "inline" hint.

https://isocpp.org/wiki/faq/inline-functions#inline-member-fns-more


-- 
To view, visit http://gerrit.cloudera.org:8080/6459
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4107d27582ff5416172810364a6e76d3d93c439
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines

2017-03-31 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-5140: improve docs building guidelines
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6512/4/docs/README.md
File docs/README.md:

Line 6: * Open a terminal window and run the following commands to get the 
Impala documentation source files from Git:
> Could yo wrap long lines to help with the gerrit display to help ease revie
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6512
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines

2017-03-31 Thread Michael Brown (Code Review)
Michael Brown has uploaded a new patch set (#5).

Change subject: IMPALA-5140: improve docs building guidelines
..

IMPALA-5140: improve docs building guidelines

Move docs/generatingImpalaDoc.md to docs/README.md. This will
automatically render the document inline at places like:

https://github.com/apache/incubator-impala/tree/master/docs

under the directory listing.

Fix existing markdown which wasn't always rendering properly.  Remove
unneeded HTML and backslashes. Add a mention of make, and add one
troubleshooting tip.

Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
---
A docs/README.md
D docs/generatingImpalaDoc.md
2 files changed, 161 insertions(+), 73 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/6512/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6512
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-4733: Change HBase ports to non-ephemeral

2017-03-31 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#2).

Change subject: IMPALA-4733: Change HBase ports to non-ephemeral
..

IMPALA-4733: Change HBase ports to non-ephemeral

Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857
---
M fe/src/test/resources/hbase-site.xml.template
M testdata/cluster/admin
M testdata/cluster/node_templates/cdh5/etc/init.d/kms
M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl
M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
5 files changed, 26 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/6524/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6524
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-3079: Fix sequence file writer

2017-03-31 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change.

Change subject: IMPALA-3079: Fix sequence file writer
..


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6107/4/be/src/exec/hdfs-sequence-table-writer.cc
File be/src/exec/hdfs-sequence-table-writer.cc:

PS4, Line 179: 
 : 
> Thanks for listing them out. Please also put this list in the commit messag
Done


http://gerrit.cloudera.org:8080/#/c/6107/5/be/src/exec/read-write-util.h
File be/src/exec/read-write-util.h:

Line 214: // Returns size of the encoded long value, including the 1 byte for 
length for val < -112
> for val < -112 or val > 127.
Done


PS5, Line 228: ollow
> nit: long line
Done


PS5, Line 245: um_bytes, 9);
> nit: help to comment why it's 119 here (which is different from 120 in the 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0db642ad35132a9a5a6611810a6cafbbe26e7487
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3079: Fix sequence file writer

2017-03-31 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded a new patch set (#6).

Change subject: IMPALA-3079: Fix sequence file writer
..

IMPALA-3079: Fix sequence file writer

This change fixes the following issues in the Sequence File Writer:
1. ReadWriteUtil::VLongRequiredBytes() and ReadWriteUtil::PutVLong()
   were broken. As a result, Impala could not read back uncompressed
   sequence files created by Impala.

2. KEY_CLASS_NAME was missing from the sequence file header. As a
   result, Hive could not read back uncompressed sequence files
   created by Impala.

3. Impala created record-compressed sequence files with empty keys
   block. As a result, Hive could not read back record-compressed
   sequence files created by Impala.

4. Impala created block-compressed files with:
   - empty key-lengths block
   - empty keys block
   - empty value-lengths block
   This resulted in invalid block-compressed sequence files that Hive could
   not read back.

5. In some cases the wrong Record-compression flag was written to the
   sequence file header. As a result, Hive could not read back record-
   compressed sequence files created by Impala.

6. Impala added 'sync_marker' instead of 'neg1_sync_marker' to the
   beginning of blocks in block-compressed sequence files. Hive could
   not read these files back.

7. The calculation of block sizes in SnappyBlockCompressor class was
   incorrect for odd-length buffers.

Change-Id: I0db642ad35132a9a5a6611810a6cafbbe26e7487
---
M be/src/exec/hdfs-sequence-table-writer.cc
M be/src/exec/hdfs-sequence-table-writer.h
M be/src/exec/read-write-util-test.cc
M be/src/exec/read-write-util.h
M be/src/util/compress.cc
M be/src/util/decompress-test.cc
M testdata/workloads/functional-query/queries/QueryTest/seq-writer.test
M tests/query_test/test_compressed_formats.py
8 files changed, 494 insertions(+), 80 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/6107/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0db642ad35132a9a5a6611810a6cafbbe26e7487
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-3079: Fix sequence file writer

2017-03-31 Thread Attila Jeges (Code Review)
Hello Michael Ho,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6107

to look at the new patch set (#6).

Change subject: IMPALA-3079: Fix sequence file writer
..

IMPALA-3079: Fix sequence file writer

This change fixes the following issues in the Sequence File Writer:
1. ReadWriteUtil::VLongRequiredBytes() and ReadWriteUtil::PutVLong()
   were broken. As a result, Impala could not read back uncompressed
   sequence files created by Impala.

2. KEY_CLASS_NAME was missing from the sequence file header. As a
   result, Hive could not read back uncompressed sequence files
   created by Impala.

3. Impala created record-compressed sequence files with empty keys
   block. As a result, Hive could not read back record-compressed
   sequence files created by Impala.

4. Impala created block-compressed files with:
   - empty key-lengths block
   - empty keys block
   - empty value-lengths block
   This resulted in invalid block-compressed sequence files that Hive could
   not read back.

5. In some cases the wrong Record-compression flag was written to the
   sequence file header. As a result, Hive could not read back record-
   compressed sequence files created by Impala.

6. Impala added 'sync_marker' instead of 'neg1_sync_marker' to the
   beginning of blocks in block-compressed sequence files. Hive could
   not read these files back.

7. The calculation of block sizes in SnappyBlockCompressor class was
   incorrect for odd-length buffers.

Change-Id: I0db642ad35132a9a5a6611810a6cafbbe26e7487
---
M be/src/exec/hdfs-sequence-table-writer.cc
M be/src/exec/hdfs-sequence-table-writer.h
M be/src/exec/read-write-util-test.cc
M be/src/exec/read-write-util.h
M be/src/util/compress.cc
M be/src/util/decompress-test.cc
M testdata/workloads/functional-query/queries/QueryTest/seq-writer.test
M tests/query_test/test_compressed_formats.py
8 files changed, 494 insertions(+), 80 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/6107/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0db642ad35132a9a5a6611810a6cafbbe26e7487
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-4733: Change HBase ports to non-ephemeral

2017-03-31 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6524

Change subject: IMPALA-4733: Change HBase ports to non-ephemeral
..

IMPALA-4733: Change HBase ports to non-ephemeral

Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857
---
M fe/src/test/resources/hbase-site.xml.template
M testdata/cluster/admin
M testdata/cluster/node_templates/cdh5/etc/init.d/kms
3 files changed, 24 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/6524/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6524
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-3381: Support AM/PM marker in date and time format strings

2017-03-31 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6523

Change subject: IMPALA-3381: Support AM/PM marker in date and time format 
strings
..

IMPALA-3381: Support AM/PM marker in date and time format strings

This change adds AM/PM marker to format strings used in 'to_timestamp'
'unix_timestamp' and 'from_unixtime' functions.

It uses 'a' for the AM/PM marker following the Hive impelentation (
which follows Java 'SimpleDateFormat' patterns). Similarly to Hive,
the 'a' pattern letter can be repeated any number of times in the
format string without affecting the corresponding presentation.

For example:
> select from_unixtime(
>   unix_timestamp('2017-03-31 11:19:23 PM', '-MM-dd HH:mm:ss a'),
>   '-MM-dd HH:mm:ss aaa');
2017-03-31 11:19:23 PM

Change-Id: I99794a3e152f1712c6c469bb266d23a81d19ca34
---
M be/src/exprs/expr-test.cc
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-parse-util.h
3 files changed, 137 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6523/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6523
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I99794a3e152f1712c6c469bb266d23a81d19ca34
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges