[Impala-ASF-CR] IMPALA-3224: De-Cloudera non-docs JIRA URLs
Jim Apple has posted comments on this change. Change subject: IMPALA-3224: De-Cloudera non-docs JIRA URLs .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6487/1/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java: Line 230: // https://issues.apache.org/jira/browse/IMPALA-3570 > In most places we seem to refer to JIRAs without the full URL: Done -- To view, visit http://gerrit.cloudera.org:8080/6487 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I28ea06e89341de234f9005fdc72a2e43f0ab8182 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3224: De-Cloudera non-docs JIRA URLs
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6487 to look at the new patch set (#2). Change subject: IMPALA-3224: De-Cloudera non-docs JIRA URLs .. IMPALA-3224: De-Cloudera non-docs JIRA URLs John Russell is planning to fix the URLS in docs in a separate commit. Fixed using: (git ls-files | xargs replace \ 'https://issues.cloudera.org/browse/IMPALA' 'IMPALA' --) && \ git checkout HEAD docs Change-Id: I28ea06e89341de234f9005fdc72a2e43f0ab8182 --- M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M shell/shell_output.py M testdata/bin/compute-table-stats.sh M testdata/bin/create-load-data.sh M testdata/bin/load-test-warehouse-snapshot.sh M testdata/bin/setup-hdfs-env.sh M tests/comparison/db_connection.py M tests/comparison/discrepancy_searcher.py M tests/comparison/query_generator.py M tests/custom_cluster/test_kudu_not_available.py M tests/stress/concurrent_select.py 11 files changed, 22 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/6487/2 -- To view, visit http://gerrit.cloudera.org:8080/6487 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I28ea06e89341de234f9005fdc72a2e43f0ab8182 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-5137: Support Kudu UNIXTIME MICROS as Impala TIMESTAMP
Dan Hecht has posted comments on this change. Change subject: IMPALA-5137: Support Kudu UNIXTIME_MICROS as Impala TIMESTAMP .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6526/1/be/src/runtime/timestamp-value.h File be/src/runtime/timestamp-value.h: PS1, Line 180: Unix time (seconds since the Unix epoch) representation in UTC unix time isn't really "in UTC". It's just the number of seconds since unix epcoh (which is specified as Jan 1 1970 0:00 UTC). That is, unix time is timezone independent. The implied timezone (which in this case you want to be UTC) applies to the timestamp, not the resulting unix time. So, I think the comment should be something like: /// Interpret 'this' as a timestamp in UTC and convert to unix time. and rename the method accordingly. Or am I missing something? -- To view, visit http://gerrit.cloudera.org:8080/6526 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iae6ccfffb79118a9036fb2227dba3a55356c896d Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3203: Part 2: per-core free lists in buffer pool
Dan Hecht has posted comments on this change. Change subject: IMPALA-3203: Part 2: per-core free lists in buffer pool .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/6414/13/be/src/runtime/bufferpool/buffer-allocator.h File be/src/runtime/bufferpool/buffer-allocator.h: Line 164: }; > We should chat about the design a bit. I added a couple of basic counters a Sure, let's chat about it. This can be done in a follow on patch, so we can continue with the current patch without it. -- To view, visit http://gerrit.cloudera.org:8080/6414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3203: Part 2: per-core free lists in buffer pool
Dan Hecht has posted comments on this change. Change subject: IMPALA-3203: Part 2: per-core free lists in buffer pool .. Patch Set 15: > Also, what do you think about removing the per-list limits? They're > not necessary for correctness and they add an additional thing to > tune. I think after the maintenance and scavenging they don't add > much. Fine with me to remove. -- To view, visit http://gerrit.cloudera.org:8080/6414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines
Laurel Hale has posted comments on this change. Change subject: IMPALA-5140: improve docs building guidelines .. Patch Set 7: Code-Review+1 everything looks ok -- To view, visit http://gerrit.cloudera.org:8080/6512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Laurel Hale Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4643: [DOCS] Change URLs / set up keydefs for JIRA reports
Laurel Hale has posted comments on this change. Change subject: IMPALA-4643: [DOCS] Change URLs / set up keydefs for JIRA reports .. Patch Set 1: (6 comments) There are some issues that I've listed in my comments. http://gerrit.cloudera.org:8080/#/c/6515/1/docs/topics/impala_fixed_issues.xml File docs/topics/impala_fixed_issues.xml: PS1, Line 573: fixed_issues_232 I looked in "impala_fixed_issues.xml" and there IS a concept id "fixed_issues_232". Not sure why this isn't working, but the link is not working. PS1, Line 2461: This points to empty search results: https://issues.apache.org/jira/issues/?jql=project%3Dimpala%20and%20fixVersion%3D%22Impala%202.0.5%22%20and%20resolution%3D%22Fixed%22 PS1, Line 2706: This points to empty search results: https://issues.apache.org/jira/issues/?jql=project%3Dimpala%20and%20fixVersion%3D%22Impala%202.0.4%22%20and%20resolution%3D%22Fixed%22 PS1, Line 2770: https://issues.apache.org/jira/issues/?jql=project%3Dimpala%20and%20fixVersion%3D%22Impala%202.0.3%22%20and%20resolution%3D%22Fixed%22 PS1, Line 2820: This points to empty search results: https://issues.apache.org/jira/issues/?jql=project%3Dimpala%20and%20fixVersion%3D%22Impala%202.0.2%22%20and%20resolution%3D%22Fixed%22 -- To view, visit http://gerrit.cloudera.org:8080/6515 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I007e634f9da57289674683dd5bf64e3e3ca8f525 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Laurel Hale Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3905: HdfsScanner::GetNext() for Avro, RC, and Seq scans.
Alex Behm has uploaded a new change for review. http://gerrit.cloudera.org:8080/6527 Change subject: IMPALA-3905: HdfsScanner::GetNext() for Avro, RC, and Seq scans. .. IMPALA-3905: HdfsScanner::GetNext() for Avro, RC, and Seq scans. Implements HdfsScanner::GetNext() for the Avro, RC File, and Sequence File scanners. Changes ProcessSplit() to repeatedly call GetNext() to share the core scanning code between the legacy ProcessSplit() interface (ProcessSpit()) and the new GetNext() interface. Summary of changes: - Slightly change code flow for initial scan range that only parses the file header. The new code sets 'only_parsing_header_' in Open() and then honors that flag in GetNextInternal(). Before, all the logic was inside ProcessSpit(). - Replace 'finished_' with 'eos_'. - Add a RowBatch parameter to various functions. - Change Close() to free all resources when a nullptr RowBatch is passed. Testing: - Exhaustive tests passed on debug - Core tests passed on asan - TODO: Perf testing on cluster Change-Id: Ie18f57b0d3fe0052a8ccd361b6a5fcdf979d0669 --- M be/src/exec/base-sequence-scanner.cc M be/src/exec/base-sequence-scanner.h M be/src/exec/hdfs-avro-scanner-ir.cc M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-rcfile-scanner.cc M be/src/exec/hdfs-rcfile-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-sequence-scanner.h M testdata/workloads/functional-query/queries/DataErrorsTest/avro-errors.test 17 files changed, 575 insertions(+), 493 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/6527/1 -- To view, visit http://gerrit.cloudera.org:8080/6527 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ie18f57b0d3fe0052a8ccd361b6a5fcdf979d0669 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm
[Impala-ASF-CR] IMPALA-4883: Union Codegen
Taras Bobrovytsky has uploaded a new patch set (#6). Change subject: IMPALA-4883: Union Codegen .. IMPALA-4883: Union Codegen For each non-passthrough child of the Union node, codegen the loop that does per row tuple materialization. Testing: Ran test_queries.py test locally in exchaustive mode. Benchmark: Ran a local benchmark on a local 10 GB TPCDS dataset on an unpartitioned store_sales table. SELECT COUNT(c), COUNT(ss_customer_sk), COUNT(ss_cdemo_sk), COUNT(ss_hdemo_sk), COUNT(ss_addr_sk), COUNT(ss_store_sk), COUNT(ss_promo_sk), COUNT(ss_ticket_number), COUNT(ss_quantity), COUNT(ss_wholesale_cost), COUNT(ss_list_price), COUNT(ss_sales_price), COUNT(ss_ext_discount_amt), COUNT(ss_ext_sales_price), COUNT(ss_ext_wholesale_cost), COUNT(ss_ext_list_price), COUNT(ss_ext_tax), COUNT(ss_coupon_amt), COUNT(ss_net_paid), COUNT(ss_net_paid_inc_tax), COUNT(ss_net_profit), COUNT(ss_sold_date_sk) FROM ( select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned ) t Before: 39s704ms Operator #Hosts Avg Time Max Time#Rows Est. #Rows Peak Mem Est. Peak Mem Detail -- 13:AGGREGATE 1 194.504us 194.504us1 1 28.00 KB -1.00 B FINALIZE 12:EXCHANGE1 17.284us 17.284us3 1 0 -1.00 B UNPARTITIONED 11:AGGREGATE 32s202ms2s934ms3 1 115.00 KB 10.00 MB 00:UNION 3 32s514ms 34s926ms 288.01M 288.01M3.08 MB 0 |--02:SCAN HDFS3 158.373ms 216.085ms 28.80M 28.80M 489.71 MB 1.88 GB tpcds_10_parquet.store_sales |--03:SCAN HDFS3 167.002ms 171.738ms 28.80M 28.80M 489.74 MB 1.88 GB tpcds_10_parquet.store_sales |--04:SCAN HDFS3 125.331ms 145.496ms 28.80M 28.80M 489.57 MB 1.88 GB tpcds_10_parquet.store_sales |--05:SCAN HDFS3 148.478ms 194.311ms 28.80M 28.80M 489.69 MB 1.88 GB tpcds_10_parquet.store_sales |--06:SCAN HDFS3 143.995ms 162.781ms 28.80M 28.80M 489.57 MB 1.88 GB tpcds_10_parquet.store_sales |--07:SCAN HDFS3 169.731ms 250.201ms 28.80M 28.80M 489.58 MB 1.88 GB tpcds_10_parquet.store_sales |--08:SCAN HDFS3 164.110ms 254.374ms 28.80M 28.80M 489.61 MB 1.88 GB tpcds_10_parquet.store_sales |--09:SCAN HDFS3 135.631ms 162.117ms 28.80M 28.80M 489.63 MB 1.88 GB tpcds_10_parquet.store_sales |--10:SCAN HDFS3 138.736ms 167.778ms 28.80M 28.80M 489.67 MB 1.88 GB tpcds_10_parquet.store_sales 01:SCAN HDFS 3 202.015ms 248.728ms 28.80M 28.80M 489.68 MB 1.88 GB tpcds_10_parquet.store_sales After: 20s664ms Operator #Hosts Avg Time Max Time#Rows Est. #Rows Peak Mem Est. Peak Mem Detail -- 13:AGGREGATE 1 167.757us 167.757us1 1 28.00 KB -1.00 B FINALIZE 12:EXCHANGE1 16.592us 16.592us3 1 0 -1.00 B UNPARTITIONED 11:AGGREGATE 32s924ms3s715ms3 1 115.00 KB 10.00 MB 00:UNION 34s971ms6s082ms 288.01M 288.01M3.08 MB 0 |--02:SCAN HDFS31s189ms1s588ms 28.80M 28.80M 483.82 MB 1.88 GB tpcds_10_parquet.store_sales |--03:SCAN HDFS31s117ms1s157ms 28.80M 28.80M 484.85 MB 1.88 GB tpcds_10_parquet.store_sales |--04:SCAN HDFS31s226ms1s454ms 28.80M 28.80M 483.00 MB 1.88 GB tpcds_10_parquet.store_sales |--05:SCAN HDFS31s141ms1s3
[Impala-ASF-CR] IMPALA-4883: Union Codegen
Taras Bobrovytsky has uploaded a new patch set (#6). Change subject: IMPALA-4883: Union Codegen .. IMPALA-4883: Union Codegen For each non-passthrough child of the Union node, codegen the loop that does per row tuple materialization. Testing: Ran test_queries.py test locally in exchaustive mode. Benchmark: Ran a local benchmark on a local 10 GB TPCDS dataset on an unpartitioned store_sales table. SELECT COUNT(c), COUNT(ss_customer_sk), COUNT(ss_cdemo_sk), COUNT(ss_hdemo_sk), COUNT(ss_addr_sk), COUNT(ss_store_sk), COUNT(ss_promo_sk), COUNT(ss_ticket_number), COUNT(ss_quantity), COUNT(ss_wholesale_cost), COUNT(ss_list_price), COUNT(ss_sales_price), COUNT(ss_ext_discount_amt), COUNT(ss_ext_sales_price), COUNT(ss_ext_wholesale_cost), COUNT(ss_ext_list_price), COUNT(ss_ext_tax), COUNT(ss_coupon_amt), COUNT(ss_net_paid), COUNT(ss_net_paid_inc_tax), COUNT(ss_net_profit), COUNT(ss_sold_date_sk) FROM ( select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned ) t Before: 39s704ms Operator #Hosts Avg Time Max Time#Rows Est. #Rows Peak Mem Est. Peak Mem Detail -- 13:AGGREGATE 1 194.504us 194.504us1 1 28.00 KB -1.00 B FINALIZE 12:EXCHANGE1 17.284us 17.284us3 1 0 -1.00 B UNPARTITIONED 11:AGGREGATE 32s202ms2s934ms3 1 115.00 KB 10.00 MB 00:UNION 3 32s514ms 34s926ms 288.01M 288.01M3.08 MB 0 |--02:SCAN HDFS3 158.373ms 216.085ms 28.80M 28.80M 489.71 MB 1.88 GB tpcds_10_parquet.store_sales |--03:SCAN HDFS3 167.002ms 171.738ms 28.80M 28.80M 489.74 MB 1.88 GB tpcds_10_parquet.store_sales |--04:SCAN HDFS3 125.331ms 145.496ms 28.80M 28.80M 489.57 MB 1.88 GB tpcds_10_parquet.store_sales |--05:SCAN HDFS3 148.478ms 194.311ms 28.80M 28.80M 489.69 MB 1.88 GB tpcds_10_parquet.store_sales |--06:SCAN HDFS3 143.995ms 162.781ms 28.80M 28.80M 489.57 MB 1.88 GB tpcds_10_parquet.store_sales |--07:SCAN HDFS3 169.731ms 250.201ms 28.80M 28.80M 489.58 MB 1.88 GB tpcds_10_parquet.store_sales |--08:SCAN HDFS3 164.110ms 254.374ms 28.80M 28.80M 489.61 MB 1.88 GB tpcds_10_parquet.store_sales |--09:SCAN HDFS3 135.631ms 162.117ms 28.80M 28.80M 489.63 MB 1.88 GB tpcds_10_parquet.store_sales |--10:SCAN HDFS3 138.736ms 167.778ms 28.80M 28.80M 489.67 MB 1.88 GB tpcds_10_parquet.store_sales 01:SCAN HDFS 3 202.015ms 248.728ms 28.80M 28.80M 489.68 MB 1.88 GB tpcds_10_parquet.store_sales After: 20s664ms Operator #Hosts Avg Time Max Time#Rows Est. #Rows Peak Mem Est. Peak Mem Detail -- 13:AGGREGATE 1 167.757us 167.757us1 1 28.00 KB -1.00 B FINALIZE 12:EXCHANGE1 16.592us 16.592us3 1 0 -1.00 B UNPARTITIONED 11:AGGREGATE 32s924ms3s715ms3 1 115.00 KB 10.00 MB 00:UNION 34s971ms6s082ms 288.01M 288.01M3.08 MB 0 |--02:SCAN HDFS31s189ms1s588ms 28.80M 28.80M 483.82 MB 1.88 GB tpcds_10_parquet.store_sales |--03:SCAN HDFS31s117ms1s157ms 28.80M 28.80M 484.85 MB 1.88 GB tpcds_10_parquet.store_sales |--04:SCAN HDFS31s226ms1s454ms 28.80M 28.80M 483.00 MB 1.88 GB tpcds_10_parquet.store_sales |--05:SCAN HDFS31s141ms1s3
[Impala-ASF-CR] IMPALA-4883: Union Codegen
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-4883: Union Codegen .. Patch Set 5: (11 comments) http://gerrit.cloudera.org:8080/#/c/6459/5/be/src/exec/union-node-ir.cc File be/src/exec/union-node-ir.cc: Line 19: #include "runtime/tuple.h" > Do we need tuple.h? I don't think I see any references to Tuple* in here. Done Line 21: #include "util/runtime-profile-counters.h" > Is this needed still? Done Line 35: while (!dst_batch->AtCapacity() && child_row_idx < child_batch->num_rows()) { > Nice! We can maybe avoid a few more loads and stores via the child_batch an Great suggestions. Done. Line 46: if (limit_ != -1 && num_rows_returned_ + dst_batch->num_rows() > limit_) { > We don't need to cross-compile this logic. Let's move it into the caller an Good point. Moved it out of here. http://gerrit.cloudera.org:8080/#/c/6459/5/be/src/exec/union-node.cc File be/src/exec/union-node.cc: Line 168: if (limit_ != -1 && num_rows_returned_ + row_batch->num_rows() > limit_) { > How about we move this logic around num_rows_returned_ and limit_ into GetN Done http://gerrit.cloudera.org:8080/#/c/6459/5/be/src/exec/union-node.h File be/src/exec/union-node.h: Line 28: #include "runtime/tuple.h" > Do we need the tuple.h and tuple-row.h imports? Oh I guess for the inline M Done Line 72: /// each GetNext() call. > We should add a TODO to remove this. Maybe Michael knows if there's a JIRA Added a todo, couldn't find the relevant JIRA. PS5, Line 99: Null > NULL here and below, just for consistency with other comments. Done PS5, Line 128: row_batch > dst_batch. Done Line 136: void IR_ALWAYS_INLINE MaterializeExprs(const std::vector& exprs, > Move this to the -ir.cc file? I don't think there's a reason we need to def Ok, moved it. Line 148: bool inline IsChildPassthrough(int child_idx) const { > I don't think any of the "inline" specifiers here and below do anything - i Makes sense, removed all of them. -- To view, visit http://gerrit.cloudera.org:8080/6459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib4107d27582ff5416172810364a6e76d3d93c439 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan nodes and inline views
Zach Amsden has posted comments on this change. Change subject: IMPALA-5003: Constant propagation in scan nodes and inline views .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/6389/11/fe/src/main/java/org/apache/impala/rewrite/NormalizeBinaryPredicatesRule.java File fe/src/main/java/org/apache/impala/rewrite/NormalizeBinaryPredicatesRule.java: Line 36: * id1 > id2 -> id2 < id1 I am going to abandon changes here. Although this would make it easier to extend to analysis chains, e.g. A <= B <= C <= A -> A = B = C, the complications introduced by the non-inclusive relation make implementing this quite a bit of work. This change is already large enough and I'd rather keep the scope confined to the two simple steps. -- To view, visit http://gerrit.cloudera.org:8080/6389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan nodes and inline views
Zach Amsden has uploaded a new patch set (#11). Change subject: IMPALA-5003: Constant propagation in scan nodes and inline views .. IMPALA-5003: Constant propagation in scan nodes and inline views When conjuncts are pushed into table refs and inline views, they can be considered for constant progagation within that node. In certain cases, we might end up with a FALSE conditional and now we can convert ScanNodes to EmptySet nodes when that occurs. I also added an inequality collation phase which is now partially tested and will combine conjuncts such as a < k1, a < k2 into a < min(k1, k2), as well as detect equivalence from a >= k, a <= k, and determine conflicting bounds requirements to be false. This could be expanded to do analysis against other slotrefs in the future, but this should probably be saved for another diff. Testing: Expanded the test cases for the planner to achieve constant propagation. Added Kudu, datasource, Hdfs and HBase tests to validate we can create EmptySetNodes. Some manual testing for inequality conjuncts but nothing formal yet. Query: explain select * from functional_hbase.widetable_250_cols a where a.int_col1 > 1 and a.int_col1 <= 20 and a.int_col1 < 50 and a.int_col1 > 2 +--- | Explain String +--- | Estimated Per-Host Requirements: Memory=1.00GB VCores=1 | PLAN-ROOT SINK | | | 01:EXCHANGE [UNPARTITIONED] | | | 00:SCAN HBASE [functional_hbase.widetable_250_cols a] |predicates: a.int_col1 <= 20, a.int_col1 > 2 +--- Fetched 10 row(s) in 0.08s Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4 --- M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/SelectList.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/planner/ValueRange.java M fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java M fe/src/main/java/org/apache/impala/rewrite/NormalizeBinaryPredicatesRule.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test M testdata/workloads/functional-planner/queries/PlannerTest/conjunct-ordering.test A testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test M testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test M testdata/workloads/functional-planner/queries/PlannerTest/hdfs.test M testdata/workloads/functional-planner/queries/PlannerTest/joins.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test 19 files changed, 644 insertions(+), 117 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/6389/11 -- To view, visit http://gerrit.cloudera.org:8080/6389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-4883: Union Codegen
Taras Bobrovytsky has uploaded a new patch set (#6). Change subject: IMPALA-4883: Union Codegen .. IMPALA-4883: Union Codegen For each non-passthrough child of the Union node, codegen the loop that does per row tuple materialization. Testing: Ran test_queries.py test locally in exchaustive mode. Benchmark: Ran a local benchmark on a local 10 GB TPCDS dataset on an unpartitioned store_sales table. SELECT COUNT(c), COUNT(ss_customer_sk), COUNT(ss_cdemo_sk), COUNT(ss_hdemo_sk), COUNT(ss_addr_sk), COUNT(ss_store_sk), COUNT(ss_promo_sk), COUNT(ss_ticket_number), COUNT(ss_quantity), COUNT(ss_wholesale_cost), COUNT(ss_list_price), COUNT(ss_sales_price), COUNT(ss_ext_discount_amt), COUNT(ss_ext_sales_price), COUNT(ss_ext_wholesale_cost), COUNT(ss_ext_list_price), COUNT(ss_ext_tax), COUNT(ss_coupon_amt), COUNT(ss_net_paid), COUNT(ss_net_paid_inc_tax), COUNT(ss_net_profit), COUNT(ss_sold_date_sk) FROM ( select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned ) t Before: 39s704ms Operator #Hosts Avg Time Max Time#Rows Est. #Rows Peak Mem Est. Peak Mem Detail -- 13:AGGREGATE 1 194.504us 194.504us1 1 28.00 KB -1.00 B FINALIZE 12:EXCHANGE1 17.284us 17.284us3 1 0 -1.00 B UNPARTITIONED 11:AGGREGATE 32s202ms2s934ms3 1 115.00 KB 10.00 MB 00:UNION 3 32s514ms 34s926ms 288.01M 288.01M3.08 MB 0 |--02:SCAN HDFS3 158.373ms 216.085ms 28.80M 28.80M 489.71 MB 1.88 GB tpcds_10_parquet.store_sales |--03:SCAN HDFS3 167.002ms 171.738ms 28.80M 28.80M 489.74 MB 1.88 GB tpcds_10_parquet.store_sales |--04:SCAN HDFS3 125.331ms 145.496ms 28.80M 28.80M 489.57 MB 1.88 GB tpcds_10_parquet.store_sales |--05:SCAN HDFS3 148.478ms 194.311ms 28.80M 28.80M 489.69 MB 1.88 GB tpcds_10_parquet.store_sales |--06:SCAN HDFS3 143.995ms 162.781ms 28.80M 28.80M 489.57 MB 1.88 GB tpcds_10_parquet.store_sales |--07:SCAN HDFS3 169.731ms 250.201ms 28.80M 28.80M 489.58 MB 1.88 GB tpcds_10_parquet.store_sales |--08:SCAN HDFS3 164.110ms 254.374ms 28.80M 28.80M 489.61 MB 1.88 GB tpcds_10_parquet.store_sales |--09:SCAN HDFS3 135.631ms 162.117ms 28.80M 28.80M 489.63 MB 1.88 GB tpcds_10_parquet.store_sales |--10:SCAN HDFS3 138.736ms 167.778ms 28.80M 28.80M 489.67 MB 1.88 GB tpcds_10_parquet.store_sales 01:SCAN HDFS 3 202.015ms 248.728ms 28.80M 28.80M 489.68 MB 1.88 GB tpcds_10_parquet.store_sales After: 20s664ms Operator #Hosts Avg Time Max Time#Rows Est. #Rows Peak Mem Est. Peak Mem Detail -- 13:AGGREGATE 1 167.757us 167.757us1 1 28.00 KB -1.00 B FINALIZE 12:EXCHANGE1 16.592us 16.592us3 1 0 -1.00 B UNPARTITIONED 11:AGGREGATE 32s924ms3s715ms3 1 115.00 KB 10.00 MB 00:UNION 34s971ms6s082ms 288.01M 288.01M3.08 MB 0 |--02:SCAN HDFS31s189ms1s588ms 28.80M 28.80M 483.82 MB 1.88 GB tpcds_10_parquet.store_sales |--03:SCAN HDFS31s117ms1s157ms 28.80M 28.80M 484.85 MB 1.88 GB tpcds_10_parquet.store_sales |--04:SCAN HDFS31s226ms1s454ms 28.80M 28.80M 483.00 MB 1.88 GB tpcds_10_parquet.store_sales |--05:SCAN HDFS31s141ms1s3
[Impala-ASF-CR] IMPALA-4883: Union Codegen
Taras Bobrovytsky has uploaded a new patch set (#6). Change subject: IMPALA-4883: Union Codegen .. IMPALA-4883: Union Codegen For each non-passthrough child of the Union node, codegen the loop that does per row tuple materialization. Testing: Ran test_queries.py test locally in exchaustive mode. Benchmark: Ran a local benchmark on a local 10 GB TPCDS dataset on an unpartitioned store_sales table. SELECT COUNT(c), COUNT(ss_customer_sk), COUNT(ss_cdemo_sk), COUNT(ss_hdemo_sk), COUNT(ss_addr_sk), COUNT(ss_store_sk), COUNT(ss_promo_sk), COUNT(ss_ticket_number), COUNT(ss_quantity), COUNT(ss_wholesale_cost), COUNT(ss_list_price), COUNT(ss_sales_price), COUNT(ss_ext_discount_amt), COUNT(ss_ext_sales_price), COUNT(ss_ext_wholesale_cost), COUNT(ss_ext_list_price), COUNT(ss_ext_tax), COUNT(ss_coupon_amt), COUNT(ss_net_paid), COUNT(ss_net_paid_inc_tax), COUNT(ss_net_profit), COUNT(ss_sold_date_sk) FROM ( select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned union all select fnv_hash(ss_sold_time_sk) c, * from tpcds_10_parquet.store_sales_unpartitioned ) t Before: 39s704ms Operator #Hosts Avg Time Max Time#Rows Est. #Rows Peak Mem Est. Peak Mem Detail -- 13:AGGREGATE 1 194.504us 194.504us1 1 28.00 KB -1.00 B FINALIZE 12:EXCHANGE1 17.284us 17.284us3 1 0 -1.00 B UNPARTITIONED 11:AGGREGATE 32s202ms2s934ms3 1 115.00 KB 10.00 MB 00:UNION 3 32s514ms 34s926ms 288.01M 288.01M3.08 MB 0 |--02:SCAN HDFS3 158.373ms 216.085ms 28.80M 28.80M 489.71 MB 1.88 GB tpcds_10_parquet.store_sales |--03:SCAN HDFS3 167.002ms 171.738ms 28.80M 28.80M 489.74 MB 1.88 GB tpcds_10_parquet.store_sales |--04:SCAN HDFS3 125.331ms 145.496ms 28.80M 28.80M 489.57 MB 1.88 GB tpcds_10_parquet.store_sales |--05:SCAN HDFS3 148.478ms 194.311ms 28.80M 28.80M 489.69 MB 1.88 GB tpcds_10_parquet.store_sales |--06:SCAN HDFS3 143.995ms 162.781ms 28.80M 28.80M 489.57 MB 1.88 GB tpcds_10_parquet.store_sales |--07:SCAN HDFS3 169.731ms 250.201ms 28.80M 28.80M 489.58 MB 1.88 GB tpcds_10_parquet.store_sales |--08:SCAN HDFS3 164.110ms 254.374ms 28.80M 28.80M 489.61 MB 1.88 GB tpcds_10_parquet.store_sales |--09:SCAN HDFS3 135.631ms 162.117ms 28.80M 28.80M 489.63 MB 1.88 GB tpcds_10_parquet.store_sales |--10:SCAN HDFS3 138.736ms 167.778ms 28.80M 28.80M 489.67 MB 1.88 GB tpcds_10_parquet.store_sales 01:SCAN HDFS 3 202.015ms 248.728ms 28.80M 28.80M 489.68 MB 1.88 GB tpcds_10_parquet.store_sales After: 20s664ms Operator #Hosts Avg Time Max Time#Rows Est. #Rows Peak Mem Est. Peak Mem Detail -- 13:AGGREGATE 1 167.757us 167.757us1 1 28.00 KB -1.00 B FINALIZE 12:EXCHANGE1 16.592us 16.592us3 1 0 -1.00 B UNPARTITIONED 11:AGGREGATE 32s924ms3s715ms3 1 115.00 KB 10.00 MB 00:UNION 34s971ms6s082ms 288.01M 288.01M3.08 MB 0 |--02:SCAN HDFS31s189ms1s588ms 28.80M 28.80M 483.82 MB 1.88 GB tpcds_10_parquet.store_sales |--03:SCAN HDFS31s117ms1s157ms 28.80M 28.80M 484.85 MB 1.88 GB tpcds_10_parquet.store_sales |--04:SCAN HDFS31s226ms1s454ms 28.80M 28.80M 483.00 MB 1.88 GB tpcds_10_parquet.store_sales |--05:SCAN HDFS31s141ms1s3
[Impala-ASF-CR] IMPALA-5137: Support Kudu UNIXTIME MICROS as Impala TIMESTAMP
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-5137: Support Kudu UNIXTIME_MICROS as Impala TIMESTAMP .. Patch Set 1: Draft review -- To view, visit http://gerrit.cloudera.org:8080/6526 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iae6ccfffb79118a9036fb2227dba3a55356c896d Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5137: Support Kudu UNIXTIME MICROS as Impala TIMESTAMP
Matthew Jacobs has uploaded a new change for review. http://gerrit.cloudera.org:8080/6526 Change subject: IMPALA-5137: Support Kudu UNIXTIME_MICROS as Impala TIMESTAMP .. IMPALA-5137: Support Kudu UNIXTIME_MICROS as Impala TIMESTAMP Adds Impala support for TIMESTAMP types stored in Kudu. Impala's TIMESTAMP type is a 96-bit type with nanosecond precision and Kudu's timestamp is a 64-bit microsecond delta from the Unix epoch (called UNIXTIME_MICROS), so a conversion will is necessary. TODO: As of now, this only supports writing TIMESTAMPs to Kudu. Reading will require the Kudu client to return UNIXTIME_MICROS in a padded slot for Impala. Change-Id: Iae6ccfffb79118a9036fb2227dba3a55356c896d --- M be/src/exec/kudu-table-sink.cc M be/src/exec/kudu-util.cc M be/src/exec/kudu-util.h M be/src/runtime/timestamp-test.cc M be/src/runtime/timestamp-value.h M fe/src/main/java/org/apache/impala/util/KuduUtil.java M tests/query_test/test_kudu.py 7 files changed, 93 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/6526/1 -- To view, visit http://gerrit.cloudera.org:8080/6526 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Iae6ccfffb79118a9036fb2227dba3a55356c896d Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs
[Impala-ASF-CR] IMPALA-4733: Change HBase ports to non-ephemeral
Lars Volker has posted comments on this change. Change subject: IMPALA-4733: Change HBase ports to non-ephemeral .. Patch Set 3: > Can you report how you tested this, and if it works on RHEL 7 > consistently enough that IMPALA-4733 has gone away? I tested this locally by starting the minicluster on my dev machine, and I'm running a private exhaustive build on Cloudera's internal Jenkins. I couldn't find a way to make sure this fixes the RHEL7 issues, but it seems reasonable to assume that they were caused by HBase trying to bind ports in the ephemeral port range. If this passes the internal exhaustive build, I'd be willing to give it a try and see if the RHEL7 tests improve. -- To view, visit http://gerrit.cloudera.org:8080/6524 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4733: Change HBase ports to non-ephemeral
Hello Bharath Vissapragada, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6524 to look at the new patch set (#3). Change subject: IMPALA-4733: Change HBase ports to non-ephemeral .. IMPALA-4733: Change HBase ports to non-ephemeral We've seen repeated test failures because HBase tries to bind to ports in the ephemeral port range, which sometimes would already be occupied by outgoing connections of other proccesses. This change changes the ports to the new default HBase ports (HBASE-10123): HBase Master Port: 6 -> 16000 HBase Master Web UI Port: 60010 -> 16010 HBase ReqionServer Port: 60020 -> 16020 HBase ReqionServer Web UI Port: 60030 -> 16030 HBase Status Multicast Port: 60100 -> 16100 This made it necessary to change the default KMS port, too (HADOOP-12811): KMS HTTP port: 16000 -> 9600 Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857 --- M fe/src/test/resources/hbase-site.xml.template M testdata/cluster/admin M testdata/cluster/node_templates/cdh5/etc/init.d/kms M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl 5 files changed, 32 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/6524/3 -- To view, visit http://gerrit.cloudera.org:8080/6524 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-4733: Change HBase ports to non-ephemeral
Lars Volker has posted comments on this change. Change subject: IMPALA-4733: Change HBase ports to non-ephemeral .. Patch Set 3: (1 comment) Thank you for the review. Please see my comment and the new PS3. http://gerrit.cloudera.org:8080/#/c/6524/2/testdata/cluster/node_templates/cdh5/etc/init.d/kms File testdata/cluster/node_templates/cdh5/etc/init.d/kms: Line 25: export KMS_HTTP_PORT=$KMS_WEBUI_PORT > How did it work before? Was it picking the default port or something? Yes, this was leaving the default port unchanged, which was 16000. I updated the commit message to highlight the changes to the ports. -- To view, visit http://gerrit.cloudera.org:8080/6524 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5137: pt1, Refactor TimestampValue constructors
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5137: pt1, Refactor TimestampValue constructors .. Patch Set 2: Verified-1 Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/425/ -- To view, visit http://gerrit.cloudera.org:8080/6510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id25e19f7984e5ebf9073d9c569faf69cec142fa1 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4733: Change HBase ports to non-ephemeral
Michael Brown has posted comments on this change. Change subject: IMPALA-4733: Change HBase ports to non-ephemeral .. Patch Set 2: Can you report how you tested this, and if it works on RHEL 7 consistently enough that IMPALA-4733 has gone away? -- To view, visit http://gerrit.cloudera.org:8080/6524 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4733: Change HBase ports to non-ephemeral
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-4733: Change HBase ports to non-ephemeral .. Patch Set 2: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/6524/2/testdata/cluster/node_templates/cdh5/etc/init.d/kms File testdata/cluster/node_templates/cdh5/etc/init.d/kms: Line 25: export KMS_HTTP_PORT=$KMS_WEBUI_PORT How did it work before? Was it picking the default port or something? -- To view, visit http://gerrit.cloudera.org:8080/6524 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Bharath Vissapragada Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan nodes and inline views
Zach Amsden has posted comments on this change. Change subject: IMPALA-5003: Constant propagation in scan nodes and inline views .. Patch Set 10: (2 comments) http://gerrit.cloudera.org:8080/#/c/6389/6/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java: Line 72: customRewriter_ = null; This was an unexpected wrinkle that made things awkward. http://gerrit.cloudera.org:8080/#/c/6389/10/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: Line 996: info = new SlotInfo(it.nextIndex() - 1); lol I forgot to put it in the map. Worked out of the box after that. Still needs some minor changes (this step only runs after successful constant propagation), but also should run even if no propagation is done. -- To view, visit http://gerrit.cloudera.org:8080/6389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] PREVIEW: IMPALA-4678: port backend exec to use buffer pool
Tim Armstrong has uploaded a new patch set (#7). Change subject: PREVIEW: IMPALA-4678: port backend exec to use buffer pool .. PREVIEW: IMPALA-4678: port backend exec to use buffer pool Always create global BufferPool at startup using 80% of memory and limit reservations to 80% of query memory (same as BufferedBlockMgr). Each ExecNode has to declare its memory requirements at Prepare() time. Convert HashTable to use the new BufferPool via a Suballocator. Make PAGG memory consumption more efficient (avoid wasting buffers): * Allow preaggs to execute with 0 reservation - if streams and hash tables cannot be allocated, it will pass through rows. * Halve the buffer requirement for spilling aggs - avoid allocating buffers for aggregated and unaggregated streams simultaneously. Convert Sorter to use BufferPool. TODO in this patch: * some of the DCHECKS may be too aggressive. With the current memory transfer model, operators that accumulate batches, i.e. NLJ, can "steal" reservation. We need a test to reproduce this problem. We can probably fix by having NLJ copy if it sees an attached buffer. * Consider renaming buffer_pool_page_size, e.g. to spillable_page_size TODO in follow-up patches: * Rename BufferedTupleStreamV2 to BufferedTupleStream * Remove the old hash join and aggregation nodes Testing: * Updated tests to reflect new memory requirements * TODO: recalibrate limits in test_mem_usage_scaling * TODO: more tests to exercise new code paths Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e --- M be/src/codegen/gen_ir_descriptions.py M be/src/exec/analytic-eval-node.cc M be/src/exec/analytic-eval-node.h M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/hash-table-test.cc M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/exec/partitioned-aggregation-node-ir.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-aggregation-node.h M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/partitioned-hash-join-node.inline.h M be/src/exec/sort-node.cc M be/src/exec/sort-node.h M be/src/runtime/CMakeLists.txt D be/src/runtime/buffered-block-mgr-test.cc D be/src/runtime/buffered-block-mgr.cc D be/src/runtime/buffered-block-mgr.h D be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/buffered-tuple-stream-v2.cc M be/src/runtime/buffered-tuple-stream-v2.h D be/src/runtime/buffered-tuple-stream.cc D be/src/runtime/buffered-tuple-stream.h D be/src/runtime/buffered-tuple-stream.inline.h M be/src/runtime/disk-io-mgr.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-filter.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/sorter.cc M be/src/runtime/sorter.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/bloom-filter.h M be/src/util/static-asserts.cc M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/generate_error_codes.py M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters_phj.test M testdata/workloads/functional-query/queries/QueryTest/spilling.test M tests/query_test/test_mem_usage_scaling.py M tests/query_test/test_sort.py R tests/query_test/test_spilling.py M tests/stress/concurrent_select.py 60 files changed, 1,541 insertions(+), 7,538 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/5801/7 -- To view, visit http://gerrit.cloudera.org:8080/5801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] PREVIEW: IMPALA-4678: port backend exec to use buffer pool
Tim Armstrong has posted comments on this change. Change subject: PREVIEW: IMPALA-4678: port backend exec to use buffer pool .. Patch Set 7: Refreshed to my latest development version. -- To view, visit http://gerrit.cloudera.org:8080/5801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] Pass build type to Impala LZO.
Alex Behm has submitted this change and it was merged. Change subject: Pass build type to Impala LZO. .. Pass build type to Impala LZO. Before, the build type used for Impala LZO was always debug. Now, the build type is passed from the Impala CMakeLists.txt. This patch needs corresponding changes to Impala LZO. Testing: I tested locally with these build types: DEBUG, RELEASE, and ADDRESS_SANITIZER. Change-Id: Ia83e594409ad5938662ca210c810d5d31b8637b0 Reviewed-on: http://gerrit.cloudera.org:8080/6446 Reviewed-by: Alex Behm Tested-by: Alex Behm --- M CMakeLists.txt 1 file changed, 2 insertions(+), 1 deletion(-) Approvals: Alex Behm: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/6446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ia83e594409ad5938662ca210c810d5d31b8637b0 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Pass build type to Impala LZO.
Alex Behm has posted comments on this change. Change subject: Pass build type to Impala LZO. .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/6446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia83e594409ad5938662ca210c810d5d31b8637b0 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] Pass build type to Impala LZO.
Alex Behm has posted comments on this change. Change subject: Pass build type to Impala LZO. .. Patch Set 2: Code-Review+2 Merging this change directly after manual validation. It needs a coordinated change with Impala-Lzo. GVO and private testing of such coordinated changes is currently not supported on jenkins.impala.io. I filed https://issues.apache.org/jira/browse/IMPALA-5148 to improve this. Jim, I could not find an easy way for me to test whether this fixes IMPALA-4699 as well. I don't think my change has anything to do with that JIRA. -- To view, visit http://gerrit.cloudera.org:8080/6446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia83e594409ad5938662ca210c810d5d31b8637b0 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3203: Part 2: per-core free lists in buffer pool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3203: Part 2: per-core free lists in buffer pool .. Patch Set 15: Also, what do you think about removing the per-list limits? They're not necessary for correctness and they add an additional thing to tune. I think after the maintenance and scavenging they don't add much. -- To view, visit http://gerrit.cloudera.org:8080/6414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3203: Part 2: per-core free lists in buffer pool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3203: Part 2: per-core free lists in buffer pool .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/6414/13/be/src/runtime/bufferpool/buffer-allocator.h File be/src/runtime/bufferpool/buffer-allocator.h: Line 164: }; > I was thinking it might be useful to have information similar to what we ge We should chat about the design a bit. I added a couple of basic counters as part of the follow-up mmap patch: https://gerrit.cloudera.org/#/c/6474 We probably don't want to have global counters shared between all threads, but we could probably have per-arena counters aggregated on demand via a SumGauge. -- To view, visit http://gerrit.cloudera.org:8080/6414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] PREVIEW IMPALA-5073: Use mmap instead of malloc for buffer pool
Tim Armstrong has uploaded a new patch set (#3). Change subject: PREVIEW IMPALA-5073: Use mmap instead of malloc for buffer pool .. PREVIEW IMPALA-5073: Use mmap instead of malloc for buffer pool Allocate with mmap instead of TCMalloc to give more control over memory usage. Also allocate huge pages when possible to reduce TLB pressure. Adds additional memory metrics, since we previously relied on the assumption that all memory was allocated through TCMalloc. memory.total-used and memory.total-reserved track the total across the buffer pool and TCMalloc. When the buffer pool is not present, they just report the TCMalloc values. ASAN still uses malloc() because it doesn't instrument mmap(). Testing: Added some unit tests to test edge cases. Many pre-existing tests also exercise the modified code. Change-Id: Ifbc748f74adcbbdcfa45f3ec7df98284925acbd6 --- M be/src/catalog/catalogd-main.cc M be/src/runtime/bufferpool/buffer-allocator-test.cc M be/src/runtime/bufferpool/buffer-allocator.h M be/src/runtime/bufferpool/buffer-pool.cc M be/src/runtime/bufferpool/buffer-pool.h M be/src/runtime/bufferpool/reservation-tracker.cc M be/src/runtime/bufferpool/reservation-tracker.h M be/src/runtime/bufferpool/system-allocator.cc M be/src/runtime/bufferpool/system-allocator.h M be/src/runtime/exec-env.cc M be/src/statestore/statestored-main.cc M be/src/util/asan.h M be/src/util/memory-metrics.cc M be/src/util/memory-metrics.h M be/src/util/metrics-test.cc M be/src/util/metrics.h M common/thrift/generate_error_codes.py M common/thrift/metrics.json 18 files changed, 355 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/6474/3 -- To view, visit http://gerrit.cloudera.org:8080/6474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifbc748f74adcbbdcfa45f3ec7df98284925acbd6 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-5073: Use mmap instead of malloc for buffer pool
Tim Armstrong has uploaded a new patch set (#2). Change subject: IMPALA-5073: Use mmap instead of malloc for buffer pool .. IMPALA-5073: Use mmap instead of malloc for buffer pool Allocate with mmap instead of TCMalloc to give more control over memory usage. Also allocate huge pages when possible to reduce TLB pressure. Adds additional memory metrics, since we previously relied on the assumption that all memory was allocated through TCMalloc. memory.total-used and memory.total-reserved track the total across the buffer pool and TCMalloc. When the buffer pool is not present, they just report the TCMalloc values. ASAN still uses malloc() because it doesn't instrument mmap(). Testing: Added some unit tests to test edge cases. Many pre-existing tests also exercise the modified code. Change-Id: Ifbc748f74adcbbdcfa45f3ec7df98284925acbd6 --- M be/src/catalog/catalogd-main.cc M be/src/runtime/bufferpool/buffer-allocator-test.cc M be/src/runtime/bufferpool/buffer-allocator.h M be/src/runtime/bufferpool/buffer-pool.cc M be/src/runtime/bufferpool/buffer-pool.h M be/src/runtime/bufferpool/reservation-tracker.cc M be/src/runtime/bufferpool/reservation-tracker.h M be/src/runtime/bufferpool/system-allocator.cc M be/src/runtime/bufferpool/system-allocator.h M be/src/runtime/exec-env.cc M be/src/statestore/statestored-main.cc M be/src/util/asan.h M be/src/util/memory-metrics.cc M be/src/util/memory-metrics.h M be/src/util/metrics-test.cc M be/src/util/metrics.h M common/thrift/generate_error_codes.py M common/thrift/metrics.json 18 files changed, 355 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/6474/2 -- To view, visit http://gerrit.cloudera.org:8080/6474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifbc748f74adcbbdcfa45f3ec7df98284925acbd6 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-5137: pt1, Refactor TimestampValue constructors
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5137: pt1, Refactor TimestampValue constructors .. Patch Set 2: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/425/ -- To view, visit http://gerrit.cloudera.org:8080/6510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id25e19f7984e5ebf9073d9c569faf69cec142fa1 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3203: Part 2: per-core free lists in buffer pool
Dan Hecht has posted comments on this change. Change subject: IMPALA-3203: Part 2: per-core free lists in buffer pool .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/6414/13/be/src/runtime/bufferpool/buffer-allocator.h File be/src/runtime/bufferpool/buffer-allocator.h: Line 164: }; > Most of the perf counters are client-centric, so we will get a lot of info I was thinking it might be useful to have information similar to what we get from tc-malloc. Like breakdown of application used memory between free-list, clean-pages, and allocated-buffers/pinned-pages/dirty-pages. To help verify the system behaves as we expect, and debug issues when we hit unexpected memory pressure. Also to debug issues if memory skew occurs between cores, etc (i.e. visibility in arenas). -- To view, visit http://gerrit.cloudera.org:8080/6414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5137: pt1, Refactor TimestampValue constructors
Dan Hecht has posted comments on this change. Change subject: IMPALA-5137: pt1, Refactor TimestampValue constructors .. Patch Set 2: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/6510/2/be/src/exprs/timestamp-functions-ir.cc File be/src/exprs/timestamp-functions-ir.cc: Line 81: TimestampValue::FromUnixTime(intp.val).ToString()); not your change and don't have to address it, but it looks like this has weird behavior when the input unix time is out of range of a TimestampValue. Looks like the result is an empty string, whereas StringValFromTimestamp() used below gives null. Filed IIMPALA-5146 for that. -- To view, visit http://gerrit.cloudera.org:8080/6510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id25e19f7984e5ebf9073d9c569faf69cec142fa1 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines
Michael Brown has posted comments on this change. Change subject: IMPALA-5140: improve docs building guidelines .. Patch Set 7: Patch set 7 just updates the commit message to reflect all the line wrapping. -- To view, visit http://gerrit.cloudera.org:8080/6512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Laurel Hale Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines
Michael Brown has uploaded a new patch set (#7). Change subject: IMPALA-5140: improve docs building guidelines .. IMPALA-5140: improve docs building guidelines Move docs/generatingImpalaDoc.md to docs/README.md. This will automatically render the document inline at places like: https://github.com/apache/incubator-impala/tree/master/docs under the directory listing. Fix existing markdown which wasn't always rendering properly. Remove unneeded HTML and backslashes. Add a mention of make, and add one troubleshooting tip. Wrap most lines at 90 chars. This does not change how Github renders the markdown, and it makes reading the source easier as well. Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f --- A docs/README.md D docs/generatingImpalaDoc.md 2 files changed, 163 insertions(+), 73 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/6512/7 -- To view, visit http://gerrit.cloudera.org:8080/6512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Laurel Hale Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines
Jim Apple has posted comments on this change. Change subject: IMPALA-5140: improve docs building guidelines .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/6512/5/docs/README.md File docs/README.md: PS5, Line 111: add the :following lines to the end of the file: > I don't understand this comment. Does patch set 6 not address your concern? Oh, sorry, misread it. -- To view, visit http://gerrit.cloudera.org:8080/6512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Laurel Hale Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines
Michael Brown has posted comments on this change. Change subject: IMPALA-5140: improve docs building guidelines .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/6512/5/docs/README.md File docs/README.md: PS5, Line 111: add the :following lines to the end of the file: > It will be fine if you open a new terminal. I'd suggest assing "source" to I don't understand this comment. Does patch set 6 not address your concern? -- To view, visit http://gerrit.cloudera.org:8080/6512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Laurel Hale Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines
Jim Apple has posted comments on this change. Change subject: IMPALA-5140: improve docs building guidelines .. Patch Set 6: Laurel, doe we still need to tell people how to generate SQL reference when the entire doc can be generated just fine? -- To view, visit http://gerrit.cloudera.org:8080/6512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Laurel Hale Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines
Jim Apple has posted comments on this change. Change subject: IMPALA-5140: improve docs building guidelines .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/6512/5/docs/README.md File docs/README.md: PS5, Line 74: * **To generate HTML output of the Impala SQL Reference, run the following command:** : : ``` : ./bin/dita -input -format html5 \ : -output \ : -filter : ``` : : * **To generate PDF output of the Impala SQL Reference, run the following command:** : : ``` : ./bin/dita -input -format pdf \ : -output \ : -filter : ``` > You'd have to tell me, since you are the initial committer of this file. On Laurel was the original author. I'll add her to the review. PS5, Line 111: y :`/Users//.bash_profile`. Edit > Done It will be fine if you open a new terminal. I'd suggest assing "source" to these instructions in step 1 so people who don't open a new terminal will get the right result. -- To view, visit http://gerrit.cloudera.org:8080/6512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines
Michael Brown has uploaded a new patch set (#6). Change subject: IMPALA-5140: improve docs building guidelines .. IMPALA-5140: improve docs building guidelines Move docs/generatingImpalaDoc.md to docs/README.md. This will automatically render the document inline at places like: https://github.com/apache/incubator-impala/tree/master/docs under the directory listing. Fix existing markdown which wasn't always rendering properly. Remove unneeded HTML and backslashes. Add a mention of make, and add one troubleshooting tip. Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f --- A docs/README.md D docs/generatingImpalaDoc.md 2 files changed, 163 insertions(+), 73 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/6512/6 -- To view, visit http://gerrit.cloudera.org:8080/6512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines
Michael Brown has posted comments on this change. Change subject: IMPALA-5140: improve docs building guidelines .. Patch Set 5: (5 comments) http://gerrit.cloudera.org:8080/#/c/6512/5/docs/README.md File docs/README.md: PS5, Line 14: doc_prototype > master Done PS5, Line 17: doc_prototype > master Done PS5, Line 59: ./bin/dita > depends on where you put it It works if you followed step 3 above. PS5, Line 74: * **To generate HTML output of the Impala SQL Reference, run the following command:** : : ``` : ./bin/dita -input -format html5 \ : -output \ : -filter : ``` : : * **To generate PDF output of the Impala SQL Reference, run the following command:** : : ``` : ./bin/dita -input -format pdf \ : -output \ : -filter : ``` > Why are these needed? You'd have to tell me, since you are the initial committer of this file. One guess is that it exhibits using a different ditamap. PS5, Line 111: add the :following lines to the end of the file: > Then 'source' it. Done -- To view, visit http://gerrit.cloudera.org:8080/6512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3381: Support AM/PM marker in date and time format strings
Lars Volker has posted comments on this change. Change subject: IMPALA-3381: Support AM/PM marker in date and time format strings .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/6523/1/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: Line 5641: // AM/PM marker can be repeated and placed anywhere in the format string Do we have a check that prevents having multiple separate am/pm markers? http://gerrit.cloudera.org:8080/#/c/6523/1/be/src/runtime/timestamp-parse-util.cc File be/src/runtime/timestamp-parse-util.cc: Line 173: case 'a': tok_type = AM_PM_MARKER; dt_ctx->has_am_pm_marker = true; break; Have you considered supporting 'am' and 'pm' as tokens, too, like Greg suggested in the JIRA? Line 201: if (tok_len != 2) { You could remove this if statement (and add 0 below). PS1, Line 467: strncmp You can use strncasecmp() and simplify the code. -- To view, visit http://gerrit.cloudera.org:8080/6523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I99794a3e152f1712c6c469bb266d23a81d19ca34 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Lars Volker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4893: Efficiently update the rows read counter for sequence file
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4893: Efficiently update the rows read counter for sequence file .. Patch Set 1: (1 comment) Can you add a test for the RowsRead counter? It would be nice extra coverage that I think we're currently missing. E.g. I think in scanners.test we do some full table scans of some functional tables, and that is run for all file formats. It looks like runtime_filters.test has some verification of RowsRead. http://gerrit.cloudera.org:8080/#/c/6522/1/be/src/exec/hdfs-sequence-scanner.cc File be/src/exec/hdfs-sequence-scanner.cc: Line 346: COUNTER_ADD(scan_node_->rows_read_counter(), num_rows_read); I think we can avoid the duplicated logic if we break here instead of returning. I.e. if (stream->eof()) break; -- To view, visit http://gerrit.cloudera.org:8080/6522 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie42c97a36e46172884cc497aa645036c2c11f541 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3079: Fix sequence file writer
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3079: Fix sequence file writer .. Patch Set 6: Forgive the drive-by comment, but I'm curious about whether we plan to make sequence files a supported format for writing. It seems strange to put all this effort into it and keep it hidden behind the flag with other file writers like Avro that are totally broken. -- To view, visit http://gerrit.cloudera.org:8080/6107 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0db642ad35132a9a5a6611810a6cafbbe26e7487 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines
Jim Apple has posted comments on this change. Change subject: IMPALA-5140: improve docs building guidelines .. Patch Set 5: (5 comments) http://gerrit.cloudera.org:8080/#/c/6512/5/docs/README.md File docs/README.md: PS5, Line 14: doc_prototype master PS5, Line 17: doc_prototype master PS5, Line 59: ./bin/dita depends on where you put it PS5, Line 74: * **To generate HTML output of the Impala SQL Reference, run the following command:** : : ``` : ./bin/dita -input -format html5 \ : -output \ : -filter : ``` : : * **To generate PDF output of the Impala SQL Reference, run the following command:** : : ``` : ./bin/dita -input -format pdf \ : -output \ : -filter : ``` Why are these needed? PS5, Line 111: add the :following lines to the end of the file: Then 'source' it. -- To view, visit http://gerrit.cloudera.org:8080/6512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4883: Union Codegen
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4883: Union Codegen .. Patch Set 5: (13 comments) Getting pretty close, just minor cleanup at this point. I also just wanted to check with Michael that the tuple_pool_ approach made the most sense for now - we'll need to clean that up as part of his codegen work but I don't think it makes sense to fix in this patch. http://gerrit.cloudera.org:8080/#/c/6459/4/be/src/exec/union-node-ir.cc File be/src/exec/union-node-ir.cc: Line 34: > We can avoid checking limits for each row if we check it at the end and tru It looks like we already do this for the passthrough case so we might as well do it here. http://gerrit.cloudera.org:8080/#/c/6459/5/be/src/exec/union-node-ir.cc File be/src/exec/union-node-ir.cc: Line 19: #include "runtime/tuple.h" Do we need tuple.h? I don't think I see any references to Tuple* in here. Line 21: #include "util/runtime-profile-counters.h" Is this needed still? Line 35: while (!dst_batch->AtCapacity() && child_row_idx < child_batch->num_rows()) { Nice! We can maybe avoid a few more loads and stores via the child_batch and tuple_buf pointers. I.e. int child_batch_rows = child_batch->num_rows(). uint8_t* curr_tuple = *tuple_buf; ... *tuple_buf = curr_tuple. Line 46: if (limit_ != -1 && num_rows_returned_ + dst_batch->num_rows() > limit_) { We don't need to cross-compile this logic. Let's move it into the caller and save LLVM some work. Although, see my comment about moving this logic to GetNext() and sharing it for all three codepaths. http://gerrit.cloudera.org:8080/#/c/6459/5/be/src/exec/union-node.cc File be/src/exec/union-node.cc: Line 168: if (limit_ != -1 && num_rows_returned_ + row_batch->num_rows() > limit_) { How about we move this logic around num_rows_returned_ and limit_ into GetNext()? I believe the same logic can work for all three cases if we slightly extend it so that it handles the case when GetNext() is called with a non-empty batch, which can happen in a subplan. http://gerrit.cloudera.org:8080/#/c/6459/1/be/src/exec/union-node.h File be/src/exec/union-node.h: PS1, Line 71: in this poo > Ok, this can be removed in the future. Michael, do you think it makes sense to use this approach for now? I'm not that familiar with CodegenMaterializeExprs() so not sure if there is a better way to do this. http://gerrit.cloudera.org:8080/#/c/6459/5/be/src/exec/union-node.h File be/src/exec/union-node.h: Line 28: #include "runtime/tuple.h" Do we need the tuple.h and tuple-row.h imports? Oh I guess for the inline MaterializeExprs function, but we can move that to the -ir.cc file anyway. Line 72: /// each GetNext() call. We should add a TODO to remove this. Maybe Michael knows if there's a JIRA that will allow us to remove it. PS5, Line 99: Null NULL here and below, just for consistency with other comments. PS5, Line 128: row_batch dst_batch. Line 136: void IR_ALWAYS_INLINE MaterializeExprs(const std::vector& exprs, Move this to the -ir.cc file? I don't think there's a reason we need to define it in the .h Line 148: bool inline IsChildPassthrough(int child_idx) const { I don't think any of the "inline" specifiers here and below do anything - if the function is defined in the class body it implicitly has an "inline" hint. https://isocpp.org/wiki/faq/inline-functions#inline-member-fns-more -- To view, visit http://gerrit.cloudera.org:8080/6459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib4107d27582ff5416172810364a6e76d3d93c439 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines
Michael Brown has posted comments on this change. Change subject: IMPALA-5140: improve docs building guidelines .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/6512/4/docs/README.md File docs/README.md: Line 6: * Open a terminal window and run the following commands to get the Impala documentation source files from Git: > Could yo wrap long lines to help with the gerrit display to help ease revie Done -- To view, visit http://gerrit.cloudera.org:8080/6512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5140: improve docs building guidelines
Michael Brown has uploaded a new patch set (#5). Change subject: IMPALA-5140: improve docs building guidelines .. IMPALA-5140: improve docs building guidelines Move docs/generatingImpalaDoc.md to docs/README.md. This will automatically render the document inline at places like: https://github.com/apache/incubator-impala/tree/master/docs under the directory listing. Fix existing markdown which wasn't always rendering properly. Remove unneeded HTML and backslashes. Add a mention of make, and add one troubleshooting tip. Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f --- A docs/README.md D docs/generatingImpalaDoc.md 2 files changed, 161 insertions(+), 73 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/6512/5 -- To view, visit http://gerrit.cloudera.org:8080/6512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I71ae79ecd346045697fe225140ee9a317c5a337f Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-4733: Change HBase ports to non-ephemeral
Lars Volker has uploaded a new patch set (#2). Change subject: IMPALA-4733: Change HBase ports to non-ephemeral .. IMPALA-4733: Change HBase ports to non-ephemeral Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857 --- M fe/src/test/resources/hbase-site.xml.template M testdata/cluster/admin M testdata/cluster/node_templates/cdh5/etc/init.d/kms M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl 5 files changed, 26 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/6524/2 -- To view, visit http://gerrit.cloudera.org:8080/6524 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker
[Impala-ASF-CR] IMPALA-3079: Fix sequence file writer
Attila Jeges has posted comments on this change. Change subject: IMPALA-3079: Fix sequence file writer .. Patch Set 6: (4 comments) http://gerrit.cloudera.org:8080/#/c/6107/4/be/src/exec/hdfs-sequence-table-writer.cc File be/src/exec/hdfs-sequence-table-writer.cc: PS4, Line 179: : > Thanks for listing them out. Please also put this list in the commit messag Done http://gerrit.cloudera.org:8080/#/c/6107/5/be/src/exec/read-write-util.h File be/src/exec/read-write-util.h: Line 214: // Returns size of the encoded long value, including the 1 byte for length for val < -112 > for val < -112 or val > 127. Done PS5, Line 228: ollow > nit: long line Done PS5, Line 245: um_bytes, 9); > nit: help to comment why it's 119 here (which is different from 120 in the Done -- To view, visit http://gerrit.cloudera.org:8080/6107 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0db642ad35132a9a5a6611810a6cafbbe26e7487 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3079: Fix sequence file writer
Attila Jeges has uploaded a new patch set (#6). Change subject: IMPALA-3079: Fix sequence file writer .. IMPALA-3079: Fix sequence file writer This change fixes the following issues in the Sequence File Writer: 1. ReadWriteUtil::VLongRequiredBytes() and ReadWriteUtil::PutVLong() were broken. As a result, Impala could not read back uncompressed sequence files created by Impala. 2. KEY_CLASS_NAME was missing from the sequence file header. As a result, Hive could not read back uncompressed sequence files created by Impala. 3. Impala created record-compressed sequence files with empty keys block. As a result, Hive could not read back record-compressed sequence files created by Impala. 4. Impala created block-compressed files with: - empty key-lengths block - empty keys block - empty value-lengths block This resulted in invalid block-compressed sequence files that Hive could not read back. 5. In some cases the wrong Record-compression flag was written to the sequence file header. As a result, Hive could not read back record- compressed sequence files created by Impala. 6. Impala added 'sync_marker' instead of 'neg1_sync_marker' to the beginning of blocks in block-compressed sequence files. Hive could not read these files back. 7. The calculation of block sizes in SnappyBlockCompressor class was incorrect for odd-length buffers. Change-Id: I0db642ad35132a9a5a6611810a6cafbbe26e7487 --- M be/src/exec/hdfs-sequence-table-writer.cc M be/src/exec/hdfs-sequence-table-writer.h M be/src/exec/read-write-util-test.cc M be/src/exec/read-write-util.h M be/src/util/compress.cc M be/src/util/decompress-test.cc M testdata/workloads/functional-query/queries/QueryTest/seq-writer.test M tests/query_test/test_compressed_formats.py 8 files changed, 494 insertions(+), 80 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/6107/6 -- To view, visit http://gerrit.cloudera.org:8080/6107 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0db642ad35132a9a5a6611810a6cafbbe26e7487 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] IMPALA-3079: Fix sequence file writer
Hello Michael Ho, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6107 to look at the new patch set (#6). Change subject: IMPALA-3079: Fix sequence file writer .. IMPALA-3079: Fix sequence file writer This change fixes the following issues in the Sequence File Writer: 1. ReadWriteUtil::VLongRequiredBytes() and ReadWriteUtil::PutVLong() were broken. As a result, Impala could not read back uncompressed sequence files created by Impala. 2. KEY_CLASS_NAME was missing from the sequence file header. As a result, Hive could not read back uncompressed sequence files created by Impala. 3. Impala created record-compressed sequence files with empty keys block. As a result, Hive could not read back record-compressed sequence files created by Impala. 4. Impala created block-compressed files with: - empty key-lengths block - empty keys block - empty value-lengths block This resulted in invalid block-compressed sequence files that Hive could not read back. 5. In some cases the wrong Record-compression flag was written to the sequence file header. As a result, Hive could not read back record- compressed sequence files created by Impala. 6. Impala added 'sync_marker' instead of 'neg1_sync_marker' to the beginning of blocks in block-compressed sequence files. Hive could not read these files back. 7. The calculation of block sizes in SnappyBlockCompressor class was incorrect for odd-length buffers. Change-Id: I0db642ad35132a9a5a6611810a6cafbbe26e7487 --- M be/src/exec/hdfs-sequence-table-writer.cc M be/src/exec/hdfs-sequence-table-writer.h M be/src/exec/read-write-util-test.cc M be/src/exec/read-write-util.h M be/src/util/compress.cc M be/src/util/decompress-test.cc M testdata/workloads/functional-query/queries/QueryTest/seq-writer.test M tests/query_test/test_compressed_formats.py 8 files changed, 494 insertions(+), 80 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/6107/6 -- To view, visit http://gerrit.cloudera.org:8080/6107 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0db642ad35132a9a5a6611810a6cafbbe26e7487 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] IMPALA-4733: Change HBase ports to non-ephemeral
Lars Volker has uploaded a new change for review. http://gerrit.cloudera.org:8080/6524 Change subject: IMPALA-4733: Change HBase ports to non-ephemeral .. IMPALA-4733: Change HBase ports to non-ephemeral Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857 --- M fe/src/test/resources/hbase-site.xml.template M testdata/cluster/admin M testdata/cluster/node_templates/cdh5/etc/init.d/kms 3 files changed, 24 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/6524/1 -- To view, visit http://gerrit.cloudera.org:8080/6524 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I6f8af325e34b6e352afd75ce5ddd2446ce73d857 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker
[Impala-ASF-CR] IMPALA-3381: Support AM/PM marker in date and time format strings
Attila Jeges has uploaded a new change for review. http://gerrit.cloudera.org:8080/6523 Change subject: IMPALA-3381: Support AM/PM marker in date and time format strings .. IMPALA-3381: Support AM/PM marker in date and time format strings This change adds AM/PM marker to format strings used in 'to_timestamp' 'unix_timestamp' and 'from_unixtime' functions. It uses 'a' for the AM/PM marker following the Hive impelentation ( which follows Java 'SimpleDateFormat' patterns). Similarly to Hive, the 'a' pattern letter can be repeated any number of times in the format string without affecting the corresponding presentation. For example: > select from_unixtime( > unix_timestamp('2017-03-31 11:19:23 PM', '-MM-dd HH:mm:ss a'), > '-MM-dd HH:mm:ss aaa'); 2017-03-31 11:19:23 PM Change-Id: I99794a3e152f1712c6c469bb266d23a81d19ca34 --- M be/src/exprs/expr-test.cc M be/src/runtime/timestamp-parse-util.cc M be/src/runtime/timestamp-parse-util.h 3 files changed, 137 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6523/1 -- To view, visit http://gerrit.cloudera.org:8080/6523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I99794a3e152f1712c6c469bb266d23a81d19ca34 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges