[Impala-ASF-CR] IMPALA-5180: Don't use non-deterministic exprs in partition pruning
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5180: Don't use non-deterministic exprs in partition pruning .. Patch Set 9: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/525/ -- To view, visit http://gerrit.cloudera.org:8080/6575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I91054c6bf017401242259a1eff5e859085285546 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5180: Don't use non-deterministic exprs in partition pruning
Alex Behm has posted comments on this change. Change subject: IMPALA-5180: Don't use non-deterministic exprs in partition pruning .. Patch Set 9: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I91054c6bf017401242259a1eff5e859085285546 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5262: test analytic order by random fails with assert
Alex Behm has posted comments on this change. Change subject: IMPALA-5262: test_analytic_order_by_random fails with assert .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6775/1/tests/query_test/test_sort.py File tests/query_test/test_sort.py: Line 189 > Fair point. Let me think about this a little more. How about running this query and asserting that the result is sorted? select last_value(random(2)) over (order by random(2)) lv from functional.alltypessmall order by lv; It's not perfect, but better that nothing imo. -- To view, visit http://gerrit.cloudera.org:8080/6775 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If1ba8154c2b6a8d508916d85391b95885ef915a9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters .. IMPALA-5162,IMPALA-5163: stress test support on secure clusters This patch adds support for running the stress test (concurrent_select.py) and loading nested data (load_nested.py) into a Kerberized, SSL-enabled Impala cluster. It assumes the calling user already has a valid Kerberos ticket. One way to do that is: 1. Get access to a keytab and krb5.config 2. Set KRB5_CONFIG and KRB5CCNAME appropriately 3. Run kinit(1) 4. Run load_nested.py and/or concurrent_select.py within this environment. Because our Python clients already support Kerberos and SSL, we simply need to make sure to use the correct options when calling the entry points and initializing the clients: Impala: Impyla Hive: Impyla HDFS: hdfs.ext.kerberos.KerberosClient With this patch, I was able to manually do a short concurrent_select.py run against a secure cluster without connection or auth errors, and I was able to do the same with load_nested.py for a cluster that already had TPC-H loaded. Follow-ons for future cleanup work: IMPALA-5263: support CA bundles when running stress test against SSL'd Impala IMPALA-5264: fix InsecurePlatformWarning under stress test with SSL Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179 Reviewed-on: http://gerrit.cloudera.org:8080/6763 Reviewed-by: Matthew Mulder Reviewed-by: Alex Behm Tested-by: Impala Public Jenkins --- M testdata/bin/load_nested.py M tests/comparison/cli_options.py M tests/comparison/cluster.py M tests/comparison/db_connection.py M tests/stress/concurrent_select.py 5 files changed, 61 insertions(+), 18 deletions(-) Approvals: Matthew Mulder: Looks good to me, but someone else must approve Impala Public Jenkins: Verified Alex Behm: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/6763 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/6763 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4866: Hash join node does not apply limits correctly
anujphadke has uploaded a new change for review. http://gerrit.cloudera.org:8080/6778 Change subject: IMPALA-4866: Hash join node does not apply limits correctly .. IMPALA-4866: Hash join node does not apply limits correctly Hash join node currently does not apply the limits correctly. This issue gets masked most of the times since the planner sticks an exhcnage node on top of most of the joins. This issue gets exposed when NUM_NODES=1. Change-Id: I414124f8bb6f8b2af2df468e1c23418d05a0e29f --- M be/src/exec/partitioned-hash-join-node.cc M tests/common/test_dimensions.py 2 files changed, 11 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6778/1 -- To view, visit http://gerrit.cloudera.org:8080/6778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I414124f8bb6f8b2af2df468e1c23418d05a0e29f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke
[Impala-ASF-CR] IMPALA-3742: Partitions and sort INSERTs for Kudu tables
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables .. IMPALA-3742: Partitions and sort INSERTs for Kudu tables Bulk DMLs (INSERT, UPSERT, UPDATE, and DELETE) for Kudu are currently painful because we just send rows randomly, which creates a lot of work for Kudu since it partitions and sorts data before writing, causing writes to be slow and leading to timeouts. We can alleviate this by sending the rows to Kudu already partitioned and sorted. This patch partitions and sorts rows according to Kudu's partitioning scheme for INSERTs and UPSERTs. A followup patch will handle UPDATE and DELETE. It accomplishes this by inserting an exchange node and a sort node into the plan before the operation. Both the exchange and the sort are given a KuduPartitionExpr which takes a row and calls into the Kudu client to return its partition number. It also disallows INSERT hints for Kudu tables, since the hints that we support (SHUFFLE, CLUSTER, SORTBY), so longer make sense. Testing: - Updated planner tests. - Ran the Kudu functional tests. - Ran performance tests demonstrating that we can now handle much larger inserts without having timeouts. Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4 Reviewed-on: http://gerrit.cloudera.org:8080/6559 Reviewed-by: Thomas Tauber-Marshall Tested-by: Impala Public Jenkins --- M be/src/exec/kudu-table-sink.cc M be/src/exec/kudu-util.cc M be/src/exec/kudu-util.h M be/src/exprs/CMakeLists.txt M be/src/exprs/expr-context.h M be/src/exprs/expr.cc A be/src/exprs/kudu-partition-expr.cc A be/src/exprs/kudu-partition-expr.h M be/src/runtime/coordinator.cc M be/src/runtime/data-stream-sender.cc M be/src/runtime/data-stream-sender.h M be/src/scheduling/scheduler.cc M common/thrift/Exprs.thrift M common/thrift/Partitions.thrift M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java A fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/planner/DataPartition.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/TableSink.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java M testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test 26 files changed, 616 insertions(+), 170 deletions(-) Approvals: Impala Public Jenkins: Verified Thomas Tauber-Marshall: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/6559 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-3742: Partitions and sort INSERTs for Kudu tables
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables .. Patch Set 9: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/6559 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc
Michael Ho has posted comments on this change. Change subject: IMPALA-2550: Switch to per-query exec rpc .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/descriptors.h File be/src/runtime/descriptors.h: PS10, Line 287: /// TODO: Move these into the new query-wide state, indexed by partition id. Remove. -- To view, visit http://gerrit.cloudera.org:8080/6535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel Kornacker Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5262: test analytic order by random fails with assert
Alex Behm has posted comments on this change. Change subject: IMPALA-5262: test_analytic_order_by_random fails with assert .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6775/1/tests/query_test/test_sort.py File tests/query_test/test_sort.py: Line 189 > Do you have any suggestions? Fair point. Let me think about this a little more. -- To view, visit http://gerrit.cloudera.org:8080/6775 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If1ba8154c2b6a8d508916d85391b95885ef915a9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan conjuncts
Alex Behm has submitted this change and it was merged. Change subject: IMPALA-5003: Constant propagation in scan conjuncts .. IMPALA-5003: Constant propagation in scan conjuncts Implements constant propagation within conjuncts and applies the optimization to scan conjuncts and collection conjuncts within Hdfs scan nodes. The optimization is applied during planning. At scan nodes in particular, we want to optimize to enable partition pruning. In certain cases, we might end up with a FALSE conditional, which now will convert to an EmptySet node. Testing: Expanded the test cases for the planner to achieve constant propagation. Added Kudu, datasource, Hdfs and HBase tests to validate we can create EmptySetNodes. Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4 Reviewed-on: http://gerrit.cloudera.org:8080/6389 Tested-by: Impala Public Jenkins Reviewed-by: Alex Behm --- M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/SelectList.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test M testdata/workloads/functional-planner/queries/PlannerTest/conjunct-ordering.test A testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test M testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test M testdata/workloads/functional-planner/queries/PlannerTest/hdfs.test M testdata/workloads/functional-planner/queries/PlannerTest/joins.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test M testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test 19 files changed, 636 insertions(+), 93 deletions(-) Approvals: Impala Public Jenkins: Verified Alex Behm: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/6389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4 Gerrit-PatchSet: 26 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Zach Amsden Gerrit-Reviewer: anujphadke
[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan conjuncts
Alex Behm has posted comments on this change. Change subject: IMPALA-5003: Constant propagation in scan conjuncts .. Patch Set 25: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4 Gerrit-PatchSet: 25 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Zach Amsden Gerrit-Reviewer: anujphadke Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters .. Patch Set 1: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/524/ -- To view, visit http://gerrit.cloudera.org:8080/6763 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan conjuncts
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5003: Constant propagation in scan conjuncts .. Patch Set 25: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/6389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4 Gerrit-PatchSet: 25 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Zach Amsden Gerrit-Reviewer: anujphadke Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5262: test analytic order by random fails with assert
Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-5262: test_analytic_order_by_random fails with assert .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6775/1/tests/query_test/test_sort.py File tests/query_test/test_sort.py: Line 189 > Do we have any end-to-end tests that exercise non-deterministic exprs in an Do you have any suggestions? Its hard to check the output in a reliable way unless we're able to have the random() values returned so that we can check that their order is correct, but I don't know of any way to do that (other than use an inline view, which doesn't work, see IMPALA-5270) We can run the query without checking its output just to ensure that it doesn't crash, though the crash that was happening here was rare so that doesn't give us much coverage. We could fix that by adding it to the query generator. I also haven't actually repro-ed this, so I could also investigate why its actually failing and it may be something related to local filesystem that could get fixed, but the entire premise of the test if faulty and its likely to just continue to be flaky (e.g. if IMPALA-660 get addressed). -- To view, visit http://gerrit.cloudera.org:8080/6775 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If1ba8154c2b6a8d508916d85391b95885ef915a9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2716: Hive/Impala incompatibility for timestamp data in Parquet
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-2716: Hive/Impala incompatibility for timestamp data in Parquet .. Patch Set 10: Verified-1 Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/521/ -- To view, visit http://gerrit.cloudera.org:8080/5939 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3f24525ef45a2814f476bdee76655b30081079d6 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5262: test analytic order by random fails with assert
Alex Behm has posted comments on this change. Change subject: IMPALA-5262: test_analytic_order_by_random fails with assert .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6775/1/tests/query_test/test_sort.py File tests/query_test/test_sort.py: Line 189 Do we have any end-to-end tests that exercise non-deterministic exprs in an analytic sort? If not, I suggest we try to fix this test instead. I know there are PlannerTests, but having at least one end-to-end tests seems prudent. -- To view, visit http://gerrit.cloudera.org:8080/6775 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If1ba8154c2b6a8d508916d85391b95885ef915a9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Alex Behm Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5262: test analytic order by random fails with assert
Thomas Tauber-Marshall has uploaded a new change for review. http://gerrit.cloudera.org:8080/6775 Change subject: IMPALA-5262: test_analytic_order_by_random fails with assert .. IMPALA-5262: test_analytic_order_by_random fails with assert This was a poorly written test that relies on assumptions about the behavior of 'rand' and the order that rows get processed in a table that Impala doesn't actually guarantee. Its also unnecessary as test_order_by_random verifies that sort expr materialization behaves as expected and the PlannerTest sort-expr-materialization verified that exprs are materialized for analytic functions as appropriate. Change-Id: If1ba8154c2b6a8d508916d85391b95885ef915a9 --- M tests/query_test/test_sort.py 1 file changed, 0 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/6775/1 -- To view, visit http://gerrit.cloudera.org:8080/6775 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: If1ba8154c2b6a8d508916d85391b95885ef915a9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-5137: pt1, Refactor TimestampValue constructors
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-5137: pt1, Refactor TimestampValue constructors .. Patch Set 7: Code-Review+2 sorry for all the rebasing noise, I'm holding off on committing this until the TIMESTAMP patch gets reviewed in case this needs to change: https://gerrit.cloudera.org/#/c/6526/5 -- To view, visit http://gerrit.cloudera.org:8080/6510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id25e19f7984e5ebf9073d9c569faf69cec142fa1 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3742: Partitions and sort INSERTs for Kudu tables
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables .. Patch Set 9: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/523/ -- To view, visit http://gerrit.cloudera.org:8080/6559 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3742: Partitions and sort INSERTs for Kudu tables
Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables .. Patch Set 9: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6559 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3742: Partitions and sort INSERTs for Kudu tables
Hello Marcel Kornacker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6559 to look at the new patch set (#9). Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables .. IMPALA-3742: Partitions and sort INSERTs for Kudu tables Bulk DMLs (INSERT, UPSERT, UPDATE, and DELETE) for Kudu are currently painful because we just send rows randomly, which creates a lot of work for Kudu since it partitions and sorts data before writing, causing writes to be slow and leading to timeouts. We can alleviate this by sending the rows to Kudu already partitioned and sorted. This patch partitions and sorts rows according to Kudu's partitioning scheme for INSERTs and UPSERTs. A followup patch will handle UPDATE and DELETE. It accomplishes this by inserting an exchange node and a sort node into the plan before the operation. Both the exchange and the sort are given a KuduPartitionExpr which takes a row and calls into the Kudu client to return its partition number. It also disallows INSERT hints for Kudu tables, since the hints that we support (SHUFFLE, CLUSTER, SORTBY), so longer make sense. Testing: - Updated planner tests. - Ran the Kudu functional tests. - Ran performance tests demonstrating that we can now handle much larger inserts without having timeouts. Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4 --- M be/src/exec/kudu-table-sink.cc M be/src/exec/kudu-util.cc M be/src/exec/kudu-util.h M be/src/exprs/CMakeLists.txt M be/src/exprs/expr-context.h M be/src/exprs/expr.cc A be/src/exprs/kudu-partition-expr.cc A be/src/exprs/kudu-partition-expr.h M be/src/runtime/coordinator.cc M be/src/runtime/data-stream-sender.cc M be/src/runtime/data-stream-sender.h M be/src/scheduling/scheduler.cc M common/thrift/Exprs.thrift M common/thrift/Partitions.thrift M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java A fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/planner/DataPartition.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/TableSink.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java M testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test 26 files changed, 616 insertions(+), 170 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/6559/9 -- To view, visit http://gerrit.cloudera.org:8080/6559 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-5266 Impala ABM / LZCNT support
Jim Apple has posted comments on this change. Change subject: IMPALA-5266 Impala ABM / LZCNT support .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/5821/5/be/src/util/bit-util.h File be/src/util/bit-util.h: Line 88: return (value >> bits) | (value << (64 - bits)); This is undefined behavior when bits is 0 or 64: "The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand." -- To view, visit http://gerrit.cloudera.org:8080/5821 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9f6a465ab4a9ee4f582847f8e211a779bdede3d2 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters
Alex Behm has posted comments on this change. Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters .. Patch Set 1: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/6763/1/tests/comparison/cluster.py File tests/comparison/cluster.py: Line 364: "0.0.0.0:50070") > I can only speculate: My guess it has to do with supporting Mini vs. real c Let's not make the changes in this patch to avoid breaking functionality. Just wanted to get your take on this pattern. Line 412: local_shell(pip_path + " install pykerberos==1.1.14 requests-kerberos==0.11.0", > I can only speculate: I've seen this pattern in a few places and it's likel Thanks. If you agree there is questionable/little benefit to this lazy install+import, we should consider simplifying it - but not in this patch. -- To view, visit http://gerrit.cloudera.org:8080/6763 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters
Michael Brown has posted comments on this change. Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/6763/1/tests/comparison/cluster.py File tests/comparison/cluster.py: Line 364: "0.0.0.0:50070") > Independent question: I can only speculate: My guess it has to do with supporting Mini vs. real clusters, where the port numbers differ, and dev environments that are half-set up or whatever. Do you want me to alter get_hadoop_config() in this patch and remove the employments of default values? Line 412: local_shell(pip_path + " install pykerberos==1.1.14 requests-kerberos==0.11.0", > Not your change, but this flow strikes me as odd. We have packages+versions I can only speculate: I've seen this pattern in a few places and it's likely an attempt at micro-optimization to prevent the Impala Python virtual environment from having unnecessary packages. Pre-commit tests won't go through this path, for example, thus don't need the packages. -- To view, visit http://gerrit.cloudera.org:8080/6763 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters
Matthew Mulder has posted comments on this change. Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/6763 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters
Alex Behm has posted comments on this change. Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters .. Patch Set 1: (2 comments) Changes look reasonable, but I'm not super familiar with this code. http://gerrit.cloudera.org:8080/#/c/6763/1/tests/comparison/cluster.py File tests/comparison/cluster.py: Line 364: "0.0.0.0:50070") Independent question: Does it even make sense to plug in default values here? Seems like a misconfiguration might be hard to debug if we plug in default values, instead of throwing an error. Line 412: local_shell(pip_path + " install pykerberos==1.1.14 requests-kerberos==0.11.0", Not your change, but this flow strikes me as odd. We have packages+versions baked into the code here. What's the benefit of doing this lazy install+import as opposed to requiring these to be installed up-front? -- To view, visit http://gerrit.cloudera.org:8080/6763 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3742: Partitions and sort INSERTs for Kudu tables
Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables .. Patch Set 8: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/6559/7/fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java File fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java: Line 37: * a given row. Returns -1 for rows that do not correspond to a partition. The children of > is it documented in some class header that values outside the legal range r Done Line 74: for (int i = 0; i < children_.size(); ++i) { > single line Done -- To view, visit http://gerrit.cloudera.org:8080/6559 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3742: Partitions and sort INSERTs for Kudu tables
Hello Marcel Kornacker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6559 to look at the new patch set (#8). Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables .. IMPALA-3742: Partitions and sort INSERTs for Kudu tables Bulk DMLs (INSERT, UPSERT, UPDATE, and DELETE) for Kudu are currently painful because we just send rows randomly, which creates a lot of work for Kudu since it partitions and sorts data before writing, causing writes to be slow and leading to timeouts. We can alleviate this by sending the rows to Kudu already partitioned and sorted. This patch partitions and sorts rows according to Kudu's partitioning scheme for INSERTs and UPSERTs. A followup patch will handle UPDATE and DELETE. It accomplishes this by inserting an exchange node and a sort node into the plan before the operation. Both the exchange and the sort are given a KuduPartitionExpr which takes a row and calls into the Kudu client to return its partition number. It also disallows INSERT hints for Kudu tables, since the hints that we support (SHUFFLE, CLUSTER, SORTBY), so longer make sense. Testing: - Updated planner tests. - Ran the Kudu functional tests. - Ran performance tests demonstrating that we can now handle much larger inserts without having timeouts. Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4 --- M be/src/exec/kudu-table-sink.cc M be/src/exec/kudu-util.cc M be/src/exec/kudu-util.h M be/src/exprs/CMakeLists.txt M be/src/exprs/expr-context.h M be/src/exprs/expr.cc A be/src/exprs/kudu-partition-expr.cc A be/src/exprs/kudu-partition-expr.h M be/src/runtime/coordinator.cc M be/src/runtime/data-stream-sender.cc M be/src/runtime/data-stream-sender.h M be/src/scheduling/scheduler.cc M bin/impala-config.sh M common/thrift/Exprs.thrift M common/thrift/Partitions.thrift M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java A fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/planner/DataPartition.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/TableSink.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java M testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test 27 files changed, 617 insertions(+), 171 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/6559/8 -- To view, visit http://gerrit.cloudera.org:8080/6559 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters
Michael Brown has posted comments on this change. Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6763/1/tests/comparison/cluster.py File tests/comparison/cluster.py: PS1, Line 404: > Why is this removed? It's an unsupported parameter. http://hdfscli.readthedocs.io/en/latest/api.html#hdfs.ext.kerberos.KerberosClient -- To view, visit http://gerrit.cloudera.org:8080/6763 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters
Matthew Mulder has posted comments on this change. Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6763/1/tests/comparison/cluster.py File tests/comparison/cluster.py: PS1, Line 404: Why is this removed? -- To view, visit http://gerrit.cloudera.org:8080/6763 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan conjuncts
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5003: Constant propagation in scan conjuncts .. Patch Set 25: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/522/ -- To view, visit http://gerrit.cloudera.org:8080/6389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4 Gerrit-PatchSet: 25 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Zach Amsden Gerrit-Reviewer: anujphadke Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2716: Hive/Impala incompatibility for timestamp data in Parquet
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-2716: Hive/Impala incompatibility for timestamp data in Parquet .. Patch Set 10: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/521/ -- To view, visit http://gerrit.cloudera.org:8080/5939 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3f24525ef45a2814f476bdee76655b30081079d6 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5266 Impala ABM / LZCNT support
Zach Amsden has uploaded a new patch set (#5). Change subject: IMPALA-5266 Impala ABM / LZCNT support .. IMPALA-5266 Impala ABM / LZCNT support I recently added some code that wants to do upwards power of 2 calculation. Turns out this can be done much more quickly in hardware. It isn't on a perf critical code path yet but still seems like a decent idea. PopcountNoHw was absolutely atrocious as it contains a totally unpredictable loop that can be computed much more efficiently, so I fixed that as well. Testing: Added a perf test to verify this is faster (it is) and updated the bit-util-test to add better test coverage. Change-Id: I9f6a465ab4a9ee4f582847f8e211a779bdede3d2 --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/bit-intrinsics-benchmark.cc M be/src/util/bit-util-test.cc M be/src/util/bit-util.h M be/src/util/cpu-info.cc M be/src/util/cpu-info.h M be/src/util/fixed-size-hash-table.h M be/src/util/sse-util.h 8 files changed, 287 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/5821/5 -- To view, visit http://gerrit.cloudera.org:8080/5821 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9f6a465ab4a9ee4f582847f8e211a779bdede3d2 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] Bump Kudu version to 238249c
Impala Public Jenkins has posted comments on this change. Change subject: Bump Kudu version to 238249c .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/6718 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I92587a8061ce70ecd9dac4889bda550636982767 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] Bump Kudu version to 238249c
Impala Public Jenkins has submitted this change and it was merged. Change subject: Bump Kudu version to 238249c .. Bump Kudu version to 238249c This will pull in the Kudu client partitioner API, which is needed for IMPALA-3742. Change-Id: I92587a8061ce70ecd9dac4889bda550636982767 Reviewed-on: http://gerrit.cloudera.org:8080/6718 Reviewed-by: Matthew Jacobs Tested-by: Impala Public Jenkins --- M bin/impala-config.sh 1 file changed, 2 insertions(+), 3 deletions(-) Approvals: Impala Public Jenkins: Verified Matthew Jacobs: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/6718 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I92587a8061ce70ecd9dac4889bda550636982767 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs
[Impala-ASF-CR] Impala ABM / LZCNT support
Zach Amsden has posted comments on this change. Change subject: Impala ABM / LZCNT support .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/5821/3//COMMIT_MSG Commit Message: Line 7: Impala ABM / LZCNT support > Can you file a tracking JIRA for this? We mostly have standardised on alway Done - IMPALA-5266 -- To view, visit http://gerrit.cloudera.org:8080/5821 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9f6a465ab4a9ee4f582847f8e211a779bdede3d2 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5137: Support Kudu UNIXTIME MICROS as Impala TIMESTAMP
Matthew Jacobs has uploaded a new patch set (#5). Change subject: IMPALA-5137: Support Kudu UNIXTIME_MICROS as Impala TIMESTAMP .. IMPALA-5137: Support Kudu UNIXTIME_MICROS as Impala TIMESTAMP Adds Impala support for TIMESTAMP types stored in Kudu. Impala stores TIMESTAMP values in 96-bits and has nanosecond precision. Kudu's timestamp is a 64-bit microsecond delta from the Unix epoch (called UNIXTIME_MICROS), so a conversion is necessary. When writing to Kudu, TIMESTAMP values in nanoseconds are averaged to the nearest microsecond. When reading from Kudu, the KuduScanner returns UNIXTIME_MICROS with 8bytes of padding so Impala can convert the value to a TimestampValue in-line and copy the entire row. TODO: Kudu still needs to provide a knob to enable this: https://gerrit.cloudera.org/#/c/6624/ Testing: Updated the functional_kudu schema to use TIMESTAMPs instead of converting to STRING, so this provides some decent coverage. Some BE tests were added, and some EE tests as well. TODO: More testing of boundary values, and some basic perf. TODO: Support pushing down TIMESTAMP predicates TODO: Support TIMESTAMPs in range partitioning expressions Change-Id: Iae6ccfffb79118a9036fb2227dba3a55356c896d --- M be/src/exec/kudu-scanner.cc M be/src/exec/kudu-table-sink.cc M be/src/runtime/timestamp-test.cc M be/src/runtime/timestamp-value.h M bin/impala-config.sh M common/thrift/generate_error_codes.py M fe/src/main/java/org/apache/impala/util/KuduUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M testdata/datasets/functional/functional_schema_template.sql A testdata/workloads/functional-query/queries/QueryTest/kudu-overflow-ts-abort-on-error.test A testdata/workloads/functional-query/queries/QueryTest/kudu-overflow-ts.test M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test M tests/query_test/test_kudu.py M tests/query_test/test_queries.py M tests/query_test/test_scanners.py 16 files changed, 312 insertions(+), 138 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/6526/5 -- To view, visit http://gerrit.cloudera.org:8080/6526 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iae6ccfffb79118a9036fb2227dba3a55356c896d Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan conjuncts
Zach Amsden has posted comments on this change. Change subject: IMPALA-5003: Constant propagation in scan conjuncts .. Patch Set 21: Got a green light test run for this: http://sandbox.jenkins.cloudera.com/view/Impala/view/Private-Utility/job/impala-private-build-and-test/5527/console -- To view, visit http://gerrit.cloudera.org:8080/6389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4 Gerrit-PatchSet: 21 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Zach Amsden Gerrit-Reviewer: anujphadke Gerrit-HasComments: No
[Impala-ASF-CR] Bump Kudu version to 238249c
Impala Public Jenkins has posted comments on this change. Change subject: Bump Kudu version to 238249c .. Patch Set 1: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/520/ -- To view, visit http://gerrit.cloudera.org:8080/6718 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I92587a8061ce70ecd9dac4889bda550636982767 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] Experiment: glibc strncmp/memcmp appears much faster than SSE4.2
Jim Apple has posted comments on this change. Change subject: Experiment: glibc strncmp/memcmp appears much faster than SSE4.2 .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6768/1//COMMIT_MSG Commit Message: PS1, Line 17: memcmp is sse4.1-based > does it fall back to non-SSE4.1, if the CPU doesn't have SSE4.1? (maybe usi Yes, it looks like it: https://github.com/bminor/glibc/blob/ee19f1de0d0da24114be554fdf94243c0ec6b86c/sysdeps/x86_64/multiarch/memcmp.S Any suggestions on how to check the codegen case? Do you think the benchmark showing 5x improvement in running a particular query in the shell in covers that? -- To view, visit http://gerrit.cloudera.org:8080/6768 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie4786a4a75fdaffedd6e17cf076b5368ba4b4e3e Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] Experiment: glibc strncmp/memcmp appears much faster than SSE4.2
Dan Hecht has posted comments on this change. Change subject: Experiment: glibc strncmp/memcmp appears much faster than SSE4.2 .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6768/1//COMMIT_MSG Commit Message: PS1, Line 17: memcmp is sse4.1-based does it fall back to non-SSE4.1, if the CPU doesn't have SSE4.1? (maybe using IFUNC)? If so, then I think it makes sense to switch to gcc's version if it's always faster. You'll probably want to check codegen case too (make sure the cpu dependent dispatch all works with clang too). -- To view, visit http://gerrit.cloudera.org:8080/6768 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie4786a4a75fdaffedd6e17cf076b5368ba4b4e3e Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes