[Impala-ASF-CR] IMPALA-7580: Increase timeout int test automatic invalidation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11453 ) Change subject: IMPALA-7580: Increase timeout int test_automatic_invalidation .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11453 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifcf8fabdbfedf310b452bbc9b913a0a85c4b18f1 Gerrit-Change-Number: 11453 Gerrit-PatchSet: 2 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 18 Sep 2018 02:47:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7580: Increase timeout int test automatic invalidation
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11453 ) Change subject: IMPALA-7580: Increase timeout int test_automatic_invalidation .. IMPALA-7580: Increase timeout int test_automatic_invalidation The query in test_automatic_invalidation might finish after the table expires and renders the test flaky. This patch increases the timeout to 10 seconds on regular builds and 20 seconds on slow builds. Change-Id: Ifcf8fabdbfedf310b452bbc9b913a0a85c4b18f1 Reviewed-on: http://gerrit.cloudera.org:8080/11453 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M tests/custom_cluster/test_automatic_invalidation.py 1 file changed, 9 insertions(+), 4 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/11453 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ifcf8fabdbfedf310b452bbc9b913a0a85c4b18f1 Gerrit-Change-Number: 11453 Gerrit-PatchSet: 3 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] Revert "IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER"
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11456 ) Change subject: Revert "IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER" .. Revert "IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER" This patch has been causing a large number of build failures. Revert it until we figure out why. Change-Id: I7f4fc028962d4c6a630456a12a65884a62f01442 Reviewed-on: http://gerrit.cloudera.org:8080/11456 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins --- M bin/create-test-configuration.sh M bin/impala-config.sh M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableOrViewSetOwnerStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java M fe/src/main/java/org/apache/impala/analysis/DropDbStmt.java M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/SentryPolicyService.java M fe/src/main/java/org/apache/impala/util/SentryProxy.java M fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M fe/src/test/resources/mysql-hive-site.xml.template M fe/src/test/resources/postgresql-hive-site.xml.template M fe/src/test/resources/sentry-site.xml.template D fe/src/test/resources/sentry-site_no_oo.xml.template D fe/src/test/resources/sentry-site_oo.xml.template D fe/src/test/resources/sentry-site_oo_nogrant.xml.template M testdata/bin/run-sentry-service.sh M tests/authorization/test_grant_revoke.py D tests/authorization/test_owner_privileges.py M tests/common/custom_cluster_test_suite.py M tests/common/impala_test_suite.py 34 files changed, 152 insertions(+), 1,363 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/11456 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I7f4fc028962d4c6a630456a12a65884a62f01442 Gerrit-Change-Number: 11456 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Revert "IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER"
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11456 ) Change subject: Revert "IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER" .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11456 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f4fc028962d4c6a630456a12a65884a62f01442 Gerrit-Change-Number: 11456 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 18 Sep 2018 02:11:47 + Gerrit-HasComments: No
[Impala-ASF-CR] Revert "IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER"
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11456 ) Change subject: Revert "IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER" .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/11456/1/tests/common/custom_cluster_test_suite.py File tests/common/custom_cluster_test_suite.py: http://gerrit.cloudera.org:8080/#/c/11456/1/tests/common/custom_cluster_test_suite.py@92 PS1, Line 92: o flake8: E501 line too long (95 > 90 characters) -- To view, visit http://gerrit.cloudera.org:8080/11456 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f4fc028962d4c6a630456a12a65884a62f01442 Gerrit-Change-Number: 11456 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 18 Sep 2018 01:48:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7573: Fix GroupingAggregator::Reset with output partition
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11446 ) Change subject: IMPALA-7573: Fix GroupingAggregator::Reset with output_partition_ .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6db8ec8479b18b5ed681d7ac438480711ab7a1ba Gerrit-Change-Number: 11446 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 18 Sep 2018 00:37:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7573: Fix GroupingAggregator::Reset with output partition
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11446 ) Change subject: IMPALA-7573: Fix GroupingAggregator::Reset with output_partition_ .. IMPALA-7573: Fix GroupingAggregator::Reset with output_partition_ GroupingAggregator::Reset() doesn't call Close() on output_partition_, which can lead to hitting a DCHECK(is_closed) in the Partition destructor. Testing: - Added an e2e regression test. Change-Id: I6db8ec8479b18b5ed681d7ac438480711ab7a1ba Reviewed-on: http://gerrit.cloudera.org:8080/11446 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test 3 files changed, 26 insertions(+), 12 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/11446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I6db8ec8479b18b5ed681d7ac438480711ab7a1ba Gerrit-Change-Number: 11446 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7335: Fix a race in HdfsScanNode
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11337 ) Change subject: IMPALA-7335: Fix a race in HdfsScanNode .. Patch Set 7: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/700/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/11337 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id470818846a5c69ad28a6ff695069702982aa793 Gerrit-Change-Number: 11337 Gerrit-PatchSet: 7 Gerrit-Owner: Pooja Nilangekar Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 18 Sep 2018 00:04:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11450 ) Change subject: IMPALA-7492: Add support for DATE text parser/formatter .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.cc File be/src/runtime/date-value.cc: http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.cc@89 PS2, Line 89: } Do you want to set an informative string if the value is not valid? -- To view, visit http://gerrit.cloudera.org:8080/11450 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831 Gerrit-Change-Number: 11450 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 17 Sep 2018 23:59:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7353: Improve memory estimates for Hbase Scan Nodes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11306 ) Change subject: IMPALA-7353: Improve memory estimates for Hbase Scan Nodes .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/699/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I583545c3f5e454854f111871c5fbc4f108ae4bff Gerrit-Change-Number: 11306 Gerrit-PatchSet: 3 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 17 Sep 2018 23:40:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7335: Fix a race in HdfsScanNode
Pooja Nilangekar has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/11337 ) Change subject: IMPALA-7335: Fix a race in HdfsScanNode .. IMPALA-7335: Fix a race in HdfsScanNode This change fixes the race between the done_ flag and the status_ variable in the HdfsScanNode. Previously, a scanner thread would mark its scan range as complete even if it ran into an error. Another scanner thread could notice that all scan ranges have completed and set the done_ flag before the status_ variable is updated with the non-ok status. This caused the main thread to return an ok status despite the scanner thread running into an error. This change ensures that when a scanner thread runs into an error, it updates the status_ before marking its scan range as complete. Testing: Ran core tests. Since there is no deterministic method to reproduce the race, this change does not include any regression tests. Additionally, this change also fixes IMPALA-7430 by removing the logs added to debug this race. Change-Id: Id470818846a5c69ad28a6ff695069702982aa793 --- M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h 2 files changed, 40 insertions(+), 39 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/11337/7 -- To view, visit http://gerrit.cloudera.org:8080/11337 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id470818846a5c69ad28a6ff695069702982aa793 Gerrit-Change-Number: 11337 Gerrit-PatchSet: 7 Gerrit-Owner: Pooja Nilangekar Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7335: Fix a race in HdfsScanNode
Pooja Nilangekar has posted comments on this change. ( http://gerrit.cloudera.org:8080/11337 ) Change subject: IMPALA-7335: Fix a race in HdfsScanNode .. Patch Set 7: (5 comments) http://gerrit.cloudera.org:8080/#/c/11337/6/be/src/exec/hdfs-scan-node.h File be/src/exec/hdfs-scan-node.h: http://gerrit.cloudera.org:8080/#/c/11337/6/be/src/exec/hdfs-scan-node.h@181 PS6, Line 181: Calling it repeatedly ignores subsequent : /// calls. > those two sentences seem to contradict each other. If calling it repeatedly Done http://gerrit.cloudera.org:8080/#/c/11337/4/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/11337/4/be/src/exec/hdfs-scan-node.cc@527 PS4, Line 527: DoneInternal(const St > yes you are right, if we get a cancelled status and done is not set then we Done http://gerrit.cloudera.org:8080/#/c/11337/6/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/11337/6/be/src/exec/hdfs-scan-node.cc@512 PS6, Line 512: marking a range as complete > I don't think it's obvious where that happens. Is that in the Close() call Done http://gerrit.cloudera.org:8080/#/c/11337/6/be/src/exec/hdfs-scan-node.cc@538 PS6, Line 538: unique_lock l(lock_); > does status_ still get set outside of SetDoneInternal()? If so, why? No, status_ is only set in SetDoneInternal(). http://gerrit.cloudera.org:8080/#/c/11337/6/be/src/exec/hdfs-scan-node.cc@538 PS6, Line 538: unique_lock l(lock_); > if this is only called when the execution is complete, worth adding that to Done. However completion could be due to reading all the scan ranges, reaching the limit specified in the query or if the fragment is closed by the coordinator. So I added a comment stating " This function shouldn't be called if the scan node hits an error." Or should I rephrase it? -- To view, visit http://gerrit.cloudera.org:8080/11337 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id470818846a5c69ad28a6ff695069702982aa793 Gerrit-Change-Number: 11337 Gerrit-PatchSet: 7 Gerrit-Owner: Pooja Nilangekar Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 17 Sep 2018 23:34:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-589: Add a sql function to return the impalad coordinator hostname for diagnostic purposes.
Andrew Sherman has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11459 Change subject: IMPALA-589: Add a sql function to return the impalad coordinator hostname for diagnostic purposes. .. IMPALA-589: Add a sql function to return the impalad coordinator hostname for diagnostic purposes. In every execution of an Impala query, one of the impalad daemons acts as the coordinator node. In some cases, such as when using a proxy, a user cannot predict which host will act as the coordinator. To aid in diagnosis, we provide a sql function which returns the name of the host on which the coordinator is running. EXTERNAL DESCRIPTION: Add a builtin function called coordinator(), which returns the name of the host which is running the impalad that is acting as the coordinator for the current query. IMPLEMENTATION: All functions are passed a FunctionContext object which is the interface to the rest of the system for a UDF. From the FunctionContext we get the TQueryCtx (query context) which contains a TNetworkAddress for the coordinator. The hostname of the coordinator is copied from the TNetworkAddress. Can the TNetworkAddress for the coordinator be unitialized? In the thrift source the coord_address is marked optional, but my reading of the code says this is always set in a real impalad. To future-proof the code a null StringVal is returned if coord_address is unset. TESTING: - Added a basic unit test for the new function. - Added a unit test which simulates the case when coord_address is unset. - Hand tested in a development deployment. - Ran regression tests and got a clean run. LIMITATIONS: This change only adds the coordinator() function. It does not add the debug_url() function that was mentioned in the comments on the original Jira. Change-Id: I94d6e2664ba659b48df53c5c06f67b502c533e47 --- M be/src/exprs/expr-test.cc M be/src/exprs/utility-functions-ir.cc M be/src/exprs/utility-functions.h M common/function-registry/impala_functions.py M docs/topics/impala_misc_functions.xml 5 files changed, 61 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/11459/1 -- To view, visit http://gerrit.cloudera.org:8080/11459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I94d6e2664ba659b48df53c5c06f67b502c533e47 Gerrit-Change-Number: 11459 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman
[Impala-ASF-CR] Revert "IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER"
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11456 ) Change subject: Revert "IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER" .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/698/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11456 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f4fc028962d4c6a630456a12a65884a62f01442 Gerrit-Change-Number: 11456 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 17 Sep 2018 23:14:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7580: Increase timeout int test automatic invalidation
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11453 ) Change subject: IMPALA-7580: Increase timeout int test_automatic_invalidation .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11453 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifcf8fabdbfedf310b452bbc9b913a0a85c4b18f1 Gerrit-Change-Number: 11453 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 17 Sep 2018 23:07:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7353: Improve memory estimates for Hbase Scan Nodes
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11306 ) Change subject: IMPALA-7353: Improve memory estimates for Hbase Scan Nodes .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/11306/2/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java File fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java: http://gerrit.cloudera.org:8080/#/c/11306/2/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@548 PS2, Line 548: isStatsMissing > whoops, I was only trying to follow the convention that all boolean vars ha Yeah I think the intention was good, my brain just doesn't like ungrammatical-sounding variable names. isMissingStats seems fine. -- To view, visit http://gerrit.cloudera.org:8080/11306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I583545c3f5e454854f111871c5fbc4f108ae4bff Gerrit-Change-Number: 11306 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 17 Sep 2018 23:10:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7580: Increase timeout int test automatic invalidation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11453 ) Change subject: IMPALA-7580: Increase timeout int test_automatic_invalidation .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3168/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/11453 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifcf8fabdbfedf310b452bbc9b913a0a85c4b18f1 Gerrit-Change-Number: 11453 Gerrit-PatchSet: 2 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 17 Sep 2018 23:08:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7580: Increase timeout int test automatic invalidation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11453 ) Change subject: IMPALA-7580: Increase timeout int test_automatic_invalidation .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11453 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifcf8fabdbfedf310b452bbc9b913a0a85c4b18f1 Gerrit-Change-Number: 11453 Gerrit-PatchSet: 2 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 17 Sep 2018 23:08:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7353: Improve memory estimates for Hbase Scan Nodes
Hello Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11306 to look at the new patch set (#3). Change subject: IMPALA-7353: Improve memory estimates for Hbase Scan Nodes .. IMPALA-7353: Improve memory estimates for Hbase Scan Nodes Currently for hbase scan nodes we use a constant estimate of 1GB which is generally a gross over-estimation. This patch improves upon those estimates by using huerestics based on how hbase rows are stored and fetched and how the scanners interact with the internal memory pool. Testing: Added/Modified resource requirements planner test. Added a junit test for the estimation logic. Change-Id: I583545c3f5e454854f111871c5fbc4f108ae4bff --- M be/src/exec/hbase-table-scanner.cc M be/src/runtime/mem-pool.cc M be/src/runtime/mem-pool.h M fe/src/main/java/org/apache/impala/catalog/FeHBaseTable.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test 8 files changed, 472 insertions(+), 266 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/11306/3 -- To view, visit http://gerrit.cloudera.org:8080/11306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I583545c3f5e454854f111871c5fbc4f108ae4bff Gerrit-Change-Number: 11306 Gerrit-PatchSet: 3 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7351: Improve memory estimates for Hbase Scan Nodes
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/11306 ) Change subject: IMPALA-7351: Improve memory estimates for Hbase Scan Nodes .. Patch Set 2: (6 comments) http://gerrit.cloudera.org:8080/#/c/11306/2/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java File fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java: http://gerrit.cloudera.org:8080/#/c/11306/2/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@534 PS2, Line 534: public > can this be package-private? Let's document that it's only exposed for test Done http://gerrit.cloudera.org:8080/#/c/11306/2/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@548 PS2, Line 548: isStatsMissing > This variable name sounds a bit ungrammatical to me. Maybe "statsAreMissing whoops, I was only trying to follow the convention that all boolean vars have names starting with "is..". How about "isMissingStats"? http://gerrit.cloudera.org:8080/#/c/11306/2/fe/src/test/java/org/apache/impala/planner/PlannerTest.java File fe/src/test/java/org/apache/impala/planner/PlannerTest.java: http://gerrit.cloudera.org:8080/#/c/11306/2/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@630 PS2, Line 630: assertTrue > assertEquals() here and below. We should also consider doing a static impor Done http://gerrit.cloudera.org:8080/#/c/11306/2/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@631 PS2, Line 631: BitUtil.roundUpToPowerOf2(Type.INT.getSlotSize()) * 2 > I think I'd prefer just hardcoding the expected value instead of having sli Done http://gerrit.cloudera.org:8080/#/c/11306/2/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@636 PS2, Line 636: columnList.clear(); > Consider just constructing a new column list with https://google.github.io/ Done http://gerrit.cloudera.org:8080/#/c/11306/2/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@678 PS2, Line 678: } > Also consider adding a test case with many columns, just for completeness. Done -- To view, visit http://gerrit.cloudera.org:8080/11306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I583545c3f5e454854f111871c5fbc4f108ae4bff Gerrit-Change-Number: 11306 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 17 Sep 2018 23:07:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Revert "IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER"
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11456 ) Change subject: Revert "IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER" .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3167/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/11456 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f4fc028962d4c6a630456a12a65884a62f01442 Gerrit-Change-Number: 11456 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 17 Sep 2018 22:45:56 + Gerrit-HasComments: No
[Impala-ASF-CR] Revert "IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER"
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11456 ) Change subject: Revert "IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER" .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11456 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f4fc028962d4c6a630456a12a65884a62f01442 Gerrit-Change-Number: 11456 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 17 Sep 2018 22:44:56 + Gerrit-HasComments: No
[Impala-ASF-CR] Revert "IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER"
Thomas Marshall has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11456 Change subject: Revert "IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER" .. Revert "IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER" This patch has been causing a large number of build failures. Revert it until we figure out why. Change-Id: I7f4fc028962d4c6a630456a12a65884a62f01442 --- M bin/create-test-configuration.sh M bin/impala-config.sh M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableOrViewSetOwnerStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java M fe/src/main/java/org/apache/impala/analysis/DropDbStmt.java M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/SentryPolicyService.java M fe/src/main/java/org/apache/impala/util/SentryProxy.java M fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M fe/src/test/resources/mysql-hive-site.xml.template M fe/src/test/resources/postgresql-hive-site.xml.template M fe/src/test/resources/sentry-site.xml.template D fe/src/test/resources/sentry-site_no_oo.xml.template D fe/src/test/resources/sentry-site_oo.xml.template D fe/src/test/resources/sentry-site_oo_nogrant.xml.template M testdata/bin/run-sentry-service.sh M tests/authorization/test_grant_revoke.py D tests/authorization/test_owner_privileges.py M tests/common/custom_cluster_test_suite.py M tests/common/impala_test_suite.py 34 files changed, 152 insertions(+), 1,363 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/11456/1 -- To view, visit http://gerrit.cloudera.org:8080/11456 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7f4fc028962d4c6a630456a12a65884a62f01442 Gerrit-Change-Number: 11456 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Marshall
[Impala-ASF-CR] IMPALA-7580: Increase timeout int test automatic invalidation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11453 ) Change subject: IMPALA-7580: Increase timeout int test_automatic_invalidation .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/697/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11453 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifcf8fabdbfedf310b452bbc9b913a0a85c4b18f1 Gerrit-Change-Number: 11453 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 17 Sep 2018 22:24:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7525: [DOCS] SHOW GRANT USER
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11454 ) Change subject: IMPALA-7525: [DOCS] SHOW GRANT USER .. Patch Set 1: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/76/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/11454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa39275c595d73b2fba293f77bb21aa3985dfa38 Gerrit-Change-Number: 11454 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 17 Sep 2018 22:15:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7351: Improve memory estimates for Kudu Scan Nodes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11440 ) Change subject: IMPALA-7351: Improve memory estimates for Kudu Scan Nodes .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/696/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11440 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If9bb52530271b0bff91311a67d222a2e9fac1229 Gerrit-Change-Number: 11440 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 17 Sep 2018 22:05:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7525: [DOCS] SHOW GRANT USER
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11454 ) Change subject: IMPALA-7525: [DOCS] SHOW GRANT USER .. Patch Set 1: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/76/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/11454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa39275c595d73b2fba293f77bb21aa3985dfa38 Gerrit-Change-Number: 11454 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 17 Sep 2018 22:04:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7525: [DOCS] SHOW GRANT USER
Alex Rodoni has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11454 Change subject: IMPALA-7525: [DOCS] SHOW GRANT USER .. IMPALA-7525: [DOCS] SHOW GRANT USER Documented the new SHOW GRANT USER statements: SHOW GRANT USER user_name SHOW GRANT USER user_name ON SERVER SHOW GRANT USER user_name ON DATABASE database_name SHOW GRANT USER user_name ON TABLE table_name SHOW GRANT USER user_name ON URI uri Change-Id: Iaa39275c595d73b2fba293f77bb21aa3985dfa38 --- M docs/shared/impala_common.xml M docs/topics/impala_show.xml 2 files changed, 40 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/11454/1 -- To view, visit http://gerrit.cloudera.org:8080/11454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iaa39275c595d73b2fba293f77bb21aa3985dfa38 Gerrit-Change-Number: 11454 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] IMPALA-7351: Improve memory estimates for Kudu Scan Nodes
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11440 ) Change subject: IMPALA-7351: Improve memory estimates for Kudu Scan Nodes .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/11440/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11440/1//COMMIT_MSG@14 PS1, Line 14: Modified resource requirements planner test. I don't see any changes to this test (it seems like I didn't include a Kudu scan tests case, which is an oversight). I think we should add some tests there. http://gerrit.cloudera.org:8080/#/c/11440/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java: http://gerrit.cloudera.org:8080/#/c/11440/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@91 PS1, Line 91: // Factor capturing the worst-case deviation from a uniform distribution of scan ranges Put this in the ScanNode base class? http://gerrit.cloudera.org:8080/#/c/11440/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@284 PS1, Line 284: // The non-MT scan node requires at least one scanner thread. : int maxScannerThreads; : if (queryOptions.getMt_dop() >= 1) { : maxScannerThreads = 1; : } else { : // TODO: what about heterogeneous hardware, i.e., different num of cores? : maxScannerThreads = Math.min(perHostScanRanges, RuntimeEnv.INSTANCE.getNumCores()); : // Account for the max scanner threads query option. : if (queryOptions.isSetNum_scanner_threads() && : queryOptions.getNum_scanner_threads() > 0) { : maxScannerThreads = : Math.min(maxScannerThreads, queryOptions.getNum_scanner_threads()); : } : } I think this is shared with HdfsScanNode. Can we factor out to a helper function in ScanNode? http://gerrit.cloudera.org:8080/#/c/11440/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@298 PS1, Line 298: int num_cols = desc_.getSlots().size(); Can you leave a comment briefly describing the heuristic and mentioning any known limitations? E.g. doesn't model the row batch queue. Otherwise it will be mysterious in a year's time when we've forgotten the context. http://gerrit.cloudera.org:8080/#/c/11440/1/testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test File testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test: PS1: You might know already, but we don't validate resource requirements for this test because PlannerTestOption.VALIDATE_RESOURCES isn't set for it. I don't think we need to validate resources here, but didn't want us to think we had coverage that we don't. -- To view, visit http://gerrit.cloudera.org:8080/11440 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If9bb52530271b0bff91311a67d222a2e9fac1229 Gerrit-Change-Number: 11440 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 17 Sep 2018 22:01:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11403 ) Change subject: IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata .. IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata Given the granular caching in CatalogdMetaProvider, it's possible to have a situation like the following: - impalad has cached the table list including some table "t" - some other daemon has issued deletion of "t" - we try to access "t" before the invalidation has reached us via the statestore In this case, we'll get an error back when we try to fetch the table "t". This can confuse the planning process since it previously determined that the table exists. Note that this may occur either when "t" is first referenced or later during planning (eg when fetching a specific partition post-pruning), so it wouldn't be possible to simply convert it into a 'table does not exist' result. The solution here is to throw InconsistentMetadataFetchException after invalidating the table list and associated object. This then triggers a re-plan of the query. This patch implements such re-planning up to 10 times before eventually giving up. Given that each attempt invalidates the inconsistent cache we would normally expect it to succeed on the first such retry. The limit of 10 retries is just to avoid infinite loops in the case of a bug. Change-Id: I17389954780fa22d7866224c17e128458fffa545 Reviewed-on: http://gerrit.cloudera.org:8080/11403 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/service/Frontend.java M tests/custom_cluster/test_local_catalog.py 5 files changed, 192 insertions(+), 68 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/11403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I17389954780fa22d7866224c17e128458fffa545 Gerrit-Change-Number: 11403 Gerrit-PatchSet: 7 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11403 ) Change subject: IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17389954780fa22d7866224c17e128458fffa545 Gerrit-Change-Number: 11403 Gerrit-PatchSet: 6 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 17 Sep 2018 22:00:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7576: Add a timeout for all E2E tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11447 ) Change subject: IMPALA-7576: Add a timeout for all E2E tests .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I301dd27a9767bfaef2756282014ef457a31956bd Gerrit-Change-Number: 11447 Gerrit-PatchSet: 6 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Mon, 17 Sep 2018 21:49:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7576: Add a timeout for all E2E tests
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11447 ) Change subject: IMPALA-7576: Add a timeout for all E2E tests .. IMPALA-7576: Add a timeout for all E2E tests We've been seeing a lot of hangs in tests lately. This can waste test resources by keeping machines busy, cause loss of coverage when subsequent tests don't run, and can be difficult to diagnose if its not clear which test hung. This patch introduces a timeout of 2 hours for normal builds and 4 hours for slow builds for all tests run under pytest by using the pytest-timeout plugin. The timeouts were chosen to be generous to avoid false positives. In recent runs I examined, the longest running test is test_decimal_fuzz, which took 63 minutes in a DEBUG build and 162 minutes in an ASAN build. Testing: - Ran locally with a reduced timeout and confirmed the test is timed out when expected. Change-Id: I301dd27a9767bfaef2756282014ef457a31956bd Reviewed-on: http://gerrit.cloudera.org:8080/11447 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M infra/python/deps/stage2-requirements.txt M tests/conftest.py 2 files changed, 7 insertions(+), 1 deletion(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/11447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I301dd27a9767bfaef2756282014ef457a31956bd Gerrit-Change-Number: 11447 Gerrit-PatchSet: 7 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-7580: Increase timeout int test automatic invalidation
Tianyi Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11453 Change subject: IMPALA-7580: Increase timeout int test_automatic_invalidation .. IMPALA-7580: Increase timeout int test_automatic_invalidation The query in test_automatic_invalidation might finish after the table expires and renders the test flaky. This patch increases the timeout to 10 seconds on regular builds and 20 seconds on slow builds. Change-Id: Ifcf8fabdbfedf310b452bbc9b913a0a85c4b18f1 --- M tests/custom_cluster/test_automatic_invalidation.py 1 file changed, 9 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/11453/1 -- To view, visit http://gerrit.cloudera.org:8080/11453 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ifcf8fabdbfedf310b452bbc9b913a0a85c4b18f1 Gerrit-Change-Number: 11453 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7351: Improve memory estimates for Kudu Scan Nodes
Bikramjeet Vig has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11440 Change subject: IMPALA-7351: Improve memory estimates for Kudu Scan Nodes .. IMPALA-7351: Improve memory estimates for Kudu Scan Nodes This patch adds memory estimates for kudu scan nodes based on empirically derived estimates for the scan's memory consumption that were added in IMPALA-7096. Testing: Modified resource requirements planner test. Change-Id: If9bb52530271b0bff91311a67d222a2e9fac1229 --- M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test 6 files changed, 100 insertions(+), 61 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/11440/1 -- To view, visit http://gerrit.cloudera.org:8080/11440 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If9bb52530271b0bff91311a67d222a2e9fac1229 Gerrit-Change-Number: 11440 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig
[Impala-ASF-CR] IMPALA-7573: Fix GroupingAggregator::Reset with output partition
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11446 ) Change subject: IMPALA-7573: Fix GroupingAggregator::Reset with output_partition_ .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/695/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6db8ec8479b18b5ed681d7ac438480711ab7a1ba Gerrit-Change-Number: 11446 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 17 Sep 2018 21:24:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7573: Fix GroupingAggregator::Reset with output partition
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11446 ) Change subject: IMPALA-7573: Fix GroupingAggregator::Reset with output_partition_ .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3166/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/11446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6db8ec8479b18b5ed681d7ac438480711ab7a1ba Gerrit-Change-Number: 11446 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 17 Sep 2018 20:52:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7573: Fix GroupingAggregator::Reset with output partition
Thomas Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/11446 ) Change subject: IMPALA-7573: Fix GroupingAggregator::Reset with output_partition_ .. Patch Set 2: Code-Review+2 (2 comments) carrying forward http://gerrit.cloudera.org:8080/#/c/11446/1/be/src/exec/grouping-aggregator.h File be/src/exec/grouping-aggregator.h: http://gerrit.cloudera.org:8080/#/c/11446/1/be/src/exec/grouping-aggregator.h@583 PS1, Line 583: /// Calls Close() on 'output_partition_' and every Partition in > Comment needs update to mention output_partition_. Done http://gerrit.cloudera.org:8080/#/c/11446/1/be/src/exec/grouping-aggregator.cc File be/src/exec/grouping-aggregator.cc: http://gerrit.cloudera.org:8080/#/c/11446/1/be/src/exec/grouping-aggregator.cc@260 PS1, Line 260: DCHECK(output_partition_ != nullptr); > Can you document this invariant (!output_iterator_.AtEnd() implies output_p Done -- To view, visit http://gerrit.cloudera.org:8080/11446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6db8ec8479b18b5ed681d7ac438480711ab7a1ba Gerrit-Change-Number: 11446 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 17 Sep 2018 20:52:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7573: Fix GroupingAggregator::Reset with output partition
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11446 ) Change subject: IMPALA-7573: Fix GroupingAggregator::Reset with output_partition_ .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6db8ec8479b18b5ed681d7ac438480711ab7a1ba Gerrit-Change-Number: 11446 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 17 Sep 2018 20:52:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7573: Fix GroupingAggregator::Reset with output partition
Hello Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11446 to look at the new patch set (#2). Change subject: IMPALA-7573: Fix GroupingAggregator::Reset with output_partition_ .. IMPALA-7573: Fix GroupingAggregator::Reset with output_partition_ GroupingAggregator::Reset() doesn't call Close() on output_partition_, which can lead to hitting a DCHECK(is_closed) in the Partition destructor. Testing: - Added an e2e regression test. Change-Id: I6db8ec8479b18b5ed681d7ac438480711ab7a1ba --- M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test 3 files changed, 26 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/11446/2 -- To view, visit http://gerrit.cloudera.org:8080/11446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6db8ec8479b18b5ed681d7ac438480711ab7a1ba Gerrit-Change-Number: 11446 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7335: Fix a race in HdfsScanNode
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/11337 ) Change subject: IMPALA-7335: Fix a race in HdfsScanNode .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/11337/4/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/11337/4/be/src/exec/hdfs-scan-node.cc@527 PS4, Line 527: is already in the don > @Bikram, I think you missed this comment. Can you please take a look at th yes you are right, if we get a cancelled status and done is not set then we should set the status here. On that note, before your change the behavior seemed like that if done_ is set and the scanner thread encounters an error, then the status_ is set to that error. Eventually the error may or may not be propagated up the scan node depending on who gets hold of the lock_ first (GetNext or the error-ed ScannerThread). With your patch it seems like once done is set, the status does not change, which sounds like the right behaviour, but it'd be good to confirm on that with others. http://gerrit.cloudera.org:8080/#/c/11337/6/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/11337/6/be/src/exec/hdfs-scan-node.cc@538 PS6, Line 538: // The scan node's execution completed, preserve the status_ currently set if this is only called when the execution is complete, worth adding that to the method comment. -- To view, visit http://gerrit.cloudera.org:8080/11337 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id470818846a5c69ad28a6ff695069702982aa793 Gerrit-Change-Number: 11337 Gerrit-PatchSet: 6 Gerrit-Owner: Pooja Nilangekar Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 17 Sep 2018 20:32:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11403 ) Change subject: IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/694/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17389954780fa22d7866224c17e128458fffa545 Gerrit-Change-Number: 11403 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 17 Sep 2018 19:04:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7576: Add a timeout for all E2E tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11447 ) Change subject: IMPALA-7576: Add a timeout for all E2E tests .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/693/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I301dd27a9767bfaef2756282014ef457a31956bd Gerrit-Change-Number: 11447 Gerrit-PatchSet: 5 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Mon, 17 Sep 2018 18:39:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7570: [DOCS] Added a table of all built-in impala functions
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/11441 ) Change subject: IMPALA-7570: [DOCS] Added a table of all built-in impala_functions .. Patch Set 1: > How did you create the list? Maybe add that as a comment in the doc > so someone else can use the same method later. Added a description. -- To view, visit http://gerrit.cloudera.org:8080/11441 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f6b024bc218a9158249f161fd16be10f16d19db Gerrit-Change-Number: 11441 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Comment-Date: Mon, 17 Sep 2018 18:17:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11403 ) Change subject: IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3165/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/11403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17389954780fa22d7866224c17e128458fffa545 Gerrit-Change-Number: 11403 Gerrit-PatchSet: 6 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 17 Sep 2018 18:16:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7570: [DOCS] Added a table of all built-in impala functions
Hello Jim Apple, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11441 to look at the new patch set (#2). Change subject: IMPALA-7570: [DOCS] Added a table of all built-in impala_functions .. IMPALA-7570: [DOCS] Added a table of all built-in impala_functions - Cleaned up no-value added texts. - Added a table of built-in functions that users can use to get a link to functions. Because the functions are listed as , the above list of functions has to be manually maintained. When there is a new function or a removed function, update the above list. The link format is: FUNCTION NAME For example: WEEKOFYEAR Change-Id: I2f6b024bc218a9158249f161fd16be10f16d19db --- M docs/topics/impala_functions.xml 1 file changed, 1,368 insertions(+), 105 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/11441/2 -- To view, visit http://gerrit.cloudera.org:8080/11441 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2f6b024bc218a9158249f161fd16be10f16d19db Gerrit-Change-Number: 11441 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple
[Impala-ASF-CR] IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11403 ) Change subject: IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17389954780fa22d7866224c17e128458fffa545 Gerrit-Change-Number: 11403 Gerrit-PatchSet: 6 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 17 Sep 2018 18:16:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata
Vuk Ercegovac has uploaded a new patch set (#5) to the change originally created by Todd Lipcon. ( http://gerrit.cloudera.org:8080/11403 ) Change subject: IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata .. IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata Given the granular caching in CatalogdMetaProvider, it's possible to have a situation like the following: - impalad has cached the table list including some table "t" - some other daemon has issued deletion of "t" - we try to access "t" before the invalidation has reached us via the statestore In this case, we'll get an error back when we try to fetch the table "t". This can confuse the planning process since it previously determined that the table exists. Note that this may occur either when "t" is first referenced or later during planning (eg when fetching a specific partition post-pruning), so it wouldn't be possible to simply convert it into a 'table does not exist' result. The solution here is to throw InconsistentMetadataFetchException after invalidating the table list and associated object. This then triggers a re-plan of the query. This patch implements such re-planning up to 10 times before eventually giving up. Given that each attempt invalidates the inconsistent cache we would normally expect it to succeed on the first such retry. The limit of 10 retries is just to avoid infinite loops in the case of a bug. Change-Id: I17389954780fa22d7866224c17e128458fffa545 --- M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/service/Frontend.java M tests/custom_cluster/test_local_catalog.py 5 files changed, 192 insertions(+), 68 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/11403/5 -- To view, visit http://gerrit.cloudera.org:8080/11403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I17389954780fa22d7866224c17e128458fffa545 Gerrit-Change-Number: 11403 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11403 ) Change subject: IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata .. Patch Set 5: Code-Review+2 carrying +2 -- To view, visit http://gerrit.cloudera.org:8080/11403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17389954780fa22d7866224c17e128458fffa545 Gerrit-Change-Number: 11403 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 17 Sep 2018 18:15:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11403 ) Change subject: IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/11403/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11403/4//COMMIT_MSG@9 PS4, Line 9: catching > typo Done http://gerrit.cloudera.org:8080/#/c/11403/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/11403/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2099 PS4, Line 2099: the missing objects > nit: We don't actually set the missing objects? Maybe reword? Done http://gerrit.cloudera.org:8080/#/c/11403/4/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11403/4/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@334 PS4, Line 334: info > warn? Done -- To view, visit http://gerrit.cloudera.org:8080/11403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17389954780fa22d7866224c17e128458fffa545 Gerrit-Change-Number: 11403 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 17 Sep 2018 18:14:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Fix caps and grammar of backend startup profile key
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11451 ) Change subject: Fix caps and grammar of backend startup profile key .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/692/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11451 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie7d38a81364a57de90a01f4f65a375dc2e403e18 Gerrit-Change-Number: 11451 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 17 Sep 2018 18:11:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7576: Add a timeout for all E2E tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11447 ) Change subject: IMPALA-7576: Add a timeout for all E2E tests .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3164/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/11447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I301dd27a9767bfaef2756282014ef457a31956bd Gerrit-Change-Number: 11447 Gerrit-PatchSet: 6 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Mon, 17 Sep 2018 18:05:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7576: Add a timeout for all E2E tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11447 ) Change subject: IMPALA-7576: Add a timeout for all E2E tests .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I301dd27a9767bfaef2756282014ef457a31956bd Gerrit-Change-Number: 11447 Gerrit-PatchSet: 6 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Mon, 17 Sep 2018 18:05:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7576: Add a timeout for all E2E tests
Thomas Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/11447 ) Change subject: IMPALA-7576: Add a timeout for all E2E tests .. Patch Set 5: Code-Review+2 (1 comment) carrying forward http://gerrit.cloudera.org:8080/#/c/11447/4/tests/conftest.py File tests/conftest.py: http://gerrit.cloudera.org:8080/#/c/11447/4/tests/conftest.py@58 PS4, Line 58: """ Hook startup of pytest. Sets up log format and per-test timeout. """ > Moving to conftest.py was a great suggestion. Done -- To view, visit http://gerrit.cloudera.org:8080/11447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I301dd27a9767bfaef2756282014ef457a31956bd Gerrit-Change-Number: 11447 Gerrit-PatchSet: 5 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Mon, 17 Sep 2018 18:04:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7576: Add a timeout for all E2E tests
Hello Philip Zeyliger, David Knupp, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11447 to look at the new patch set (#5). Change subject: IMPALA-7576: Add a timeout for all E2E tests .. IMPALA-7576: Add a timeout for all E2E tests We've been seeing a lot of hangs in tests lately. This can waste test resources by keeping machines busy, cause loss of coverage when subsequent tests don't run, and can be difficult to diagnose if its not clear which test hung. This patch introduces a timeout of 2 hours for normal builds and 4 hours for slow builds for all tests run under pytest by using the pytest-timeout plugin. The timeouts were chosen to be generous to avoid false positives. In recent runs I examined, the longest running test is test_decimal_fuzz, which took 63 minutes in a DEBUG build and 162 minutes in an ASAN build. Testing: - Ran locally with a reduced timeout and confirmed the test is timed out when expected. Change-Id: I301dd27a9767bfaef2756282014ef457a31956bd --- M infra/python/deps/stage2-requirements.txt M tests/conftest.py 2 files changed, 7 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/11447/5 -- To view, visit http://gerrit.cloudera.org:8080/11447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I301dd27a9767bfaef2756282014ef457a31956bd Gerrit-Change-Number: 11447 Gerrit-PatchSet: 5 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-7576: Add a timeout for all E2E tests
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/11447 ) Change subject: IMPALA-7576: Add a timeout for all E2E tests .. Patch Set 4: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/11447/4/tests/conftest.py File tests/conftest.py: http://gerrit.cloudera.org:8080/#/c/11447/4/tests/conftest.py@58 PS4, Line 58: """ Hook startup of pytest to set up log format. """ Moving to conftest.py was a great suggestion. Nit: this is docstring should be updated now. -- To view, visit http://gerrit.cloudera.org:8080/11447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I301dd27a9767bfaef2756282014ef457a31956bd Gerrit-Change-Number: 11447 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Mon, 17 Sep 2018 17:45:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Fix caps and grammar of backend startup profile key
Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11451 Change subject: Fix caps and grammar of backend startup profile key .. Fix caps and grammar of backend startup profile key * Make caps consistent with other keys * Use "start up" (verb) instead of "startup" (noun) Change-Id: Ie7d38a81364a57de90a01f4f65a375dc2e403e18 --- M be/src/runtime/coordinator.cc 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/11451/1 -- To view, visit http://gerrit.cloudera.org:8080/11451 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie7d38a81364a57de90a01f4f65a375dc2e403e18 Gerrit-Change-Number: 11451 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/11403 ) Change subject: IMPALA-7530, IMPALA-7509. Re-plan queries if they fetch inconsistent metadata .. Patch Set 4: Code-Review+2 (3 comments) http://gerrit.cloudera.org:8080/#/c/11403/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11403/4//COMMIT_MSG@9 PS4, Line 9: catching typo http://gerrit.cloudera.org:8080/#/c/11403/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/11403/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2099 PS4, Line 2099: the missing objects nit: We don't actually set the missing objects? Maybe reword? http://gerrit.cloudera.org:8080/#/c/11403/4/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11403/4/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@334 PS4, Line 334: info warn? -- To view, visit http://gerrit.cloudera.org:8080/11403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17389954780fa22d7866224c17e128458fffa545 Gerrit-Change-Number: 11403 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 17 Sep 2018 17:27:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/11450 ) Change subject: IMPALA-7492: Add support for DATE text parser/formatter .. Patch Set 2: (9 comments) http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc File be/src/runtime/date-test.cc: http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc@120 PS2, Line 120: ValidateDate I think that it's worth mentioning both here and in the commit message that Impala's parsing logic is checked with against CCTZ's logic. http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc@418 PS2, Line 418: { I would prefer to move this block to a member function of DateTc. http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc@424 PS2, Line 424: "TC: " << i; Instead of printing the index it would be more informative print test_case.str or adding a << function for DateTc. http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc@472 PS2, Line 472: { Similarly to line 418, I would prefer to move this block to a member function + replace '<< "TC: " << i' with more detailed info. http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc@490 PS2, Line 490: // Test min supported date. Please mention here that MIN_DATE_DAYS_SINCE_EPOCH is calculated with Proleptic Gregorian calendar and is expected to be different than 0001-01-01 Hive. http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.h File be/src/runtime/date-value.h: http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.h@44 PS2, Line 44: MIN_DAYS_SINCE_EPOCH - 1 I would prefer to choose another value for invalid dates - my concern is that this may be valid if written by Hive or other systems that do not use Proleptic Gregorian Calendar. This probably wouldn't cause an issue, but it would clearer to use a more extreme value. http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.h@46 PS2, Line 46: year >= MIN_YEAR && year <= MAX_YEAR LIKELY could be added here. http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.h@67 PS2, Line 67: ToCivilDay I would prefer to avoid including CCTZ in the interface (but not at all costs) and hide it behind the implementation of TimestampValue/DateValue/TimezoneDb. Maybe some utility functions could be added somewhere that do these conversions. CCTZ will be useful for functions like addDay/addYear/dayOfWeek, but these could be also added to the interface of DateValue, so the CCTZ can do its work in the implementation. http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.h@101 PS2, Line 101: int32_t days_since_epoch_; Please add more explanation about the representation here or at the start of the class e.g.: - the epoch is 1970.01.01 - this representation was chosen to be the same (bit-by-bit) as Parquet's date: https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#date - boost::gregorian::date could not be used as representation due to its limited range - Proleptic Gregorian calendar which can lead to different historical representation compared to Hive (https://en.wikipedia.org/wiki/Proleptic_Gregorian_calendar) -- To view, visit http://gerrit.cloudera.org:8080/11450 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831 Gerrit-Change-Number: 11450 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 17 Sep 2018 17:22:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7556: part 1: handle different file systems via polymorphism
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11444 ) Change subject: IMPALA-7556: part 1: handle different file systems via polymorphism .. Patch Set 2: (9 comments) I think this makes sense. I think the documentation of invariants around some of the variables could be better and make it easier to work on the code in future. http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/file-reader.h File be/src/runtime/io/file-reader.h: http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/file-reader.h@41 PS2, Line 41: void SetRequestContext(RequestContext* request_context) { I think we always call SetRequestContext() and SetDiskIoMgr() at the same time as ResetState() so I think we can just combine them. Would make the intended use pattern of the API clearer. Or we could just expose request_context_ and io_mgr_ in ScanRange and use those directly. http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/file-reader.h@50 PS2, Line 50: BytesRead bytes_read() since it's just a trivial accessor http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/file-reader.h@81 PS2, Line 81: GetLock this is a plain accessor so prefer 'lock()' http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/file-reader.h@93 PS2, Line 93: ScanRange* scan_range_; const? I don't think this changes after initialization. http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/hdfs-file-reader.h File be/src/runtime/io/hdfs-file-reader.h: http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/hdfs-file-reader.h@47 PS2, Line 47: hdfsFS hdfs_fs_; const, since I don't think this is reassigned http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/hdfs-file-reader.cc File be/src/runtime/io/hdfs-file-reader.cc: http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/hdfs-file-reader.cc@108 PS2, Line 108: int64_t max_chunk_size = scan_range_->MaxReadChunkSize(); This part was just a mechanical move, right? Would be helpful to flag the parts that were just moved from one file to another (we can verify this ourselves but it's helpful to have a starting people). http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/local-file-reader.h File be/src/runtime/io/local-file-reader.h: http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/local-file-reader.h@40 PS2, Line 40: FILE* file_ = nullptr; Maybe briefly document how this fits into the lifecycle, e.g. set in Open() and set back to nullptr in Close(). http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/request-ranges.h File be/src/runtime/io/request-ranges.h: http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/request-ranges.h@274 PS2, Line 274: Status CancelStatus() const { return cancel_status_; } This needs a little more thought - we needed to hold lock_ or scan_range_lock_ to read this member so don't want to expose this publicly here. Maybe could move it to private: and document that the FileReader classes access it directly while holding their lock. trivial accessor, so cancel_status() if we keep it exposed. http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/request-ranges.h@482 PS2, Line 482: hdfs_lock_ Comments referencing hdfs_lock_ need updating. -- To view, visit http://gerrit.cloudera.org:8080/11444 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia3d3d2d774075008285230606b992603d5be1a82 Gerrit-Change-Number: 11444 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 17 Sep 2018 17:03:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7570: [DOCS] Added a table of all built-in impala functions
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/11441 ) Change subject: IMPALA-7570: [DOCS] Added a table of all built-in impala_functions .. Patch Set 1: How did you create the list? Maybe add that as a comment in the doc so someone else can use the same method later. -- To view, visit http://gerrit.cloudera.org:8080/11441 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f6b024bc218a9158249f161fd16be10f16d19db Gerrit-Change-Number: 11441 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Comment-Date: Mon, 17 Sep 2018 16:50:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5031: undefined behavior: codegen signed overflow
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11406 ) Change subject: IMPALA-5031: undefined behavior: codegen signed overflow .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/11406/1/be/src/exprs/operators-ir.cc File be/src/exprs/operators-ir.cc: http://gerrit.cloudera.org:8080/#/c/11406/1/be/src/exprs/operators-ir.cc@65 PS1, Line 65: docs p > If we #include , I agree, but I think cstring puts it in std::, n I'm pretty sure all real implementations put the standard libc functions into the global namespace regardless of which header variant you import. Honestly, this makes sense to me - it would be insane for a C++ program to redefine memcpy. But yeah, this is ok, just slightly inconsistent with the rest of the code base. http://gerrit.cloudera.org:8080/#/c/11406/2/be/src/exprs/operators-ir.cc File be/src/exprs/operators-ir.cc: http://gerrit.cloudera.org:8080/#/c/11406/2/be/src/exprs/operators-ir.cc@68 PS2, Line 68: // range of the type". However, Clang does not document its implementation-defined I'm ok with relying on clang's undocumented behaviour so long as we have a test case. I don't see a reason why they would change this going forward, given it's the simplest to implement and most compatible with GCC. -- To view, visit http://gerrit.cloudera.org:8080/11406 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79ec3a5ed974709e5e47be6b074d39ee89461f7f Gerrit-Change-Number: 11406 Gerrit-PatchSet: 2 Gerrit-Owner: Jim Apple Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 17 Sep 2018 16:32:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/11450 ) Change subject: IMPALA-7492: Add support for DATE text parser/formatter .. Patch Set 1: (12 comments) http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/benchmarks/convert-timestamp-benchmark.cc File be/src/benchmarks/convert-timestamp-benchmark.cc: http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/benchmarks/convert-timestamp-benchmark.cc@152 PS1, Line 152: 19 nit: shouldn't it be 20? Maybe you could add a Reset function that only takes a const char*, and uses strlen() inside. http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-parse-util.h File be/src/runtime/date-parse-util.h: http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-parse-util.h@26 PS1, Line 26: namespace dtp = datetime_parse_util; nit: It is against the google style guide: "Do not use Namespace aliases at namespace scope in header files except in explicitly marked internal-only namespaces" https://google.github.io/styleguide/cppguide.html#Namespaces http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-parse-util.h@59 PS1, Line 59: /// Returns the realigned year value for dt_result.year. This function should be called : /// only if dt_result.realign_year is true. Please explain what is a realigned year. You can re-use the original comment from timestamp-parse-util.h http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-parse-util.cc File be/src/runtime/date-parse-util.cc: http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-parse-util.cc@61 PS1, Line 61: NULL nit: nullptr http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-parse-util.cc@71 PS1, Line 71: if (!date->IsValid()) return false; : : return true; nit: return date->IsValid(); http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-parse-util.cc@132 PS1, Line 132: DateTimeFormatContext lazy_ctx; : lazy_ctx.Reset(str, trimmed_len); nit: could be a one-liner with a constructor call. http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-parse-util.cc@147 PS1, Line 147: NULL nit: nullptr http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-parse-util.cc@155 PS1, Line 155: NULL nit: nullptr http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-test.cc File be/src/runtime/date-test.cc: http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-test.cc@140 PS1, Line 140: for (int i = 0; i < toks_len; ++i) { : fmt.append((*toks)[i].fmt); : if ((*toks)[i].str != nullptr) { : val.append(string((*toks)[i].str)); : } else { : val.append(lexical_cast((*toks)[i].val)); : } : } : string fmt_val = "Format: " + fmt + ", Val: " + val; : DateTimeFormatContext dt_ctx(fmt.c_str(), fmt.length()); : ASSERT_TRUE(ParseFormatTokens(_ctx, false)) << fmt_val; : DateValue dv = DateValue::Parse(val.c_str(), val.length(), dt_ctx); : ValidateDate(dv, year, month, day, fmt_val); : int buff_len = dt_ctx.fmt_out_len + 1; : char buff[buff_len]; : int actual_len = dv.Format(dt_ctx, buff_len, buff); : EXPECT_GT(actual_len, 0) << fmt_val; : EXPECT_LE(actual_len, dt_ctx.fmt_len) << fmt_val; : string buff_str(buff); : EXPECT_EQ(buff_str, val) << fmt_val << " " << buff_str; : fmt.clear(); : val.clear(); : } : // Validate we can parse date with separators : { : for (const char* separator = SEPARATORS; *separator != 0; ++separator) { : for (int i = 0; i < toks_len; ++i) { : fmt.append((*toks)[i].fmt); : if (i + 1 < toks_len) fmt.push_back(*separator); : if ((*toks)[i].str != nullptr) { : val.append(string((*toks)[i].str)); : } else { : val.append(lexical_cast((*toks)[i].val)); : } : if (i + 1 < toks_len) val.push_back(*separator); : } : string fmt_val = "Format: " + fmt + ", Val: " + val; : DateTimeFormatContext dt_ctx(fmt.c_str(), fmt.length()); : ASSERT_TRUE(ParseFormatTokens(_ctx, false)) << fmt_val; : DateValue dv = DateValue::Parse(val.c_str(), val.length(), dt_ctx); : ValidateDate(dv, year, month, day, fmt_val); : int
[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/11450 ) Change subject: IMPALA-7492: Add support for DATE text parser/formatter .. Patch Set 1: (5 comments) I have went through the change quickly, I will dig into details deeper in the future. http://gerrit.cloudera.org:8080/#/c/11450/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11450/2//COMMIT_MSG@13 PS2, Line 13: The parser supports parsing both default and custom formatted DATE : values. The formatter supports default and custom formatting of DATE : values. It took me some time to realize that these two sentences are not duplicates. Please add "also" or merge them somehow. http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc File be/src/runtime/date-test.cc: http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc@195 PS2, Line 195: TEST I would prefer to split this into more than one tests, e.g. DefaultParse, CustomParse, EdgeCases, Format. timestamp-test.cc has a very long Basic test, but I do not think that it is a good idea - I found it quite hard to get an overview of the tests and checking whether something is already tested or not. http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc@491 PS2, Line 491: const int64_t MIN_DATE_DAYS_SINCE_EPOCH = -719528; Does this have to be int64? Int32 should be generally always enough to represent day since epoch. http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc@510 PS2, Line 510: const int64_t MAX_DATE_DAYS_SINCE_EPOCH = 2932896; Same as line 491. http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.h File be/src/runtime/date-value.h: http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.h@35 PS2, Line 35: -01-01 Isn't it -12-31? That date can be represented with TimestampValue. -- To view, visit http://gerrit.cloudera.org:8080/11450 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831 Gerrit-Change-Number: 11450 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 17 Sep 2018 16:19:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11450 ) Change subject: IMPALA-7492: Add support for DATE text parser/formatter .. Patch Set 2: (3 comments) I was curious so I took a quick look over the changes to understand the high-level changes, I think this makes a lot of sense to me. http://gerrit.cloudera.org:8080/#/c/11450/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11450/2//COMMIT_MSG@29 PS2, Line 29: - Added BE-tests for DateParser and DateValue classes. I didn't see any reason in the code to expect that performance would change, but it would be good to do a sanity-test of timestamp parsing performance, e.g. measure the time taken to convert a largish table of timestamp strings. I'd suggest doing this with mt_dop=1 so that scanner threads don't affect measurements. http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.h File be/src/runtime/date-value.h: http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.h@54 PS2, Line 54: bool IsValid() const { We should document the invariant about validity. I.e. that it's possible for an "invalid" DateValue to exist. http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/datetime-parse-util.h File be/src/runtime/datetime-parse-util.h: http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/datetime-parse-util.h@178 PS2, Line 178: : year(-1), We'v generally been directly initializing the members when they're constants. -- To view, visit http://gerrit.cloudera.org:8080/11450 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831 Gerrit-Change-Number: 11450 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 17 Sep 2018 16:18:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11450 ) Change subject: IMPALA-7492: Add support for DATE text parser/formatter .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/691/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11450 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831 Gerrit-Change-Number: 11450 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 17 Sep 2018 15:31:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter
Attila Jeges has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/11450 ) Change subject: IMPALA-7492: Add support for DATE text parser/formatter .. IMPALA-7492: Add support for DATE text parser/formatter This change is the first step in implementing support for DATE type (IMPALA-6169). The DATE parser/formatter is implemented by the new DateParser class. The parser supports parsing both default and custom formatted DATE values. The formatter supports default and custom formatting of DATE values. In the future, DateParser will be used in the text scanner/writer and in the DATE <-> STRING cast functions. The DateParser class reuses some of the functionality already implemented in the TimestampParser class to minimize redundancy. To make code reuse easier, a new namespace (datetime_parse_util) was created and the common functionality was moved there. This change also adds a new class (DateValue) to represent a DATE value in-memory. The DateParser and DateValue classes are used only in tests at the moment, therefore this patch doesn't change user facing behavior. Testing: - Added BE-tests for DateParser and DateValue classes. Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831 --- M be/src/benchmarks/convert-timestamp-benchmark.cc M be/src/benchmarks/parse-timestamp-benchmark.cc M be/src/common/init.cc M be/src/exprs/timestamp-functions-ir.cc M be/src/exprs/timestamp-functions.cc M be/src/exprs/timestamp-functions.h M be/src/runtime/CMakeLists.txt A be/src/runtime/date-parse-util.cc A be/src/runtime/date-parse-util.h A be/src/runtime/date-test.cc A be/src/runtime/date-value.cc A be/src/runtime/date-value.h A be/src/runtime/datetime-parse-util.cc A be/src/runtime/datetime-parse-util.h M be/src/runtime/timestamp-parse-util.cc M be/src/runtime/timestamp-parse-util.h M be/src/runtime/timestamp-test.cc M be/src/runtime/timestamp-value.cc M be/src/runtime/timestamp-value.h 19 files changed, 1,830 insertions(+), 729 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/11450/2 -- To view, visit http://gerrit.cloudera.org:8080/11450 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831 Gerrit-Change-Number: 11450 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-7556: part 1: handle different file systems via polymorphism
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/11444 ) Change subject: IMPALA-7556: part 1: handle different file systems via polymorphism .. Patch Set 2: (8 comments) I have minimal experience with IO manager so I may not understand the intent of the refactor in some places. http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/hdfs-file-reader.cc File be/src/runtime/io/hdfs-file-reader.cc: http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/hdfs-file-reader.cc@122 PS2, Line 122: while (true) { optional: If this part is already touched, it would be nice to extract some parts to functions to reduce function length/nestedness. At the first glance I would try to extract the part that is "retried" to a function like Status ReadFromPosInternal(...lot of args). http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/hdfs-file-reader.cc@201 PS2, Line 201: Close In the .h Close() is after ReadFromCache(). I would prefer this order in .cc to be consistent and also to group read functions together. http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/hdfs-file-reader.cc@203 PS2, Line 203: closed_file Variable closed_file looks unnecessary. http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/hdfs-file-reader.cc@207 PS2, Line 207: if optional: as this block mainly uses private ScanRange variables, I would extract it to a function in ScanRange like FreeCachedExternalBuffer(). http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/hdfs-file-reader.cc@249 PS2, Line 249: ReadFromCache I would consider moving (most of) this back to ScanRange::ReadFromCache(). The local file version doesn't seem meaningful + this uses scan_range's private variables much more often than HDFSFileReader's. http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/hdfs-file-reader.cc@334 PS2, Line 334: nit: other members have no space after =. http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/local-file-reader.cc File be/src/runtime/io/local-file-reader.cc: http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/local-file-reader.cc@100 PS2, Line 100: Status LocalFileReader::ReadFromCache(const unique_lock& reader_lock, : bool* read_succeeded) { : DCHECK(reader_lock.mutex() == _context_->lock_ && reader_lock.owns_lock()); : DCHECK_EQ(bytes_read_, 0); : return Status::OK(); : } This looks weird to me - is this ever called? I checked the callsites of ReadFromCache() and read_succeeded is not initialized before the call, so function will also leave it in an uninitialized state. If this should never be called, then I would replace add DCHECK(false). http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/local-file-reader.cc@109 PS2, Line 109: closed_file Variable closed_file looks unnecessary. -- To view, visit http://gerrit.cloudera.org:8080/11444 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia3d3d2d774075008285230606b992603d5be1a82 Gerrit-Change-Number: 11444 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 17 Sep 2018 14:32:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11183 ) Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions .. IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions Impala used to convert from sub-second unix time to TimestampValue (which is split to date_ and time_ similarly to boost::posix_time::ptime) by first splitting the input into seconds and sub-seconds part, converting the seconds part wit boost::posix_time::from_time_t(), and then adding the sub-seconds part to this timestamp. Different tricks are used to speed up different functions: - UTC functions that expect a single integer as input can split it into date_ and time_ directly. - Non-UTC functions need seconds for timezone conversion, because CCTZ expects time points as seconds. These were optimized by adding the subsecond part to time_ instead of adding it to a ptime. This can be done safely because the sub-second part is between [0, 1 sec), so it cannot overflow into a different day or timezone. Benchmarks show 2x - 6x speedup for the measured functions. The main motivation is IMPALA-5050: "Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS to the parquet scanner" - reading these types will run micro/milli->TimestampValue conversion for every row. Other changes: - TimestampValue::UtcFromUnixTimeMillis was added - currently this is only used in tests but it will be useful for IMPALA-5050 - Some functions were moved from .h to .inline.h. - FromUnixTimeMicros was changed to do the utc->local conversion depending on flag use_local_tz_for_unix_timestamp_conversions to be consistent with other similar functions. This function was only used in tests until now but it will be useful for IMPALA-5050. - When a result mismatch is detected in convert-timestamp-benchmark.cc it now prints non-equal values. - Benchmarks were added for micro + nano conversions. Note that only single threaded benchmarks were added because I do not expect any difference in the multi threaded case. - DCHECKs were added to TimeStampValue::Validate to ensure that time_ is between [0, 24 hour). Testing: - timestamp-test.cc was extended to give better coverage for sub-second conversions. Edge cases were already covered pretty well. Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa Reviewed-on: http://gerrit.cloudera.org:8080/11183 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/benchmarks/convert-timestamp-benchmark.cc M be/src/exec/data-source-scan-node.cc M be/src/exec/hdfs-orc-scanner.cc M be/src/exprs/decimal-operators-ir.cc M be/src/exprs/expr-test.cc M be/src/runtime/timestamp-test.cc M be/src/runtime/timestamp-value.cc M be/src/runtime/timestamp-value.h M be/src/runtime/timestamp-value.inline.h 9 files changed, 378 insertions(+), 108 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/11183 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa Gerrit-Change-Number: 11183 Gerrit-PatchSet: 24 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11183 ) Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions .. Patch Set 23: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11183 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa Gerrit-Change-Number: 11183 Gerrit-PatchSet: 23 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 17 Sep 2018 14:18:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11450 ) Change subject: IMPALA-7492: Add support for DATE text parser/formatter .. Patch Set 1: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/690/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/11450 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831 Gerrit-Change-Number: 11450 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 17 Sep 2018 13:05:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter
Attila Jeges has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11450 Change subject: IMPALA-7492: Add support for DATE text parser/formatter .. IMPALA-7492: Add support for DATE text parser/formatter This change is the first step in implementing support for DATE type (IMPALA-6169). The DATE parser/formatter is implemented by the new DateParser class. The parser supports parsing both default and custom formatted DATE values. The formatter supports default and custom formatting of DATE values. In the future, DateParser will be used in the text scanner/writer and in the DATE <-> STRING cast functions. The DateParser class reuses some of the functionality already implemented in the TimestampParser class to minimize redundancy. To make code reuse easier, a new namespace (datetime_parse_util) was created and the common functionality was moved there. This change also adds a new class (DateValue) to represent a DATE value in-memory. The DateParser and DateValue classes are used only in tests at the moment, therefore this patch doesn't change user facing behavior. Testing: - Added BE-tests for DateParser and DateValue classes. Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831 --- M be/src/benchmarks/convert-timestamp-benchmark.cc M be/src/benchmarks/parse-timestamp-benchmark.cc M be/src/common/init.cc M be/src/exprs/timestamp-functions-ir.cc M be/src/exprs/timestamp-functions.cc M be/src/exprs/timestamp-functions.h M be/src/runtime/CMakeLists.txt A be/src/runtime/date-parse-util.cc A be/src/runtime/date-parse-util.h A be/src/runtime/date-test.cc A be/src/runtime/date-value.cc A be/src/runtime/date-value.h A be/src/runtime/datetime-parse-util.cc A be/src/runtime/datetime-parse-util.h M be/src/runtime/timestamp-parse-util.cc M be/src/runtime/timestamp-parse-util.h M be/src/runtime/timestamp-test.cc M be/src/runtime/timestamp-value.cc M be/src/runtime/timestamp-value.h 19 files changed, 1,830 insertions(+), 729 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/11450/1 -- To view, visit http://gerrit.cloudera.org:8080/11450 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831 Gerrit-Change-Number: 11450 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges
[Impala-ASF-CR] IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11183 ) Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions .. Patch Set 22: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/689/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11183 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa Gerrit-Change-Number: 11183 Gerrit-PatchSet: 22 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 17 Sep 2018 11:05:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11183 ) Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions .. Patch Set 23: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11183 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa Gerrit-Change-Number: 11183 Gerrit-PatchSet: 23 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 17 Sep 2018 10:34:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11183 ) Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions .. Patch Set 23: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3163/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/11183 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa Gerrit-Change-Number: 11183 Gerrit-PatchSet: 23 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 17 Sep 2018 10:34:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/11183 ) Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions .. Patch Set 22: Code-Review+2 (2 comments) Thanks for the comments, LL looks much better than ll! Carry +2. http://gerrit.cloudera.org:8080/#/c/11183/21/be/src/benchmarks/convert-timestamp-benchmark.cc File be/src/benchmarks/convert-timestamp-benchmark.cc: http://gerrit.cloudera.org:8080/#/c/11183/21/be/src/benchmarks/convert-timestamp-benchmark.cc@581 PS21, Line 581: 24LL > nitL: if you want to use unsigned, you can use 'ull', also this way it woul Done http://gerrit.cloudera.org:8080/#/c/11183/21/be/src/runtime/timestamp-value.h File be/src/runtime/timestamp-value.h: http://gerrit.cloudera.org:8080/#/c/11183/21/be/src/runtime/timestamp-value.h@188 PS21, Line 188: LL > nit: you could use capital letters, 'LL'. Done -- To view, visit http://gerrit.cloudera.org:8080/11183 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa Gerrit-Change-Number: 11183 Gerrit-PatchSet: 22 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 17 Sep 2018 10:34:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions
Hello Zoltan Borok-Nagy, Attila Jeges, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11183 to look at the new patch set (#22). Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions .. IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions Impala used to convert from sub-second unix time to TimestampValue (which is split to date_ and time_ similarly to boost::posix_time::ptime) by first splitting the input into seconds and sub-seconds part, converting the seconds part wit boost::posix_time::from_time_t(), and then adding the sub-seconds part to this timestamp. Different tricks are used to speed up different functions: - UTC functions that expect a single integer as input can split it into date_ and time_ directly. - Non-UTC functions need seconds for timezone conversion, because CCTZ expects time points as seconds. These were optimized by adding the subsecond part to time_ instead of adding it to a ptime. This can be done safely because the sub-second part is between [0, 1 sec), so it cannot overflow into a different day or timezone. Benchmarks show 2x - 6x speedup for the measured functions. The main motivation is IMPALA-5050: "Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS to the parquet scanner" - reading these types will run micro/milli->TimestampValue conversion for every row. Other changes: - TimestampValue::UtcFromUnixTimeMillis was added - currently this is only used in tests but it will be useful for IMPALA-5050 - Some functions were moved from .h to .inline.h. - FromUnixTimeMicros was changed to do the utc->local conversion depending on flag use_local_tz_for_unix_timestamp_conversions to be consistent with other similar functions. This function was only used in tests until now but it will be useful for IMPALA-5050. - When a result mismatch is detected in convert-timestamp-benchmark.cc it now prints non-equal values. - Benchmarks were added for micro + nano conversions. Note that only single threaded benchmarks were added because I do not expect any difference in the multi threaded case. - DCHECKs were added to TimeStampValue::Validate to ensure that time_ is between [0, 24 hour). Testing: - timestamp-test.cc was extended to give better coverage for sub-second conversions. Edge cases were already covered pretty well. Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa --- M be/src/benchmarks/convert-timestamp-benchmark.cc M be/src/exec/data-source-scan-node.cc M be/src/exec/hdfs-orc-scanner.cc M be/src/exprs/decimal-operators-ir.cc M be/src/exprs/expr-test.cc M be/src/runtime/timestamp-test.cc M be/src/runtime/timestamp-value.cc M be/src/runtime/timestamp-value.h M be/src/runtime/timestamp-value.inline.h 9 files changed, 378 insertions(+), 108 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/11183/22 -- To view, visit http://gerrit.cloudera.org:8080/11183 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa Gerrit-Change-Number: 11183 Gerrit-PatchSet: 22 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/11183 ) Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions .. Patch Set 21: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/11183/21/be/src/benchmarks/convert-timestamp-benchmark.cc File be/src/benchmarks/convert-timestamp-benchmark.cc: http://gerrit.cloudera.org:8080/#/c/11183/21/be/src/benchmarks/convert-timestamp-benchmark.cc@581 PS21, Line 581: 24ll nitL: if you want to use unsigned, you can use 'ull', also this way it wouldn't look like 2411. You can also use capital letters. http://gerrit.cloudera.org:8080/#/c/11183/21/be/src/runtime/timestamp-value.h File be/src/runtime/timestamp-value.h: http://gerrit.cloudera.org:8080/#/c/11183/21/be/src/runtime/timestamp-value.h@188 PS21, Line 188: ll nit: you could use capital letters, 'LL'. -- To view, visit http://gerrit.cloudera.org:8080/11183 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa Gerrit-Change-Number: 11183 Gerrit-PatchSet: 21 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 17 Sep 2018 10:11:49 + Gerrit-HasComments: Yes