[Impala-ASF-CR] IMPALA-7200: Fix missing FILESYSTEM PREFIX hitting local dataload
Joe McDonnell has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10803 ) Change subject: IMPALA-7200: Fix missing FILESYSTEM_PREFIX hitting local dataload .. IMPALA-7200: Fix missing FILESYSTEM_PREFIX hitting local dataload As part of IMPALA-3307, we copy a time-zone database into HDFS. This command is failing on local filesystem due to a missing FILESYSTEM_PREFIX. This adds FILESYSTEM_PREFIX for this command. Change-Id: I972192f22943baef6043a4c9db54d5d48089ea9d Reviewed-on: http://gerrit.cloudera.org:8080/10803 Reviewed-by: Sailesh Mukil Tested-by: Impala Public Jenkins --- M testdata/bin/create-load-data.sh 1 file changed, 2 insertions(+), 1 deletion(-) Approvals: Sailesh Mukil: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10803 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I972192f22943baef6043a4c9db54d5d48089ea9d Gerrit-Change-Number: 10803 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6816: minimise calls to GetMinSubscriberTopicVersion()
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10705 ) Change subject: IMPALA-6816: minimise calls to GetMinSubscriberTopicVersion() .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/10705/3/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/10705/3/be/src/scheduling/admission-controller.cc@245 PS3, Line 245: StatestoreSubscriber::UpdateCallback cb = [this]( > Nit: auto should be used as the type for lambda. Declaring it as std::funct Done http://gerrit.cloudera.org:8080/#/c/10705/3/be/src/statestore/statestore.cc File be/src/statestore/statestore.cc: http://gerrit.cloudera.org:8080/#/c/10705/3/be/src/statestore/statestore.cc@735 PS3, Line 735: if (deltas_needing_min_version.size() > 0) { > nit: !deltas_needing_min_version.empty() Done http://gerrit.cloudera.org:8080/#/c/10705/3/be/src/statestore/statestore.cc@737 PS3, Line 737: typedef map TopicDeltaMap; > No longer used Done http://gerrit.cloudera.org:8080/#/c/10705/3/common/thrift/StatestoreService.thrift File common/thrift/StatestoreService.thrift: http://gerrit.cloudera.org:8080/#/c/10705/3/common/thrift/StatestoreService.thrift@147 PS3, Line 147: 3: optional bool populate_min_subscriber_topic_version = false; > Why optional? It's treated as required throughout this patch. Good point, I wanted a default value for convenience in test-statestore but that didn't imply a need to make it optional. http://gerrit.cloudera.org:8080/#/c/10705/3/tests/statestore/test_statestore.py File tests/statestore/test_statestore.py: http://gerrit.cloudera.org:8080/#/c/10705/3/tests/statestore/test_statestore.py@614 PS3, Line 614: update_counts > index into last_to_versions Done http://gerrit.cloudera.org:8080/#/c/10705/3/tests/statestore/test_statestore.py@615 PS3, Line 615: update_count > unused Done -- To view, visit http://gerrit.cloudera.org:8080/10705 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8ee7cb2355ba1049b9081e0df344ac41aa4ebeb1 Gerrit-Change-Number: 10705 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 23 Jun 2018 04:14:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6816: minimise calls to GetMinSubscriberTopicVersion()
Hello Tianyi Wang, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10705 to look at the new patch set (#4). Change subject: IMPALA-6816: minimise calls to GetMinSubscriberTopicVersion() .. IMPALA-6816: minimise calls to GetMinSubscriberTopicVersion() min_subscriber_topic_version is expensive to compute (requires iterating over all subscribers to compute) but is only used by one subscriber/topic pair: Impalads receiving catalog topic updates. This patch implements a simple fix - only compute it if a subscriber asks for it. A more complex alternative would be to maintain a priority queue of subscriber versions, but that didn't seem worth the the complexity and risk of bugs. Testing: Add a statestore test to validate the versions. It looks like we had a pre-existing test gap for validating min_subscriber_topic_version so the test is mainly focused on adding that coverage. Ran core tests with DEBUG and ASAN. Change-Id: I8ee7cb2355ba1049b9081e0df344ac41aa4ebeb1 --- M be/src/catalog/catalog-server.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler.cc M be/src/service/impala-server.cc M be/src/statestore/statestore-subscriber.cc M be/src/statestore/statestore-subscriber.h M be/src/statestore/statestore.cc M be/src/statestore/statestore.h M common/thrift/StatestoreService.thrift M tests/statestore/test_statestore.py 10 files changed, 155 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/10705/4 -- To view, visit http://gerrit.cloudera.org:8080/10705 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8ee7cb2355ba1049b9081e0df344ac41aa4ebeb1 Gerrit-Change-Number: 10705 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7140 (part 5): support fetching file info for FS tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10749 ) Change subject: IMPALA-7140 (part 5): support fetching file info for FS tables .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10749 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I42d67ab754872fad094c7dacdd2e1182de1bf3e8 Gerrit-Change-Number: 10749 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Sat, 23 Jun 2018 03:21:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6816: minimise calls to GetMinSubscriberTopicVersion()
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/10705 ) Change subject: IMPALA-6816: minimise calls to GetMinSubscriberTopicVersion() .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/10705/3/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/10705/3/be/src/scheduling/admission-controller.cc@245 PS3, Line 245: StatestoreSubscriber::UpdateCallback cb = [this]( Nit: auto should be used as the type for lambda. Declaring it as std::function would invoke std::function's copy-constructor once. (Fun fact: It's not move-constructed). http://gerrit.cloudera.org:8080/#/c/10705/3/be/src/statestore/statestore.cc File be/src/statestore/statestore.cc: http://gerrit.cloudera.org:8080/#/c/10705/3/be/src/statestore/statestore.cc@735 PS3, Line 735: if (deltas_needing_min_version.size() > 0) { nit: !deltas_needing_min_version.empty() http://gerrit.cloudera.org:8080/#/c/10705/3/be/src/statestore/statestore.cc@737 PS3, Line 737: typedef map TopicDeltaMap; No longer used http://gerrit.cloudera.org:8080/#/c/10705/3/common/thrift/StatestoreService.thrift File common/thrift/StatestoreService.thrift: http://gerrit.cloudera.org:8080/#/c/10705/3/common/thrift/StatestoreService.thrift@147 PS3, Line 147: 3: optional bool populate_min_subscriber_topic_version = false; Why optional? It's treated as required throughout this patch. http://gerrit.cloudera.org:8080/#/c/10705/3/tests/statestore/test_statestore.py File tests/statestore/test_statestore.py: http://gerrit.cloudera.org:8080/#/c/10705/3/tests/statestore/test_statestore.py@614 PS3, Line 614: update_counts index into last_to_versions http://gerrit.cloudera.org:8080/#/c/10705/3/tests/statestore/test_statestore.py@615 PS3, Line 615: update_count unused -- To view, visit http://gerrit.cloudera.org:8080/10705 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8ee7cb2355ba1049b9081e0df344ac41aa4ebeb1 Gerrit-Change-Number: 10705 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 23 Jun 2018 02:03:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7200: Fix missing FILESYSTEM PREFIX hitting local dataload
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10803 ) Change subject: IMPALA-7200: Fix missing FILESYSTEM_PREFIX hitting local dataload .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10803 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I972192f22943baef6043a4c9db54d5d48089ea9d Gerrit-Change-Number: 10803 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Sat, 23 Jun 2018 01:55:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7180: Pin Impala CDH dependencies
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10748 ) Change subject: IMPALA-7180: Pin Impala CDH dependencies .. Patch Set 10: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I66c0dcb8abdd0d187490a761f129cda3b3500990 Gerrit-Change-Number: 10748 Gerrit-PatchSet: 10 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Sat, 23 Jun 2018 01:46:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7180: Pin Impala CDH dependencies
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10748 ) Change subject: IMPALA-7180: Pin Impala CDH dependencies .. IMPALA-7180: Pin Impala CDH dependencies For IMPALA_MINICLUSTER_PROFILE=3 (Hadoop 3.x components), pin the CDH dependencies by storing the CDH tarballs and Maven repository in S3. This solves the issue of build coherency between the the CDH tarballs and Maven dependencies. For IMPALA_MINICLUSTER_PROFILE=2 (Hadoop 2.x components), pin the CDH dependencies by storing only the CDH tarballs in S3. The Maven repository will still use https://repository.cloudera.com, so there is still a possibility of a build coherency issue. For each CDH dependency, there is a unique build number in each repository URL to indicate the build number that created those CDH dependencies. This informaton can be useful for debugging issues related to CDH dependencies. This patch introduces CDH_DOWNLOAD_HOST and CDH_BUILD_NUMBER environment variables that can be overriden, which can be useful for running an integration job. This patch also fixes dependency issues in Hadoop that transitively depend on snapshot versions of dependencies that no longer exist, i.e. - net.minidev:json-smart:2.3-SNAPSHOT (HADOOP-14903) - org.glassfish:javax.el:3.0.1-b06-SNAPSHOT The fix is to force the dependencies by using the released versions of those dependencies. Testing: - Ran all core tests on IMPALA_MINICLUSTER_PROFILE=2 and IMPALA_MINICLUSTER_PROFILE=3 Cherry-picks: not for 2.x Change-Id: I66c0dcb8abdd0d187490a761f129cda3b3500990 Reviewed-on: http://gerrit.cloudera.org:8080/10748 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M bin/bootstrap_toolchain.py M bin/impala-config.sh M common/yarn-extras/pom.xml M fe/pom.xml M impala-parent/pom.xml M testdata/pom.xml M tests/test-hive-udfs/pom.xml 7 files changed, 297 insertions(+), 33 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/10748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I66c0dcb8abdd0d187490a761f129cda3b3500990 Gerrit-Change-Number: 10748 Gerrit-PatchSet: 11 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-7191: don't call srand() at random times
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10778 ) Change subject: IMPALA-7191: don't call srand() at random times .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaf0ef5a0e842ebdcdb6d7355302e68fb0bc7ef5f Gerrit-Change-Number: 10778 Gerrit-PatchSet: 3 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 23 Jun 2018 01:14:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7201. Support DDL with LocalCatalog enabled
Hello Tianyi Wang, Vuk Ercegovac, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/10806 to review the following change. Change subject: IMPALA-7201. Support DDL with LocalCatalog enabled .. IMPALA-7201. Support DDL with LocalCatalog enabled This fixes a couple issues with DDL commands when LocalCatalog is enabled: - updateCatalogCache() gets called after any DDL. Instead of throwing an exception, we can just no-op this by returning some fake result. - In order to support 'drop database' we need to properly implement the various function-related calls such that they don't throw exceptions. This changes them to be stubbed out as having no functions. - Fixes for 'alter view' and 'drop view' so that the underlying target table gets loaded by the catalogd before attempting the operation. Without this, in the LocalCatalog case, the catalogd would only have an IncompleteTable and these operations would fail with "unexpected table type" errors. With this patch I was able to run 'run-tests.py -k views' and 3/4 passed. The one that failed depends on HBase tables, not yet implemented. Change-Id: Ic39c97a5f5ad145e03b96d1a470dc2dfa6ec71a5 --- M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/FeCatalogManager.java 3 files changed, 33 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/10806/1 -- To view, visit http://gerrit.cloudera.org:8080/10806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ic39c97a5f5ad145e03b96d1a470dc2dfa6ec71a5 Gerrit-Change-Number: 10806 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7140 (part 7): small fixes to enable most queries on HDFS tables
Hello Tianyi Wang, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10798 to look at the new patch set (#2). Change subject: IMPALA-7140 (part 7): small fixes to enable most queries on HDFS tables .. IMPALA-7140 (part 7): small fixes to enable most queries on HDFS tables This is a grab-bag of small fixes necessary to get most queries on HDFS tables passing with the correct plans: * Change the loading of tables to check for other table types before checking for FS tables, since without specific checks against various properties, other tables may look like FS tables and try to instantiate LocalFsTable incorrectly. * Return -1 for extrapolated row count -- the previous 0 value was convincing the planner that it had a valid value. * Fix up the handling of BuiltinsDb so that we don't depend on ImpaladCatalog to have been loaded in order to instantiate it. * Properly handle the case where all partitions are pruned by a predicate. With this change, about half of the tests in PlannerTest pass. The tests that don't pass all rely on views, HBase tables, etc. Change-Id: I6f603e62b7a013c148c0905ebdec2f4303f9c4e5 --- M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/ExtractFromExpr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/FunctionName.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M fe/src/main/java/org/apache/impala/catalog/ScalarFunction.java M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java 13 files changed, 66 insertions(+), 40 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/10798/2 -- To view, visit http://gerrit.cloudera.org:8080/10798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6f603e62b7a013c148c0905ebdec2f4303f9c4e5 Gerrit-Change-Number: 10798 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7140 (part 8): support views in LocalCatalog
Hello Tianyi Wang, Vuk Ercegovac, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/10805 to review the following change. Change subject: IMPALA-7140 (part 8): support views in LocalCatalog .. IMPALA-7140 (part 8): support views in LocalCatalog This adds basic support for loading views in LocalCatalog. Tested with a small unit test and also verified from the shell that I can select from a view. Change-Id: Ib3516b9ceff6dce12ded68d93afde09728627e08 --- M fe/src/main/java/org/apache/impala/catalog/View.java M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java A fe/src/main/java/org/apache/impala/catalog/local/LocalView.java M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java 4 files changed, 115 insertions(+), 33 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/10805/1 -- To view, visit http://gerrit.cloudera.org:8080/10805 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib3516b9ceff6dce12ded68d93afde09728627e08 Gerrit-Change-Number: 10805 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7149: Disable some tests in the EC build
Taras Bobrovytsky has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10804 Change subject: IMPALA-7149: Disable some tests in the EC build .. IMPALA-7149: Disable some tests in the EC build We temporarily disable the resource limits tests in the EC build to make it pass. We also disable the tests marked with "tuned_for_minicluster" in the EC build. Cherry-picks: not for 2.x. Change-Id: I0975b1a28b318625f853b612bdfea3a8adcd776e --- M tests/common/skip.py M tests/query_test/test_resource_limits.py 2 files changed, 4 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/10804/1 -- To view, visit http://gerrit.cloudera.org:8080/10804 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I0975b1a28b318625f853b612bdfea3a8adcd776e Gerrit-Change-Number: 10804 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-7140 (part 5): support fetching file info for FS tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10749 ) Change subject: IMPALA-7140 (part 5): support fetching file info for FS tables .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2734/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/10749 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I42d67ab754872fad094c7dacdd2e1182de1bf3e8 Gerrit-Change-Number: 10749 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Sat, 23 Jun 2018 00:07:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7140 (part 5): support fetching file info for FS tables
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10749 ) Change subject: IMPALA-7140 (part 5): support fetching file info for FS tables .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10749 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I42d67ab754872fad094c7dacdd2e1182de1bf3e8 Gerrit-Change-Number: 10749 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Sat, 23 Jun 2018 00:03:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 ) Change subject: IMPALA-4848: Add WIDTH_BUCKET() function .. Patch Set 16: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f Gerrit-Change-Number: 6023 Gerrit-PatchSet: 16 Gerrit-Owner: anujphadke Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Fri, 22 Jun 2018 23:46:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6642 (Part 2): clean up start-impala-cluster.py
Taras Bobrovytsky has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/10780 ) Change subject: IMPALA-6642 (Part 2): clean up start-impala-cluster.py .. IMPALA-6642 (Part 2): clean up start-impala-cluster.py We clean up start-impala-cluster.py in general in this patch by using logging instead of "print" and formatting strings using the format() function. We make sure to include a timestamp in each log message in order to make it easier to debug failures in custom cluster tests that happen when starting the cluster. Change-Id: I60169203c61ae6bc0a3ccd3dea355799b603efe5 --- M bin/start-impala-cluster.py M tests/common/impala_service.py 2 files changed, 147 insertions(+), 83 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/10780/2 -- To view, visit http://gerrit.cloudera.org:8080/10780 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I60169203c61ae6bc0a3ccd3dea355799b603efe5 Gerrit-Change-Number: 10780 Gerrit-PatchSet: 2 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function
anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 ) Change subject: IMPALA-4848: Add WIDTH_BUCKET() function .. Patch Set 16: (3 comments) http://gerrit.cloudera.org:8080/#/c/6023/15/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/6023/15/be/src/exprs/expr-test.cc@582 PS15, Line 582: > Can you document briefly what the function does? Done http://gerrit.cloudera.org:8080/#/c/6023/15/be/src/exprs/expr-test.cc@587 PS15, Line 587: void TestError(const string& expr) { > ASSERT_FALSE(status.ok()) Done http://gerrit.cloudera.org:8080/#/c/6023/15/be/src/exprs/expr-test.cc@588 PS15, Line 588: GetValue(expr, INVALID_TYPE, /* expect_error */ true); > This is basically EndsWith(), right? Can you factor out into a utility in S Done -- To view, visit http://gerrit.cloudera.org:8080/6023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f Gerrit-Change-Number: 6023 Gerrit-PatchSet: 16 Gerrit-Owner: anujphadke Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Fri, 22 Jun 2018 23:39:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function
Hello Taras Bobrovytsky, Michael Brown, Tim Armstrong, Alex Behm, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6023 to look at the new patch set (#16). Change subject: IMPALA-4848: Add WIDTH_BUCKET() function .. IMPALA-4848: Add WIDTH_BUCKET() function Syntax : width_bucket(expr decimal, min_val decimal, max_val decimal, num_buckets int) This function creates equiwidth histograms , where the histogram range is divided into num_buckets buckets having identical sizes. This function returns the bucket in which the expr value would fall. min_val and max_val are the minimum and maximum value of the histogram range respectively. -> This function returns NULL if expr is a NULL. -> This function returns 0 if expr < min_val -> This function returns num_buckets + 1 if expr > max_val E.g. [localhost:21000] > select width_bucket(8, 1, 20, 3); +---+ | width_bucket(8, 1, 20, 3) | +---+ | 2 | +---+ Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f --- M be/src/exprs/expr-test.cc M be/src/exprs/math-functions-ir.cc M be/src/exprs/math-functions.h M be/src/util/string-util.cc M be/src/util/string-util.h M common/function-registry/impala_functions.py M fe/src/main/java/org/apache/impala/analysis/Expr.java 7 files changed, 246 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/16 -- To view, visit http://gerrit.cloudera.org:8080/6023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f Gerrit-Change-Number: 6023 Gerrit-PatchSet: 16 Gerrit-Owner: anujphadke Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke
[Impala-ASF-CR] IMPALA-110: Planner support for multiple distinct aggregations.
Thomas Marshall has abandoned this change. ( http://gerrit.cloudera.org:8080/10444 ) Change subject: IMPALA-110: Planner support for multiple distinct aggregations. .. Abandoned -- To view, visit http://gerrit.cloudera.org:8080/10444 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I4c5cb348f9431350d2e5bf2c84325dcc44d38d2f Gerrit-Change-Number: 10444 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-110: Planner support for multiple distinct aggregations.
Thomas Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/10444 ) Change subject: IMPALA-110: Planner support for multiple distinct aggregations. .. Patch Set 1: This can be abandoned, all of its content is in https://gerrit.cloudera.org/#/c/10771/ -- To view, visit http://gerrit.cloudera.org:8080/10444 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4c5cb348f9431350d2e5bf2c84325dcc44d38d2f Gerrit-Change-Number: 10444 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Fri, 22 Jun 2018 23:20:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6642 (Part 2): Add timestamps to start-impala-cluster.py
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/10780 ) Change subject: IMPALA-6642 (Part 2): Add timestamps to start-impala-cluster.py .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/10780/1/bin/start-impala-cluster.py File bin/start-impala-cluster.py: http://gerrit.cloudera.org:8080/#/c/10780/1/bin/start-impala-cluster.py@460 PS1, Line 460: print 'Error starting cluster: {0}'.format(e) > If you switch to logging, this becomes just "logging.exception("Error start Done -- To view, visit http://gerrit.cloudera.org:8080/10780 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60169203c61ae6bc0a3ccd3dea355799b603efe5 Gerrit-Change-Number: 10780 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 22 Jun 2018 23:02:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7199: Add scripts to create code coverage reports
Hello Philip Zeyliger, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10791 to look at the new patch set (#2). Change subject: IMPALA-7199: Add scripts to create code coverage reports .. IMPALA-7199: Add scripts to create code coverage reports gcovr is a python library that uses gcov to generate code coverage reports. This adds gcovr to the python dependencies and adds bin/impala-gcovr to provide easy access to gcovr's command line. gcovr 3.4 supports python 2.6+. This also adds bin/coverage_helper.sh to provide a simplified interface to generate reports and zero coverage counters. Code coverage data is written out when a program exits, so it is important to avoid hard kills to shut down the impalads when generating coverage. This modifies testdata/bin/kill-all.sh to call start-impala-cluster.py --kill when shutting down the minicluster to try to avoid doing a hard kill. It will still do a hard kill if impala is still running after the softer kill. Change-Id: I5b2e0b794c64f9343ec976de7a3f235e54d2badd --- A bin/coverage_helper.sh A bin/impala-gcovr M infra/python/deps/requirements.txt M testdata/bin/kill-all.sh 4 files changed, 111 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/10791/2 -- To view, visit http://gerrit.cloudera.org:8080/10791 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5b2e0b794c64f9343ec976de7a3f235e54d2badd Gerrit-Change-Number: 10791 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] Add scripts to create code coverage reports
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/10791 ) Change subject: Add scripts to create code coverage reports .. Patch Set 1: (3 comments) Made this into a JIRA IMPALA-7199. There is example output in a tarball there. http://gerrit.cloudera.org:8080/#/c/10791/1/bin/coverage_helper.sh File bin/coverage_helper.sh: http://gerrit.cloudera.org:8080/#/c/10791/1/bin/coverage_helper.sh@1 PS1, Line 1: #!/bin/bash > I'm increasingly of the opinion that we should just start with python for t I'll take a look at how that would be. http://gerrit.cloudera.org:8080/#/c/10791/1/bin/coverage_helper.sh@24 PS1, Line 24: REPORT_DIRECTORY=${IMPALA_HOME}/logs/coverage > Does something mkdir -p this? I do this down in the REPORT_ACTION = 1 block. http://gerrit.cloudera.org:8080/#/c/10791/1/bin/coverage_helper.sh@30 PS1, Line 30: echo "[-reportdirectory] : Output directory for coverage report files" > Indicate that this takes an argument? Done -- To view, visit http://gerrit.cloudera.org:8080/10791 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b2e0b794c64f9343ec976de7a3f235e54d2badd Gerrit-Change-Number: 10791 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 22 Jun 2018 22:47:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7198: [DOCS] Corrected misspelled hints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10796 ) Change subject: IMPALA-7198: [DOCS] Corrected misspelled hints .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3e3c4ed050131f811311d5fdb8634a1ea3385fb3 Gerrit-Change-Number: 10796 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Fri, 22 Jun 2018 22:44:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7198: [DOCS] Corrected misspelled hints
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10796 ) Change subject: IMPALA-7198: [DOCS] Corrected misspelled hints .. IMPALA-7198: [DOCS] Corrected misspelled hints Change-Id: I3e3c4ed050131f811311d5fdb8634a1ea3385fb3 Reviewed-on: http://gerrit.cloudera.org:8080/10796 Reviewed-by: Alex Rodoni Tested-by: Impala Public Jenkins --- M docs/topics/impala_hints.xml 1 file changed, 2 insertions(+), 2 deletions(-) Approvals: Alex Rodoni: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I3e3c4ed050131f811311d5fdb8634a1ea3385fb3 Gerrit-Change-Number: 10796 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-7140 (part 3): load partitions for FS tables
Todd Lipcon has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10713 ) Change subject: IMPALA-7140 (part 3): load partitions for FS tables .. IMPALA-7140 (part 3): load partitions for FS tables This implements the loading of partitions from the HMS for FS-based tables. The LocalPartitionSpec class implements PrunablePartition based on parsing the partition names returned from the HMS. After partitions are pruned, the remaining partitions can be loaded by name. With this patch, I am able to connect to an impalad running with --use_local_catalog and issue 'show partitions functional.alltypes' with the expected results. A new unit test is added which shares some code with CatalogTest to ensure that partitions can be loaded and parsed properly. Testing partition pruning via end-to-end planning is not quite supported yet: we need to implement 'toThriftDescriptor()' before we can run those end-to-end tests. Change-Id: Iddf2edbd6bdc0684560b2ecca9c4c6b6819ef1d3 Reviewed-on: http://gerrit.cloudera.org:8080/10713 Reviewed-by: Vuk Ercegovac Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java A fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java A fe/src/main/java/org/apache/impala/catalog/local/LocalPartitionSpec.java M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java 14 files changed, 707 insertions(+), 108 deletions(-) Approvals: Vuk Ercegovac: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10713 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iddf2edbd6bdc0684560b2ecca9c4c6b6819ef1d3 Gerrit-Change-Number: 10713 Gerrit-PatchSet: 7 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7140 (part 4): support creating descriptors for FS tables
Todd Lipcon has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10735 ) Change subject: IMPALA-7140 (part 4): support creating descriptors for FS tables .. IMPALA-7140 (part 4): support creating descriptors for FS tables This adds the relevant methods to convert LocalFsTable and LocalFsPartition to thrift descriptors for consumption by the backend. Unfortunately we cannot yet enable the planner tests, since they are checking file counts and sizes as part of the explain output, and we haven't yet implemented file info fetching in the LocalCatalog. However, I was able to manually test this change by starting an impalad with --use_local_catalog, connecting to it from the shell, and running various EXPLAIN SELECT queries against tpch and functional tables. The explain output is more or less as expected with the exception of missing file info. Change-Id: I4550612eb6d1e3a324f49a9c4d24b048e45d3738 Reviewed-on: http://gerrit.cloudera.org:8080/10735 Reviewed-by: Vuk Ercegovac Tested-by: Impala Public Jenkins --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java 11 files changed, 227 insertions(+), 105 deletions(-) Approvals: Vuk Ercegovac: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10735 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I4550612eb6d1e3a324f49a9c4d24b048e45d3738 Gerrit-Change-Number: 10735 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] [DRAFT] IMPALA-6189: Add thread watchdogs for HDFS IO calls
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10696 ) Change subject: [DRAFT] IMPALA-6189: Add thread watchdogs for HDFS IO calls .. Patch Set 4: Forgot to respond to the top-level comment. > The big question for me is what our policy is here. Should we be doing this > for all blocking system calls? All blocking RPCs? How do we maintain that we > have decent coverage? I think we should start this with HDFS calls since they are notorious in this regarding especially around not timing out. If this is committed and the macro is available, we can find out other problematic blocking calls and cover them too. > For the Java stuff, an alternative approach is to call out via our regular > Thrift-y/JNI-y route to ask Java to get the stack trace using management > beans. I'm pretty sure you can match native thread ids to Java ones, though > based on reading > https://gist.github.com/rednaxelafx/843622/eb0b0877ff4aac77c76e5a50f7621dc32ea451eb > it looks like it's hard. (In jstack, it's the nid=... but you need to do a > hex to decimal conversion for pids. But it looks like it's not readily > available out of Java.) Tried this, but I couldn't find a way to map the nid <-> tid using the API. Thread API in Java does not expose the nid of the thread (which is weird). Of course we could do it the hacky way by parsing the jstack output and extracting the nid field but I don't think that is a reasonable approach. Not sure if you have any other ideas to achieve the same. > I think it may also make sense to expose a counter on how often this happens. > A monitoring tool would want to alert if this is happening a lot, and it > won't want to grep the logs. Even more interesting would be to write down > when it happens on behalf of a query in the profile, but that's not always > possible. Yea, I thought about this. Let me think a bit more and get back to you, flushing out the comments meanwhile. -- To view, visit http://gerrit.cloudera.org:8080/10696 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I28e918761c120043d332b034450208eaf34e3e2b Gerrit-Change-Number: 10696 Gerrit-PatchSet: 4 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Fri, 22 Jun 2018 22:26:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7198: [DOCS] Corrected misspelled hints
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/10796 ) Change subject: IMPALA-7198: [DOCS] Corrected misspelled hints .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3e3c4ed050131f811311d5fdb8634a1ea3385fb3 Gerrit-Change-Number: 10796 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Fri, 22 Jun 2018 22:21:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7198: [DOCS] Corrected misspelled hints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10796 ) Change subject: IMPALA-7198: [DOCS] Corrected misspelled hints .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-docs-submit/331/ -- To view, visit http://gerrit.cloudera.org:8080/10796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3e3c4ed050131f811311d5fdb8634a1ea3385fb3 Gerrit-Change-Number: 10796 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Fri, 22 Jun 2018 22:21:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7200: Fix missing FILESYSTEM PREFIX hitting local dataload
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10803 ) Change subject: IMPALA-7200: Fix missing FILESYSTEM_PREFIX hitting local dataload .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2733/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/10803 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I972192f22943baef6043a4c9db54d5d48089ea9d Gerrit-Change-Number: 10803 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 22 Jun 2018 22:14:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7200: Fix missing FILESYSTEM PREFIX hitting local dataload
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/10803 ) Change subject: IMPALA-7200: Fix missing FILESYSTEM_PREFIX hitting local dataload .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10803 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I972192f22943baef6043a4c9db54d5d48089ea9d Gerrit-Change-Number: 10803 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 22 Jun 2018 22:10:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7200: Fix missing FILESYSTEM PREFIX hitting local dataload
Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10803 Change subject: IMPALA-7200: Fix missing FILESYSTEM_PREFIX hitting local dataload .. IMPALA-7200: Fix missing FILESYSTEM_PREFIX hitting local dataload As part of IMPALA-3307, we copy a time-zone database into HDFS. This command is failing on local filesystem due to a missing FILESYSTEM_PREFIX. This adds FILESYSTEM_PREFIX for this command. Change-Id: I972192f22943baef6043a4c9db54d5d48089ea9d --- M testdata/bin/create-load-data.sh 1 file changed, 2 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/10803/1 -- To view, visit http://gerrit.cloudera.org:8080/10803 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I972192f22943baef6043a4c9db54d5d48089ea9d Gerrit-Change-Number: 10803 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell
[Impala-ASF-CR] IMPALA-7180: Pin Impala CDH dependencies
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10748 ) Change subject: IMPALA-7180: Pin Impala CDH dependencies .. Patch Set 10: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2732/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/10748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I66c0dcb8abdd0d187490a761f129cda3b3500990 Gerrit-Change-Number: 10748 Gerrit-PatchSet: 10 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 22 Jun 2018 22:00:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7180: Pin Impala CDH dependencies
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10748 ) Change subject: IMPALA-7180: Pin Impala CDH dependencies .. Patch Set 10: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I66c0dcb8abdd0d187490a761f129cda3b3500990 Gerrit-Change-Number: 10748 Gerrit-PatchSet: 10 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 22 Jun 2018 22:00:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-110 (part 2): Refactor PartitionedAggregationNode
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10394 ) Change subject: IMPALA-110 (part 2): Refactor PartitionedAggregationNode .. Patch Set 4: (5 comments) I still need to look at grouping-aggregator-* one more time but here's the comments for the rest. http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/exec-node.cc File be/src/exec/exec-node.cc: http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/exec-node.cc@310 PS4, Line 310: if (FLAGS_use_legacy_aggregation) { : *node = pool->Add(new PartitionedAggregationNode(pool, tnode, descs)); do we need to keep this code around? it seems like the new stuff should be functionally, algorithmically, and performance equivalent, so how about we just delete the old code? it's hard to test both code paths and the risk here seems low enough that we can just delete the old one. http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/non-grouping-aggregator.h File be/src/exec/non-grouping-aggregator.h: http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/non-grouping-aggregator.h@103 PS4, Line 103: 'process_batch_fn_' or : /// 'process_batch_no_grouping_fn_' will be updated with the codegened function : /// depending on whether this is a grouping or non-grouping aggregation. is that still accurate? http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/non-grouping-aggregator.h@107 PS4, Line 107: CodegenProcessBatch also the naming seems inconsistent with ProcessBatchNoGrouping(). The NoGrouping part seems redundant with the class name now, but I don't mind either way if we can make it consistent with the CodegenFn naming. http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/non-grouping-aggregator.cc File be/src/exec/non-grouping-aggregator.cc: http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/non-grouping-aggregator.cc@123 PS4, Line 123: ProcessBatchNoGrouping given that this is no 1:1 with AddBatch(), maybe just call it AddBatchWork() to make that obvious (and do the renaming for codegen fn/vars)? http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/streaming-aggregation-node.h File be/src/exec/streaming-aggregation-node.h: http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/streaming-aggregation-node.h@35 PS4, Line 35: f there is not enough memory available I think it also stops aggregating if it's not seeing a good reduction? if so, maybe clarify with that. -- To view, visit http://gerrit.cloudera.org:8080/10394 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e7bb583f54aa4add3738bde7f57cf3511ac567e Gerrit-Change-Number: 10394 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 22 Jun 2018 21:56:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-110 (part 2): Refactor PartitionedAggregationNode
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10394 ) Change subject: IMPALA-110 (part 2): Refactor PartitionedAggregationNode .. Patch Set 4: (2 comments) Had a couple of high-level questions first http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/common/global-flags.cc@123 PS4, Line 123: DEFINE_bool(use_legacy_aggregation, false, "(Advanced) If true, fall back to using the " What's the motivation for keeping the old implementation? Are we just concerned about having a fallback in case of bugs, or are there specific things to address before we feel comfortable removing it? I'm open to doing this, mainly want to have a clear plan for eventual removal so we avoid carrying it around for a long time like the old Aggs and Joins. http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/aggregator.h File be/src/exec/aggregator.h: http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/aggregator.h@95 PS4, Line 95: /// Account for peak memory used by this aggregator. Is the idea here to account for memory used by each aggregator separately? That's probably a useful property but wanted to make sure I understood the motivation (the alternative would be to have more of the MemPools, etc in the ExecNode). -- To view, visit http://gerrit.cloudera.org:8080/10394 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e7bb583f54aa4add3738bde7f57cf3511ac567e Gerrit-Change-Number: 10394 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 22 Jun 2018 21:40:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7191: don't call srand() at random times
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10778 ) Change subject: IMPALA-7191: don't call srand() at random times .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2731/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/10778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaf0ef5a0e842ebdcdb6d7355302e68fb0bc7ef5f Gerrit-Change-Number: 10778 Gerrit-PatchSet: 3 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 22 Jun 2018 21:39:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7191: don't call srand() at random times
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10778 ) Change subject: IMPALA-7191: don't call srand() at random times .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaf0ef5a0e842ebdcdb6d7355302e68fb0bc7ef5f Gerrit-Change-Number: 10778 Gerrit-PatchSet: 3 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 22 Jun 2018 21:39:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-110 (part 2): Refactor PartitionedAggregationNode
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10394 ) Change subject: IMPALA-110 (part 2): Refactor PartitionedAggregationNode .. Patch Set 4: I can take over the review once it looks good to Dan. Will probably manually diff the moved code or look at the branch you posted in a diff viewer. -- To view, visit http://gerrit.cloudera.org:8080/10394 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e7bb583f54aa4add3738bde7f57cf3511ac567e Gerrit-Change-Number: 10394 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 22 Jun 2018 21:30:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7191: don't call srand() at random times
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10778 ) Change subject: IMPALA-7191: don't call srand() at random times .. Patch Set 2: Seems good to merge -- To view, visit http://gerrit.cloudera.org:8080/10778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaf0ef5a0e842ebdcdb6d7355302e68fb0bc7ef5f Gerrit-Change-Number: 10778 Gerrit-PatchSet: 2 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 22 Jun 2018 21:28:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7140 (part 4): support creating descriptors for FS tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10735 ) Change subject: IMPALA-7140 (part 4): support creating descriptors for FS tables .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10735 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4550612eb6d1e3a324f49a9c4d24b048e45d3738 Gerrit-Change-Number: 10735 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 22 Jun 2018 21:11:00 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Update to the workaround for IMPALA-3316
Alex Rodoni has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10800 Change subject: [DOCS] Update to the workaround for IMPALA-3316 .. [DOCS] Update to the workaround for IMPALA-3316 Both the date and time strings can be stored and converted to TIMESTAMP Change-Id: If45da5d24dd3bc5f649d95b5bc104047420dbea1 --- M docs/topics/impala_known_issues.xml 1 file changed, 17 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/10800/1 -- To view, visit http://gerrit.cloudera.org:8080/10800 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If45da5d24dd3bc5f649d95b5bc104047420dbea1 Gerrit-Change-Number: 10800 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] IMPALA-7140 (part 3): load partitions for FS tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10713 ) Change subject: IMPALA-7140 (part 3): load partitions for FS tables .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10713 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddf2edbd6bdc0684560b2ecca9c4c6b6819ef1d3 Gerrit-Change-Number: 10713 Gerrit-PatchSet: 6 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 22 Jun 2018 20:49:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5900: [DOCS] Doc fe service threads startup option
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10795 ) Change subject: IMPALA-5900: [DOCS] Doc fe_service_threads startup option .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10795 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f0a417a4aa07b8082037fc6ff355e62ce1493e5 Gerrit-Change-Number: 10795 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Fri, 22 Jun 2018 20:19:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5900: [DOCS] Doc fe service threads startup option
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10795 ) Change subject: IMPALA-5900: [DOCS] Doc fe_service_threads startup option .. IMPALA-5900: [DOCS] Doc fe_service_threads startup option Change-Id: I7f0a417a4aa07b8082037fc6ff355e62ce1493e5 Reviewed-on: http://gerrit.cloudera.org:8080/10795 Reviewed-by: Michael Brown Tested-by: Impala Public Jenkins --- M docs/topics/impala_config_options.xml 1 file changed, 30 insertions(+), 219 deletions(-) Approvals: Michael Brown: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10795 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I7f0a417a4aa07b8082037fc6ff355e62ce1493e5 Gerrit-Change-Number: 10795 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-5900: [DOCS] Doc fe service threads startup option
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10795 ) Change subject: IMPALA-5900: [DOCS] Doc fe_service_threads startup option .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-docs-submit/330/ -- To view, visit http://gerrit.cloudera.org:8080/10795 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f0a417a4aa07b8082037fc6ff355e62ce1493e5 Gerrit-Change-Number: 10795 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Fri, 22 Jun 2018 20:09:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6802 (part 5): Clean up authorization tests
Fredy Wijaya has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10799 Change subject: IMPALA-6802 (part 5): Clean up authorization tests .. IMPALA-6802 (part 5): Clean up authorization tests The fifth part of this patch is to rewrite the following authorization tests: - compute/drop stats - create database/table/view - drop database/table/view Cherry-picks: not for 2.x Change-Id: I9c5a90978892ab552846e7da8c17aeeb91e57526 --- R fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java 1 file changed, 538 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/10799/2 -- To view, visit http://gerrit.cloudera.org:8080/10799 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I9c5a90978892ab552846e7da8c17aeeb91e57526 Gerrit-Change-Number: 10799 Gerrit-PatchSet: 2 Gerrit-Owner: Fredy Wijaya
[native-toolchain-CR] Patch llvm to fix run-clang-tidy.py output
Joe McDonnell has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10788 ) Change subject: Patch llvm to fix run-clang-tidy.py output .. Patch llvm to fix run-clang-tidy.py output Clang's run-clang-tidy.py has some behaviors that make it hard for Impala to get good output. Specifically: 1. It does not synchronize output to stdout. Output for different files can be interleaved with each other and with the script's own output. 2. It does not suppress its own output when the -quiet argument is specified. This output is hard to separate from the actual output of clang-tidy. 3. It can silently skips files when clang-tidy fails with an error code. This does not happen with the existing set of clang-tidy checks, but it could cause loss of coverage if a new check caused clang-tidy to encounter an error. This adds a patch to llvm that fixes these issues in run-clang-tidy.py by synchronizing output, respecting -quiet, and allowing an error from clang-tidy to abort the run. Since run-clang-tidy.py is not in the main llvm source tree (it is in the extra package), our existing method could not patch it. This changes llvm to extract all of the archives into the appropriate places before applying a unified patch that can touch any of the source files from any of the archives. Testing: - Ran run-clang-tidy.py on normal Impala checkout without issues. - Added clang tidy checks that Impala currently doesn't pass and verified that the output with -quiet is sensible. Change-Id: I9125cb0a908fd7005ee68aafb41f3afe93522632 Reviewed-on: http://gerrit.cloudera.org:8080/10788 Reviewed-by: Joe McDonnell Tested-by: Joe McDonnell --- M buildall.sh M functions.sh M source/llvm/build-source-tarball.sh A source/llvm/llvm-5.0.1-patches/0001-PATCH-Fix-run-clang-tidy.py-s-output.patch 4 files changed, 110 insertions(+), 20 deletions(-) Approvals: Joe McDonnell: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/10788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I9125cb0a908fd7005ee68aafb41f3afe93522632 Gerrit-Change-Number: 10788 Gerrit-PatchSet: 6 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[native-toolchain-CR] Patch llvm to fix run-clang-tidy.py output
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/10788 ) Change subject: Patch llvm to fix run-clang-tidy.py output .. Patch Set 5: Verified+1 Toolchain package build succeeded on all platforms, +1 verified -- To view, visit http://gerrit.cloudera.org:8080/10788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9125cb0a908fd7005ee68aafb41f3afe93522632 Gerrit-Change-Number: 10788 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 22 Jun 2018 19:43:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3956 - Impala shell should ignore variables in comments
Adam Holley has removed Taras Bobrovytsky from this change. ( http://gerrit.cloudera.org:8080/10784 ) Change subject: IMPALA-3956 - Impala shell should ignore variables in comments .. Removed reviewer Taras Bobrovytsky. -- To view, visit http://gerrit.cloudera.org:8080/10784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: I9e442e169d512daf80c86606528300ce4e6f371c Gerrit-Change-Number: 10784 Gerrit-PatchSet: 3 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Fredy Wijaya
[Impala-ASF-CR] IMPALA-3956 - Impala shell should ignore variables in comments
Adam Holley has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/10784 ) Change subject: IMPALA-3956 - Impala shell should ignore variables in comments .. IMPALA-3956 - Impala shell should ignore variables in comments This patches fixes an issue where variables that appeared in comments were being parsed. > /* ${abc} */ help; Before - Unknown sustitution syntax (ABC) error After - Shows help correctly Testing: - Added shell tests - Ran end-to-end shell tests. Change-Id: I9e442e169d512daf80c86606528300ce4e6f371c --- M shell/impala_shell.py M tests/shell/test_shell_interactive.py 2 files changed, 70 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/10784/3 -- To view, visit http://gerrit.cloudera.org:8080/10784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9e442e169d512daf80c86606528300ce4e6f371c Gerrit-Change-Number: 10784 Gerrit-PatchSet: 3 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Fredy Wijaya
[Impala-ASF-CR] [DRAFT] IMPALA-6189: Add thread watchdogs for HDFS IO calls
Hello Michael Ho, Lars Volker, Philip Zeyliger, Balazs Jeszenszky, Zoram Thanga, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10696 to look at the new patch set (#4). Change subject: [DRAFT] IMPALA-6189: Add thread watchdogs for HDFS IO calls .. [DRAFT] IMPALA-6189: Add thread watchdogs for HDFS IO calls This commit adds scoped thread watchdogs around blocking HDFS IO calls. These dump a user (native + JVM) and kernel stack trace to the logs if the blocking function call runs for greater than 'threshold' milliseconds. Having such traces will help debug issues with query hangs due to hangs in external components. The threshold can be configured using --thread_watchdog_threshold_ms If stack unwinding tends to be CPU heavy and causing performance problems, the watchdogs can be disabled by setting --thread_watchdog_threshold_ms=0 (default = 5000ms) Usage: - This patch adds a macro called SCOPED_WATCHDOG(). Following is a sample usage: { SCOPED_WATCHDOG(); hdfsOpenFile(fs, fname, O_RDONLY, 0, 0, 0) } The above code logs kernel and user level stacks for the current thread if hdfsOpenFile() takes more than 5000 milliseconds. Sample output: - Native Stacks: == Injected a manual sleep using SleepForMs() before hdfsOpenFile() to simulate a hang and the following is logged in the impalad logs. W0611 16:21:01.602687 347 kernel_stack_watchdog.cc:146] Thread 383 stuck at /home/bharath/Impala/be/src/runtime/io/handle-cache.inline.h:34 for 5036ms: Kernel stack: [<>] hrtimer_nanosleep+0xbb/0x180 [<>] SyS_nanosleep+0x66/0x80 [<>] system_call_fastpath+0x1a/0x1f [<>] 0x User stack: @ 0x7f808e7fed40 (unknown) @ 0x7f808eb9db9d __libc_nanosleep @ 0x1f460e6 std::this_thread::sleep_for<>() @ 0x1f44edd impala::SleepForMs() @ 0x2fef2ca impala::io::HdfsFileHandle::HdfsFileHandle() @ 0x2fef5f7 impala::io::CachedHdfsFileHandle::CachedHdfsFileHandle() @ 0x2fefe78 impala::io::FileHandleCache::GetFileHandle() @ 0x2ff3e3f impala::io::DiskIoMgr::GetCachedHdfsFileHandle() @ 0x300fad7 impala::io::ScanRange::Read() @ 0x300bca4 impala::io::ScanRange::DoRead() @ 0x2ff3207 impala::io::DiskQueue::DiskThreadLoop() @ 0x2ffac32 boost::_mfi::mf1<>::operator()() @ 0x2ffa79f boost::_bi::list2<>::operator()<>() @ 0x2ff9fbd boost::_bi::bind_t<>::operator()() @ 0x2ff970a boost::detail::function::void_function_obj_invoker0<>::invoke() @ 0x1b4e388 boost::function0<>::operator()() JVM Stack: Thread not attached to the JVM. JVM Stacks: === If a JVM Thread is attached for a given hung native thread, the the stack trace output looks like follows. W0618 20:55:45.93 1220 kernel_stack_watchdog.cc:147] Thread 1381 stuck at be/src/service/impala-server.cc:882 for 5083ms: Kernel stack: [<>] futex_wait_queue_me+0xde/0x140 [<>] futex_wait+0x182/0x290 [<>] do_futex+0xde/0x6a0 [<>] SyS_futex+0x71/0x150 [<>] system_call_fastpath+0x1a/0x1f [<>] 0x User stack: @ 0x7fee18315d40 (unknown) @ 0x7fee186b17ce __pthread_cond_timedwait @ 0x7fee1b1d6a9f os::PlatformEvent::park() @ 0x7fee1b1c2c3f ObjectMonitor::wait() @ 0x7fee1afd5ed2 JVM_MonitorWait @ 0x7fee03cb6ca4 (unknown) JVM Stack: Ljava/lang/Object;wait() Lorg/apache/impala/catalog/ImpaladCatalog;waitForCatalogUpdate(Line:239) Lorg/apache/impala/analysis/StmtMetadataLoader;loadTables(Line:136) Lorg/apache/impala/analysis/StmtMetadataLoader;loadTables(Line:115) Lorg/apache/impala/service/Frontend;createExecRequest(Line:996) Lorg/apache/impala/service/JniFrontend;createExecRequest(Line:152) Change-Id: I28e918761c120043d332b034450208eaf34e3e2b --- M be/src/common/global-flags.cc M be/src/exec/hdfs-table-sink.cc M be/src/kudu/util/debug-util.cc M be/src/kudu/util/kernel_stack_watchdog.cc M be/src/runtime/io/handle-cache.inline.h M be/src/runtime/io/scan-range.cc M be/src/service/CMakeLists.txt A be/src/service/jvmti-agent.cc A be/src/service/jvmti-agent.h M be/src/util/CMakeLists.txt A be/src/util/jvm-stack-trace.cc A be/src/util/jvm-stack-trace.h A be/src/util/scoped-watchdog.h M bin/impala-config.sh 14 files changed, 509 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/10696/4 -- To view, visit http://gerrit.cloudera.org:8080/10696 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id:
[Impala-ASF-CR] [DRAFT] IMPALA-6189: Add thread watchdogs for HDFS IO calls
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10696 ) Change subject: [DRAFT] IMPALA-6189: Add thread watchdogs for HDFS IO calls .. Patch Set 3: (9 comments) http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/exec/hdfs-table-sink.cc File be/src/exec/hdfs-table-sink.cc: http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/exec/hdfs-table-sink.cc@410 PS3, Line 410: hdfsFileInfo* info = hdfsGetPathInfo(output_partition->hdfs_connection, > This is an RPC too. Should we be watching it? Done http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/kudu/util/debug-util.cc File be/src/kudu/util/debug-util.cc: http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/kudu/util/debug-util.cc@62 PS3, Line 62: DEFINE_int32(stack_trace_sig_num, SIGRTMIN, "Signal used by the thread watchdog to " > Why is this configurable? Firstly, these changes in Kudu's util code are expected to be pushed to the Kudu repo and then I plan to cherry-pick them back to Impala to avoid divergence. Now, coming to your question, Kudu uses SIGUSR2 for the signal handler invocation which unfortunately does not work for Impala since JVM already uses it for something else[1]. The idea is to make it configurable in the Kudu code so that Impala can override it. If Kudu project also agrees to use a different signal handler, I suppose we don't need to make it configurable then. [1] https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/signals006.html http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/kudu/util/debug-util.cc@121 PS3, Line 121: // Stack trace collected from the JVM thread (if one exists). : // TODO: Implement this as a callback rather than polluting Kudu's util code? : impala::JVMStackTrace jstack; > Please apply those changes to Kudu code base if possible. This avoids diver Yep, I need to discuss this with Lars and will try to push as much as I can into Kudu's code base. http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/kudu/util/debug-util.cc@212 PS3, Line 212:<< "Kudu will not produce thread stack traces."; > s/Kudu/Impala? Will try to make this more generic when I push a Kudu patch for review. Thanks for pointing it out. http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/kudu/util/debug-util.cc@288 PS3, Line 288: ret << g_comm.jstack.Stringify(); > This is new code. I think it may make sense to feature flag it (--kernel_wa We do have a kill switch for this whole watchdog feature. Do we need to have one specially for the JVM part? http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/runtime/io/scan-range.cc File be/src/runtime/io/scan-range.cc: http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/runtime/io/scan-range.cc@632 PS3, Line 632: fs_, hdfs_file, position_in_file > as mentioned before, we'll want to capture the argument values since that w I plan to fix this in the same patch. Kudu's logging assumes we are only passing filename:line_number (snippet below). lock_guard l(log_lock_); LOG_STRING(WARNING, log_collector_.get()) << "Thread " << p << " stuck at " << frame->status_ << " for " << paused_ms << "ms" << ":\n" << "Kernel stack:\n" << kernel_stack << "\n" << "User stack:\n" << user_stack; } I will send out a consolidated Kudu patch (with all the required Kudu changes from the current patch) and then include this contextual information during rebase (after cherry-picking it onto Impala). http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/runtime/io/scan-range.cc@642 PS3, Line 642: current_bytes_read = hdfsRead(fs_, hdfs_file, buffer + *bytes_read, > Isn't this the main thing we're interested in? Done http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/service/CMakeLists.txt File be/src/service/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/service/CMakeLists.txt@53 PS3, Line 53: # Jvmti agent library that registers with the JVM on load. > As I mentioned elsewhere (perhaps later), I'm interested in whether this ad I don't think there is much overhead in the binary size. I'm not sure I understand the rest of your comment. May be can you point some docs to the workaround you are talking about? Without patch: -rwxrwxr-x 1 bharath bharath 380M Jun 22 11:35 libfesupport.so -rwxrwxr-x 1 bharath bharath 376M Jun 22 11:35 impalad -rwxrwxr-x 1 bharath bharath 375M Jun 22 11:36 session-expiry-test -rwxrwxr-x 1 bharath bharath 373M Jun 22 11:36 query-options-test -rwxrwxr-x 1 bharath bharath 372M Jun 22 11:36 hs2-util-test -rw-rw-r-- 1 bharath bharath 71M Jun 22 11:32 libService.a With Patch: -rwxrwxr-x 1 bharath bharath 380M Jun 22 11:24 libfesupport.so -rwxrwxr-x 1 bharath bharath 377M Jun 22 11:24 impalad -rwxrwxr-x 1 bharath bharath 376M Jun 17 09:57 session-expiry-test -rwxrwxr-x 1 bharath bharath 374M
[native-toolchain-CR] Patch llvm to fix run-clang-tidy.py output
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/10788 ) Change subject: Patch llvm to fix run-clang-tidy.py output .. Patch Set 5: Code-Review+2 Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/10788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9125cb0a908fd7005ee68aafb41f3afe93522632 Gerrit-Change-Number: 10788 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 22 Jun 2018 18:49:27 + Gerrit-HasComments: No
[native-toolchain-CR] Patch llvm to fix run-clang-tidy.py output
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/10788 ) Change subject: Patch llvm to fix run-clang-tidy.py output .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/10788/4/functions.sh File functions.sh: http://gerrit.cloudera.org:8080/#/c/10788/4/functions.sh@218 PS4, Line 218: setup_package_build > setup_extracted_package_build Fixed -- To view, visit http://gerrit.cloudera.org:8080/10788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9125cb0a908fd7005ee68aafb41f3afe93522632 Gerrit-Change-Number: 10788 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 22 Jun 2018 18:48:52 + Gerrit-HasComments: Yes
[native-toolchain-CR] Patch llvm to fix run-clang-tidy.py output
Hello Lars Volker, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10788 to look at the new patch set (#5). Change subject: Patch llvm to fix run-clang-tidy.py output .. Patch llvm to fix run-clang-tidy.py output Clang's run-clang-tidy.py has some behaviors that make it hard for Impala to get good output. Specifically: 1. It does not synchronize output to stdout. Output for different files can be interleaved with each other and with the script's own output. 2. It does not suppress its own output when the -quiet argument is specified. This output is hard to separate from the actual output of clang-tidy. 3. It can silently skips files when clang-tidy fails with an error code. This does not happen with the existing set of clang-tidy checks, but it could cause loss of coverage if a new check caused clang-tidy to encounter an error. This adds a patch to llvm that fixes these issues in run-clang-tidy.py by synchronizing output, respecting -quiet, and allowing an error from clang-tidy to abort the run. Since run-clang-tidy.py is not in the main llvm source tree (it is in the extra package), our existing method could not patch it. This changes llvm to extract all of the archives into the appropriate places before applying a unified patch that can touch any of the source files from any of the archives. Testing: - Ran run-clang-tidy.py on normal Impala checkout without issues. - Added clang tidy checks that Impala currently doesn't pass and verified that the output with -quiet is sensible. Change-Id: I9125cb0a908fd7005ee68aafb41f3afe93522632 --- M buildall.sh M functions.sh M source/llvm/build-source-tarball.sh A source/llvm/llvm-5.0.1-patches/0001-PATCH-Fix-run-clang-tidy.py-s-output.patch 4 files changed, 110 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/88/10788/5 -- To view, visit http://gerrit.cloudera.org:8080/10788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9125cb0a908fd7005ee68aafb41f3afe93522632 Gerrit-Change-Number: 10788 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5900: [DOCS] Doc fe service threads startup option
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10795 ) Change subject: IMPALA-5900: [DOCS] Doc fe_service_threads startup option .. Patch Set 2: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-docs-submit/329/ -- To view, visit http://gerrit.cloudera.org:8080/10795 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f0a417a4aa07b8082037fc6ff355e62ce1493e5 Gerrit-Change-Number: 10795 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Fri, 22 Jun 2018 18:21:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3956 - Impala shell should ignore variables in comments
Adam Holley has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/10784 ) Change subject: IMPALA-3956 - Impala shell should ignore variables in comments .. IMPALA-3956 - Impala shell should ignore variables in comments This patches fixes an issue where variables that appeared in comments were being parsed. > /* ${abc} */ help; Before - Unknown sustitution syntax (ABC) error After - Shows help correctly Testing: - Added shell tests - Ran end-to-end shell tests. Change-Id: I9e442e169d512daf80c86606528300ce4e6f371c --- M shell/impala_shell.py M tests/shell/test_shell_interactive.py 2 files changed, 65 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/10784/2 -- To view, visit http://gerrit.cloudera.org:8080/10784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9e442e169d512daf80c86606528300ce4e6f371c Gerrit-Change-Number: 10784 Gerrit-PatchSet: 2 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Fredy Wijaya
[Impala-ASF-CR] IMPALA-7180: Pin Impala CDH dependencies
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10748 ) Change subject: IMPALA-7180: Pin Impala CDH dependencies .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2730/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/10748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I66c0dcb8abdd0d187490a761f129cda3b3500990 Gerrit-Change-Number: 10748 Gerrit-PatchSet: 9 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 22 Jun 2018 18:02:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7180: Pin Impala CDH dependencies
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10748 ) Change subject: IMPALA-7180: Pin Impala CDH dependencies .. Patch Set 9: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I66c0dcb8abdd0d187490a761f129cda3b3500990 Gerrit-Change-Number: 10748 Gerrit-PatchSet: 9 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 22 Jun 2018 18:02:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7180: Pin Impala CDH dependencies
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/10748 ) Change subject: IMPALA-7180: Pin Impala CDH dependencies .. Patch Set 8: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I66c0dcb8abdd0d187490a761f129cda3b3500990 Gerrit-Change-Number: 10748 Gerrit-PatchSet: 8 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 22 Jun 2018 18:01:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7140 (part 7): small fixes to enable most queries on HDFS tables
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10798 ) Change subject: IMPALA-7140 (part 7): small fixes to enable most queries on HDFS tables .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/10798/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10798/1//COMMIT_MSG@20 PS1, Line 20: * Fix the "builtins DB" code path. This is currently quite messy since actually it seems my fix made unit tests pass but now the actual daemon doesn't start anymore in local_catalog mode, so I think I need to re-think this a bit. -- To view, visit http://gerrit.cloudera.org:8080/10798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6f603e62b7a013c148c0905ebdec2f4303f9c4e5 Gerrit-Change-Number: 10798 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 22 Jun 2018 18:00:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5900: [DOCS] Doc fe service threads startup option
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10795 ) Change subject: IMPALA-5900: [DOCS] Doc fe_service_threads startup option .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-docs-submit/329/ -- To view, visit http://gerrit.cloudera.org:8080/10795 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f0a417a4aa07b8082037fc6ff355e62ce1493e5 Gerrit-Change-Number: 10795 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Fri, 22 Jun 2018 17:49:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7140 (part 4): support creating descriptors for FS tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10735 ) Change subject: IMPALA-7140 (part 4): support creating descriptors for FS tables .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2729/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/10735 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4550612eb6d1e3a324f49a9c4d24b048e45d3738 Gerrit-Change-Number: 10735 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 22 Jun 2018 17:47:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5900: [DOCS] Doc fe service threads startup option
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/10795 ) Change subject: IMPALA-5900: [DOCS] Doc fe_service_threads startup option .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10795 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f0a417a4aa07b8082037fc6ff355e62ce1493e5 Gerrit-Change-Number: 10795 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Fri, 22 Jun 2018 17:43:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5900: [DOCS] Doc fe service threads startup option
Hello Juan Yu, Michael Brown, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10795 to look at the new patch set (#2). Change subject: IMPALA-5900: [DOCS] Doc fe_service_threads startup option .. IMPALA-5900: [DOCS] Doc fe_service_threads startup option Change-Id: I7f0a417a4aa07b8082037fc6ff355e62ce1493e5 --- M docs/topics/impala_config_options.xml 1 file changed, 30 insertions(+), 219 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/10795/2 -- To view, visit http://gerrit.cloudera.org:8080/10795 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7f0a417a4aa07b8082037fc6ff355e62ce1493e5 Gerrit-Change-Number: 10795 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-4669: [KUTIL] Add kudu util library to the build.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10758 ) Change subject: IMPALA-4669: [KUTIL] Add kudu_util library to the build. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/10758/1/be/src/kudu/util/CMakeLists.txt File be/src/kudu/util/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/10758/1/be/src/kudu/util/CMakeLists.txt@213 PS1, Line 213: if(NOT NO_NVM_SUPPORT) > After this change we should probably start continuously cherry-picking chan It seems that the original code is conflating NO_NVM_SUPPORT with NO_APPLE. -- To view, visit http://gerrit.cloudera.org:8080/10758 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I451f02d3e4669e8a548b92fb1445cb2b322659a2 Gerrit-Change-Number: 10758 Gerrit-PatchSet: 1 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Fri, 22 Jun 2018 17:39:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5900: [DOCS] Doc fe service threads startup option
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/10795 ) Change subject: IMPALA-5900: [DOCS] Doc fe_service_threads startup option .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/10795/1/docs/topics/impala_config_options.xml File docs/topics/impala_config_options.xml: http://gerrit.cloudera.org:8080/#/c/10795/1/docs/topics/impala_config_options.xml@a305 PS1, Line 305: > What's the plan for this? Why remove all of it? This is duplicated above - around L#132 http://gerrit.cloudera.org:8080/#/c/10795/1/docs/topics/impala_config_options.xml@320 PS1, Line 320: reads and writes > "reads from and writes to"? Done http://gerrit.cloudera.org:8080/#/c/10795/1/docs/topics/impala_config_options.xml@330 PS1, Line 330: clients : has > subject-verb disagreement Done http://gerrit.cloudera.org:8080/#/c/10795/1/docs/topics/impala_config_options.xml@332 PS1, Line 332: connection > connections Done http://gerrit.cloudera.org:8080/#/c/10795/1/docs/topics/impala_config_options.xml@333 PS1, Line 333: most of connections > either "most of the" or "most", not "most of? Done -- To view, visit http://gerrit.cloudera.org:8080/10795 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f0a417a4aa07b8082037fc6ff355e62ce1493e5 Gerrit-Change-Number: 10795 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Fri, 22 Jun 2018 17:39:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7140 (part 4): support creating descriptors for FS tables
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10735 ) Change subject: IMPALA-7140 (part 4): support creating descriptors for FS tables .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10735 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4550612eb6d1e3a324f49a9c4d24b048e45d3738 Gerrit-Change-Number: 10735 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 22 Jun 2018 17:33:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7140 (part 7): small fixes to enable most queries on HDFS tables
Hello Tianyi Wang, Vuk Ercegovac, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/10798 to review the following change. Change subject: IMPALA-7140 (part 7): small fixes to enable most queries on HDFS tables .. IMPALA-7140 (part 7): small fixes to enable most queries on HDFS tables This is a grab-bag of small fixes necessary to get most queries on HDFS tables passing with the correct plans: * Change the loading of tables to check for other table types before checking for FS tables, since without specific checks against various properties, other tables may look like FS tables and try to instantiate LocalFsTable incorrectly. * Return -1 for extrapolated row count -- the previous 0 value was convincing the planner that it had a valid value. * Fix the "builtins DB" code path. This is currently quite messy since some planner classes refer to ImpaladCatalog directly to get a BuiltinsDb, and so we need to use that instance rather than creating our own for now. * Properly handle the case where all partitions are pruned by a predicate. With this change, about half of the tests in PlannerTest pass. The tests that don't pass all rely on views, HBase tables, etc. Change-Id: I6f603e62b7a013c148c0905ebdec2f4303f9c4e5 --- M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java 4 files changed, 33 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/10798/1 -- To view, visit http://gerrit.cloudera.org:8080/10798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I6f603e62b7a013c148c0905ebdec2f4303f9c4e5 Gerrit-Change-Number: 10798 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-5900: [DOCS] Doc fe service threads startup option
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/10795 ) Change subject: IMPALA-5900: [DOCS] Doc fe_service_threads startup option .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/10795/1/docs/topics/impala_config_options.xml File docs/topics/impala_config_options.xml: http://gerrit.cloudera.org:8080/#/c/10795/1/docs/topics/impala_config_options.xml@a305 PS1, Line 305: What's the plan for this? Why remove all of it? http://gerrit.cloudera.org:8080/#/c/10795/1/docs/topics/impala_config_options.xml@320 PS1, Line 320: reads and writes "reads from and writes to"? http://gerrit.cloudera.org:8080/#/c/10795/1/docs/topics/impala_config_options.xml@330 PS1, Line 330: clients : has subject-verb disagreement http://gerrit.cloudera.org:8080/#/c/10795/1/docs/topics/impala_config_options.xml@332 PS1, Line 332: connection connections http://gerrit.cloudera.org:8080/#/c/10795/1/docs/topics/impala_config_options.xml@333 PS1, Line 333: most of connections either "most of the" or "most", not "most of? -- To view, visit http://gerrit.cloudera.org:8080/10795 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f0a417a4aa07b8082037fc6ff355e62ce1493e5 Gerrit-Change-Number: 10795 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Fri, 22 Jun 2018 17:25:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7140 (part 4): support creating descriptors for FS tables
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10735 ) Change subject: IMPALA-7140 (part 4): support creating descriptors for FS tables .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/10735/3/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java: http://gerrit.cloudera.org:8080/#/c/10735/3/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@175 PS3, Line 175: /*includeFileDesc=*/false, > I inlined this call into the above function since it was the only call site actually was wrong about this, thrift *descriptors* never include file info. But I think that's already implied by the name "descriptor". Let me know if you want me to add a comment though. -- To view, visit http://gerrit.cloudera.org:8080/10735 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4550612eb6d1e3a324f49a9c4d24b048e45d3738 Gerrit-Change-Number: 10735 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 22 Jun 2018 17:24:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7140 (part 3): load partitions for FS tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10713 ) Change subject: IMPALA-7140 (part 3): load partitions for FS tables .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2728/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/10713 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddf2edbd6bdc0684560b2ecca9c4c6b6819ef1d3 Gerrit-Change-Number: 10713 Gerrit-PatchSet: 6 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 22 Jun 2018 17:23:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7140 (part 3): load partitions for FS tables
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10713 ) Change subject: IMPALA-7140 (part 3): load partitions for FS tables .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10713 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddf2edbd6bdc0684560b2ecca9c4c6b6819ef1d3 Gerrit-Change-Number: 10713 Gerrit-PatchSet: 6 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 22 Jun 2018 17:22:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7140 (part 6): fetch column stats for LocalTable
Hello Tianyi Wang, Vuk Ercegovac, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/10797 to review the following change. Change subject: IMPALA-7140 (part 6): fetch column stats for LocalTable .. IMPALA-7140 (part 6): fetch column stats for LocalTable This adds fetching of column statistics for LocalTable. Currently, all column stats are fetched when the table is loaded, even for simple statements like 'DESCRIBE' where they aren't necessary. This is because I couldn't find a convenient spot during analysis at which time the set of necessary columns are known. I left a TODO for this potential improvement. With this change I can see that 'SHOW COLUMN STATS' shows the expected results for functional.alltypes. A new simple unit test verifies this. Planner tests still don't pass due to some NullPointerExceptions related to loading functions from the builtins DB -- most of the tests seem to rely on simple built-ins like COUNT and CAST. Change-Id: Ib6403c2bedf4ee29c5e6f90e947382cb44f46e0c --- M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java 7 files changed, 114 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/10797/1 -- To view, visit http://gerrit.cloudera.org:8080/10797 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib6403c2bedf4ee29c5e6f90e947382cb44f46e0c Gerrit-Change-Number: 10797 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7140 (part 5): support fetching file info for FS tables
Hello Tianyi Wang, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10749 to look at the new patch set (#3). Change subject: IMPALA-7140 (part 5): support fetching file info for FS tables .. IMPALA-7140 (part 5): support fetching file info for FS tables This adds support for fetching file information and creating file descriptors. With this patch, I'm able to connect and run queries. Most planner tests still fail because of missing column stats resulting in different join orders compared to the existing implementation. Change-Id: I42d67ab754872fad094c7dacdd2e1182de1bf3e8 --- M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java 8 files changed, 188 insertions(+), 50 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/10749/3 -- To view, visit http://gerrit.cloudera.org:8080/10749 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I42d67ab754872fad094c7dacdd2e1182de1bf3e8 Gerrit-Change-Number: 10749 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7140 (part 4): support creating descriptors for FS tables
Hello Tianyi Wang, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10735 to look at the new patch set (#4). Change subject: IMPALA-7140 (part 4): support creating descriptors for FS tables .. IMPALA-7140 (part 4): support creating descriptors for FS tables This adds the relevant methods to convert LocalFsTable and LocalFsPartition to thrift descriptors for consumption by the backend. Unfortunately we cannot yet enable the planner tests, since they are checking file counts and sizes as part of the explain output, and we haven't yet implemented file info fetching in the LocalCatalog. However, I was able to manually test this change by starting an impalad with --use_local_catalog, connecting to it from the shell, and running various EXPLAIN SELECT queries against tpch and functional tables. The explain output is more or less as expected with the exception of missing file info. Change-Id: I4550612eb6d1e3a324f49a9c4d24b048e45d3738 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java 11 files changed, 227 insertions(+), 105 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/10735/4 -- To view, visit http://gerrit.cloudera.org:8080/10735 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4550612eb6d1e3a324f49a9c4d24b048e45d3738 Gerrit-Change-Number: 10735 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7140 (part 5): support fetching file info for FS tables
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10749 ) Change subject: IMPALA-7140 (part 5): support fetching file info for FS tables .. Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@132 PS2, Line 132: FileStatus fileStatus) { > nit: pull up to prev. line. thx for cleaning that signature. Done http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@167 PS2, Line 167: must path > Path must be oops, I typed with a lisp there :) http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@172 PS2, Line 172: b.add(it.next()); > nit: one line Done http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java: http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@81 PS2, Line 81: hostIndex_ > pls add a comment and todo to maintain it (afaict, it remains empty). I think this gets modified as a side effect of loadFileDescriptors, which calls HdfsTable.createFileDescriptors, which updates the host index to include all referenced hosts. I'll add some comment. http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java: http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java@73 PS2, Line 73: Load files and locations > This is more like inode info (file metadata), which is clearer than "Load f Done -- To view, visit http://gerrit.cloudera.org:8080/10749 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I42d67ab754872fad094c7dacdd2e1182de1bf3e8 Gerrit-Change-Number: 10749 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 22 Jun 2018 17:10:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7198: [DOCS] Corrected misspelled hints
Alex Rodoni has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10796 Change subject: IMPALA-7198: [DOCS] Corrected misspelled hints .. IMPALA-7198: [DOCS] Corrected misspelled hints Change-Id: I3e3c4ed050131f811311d5fdb8634a1ea3385fb3 --- M docs/topics/impala_hints.xml 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/10796/1 -- To view, visit http://gerrit.cloudera.org:8080/10796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I3e3c4ed050131f811311d5fdb8634a1ea3385fb3 Gerrit-Change-Number: 10796 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] IMPALA-7140 (part 4): support creating descriptors for FS tables
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10735 ) Change subject: IMPALA-7140 (part 4): support creating descriptors for FS tables .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/10735/3/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java File fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java: http://gerrit.cloudera.org:8080/#/c/10735/3/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java@295 PS3, Line 295: THdfsPartition > naming does not align with the switch from 'hdfs' to 'fs'. pls add a todo t added a TODO on the Thrift file http://gerrit.cloudera.org:8080/#/c/10735/3/fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java File fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java: http://gerrit.cloudera.org:8080/#/c/10735/3/fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java@160 PS3, Line 160: getFilteredHmsParameters > not for this change, but wondering if hms details need to be exposed this h I separated this API rather than filtering at the call site because I'd like to be able to push down the filtering all the way to what we fetch from HMS after it implements something like https://issues.apache.org/jira/browse/HIVE-19715 . We might want to _always_ remove the incremental stats from the parameters and add an entirely separate getter which returns them in decoded format for memory usage - I think Parna is working on something like that. But didn't want to expand this change. I'll add a TODO about this. http://gerrit.cloudera.org:8080/#/c/10735/3/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java File fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java: http://gerrit.cloudera.org:8080/#/c/10735/3/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java@87 PS3, Line 87: 0; > interesting that this one can be 0, L73 returns empty but flipping L81 to t yea, I was surprised too. It's temporary at least until the next patch in this series. http://gerrit.cloudera.org:8080/#/c/10735/3/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java@98 PS3, Line 98: -1 > what's the -1? Done http://gerrit.cloudera.org:8080/#/c/10735/3/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java: http://gerrit.cloudera.org:8080/#/c/10735/3/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@175 PS3, Line 175: > add a comment that a slim hdfs table is being constructed (no hdfs info, no I inlined this call into the above function since it was the only call site, and the above function name makes it clearer it's just a descriptor. The "no HDFS info" bit is just temporary until the next patch, not something inherent about the design here (it will include the descriptors) http://gerrit.cloudera.org:8080/#/c/10735/3/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@176 PS3, Line 176: get > nit: build instead of get? obviated by above -- To view, visit http://gerrit.cloudera.org:8080/10735 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4550612eb6d1e3a324f49a9c4d24b048e45d3738 Gerrit-Change-Number: 10735 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 22 Jun 2018 16:59:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7140 (part 5): support fetching file info for FS tables
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10749 ) Change subject: IMPALA-7140 (part 5): support fetching file info for FS tables .. Patch Set 2: Code-Review+2 (6 comments) http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@132 PS2, Line 132: FileStatus fileStatus) { nit: pull up to prev. line. thx for cleaning that signature. http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@167 PS2, Line 167: must path Path must be http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@172 PS2, Line 172: b.add(it.next()); nit: one line http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java File fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java: http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java@242 PS2, Line 242: new FakeRemoteIterator<>(stats) nice. http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java: http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@81 PS2, Line 81: hostIndex_ pls add a comment and todo to maintain it (afaict, it remains empty). http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java: http://gerrit.cloudera.org:8080/#/c/10749/2/fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java@73 PS2, Line 73: Load files and locations This is more like inode info (file metadata), which is clearer than "Load files". -- To view, visit http://gerrit.cloudera.org:8080/10749 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I42d67ab754872fad094c7dacdd2e1182de1bf3e8 Gerrit-Change-Number: 10749 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 22 Jun 2018 16:53:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5900: [DOCS] Doc fe service threads startup option
Alex Rodoni has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10795 Change subject: IMPALA-5900: [DOCS] Doc fe_service_threads startup option .. IMPALA-5900: [DOCS] Doc fe_service_threads startup option Change-Id: I7f0a417a4aa07b8082037fc6ff355e62ce1493e5 --- M docs/topics/impala_config_options.xml 1 file changed, 30 insertions(+), 219 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/10795/1 -- To view, visit http://gerrit.cloudera.org:8080/10795 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7f0a417a4aa07b8082037fc6ff355e62ce1493e5 Gerrit-Change-Number: 10795 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] Add scripts to create code coverage reports
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10791 ) Change subject: Add scripts to create code coverage reports .. Patch Set 1: Code-Review+1 (3 comments) Looks fine to me. I'd love to see some demo output at some point. http://gerrit.cloudera.org:8080/#/c/10791/1/bin/coverage_helper.sh File bin/coverage_helper.sh: http://gerrit.cloudera.org:8080/#/c/10791/1/bin/coverage_helper.sh@1 PS1, Line 1: #!/bin/bash I'm increasingly of the opinion that we should just start with python for this kind of thing. You'll save on lines 26-52 by using more obvious argparse, for example. http://gerrit.cloudera.org:8080/#/c/10791/1/bin/coverage_helper.sh@24 PS1, Line 24: REPORT_DIRECTORY=${IMPALA_HOME}/logs/coverage Does something mkdir -p this? http://gerrit.cloudera.org:8080/#/c/10791/1/bin/coverage_helper.sh@30 PS1, Line 30: echo "[-reportdirectory] : Output directory for coverage report files" Indicate that this takes an argument? -- To view, visit http://gerrit.cloudera.org:8080/10791 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b2e0b794c64f9343ec976de7a3f235e54d2badd Gerrit-Change-Number: 10791 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 22 Jun 2018 16:48:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7121: Clean up partitionIds from HdfsTable
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10654 ) Change subject: IMPALA-7121: Clean up partitionIds_ from HdfsTable .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10654 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8b5a480e570aeae565fafd4f3e2b279e7a98c7da Gerrit-Change-Number: 10654 Gerrit-PatchSet: 6 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 22 Jun 2018 16:43:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7180: Pin Impala CDH dependencies
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/10748 ) Change subject: IMPALA-7180: Pin Impala CDH dependencies .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/10748/7/bin/bootstrap_toolchain.py File bin/bootstrap_toolchain.py: http://gerrit.cloudera.org:8080/#/c/10748/7/bin/bootstrap_toolchain.py@436 PS7, Line 436: if not os.getenv("DOWNLOAD_CDH_COMPONENTS", "false") == "true": sys.exit(0) > nit: you can invert the check (not os.get... == "true") and exit early. Tha Done -- To view, visit http://gerrit.cloudera.org:8080/10748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I66c0dcb8abdd0d187490a761f129cda3b3500990 Gerrit-Change-Number: 10748 Gerrit-PatchSet: 8 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 22 Jun 2018 16:41:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7180: Pin Impala CDH dependencies
Fredy Wijaya has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/10748 ) Change subject: IMPALA-7180: Pin Impala CDH dependencies .. IMPALA-7180: Pin Impala CDH dependencies For IMPALA_MINICLUSTER_PROFILE=3 (Hadoop 3.x components), pin the CDH dependencies by storing the CDH tarballs and Maven repository in S3. This solves the issue of build coherency between the the CDH tarballs and Maven dependencies. For IMPALA_MINICLUSTER_PROFILE=2 (Hadoop 2.x components), pin the CDH dependencies by storing only the CDH tarballs in S3. The Maven repository will still use https://repository.cloudera.com, so there is still a possibility of a build coherency issue. For each CDH dependency, there is a unique build number in each repository URL to indicate the build number that created those CDH dependencies. This informaton can be useful for debugging issues related to CDH dependencies. This patch introduces CDH_DOWNLOAD_HOST and CDH_BUILD_NUMBER environment variables that can be overriden, which can be useful for running an integration job. This patch also fixes dependency issues in Hadoop that transitively depend on snapshot versions of dependencies that no longer exist, i.e. - net.minidev:json-smart:2.3-SNAPSHOT (HADOOP-14903) - org.glassfish:javax.el:3.0.1-b06-SNAPSHOT The fix is to force the dependencies by using the released versions of those dependencies. Testing: - Ran all core tests on IMPALA_MINICLUSTER_PROFILE=2 and IMPALA_MINICLUSTER_PROFILE=3 Cherry-picks: not for 2.x Change-Id: I66c0dcb8abdd0d187490a761f129cda3b3500990 --- M bin/bootstrap_toolchain.py M bin/impala-config.sh M common/yarn-extras/pom.xml M fe/pom.xml M impala-parent/pom.xml M testdata/pom.xml M tests/test-hive-udfs/pom.xml 7 files changed, 297 insertions(+), 33 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/10748/8 -- To view, visit http://gerrit.cloudera.org:8080/10748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I66c0dcb8abdd0d187490a761f129cda3b3500990 Gerrit-Change-Number: 10748 Gerrit-PatchSet: 8 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] [DRAFT] IMPALA-6189: Add thread watchdogs for HDFS IO calls
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10696 ) Change subject: [DRAFT] IMPALA-6189: Add thread watchdogs for HDFS IO calls .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/runtime/io/scan-range.cc File be/src/runtime/io/scan-range.cc: http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/runtime/io/scan-range.cc@632 PS3, Line 632: fs_, hdfs_file, position_in_file as mentioned before, we'll want to capture the argument values since that will likely be the first question some asks after seeing the callstack. if you don't want to do it in this patch, could you please file a follow on JIRA so we don't forget? -- To view, visit http://gerrit.cloudera.org:8080/10696 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I28e918761c120043d332b034450208eaf34e3e2b Gerrit-Change-Number: 10696 Gerrit-PatchSet: 3 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Fri, 22 Jun 2018 16:37:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5763: [DOCS] Warn against setting --logbufsecs to 0
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10790 ) Change subject: IMPALA-5763: [DOCS] Warn against setting --logbufsecs to 0 .. IMPALA-5763: [DOCS] Warn against setting --logbufsecs to 0 Change-Id: I45d14e2ba601574cf76a4b739617b097f3e551aa Reviewed-on: http://gerrit.cloudera.org:8080/10790 Reviewed-by: Tim Armstrong Reviewed-by: Lars Volker Tested-by: Impala Public Jenkins --- M docs/topics/impala_logging.xml 1 file changed, 38 insertions(+), 19 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Lars Volker: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10790 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I45d14e2ba601574cf76a4b739617b097f3e551aa Gerrit-Change-Number: 10790 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5763: [DOCS] Warn against setting --logbufsecs to 0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10790 ) Change subject: IMPALA-5763: [DOCS] Warn against setting --logbufsecs to 0 .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10790 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I45d14e2ba601574cf76a4b739617b097f3e551aa Gerrit-Change-Number: 10790 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 22 Jun 2018 16:35:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6119: Fix issue with multiple partitions sharing same location
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10543 ) Change subject: IMPALA-6119: Fix issue with multiple partitions sharing same location .. Patch Set 18: (2 comments) http://gerrit.cloudera.org:8080/#/c/10543/18/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/10543/18/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1435 PS18, Line 1435: HdfsPartition partition: partitionMap_.values() This loop constructs partitionsToUpdateFileMdByPath for the case where partitions are not explicitly specified (among other things). That map is a subset of locationToPartMap_ (loosely speaking). So we're already looping over all partitions here. If the block on L1446 can be modified to also build up such a map for explicitly specified partitions (e.g., when partitionsToUpdate is != null), that map can be passed into getPartitionsByPath (L1484) to be used instead of the table-wide locationToPartMap_. Unless I've missed something, this looks like it adds little to current complexity for this code path and uses a small amount of additional memory when needed. I see no other use of getPartitionsByPath. http://gerrit.cloudera.org:8080/#/c/10543/18/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/10543/18/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2252 PS18, Line 2252: for (HdfsPartition part : parts) { if the path -> partition map is dropped, an additional loop would need to consider all partitions and check vs. the specified partitions for other partitions that share the same location. this can be sped up via small indexes (index the specified partitions by location). so we'd increase the cost here from scanning the specified partitions to scanning all partitions-- how much of a performance hit is this for 100k partitions and how performance critical is drop partitions? -- To view, visit http://gerrit.cloudera.org:8080/10543 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a Gerrit-Change-Number: 10543 Gerrit-PatchSet: 18 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 22 Jun 2018 16:30:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7180: Pin Impala CDH dependencies
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/10748 ) Change subject: IMPALA-7180: Pin Impala CDH dependencies .. Patch Set 7: Code-Review+2 (1 comment) Had one nit, otherwise LGTM. http://gerrit.cloudera.org:8080/#/c/10748/7/bin/bootstrap_toolchain.py File bin/bootstrap_toolchain.py: http://gerrit.cloudera.org:8080/#/c/10748/7/bin/bootstrap_toolchain.py@436 PS7, Line 436: if os.getenv("DOWNLOAD_CDH_COMPONENTS", "false") == "true": nit: you can invert the check (not os.get... == "true") and exit early. That'll reduce the nesting for the rest -- To view, visit http://gerrit.cloudera.org:8080/10748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I66c0dcb8abdd0d187490a761f129cda3b3500990 Gerrit-Change-Number: 10748 Gerrit-PatchSet: 7 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 22 Jun 2018 16:29:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5763: [DOCS] Warn against setting --logbufsecs to 0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10790 ) Change subject: IMPALA-5763: [DOCS] Warn against setting --logbufsecs to 0 .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-docs-submit/328/ -- To view, visit http://gerrit.cloudera.org:8080/10790 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I45d14e2ba601574cf76a4b739617b097f3e551aa Gerrit-Change-Number: 10790 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 22 Jun 2018 16:28:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6119: Fix issue with multiple partitions sharing same location
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/10543 ) Change subject: IMPALA-6119: Fix issue with multiple partitions sharing same location .. Patch Set 18: (9 comments) http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py File tests/metadata/test_partition_metadata.py: http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py@124 PS18, Line 124: # Create three partitions where two of them point to the same location. : self.filesystem_client.make_dir(TBL_NAME + '/p1') : self.filesystem_client.make_dir(TBL_NAME + '/p2') This looks wrong, because the directories made here will only be "same_loc_test/{p1,p2}" and not rooted at TBL_LOCATION. Are these lines needed? And if they are needed, then they need to have a uniqueness to them, because as it stands, parallel tests will overwrite each other's directories in this spot. Include unique_database as part of the directory name to solve this problem. http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py@127 PS18, Line 127: execute Here and elsewhere, there's a execute_query_expect_success() method you should consider instead for statements you expect always to succeed. http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py@163 PS18, Line 163: it's Are we (Impala) in control of this message? It should be "its". http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py@168 PS18, Line 168: assert false If we are here, then we never raised an exception. It would be polite to show the partition information as part of diagnostics when the test fails. http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py@177 PS18, Line 177: pytest.skip() You can insert a reason here. It's mildly helpful to someone that happens to see this go by on a console. http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py@184 PS18, Line 184: # Cleanup any existing data in the table directory. : self.filesystem_client.delete_file_dir(TBL_NAME, recursive=True) As before, I can't tell that this is needed. As before, if needed, this can collide with tests that are using a directory of TBL_NAME in L125-126, assuming the pwd is the same. http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py@187 PS18, Line 187: execute As before, execute_query_expect_success() in places where you expect success. http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py@191 PS18, Line 191: # Create two partitions pointing to the same location. : self.filesystem_client.make_dir(TBL_NAME + '/p1') Same comment for L185 applies here. http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py@202 PS18, Line 202: hive_cmd = ("ALTER TABLE %s DROP PARTITION (j=1)" % FQ_TBL_NAME) : call(["hive", "-e", hive_cmd]) You can use self.hive_client() instead. -- To view, visit http://gerrit.cloudera.org:8080/10543 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a Gerrit-Change-Number: 10543 Gerrit-PatchSet: 18 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 22 Jun 2018 16:00:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [DRAFT] IMPALA-6189: Add thread watchdogs for HDFS IO calls
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10696 ) Change subject: [DRAFT] IMPALA-6189: Add thread watchdogs for HDFS IO calls .. Patch Set 3: (7 comments) Hi! Thanks for doing this. The big question for me is what our policy is here. Should we be doing this for all blocking system calls? All blocking RPCs? How do we maintain that we have decent coverage? For the Java stuff, an alternative approach is to call out via our regular Thrift-y/JNI-y route to ask Java to get the stack trace using management beans. I'm pretty sure you can match native thread ids to Java ones, though based on reading https://gist.github.com/rednaxelafx/843622/eb0b0877ff4aac77c76e5a50f7621dc32ea451eb it looks like it's hard. (In jstack, it's the nid=... but you need to do a hex to decimal conversion for pids. But it looks like it's not readily available out of Java.) I think it may also make sense to expose a counter on how often this happens. A monitoring tool would want to alert if this is happening a lot, and it won't want to grep the logs. Even more interesting would be to write down when it happens on behalf of a query in the profile, but that's not always possible. http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/exec/hdfs-table-sink.cc File be/src/exec/hdfs-table-sink.cc: http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/exec/hdfs-table-sink.cc@410 PS3, Line 410: hdfsFileInfo* info = hdfsGetPathInfo(output_partition->hdfs_connection, This is an RPC too. Should we be watching it? http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/kudu/util/debug-util.cc File be/src/kudu/util/debug-util.cc: http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/kudu/util/debug-util.cc@62 PS3, Line 62: DEFINE_int32(stack_trace_sig_num, SIGRTMIN, "Signal used by the thread watchdog to " Why is this configurable? http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/kudu/util/debug-util.cc@212 PS3, Line 212:<< "Kudu will not produce thread stack traces."; s/Kudu/Impala? http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/kudu/util/debug-util.cc@288 PS3, Line 288: ret << g_comm.jstack.Stringify(); This is new code. I think it may make sense to feature flag it (--kernel_watchdog_jstack_enabled) so that we can turn it off if we find that it's wrong. http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/runtime/io/scan-range.cc File be/src/runtime/io/scan-range.cc: http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/runtime/io/scan-range.cc@642 PS3, Line 642: current_bytes_read = hdfsRead(fs_, hdfs_file, buffer + *bytes_read, Isn't this the main thing we're interested in? http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/service/CMakeLists.txt File be/src/service/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/10696/3/be/src/service/CMakeLists.txt@53 PS3, Line 53: # Jvmti agent library that registers with the JVM on load. As I mentioned elsewhere (perhaps later), I'm interested in whether this adds a lot to our binary. We're already launching the JVM from a statically linked binary; do we need to point it to another file to get methods? (There's a way around this for normal JNI libraries in JDK1.8+ though I'd have to look it up.) http://gerrit.cloudera.org:8080/#/c/10696/3/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/10696/3/bin/impala-config.sh@619 PS3, Line 619: LIBHDFS_OPTS+=" -agentpath:$IMPALA_HOME/be/build/latest/service/libjvmtiagent.so" It surprises me that we have to do this with options, rather than being able to do it directly. What's going on is that we're doubling down here on our reliance on hadoop-hdfs/src/main/native/libhdfs/jni_helper.c. It's definitely not a public API that we should be depending on. Joe ran into https://issues.apache.org/jira/browse/HDFS-12628?jql=reporter%3Djoemcdonnell%20and%20project%3Dhdfs which is somewhat related. It's also problematic that we're introducing a new .so. We shouldn't have to since we statically link things. What's the size of the new .SO? Are we adding 100MB to everyone's builds? impala-config.sh is read in our build process. How do these settings work when we're running on clusters? -- To view, visit http://gerrit.cloudera.org:8080/10696 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I28e918761c120043d332b034450208eaf34e3e2b Gerrit-Change-Number: 10696 Gerrit-PatchSet: 3 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Fri, 22 Jun 2018 15:23:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7175: deflake check for failed impalad
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10745 ) Change subject: IMPALA-7175: deflake check for failed impalad .. IMPALA-7175: deflake check for failed impalad test_native_functions_race checks for an impalad crash by testing whether the number of impalads at the start and end of the test is the same. A recent run was flaky since the number of impalads at the start was incorrectly found to be 2. This fix tries to make the test most robust by determining the number of impalads based on how many can evaluate a trivial test query. For these tests, its assumed that the number of coordinators is the same as the number of impalads in the cluster. Change-Id: I97c6b398e43c6abb1df2b1783c26159137f14fa4 Reviewed-on: http://gerrit.cloudera.org:8080/10745 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M tests/common/impala_cluster.py M tests/query_test/test_udfs.py 2 files changed, 17 insertions(+), 3 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/10745 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I97c6b398e43c6abb1df2b1783c26159137f14fa4 Gerrit-Change-Number: 10745 Gerrit-PatchSet: 5 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7175: deflake check for failed impalad
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10745 ) Change subject: IMPALA-7175: deflake check for failed impalad .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10745 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I97c6b398e43c6abb1df2b1783c26159137f14fa4 Gerrit-Change-Number: 10745 Gerrit-PatchSet: 4 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 22 Jun 2018 14:42:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6119: Fix issue with multiple partitions sharing same location
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/10543 ) Change subject: IMPALA-6119: Fix issue with multiple partitions sharing same location .. Patch Set 18: (10 comments) http://gerrit.cloudera.org:8080/#/c/10543/17/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/10543/17/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@187 PS17, Line 187: The second parameter is a set of : // the partitions pointing to this location. > we're paying memory for speed. what operations will be slowed down to addre This has been discussed somewhere back in a previous comment. Basically after each insert we have to check that are there any partitions on this same location where we have to add internally the new file descriptors. The same goes for dropping a partition: since the whole directory is dropped we have to get rid of the rest partitions on this location. http://gerrit.cloudera.org:8080/#/c/10543/17/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@189 PS17, Line 189: locationToPartMap_ > can you quantify the additional memory needed per partition? perhaps measur This has also been covered on this review. See my comment on Patch Set 3 around 1st June. http://gerrit.cloudera.org:8080/#/c/10543/17/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1235 PS17, Line 1235: Mapping > use consistent naming. so far, there's: LocationMapping and PartMap Picked LocationMapping Done http://gerrit.cloudera.org:8080/#/c/10543/17/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1492 PS17, Line 1492: received > what does "received" mean? received : got as parameter The original comment doesn't reflect what the function does after the change. I wanted to emphasise that it's not just simply 'convert' the partition Names to HdfsPartitions and group them by location. But additionally it searches for the ones sharing their locations and include those to the result as well. http://gerrit.cloudera.org:8080/#/c/10543/17/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1494 PS17, Line 1494: private HashMap> getPartitionsByPath( > Is the modification to this method the one that fixes the issue with refres We can populate such a map in that loop for sure but I feel that this already existing function getPartitionsByPath() is a more natural fit to do this mapping for us. Anyway, we still need to keep the locationToPartMap_ as an HdfsTable member as we need that for altering and dropping partitions. And we add no extra time complexity to getPartitionsByPath() with this change. http://gerrit.cloudera.org:8080/#/c/10543/17/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/10543/17/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2255 PS17, Line 2255: bl_hdfs.isL > lift the cast. perhaps put it right after the precondition on L2239 Done http://gerrit.cloudera.org:8080/#/c/10543/17/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2258 PS17, Line 2258: Preconditions.checkState(partitionNameList.size() > 1); > at this point, size of partitionNameList must be > 1. perhaps add a precond Done http://gerrit.cloudera.org:8080/#/c/10543/17/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2262 PS17, Line 2262: ionNameList.subList(0, > is there a solution in mind? for example, should we suggest that duplicate The solution depends on what the original intent of the user was: He he just wanted to drop the partition with keeping the data, then it should set this partition's location to something unique (don't have to be valid) and then drop this partition. If the user wants to drop the data and all the partitions using that location, that it has to guarantee that all the partitions point to some not-necessarily valid location, and one points to the valid location meant to be dropped. Then it can delete the partitions one by one. I'm not sure how to include this info to the error text briefly, though :) http://gerrit.cloudera.org:8080/#/c/10543/17/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2464 PS17, Line 2464: fsPartiti > make this a precondition for this method? Won't work for the whole method, but can add that to the else branch. Done http://gerrit.cloudera.org:8080/#/c/10543/15/tests/metadata/test_partition_metadata.py File tests/metadata/test_partition_metadata.py: http://gerrit.cloudera.org:8080/#/c/10543/15/tests/metadata/test_partition_metadata.py@177 PS15, Line 177: pytest.skip() > The problem with skipping though is that no one will remember to un-skip th Still I feel that a Hive upgrade shouldn't break our builds. Another thing is
[Impala-ASF-CR] IMPALA-6119: Fix issue with multiple partitions sharing same location
Hello Bharath Vissapragada, Michael Brown, Zoltan Borok-Nagy, Sailesh Mukil, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10543 to look at the new patch set (#18). Change subject: IMPALA-6119: Fix issue with multiple partitions sharing same location .. IMPALA-6119: Fix issue with multiple partitions sharing same location When multiple partitions point to the same location and a new data file is added to any of them then the expected behaviour is that this new file is added to the other partitions pointing to the same location as well. Apparently, this is not the case and right after the insertion the new file is only visible in the partition where it was inserted to and an invalidate metadata is needed to resolve this inconsistency. This fix addresses this issue with keeping track of a mapping between locations and the HdfsPartitions pointing to it. When new files are inserted into a partition then all the other partition's metadata are reloaded that point to the same location as the one where the files are inserted. For managed tables it's no longer allowed to drop a partition that shares it's location with other partitions. In this case an error is displayed to the user. Testing: There was an existing test that covered partitions pointing to the same location. However, after each insert it executed a refresh to reload the metadata for the entire table. This reload was removed to cover the changes of this fix. Another test is introduced to cover the case when the location of a partition is altered or a partition is removed. One more test is created to cover when Impala reloads some of it's partitions after Hive had dropped a partition that shares it's location with other partitions. Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a --- M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M tests/metadata/test_partition_metadata.py 3 files changed, 200 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/10543/18 -- To view, visit http://gerrit.cloudera.org:8080/10543 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a Gerrit-Change-Number: 10543 Gerrit-PatchSet: 18 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Zoltan Borok-Nagy