[Impala-ASF-CR] IMPALA-9243: Add info about blacklisting decisions to the webui
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15178 ) Change subject: IMPALA-9243: Add info about blacklisting decisions to the webui .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15178 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0c309315b142a50be102dcb516b36ec6cb3cf47 Gerrit-Change-Number: 15178 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Sat, 08 Feb 2020 05:22:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9243: Add info about blacklisting decisions to the webui
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15178 ) Change subject: IMPALA-9243: Add info about blacklisting decisions to the webui .. IMPALA-9243: Add info about blacklisting decisions to the webui This patch adds information about blacklisting decisions to the /backends webui endpoint. For the JSON, it adds an 'is_blacklisted' field to all backends, and for and backends where 'is_blacklisted' is true it adds a 'blacklist_cause' field indicating the error status that led to the backend getting blacklisted and an 'blacklist_time_remaining' field indiciating how much longer the backend will remain on the blacklist. It also adds counts for the number of blacklisted and quiescing backends, if any, and the number of active (i.e. all other) backends. For display, in order to prevent the table of backend information from having too many columns (prior to this patch it already had 12), it separates blacklisted, quiescing, and active backends into three separate table, with the blacklisted and quiescing tables only getting displayed if there are any such backends. Additionally, tooltips are added next to the headers for the blacklisted and quiescing tables that provide a brief explanation of what it means for a backend to appear on there lists. Using separate tables also facilitates having state-specific columns - the blacklisted table displays columns for the blacklist cause and time remaining. Future work could consider adding columns to the quiescing table, such as time until the grace period and deadline expires. Testing: - Manually ran various quiescing/blacklisting scenarios and confirmed the /backends page displays as expected. - Added cases to test_web_pages (to verify the new fields when nothing is blacklisted) and test_blacklist. Change-Id: Ia0c309315b142a50be102dcb516b36ec6cb3cf47 Reviewed-on: http://gerrit.cloudera.org:8080/15178 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/runtime/coordinator.cc M be/src/scheduling/cluster-membership-mgr-test.cc M be/src/scheduling/cluster-membership-mgr.cc M be/src/scheduling/cluster-membership-mgr.h M be/src/scheduling/executor-blacklist.cc M be/src/scheduling/executor-blacklist.h M be/src/service/impala-http-handler.cc M bin/rat_exclude_files.txt M tests/custom_cluster/test_blacklist.py M tests/webserver/test_web_pages.py M www/backends.tmpl A www/blacklisted_tooltip.txt A www/quiescing_tooltip.txt 13 files changed, 208 insertions(+), 28 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/15178 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ia0c309315b142a50be102dcb516b36ec6cb3cf47 Gerrit-Change-Number: 15178 Gerrit-PatchSet: 5 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-9361: manually configured kerberized minicluster
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15159 ) Change subject: IMPALA-9361: manually configured kerberized minicluster .. IMPALA-9361: manually configured kerberized minicluster The kerberized minicluster is enabled by setting IMPALA_KERBERIZE=true in impala-config-*.sh. After setting it you must run ./bin/create-test-configuration.sh then restart minicluster. This adds a script to partially automate setup of a local KDC, in lieu of the unmaintained minikdc support (which has been ripped out). Testing: I was able to run some queries against pre-created HDFS tables with kerberos enabled. Change-Id: Ib34101d132e9c9d59da14537edf7d096f25e9bee Reviewed-on: http://gerrit.cloudera.org:8080/15159 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M bin/bootstrap_toolchain.py M bin/create-test-configuration.sh M bin/impala-config.sh A bin/kerberos/README-kerberos.md A bin/kerberos/experimental-kerberos-setup.sh M bin/rat_exclude_files.txt M bin/start-impala-cluster.py M buildall.sh M fe/src/test/resources/hive-site.xml.py D testdata/bin/minikdc.sh M testdata/bin/minikdc_env.sh M testdata/bin/run-all.sh M testdata/bin/run-hbase.sh M testdata/cluster/admin M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.py M testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py 16 files changed, 188 insertions(+), 301 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/15159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ib34101d132e9c9d59da14537edf7d096f25e9bee Gerrit-Change-Number: 15159 Gerrit-PatchSet: 10 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9361: manually configured kerberized minicluster
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15159 ) Change subject: IMPALA-9361: manually configured kerberized minicluster .. Patch Set 9: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib34101d132e9c9d59da14537edf7d096f25e9bee Gerrit-Change-Number: 15159 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 08 Feb 2020 05:16:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15157 ) Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of columns .. IMPALA-9358: Query slowdown with inline views and hundreds of columns IMPALA-8386 introduced an expensive precondition check using the function ExprSubstitutionMap.checkComposedFrom(). This check has significant performance impact on statements that contain inline views with hundreds of columns. Most of the cost is in the get() calls used to find expressions in the local substitution map. The fix is to add a getWithHint() call that uses the current loop index as a starting point to search for expressions. This leverages the fact that expressions have identical positions in both substitution maps in most common cases. A more generic approach would be to accelerate expression equality search using hash functions but that would be a much riskier fix and Impala currently lacks the infrasturucture to so. Testing: Performance testing with a query with 1000 expressions of the following form: with a as (select c1 c1, c1 c2, c1 c3, ... from t) select c1, c2, c3, ... from a; repro query went from 12 sec to 1 sec. There was no noticeable time spent in the precondition now. Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc Reviewed-on: http://gerrit.cloudera.org:8080/15157 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java 1 file changed, 16 insertions(+), 2 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/15157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc Gerrit-Change-Number: 15157 Gerrit-PatchSet: 5 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15157 ) Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of columns .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc Gerrit-Change-Number: 15157 Gerrit-PatchSet: 4 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Sat, 08 Feb 2020 02:39:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8852: Skip short-circuit config check for coordinator-only mode
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/15173 ) Change subject: IMPALA-8852: Skip short-circuit config check for coordinator-only mode .. Patch Set 3: Code-Review+1 (1 comment) LGTM. Okay for Andrew to +2 after code style nit is addressed. http://gerrit.cloudera.org:8080/#/c/15173/3/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java File fe/src/test/java/org/apache/impala/service/JniFrontendTest.java: http://gerrit.cloudera.org:8080/#/c/15173/3/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@79 PS3, Line 79: Line 79 and 80 should have 4 space indent. I know these can be difficult to catch but the best way is to teach the IDE . You can use something like https://github.com/google/styleguide/blob/gh-pages/intellij-java-google-style.xml and modify it to comply with our guidelines. (I only had to change line wrap at 90.) If I'm unsure, I usually look at the surrounding code and follow that style. -- To view, visit http://gerrit.cloudera.org:8080/15173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I373d4037f4cee203322a398b77b75810ba708bb5 Gerrit-Change-Number: 15173 Gerrit-PatchSet: 3 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Sat, 08 Feb 2020 02:30:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7002: Throw AuthorizationException when user accessing non-existent table/database in CTE without any privilege.
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/15123 ) Change subject: IMPALA-7002: Throw AuthorizationException when user accessing non-existent table/database in CTE without any privilege. .. Patch Set 6: > Patch Set 6: Code-Review+1 > > (1 comment) > > Thanks @Fang-Yu, please feel free to +2 Thanks Bikram! I can give a +2 but may need your help with approving the patch. :-) -- To view, visit http://gerrit.cloudera.org:8080/15123 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia6b657a7147a136198a9a97a679c9131ee814577 Gerrit-Change-Number: 15123 Gerrit-PatchSet: 6 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Sat, 08 Feb 2020 02:22:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8690 (prep 2): Switch DataCache to Impala's cache implementation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15170 ) Change subject: IMPALA-8690 (prep 2): Switch DataCache to Impala's cache implementation .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5176/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I917f8352c9276373dd2761af986bf3487855271c Gerrit-Change-Number: 15170 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 08 Feb 2020 02:16:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15169 ) Change subject: IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5175/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15169 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e7b8b852b56613803dea125456fa9e8b3736d76 Gerrit-Change-Number: 15169 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 08 Feb 2020 02:16:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8690 (prep 3): Factor out common code for cache implementations
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15179 ) Change subject: IMPALA-8690 (prep 3): Factor out common code for cache implementations .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5177/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67294244a3e8a2812f1482fe786bf7f8e6ce054e Gerrit-Change-Number: 15179 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 08 Feb 2020 02:15:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15169 ) Change subject: IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5174/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15169 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e7b8b852b56613803dea125456fa9e8b3736d76 Gerrit-Change-Number: 15169 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 08 Feb 2020 02:14:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. Patch Set 6: I just saw this commit in Kudu, and I'm wondering if it helps with the libcurl situation: https://gerrit.cloudera.org/#/c/15180/ If they link libcurl, then we might not need to install it. -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 6 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Sat, 08 Feb 2020 01:42:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8852: Skip short-circuit config check for coordinator-only mode
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/15173 ) Change subject: IMPALA-8852: Skip short-circuit config check for coordinator-only mode .. Patch Set 3: Code-Review+1 LGTM Giving +1 for now, I can +2/commit once Anurag is happy -- To view, visit http://gerrit.cloudera.org:8080/15173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I373d4037f4cee203322a398b77b75810ba708bb5 Gerrit-Change-Number: 15173 Gerrit-PatchSet: 3 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Sat, 08 Feb 2020 01:36:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9361: manually configured kerberized minicluster
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15159 ) Change subject: IMPALA-9361: manually configured kerberized minicluster .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5173/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib34101d132e9c9d59da14537edf7d096f25e9bee Gerrit-Change-Number: 15159 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 08 Feb 2020 01:29:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15169 ) Change subject: IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache .. Patch Set 5: (9 comments) http://gerrit.cloudera.org:8080/#/c/15169/5/be/src/util/cache/cache-bench.cc File be/src/util/cache/cache-bench.cc: http://gerrit.cloudera.org:8080/#/c/15169/5/be/src/util/cache/cache-bench.cc@176 PS5, Line 176: pair hits_lookups = RunQueryThreads(FLAGS_num_threads, FLAGS_run_seconds); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/15169/5/be/src/util/cache/cache-bench.cc@183 PS5, Line 183: LOG(INFO) << test_case << ": " << HumanReadableNum::ToString(l_per_sec) << " lookups/sec"; line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/15169/5/be/src/util/cache/cache-bench.cc@184 PS5, Line 184: LOG(INFO) << test_case << ": " << StringPrintf("%.1f", hit_rate * 100.0) << "% hit rate"; line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/15169/5/be/src/util/cache/cache.h File be/src/util/cache/cache.h: http://gerrit.cloudera.org:8080/#/c/15169/5/be/src/util/cache/cache.h@127 PS5, Line 127: // Passing EXPECT_IN_CACHE will increment the hit/miss metrics that track the number of times line too long (95 > 90) http://gerrit.cloudera.org:8080/#/c/15169/5/be/src/util/cache/cache.h@128 PS5, Line 128: // blocks were requested that the users were hoping to get the block from the cache, along with line too long (97 > 90) http://gerrit.cloudera.org:8080/#/c/15169/5/be/src/util/cache/cache.h@131 PS5, Line 131: // This helps in determining if we are effectively caching the blocks that matter the most. line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/15169/5/be/src/util/cache/nvm_cache.cc File be/src/util/cache/nvm_cache.cc: http://gerrit.cloudera.org:8080/#/c/15169/5/be/src/util/cache/nvm_cache.cc@8 PS5, Line 8: // This file implements a cache based on the MEMKIND library (http://memkind.github.io/memkind/) line too long (96 > 90) http://gerrit.cloudera.org:8080/#/c/15169/5/be/src/util/cache/nvm_cache.cc@9 PS5, Line 9: // This library makes it easy to program against persistent memory hardware by exposing an API line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/15169/5/be/src/util/cache/nvm_cache.cc@706 PS5, Line 706: int err = CALL_MEMKIND(memkind_create_pmem, FLAGS_nvm_cache_path.c_str(), capacity, ); line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/15169 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e7b8b852b56613803dea125456fa9e8b3736d76 Gerrit-Change-Number: 15169 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 08 Feb 2020 01:18:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15169 to look at the new patch set (#5). Change subject: IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache .. IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache The data cache uses Kudu's cache code in be/src/kudu/util. To make it easy to make Impala specific changes (such as introducing new cache eviction algorithms), this copies the code into Impala's utilities directory. To make the review easier, this is a simple copy of the files with a minimal CMakeLists.txt and RAT changes. This does not link the resulting library into anything. A subsequent change will modify the code to be usable by Impala. To test, I verified that this compiles. Change-Id: I0e7b8b852b56613803dea125456fa9e8b3736d76 --- A be/src/util/cache/CMakeLists.txt A be/src/util/cache/cache-bench.cc A be/src/util/cache/cache-test.cc A be/src/util/cache/cache.cc A be/src/util/cache/cache.h A be/src/util/cache/nvm_cache.cc A be/src/util/cache/nvm_cache.h M bin/rat_exclude_files.txt 8 files changed, 2,595 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/15169/5 -- To view, visit http://gerrit.cloudera.org:8080/15169 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0e7b8b852b56613803dea125456fa9e8b3736d76 Gerrit-Change-Number: 15169 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8690 (prep 2): Switch DataCache to Impala's cache implementation
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15170 to look at the new patch set (#5). Change subject: IMPALA-8690 (prep 2): Switch DataCache to Impala's cache implementation .. IMPALA-8690 (prep 2): Switch DataCache to Impala's cache implementation In a previous change, the Kudu caching code from be/src/kudu/util was copied to be/src/util/cache. This gets that code workable for Impala and uses it for the DataCache. This involves the following changes: 1. Fix up includes to point to Impala's cache implementation 2. Remove NVM Cache support and code (not used by Impala) 3. Remove CacheMetrics code (not used by Impala) 4. Move the cache code to the impala namespace 5. Modify cache-bench and cache-test to use Impala's backend test infrastructure 6. Switch DataCache to use be/src/util/cache These are only boilerplate changes. The cache implementation has not meaningfully changed. Testing: - Ran core tests - The modified version of Kudu's cache-test passes - The modified version of Kudu's cache-bench runs successfully Change-Id: I917f8352c9276373dd2761af986bf3487855271c --- M .clang-tidy M be/CMakeLists.txt M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/util/CMakeLists.txt M be/src/util/cache/CMakeLists.txt M be/src/util/cache/cache-bench.cc M be/src/util/cache/cache-test.cc M be/src/util/cache/cache.cc M be/src/util/cache/cache.h D be/src/util/cache/nvm_cache.cc D be/src/util/cache/nvm_cache.h M bin/rat_exclude_files.txt 13 files changed, 74 insertions(+), 949 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/15170/5 -- To view, visit http://gerrit.cloudera.org:8080/15170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I917f8352c9276373dd2761af986bf3487855271c Gerrit-Change-Number: 15170 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8690 (prep 3): Factor out common code for cache implementations
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15179 to look at the new patch set (#4). Change subject: IMPALA-8690 (prep 3): Factor out common code for cache implementations .. IMPALA-8690 (prep 3): Factor out common code for cache implementations The existing cache implementation handles LRU/FIFO eviction algorithms, but it also implements several components that are useful for other cache implementations. Specifically, there is a simple hash table (HandleTable) and a simple sharding implementation (ShardedCache). These can be reused across other eviction algorithms by making them generic. This pulls them out of be/src/util/cache/cache.cc and into a new be/src/util/cache/cache-internal.h file. To make the HandleTable generic, this introduces a HandleBase class that contains common code between the implementations (such as the key, the value, the hash, etc). The HandleTable works on this base class, and the RLHandle now derives from HandleBase. To support this, Allocate/Free need to treat the handle as an object (calling constructors/destructors) rather than a treating it like a chunk of memory (or simple struct). ShardedCache is made generic by having cache implementations derive from a base CacheShard class that defines the appropriate methods needed by the sharding class. This is purely an interface, and the base class defines no functions. The existing CacheShard is renamed to RLCachedShard and derives from this class. Change-Id: I67294244a3e8a2812f1482fe786bf7f8e6ce054e --- A be/src/util/cache/cache-internal.h M be/src/util/cache/cache.cc M bin/rat_exclude_files.txt 3 files changed, 365 insertions(+), 282 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/15179/4 -- To view, visit http://gerrit.cloudera.org:8080/15179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I67294244a3e8a2812f1482fe786bf7f8e6ce054e Gerrit-Change-Number: 15179 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15169 ) Change subject: IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache .. Patch Set 4: (9 comments) http://gerrit.cloudera.org:8080/#/c/15169/4/be/src/util/cache/cache-bench.cc File be/src/util/cache/cache-bench.cc: http://gerrit.cloudera.org:8080/#/c/15169/4/be/src/util/cache/cache-bench.cc@176 PS4, Line 176: pair hits_lookups = RunQueryThreads(FLAGS_num_threads, FLAGS_run_seconds); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/15169/4/be/src/util/cache/cache-bench.cc@183 PS4, Line 183: LOG(INFO) << test_case << ": " << HumanReadableNum::ToString(l_per_sec) << " lookups/sec"; line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/15169/4/be/src/util/cache/cache-bench.cc@184 PS4, Line 184: LOG(INFO) << test_case << ": " << StringPrintf("%.1f", hit_rate * 100.0) << "% hit rate"; line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/15169/4/be/src/util/cache/cache.h File be/src/util/cache/cache.h: http://gerrit.cloudera.org:8080/#/c/15169/4/be/src/util/cache/cache.h@127 PS4, Line 127: // Passing EXPECT_IN_CACHE will increment the hit/miss metrics that track the number of times line too long (95 > 90) http://gerrit.cloudera.org:8080/#/c/15169/4/be/src/util/cache/cache.h@128 PS4, Line 128: // blocks were requested that the users were hoping to get the block from the cache, along with line too long (97 > 90) http://gerrit.cloudera.org:8080/#/c/15169/4/be/src/util/cache/cache.h@131 PS4, Line 131: // This helps in determining if we are effectively caching the blocks that matter the most. line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/15169/4/be/src/util/cache/nvm_cache.cc File be/src/util/cache/nvm_cache.cc: http://gerrit.cloudera.org:8080/#/c/15169/4/be/src/util/cache/nvm_cache.cc@8 PS4, Line 8: // This file implements a cache based on the MEMKIND library (http://memkind.github.io/memkind/) line too long (96 > 90) http://gerrit.cloudera.org:8080/#/c/15169/4/be/src/util/cache/nvm_cache.cc@9 PS4, Line 9: // This library makes it easy to program against persistent memory hardware by exposing an API line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/15169/4/be/src/util/cache/nvm_cache.cc@706 PS4, Line 706: int err = CALL_MEMKIND(memkind_create_pmem, FLAGS_nvm_cache_path.c_str(), capacity, ); line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/15169 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e7b8b852b56613803dea125456fa9e8b3736d76 Gerrit-Change-Number: 15169 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 08 Feb 2020 01:14:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15169 to look at the new patch set (#4). Change subject: IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache .. IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache The data cache uses Kudu's cache code in be/src/kudu/util. To make it easy to make Impala specific changes (such as introducing new cache eviction algorithms), this copies the code into Impala's utilities directory. To make the review easier, this is a simple copy of the files with a minimal CMakeLists.txt and RAT changes. This does not link the resulting library into anything. A subsequent change will modify the code to be usable by Impala. To test, I verified that this compiles. Change-Id: I0e7b8b852b56613803dea125456fa9e8b3736d76 --- A be/src/util/cache/CMakeLists.txt A be/src/util/cache/cache-bench.cc A be/src/util/cache/cache-test.cc A be/src/util/cache/cache.cc A be/src/util/cache/cache.h A be/src/util/cache/nvm_cache.cc A be/src/util/cache/nvm_cache.h M bin/rat_exclude_files.txt 8 files changed, 2,595 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/15169/4 -- To view, visit http://gerrit.cloudera.org:8080/15169 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0e7b8b852b56613803dea125456fa9e8b3736d76 Gerrit-Change-Number: 15169 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-7002: Throw AuthorizationException when user accessing non-existent table/database in CTE without any privilege.
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/15123 ) Change subject: IMPALA-7002: Throw AuthorizationException when user accessing non-existent table/database in CTE without any privilege. .. Patch Set 6: Code-Review+1 (1 comment) Thanks @Fang-Yu, please feel free to +2 http://gerrit.cloudera.org:8080/#/c/15123/5/fe/src/main/java/org/apache/impala/analysis/WithClause.java File fe/src/main/java/org/apache/impala/analysis/WithClause.java: http://gerrit.cloudera.org:8080/#/c/15123/5/fe/src/main/java/org/apache/impala/analysis/WithClause.java@95 PS5, Line 95: // Don't need catch block since the exception will be handled by the caller. nit: no need for this comment -- To view, visit http://gerrit.cloudera.org:8080/15123 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia6b657a7147a136198a9a97a679c9131ee814577 Gerrit-Change-Number: 15123 Gerrit-PatchSet: 6 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Sat, 08 Feb 2020 00:44:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9361: manually configured kerberized minicluster
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15159 ) Change subject: IMPALA-9361: manually configured kerberized minicluster .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5307/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib34101d132e9c9d59da14537edf7d096f25e9bee Gerrit-Change-Number: 15159 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 08 Feb 2020 00:42:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9361: manually configured kerberized minicluster
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15159 ) Change subject: IMPALA-9361: manually configured kerberized minicluster .. Patch Set 8: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib34101d132e9c9d59da14537edf7d096f25e9bee Gerrit-Change-Number: 15159 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 08 Feb 2020 00:42:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9361: manually configured kerberized minicluster
Hello Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15159 to look at the new patch set (#8). Change subject: IMPALA-9361: manually configured kerberized minicluster .. IMPALA-9361: manually configured kerberized minicluster The kerberized minicluster is enabled by setting IMPALA_KERBERIZE=true in impala-config-*.sh. After setting it you must run ./bin/create-test-configuration.sh then restart minicluster. This adds a script to partially automate setup of a local KDC, in lieu of the unmaintained minikdc support (which has been ripped out). Testing: I was able to run some queries against pre-created HDFS tables with kerberos enabled. Change-Id: Ib34101d132e9c9d59da14537edf7d096f25e9bee --- M bin/bootstrap_toolchain.py M bin/create-test-configuration.sh M bin/impala-config.sh A bin/kerberos/README-kerberos.md A bin/kerberos/experimental-kerberos-setup.sh M bin/rat_exclude_files.txt M bin/start-impala-cluster.py M buildall.sh M fe/src/test/resources/hive-site.xml.py D testdata/bin/minikdc.sh M testdata/bin/minikdc_env.sh M testdata/bin/run-all.sh M testdata/bin/run-hbase.sh M testdata/cluster/admin M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.py M testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py 16 files changed, 188 insertions(+), 301 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/15159/8 -- To view, visit http://gerrit.cloudera.org:8080/15159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib34101d132e9c9d59da14537edf7d096f25e9bee Gerrit-Change-Number: 15159 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9361: manually configured kerberized minicluster
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15159 ) Change subject: IMPALA-9361: manually configured kerberized minicluster .. Patch Set 9: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib34101d132e9c9d59da14537edf7d096f25e9bee Gerrit-Change-Number: 15159 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 08 Feb 2020 00:42:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9361: manually configured kerberized minicluster
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15159 ) Change subject: IMPALA-9361: manually configured kerberized minicluster .. Patch Set 7: Also hit IMPALA-9368 -- To view, visit http://gerrit.cloudera.org:8080/15159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib34101d132e9c9d59da14537edf7d096f25e9bee Gerrit-Change-Number: 15159 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 08 Feb 2020 00:39:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8690 (prep 3): Factor out common code for cache implementations
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15179 ) Change subject: IMPALA-8690 (prep 3): Factor out common code for cache implementations .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5172/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67294244a3e8a2812f1482fe786bf7f8e6ce054e Gerrit-Change-Number: 15179 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 08 Feb 2020 00:33:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9243: Add info about blacklisting decisions to the webui
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15178 ) Change subject: IMPALA-9243: Add info about blacklisting decisions to the webui .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15178 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0c309315b142a50be102dcb516b36ec6cb3cf47 Gerrit-Change-Number: 15178 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Sat, 08 Feb 2020 00:32:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9243: Add info about blacklisting decisions to the webui
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15178 ) Change subject: IMPALA-9243: Add info about blacklisting decisions to the webui .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5306/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15178 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0c309315b142a50be102dcb516b36ec6cb3cf47 Gerrit-Change-Number: 15178 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Sat, 08 Feb 2020 00:32:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15169 ) Change subject: IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache .. Patch Set 3: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/5170/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/15169 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e7b8b852b56613803dea125456fa9e8b3736d76 Gerrit-Change-Number: 15169 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 08 Feb 2020 00:30:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7002: Throw AuthorizationException when user accessing non-existent table/database in CTE without any privilege.
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/15123 ) Change subject: IMPALA-7002: Throw AuthorizationException when user accessing non-existent table/database in CTE without any privilege. .. Patch Set 6: (1 comment) > Patch Set 6: > > (1 comment) Hi Bikram and Wenzhe, I have verified Wenzhe's explanation and think what Wenzhe described is correct. Let me know if you have any other comments. Thanks! http://gerrit.cloudera.org:8080/#/c/15123/6/fe/src/main/java/org/apache/impala/analysis/WithClause.java File fe/src/main/java/org/apache/impala/analysis/WithClause.java: http://gerrit.cloudera.org:8080/#/c/15123/6/fe/src/main/java/org/apache/impala/analysis/WithClause.java@96 PS6, Line 96: finally { : // Register all privilege requests made from the root analyzer to the input : // analyzer so that caller could do authorization for all the requests collected : // during analysis and report an authorization error over an analysis error. : // We should not accidentally reveal the non-existence of a database/table if : // the user is not authorized. : for (PrivilegeRequest req : withClauseAnalyzer.getPrivilegeReqs()) { : analyzer.registerPrivReq(req); : } : } > Look into the code. The local views will be used for further analysis, audi Thanks to Wenzhe for the detailed explanation! I have checked the code path again and think what Wenzhe described is correct. That is, even we moved the code that registers the local views (that for-loop) and the code that adds access events to the finally block. These 2 pieces of information will not be processed afterwards when there is an AnalysisException thrown. Specifically, once there is an AnalysisException thrown, this exception will be caught at https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java#L416. If later on there is an AuthorizationException thrown, that exception will be caught at https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java#L432. In either case, any statement after analysisCtx.analyzeAndAuthorize() at https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/Frontend.java#L1536 will not be executed, including the code that would processing the audit events at https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/Frontend.java#L1540. On the other hand, I am answering my previous question myself. If on the Ranger server we grant the user the SELECT privilege of a non-existing resource, then we will see an AuthorizationException, which is not what I had expected before. I expected to see an AnalysisException. But this issue/phenomenon is already there in Impala and thus is independent of Wenzhe's patch. -- To view, visit http://gerrit.cloudera.org:8080/15123 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia6b657a7147a136198a9a97a679c9131ee814577 Gerrit-Change-Number: 15123 Gerrit-PatchSet: 6 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Sat, 08 Feb 2020 00:16:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9243: Add info about blacklisting decisions to the webui
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15178 ) Change subject: IMPALA-9243: Add info about blacklisting decisions to the webui .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5169/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15178 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0c309315b142a50be102dcb516b36ec6cb3cf47 Gerrit-Change-Number: 15178 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Sat, 08 Feb 2020 00:11:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8690 (prep 2): Switch DataCache to Impala's cache implementation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15170 ) Change subject: IMPALA-8690 (prep 2): Switch DataCache to Impala's cache implementation .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5171/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I917f8352c9276373dd2761af986bf3487855271c Gerrit-Change-Number: 15170 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 08 Feb 2020 00:00:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8712: Make ExecQueryFInstances async
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15154 ) Change subject: IMPALA-8712: Make ExecQueryFInstances async .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5168/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15154 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I33ec96e5885af094c294cd3a76c242995263ba32 Gerrit-Change-Number: 15154 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 07 Feb 2020 23:48:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7002: Throw AuthorizationException when user accessing non-existent table/database in CTE without any privilege.
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/15123 ) Change subject: IMPALA-7002: Throw AuthorizationException when user accessing non-existent table/database in CTE without any privilege. .. Patch Set 6: Code-Review+1 > Patch Set 6: > > (1 comment) Thanks to Wenzhe for the detailed explanation! I checked the code path again myself. I think what Wenzhe described is correct. Specifically, even though we moved the code that registers the local views (that for-loop) and the statement that adds the audit events to the finally block, if there is an AnalysisException thrown, in the end the information about those previously registered local views and added audit events will not be processed. To be precise, the AnalysisException will be caught at https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java#L416. If later on there is an AuthorizationException thrown, it will be caught at https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java#L433. As long as there is an exception thrown, anything after https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/Frontend.java#L1536 will not be executed, including the code that processes the access events at https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/Frontend.java#L1540. -- To view, visit http://gerrit.cloudera.org:8080/15123 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia6b657a7147a136198a9a97a679c9131ee814577 Gerrit-Change-Number: 15123 Gerrit-PatchSet: 6 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 07 Feb 2020 23:48:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8690 (prep 3): Factor out common code for cache implementations
Joe McDonnell has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/15179 ) Change subject: IMPALA-8690 (prep 3): Factor out common code for cache implementations .. IMPALA-8690 (prep 3): Factor out common code for cache implementations The existing cache implementation handles LRU/FIFO eviction algorithms, but it also implements several components that are useful for other cache implementations. Specifically, there is a simple hash table (HandleTable) and a simple sharding implementation (ShardedCache). These can be reused across other eviction algorithms by making them generic. This pulls them out of be/src/util/cache/cache.cc and into a new be/src/util/cache/cache-internal.h file. To make the HandleTable generic, this introduces a HandleBase class that contains common code between the implementations (such as the key, the value, the hash, etc). The HandleTable works on this base class, and the RLHandle now derives from HandleBase. To support this, Allocate/Free need to treat the handle as an object (calling constructors/destructors) rather than a treating it like a chunk of memory (or simple struct). ShardedCache is made generic by having cache implementations derive from a base CacheShard class that defines the appropriate methods needed by the sharding class. This is purely an interface, and the base class defines no functions. The existing CacheShard is renamed to RLCachedShard and derives from this class. Change-Id: I67294244a3e8a2812f1482fe786bf7f8e6ce054e --- A be/src/util/cache/cache-internal.h M be/src/util/cache/cache.cc M bin/rat_exclude_files.txt 3 files changed, 365 insertions(+), 282 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/15179/3 -- To view, visit http://gerrit.cloudera.org:8080/15179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I67294244a3e8a2812f1482fe786bf7f8e6ce054e Gerrit-Change-Number: 15179 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8690 (prep 2): Switch DataCache to Impala's cache implementation
Joe McDonnell has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/15170 ) Change subject: IMPALA-8690 (prep 2): Switch DataCache to Impala's cache implementation .. IMPALA-8690 (prep 2): Switch DataCache to Impala's cache implementation In a previous change, the Kudu caching code from be/src/kudu/util was copied to be/src/util/cache. This gets that code workable for Impala and uses it for the DataCache. This involves the following changes: 1. Fix up includes to point to Impala's cache implementation 2. Remove NVM Cache support and code (not used by Impala) 3. Remove CacheMetrics code (not used by Impala) 4. Move the cache code to the impala namespace 5. Modify cache-bench and cache-test to use Impala's backend test infrastructure 6. Switch DataCache to use be/src/util/cache These are only boilerplate changes. The cache implementation has not meaningfully changed. Testing: - Ran core tests - The modified version of Kudu's cache-test passes - The modified version of Kudu's cache-bench runs successfully Change-Id: I917f8352c9276373dd2761af986bf3487855271c --- M .clang-tidy M be/CMakeLists.txt M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/util/CMakeLists.txt M be/src/util/cache/CMakeLists.txt M be/src/util/cache/cache-bench.cc M be/src/util/cache/cache-test.cc M be/src/util/cache/cache.cc M be/src/util/cache/cache.h D be/src/util/cache/nvm_cache.cc D be/src/util/cache/nvm_cache.h M bin/rat_exclude_files.txt 13 files changed, 74 insertions(+), 949 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/15170/4 -- To view, visit http://gerrit.cloudera.org:8080/15170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I917f8352c9276373dd2761af986bf3487855271c Gerrit-Change-Number: 15170 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15169 ) Change subject: IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache .. Patch Set 3: (9 comments) http://gerrit.cloudera.org:8080/#/c/15169/3/be/src/util/cache/cache-bench.cc File be/src/util/cache/cache-bench.cc: http://gerrit.cloudera.org:8080/#/c/15169/3/be/src/util/cache/cache-bench.cc@176 PS3, Line 176: pair hits_lookups = RunQueryThreads(FLAGS_num_threads, FLAGS_run_seconds); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/15169/3/be/src/util/cache/cache-bench.cc@183 PS3, Line 183: LOG(INFO) << test_case << ": " << HumanReadableNum::ToString(l_per_sec) << " lookups/sec"; line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/15169/3/be/src/util/cache/cache-bench.cc@184 PS3, Line 184: LOG(INFO) << test_case << ": " << StringPrintf("%.1f", hit_rate * 100.0) << "% hit rate"; line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/15169/3/be/src/util/cache/cache.h File be/src/util/cache/cache.h: http://gerrit.cloudera.org:8080/#/c/15169/3/be/src/util/cache/cache.h@127 PS3, Line 127: // Passing EXPECT_IN_CACHE will increment the hit/miss metrics that track the number of times line too long (95 > 90) http://gerrit.cloudera.org:8080/#/c/15169/3/be/src/util/cache/cache.h@128 PS3, Line 128: // blocks were requested that the users were hoping to get the block from the cache, along with line too long (97 > 90) http://gerrit.cloudera.org:8080/#/c/15169/3/be/src/util/cache/cache.h@131 PS3, Line 131: // This helps in determining if we are effectively caching the blocks that matter the most. line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/15169/3/be/src/util/cache/nvm_cache.cc File be/src/util/cache/nvm_cache.cc: http://gerrit.cloudera.org:8080/#/c/15169/3/be/src/util/cache/nvm_cache.cc@8 PS3, Line 8: // This file implements a cache based on the MEMKIND library (http://memkind.github.io/memkind/) line too long (96 > 90) http://gerrit.cloudera.org:8080/#/c/15169/3/be/src/util/cache/nvm_cache.cc@9 PS3, Line 9: // This library makes it easy to program against persistent memory hardware by exposing an API line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/15169/3/be/src/util/cache/nvm_cache.cc@706 PS3, Line 706: int err = CALL_MEMKIND(memkind_create_pmem, FLAGS_nvm_cache_path.c_str(), capacity, ); line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/15169 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e7b8b852b56613803dea125456fa9e8b3736d76 Gerrit-Change-Number: 15169 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 07 Feb 2020 23:46:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache
Joe McDonnell has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/15169 ) Change subject: IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache .. IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache The data cache uses Kudu's cache code in be/src/kudu/util. To make it easy to make Impala specific changes (such as introducing new cache eviction algorithms), this copies the code into Impala's utilities directory. To make the review easier, this is a simple copy of the files with a minimal CMakeLists.txt and RAT changes. This does not link the resulting library into anything. A subsequent change will modify the code to be usable by Impala. To test, I verified that this compiles. Change-Id: I0e7b8b852b56613803dea125456fa9e8b3736d76 --- A be/src/util/cache/CMakeLists.txt A be/src/util/cache/cache-bench.cc A be/src/util/cache/cache-test.cc A be/src/util/cache/cache.cc A be/src/util/cache/cache.h A be/src/util/cache/nvm_cache.cc A be/src/util/cache/nvm_cache.h M bin/rat_exclude_files.txt 8 files changed, 2,595 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/15169/3 -- To view, visit http://gerrit.cloudera.org:8080/15169 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0e7b8b852b56613803dea125456fa9e8b3736d76 Gerrit-Change-Number: 15169 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9243: Add info about blacklisting decisions to the webui
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/15178 ) Change subject: IMPALA-9243: Add info about blacklisting decisions to the webui .. Patch Set 3: Code-Review+2 This looks good to me -- To view, visit http://gerrit.cloudera.org:8080/15178 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0c309315b142a50be102dcb516b36ec6cb3cf47 Gerrit-Change-Number: 15178 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 07 Feb 2020 23:37:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9243: Add info about blacklisting decisions to the webui
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/15178 ) Change subject: IMPALA-9243: Add info about blacklisting decisions to the webui .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/15178/2/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/15178/2/be/src/service/impala-http-handler.cc@1001 PS2, Line 1001: // Don't add the following fields if they're 0 so that we won't display the : // corresponding tables if they would be empty. : if (num_quiescing_backends > 0) { : document->AddMember( : "num_quiescing_backends", num_quiescing_backends, document->GetAllocator()); : } : if (num_blacklisted_backends > 0) { : > For my own understanding, if num_quiescing_backends is zero and we don't ad Right, I figured it would be better not to clutter up the page with the extra empty tables in the common case where the cluster is healthy. Added a comment here explaining. -- To view, visit http://gerrit.cloudera.org:8080/15178 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0c309315b142a50be102dcb516b36ec6cb3cf47 Gerrit-Change-Number: 15178 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 07 Feb 2020 23:24:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9243: Add info about blacklisting decisions to the webui
Hello Sahil Takiar, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15178 to look at the new patch set (#3). Change subject: IMPALA-9243: Add info about blacklisting decisions to the webui .. IMPALA-9243: Add info about blacklisting decisions to the webui This patch adds information about blacklisting decisions to the /backends webui endpoint. For the JSON, it adds an 'is_blacklisted' field to all backends, and for and backends where 'is_blacklisted' is true it adds a 'blacklist_cause' field indicating the error status that led to the backend getting blacklisted and an 'blacklist_time_remaining' field indiciating how much longer the backend will remain on the blacklist. It also adds counts for the number of blacklisted and quiescing backends, if any, and the number of active (i.e. all other) backends. For display, in order to prevent the table of backend information from having too many columns (prior to this patch it already had 12), it separates blacklisted, quiescing, and active backends into three separate table, with the blacklisted and quiescing tables only getting displayed if there are any such backends. Additionally, tooltips are added next to the headers for the blacklisted and quiescing tables that provide a brief explanation of what it means for a backend to appear on there lists. Using separate tables also facilitates having state-specific columns - the blacklisted table displays columns for the blacklist cause and time remaining. Future work could consider adding columns to the quiescing table, such as time until the grace period and deadline expires. Testing: - Manually ran various quiescing/blacklisting scenarios and confirmed the /backends page displays as expected. - Added cases to test_web_pages (to verify the new fields when nothing is blacklisted) and test_blacklist. Change-Id: Ia0c309315b142a50be102dcb516b36ec6cb3cf47 --- M be/src/runtime/coordinator.cc M be/src/scheduling/cluster-membership-mgr-test.cc M be/src/scheduling/cluster-membership-mgr.cc M be/src/scheduling/cluster-membership-mgr.h M be/src/scheduling/executor-blacklist.cc M be/src/scheduling/executor-blacklist.h M be/src/service/impala-http-handler.cc M bin/rat_exclude_files.txt M tests/custom_cluster/test_blacklist.py M tests/webserver/test_web_pages.py M www/backends.tmpl A www/blacklisted_tooltip.txt A www/quiescing_tooltip.txt 13 files changed, 208 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/15178/3 -- To view, visit http://gerrit.cloudera.org:8080/15178 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia0c309315b142a50be102dcb516b36ec6cb3cf47 Gerrit-Change-Number: 15178 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-8690 (prep 2): Switch DataCache to Impala's cache implementation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15170 ) Change subject: IMPALA-8690 (prep 2): Switch DataCache to Impala's cache implementation .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5166/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I917f8352c9276373dd2761af986bf3487855271c Gerrit-Change-Number: 15170 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 07 Feb 2020 23:21:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15169 ) Change subject: IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache .. Patch Set 2: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/5165/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/15169 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e7b8b852b56613803dea125456fa9e8b3736d76 Gerrit-Change-Number: 15169 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 07 Feb 2020 23:20:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8690 (prep 3): Factor out common code for cache implementations
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15179 ) Change subject: IMPALA-8690 (prep 3): Factor out common code for cache implementations .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5167/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67294244a3e8a2812f1482fe786bf7f8e6ce054e Gerrit-Change-Number: 15179 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 07 Feb 2020 23:20:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8712: Make ExecQueryFInstances async
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/15154 ) Change subject: IMPALA-8712: Make ExecQueryFInstances async .. Patch Set 3: (15 comments) http://gerrit.cloudera.org:8080/#/c/15154/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15154/2//COMMIT_MSG@10 PS2, Line 10: Previously, Impala would issue all the Exec()s, wait for all of them : to complete, and then check if any of them resulted in an error. We : now stop issuing Exec()s and cancel any that are still in flight as : soon as an error occurs. > so this covers IMPALA-6788 as well, right? Not completely - there's still the case where we get an error report from a running fragment while still doing startup. With this patch, we'll still wait for all Exec()s to complete before processing the error. I added a TODO with this JIRA to call this out. http://gerrit.cloudera.org:8080/#/c/15154/2//COMMIT_MSG@23 PS2, Line 23: Removing this thread pool has potential performance implications, as : it means that the Exec() parameters are serialized in serialize rather : than in parallel (with the level of parallelism determined by the size : of the thread pool, which was configurable by an Advanced flag and : defaulted to 12). > do we have a runtime counter to measure this overhead? Yes, we record the amount of time query startup takes, the time between when we mark the event "Ready to start on X backends" to when we mark "All X execution backends started" http://gerrit.cloudera.org:8080/#/c/15154/2/be/src/runtime/coordinator-backend-state.h File be/src/runtime/coordinator-backend-state.h: http://gerrit.cloudera.org:8080/#/c/15154/2/be/src/runtime/coordinator-backend-state.h@82 PS2, Line 82: Return true > nit: "Returns true" Done http://gerrit.cloudera.org:8080/#/c/15154/2/be/src/runtime/coordinator-backend-state.h@96 PS2, Line 96: after WaitOnExecRpc( > nit: needs to be updated Done http://gerrit.cloudera.org:8080/#/c/15154/2/be/src/runtime/coordinator-backend-state.h@102 PS2, Line 102: Exec > nit: should the docs of this method be updated to indicate that this method Done http://gerrit.cloudera.org:8080/#/c/15154/2/be/src/runtime/coordinator-backend-state.h@108 PS2, Line 108: imes and f > might want to mention some more docs about thread safety - e.g. is a client Done http://gerrit.cloudera.org:8080/#/c/15154/2/be/src/runtime/coordinator-backend-state.h@369 PS2, Line 369: : > same as below Done http://gerrit.cloudera.org:8080/#/c/15154/2/be/src/runtime/coordinator-backend-state.h@375 PS2, Line 375: : int64_t last_report_time_ms_ = 0; > i believe this isn't thread safe. even if it is no longer set after Exec() Went ahead and protected this with 'lock_' always. It does mean I removed some DCHECKs on it in places where we're not taking 'lock_' otherwise, but I don't think that's a big deal. http://gerrit.cloudera.org:8080/#/c/15154/2/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/15154/2/be/src/runtime/coordinator-backend-state.cc@a452 PS2, Line 452: : : : : : : : : : : : : > looks like dead code, but are any of the TODOs still relevant? Sort of. I basically replaced this with the large comment I added earlier in this function, which addresses the same basic issue but better reflects the actual current state of things. http://gerrit.cloudera.org:8080/#/c/15154/2/be/src/runtime/coordinator-backend-state.cc@173 PS2, Line 173: > i believe we generally try to notify any waiting threads after the lock has Yes, there can potentially be multiple calls to WaitOnExec(), eg. if there are multiple calls to Coordinator::WaitOnExecRpcs(), as discussed in another comment. http://gerrit.cloudera.org:8080/#/c/15154/2/be/src/runtime/coordinator-backend-state.cc@193 PS2, Line 193: > nit: won't this be called twice, once in SetExecError and again when method Done http://gerrit.cloudera.org:8080/#/c/15154/2/be/src/runtime/coordinator-backend-state.cc@228 PS2, Line 228: if (IsEmptyBackend > is this thread safe? this gets set without holding onto the lock, and its p So I went ahead and fixed this by holding 'lock_' for all of the Exec() function, rather than just taking it right before sending the rpc. Its nice, because since we also hold lock_ for all of Cancel(), it significantly simplifies reasoning about possible interleavings between those functions. The downside is that we can't stop ourselves from sending an exec rpc on to backend if we've already started executing BackendState::Exec(), eg. if we realize we hit an error from a different backend while
[Impala-ASF-CR] IMPALA-8712: Make ExecQueryFInstances async
Hello Sahil Takiar, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15154 to look at the new patch set (#3). Change subject: IMPALA-8712: Make ExecQueryFInstances async .. IMPALA-8712: Make ExecQueryFInstances async This patch refactors the ExecQueryFInstances rpc to be asychronous. Previously, Impala would issue all the Exec()s, wait for all of them to complete, and then check if any of them resulted in an error. We now stop issuing Exec()s and cancel any that are still in flight as soon as an error occurs. It also performs some cleanup around the thread safety of Coordinator::BackendState, including adding comments and DCHECKS. === Exec RPC Thread Pool === This patch also removes the 'exec_rpc_thread_pool_' from ExecEnv. This thread pool was used to partially simulate async Exec() prior to the switch to KRPC, which provides built-in async rpc capabilities. Removing this thread pool has potential performance implications, as it means that the Exec() parameters are serialized in serialize rather than in parallel (with the level of parallelism determined by the size of the thread pool, which was configurable by an Advanced flag and defaulted to 12). To ensure we don't regress query startup times, I did some performance testing. All tests were done on a 10 node cluster. The baseline used for the tests did not include IMPALA-9181, a perf optimization for query startup done to facilitate this work. I ran TPCH 100 at concurrency levels of 1, 4, and 8 and extracted the query startup times from the profiles. For each concurrency level, the average regression in query startup time was < 2ms. Because query e2e running time was much longer than this, there was no noticable change in total query time. I also ran a 'worst case scenario' with a table with 10,000 pertitions to create a very large Exec() payload to serialize (~1.21MB vs. ~10KB-30KB for TPCH 100). Again, change in query startup time was neglible. TODO: once IMPALA-9335 (krpc rebase) goes in, the change in be/src/kudu/rpc/connection.cc can be removed from this patch. Testing: - Added a e2e test that verifies that a query where an Exec() fails doesn't wait for all Exec()s to complete before cancelling and returning the error to the client. Change-Id: I33ec96e5885af094c294cd3a76c242995263ba32 --- M be/src/common/global-flags.cc M be/src/kudu/rpc/connection.cc M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/query-state.cc M be/src/util/counting-barrier.h M tests/custom_cluster/test_rpc_exception.py M tests/failure/test_failpoints.py 12 files changed, 442 insertions(+), 267 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/15154/3 -- To view, visit http://gerrit.cloudera.org:8080/15154 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I33ec96e5885af094c294cd3a76c242995263ba32 Gerrit-Change-Number: 15154 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.
Hello Grant Henke, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15132 to look at the new patch set (#12). Change subject: IMPALA-3343: Make impala-shell compatible with python 3. .. IMPALA-3343: Make impala-shell compatible with python 3. This patch makes the impala-shell code cross-compatible with python 2 and python 3. The goal is wind up with a version of the shell that will pass python e2e tests irrepsective of the version of python used to launch the shell, under the assumption that the test framework itself will continue to run with python 2.7.x. There are a few isolated tests that weren't able to pass under both versions, and the reasons have been documented in comments in the test themselves. Notable changes for reviewers to consider: - With regard to validating the patch, my assumption is that simply passing the existing set of e2e shell tests is sufficient to confirm that the shell is functioning properly. No new tests were added. - Many of the simpler changes derive from the fact that a few built-in functions and/or types have either been removed or have else changed in python 3.x, E.g., xrange and basestring are both gone, dict.iteritems() has been removed, dict.items() behaves differently, the unicode() function and the method str.decode() have both been removed, etc. Also, catching exceptions using "Exception, e" no longer works, and (as most know), using print() as a function is required now. - A new pytest command line option was added in conftest.py to enable a user to specify a path to an alternate impala-shell executable to test. It's possible to use this to point to an instance of the impala-shell that was installed as a standalone python package in a separate virtualenv. Example usage: USE_THRIFT11_GEN_PY=true impala-py.test --shell_executable=//bin/impala-shell -sv shell/test_shell_commandline.py The target virtualenv may be based on either python3 or python2. However, this has no effect on the version of python used to run the test framework, which remains tied to python 2.7.x for the foreseeable future. - $IMPALA_HOME/bin/set-pythonpath.sh was updated to properly use the thrift-11 gen-py files if USE_THRIFT11_GEN_PY is set to "true". This is required for testing a version of the impala-shell in a python3-based virtualenv. - thrift_sasl.py was updated to match the current public alpha, 0.4a1 - The wording of the header changed a bit to include the python version used to run the shell. Starting Impala Shell on Python 3.7.5 with no authentication Opened TCP connection to localhost:21000 ... OR Starting Impala Shell on Python 2.7.12 with no authentication Opened TCP connection to localhost:21000 ... - By far, the biggest hassle has been juggling str versus unicode versus bytes data types. Python 2.x was fairly loose and inconsistent in how it dealt with strings. As a quick demo of what I mean: Python 2.7.12 (default, Nov 12 2018, 14:36:49) [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> d = 'like a duck' >>> d == str(d) == bytes(d) == unicode(d) == d.encode('utf-8') == d.decode('utf-8') True ...and yet there are weird unexpected gotchas. >>> d.decode('utf-8') == d.encode('utf-8') True >>> d.encode('utf-8') == bytearray(d, 'utf-8') True >>> d.decode('utf-8') == bytearray(d, 'utf-8') # fails the eq property? False As a result of this, the way we handled strings in the impala-shell code had become equally loose and inconsistent -- mainly in the form of frequent and liberal use of str.encode() and str.decode() -- but things still just worked. In python3, there's a much clearer distinction between strings and bytes, and as such, much tighter type consistency is expected by standard libs like subprocess, re, sqlparse, prettytable, etc., which are used throughout the shell. Even simple calls that worked in python 2.x: >>> import re >>> re.findall('foo', b'foobar') ['foo'] ...can throw exceptions in python 3.x: >>> import re >>> re.findall('foo', b'foobar') Traceback (most recent call last): File "", line 1, in File "/data0/systest/venvs/py3/lib/python3.7/re.py", line 223, in findall return _compile(pattern, flags).findall(string) TypeError: cannot use a string pattern on a bytes-like object Exceptions like this resulted in a many, if not most shell tests failing under python 3. At first, I tried to go one-by-one to the site of each failure, and correct by checking instance type and re-encoding as necessary, but this only led to even more str.encode() calls littering the code, which just seemed like a code-smell. (Wiki "code smell" if you don't know the term.) What ultimately seemed like a better approach
[Impala-ASF-CR] IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15169 ) Change subject: IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache .. Patch Set 2: (9 comments) http://gerrit.cloudera.org:8080/#/c/15169/2/be/src/util/cache/cache-bench.cc File be/src/util/cache/cache-bench.cc: http://gerrit.cloudera.org:8080/#/c/15169/2/be/src/util/cache/cache-bench.cc@176 PS2, Line 176: pair hits_lookups = RunQueryThreads(FLAGS_num_threads, FLAGS_run_seconds); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/15169/2/be/src/util/cache/cache-bench.cc@183 PS2, Line 183: LOG(INFO) << test_case << ": " << HumanReadableNum::ToString(l_per_sec) << " lookups/sec"; line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/15169/2/be/src/util/cache/cache-bench.cc@184 PS2, Line 184: LOG(INFO) << test_case << ": " << StringPrintf("%.1f", hit_rate * 100.0) << "% hit rate"; line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/15169/2/be/src/util/cache/cache.h File be/src/util/cache/cache.h: http://gerrit.cloudera.org:8080/#/c/15169/2/be/src/util/cache/cache.h@127 PS2, Line 127: // Passing EXPECT_IN_CACHE will increment the hit/miss metrics that track the number of times line too long (95 > 90) http://gerrit.cloudera.org:8080/#/c/15169/2/be/src/util/cache/cache.h@128 PS2, Line 128: // blocks were requested that the users were hoping to get the block from the cache, along with line too long (97 > 90) http://gerrit.cloudera.org:8080/#/c/15169/2/be/src/util/cache/cache.h@131 PS2, Line 131: // This helps in determining if we are effectively caching the blocks that matter the most. line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/15169/2/be/src/util/cache/nvm_cache.cc File be/src/util/cache/nvm_cache.cc: http://gerrit.cloudera.org:8080/#/c/15169/2/be/src/util/cache/nvm_cache.cc@8 PS2, Line 8: // This file implements a cache based on the MEMKIND library (http://memkind.github.io/memkind/) line too long (96 > 90) http://gerrit.cloudera.org:8080/#/c/15169/2/be/src/util/cache/nvm_cache.cc@9 PS2, Line 9: // This library makes it easy to program against persistent memory hardware by exposing an API line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/15169/2/be/src/util/cache/nvm_cache.cc@706 PS2, Line 706: int err = CALL_MEMKIND(memkind_create_pmem, FLAGS_nvm_cache_path.c_str(), capacity, ); line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/15169 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e7b8b852b56613803dea125456fa9e8b3736d76 Gerrit-Change-Number: 15169 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 07 Feb 2020 22:35:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache
Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15169 Change subject: IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache .. IMPALA-8960 (prep 1): Copy Kudu cache code to be/src/util/cache The data cache uses Kudu's cache code in be/src/kudu/util. To make it easy to make Impala specific changes (such as introducing new cache eviction algorithms), this copies the code into Impala's utilities directory. To make the review easier, this is a simple copy of the files with a minimal CMakeLists.txt and RAT changes. This does not link the resulting library into anything. A subsequent change will modify the code to be usable by Impala. To test, I verified that this compiles. Change-Id: I0e7b8b852b56613803dea125456fa9e8b3736d76 --- A be/src/util/cache/CMakeLists.txt A be/src/util/cache/cache-bench.cc A be/src/util/cache/cache-test.cc A be/src/util/cache/cache.cc A be/src/util/cache/cache.h A be/src/util/cache/nvm_cache.cc A be/src/util/cache/nvm_cache.h M bin/rat_exclude_files.txt 8 files changed, 2,594 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/15169/2 -- To view, visit http://gerrit.cloudera.org:8080/15169 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I0e7b8b852b56613803dea125456fa9e8b3736d76 Gerrit-Change-Number: 15169 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8690 (prep 2): Switch DataCache to Impala's cache implementation
Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15170 Change subject: IMPALA-8690 (prep 2): Switch DataCache to Impala's cache implementation .. IMPALA-8690 (prep 2): Switch DataCache to Impala's cache implementation In a previous change, the Kudu caching code from be/src/kudu/util was copied to be/src/util/cache. This gets that code workable for Impala and uses it for the DataCache. This involves the following changes: 1. Fix up includes to point to Impala's cache implementation 2. Remove NVM Cache support and code (not used by Impala) 3. Remove CacheMetrics code (not used by Impala) 4. Move the cache code to the impala namespace 5. Modify cache-bench and cache-test to use Impala's backend test infrastructure 6. Switch DataCache to use be/src/util/cache These are only boilerplate changes. The cache implementation has not meaningfully changed. Testing: - Ran core tests - The modified version of Kudu's cache-test passes - The modified version of Kudu's cache-bench runs successfully Change-Id: I917f8352c9276373dd2761af986bf3487855271c --- M .clang-tidy M be/CMakeLists.txt M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/util/CMakeLists.txt M be/src/util/cache/CMakeLists.txt M be/src/util/cache/cache-bench.cc M be/src/util/cache/cache-test.cc M be/src/util/cache/cache.cc M be/src/util/cache/cache.h D be/src/util/cache/nvm_cache.cc D be/src/util/cache/nvm_cache.h 12 files changed, 74 insertions(+), 948 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/15170/3 -- To view, visit http://gerrit.cloudera.org:8080/15170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I917f8352c9276373dd2761af986bf3487855271c Gerrit-Change-Number: 15170 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8690 (prep 3): Factor out common code for cache implementations
Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15179 Change subject: IMPALA-8690 (prep 3): Factor out common code for cache implementations .. IMPALA-8690 (prep 3): Factor out common code for cache implementations The existing cache implementation handles LRU/FIFO eviction algorithms, but it also implements several components that are useful for other cache implementations. Specifically, there is a simple hash table (HandleTable) and a simple sharding implementation (ShardedCache). These can be reused across other eviction algorithms by making them generic. This pulls them out of be/src/util/cache/cache.cc and into a new be/src/util/cache/cache-internal.h file. To make the HandleTable generic, this introduces a HandleBase class that contains common code between the implementations (such as the key, the value, the hash, etc). The HandleTable works on this base class, and the RLHandle now derives from HandleBase. To support this, Allocate/Free need to treat the handle as an object (calling constructors/destructors) rather than a treating it like a chunk of memory (or simple struct). ShardedCache is made generic by having cache implementations derive from a base CacheShard class that defines the appropriate methods needed by the sharding class. This is purely an interface, and the base class defines no functions. The existing CacheShard is renamed to RLCachedShard and derives from this class. Change-Id: I67294244a3e8a2812f1482fe786bf7f8e6ce054e --- A be/src/util/cache/cache-internal.h M be/src/util/cache/cache.cc M bin/rat_exclude_files.txt 3 files changed, 365 insertions(+), 282 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/15179/2 -- To view, visit http://gerrit.cloudera.org:8080/15179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I67294244a3e8a2812f1482fe786bf7f8e6ce054e Gerrit-Change-Number: 15179 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.
Hello Grant Henke, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15132 to look at the new patch set (#11). Change subject: IMPALA-3343: Make impala-shell compatible with python 3. .. IMPALA-3343: Make impala-shell compatible with python 3. This patch makes the impala-shell code cross-compatible with python 2 and python 3. The goal is wind up with a version of the shell that will pass python e2e tests irrepsective of the version of python used to launch the shell, under the assumption that the test framework itself will continue to run with python 2.7.x, irrespective of the shell version being tested. There are a few isolated tests that weren't able to pass under both versions, and the reasons have been documented in comments in the test themselves. Notable changes for reviewers to consider: - With regard to validating the patch, my assumption is that simply passing the existing set of e2e shell tests is sufficient to confirm that the shell is functioning properly. No new tests were added. - Many of the simpler changes derive from the fact that a few built-in functions and/or types have either been removed or have else changed in python 3.x, E.g., xrange and basestring are both gone, dict.iteritems() has been removed, dict.items() behaves differently, the unicode() function and the method str.decode() have both been removed, etc. Also, catching exceptions using "Exception, e" no longer works, and (as most know), using print() as a function is required now. - A new pytest command line option was added in conftest.py to enable a user to specify a path to an alternate impala-shell executable to test. It's possible to use this to point to an instance of the impala-shell that was installed as a standalone python package in a separate virtualenv. Example usage: USE_THRIFT11_GEN_PY=true impala-py.test --shell_executable=//bin/impala-shell -sv shell/test_shell_commandline.py The target virtualenv may be based on either python3 or python2. However, this has no effect on the version of python used to run the test framework, which remains tied to python 2.7.x for the foreseeable future. - $IMPALA_HOME/bin/set-pythonpath.sh was updated to properly use the thrift-11 gen-py files if USE_THRIFT11_GEN_PY is set to "true". This is required for testing a version of the impala-shell in a python3-based virtualenv. - thrift_sasl.py was updated to match the current public alpha, 0.4a1 - The wording of the header changed a bit to include the python version used to run the shell. Starting Impala Shell on Python 3.7.5 with no authentication Opened TCP connection to localhost:21000 ... OR Starting Impala Shell on Python 2.7.12 with no authentication Opened TCP connection to localhost:21000 ... - By far, the biggest hassle has been juggling str versus unicode versus bytes data types. Python 2.x was fairly loose and inconsistent in how it dealt with strings. As a quick demo of what I mean: Python 2.7.12 (default, Nov 12 2018, 14:36:49) [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> d = 'like a duck' >>> d == str(d) == bytes(d) == unicode(d) == d.encode('utf-8') == d.decode('utf-8') True ...and yet there are weird unexpected gotchas. >>> d.decode('utf-8') == d.encode('utf-8') True >>> d.encode('utf-8') == bytearray(d, 'utf-8') True >>> d.decode('utf-8') == bytearray(d, 'utf-8') # fails the eq property? False As a result of this, the way we handled strings in the impala-shell code had become equally loose and inconsistent -- mainly in the form of frequent and liberal use of str.encode() and str.decode() -- but things still just worked. In python3, there's a much clearer distinction between strings and bytes, and as such, much tighter type consistency is expected by standard libs like subprocess, re, sqlparse, prettytable, etc., which are used throughout the shell. Even simple calls that worked in python 2.x: >>> import re >>> re.findall('foo', b'foobar') ['foo'] ...can throw exceptions in python 3.x: >>> import re >>> re.findall('foo', b'foobar') Traceback (most recent call last): File "", line 1, in File "/data0/systest/venvs/py3/lib/python3.7/re.py", line 223, in findall return _compile(pattern, flags).findall(string) TypeError: cannot use a string pattern on a bytes-like object Exceptions like this resulted in a many, if not most shell tests failing under python 3. At first, I tried to go one-by-one to the site of each failure, and correct by checking instance type and re-encoding as necessary, but this only led to even more str.encode() calls littering the code, which just seemed like a code-smell. (Wiki "code smell" if you don't know the term.)
[Impala-ASF-CR] IMPALA-9366: Remove embedded pointer reference in PhjBuilder::CodegenInsertRuntimeFilters
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15186 ) Change subject: IMPALA-9366: Remove embedded pointer reference in PhjBuilder::CodegenInsertRuntimeFilters .. Patch Set 1: (1 comment) I checked the other callers of CastPtrToLlvmPtr() to make sure they were all pointers to ScalarExpr instead of per-finstance objects. CastPtrToLlvmPtr() is still an anti-pattern and an obstacle to things like codegen caching, but we don't need to fix it right now. Maybe file a follow-on JIRA to remove it entirely? Also I'd often ask for perf benchmarking for changes like this, but we've done this transformation often enough that I'm confident it won't make a difference perf-wise. http://gerrit.cloudera.org:8080/#/c/15186/1/be/src/exec/partitioned-hash-join-builder.cc File be/src/exec/partitioned-hash-join-builder.cc: http://gerrit.cloudera.org:8080/#/c/15186/1/be/src/exec/partitioned-hash-join-builder.cc@638 PS1, Line 638: FilterContext filter_ctxs[], TupleRow* build_row) noexcept { DCHECK that filter_ctx is referring to the same array as filter_ctx_? -- To view, visit http://gerrit.cloudera.org:8080/15186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09037b5832b037536bde9324f7a2a1077854 Gerrit-Change-Number: 15186 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 07 Feb 2020 21:55:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8778: Support Apache Hudi Read Optimized Table
Yanjia Gary Li has posted comments on this change. ( http://gerrit.cloudera.org:8080/14711 ) Change subject: IMPALA-8778: Support Apache Hudi Read Optimized Table .. Patch Set 22: (1 comment) http://gerrit.cloudera.org:8080/#/c/14711/22/testdata/data/README File testdata/data/README: http://gerrit.cloudera.org:8080/#/c/14711/22/testdata/data/README@489 PS22, Line 489: `ca51fa17-681b-4497-85b7-4f68e7a63ee7-0_1-5-10_20200112194517.parquet` : `ca51fa17-681b-4497-85b7-4f68e7a63ee7-0` is the bloom index hash of this file : `20200112194517` is the timestamp of this version > I agree with Csaba, and it seems we can easily make the file sizes smaller. Thanks for pointing this out. I definitely agree here. Those parquet files are generated by a test in Hudi and the bloom.num_entries was set as default 6. I am not familiar with the indexing part of Hudi's code so I am not sure if this is using any built-in bloom filter feature of PARQUET. But reducing this number to 100 will makes each parquet file to ~10KB. If this size is acceptable then I will update those files. -- To view, visit http://gerrit.cloudera.org:8080/14711 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I65e146b347714df32fe968409ef2dde1f6a25cdf Gerrit-Change-Number: 14711 Gerrit-PatchSet: 22 Gerrit-Owner: Yanjia Gary Li Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yanjia Gary Li Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 07 Feb 2020 21:48:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15157 ) Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of columns .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5305/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc Gerrit-Change-Number: 15157 Gerrit-PatchSet: 4 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 07 Feb 2020 21:47:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15157 ) Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of columns .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc Gerrit-Change-Number: 15157 Gerrit-PatchSet: 4 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 07 Feb 2020 21:47:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15157 ) Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of columns .. Patch Set 3: Code-Review+2 Carrying forward the +1 votes. -- To view, visit http://gerrit.cloudera.org:8080/15157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc Gerrit-Change-Number: 15157 Gerrit-PatchSet: 3 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 07 Feb 2020 21:47:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7002: Throw AuthorizationException when user accessing non-existent table/database in CTE without any privilege.
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/15123 ) Change subject: IMPALA-7002: Throw AuthorizationException when user accessing non-existent table/database in CTE without any privilege. .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/15123/6/fe/src/main/java/org/apache/impala/analysis/WithClause.java File fe/src/main/java/org/apache/impala/analysis/WithClause.java: http://gerrit.cloudera.org:8080/#/c/15123/6/fe/src/main/java/org/apache/impala/analysis/WithClause.java@96 PS6, Line 96: finally { : // Register all privilege requests made from the root analyzer to the input : // analyzer so that caller could do authorization for all the requests collected : // during analysis and report an authorization error over an analysis error. : // We should not accidentally reveal the non-existence of a database/table if : // the user is not authorized. : for (PrivilegeRequest req : withClauseAnalyzer.getPrivilegeReqs()) { : analyzer.registerPrivReq(req); : } : } > After taking a closer look at this patch, I have a question about what we s Look into the code. The local views will be used for further analysis, audit events will be used to create base exec request after analyzeAndAuthorize() is successfully return. If the user has been granted to access the whole server, authzChecker.authorize() will not throw exception for non existing database/table. Then analyzeAndAuthorize() will throw AnalysisException. This cause frontend to return analysis error without any further analysis, nor executing the request. The local views and audit events will not be referenced after analyze() throw AnalysisException. So it's not necessary to register local view and add events in the finally block in WithClause.analyze(). -- To view, visit http://gerrit.cloudera.org:8080/15123 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia6b657a7147a136198a9a97a679c9131ee814577 Gerrit-Change-Number: 15123 Gerrit-PatchSet: 6 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 07 Feb 2020 21:46:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9366: Remove embedded pointer reference in PhjBuilder::CodegenInsertRuntimeFilters
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15186 ) Change subject: IMPALA-9366: Remove embedded pointer reference in PhjBuilder::CodegenInsertRuntimeFilters .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5164/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09037b5832b037536bde9324f7a2a1077854 Gerrit-Change-Number: 15186 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 07 Feb 2020 21:43:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8587: Show inherited privileges with Ranger show grant
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15111 ) Change subject: IMPALA-8587: Show inherited privileges with Ranger show grant .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15111 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4e679dc6fcf8d0b0e4e0fc2e9b335e2d8bc0899 Gerrit-Change-Number: 15111 Gerrit-PatchSet: 6 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 07 Feb 2020 21:37:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8587: Show inherited privileges with Ranger show grant
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15111 ) Change subject: IMPALA-8587: Show inherited privileges with Ranger show grant .. IMPALA-8587: Show inherited privileges with Ranger show grant Previously when executing a SHOW GRANT statement on a resource with Ranger authorization enabled, Impala would not show inherited privileges. For example, consider a user 'foo' with database-level privileges granted by: GRANT SELECT ON DATABASE db TO USER foo; If later on we would like to retrieve the table-level privileges associated with the user 'foo' by: SHOW GRANT USER foo ON TABLE db.table; We would not see any result before this change. After this change, the related privileges including the inherited privileges with regard to the specified resource will be shown. In our example described above, we will see the following result and therefore the result returned by SHOW GRANT statement is more informative than the case in which only the privileges on 'db'.'table' were shown. Notice that in the following returned result, we are also able to know the specified user's privileges on any other table under the database 'db'. +++--+---++-+-+---+--+---+ | principal_type | principal_name | database | table | column | uri | udf | privilege | grant_option | create_time | +++--+---++-+-+---+--+---+ | USER | foo| db | * | * | | | select| false| 1580174954746 | +++--+---++-+-+---+--+---+ Testing - Ran all FE tests - Ran all authorization E2E tests - Added E2E tests in test_ranger verifying functionality Change-Id: Ia4e679dc6fcf8d0b0e4e0fc2e9b335e2d8bc0899 Reviewed-on: http://gerrit.cloudera.org:8080/15111 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/authorization/ranger/RangerImpaladAuthorizationManager.java M tests/authorization/test_ranger.py 2 files changed, 235 insertions(+), 70 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/15111 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ia4e679dc6fcf8d0b0e4e0fc2e9b335e2d8bc0899 Gerrit-Change-Number: 15111 Gerrit-PatchSet: 7 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15132 ) Change subject: IMPALA-3343: Make impala-shell compatible with python 3. .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5163/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15132 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462 Gerrit-Change-Number: 15132 Gerrit-PatchSet: 10 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 07 Feb 2020 21:18:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15132 ) Change subject: IMPALA-3343: Make impala-shell compatible with python 3. .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5162/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15132 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462 Gerrit-Change-Number: 15132 Gerrit-PatchSet: 9 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 07 Feb 2020 21:11:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9366: Remove embedded pointer reference in PhjBuilder::CodegenInsertRuntimeFilters
Bikramjeet Vig has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15186 Change subject: IMPALA-9366: Remove embedded pointer reference in PhjBuilder::CodegenInsertRuntimeFilters .. IMPALA-9366: Remove embedded pointer reference in PhjBuilder::CodegenInsertRuntimeFilters The handcrafted codegen method in PhjBuilder::CodegenInsertRuntimeFilters embeds direct pointer references to filter contexts. This patch replaces those with references to the input parameters instead. Testing: Successfully ran core tests for joins and runtime filters. Change-Id: I09037b5832b037536bde9324f7a2a1077854 --- M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h 3 files changed, 28 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/15186/1 -- To view, visit http://gerrit.cloudera.org:8080/15186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I09037b5832b037536bde9324f7a2a1077854 Gerrit-Change-Number: 15186 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig
[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/15132 ) Change subject: IMPALA-3343: Make impala-shell compatible with python 3. .. Patch Set 10: (3 comments) OK, this is really the last revision to get it ready for final review. I was so focused on getting the minicluster tests to pass that I neglected to check kerberized connections. minicluster tests are passing, and I've confirmed that I can connect to both a kerberized cluster, and LDAP-protected cluster over HTTP. http://gerrit.cloudera.org:8080/#/c/15132/9/shell/thrift_sasl.py File shell/thrift_sasl.py: http://gerrit.cloudera.org:8080/#/c/15132/9/shell/thrift_sasl.py@27 PS9, Line 27: > flake8: E501 line too long (97 > 90 characters) Done http://gerrit.cloudera.org:8080/#/c/15132/9/shell/thrift_sasl.py@62 PS9, Line 62: > flake8: E261 at least two spaces before inline comment Done http://gerrit.cloudera.org:8080/#/c/15132/9/shell/thrift_sasl.py@64 PS9, Line 64: > flake8: E261 at least two spaces before inline comment Done -- To view, visit http://gerrit.cloudera.org:8080/15132 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462 Gerrit-Change-Number: 15132 Gerrit-PatchSet: 10 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 07 Feb 2020 20:51:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.
Hello Grant Henke, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15132 to look at the new patch set (#10). Change subject: IMPALA-3343: Make impala-shell compatible with python 3. .. IMPALA-3343: Make impala-shell compatible with python 3. This is patch makes the impala-shell code cross-compatible with python 2 and python 3. The goal is wind up with a version of the shell that will pass python e2e tests repsective of the version of python used to launch the shell, under the assumption that the test framework itself will continue to run with python 2.7.x, irrespective of the shell version being tested. There are a few isolated tests that weren't able to pass under both versions, and the reasons have been documented in comments in the test themselves. Notable changes for reviewers to consider: - With regard to validating the patch, my assumption is that simply passing the existing set of e2e shell tests is sufficient to confirm that the shell is functioning properly. No new tests were added. - Many of the simpler changes derive from the fact that a few built-in functions and/or types have either been removed or have else changed in python 3.x, E.g., xrange and basestring are both gone, dict.iteritems() has been removed, dict.items() behaves differently, the unicode() function and the method str.decode() have both been removed, etc. Also, catching exceptions using "Exception, e" no longer works, and (as most know), using print() as a function is required now. - A new pytest command line option was added in conftest.py to enable a user to specify a path to an alternate impala-shell executable to test. It's possible to use this to point to an instance of the impala-shell that was installed as a standalone python package in a separate virtualenv. Example usage: USE_THRIFT11_GEN_PY=true impala-py.test --shell_executable=//bin/impala-shell -sv shell/test_shell_commandline.py The target virtualenv may be based on either python3 or python2. However, this has no effect on the version of python used to run the test framework, which remains tied to python 2.7.x for the foreseeable future. - $IMPALA_HOME/bin/set-pythonpath.sh was updated to properly use the thrift-11 gen-py files if USE_THRIFT11_GEN_PY is set to "true". This is required for testing a version of the impala-shell in a python3-based virtualenv. - thrift_sasl.py was updated to match the current public alpha, 0.4a1 - The wording of the header changed a bit to include the python version used to run the shell. Starting Impala Shell on Python 3.7.5 with no authentication Opened TCP connection to localhost:21000 ... OR Starting Impala Shell on Python 2.7.12 with no authentication Opened TCP connection to localhost:21000 ... - By far, the biggest hassle has been juggling str versus unicode versus bytes data types. Python 2.x was fairly loose and inconsistent in how it dealt with strings. As a quick demo of what I mean: Python 2.7.12 (default, Nov 12 2018, 14:36:49) [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> d = 'like a duck' >>> d == str(d) == bytes(d) == unicode(d) == d.encode('utf-8') == d.decode('utf-8') True ...and yet there are weird unexpected gotchas. >>> d.decode('utf-8') == d.encode('utf-8') True >>> d.encode('utf-8') == bytearray(d, 'utf-8') True >>> d.decode('utf-8') == bytearray(d, 'utf-8') # fails the eq property? False As a result of this, the way we handled strings in the impala-shell code had become equally loose and inconsistent -- mainly in the form of frequent and liberal use of str.encode() and str.decode() -- but things still just worked. In python3, there's a much clearer distinction between strings and bytes, and as such, much tighter type consistency is expected by standard libs like subprocess, re, sqlparse, prettytable, etc., which are used throughout the shell. Even simple calls that worked in python 2.x: >>> import re >>> re.findall('foo', b'foobar') ['foo'] ...can throw exceptions in python 3.x: >>> import re >>> re.findall('foo', b'foobar') Traceback (most recent call last): File "", line 1, in File "/data0/systest/venvs/py3/lib/python3.7/re.py", line 223, in findall return _compile(pattern, flags).findall(string) TypeError: cannot use a string pattern on a bytes-like object Exceptions like this resulted in a many, if not most shell tests failing under python 3. At first, I tried to go one-by-one to the site of each failure, and correct by checking instance type and re-encoding as necessary, but this only led to even more str.encode() calls littering the code, which just seemed like a code-smell. (Wiki "code smell" if you don't know the term.)
[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15132 ) Change subject: IMPALA-3343: Make impala-shell compatible with python 3. .. Patch Set 9: (3 comments) http://gerrit.cloudera.org:8080/#/c/15132/9/shell/thrift_sasl.py File shell/thrift_sasl.py: http://gerrit.cloudera.org:8080/#/c/15132/9/shell/thrift_sasl.py@27 PS9, Line 27: n flake8: E501 line too long (97 > 90 characters) http://gerrit.cloudera.org:8080/#/c/15132/9/shell/thrift_sasl.py@62 PS9, Line 62: flake8: E261 at least two spaces before inline comment http://gerrit.cloudera.org:8080/#/c/15132/9/shell/thrift_sasl.py@64 PS9, Line 64: flake8: E261 at least two spaces before inline comment -- To view, visit http://gerrit.cloudera.org:8080/15132 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462 Gerrit-Change-Number: 15132 Gerrit-PatchSet: 9 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 07 Feb 2020 20:26:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.
Hello Grant Henke, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15132 to look at the new patch set (#9). Change subject: IMPALA-3343: Make impala-shell compatible with python 3. .. IMPALA-3343: Make impala-shell compatible with python 3. This is patch makes the impala-shell code cross-compatible with python 2 and python 3. The goal is wind up with a version of the shell that will pass python e2e tests repsective of the version of python used to launch the shell, under the assumption that the test framework itself will continue to run with python 2.7.x, irrespective of the shell version being tested. There are a few isolated tests that weren't able to pass under both versions, and the reasons have been documented in comments in the test themselves. Notable changes for reviewers to consider: - With regard to validating the patch, my assumption is that simply passing the existing set of e2e shell tests is sufficient to confirm that the shell is functioning properly. No new tests were added. - Many of the simpler changes derive from the fact that a few built-in functions and/or types have either been removed or have else changed in python 3.x, E.g., xrange and basestring are both gone, dict.iteritems() has been removed, dict.items() behaves differently, the unicode() function and the method str.decode() have both been removed, etc. Also, catching exceptions using "Exception, e" no longer works, and (as most know), using print() as a function is required now. - A new pytest command line option was added in conftest.py to enable a user to specify a path to an alternate impala-shell executable to test. It's possible to use this to point to an instance of the impala-shell that was installed as a standalone python package in a separate virtualenv. Example usage: USE_THRIFT11_GEN_PY=true impala-py.test --shell_executable=//bin/impala-shell -sv shell/test_shell_commandline.py The target virtualenv may be based on either python3 or python2. However, this has no effect on the version of python used to run the test framework, which remains tied to python 2.7.x for the foreseeable future. - $IMPALA_HOME/bin/set-pythonpath.sh was updated to properly use the thrift-11 gen-py files if USE_THRIFT11_GEN_PY is set to "true". This is required for testing a version of the impala-shell in a python3-based virtualenv. - thrift_sasl.py was updated to match the current public alpha, 0.4a1 - The wording of the header changed a bit to include the python version used to run the shell. Starting Impala Shell on Python 3.7.5 with no authentication Opened TCP connection to localhost:21000 ... OR Starting Impala Shell on Python 2.7.12 with no authentication Opened TCP connection to localhost:21000 ... - By far, the biggest hassle has been juggling str versus unicode versus bytes data types. Python 2.x was fairly loose and inconsistent in how it dealt with strings. As a quick demo of what I mean: Python 2.7.12 (default, Nov 12 2018, 14:36:49) [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> d = 'like a duck' >>> d == str(d) == bytes(d) == unicode(d) == d.encode('utf-8') == d.decode('utf-8') True ...and yet there are weird unexpected gotchas. >>> d.decode('utf-8') == d.encode('utf-8') True >>> d.encode('utf-8') == bytearray(d, 'utf-8') True >>> d.decode('utf-8') == bytearray(d, 'utf-8') # fails the eq property? False As a result of this, the way we handled strings in the impala-shell code had become equally loose and inconsistent -- mainly in the form of frequent and liberal use of str.encode() and str.decode() -- but things still just worked. In python3, there's a much clearer distinction between strings and bytes, and as such, much tighter type consistency is expected by standard libs like subprocess, re, sqlparse, prettytable, etc., which are used throughout the shell. Even simple calls that worked in python 2.x: >>> import re >>> re.findall('foo', b'foobar') ['foo'] ...can throw exceptions in python 3.x: >>> import re >>> re.findall('foo', b'foobar') Traceback (most recent call last): File "", line 1, in File "/data0/systest/venvs/py3/lib/python3.7/re.py", line 223, in findall return _compile(pattern, flags).findall(string) TypeError: cannot use a string pattern on a bytes-like object Exceptions like this resulted in a many, if not most shell tests failing under python 3. At first, I tried to go one-by-one to the site of each failure, and correct by checking instance type and re-encoding as necessary, but this only led to even more str.encode() calls littering the code, which just seemed like a code-smell. (Wiki "code smell" if you don't know the term.)
[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15132 ) Change subject: IMPALA-3343: Make impala-shell compatible with python 3. .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5161/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15132 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462 Gerrit-Change-Number: 15132 Gerrit-PatchSet: 8 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 07 Feb 2020 20:01:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9361: manually configured kerberized minicluster
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15159 ) Change subject: IMPALA-9361: manually configured kerberized minicluster .. Patch Set 7: This hit IMPALA-9152 again and another isuse. -- To view, visit http://gerrit.cloudera.org:8080/15159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib34101d132e9c9d59da14537edf7d096f25e9bee Gerrit-Change-Number: 15159 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 07 Feb 2020 19:57:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9243: Add info about blacklisting decisions to the webui
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/15178 ) Change subject: IMPALA-9243: Add info about blacklisting decisions to the webui .. Patch Set 2: Code-Review+1 (1 comment) This looks good to me. One question for my own understanding. If no one else wants to review, I will bump to +2. http://gerrit.cloudera.org:8080/#/c/15178/2/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/15178/2/be/src/service/impala-http-handler.cc@1001 PS2, Line 1001: if (num_quiescing_backends > 0) { : document->AddMember( : "num_quiescing_backends", num_quiescing_backends, document->GetAllocator()); : } : if (num_blacklisted_backends > 0) { : document->AddMember( : "num_blacklisted_backends", num_blacklisted_backends, document->GetAllocator()); : } For my own understanding, if num_quiescing_backends is zero and we don't add the member, what does this code in www/backends.tmpl do? {{#num_quiescing_backends}} ... content ... {{/num_quiescing_backends}} I assume this would skip the whole chunk. Is that right? The context is I'm wondering whether it matters that we don't add these members vs adding them with a count of zero. -- To view, visit http://gerrit.cloudera.org:8080/15178 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0c309315b142a50be102dcb516b36ec6cb3cf47 Gerrit-Change-Number: 15178 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 07 Feb 2020 19:37:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 6 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Fri, 07 Feb 2020 19:15:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/15132 ) Change subject: IMPALA-3343: Make impala-shell compatible with python 3. .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/15132/7/tests/shell/test_shell_commandline.py File tests/shell/test_shell_commandline.py: http://gerrit.cloudera.org:8080/#/c/15132/7/tests/shell/test_shell_commandline.py@1012 PS7, Line 1012: > flake8: W391 blank line at end of file Done -- To view, visit http://gerrit.cloudera.org:8080/15132 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462 Gerrit-Change-Number: 15132 Gerrit-PatchSet: 8 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 07 Feb 2020 19:12:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.
Hello Grant Henke, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15132 to look at the new patch set (#8). Change subject: IMPALA-3343: Make impala-shell compatible with python 3. .. IMPALA-3343: Make impala-shell compatible with python 3. This is patch makes the impala-shell code cross-compatible with python 2 and python 3. The goal is wind up with a version of the shell that will pass python e2e tests repsective of the version of python used to launch the shell, under the assumption that the test framework itself will continue to run with python 2.7.x, irrespective of the shell version being tested. There are a few isolated tests that weren't able to pass under both versions, and the reasons have been documented in comments in the test themselves. Notable changes for reviewers to consider: - With regard to validating the patch, my assumption is that simply passing the existing set of e2e shell tests is sufficient to confirm that the shell is functioning properly. No new tests were added. - Many of the simpler changes derive from the fact that a few built-in functions and/or types have either been removed or have else changed in python 3.x, E.g., xrange and basestring are both gone, dict.iteritems() has been removed, dict.items() behaves differently, the unicode() function and the method str.decode() have both been removed, etc. Also, catching exceptions using "Exception, e" no longer works, and (as most know), using print() as a function is required now. - A new pytest command line option was added in conftest.py to enable a user to specify a path to an alternate impala-shell executable to test. It's possible to use this to point to an instance of the impala-shell that was installed as a standalone python package in a separate virtualenv. Example usage: USE_THRIFT11_GEN_PY=true impala-py.test --shell_executable=//bin/impala-shell -sv shell/test_shell_commandline.py The target virtualenv may be based on either python3 or python2. However, this has no effect on the version of python used to run the test framework, which remains tied to python 2.7.x for the foreseeable future. - $IMPALA_HOME/bin/set-pythonpath.sh was updated to properly use the thrift-11 gen-py files if USE_THRIFT11_GEN_PY is set to "true". This is required for testing a version of the impala-shell in a python3-based virtualenv. - The wording of the header changed a bit to include the python version used to run the shell. Starting Impala Shell on Python 3.7.5 with no authentication Opened TCP connection to localhost:21000 ... OR Starting Impala Shell on Python 2.7.12 with no authentication Opened TCP connection to localhost:21000 ... - By far, the biggest hassle has been juggling str versus unicode versus bytes data types. Python 2.x was fairly loose and inconsistent in how it dealt with strings. As a quick demo of what I mean: Python 2.7.12 (default, Nov 12 2018, 14:36:49) [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> d = 'like a duck' >>> d == str(d) == bytes(d) == unicode(d) == d.encode('utf-8') == d.decode('utf-8') True ...and yet there are weird unexpected gotchas. >>> d.decode('utf-8') == d.encode('utf-8') True >>> d.encode('utf-8') == bytearray(d, 'utf-8') True >>> d.decode('utf-8') == bytearray(d, 'utf-8') # fails the eq property? False As a result of this, the way we handled strings in the impala-shell code had become equally loose and inconsistent -- mainly in the form of frequent and liberal use of str.encode() and str.decode() -- but things still just worked. In python3, there's a much clearer distinction between strings and bytes, and as such, much tighter type consistency is expected by standard libs like subprocess, re, sqlparse, prettytable, etc., which are used throughout the shell. Even simple calls that worked in python 2.x: >>> import re >>> re.findall('foo', b'foobar') ['foo'] ...can throw exceptions in python 3.x: >>> import re >>> re.findall('foo', b'foobar') Traceback (most recent call last): File "", line 1, in File "/data0/systest/venvs/py3/lib/python3.7/re.py", line 223, in findall return _compile(pattern, flags).findall(string) TypeError: cannot use a string pattern on a bytes-like object Exceptions like this resulted in a many, if not most shell tests failing under python 3. At first, I tried to go one-by-one to the site of each failure, and correct by checking instance type and re-encoding as necessary, but this only led to even more str.encode() calls littering the code, which just seemed like a code-smell. (Wiki "code smell" if you don't know the term.) What ultimately seemed like a better approach was to try to weed out
[Impala-ASF-CR] IMPALA-8110: Fix the Parquet stats filtering issue to correctly handle narrowed integer types
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15087 ) Change subject: IMPALA-8110: Fix the Parquet stats filtering issue to correctly handle narrowed integer types .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5160/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15087 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id8bdaf4c4b2d0c6ea26d6e9bf013afca647e53a1 Gerrit-Change-Number: 15087 Gerrit-PatchSet: 8 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 07 Feb 2020 19:10:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9243: Add info about blacklisting decisions to the webui
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15178 ) Change subject: IMPALA-9243: Add info about blacklisting decisions to the webui .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5159/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15178 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0c309315b142a50be102dcb516b36ec6cb3cf47 Gerrit-Change-Number: 15178 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 07 Feb 2020 19:06:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15157 ) Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of columns .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5158/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc Gerrit-Change-Number: 15157 Gerrit-PatchSet: 3 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 07 Feb 2020 19:02:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8852: Skip short-circuit config check for coordinator-only mode
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15173 ) Change subject: IMPALA-8852: Skip short-circuit config check for coordinator-only mode .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5157/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I373d4037f4cee203322a398b77b75810ba708bb5 Gerrit-Change-Number: 15173 Gerrit-PatchSet: 3 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Fri, 07 Feb 2020 18:51:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8110: Fix the Parquet stats filtering issue to correctly handle narrowed integer types
Wenzhe Zhou has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/15087 ) Change subject: IMPALA-8110: Fix the Parquet stats filtering issue to correctly handle narrowed integer types .. IMPALA-8110: Fix the Parquet stats filtering issue to correctly handle narrowed integer types This patch adds validation for the paired stats values of tinyint and smallint column data type when reading min/max column stats value from Parquet file. Testing: - Did Manual tests: create table with column as int type, intert some values, then alter table to change the column data type as tinyint (int8), insert more values, verify that the query return correct number of rows when PARQUET_READ_STATISTICS is set as 1. Did similar tests to change column data type from int to smallint, and from smallint to tinyint. - Added automatic test cases in parquet-stats.test for column data type been changed from int to tinyint, from smallint to tinyint and from int to smallint. - Passed EE tests. - Passed all core tests. Change-Id: Id8bdaf4c4b2d0c6ea26d6e9bf013afca647e53a1 --- M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/parquet-column-stats.cc M be/src/exec/parquet/parquet-column-stats.h M testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test 4 files changed, 121 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/15087/8 -- To view, visit http://gerrit.cloudera.org:8080/15087 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id8bdaf4c4b2d0c6ea26d6e9bf013afca647e53a1 Gerrit-Change-Number: 15087 Gerrit-PatchSet: 8 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-8110: Fix the Parquet stats filtering issue to correctly handle narrowed integer types
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/15087 ) Change subject: IMPALA-8110: Fix the Parquet stats filtering issue to correctly handle narrowed integer types .. Patch Set 7: (4 comments) http://gerrit.cloudera.org:8080/#/c/15087/7/be/src/exec/parquet/parquet-column-stats.cc File be/src/exec/parquet/parquet-column-stats.cc: http://gerrit.cloudera.org:8080/#/c/15087/7/be/src/exec/parquet/parquet-column-stats.cc@100 PS7, Line 100: return false; > here + for SMALLINT: Fits to a one line if without braces. will remove braces and make the code to one line. http://gerrit.cloudera.org:8080/#/c/15087/7/be/src/exec/parquet/parquet-column-stats.cc@102 PS7, Line 102: ret = ColumnStats::DecodePlainValue(*paired_stats_value, : _stats_val, parquet::Type::INT32); > here + for SMALLINT: doesn't need to be in an else block, as the "if" block will remove else. http://gerrit.cloudera.org:8080/#/c/15087/7/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test File testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test: http://gerrit.cloudera.org:8080/#/c/15087/7/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test@690 PS7, Line 690: i > An idea to improve readability: i / j / k could be renamed to contain their Will change column names as suggested. http://gerrit.cloudera.org:8080/#/c/15087/7/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test@717 PS7, Line 717: 200 > Here and in other = predicates: it seems more interesting to test for the o will change code to check overlown value. -- To view, visit http://gerrit.cloudera.org:8080/15087 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id8bdaf4c4b2d0c6ea26d6e9bf013afca647e53a1 Gerrit-Change-Number: 15087 Gerrit-PatchSet: 7 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 07 Feb 2020 18:44:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/15157 ) Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of columns .. Patch Set 3: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/15157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc Gerrit-Change-Number: 15157 Gerrit-PatchSet: 3 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 07 Feb 2020 18:41:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9243: Add info about blacklisting decisions to the webui
Hello Sahil Takiar, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15178 to look at the new patch set (#2). Change subject: IMPALA-9243: Add info about blacklisting decisions to the webui .. IMPALA-9243: Add info about blacklisting decisions to the webui This patch adds information about blacklisting decisions to the /backends webui endpoint. For the JSON, it adds an 'is_blacklisted' field to all backends, and for and backends where 'is_blacklisted' is true it adds a 'blacklist_cause' field indicating the error status that led to the backend getting blacklisted and an 'blacklist_time_remaining' field indiciating how much longer the backend will remain on the blacklist. It also adds counts for the number of blacklisted and quiescing backends, if any, and the number of active (i.e. all other) backends. For display, in order to prevent the table of backend information from having too many columns (prior to this patch it already had 12), it separates blacklisted, quiescing, and active backends into three separate table, with the blacklisted and quiescing tables only getting displayed if there are any such backends. Additionally, tooltips are added next to the headers for the blacklisted and quiescing tables that provide a brief explanation of what it means for a backend to appear on there lists. Using separate tables also facilitates having state-specific columns - the blacklisted table displays columns for the blacklist cause and time remaining. Future work could consider adding columns to the quiescing table, such as time until the grace period and deadline expires. Testing: - Manually ran various quiescing/blacklisting scenarios and confirmed the /backends page displays as expected. - Added cases to test_web_pages (to verify the new fields when nothing is blacklisted) and test_blacklist. Change-Id: Ia0c309315b142a50be102dcb516b36ec6cb3cf47 --- M be/src/runtime/coordinator.cc M be/src/scheduling/cluster-membership-mgr-test.cc M be/src/scheduling/cluster-membership-mgr.cc M be/src/scheduling/cluster-membership-mgr.h M be/src/scheduling/executor-blacklist.cc M be/src/scheduling/executor-blacklist.h M be/src/service/impala-http-handler.cc M bin/rat_exclude_files.txt M tests/custom_cluster/test_blacklist.py M tests/webserver/test_web_pages.py M www/backends.tmpl A www/blacklisted_tooltip.txt A www/quiescing_tooltip.txt 13 files changed, 206 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/15178/2 -- To view, visit http://gerrit.cloudera.org:8080/15178 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia0c309315b142a50be102dcb516b36ec6cb3cf47 Gerrit-Change-Number: 15178 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns
Hello Quanlong Huang, Anurag Mantripragada, Vihang Karajgaonkar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15157 to look at the new patch set (#3). Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of columns .. IMPALA-9358: Query slowdown with inline views and hundreds of columns IMPALA-8386 introduced an expensive precondition check using the function ExprSubstitutionMap.checkComposedFrom(). This check has significant performance impact on statements that contain inline views with hundreds of columns. Most of the cost is in the get() calls used to find expressions in the local substitution map. The fix is to add a getWithHint() call that uses the current loop index as a starting point to search for expressions. This leverages the fact that expressions have identical positions in both substitution maps in most common cases. A more generic approach would be to accelerate expression equality search using hash functions but that would be a much riskier fix and Impala currently lacks the infrasturucture to so. Testing: Performance testing with a query with 1000 expressions of the following form: with a as (select c1 c1, c1 c2, c1 c3, ... from t) select c1, c2, c3, ... from a; repro query went from 12 sec to 1 sec. There was no noticeable time spent in the precondition now. Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc --- M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java 1 file changed, 16 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/15157/3 -- To view, visit http://gerrit.cloudera.org:8080/15157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc Gerrit-Change-Number: 15157 Gerrit-PatchSet: 3 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-8852: Skip short-circuit config check for coordinator-only mode
Tamas Mate has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/15173 ) Change subject: IMPALA-8852: Skip short-circuit config check for coordinator-only mode .. IMPALA-8852: Skip short-circuit config check for coordinator-only mode ImpalaD should not abort when running in coordinator-only mode and DataNode is not available on the host. This change adds a condition to skip the short- circuit socket path directory checks when ImpalaD is started with 'is_executor=false' flag. Testing: Added test to JniFrontendTest.java to verify the short-circuit directory check is skipped if ImpalaD is started in coordinator-only mode. Change-Id: I373d4037f4cee203322a398b77b75810ba708bb5 --- M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java 2 files changed, 33 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/15173/3 -- To view, visit http://gerrit.cloudera.org:8080/15173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I373d4037f4cee203322a398b77b75810ba708bb5 Gerrit-Change-Number: 15173 Gerrit-PatchSet: 3 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate
[Impala-ASF-CR] IMPALA-8852: Skip short-circuit config check for coordinator-only mode
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/15173 ) Change subject: IMPALA-8852: Skip short-circuit config check for coordinator-only mode .. Patch Set 2: (11 comments) Thank you for the review and pointers, Anurag, Andrew. It looks better now. Skipped the whole check earlier because it seemed reasonable, noticed that keeping the earlier checks is useful, incorrect clusterwide hdfs configuration will prevent only coordinator starts. http://gerrit.cloudera.org:8080/#/c/15173/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15173/2//COMMIT_MSG@7 PS2, Line 7: Skipping > Nit: Change to Skip? Done http://gerrit.cloudera.org:8080/#/c/15173/2//COMMIT_MSG@10 PS2, Line 10: DataNode is not avaiable on the hosts. This change adds a condition to > Nit: typo: available Done http://gerrit.cloudera.org:8080/#/c/15173/2//COMMIT_MSG@13 PS2, Line 13: > Usually, our commit messages also have a section "Testing" to mention the t Thank you for the pointers, added the Testing section. http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java File fe/src/main/java/org/apache/impala/service/JniFrontend.java: http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@719 PS2, Line 719: , > Could you also mention here that this will skip the check if it is coordina Done http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@723 PS2, Line 723: if (!BackendConfig.INSTANCE.getBackendCfg().is_executor) { > Should this check move after the test for DFS_CLIENT_READ_SHORTCIRCUIT_KEY? Refactored the check to only affect the directory checks. http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@724 PS2, Line 724: LOG.info("Coordinator only instance will not read local data via " + > Nit "Coordinator-only" is clearer I think Done http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@725 PS2, Line 725: "short-circuit reads."); > We use clang-format to format Java code (as well as C++) Thanks for the pointers, installed and ran on the changed parts. http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@806 PS2, Line 806: > Nit: Extra space. Done http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java File fe/src/test/java/org/apache/impala/service/JniFrontendTest.java: http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@66 PS2, Line 66: testCheckShortCircuitReadShouldReturnErrorForInvalidConfig() > Would it be possible to consolidate this and the next into a single test? S Done http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@68 PS2, Line 68: // GIVEN > I wasn't familiar with this given-when-then style but I googled it and lear It can improve the readability of the tests. Agree, it is not in harmony with the codebase, removed them when consolidated the tests. http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@90 PS2, Line 90: conf.setStrings("dfs.domain.socket.path", ""); > Don't we need to set dfs.client.read.shortcircuit = true here? Is the defau It was left empty to have some result from 'checkShortCircuitRead' that can be checked. In the new test I had to fill it with some value to test the new code path. -- To view, visit http://gerrit.cloudera.org:8080/15173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I373d4037f4cee203322a398b77b75810ba708bb5 Gerrit-Change-Number: 15173 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Fri, 07 Feb 2020 18:04:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8587: Show inherited privileges with Ranger show grant
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15111 ) Change subject: IMPALA-8587: Show inherited privileges with Ranger show grant .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15111 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4e679dc6fcf8d0b0e4e0fc2e9b335e2d8bc0899 Gerrit-Change-Number: 15111 Gerrit-PatchSet: 6 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 07 Feb 2020 16:47:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8587: Show inherited privileges with Ranger show grant
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15111 ) Change subject: IMPALA-8587: Show inherited privileges with Ranger show grant .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5303/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15111 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4e679dc6fcf8d0b0e4e0fc2e9b335e2d8bc0899 Gerrit-Change-Number: 15111 Gerrit-PatchSet: 6 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 07 Feb 2020 16:47:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8778: Support Apache Hudi Read Optimized Table
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/14711 ) Change subject: IMPALA-8778: Support Apache Hudi Read Optimized Table .. Patch Set 22: (3 comments) The verify job failed. The upside is that we have the opportunity to lower the file sizes as Csaba suggested. I also added comments of the test failures. http://gerrit.cloudera.org:8080/#/c/14711/22/be/src/service/query-options-test.cc File be/src/service/query-options-test.cc: http://gerrit.cloudera.org:8080/#/c/14711/22/be/src/service/query-options-test.cc@223 PS22, Line 223: ) you need to add HUDI_PARQUET here http://gerrit.cloudera.org:8080/#/c/14711/22/testdata/data/README File testdata/data/README: http://gerrit.cloudera.org:8080/#/c/14711/22/testdata/data/README@489 PS22, Line 489: `ca51fa17-681b-4497-85b7-4f68e7a63ee7-0_1-5-10_20200112194517.parquet` : `ca51fa17-681b-4497-85b7-4f68e7a63ee7-0` is the bloom index hash of this file : `20200112194517` is the timestamp of this version > I would prefer not to add such large (~0.5MB) files to the .git repo if pos I agree with Csaba, and it seems we can easily make the file sizes smaller. I'm not sure if we can disable bloom filter writing based on Hudi's source, but the following options should definitely lower the size of the bloom filter: >From >https://github.com/apache/incubator-hudi/blob/c1516df8ac55757ebd07d8aa459a0ceedeccab7b/hudi-client/src/main/java/org/apache/hudi/config/HoodieIndexConfig.java#L41-L44 hoodie.index.bloom.num_entries hoodie.index.bloom.fpp http://gerrit.cloudera.org:8080/#/c/14711/22/testdata/workloads/functional-query/queries/QueryTest/set.test File testdata/workloads/functional-query/queries/QueryTest/set.test: http://gerrit.cloudera.org:8080/#/c/14711/22/testdata/workloads/functional-query/queries/QueryTest/set.test@145 PS22, Line 145: . Need to add HUDI_PARQUET(7) -- To view, visit http://gerrit.cloudera.org:8080/14711 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I65e146b347714df32fe968409ef2dde1f6a25cdf Gerrit-Change-Number: 14711 Gerrit-PatchSet: 22 Gerrit-Owner: Yanjia Gary Li Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yanjia Gary Li Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 07 Feb 2020 16:35:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8778: Support Apache Hudi Read Optimized Table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14711 ) Change subject: IMPALA-8778: Support Apache Hudi Read Optimized Table .. Patch Set 22: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5301/ -- To view, visit http://gerrit.cloudera.org:8080/14711 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I65e146b347714df32fe968409ef2dde1f6a25cdf Gerrit-Change-Number: 14711 Gerrit-PatchSet: 22 Gerrit-Owner: Yanjia Gary Li Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yanjia Gary Li Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 07 Feb 2020 16:15:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6663: Expose current DDL metrics on WebUI
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13806 ) Change subject: IMPALA-6663: Expose current DDL metrics on WebUI .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5156/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0ed76f134bad6d3b3d4dce132365a53a01e9512a Gerrit-Change-Number: 13806 Gerrit-PatchSet: 13 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 07 Feb 2020 16:01:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8778: Support Apache Hudi Read Optimized Table
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/14711 ) Change subject: IMPALA-8778: Support Apache Hudi Read Optimized Table .. Patch Set 22: (1 comment) Sorry for joining late and thanks for the great work! I have some questions / concerns about the example parquet files. http://gerrit.cloudera.org:8080/#/c/14711/22/testdata/data/README File testdata/data/README: http://gerrit.cloudera.org:8080/#/c/14711/22/testdata/data/README@489 PS22, Line 489: `ca51fa17-681b-4497-85b7-4f68e7a63ee7-0_1-5-10_20200112194517.parquet` : `ca51fa17-681b-4497-85b7-4f68e7a63ee7-0` is the bloom index hash of this file : `20200112194517` is the timestamp of this version I would prefer not to add such large (~0.5MB) files to the .git repo if possible. There are already some > 1MB files there, and there are some tests that really need large files, but it would be good to think about possibilities to avoid it in this case. We looked at the files with Zoltan - it only has a few rows, and to whole file is compressible to ~7K, so we are were not sure about the reason behind the large size. We have noticed a large Bloom filter in the metadata - maybe the uncompressed bloom filter is the reason behind the large size? It would be also nice to add some more information here about the way these files for created - Parquet bloom filters are not yet finished according to https://issues.apache.org/jira/browse/PARQUET-41 Did you use a released parquet-mr to create the files, or Hoodie has its own fork of parquet-mt or its own writer? -- To view, visit http://gerrit.cloudera.org:8080/14711 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I65e146b347714df32fe968409ef2dde1f6a25cdf Gerrit-Change-Number: 14711 Gerrit-PatchSet: 22 Gerrit-Owner: Yanjia Gary Li Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yanjia Gary Li Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 07 Feb 2020 15:44:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5155/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 6 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 07 Feb 2020 15:44:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6663: Expose current DDL metrics on WebUI
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13806 ) Change subject: IMPALA-6663: Expose current DDL metrics on WebUI .. Patch Set 12: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5154/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0ed76f134bad6d3b3d4dce132365a53a01e9512a Gerrit-Change-Number: 13806 Gerrit-PatchSet: 12 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 07 Feb 2020 15:32:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6663: Expose current DDL metrics on WebUI
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13806 ) Change subject: IMPALA-6663: Expose current DDL metrics on WebUI .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5153/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0ed76f134bad6d3b3d4dce132365a53a01e9512a Gerrit-Change-Number: 13806 Gerrit-PatchSet: 11 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 07 Feb 2020 15:23:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6663: Expose current DDL metrics on WebUI
Tamas Mate has uploaded a new patch set (#13). ( http://gerrit.cloudera.org:8080/13806 ) Change subject: IMPALA-6663: Expose current DDL metrics on WebUI .. IMPALA-6663: Expose current DDL metrics on WebUI This change adds a new '/operations' tab to the Catalog WebUI. On this page the users can see two summary tables, the first report is based on how many catalog operations are running and which tables are involved. The second report shows the number of catalog operations on table level. A new 'monitor' package is created to collect the monitoring classes. Each DDL/FinalizeDML/ResetMetada request get a class with common base class. The output of these classes are summarized and can be accessed with a new JNI call. A screenshot of the new '/operations' page can be found in the Jira. Testing: Unit tests were added to test the counters and a web server test to verify the availability of the page. Change-Id: I0ed76f134bad6d3b3d4dce132365a53a01e9512a --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/monitor/CatalogDdlCounter.java A fe/src/main/java/org/apache/impala/catalog/monitor/CatalogFinalizeDmlCounter.java A fe/src/main/java/org/apache/impala/catalog/monitor/CatalogMonitor.java A fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOperationCounter.java A fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOperationMetrics.java A fe/src/main/java/org/apache/impala/catalog/monitor/CatalogResetMetadataCounter.java R fe/src/main/java/org/apache/impala/catalog/monitor/CatalogTableMetrics.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java A fe/src/test/java/org/apache/impala/catalog/monitor/CatalogDdlCounterTest.java A fe/src/test/java/org/apache/impala/catalog/monitor/CatalogFinalizeDmlCounterTest.java A fe/src/test/java/org/apache/impala/catalog/monitor/CatalogResetMetadataCounterTest.java M tests/webserver/test_web_pages.py A www/catalog_operations.tmpl 22 files changed, 1,260 insertions(+), 157 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/13806/13 -- To view, visit http://gerrit.cloudera.org:8080/13806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0ed76f134bad6d3b3d4dce132365a53a01e9512a Gerrit-Change-Number: 13806 Gerrit-PatchSet: 13 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift 45 files changed, 484 insertions(+), 227 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/6 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 6 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8110: Fix the Parquet stats filtering issue to correctly handle narrowed integer types
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/15087 ) Change subject: IMPALA-8110: Fix the Parquet stats filtering issue to correctly handle narrowed integer types .. Patch Set 7: Code-Review+1 (4 comments) Thanks for the fixes so far! A few more nits, mainly about readability and compactness. http://gerrit.cloudera.org:8080/#/c/15087/7/be/src/exec/parquet/parquet-column-stats.cc File be/src/exec/parquet/parquet-column-stats.cc: http://gerrit.cloudera.org:8080/#/c/15087/7/be/src/exec/parquet/parquet-column-stats.cc@100 PS7, Line 100: return false; here + for SMALLINT: Fits to a one line if without braces. http://gerrit.cloudera.org:8080/#/c/15087/7/be/src/exec/parquet/parquet-column-stats.cc@102 PS7, Line 102: ret = ColumnStats::DecodePlainValue(*paired_stats_value, : _stats_val, parquet::Type::INT32); here + for SMALLINT: doesn't need to be in an else block, as the "if" block returns. http://gerrit.cloudera.org:8080/#/c/15087/7/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test File testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test: http://gerrit.cloudera.org:8080/#/c/15087/7/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test@690 PS7, Line 690: i An idea to improve readability: i / j / k could be renamed to contain their type, e.g. i32_to_8, i16_to_8, i32_to_16. http://gerrit.cloudera.org:8080/#/c/15087/7/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test@717 PS7, Line 717: 200 Here and in other = predicates: it seems more interesting to test for the overflown value instead of the out of range positive value (where the result of the count should be 1), as we are mainly hunting for false negatives in the tests. -- To view, visit http://gerrit.cloudera.org:8080/15087 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id8bdaf4c4b2d0c6ea26d6e9bf013afca647e53a1 Gerrit-Change-Number: 15087 Gerrit-PatchSet: 7 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 07 Feb 2020 14:48:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6663: Expose current DDL metrics on WebUI
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13806 ) Change subject: IMPALA-6663: Expose current DDL metrics on WebUI .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/13806/12/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/13806/12/tests/webserver/test_web_pages.py@766 PS12, Line 766: flake8: W391 blank line at end of file -- To view, visit http://gerrit.cloudera.org:8080/13806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0ed76f134bad6d3b3d4dce132365a53a01e9512a Gerrit-Change-Number: 13806 Gerrit-PatchSet: 12 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 07 Feb 2020 14:46:18 + Gerrit-HasComments: Yes