[Impala-ASF-CR] IMPALA-4524: Batch calls to ALTER TABLE...ADD PARTITION.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8238 ) Change subject: IMPALA-4524: Batch calls to ALTER TABLE...ADD PARTITION. .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8238 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I95f8221ff08c0f126f951f7d37ff5e57985f855f Gerrit-Change-Number: 8238 Gerrit-PatchSet: 7 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Thu, 19 Oct 2017 04:27:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6073: Fail on misconfigured CLASSPATH.
Kim Jin Chul has posted comments on this change. ( http://gerrit.cloudera.org:8080/8327 ) Change subject: IMPALA-6073: Fail on misconfigured CLASSPATH. .. Patch Set 1: Code-Review-1 I think this check seems not to be enough. Let me leave a detailed comment on the JIRA ticket. -- To view, visit http://gerrit.cloudera.org:8080/8327 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icb92fa5cad8bbc29b620bec5904e45ed7aeff13d Gerrit-Change-Number: 8327 Gerrit-PatchSet: 1 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Kim Jin Chul Gerrit-Comment-Date: Thu, 19 Oct 2017 04:13:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6073: Fail on misconfigured CLASSPATH.
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8327 ) Change subject: IMPALA-6073: Fail on misconfigured CLASSPATH. .. Patch Set 1: (3 comments) Thanks for fixing, lgtm, only minor nits http://gerrit.cloudera.org:8080/#/c/8327/1/be/src/common/init.cc File be/src/common/init.cc: http://gerrit.cloudera.org:8080/#/c/8327/1/be/src/common/init.cc@178 PS1, Line 178: char *classpath = getenv("CLASSPATH"); style: char* http://gerrit.cloudera.org:8080/#/c/8327/1/be/src/common/init.cc@179 PS1, Line 179: assert(classpath != NULL); Why not DCHECK and add a message http://gerrit.cloudera.org:8080/#/c/8327/1/be/src/common/init.cc@182 PS1, Line 182: assert(std::string(classpath).find("*") == std::string::npos); DCHECK and include an error message -- To view, visit http://gerrit.cloudera.org:8080/8327 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icb92fa5cad8bbc29b620bec5904e45ed7aeff13d Gerrit-Change-Number: 8327 Gerrit-PatchSet: 1 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Comment-Date: Thu, 19 Oct 2017 03:45:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4524: Batch calls to ALTER TABLE...ADD PARTITION.
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8238 ) Change subject: IMPALA-4524: Batch calls to ALTER TABLE...ADD PARTITION. .. Patch Set 7: (4 comments) http://gerrit.cloudera.org:8080/#/c/8238/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/8238/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1958 PS7, Line 1958: Lists.partition(allHmsPartitionsToAdd, MAX_PARTITION_UPDATES_PER_RPC)) { While you are here, can you also change bulkAlterPartitions() to use Lists.partition()? http://gerrit.cloudera.org:8080/#/c/8238/7/tests/metadata/test_ddl.py File tests/metadata/test_ddl.py: http://gerrit.cloudera.org:8080/#/c/8238/7/tests/metadata/test_ddl.py@514 PS7, Line 514: def test_alter_table_create_many_partitions(self, vector, unique_database): This test doesn't belong in the TestLibCache class. Don't we have existing tests that test the multi-partition add alter? Seems like this test belongs there. http://gerrit.cloudera.org:8080/#/c/8238/7/tests/metadata/test_ddl.py@519 PS7, Line 519: self.client.execute('use {0}'.format(unique_database)) "use" can generally cause problems in tests because we sometimes run a test against different impalads for example to exercise sync_ddl. While I think it is ok in this specific instance, it's generally better practice not so issue "use" statements unless absolutely necessary. http://gerrit.cloudera.org:8080/#/c/8238/7/tests/metadata/test_ddl.py@527 PS7, Line 527: # 473 -1 0 0B NOT CACHED NOT CACHED TEXTfalse hdfs://localhost:20500/test-warehouse/test_alter_table_create_many_partitions_5b4888f5.db/t/p=473 This comment can become stale quickly. I suggest not adding it an output row verbatim but mentioning that show partitions contains the partition HDFS paths which we expect to contain p=val sub-directories that you are going to search for with the regex. -- To view, visit http://gerrit.cloudera.org:8080/8238 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I95f8221ff08c0f126f951f7d37ff5e57985f855f Gerrit-Change-Number: 8238 Gerrit-PatchSet: 7 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Thu, 19 Oct 2017 03:40:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5599: Clean up references to TimestampValue in be/src.
Hello Michael Ho, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8305 to look at the new patch set (#3). Change subject: IMPALA-5599: Clean up references to TimestampValue in be/src. .. IMPALA-5599: Clean up references to TimestampValue in be/src. This is a follow-on commit to d53f43b4. In this patch we replace all inappropriate usage of TimestampValue in be/src/service and be/src/statetore with simpler data types for Unix timestamps, and where conversions to strings are needed, we use the interfaces introduced by the previous commit. BE, FE, EE, JDBC, and Cluster Tests were run to check for regressions. All the tests passed. Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c --- M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/statestore/statestore-subscriber.cc 6 files changed, 29 insertions(+), 42 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/8305/3 -- To view, visit http://gerrit.cloudera.org:8080/8305 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c Gerrit-Change-Number: 8305 Gerrit-PatchSet: 3 Gerrit-Owner: Zoram ThangaGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zoram Thanga
[Impala-ASF-CR] IMPALA-5599: Clean up references to TimestampValue in be/src.
Zoram Thanga has posted comments on this change. ( http://gerrit.cloudera.org:8080/8305 ) Change subject: IMPALA-5599: Clean up references to TimestampValue in be/src. .. Patch Set 2: > (1 comment) > > Code change looks good. > > How did you test it? Please add a Testing section to commit message > indicating that. Please at least manually inspect each affected > field by e.g. looking at the profile. Thanks for the suggestion to look at the profiles. Here's an example, and it looks fine to me: Query (id=394caf0e5f233559:8d02422c): DEBUG MODE WARNING: Query profile created while running a DEBUG build of Impala. Use RELEASE builds to measure query performance. Summary: Session ID: 864fbccb3e7bee09:7fa78fb245fa3688 Session Type: BEESWAX Start Time: 2017-10-18 20:21:59.972644 End Time: 2017-10-18 20:22:00.097281 Query Type: N/A Query State: EXCEPTION Query Status: AnalysisException: Could not resolve table reference: 'tpch_avro.lineitem' Impala Version: impalad version 2.11.0-SNAPSHOT DEBUG (build 979c3dfd6fc40dbfb413cf45a9560f81495d1f5a) User: zoram Connected User: zoram Delegated User: Network Address: Default Db: default Sql Statement: select l_orderkey from tpch_avro.lineitem having count(l_orderkey) > 1 Coordinator: impala-1:22000 Query Timeline: 124.796ms - Query submitted: 45.687us (45.687us) - Unregister query: 124.627ms (124.582ms) -- To view, visit http://gerrit.cloudera.org:8080/8305 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c Gerrit-Change-Number: 8305 Gerrit-PatchSet: 2 Gerrit-Owner: Zoram ThangaGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Thu, 19 Oct 2017 03:32:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4964: Fix Decimal modulo overflow
Taras Bobrovytsky has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8329 Change subject: IMPALA-4964: Fix Decimal modulo overflow .. IMPALA-4964: Fix Decimal modulo overflow The modulo operation between two decimals should never overflow. Before this patch, there would be an overflow if the scale difference between the two decimals was large. We would try to scale up the one with the smaller scale, so that the scales matched, which could result in an overflow. We fix the problem by first checking if the scaled up value would fit into 128 bits by estimating the number of leading zeros in it. If we detect that 128 bits is not enough, we convert both numbers to int256, do the operation, then convert back to 128 bits. Testing: - Added some expr tests that excercise the new code path. Change-Id: I5420201d4440d421e33e443df005cdcc16b8a6cd --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-test.cc M be/src/runtime/decimal-value.inline.h 3 files changed, 116 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/8329/1 -- To view, visit http://gerrit.cloudera.org:8080/8329 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I5420201d4440d421e33e443df005cdcc16b8a6cd Gerrit-Change-Number: 8329 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-6073: Fail on misconfigured CLASSPATH.
Philip Zeyliger has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8327 Change subject: IMPALA-6073: Fail on misconfigured CLASSPATH. .. IMPALA-6073: Fail on misconfigured CLASSPATH. Asserts, early, that $CLASSPATH is set and has no wildcards. Several developers have been tripped up running C++ tests without first running 'bin/set-classpath.sh'. This causes them to run into HDFS-11851 (getGlobalJNIEnv() may deadlock on exception), wherein, instead of a relatively clear "Class not found" error, we hang forever. I considered limiting this to tests, but we clearly need a $CLASSPATH for the daemons themselves. Testing: * Ran a backend test without $CLASSPATH set. * Started Impala cluster with this change. Change-Id: Icb92fa5cad8bbc29b620bec5904e45ed7aeff13d --- M be/src/common/init.cc 1 file changed, 7 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/8327/1 -- To view, visit http://gerrit.cloudera.org:8080/8327 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Icb92fa5cad8bbc29b620bec5904e45ed7aeff13d Gerrit-Change-Number: 8327 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger
[Impala-ASF-CR] IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8274 ) Change subject: IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+ .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30234aa50fea93f316e75beea2ced002dcea0c24 Gerrit-Change-Number: 8274 Gerrit-PatchSet: 4 Gerrit-Owner: Zach AmsdenGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 19 Oct 2017 01:45:33 + Gerrit-HasComments: No
[Impala-ASF-CR] Allow the SASL protocol service name to be configurable
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8230 ) Change subject: Allow the SASL protocol service name to be configurable .. Allow the SASL protocol service name to be configurable Previously the SASL service name was always set to a constant "kudu" which was tracked by kSaslProtoName in rpc/constants.h. However, for applications that use the KRPC library that would prefer to do their own SASL initialization, they would requre to set their own SASL service name to be passed into sasl_server_new()/sasl_client_new(). This patch allows for this configuration by adding a configurable parameter to the MessengerBuilder which is ultimately passed down to the negotiation layer. Change-Id: I9e30fe4461893b67527333259579e2304b19af1e Reviewed-on: http://gerrit.cloudera.org:8080/8218 Reviewed-by: Dan BurkertTested-by: Dan Burkert Reviewed-on: http://gerrit.cloudera.org:8080/8230 Reviewed-by: Sailesh Mukil Tested-by: Impala Public Jenkins --- M be/src/kudu/rpc/client_negotiation.cc M be/src/kudu/rpc/client_negotiation.h M be/src/kudu/rpc/constants.cc M be/src/kudu/rpc/constants.h M be/src/kudu/rpc/messenger.cc M be/src/kudu/rpc/messenger.h M be/src/kudu/rpc/negotiation-test.cc M be/src/kudu/rpc/negotiation.cc M be/src/kudu/rpc/server_negotiation.cc M be/src/kudu/rpc/server_negotiation.h 10 files changed, 59 insertions(+), 28 deletions(-) Approvals: Sailesh Mukil: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8230 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e Gerrit-Change-Number: 8230 Gerrit-PatchSet: 4 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] Allow the SASL protocol service name to be configurable
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8230 ) Change subject: Allow the SASL protocol service name to be configurable .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8230 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e Gerrit-Change-Number: 8230 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Thu, 19 Oct 2017 01:23:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition
Taras Bobrovytsky has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/8309 ) Change subject: IMPALA-5019: Decimal V2 addition .. IMPALA-5019: Decimal V2 addition In this patch, we implement the new decimal return type rules for addition expressions. These rules become active when the query option DECIMAL_V2 is enabled. The algorithm for determining the type of the result is described in the JIRA. DECIMAL V1: ++ | typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) | ++ | DECIMAL(38,38) | ++ DECIMAL V2: ++ | typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) | ++ | DECIMAL(38,6) | ++ This patch required backend changes. We implement an algorithm where we handle the whole and fractional parts separately, and then combine them to get the final result. This is more complex and slower. We try to avoid this by first checking if the result would into int128. Testing: - Added expr tests. - Tested locally on my machine with a script that generates random decimal numbers and checks that Impala adds them correctly. Performance: For the common case, performance remains the same. select cast(2.2 as decimal(18, 1) + cast(2.2 as decimal(18, 1) BEFORE: 4.74s AFTER: 4.73s In this case, we check if it is necessary to do the complex addition, and it turns out to be not necessary. We see a slowdown because the result needs to be scaled down by dividing. select cast(2.2 as decimal(38, 19) + cast(2.2 as decimal(38, 19) BEFORE: 1.63s AFTER: 13.57s In following case, we take the most complex path and see the most signification perfmance hit. select cast(7.5 as decimal(38,37)) + cast(2.2 as decimal(38,37)) BEFORE: 1.63s AFTER: 20.57 Change-Id: I401049c56d910eb1546a178c909c923b01239336 --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.inline.h M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java 4 files changed, 297 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8309/2 -- To view, visit http://gerrit.cloudera.org:8080/8309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336 Gerrit-Change-Number: 8309 Gerrit-PatchSet: 2 Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.
Hello Philip Zeyliger, Balazs Jeszenszky, Dimitris Tsirogiannis, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8202 to look at the new patch set (#9). Change subject: IMPALA-4704: Turns on client connections when local catalog initialized. .. IMPALA-4704: Turns on client connections when local catalog initialized. Currently, impalad starts beeswax and hs2 servers even if the catalog has not yet been initialized. As a result, client connections see an error message stating that the impalad is not yet ready. This patch changes the impalad startup sequence to wait until the catalog is received before opening beeswax and hs2 ports and starting their servers. Testing: - python e2e tests that start a cluster without a catalog and check that client connections are rejected as expected. Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/testutil/in-process-servers.cc M bin/start-impala-cluster.py M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/service/FrontendTest.java M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py A tests/custom_cluster/test_catalog_wait.py M tests/custom_cluster/test_coordinators.py 13 files changed, 166 insertions(+), 55 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/8202/9 -- To view, visit http://gerrit.cloudera.org:8080/8202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 Gerrit-Change-Number: 8202 Gerrit-PatchSet: 9 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6070: Parallel data load.
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8320 ) Change subject: IMPALA-6070: Parallel data load. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/run-hive-server.sh File testdata/bin/run-hive-server.sh: http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/run-hive-server.sh@75 PS1, Line 75: HADOOP_HEAPSIZE="1024" hive --service hiveserver2 > ${LOGDIR}/hive-server2.out 2>&1 & > People like Alex are those whom I was most concerned about, as I know he us :). I'm still using that good-old machine, mem should be fine (fingers crossed). -- To view, visit http://gerrit.cloudera.org:8080/8320 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I836c4e1586f229621c102c4f4ba22ce7224ab9ac Gerrit-Change-Number: 8320 Gerrit-PatchSet: 1 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 19 Oct 2017 00:47:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4524: Batch calls to ALTER TABLE...ADD PARTITION.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8238 ) Change subject: IMPALA-4524: Batch calls to ALTER TABLE...ADD PARTITION. .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1350/ -- To view, visit http://gerrit.cloudera.org:8080/8238 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I95f8221ff08c0f126f951f7d37ff5e57985f855f Gerrit-Change-Number: 8238 Gerrit-PatchSet: 7 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Thu, 19 Oct 2017 00:27:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6070: Parallel data load.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/8320 ) Change subject: IMPALA-6070: Parallel data load. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/run-hive-server.sh File testdata/bin/run-hive-server.sh: http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/run-hive-server.sh@75 PS1, Line 75: HADOOP_HEAPSIZE="1024" hive --service hiveserver2 > ${LOGDIR}/hive-server2.out 2>&1 & > I'd prefer to keep this change. Our Hive server tends to OOM pretty easily People like Alex are those whom I was most concerned about, as I know he used to develop Impala on a machine without much memory. If Alex is OK with this, I am, too. -- To view, visit http://gerrit.cloudera.org:8080/8320 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I836c4e1586f229621c102c4f4ba22ce7224ab9ac Gerrit-Change-Number: 8320 Gerrit-PatchSet: 1 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 19 Oct 2017 00:09:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6070: Parallel data load.
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8320 ) Change subject: IMPALA-6070: Parallel data load. .. Patch Set 1: (2 comments) Changes like these tend to be slow and painful to test, so I'm in favor of not parallelizing additional things in this patch. Additional steps can be improved later. http://gerrit.cloudera.org:8080/#/c/8320/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8320/1//COMMIT_MSG@33 PS1, Line 33: What testing did you do? Does the data load still run on a non-beefy local machine? http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/run-hive-server.sh File testdata/bin/run-hive-server.sh: http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/run-hive-server.sh@75 PS1, Line 75: HADOOP_HEAPSIZE="1024" hive --service hiveserver2 > ${LOGDIR}/hive-server2.out 2>&1 & > This looks like it will also increase HADOOP_HEAPSIZE when not doing a para I'd prefer to keep this change. Our Hive server tends to OOM pretty easily when doing anything non-trivial with Hive on our mini cluster. -- To view, visit http://gerrit.cloudera.org:8080/8320 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I836c4e1586f229621c102c4f4ba22ce7224ab9ac Gerrit-Change-Number: 8320 Gerrit-PatchSet: 1 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 19 Oct 2017 00:07:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5976: Remove equivalent class computation in FE
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8317 ) Change subject: IMPALA-5976: Remove equivalent class computation in FE .. Patch Set 1: (47 comments) Very nice. First wave of comments. I have not exhaustively gone through everything yet since this is a very substantial patch. http://gerrit.cloudera.org:8080/#/c/8317/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8317/1//COMMIT_MSG@7 PS1, Line 7: IMPALA-5976: Remove equivalent class computation in FE Remove equivalence classes from FE http://gerrit.cloudera.org:8080/#/c/8317/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/8317/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@a2017 PS1, Line 2017: Very satisfying to see this code deleted! http://gerrit.cloudera.org:8080/#/c/8317/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@a2890 PS1, Line 2890: This function was nice for debugging. Does your patch add a similar function that nicely prints the two graphs? http://gerrit.cloudera.org:8080/#/c/8317/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1253 PS1, Line 1253: return !tids.isEmpty() && (tids.size() > 1 || isOjConjunct(e) || isFullOuterJoined(e) This new formatting is extremely hard to read, please reformat for readability. The old formatting was much better. http://gerrit.cloudera.org:8080/#/c/8317/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1496 PS1, Line 1496: for (SlotId dst : getEquivSlots(srcSid)) { Unless we are absolutely convinced that the change is correct, I think we should try to preserve the semantics of the original code. My understanding is that the new getEquivSlots() returns slots that have a mutual value transfer with srcSid - which is different than what the original code did. http://gerrit.cloudera.org:8080/#/c/8317/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1608 PS1, Line 1608:* For each equivalence class, adds/removes predicates from conjuncts such that it Refers to "equivalence class" again which is now an undefined term. If you want to use that term we should define what it means. Let's think about how to evolve the terminology. Based on my understanding of your patch it looks like you want to redefine "equivalence class" to be a set of slots where each pair of slots has a mutual value transfer. http://gerrit.cloudera.org:8080/#/c/8317/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1626 PS1, Line 1626: // Equivalence classes only containing slots belonging to lhsTids. What does this mean? Is the Integer key the SCC id? http://gerrit.cloudera.org:8080/#/c/8317/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1802 PS1, Line 1802: if (slotDesc.getId().asInt() >= g.numVertices()) continue; I've seen this check in a couple of places, would be nice if that could be handled by 'g' instead of all callsites. http://gerrit.cloudera.org:8080/#/c/8317/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1952 PS1, Line 1952: public void computeValueTransfer() { computeValueTransferGraph http://gerrit.cloudera.org:8080/#/c/8317/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1972 PS1, Line 1972: private void constructValueTransferFromEqPredicates(Graph g) { valueTransfers (plural) http://gerrit.cloudera.org:8080/#/c/8317/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2047 PS1, Line 2047:* Returns the ids of slots that are in the same equivalence class as slotId Comment need updating. The concept of "equivalence class" does not exist anymore and we should not use that term. http://gerrit.cloudera.org:8080/#/c/8317/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2055 PS1, Line 2055: for (int dst: g.getSccMembers(g.getSccId(slotId.asInt( { In terms of the g API I think callers should have to care about the existence of SCCs. Can we hide those details? http://gerrit.cloudera.org:8080/#/c/8317/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2070 PS1, Line 2070: Collections.sort(result); Shouldn't the destinations already be sorted? http://gerrit.cloudera.org:8080/#/c/8317/1/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: http://gerrit.cloudera.org:8080/#/c/8317/1/fe/src/main/java/org/apache/impala/analysis/Expr.java@702 PS1, Line 702: if (!localCmp.apply(Pair.create(this, that))) return false; It's very unfortunate that we have to create a new pair object to do a SlotRef comparison. We should try to avoid that generate an excessive number of objects. Expr.equals() really is called quite frequently and SlotRef is the most common Expr. Using a Java Comparator is not ideal because we don't provide
[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resourcesa
Hello Joe McDonnell, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8323 to look at the new patch set (#2). Change subject: IMPALA-1575: part 2: yield admission control resourcesa .. IMPALA-1575: part 2: yield admission control resourcesa This change releases admission control resources more eagerly, once the query has finished actively executing. Some resources (tracked and untracked) are still consumed by the client request as long as it remains open, e.g. memory for control structures and the result cache. However, these resources are relatively small and should not block admission of new queries. The same as in part 1, query execution is considered to be finished under any of the following conditions: 1. The query encounters an error and fails 2. The query is cancelled due to the idle query timeout 3. The query reaches eos (or the DML completes) 4. The client cancels the query without closing the query Admission control resources are released in two ways: 1. by calling AdmissionController::ReleaseQuery() on the coordinator promptly after query execution finishes, instead of waiting for UnregisterQuery(). This means that the query and its memory is no longer considered "admitted". 2. by changing the behaviour of MemTracker::GetPoolMemReserved() so that it is aware of when a query has finished executing and does not consider its entire memory limit to be "reserved". The preconditions for releasing an admitted query are subtle because the queries are being admitted to a distributed system, not just the coordinator. The comment for ReleaseAdmissionControlResources() documents the preconditions and rationale. Note that the preconditions are not weaker than the preconditions of calling UnregisterQuery() before this patch. Testing: TestAdmissionController is extended to end queries in four ways: cancellation by client, idle timeout, the last row being fetched, and the client closing the query. The test uses a mix of all four. After the query ends, all clients wait for the test to complete before closing the query or closing the connection. This ensures that the admission control decisions are based entirely on the query end behavior. This test works for both query admission control and mem_limit admission control and can detect both kinds of admission control resources ("admitted" and "reserved") not being released promptly. This is based on an earlier patch by Joe McDonnell. Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac --- M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/runtime/query-state.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/service/client-request-state.cc M tests/custom_cluster/test_admission_controller.py 9 files changed, 185 insertions(+), 81 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8323/2 -- To view, visit http://gerrit.cloudera.org:8080/8323 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac Gerrit-Change-Number: 8323 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-6070: Parallel data load.
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8320 ) Change subject: IMPALA-6070: Parallel data load. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/create-load-data.sh File testdata/bin/create-load-data.sh: http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/create-load-data.sh@492 PS1, Line 492: run-step "Loading auxiliary workloads" load-aux-workloads.log load-aux-workloads I don't see any reason this also couldn't run in parallel. -- To view, visit http://gerrit.cloudera.org:8080/8320 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I836c4e1586f229621c102c4f4ba22ce7224ab9ac Gerrit-Change-Number: 8320 Gerrit-PatchSet: 1 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 18 Oct 2017 23:44:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1422: support a constant on LHS of IN predicates.
Vuk Ercegovac has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/8322 ) Change subject: IMPALA-1422: support a constant on LHS of IN predicates. .. IMPALA-1422: support a constant on LHS of IN predicates. Currently, constant expressions for the LHS of the IN predicate are not supported. This patch adds this support as a rewrite in StmtRewriter (where subqueries are rewritten to joins). Since there is a nested-loop variant of left semijoin, support for IN is handled by not erring out. NOT IN is handled by a rewrite to corresponding NOT EXISTS predicate. Re-organized the frontend subquery analysis tests to expand coverage. Testing: - added frontend subquery analysis tests - added e2e tests Change-Id: I0d69889a3c72e90be9d4ccf47d2816819ae32acb --- M fe/src/main/java/org/apache/impala/analysis/InPredicate.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java M testdata/workloads/functional-query/queries/QueryTest/subquery.test 5 files changed, 479 insertions(+), 197 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/8322/2 -- To view, visit http://gerrit.cloudera.org:8080/8322 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0d69889a3c72e90be9d4ccf47d2816819ae32acb Gerrit-Change-Number: 8322 Gerrit-PatchSet: 2 Gerrit-Owner: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resourcesa
Hello Joe McDonnell, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8323 to look at the new patch set (#3). Change subject: IMPALA-1575: part 2: yield admission control resourcesa .. IMPALA-1575: part 2: yield admission control resourcesa This change releases admission control resources more eagerly, once the query has finished actively executing. Some resources (tracked and untracked) are still consumed by the client request as long as it remains open, e.g. memory for control structures and the result cache. However, these resources are relatively small and should not block admission of new queries. The same as in part 1, query execution is considered to be finished under any of the following conditions: 1. The query encounters an error and fails 2. The query is cancelled due to the idle query timeout 3. The query reaches eos (or the DML completes) 4. The client cancels the query without closing the query Admission control resources are released in two ways: 1. by calling AdmissionController::ReleaseQuery() on the coordinator promptly after query execution finishes, instead of waiting for UnregisterQuery(). This means that the query and its memory is no longer considered "admitted". 2. by changing the behaviour of MemTracker::GetPoolMemReserved() so that it is aware of when a query has finished executing and does not consider its entire memory limit to be "reserved". The preconditions for releasing an admitted query are subtle because the queries are being admitted to a distributed system, not just the coordinator. The comment for ReleaseAdmissionControlResources() documents the preconditions and rationale. Note that the preconditions are not weaker than the preconditions of calling UnregisterQuery() before this patch. Testing: TestAdmissionController is extended to end queries in four ways: cancellation by client, idle timeout, the last row being fetched, and the client closing the query. The test uses a mix of all four. After the query ends, all clients wait for the test to complete before closing the query or closing the connection. This ensures that the admission control decisions are based entirely on the query end behavior. This test works for both query admission control and mem_limit admission control and can detect both kinds of admission control resources ("admitted" and "reserved") not being released promptly. This is based on an earlier patch by Joe McDonnell. Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac --- M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/runtime/query-state.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/service/client-request-state.cc M tests/custom_cluster/test_admission_controller.py 9 files changed, 192 insertions(+), 87 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8323/3 -- To view, visit http://gerrit.cloudera.org:8080/8323 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac Gerrit-Change-Number: 8323 Gerrit-PatchSet: 3 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8233 ) Change subject: IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors .. IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors Currently if llvm encounters an error it calls exit() and this kills the impalad process. This patch adds a diagnostic handler that llvm can use to pass the error up to impala instead of crashing it. Testing: Added a test which induces an error in llvm and makes sure that its passed up to impala code and handled correctly. Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff Reviewed-on: http://gerrit.cloudera.org:8080/8233 Reviewed-by: Bikramjeet VigTested-by: Impala Public Jenkins --- M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/testutil/gtest-util.h M be/src/util/filesystem-util.cc M be/src/util/filesystem-util.h 6 files changed, 93 insertions(+), 0 deletions(-) Approvals: Bikramjeet Vig: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8233 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff Gerrit-Change-Number: 8233 Gerrit-PatchSet: 8 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke
[Impala-ASF-CR] IMPALA-2758: Change BufferedTupleStream::GetRows to returning multi batches
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/8226 ) Change subject: IMPALA-2758: Change BufferedTupleStream::GetRows to returning multi batches .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8226/1/be/src/runtime/buffered-tuple-stream.h File be/src/runtime/buffered-tuple-stream.h: http://gerrit.cloudera.org:8080/#/c/8226/1/be/src/runtime/buffered-tuple-stream.h@332 PS1, Line 332: Status GetNext( > My intent was to remove the GetRows() API altogether and switch callers to Do you mean to keep only one rowbatch and assemble the row batches multiple times but yet let the stream stay pinned? -- To view, visit http://gerrit.cloudera.org:8080/8226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414 Gerrit-Change-Number: 8226 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Oct 2017 23:36:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5599: Clean up references to TimestampValue in be/src.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8305 ) Change subject: IMPALA-5599: Clean up references to TimestampValue in be/src. .. Patch Set 2: Code-Review+1 (1 comment) Code change looks good. How did you test it? Please add a Testing section to commit message indicating that. Please at least manually inspect each affected field by e.g. looking at the profile. http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-http-handler.cc@303 PS1, Line 303: int64_t duration_us = record.end_time_us - record.start_time_us; > Please see my response to DanH's comment... Using monotonic clock for durations is generally the right thing to do, but given we also print the start and end times in this case, let's leave it alone since the Print() function handles it in a reasonable way, -- To view, visit http://gerrit.cloudera.org:8080/8305 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c Gerrit-Change-Number: 8305 Gerrit-PatchSet: 2 Gerrit-Owner: Zoram ThangaGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Wed, 18 Oct 2017 23:35:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resourcesa
Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8323 Change subject: IMPALA-1575: part 2: yield admission control resourcesa .. IMPALA-1575: part 2: yield admission control resourcesa This change releases admission control resources more eagerly, once the query has finished actively executing. Some resources (tracked and untracked) are still consumed by the client request as long as it remains open, e.g. memory for control structures and the result cache. However, these resources are relatively small and should not block admission of new queries. The same as in part 1, query execution is considered to be finished under any of the following conditions: 1. The query encounters an error and fails 2. The query is cancelled due to the idle query timeout 3. The query reaches eos (or the DML completes) 4. The client cancels the query without closing the query Admission control resources are released in two ways: 1. by calling AdmissionController::ReleaseQuery() on the coordinator promptly after query execution finishes, instead of waiting for UnregisterQuery(). This means that the query and its memory is no longer considered "admitted". 2. by changing the behaviour of MemTracker::GetPoolMemReserved() so that it is aware of when a query has finished executing and does not consider its entire memory limit to be "reserved". The preconditions for releasing an admitted query are subtle because the queries are being admitted a distributed system, not just the coordinator. The comment for ReleaseAdmissionControlResources() documents the preconditions and rationale. Note that the preconditions are not weaker than the preconditions of calling UnregisterQuery() before this patch. Testing: TestAdmissionController is extended to end queries in four ways: cancellation by client, idle timeout, the last row being fetched, and the client closing the query. The test uses a mix of all four. After the query ends, all clients wait for the test to complete before closing the query or closing the connection. This ensures that the admission control decisions are based entirely on the query end behavior. This test works for both query admission control and mem_limit admission control and can detect both kinds of admission control resources ("admitted" and "reserved") not being released promptly. This is based on an earlier patch by Joe McDonnell. Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac --- M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/runtime/query-state.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/service/client-request-state.cc M tests/custom_cluster/test_admission_controller.py 9 files changed, 191 insertions(+), 81 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8323/1 -- To view, visit http://gerrit.cloudera.org:8080/8323 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac Gerrit-Change-Number: 8323 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8233 ) Change subject: IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8233 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff Gerrit-Change-Number: 8233 Gerrit-PatchSet: 7 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 18 Oct 2017 23:30:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6070: Parallel data load.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/8320 ) Change subject: IMPALA-6070: Parallel data load. .. Patch Set 1: (6 comments) http://gerrit.cloudera.org:8080/#/c/8320/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8320/1//COMMIT_MSG@9 PS1, Line 9: This commit loads functional-query, TPC-H data, and TPC-DS data in parallel. In nit: Can you wrap this at the red line provided by gerrit? I think it is 72 characters. Emacs will wrap it for you at the right space with ctrl-q, if you choose. http://gerrit.cloudera.org:8080/#/c/8320/1//COMMIT_MSG@12 PS1, Line 12: minuites nit: minutes http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/create-load-data.sh File testdata/bin/create-load-data.sh: http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/create-load-data.sh@480 PS1, Line 480: run-step-backgroundable "Loading functional-query data" load-functional-query.log \ Could add a comment about what you decided to background and what you decided not to, and why? http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/run-hive-server.sh File testdata/bin/run-hive-server.sh: http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/run-hive-server.sh@75 PS1, Line 75: HADOOP_HEAPSIZE="1024" hive --service hiveserver2 > ${LOGDIR}/hive-server2.out 2>&1 & > I'm currently testing to see if 512 is enough. This looks like it will also increase HADOOP_HEAPSIZE when not doing a parallel load, which is a shame. Do you see a way around that? http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/run-step.sh File testdata/bin/run-step.sh: http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/run-step.sh@53 PS1, Line 53: nit: only one empty line, to match context http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/run-step.sh@84 PS1, Line 84: RUN_STEP_PIDS=() Do you want to reset MSGS, too? -- To view, visit http://gerrit.cloudera.org:8080/8320 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I836c4e1586f229621c102c4f4ba22ce7224ab9ac Gerrit-Change-Number: 8320 Gerrit-PatchSet: 1 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 18 Oct 2017 23:17:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6070: Parallel data load.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/8320 ) Change subject: IMPALA-6070: Parallel data load. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/create-load-data.sh File testdata/bin/create-load-data.sh: http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/create-load-data.sh@505 PS1, Line 505: # Tests depend on the kudu data being clean, so load the data from scratch. : run-step "Loading Kudu functional" load-kudu.log \ : load-data "functional-query" "core" "kudu/none/none" force : run-step "Loading Kudu TPCH" load-kudu-tpch.log \ : load-data "tpch" "core" "kudu/none/none" force It should be possible to do the same thing for these. That will only save about 4 minutes, but this runs even when loading from a snapshot. -- To view, visit http://gerrit.cloudera.org:8080/8320 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I836c4e1586f229621c102c4f4ba22ce7224ab9ac Gerrit-Change-Number: 8320 Gerrit-PatchSet: 1 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 18 Oct 2017 23:16:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6045: Make build scripts more friendly to Ubuntu 16.04
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8262 ) Change subject: IMPALA-6045: Make build scripts more friendly to Ubuntu 16.04 .. Patch Set 3: > Patch Set 3: > > > Sorry to get to this a bit late. > > No apology required. > > I agree with all of your suggestions. > > If you want to file a ticket and assign it to me or if you want to send me a > patch to review, I'd be amenable. Filed https://issues.apache.org/jira/browse/IMPALA-6079. I've left it unassigned. It's not clear to me yet whether some other work will lead me right back to this :) -- To view, visit http://gerrit.cloudera.org:8080/8262 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8196a2a87bce5893a349a1b290c3f3d04fd80317 Gerrit-Change-Number: 8262 Gerrit-PatchSet: 3 Gerrit-Owner: Jim AppleGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Wed, 18 Oct 2017 22:38:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5599: Clean up references to TimestampValue in be/src.
Hello Michael Ho, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8305 to look at the new patch set (#2). Change subject: IMPALA-5599: Clean up references to TimestampValue in be/src. .. IMPALA-5599: Clean up references to TimestampValue in be/src. This is a follow-on commit to d53f43b4. In this patch we replace all inappropriate usage of TimestampValue in be/src/service and be/src/statetore with simpler data types for Unix timestamps, and where conversions to strings are needed, we use the interfaces introduced by the previous commit. Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c --- M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/statestore/statestore-subscriber.cc 6 files changed, 29 insertions(+), 42 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/8305/2 -- To view, visit http://gerrit.cloudera.org:8080/8305 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c Gerrit-Change-Number: 8305 Gerrit-PatchSet: 2 Gerrit-Owner: Zoram ThangaGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zoram Thanga
[Impala-ASF-CR] IMPALA-5599: Clean up references to TimestampValue in be/src/service.
Zoram Thanga has posted comments on this change. ( http://gerrit.cloudera.org:8080/8305 ) Change subject: IMPALA-5599: Clean up references to TimestampValue in be/src/service. .. Patch Set 1: (7 comments) http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/client-request-state.h File be/src/service/client-request-state.h: http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/client-request-state.h@341 PS1, Line 341: time > indicate the clock. e.g. monotonic vs unix time. Done. This uses Unix time. http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/client-request-state.cc@a91 PS1, Line 91: > What do you think about going a step further and replacing all callers of T Changed the caller in be/src/statetore. The other calls in be/src/exprs are in test code, and seem legit http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-http-handler.cc@303 PS1, Line 303: int64_t duration_us = record.end_time_us - record.start_time_us; > That's true but that also seems to be a pre-existing bug. I agree that we s Please see my response to DanH's comment... http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-http-handler.cc@303 PS1, Line 303: int64_t duration_us = record.end_time_us - record.start_time_us; > that can be negative if settimeofday() occurred in the mean time. What shou The existing code does not handle this either. Looking at PrettyPrinter::Print(), it will basically print a negative elapsed time in nanoseconds. The only way to reliably handle intervening settimeofday(), or something like that would be to use a monotonic clock, perhaps MonotonicStopWatch. Are you suggesting we go down that route? http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-server.h@622 PS1, Line 622: /// Start and end time of the query > same comment about clock Done http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-server.h@640 PS1, Line 640: Unix milliseconds > like that Done http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-server.cc@1009 PS1, Line 1009: int64_t duration_ms = duration_us / MICROS_PER_MILLI; > same question (though old code could go negative). Please see my earlier response. -- To view, visit http://gerrit.cloudera.org:8080/8305 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c Gerrit-Change-Number: 8305 Gerrit-PatchSet: 1 Gerrit-Owner: Zoram ThangaGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Wed, 18 Oct 2017 22:24:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8051 ) Change subject: IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals .. Patch Set 4: (2 comments) Thanks for fixing this. Just had a couple of minor requests then I'll +2. http://gerrit.cloudera.org:8080/#/c/8051/4/be/src/exprs/decimal-operators-ir.cc File be/src/exprs/decimal-operators-ir.cc: http://gerrit.cloudera.org:8080/#/c/8051/4/be/src/exprs/decimal-operators-ir.cc@604 PS4, Line 604: DCHECK(nanoseconds >= numeric_limits::min() We generally prefer DCHECK_GE() and DCHECK_LE() so that the out-of-range value is printed if this gets hit. http://gerrit.cloudera.org:8080/#/c/8051/4/be/src/exprs/decimal-operators-ir.cc@622 PS4, Line 622: FromUnixTimeNanos Can you add a note to this function's comment that it's expected to work for negative nanoseconds (I don't think it's obvious when looking at the function definition). -- To view, visit http://gerrit.cloudera.org:8080/8051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0 Gerrit-Change-Number: 8051 Gerrit-PatchSet: 4 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Oct 2017 22:00:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6070: Parallel data load.
Philip Zeyliger has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8320 Change subject: IMPALA-6070: Parallel data load. .. IMPALA-6070: Parallel data load. This commit loads functional-query, TPC-H data, and TPC-DS data in parallel. In parallel, these take about 37 minutes, dominated by functional-query. Serially, these take about 30 minutes more, namely the 13 minutes of tpcds and 16 minuites of tpcds. This works out nicely because CPU usage during data load is very low in aggregate. (We don't sustain more than 1 CPU of load, whereas build machines are likely to have many CPUs.) To do this, I added support to run-step.sh to have a notion of a backgroundable task, and support waiting for all tasks. I also increased the heapsize of our HiveServer2 server. When datasets were being loaded in parallel, we ran out of memory at 256MB of heap. The resulting log output is currently like so (but without the timestamps): 15:58:04 Started Loading functional-query data in background; pid 8105. 15:58:04 Started Loading TPC-H data in background; pid 8106. 15:58:04 Loading functional-query data (logging to /home/impdev/Impala/logs/data_loading/load-functional-query.log)... 15:58:04 Started Loading TPC-DS data in background; pid 8107. 15:58:04 Loading TPC-H data (logging to /home/impdev/Impala/logs/data_loading/load-tpch.log)... 15:58:04 Loading TPC-DS data (logging to /home/impdev/Impala/logs/data_loading/load-tpcds.log)... 16:11:31Loading workload 'tpch' using exploration strategy 'core' OK (Took: 13 min 27 sec) 16:14:33Loading workload 'tpcds' using exploration strategy 'core' OK (Took: 16 min 29 sec) 16:35:08Loading workload 'functional-query' using exploration strategy 'exhaustive' OK (Took: 37 min 4 sec) Change-Id: I836c4e1586f229621c102c4f4ba22ce7224ab9ac --- M testdata/bin/create-load-data.sh M testdata/bin/run-hive-server.sh M testdata/bin/run-step.sh 3 files changed, 40 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/8320/1 -- To view, visit http://gerrit.cloudera.org:8080/8320 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I836c4e1586f229621c102c4f4ba22ce7224ab9ac Gerrit-Change-Number: 8320 Gerrit-PatchSet: 1 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8274 ) Change subject: IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+ .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1349/ -- To view, visit http://gerrit.cloudera.org:8080/8274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30234aa50fea93f316e75beea2ced002dcea0c24 Gerrit-Change-Number: 8274 Gerrit-PatchSet: 4 Gerrit-Owner: Zach AmsdenGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 18 Oct 2017 21:52:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8274 ) Change subject: IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+ .. Patch Set 4: Anyone know what's going on with this change not going through verification? I got a pass on private build and test, and the failures here appear Kerberos related. Trying a manual rebase... -- To view, visit http://gerrit.cloudera.org:8080/8274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30234aa50fea93f316e75beea2ced002dcea0c24 Gerrit-Change-Number: 8274 Gerrit-PatchSet: 4 Gerrit-Owner: Zach AmsdenGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 18 Oct 2017 21:51:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8211 to look at the new patch set (#3). Change subject: IMPALA-5243: Speed up code gen for wide Avro tables. .. IMPALA-5243: Speed up code gen for wide Avro tables. HdfsAvroScanner::CodegenMaterializeTuple generates a function linear in size to the number of columns. On 1000 column tables, codegen time is significant. This commit roughly halves it for wide columns. (Note that this had been much worse in recent history (<= Impala 2.9).) It does so by breaking up MaterializeTuple() into multiple smaller functions, and then calls them in order. When breaking up into 200-column chunks, there is a noticeable speed-up. I've made the helper code for generating LLVM function prototypes have a mutable function name, so that the builder can be re-used multiple times. I've checked by inspecting optimized LLVM that in the case where there's only 1 helper function, code gets inlined so that there doesn't seem to be an extra function. I measured codegen time for various "step sizes." The case where there are no helper functions is about 2.7s. The best case was about a step size of 200, with timings of 1.3s. For the query "select count(int_col16) from functional_avro.widetable_1000_cols", codegen times as a function of step size are roughly as follows. This is averaged across 5 executions, and rounded to 0.1s. step time 10 2.4 50 2.5 75 2.9 100 3.0 125 3.0 150 1.4 175 1.3 200 1.3 <-- chosen step size 225 1.5 250 1.4 300 1.6 400 1.6 500 1.8 1000 2.7 The raw data was generated like so, with some code that let me change the step size at runtime: $(for step in 10 50 75 100 125 150 175 200 225 250 300 400 500 1000; do for try in $(seq 5); do echo $step > /tmp/step_size.txt; echo -n "$step "; impala-shell.sh -q "select count(int_col16) from functional_avro.widetable_1000_cols; profile;" 2> /dev/null | grep -A9 'CodeGen:(Total: [0-9]*s' -m 1 | sed -e 's/ - / /' | sed -e 's/([0-9]*)//' | tr -d '\n' | tr -s ' ' ' '; echo; done; done) | tee out.txt ... 200 CodeGen:(Total: 1s333ms, non-child: 1s333ms, % non-child: 100.00%) CodegenTime: 613.562us CompileTime: 605.320ms LoadTime: 0.000ns ModuleBitcodeSize: 1.95 MB NumFunctions: 38 NumInstructions: 8.44K OptimizationTime: 701.276ms PeakMemoryUsage: 4.12 MB PrepareTime: 10.014ms ... 1000 CodeGen:(Total: 2s659ms, non-child: 2s659ms, % non-child: 100.00%) CodegenTime: 558.860us CompileTime: 1s267ms LoadTime: 0.000ns ModuleBitcodeSize: 1.95 MB NumFunctions: 34 NumInstructions: 8.41K OptimizationTime: 1s362ms PeakMemoryUsage: 4.11 MB PrepareTime: 10.574ms I have run the core tests with this change. Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476 --- M be/src/codegen/llvm-codegen.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h 3 files changed, 148 insertions(+), 93 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/8211/3 -- To view, visit http://gerrit.cloudera.org:8080/8211 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476 Gerrit-Change-Number: 8211 Gerrit-PatchSet: 3 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8211 to look at the new patch set (#2). Change subject: IMPALA-5243: Speed up code gen for wide Avro tables. .. IMPALA-5243: Speed up code gen for wide Avro tables. HdfsAvroScanner::CodegenMaterializeTuple generates a function linear in size to the number of columns. On 1000 column tables, codegen time is significant. This commit roughly halves it for wide columns. (Note that this had been much worse in recent history (<= Impala 2.9).) It does so by breaking up MaterializeTuple() into multiple smaller functions, and then calls them in order. When breaking up into 200-column chunks, there is a noticeable speed-up. I've made the helper code for generating LLVM function prototypes have a mutable function name, so that the builder can be re-used multiple times. I've checked by inspecting optimized LLVM that in the case where there's only 1 helper function, code gets inlined so that there doesn't seem to be an extra function. I measured codegen time for various "step sizes." The case where there are no helper functions is about 2.7s. The best case was about a step size of 200, with timings of 1.3s. For the query "select count(int_col16) from functional_avro.widetable_1000_cols", codegen times as a function of step size are roughly as follows. This is averaged across 5 executions, and rounded to 0.1s. step time 10 2.4 50 2.5 75 2.9 100 3.0 125 3.0 150 1.4 175 1.3 200 1.3 <-- chosen step size 225 1.5 250 1.4 300 1.6 400 1.6 500 1.8 1000 2.7 The raw data was generated like so, with some code that let me change the step size at runtime: $(for step in 10 50 75 100 125 150 175 200 225 250 300 400 500 1000; do for try in $(seq 5); do echo $step > /tmp/step_size.txt; echo -n "$step "; impala-shell.sh -q "select count(int_col16) from functional_avro.widetable_1000_cols; profile;" 2> /dev/null | grep -A9 'CodeGen:(Total: [0-9]*s' -m 1 | sed -e 's/ - / /' | sed -e 's/([0-9]*)//' | tr -d '\n' | tr -s ' ' ' '; echo; done; done) | tee out.txt ... 200 CodeGen:(Total: 1s333ms, non-child: 1s333ms, % non-child: 100.00%) CodegenTime: 613.562us CompileTime: 605.320ms LoadTime: 0.000ns ModuleBitcodeSize: 1.95 MB NumFunctions: 38 NumInstructions: 8.44K OptimizationTime: 701.276ms PeakMemoryUsage: 4.12 MB PrepareTime: 10.014ms ... 1000 CodeGen:(Total: 2s659ms, non-child: 2s659ms, % non-child: 100.00%) CodegenTime: 558.860us CompileTime: 1s267ms LoadTime: 0.000ns ModuleBitcodeSize: 1.95 MB NumFunctions: 34 NumInstructions: 8.41K OptimizationTime: 1s362ms PeakMemoryUsage: 4.11 MB PrepareTime: 10.574ms I have run the core tests with this change. Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476 --- M be/src/codegen/llvm-codegen.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h 3 files changed, 148 insertions(+), 93 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/8211/2 -- To view, visit http://gerrit.cloudera.org:8080/8211 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476 Gerrit-Change-Number: 8211 Gerrit-PatchSet: 2 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4524: Batch calls to ALTER TABLE...ADD PARTITION.
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8238 ) Change subject: IMPALA-4524: Batch calls to ALTER TABLE...ADD PARTITION. .. Patch Set 7: Code-Review+1 Carry post rebase. -- To view, visit http://gerrit.cloudera.org:8080/8238 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I95f8221ff08c0f126f951f7d37ff5e57985f855f Gerrit-Change-Number: 8238 Gerrit-PatchSet: 7 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 18 Oct 2017 21:32:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4524: Batch calls to ALTER TABLE...ADD PARTITION.
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8238 ) Change subject: IMPALA-4524: Batch calls to ALTER TABLE...ADD PARTITION. .. Patch Set 6: Code-Review+1 (1 comment) Carrying Dimitris' +1. http://gerrit.cloudera.org:8080/#/c/8238/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/8238/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1974 PS3, Line 1974: // HMS because some of them may already exist there. In that case, we load in the : // catalog the partitions that already exist in HMS but aren't in the catalog yet. : if (allHmsPartitionsToAdd.size() != addedHmsPartitions.size()) { : List difference = computeDifference(allHmsPartitionsToAdd, : addedHmsPartitions); : addedHmsPartitions.addAll( : getPartitionsFromHms(msTbl, msClient, tableName, difference)); : } : : for (Partition partition: addedHmsPartitions) { : // Create and add the HdfsPartition to catalog. Return the table object with an : > Well, you're only calling it for the diff, right? I would expect in most ca Yep, I agree. -- To view, visit http://gerrit.cloudera.org:8080/8238 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I95f8221ff08c0f126f951f7d37ff5e57985f855f Gerrit-Change-Number: 8238 Gerrit-PatchSet: 6 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 18 Oct 2017 21:32:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding
Hello Lars Volker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8267 to look at the new patch set (#8). Change subject: IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding .. IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding Switch the decoders to using more batch-oriented interfaces. As an intermediate step this doesn't make the interfaces of LevelDecoder or DictDecoder batch-oriented, only the lower-level utility classes. The next step would be to change those interfaces to be batch-oriented and make according optimisations in parquet. This could deliver much larger perf improvements than the current patch. The high-level changes are. * BitReader -> BatchedBitReader, which is built to unpack runs of 32 bit-packed values efficiently. * RleDecoder -> RleBatchDecoder, which exposes the repeated and literal runs to the caller and uses BatchedBitReader to unpack literal runs efficiently. * Dict decoding uses RleBatchDecoder to decode repeated runs efficiently and uses the BitPacking utilities to unpack and encode in a single step. Also removes an older benchmark that isn't too interesting (since the batch-oriented approach to encoding and decoding is so much faster than the value-by-value approach). Testing: * Ran core tests. * Updated unit tests to exercise new code. * Added test coverage for the deprecated bit-packed level encoding to that it still works (there was no coverage previously). Perf: Single-node benchmarks showed a few % performance gain. 16 node cluster benchmarks only showed a gain for TPC-H nested. Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6 --- M be/src/benchmarks/CMakeLists.txt M be/src/benchmarks/bit-packing-benchmark.cc D be/src/benchmarks/rle-benchmark.cc M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h D be/src/experiments/bit-stream-utils.8byte.h D be/src/experiments/bit-stream-utils.8byte.inline.h M be/src/util/bit-packing.h M be/src/util/bit-packing.inline.h M be/src/util/bit-stream-utils.h M be/src/util/bit-stream-utils.inline.h M be/src/util/dict-encoding.h M be/src/util/dict-test.cc M be/src/util/parquet-reader.cc M be/src/util/rle-encoding.h M be/src/util/rle-test.cc M testdata/data/README A testdata/data/alltypes_agg_bitpacked_def_levels.parquet M tests/query_test/test_scanners.py 19 files changed, 1,149 insertions(+), 946 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/8267/8 -- To view, visit http://gerrit.cloudera.org:8080/8267 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6 Gerrit-Change-Number: 8267 Gerrit-PatchSet: 8 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Allow the SASL protocol service name to be configurable
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8230 ) Change subject: Allow the SASL protocol service name to be configurable .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1348/ -- To view, visit http://gerrit.cloudera.org:8080/8230 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e Gerrit-Change-Number: 8230 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 18 Oct 2017 21:21:22 + Gerrit-HasComments: No
[Impala-ASF-CR] Allow the SASL protocol service name to be configurable
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/8230 ) Change subject: Allow the SASL protocol service name to be configurable .. Patch Set 3: Code-Review+2 Rebase, carry +2. -- To view, visit http://gerrit.cloudera.org:8080/8230 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e Gerrit-Change-Number: 8230 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 18 Oct 2017 21:20:58 + Gerrit-HasComments: No
[Impala-ASF-CR] Add 'psmisc' to bootstrap system.sh.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8306 ) Change subject: Add 'psmisc' to bootstrap_system.sh. .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I378b6fbf7a7b3f0b42ecf064e98cfe6e29b3 Gerrit-Change-Number: 8306 Gerrit-PatchSet: 1 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 18 Oct 2017 21:09:15 + Gerrit-HasComments: No
[Impala-ASF-CR] Add 'psmisc' to bootstrap system.sh.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8306 ) Change subject: Add 'psmisc' to bootstrap_system.sh. .. Add 'psmisc' to bootstrap_system.sh. testdata/bin/kill-all.sh uses /usr/bin/killall, which is provided by the 'psmisc' package on Ubuntu. This commit adds the package to the list of packages we install. The Docker base image for Ubuntu 16.04 doesn't contain this package, which is how this came up. Change-Id: I378b6fbf7a7b3f0b42ecf064e98cfe6e29b3 Reviewed-on: http://gerrit.cloudera.org:8080/8306 Reviewed-by: Jim AppleTested-by: Impala Public Jenkins --- M bin/bootstrap_system.sh 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Jim Apple: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I378b6fbf7a7b3f0b42ecf064e98cfe6e29b3 Gerrit-Change-Number: 8306 Gerrit-PatchSet: 2 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@276 PS1, Line 276: 169.254.169.254 > What is this address? Can we use a domain name and https? http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html This is EC2 magic. If you're not on EC2, this will tell you to set the environment variables (if you're testing S3). If you're on EC2, this will return something. We don't actually check that those credentials can access the relevant bucket, but at least it lets you through. -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo GaalGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 18 Oct 2017 21:06:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@276 PS1, Line 276: 169.254.169.254 What is this address? Can we use a domain name and https? What would a new Impala developer get as a result of this? -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo GaalGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 18 Oct 2017 20:55:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5736: Add impala-shell argument to set default query options
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/8038 ) Change subject: IMPALA-5736: Add impala-shell argument to set default query options .. Patch Set 7: (3 comments) http://gerrit.cloudera.org:8080/#/c/8038/7/shell/option_parser.py File shell/option_parser.py: http://gerrit.cloudera.org:8080/#/c/8038/7/shell/option_parser.py@42 PS7, Line 42: if default_options[option] in [True, False]: > How about "if type(...) is bool:" ? The recommended[0] way is to use isinstance() [1] [0] https://www.python.org/dev/peps/pep-0008/ [1] https://docs.python.org/2/library/functions.html#isinstance http://gerrit.cloudera.org:8080/#/c/8038/7/shell/option_parser.py@91 PS7, Line 91: upper > Is this needed? If so, can you add a comment explaining why? Dictionary comprehensions were only added in Python 2.7 [0]. Impala has users using older OSs that won't have Python 2.7. If this is needed, please re-write it not to use the comprehension. [0] https://docs.python.org/2.7/whatsnew/2.7.html#python-3-1-features http://gerrit.cloudera.org:8080/#/c/8038/7/tests/shell/test_shell_interactive.py File tests/shell/test_shell_interactive.py: http://gerrit.cloudera.org:8080/#/c/8038/7/tests/shell/test_shell_interactive.py@300 PS7, Line 300: What about testing for: 1. query options that don't exist or don't parse? 2. query option with an invalid value -- To view, visit http://gerrit.cloudera.org:8080/8038 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I26a3b67230c80a99bd246b6af205d558fec9a986 Gerrit-Change-Number: 8038 Gerrit-PatchSet: 7 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 18 Oct 2017 20:34:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Allow configuration of values passed into kerberos env vars
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8308 ) Change subject: Allow configuration of values passed into kerberos env vars .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iab4ce72c04ec4056dc89fb4c1c540a6fdaca4404 Gerrit-Change-Number: 8308 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 18 Oct 2017 20:26:40 + Gerrit-HasComments: No
[Impala-ASF-CR] Allow configuration of values passed into kerberos env vars
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8308 ) Change subject: Allow configuration of values passed into kerberos env vars .. Allow configuration of values passed into kerberos env vars We always used hardcoded constants for the following kerberos environment variables: KRB5CCNAME and KRB5RCACHETYPE. This patch allows for the configuration of these variables by taking arguments to InitKerberosForServer(). Callsites within Kudu have not been changed as all the parameters have default values. The motivation for this patch is that, Impala as a user of the KuduRPC and Kudu security libraries, needs to have a file based credential cache since the kinit happens on the C++ side and this cache needs to be read by the Java side too. Hence, we cannot have it in memory. Also, Impala still requires replay protection, since some Impala services use Thrift which lacks the nonce mechanism that KRPC uses for replay protection. Change-Id: Iab4ce72c04ec4056dc89fb4c1c540a6fdaca4404 Reviewed-on: http://gerrit.cloudera.org:8080/8247 Reviewed-by: Todd LipconTested-by: Todd Lipcon Reviewed-on: http://gerrit.cloudera.org:8080/8308 Reviewed-by: Michael Ho Tested-by: Impala Public Jenkins --- M be/src/kudu/security/init.cc M be/src/kudu/security/init.h 2 files changed, 19 insertions(+), 12 deletions(-) Approvals: Michael Ho: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iab4ce72c04ec4056dc89fb4c1c540a6fdaca4404 Gerrit-Change-Number: 8308 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8274 ) Change subject: IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+ .. Patch Set 3: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1344/ -- To view, visit http://gerrit.cloudera.org:8080/8274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30234aa50fea93f316e75beea2ced002dcea0c24 Gerrit-Change-Number: 8274 Gerrit-PatchSet: 3 Gerrit-Owner: Zach AmsdenGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 18 Oct 2017 20:10:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5599: Clean up references to TimestampValue in be/src/service.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/8305 ) Change subject: IMPALA-5599: Clean up references to TimestampValue in be/src/service. .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/client-request-state.cc@a91 PS1, Line 91: What do you think about going a step further and replacing all callers of TimestampValue::LocalTime() with UnixMicros() or the like ? There doesn't seem to be many callers left after this patch. http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-http-handler.cc@303 PS1, Line 303: int64_t duration_us = record.end_time_us - record.start_time_us; > that can be negative if settimeofday() occurred in the mean time. What shou That's true but that also seems to be a pre-existing bug. I agree that we should handle it though. -- To view, visit http://gerrit.cloudera.org:8080/8305 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c Gerrit-Change-Number: 8305 Gerrit-PatchSet: 1 Gerrit-Owner: Zoram ThangaGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Wed, 18 Oct 2017 19:22:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8085 ) Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet .. Patch Set 10: Code-Review+2 carry Dan's +2 -- To view, visit http://gerrit.cloudera.org:8080/8085 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2 Gerrit-Change-Number: 8085 Gerrit-PatchSet: 10 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Oct 2017 19:12:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8085 ) Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet .. Patch Set 10: Mostafa, we spoke offline about this but just wanted to confirm it looked ok to you. The summary is that the CPU cost of value materialisation for non-dict-encoded strings in uncompressed parquet scans regresses to roughly the same CPU cost as snappy-compressed parquet. I didn't see any regressions on TPC-H, only narrowly targeted benchmarks. -- To view, visit http://gerrit.cloudera.org:8080/8085 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2 Gerrit-Change-Number: 8085 Gerrit-PatchSet: 10 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Oct 2017 19:12:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8233 ) Change subject: IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1347/ -- To view, visit http://gerrit.cloudera.org:8080/8233 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff Gerrit-Change-Number: 8233 Gerrit-PatchSet: 7 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 18 Oct 2017 19:10:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/8233 ) Change subject: IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors .. Patch Set 7: Code-Review+2 Rebasing. Carrying over +2. -- To view, visit http://gerrit.cloudera.org:8080/8233 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff Gerrit-Change-Number: 8233 Gerrit-PatchSet: 7 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 18 Oct 2017 19:10:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8085 ) Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/8085/8/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: http://gerrit.cloudera.org:8080/#/c/8085/8/be/src/exec/parquet-column-readers.cc@983 PS8, Line 983: data > buffers? Done -- To view, visit http://gerrit.cloudera.org:8080/8085 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2 Gerrit-Change-Number: 8085 Gerrit-PatchSet: 8 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Oct 2017 19:09:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet
Hello Lars Volker, Alex Behm, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8085 to look at the new patch set (#9). Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet .. IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet This change only affects uncompressed plain-encoded Parquet where RowBatches may directly reference strings stored in the I/O buffers. The proposed fix is to simply copy the data pages if needed then use the same logic that we use for decompressed data pages. This copy inevitably adds some CPU overhead, but I believe this is acceptable because: * We generally recommend using compression, and optimize for that case. * Copying memory is cheaper than decompressing data. * Scans of uncompressed data are very likely to be I/O bound. This allows several major simplifications: * The resource management for compressed and uncompressed scans is much more similar. * We don't need to attach Disk I/O buffers to RowBatches. * We don't need to deal with attaching I/O buffers in ScannerContext. * Column readers can release each I/O buffer *before* advancing to the next one, making it easier to reason about resource consumption. E.g. each Parquet column only needs one I/O buffer at a time to make progress. Future changes will apply to Avro, Sequence Files and Text. Once all scanners are converted, ScannerContext::contains_tuple_data_ will always be false and we can remove some dead code. Testing === Ran core ASAN and exhaustive debug builds. Perf No difference in most cases when scanning uncompressed parquet. There is a significant regression (50% increase in runtime) in targeted perf tests scanning non-dictionary-encoded strings (see benchmark output below). After the regression performance is comparable to Snappy compression. I also did a TPC-H run but ran into some issues with the report generator. I manually compared times and there were no regressions. ++---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | ++---+-++++ | TARGETED-PERF(_61) | parquet / none / none | 23.02 | +0.60% | 4.23 | +5.97% | ++---+-++++ +++---++-++++-+---+ | Workload | Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters | +++---++-++++-+---+ | TARGETED-PERF(_61) | PERF_STRING-Q2 | parquet / none / none | 3.00 | 1.98| R +52.10% | 0.97%| 1.25%| 1 | 5 | | TARGETED-PERF(_61) | PERF_STRING-Q1 | parquet / none / none | 2.86 | 1.92| R +49.11% | 0.34%| 2.34%| 1 | 5 | | TARGETED-PERF(_61) | PERF_STRING-Q3 | parquet / none / none | 3.16 | 2.15| R +47.04% | 1.03%| 0.72%| 1 | 5 | | TARGETED-PERF(_61) | PERF_STRING-Q4 | parquet / none / none | 3.16 | 2.17| R +45.60% | 0.14%| 1.11%| 1 | 5 | | TARGETED-PERF(_61) | PERF_STRING-Q5 | parquet / none / none | 3.51 | 2.55| R +37.88% | 0.83%| 0.49%| 1 | 5 | | TARGETED-PERF(_61) | PERF_AGG-Q5| parquet / none / none | 0.79 | 0.61| R +30.86% | 1.54%| 4.10%| 1 | 5 | | TARGETED-PERF(_61) | primitive_top-n_al | parquet / none / none | 39.45 | 35.07 | +12.51% | 0.29%| 0.29%| 1 | 5 | | TARGETED-PERF(_61) | PERF_STRING-Q7 | parquet / none / none | 6.78 | 6.10| +11.13% | 0.99%| 0.74%| 1 | 5 | | TARGETED-PERF(_61) | PERF_STRING-Q6 | parquet / none / none | 8.83 | 8.14| +8.52% | 0.15%| 0.32%| 1 | 5 | ... Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M be/src/exec/scanner-context.h 4 files changed, 66 insertions(+), 53 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/8085/9 -- To view, visit http://gerrit.cloudera.org:8080/8085 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master
[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu
Hello Michael Ho, Lars Volker, Matthew Jacobs, Anonymous Coward #345, Tim Armstrong, Mostafa Mokhtar, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7793 to look at the new patch set (#7). Change subject: IMPALA-4252: Min-max runtime filters for Kudu .. IMPALA-4252: Min-max runtime filters for Kudu This patch implements min-max filters for runtime filters. Each runtime filter generates a bloom filter or a min-max filter, depending on if it has HDFS or Kudu targets, respectively. Min-max filters are generated by the PartitionedHashJoinBuilder. For now, min-max filters are only applied at the KuduScanner, which passes them into the Kudu client. Because the Kudu client doesn't provide a way to specify generic filter exprs, min-max filters are only generated when the target expr is a bare Kudu column ref. Future work will address applying min-max filters at HDFS scan nodes and applying bloom filters at Kudu scan nodes. Codegen is used to eliminate branching on the type of the min-max filter. Functional Testing: - Added new planner tests and updated the old ones. - Updated existing runtime filter tests to work with Kudu. - Added e2e tests for min-max filter specific functionality. Perf Testing: - All tests run on Kudu stress cluster (10 nodes) and tpch_100_kudu, timings are averages of 3 runs. - Ran a contrived query with a filter that does eliminate any rows (full self join of lineitem). The difference in running time was negligible - 24.46s with filters on, 24.15s with filters off for a ~1% slowdown. - Ran a contrived query with a filter that elimiates all rows (self join on lineitem with a join condition that never matches). The filters resulted in a significant speedup - 0.26s with filters on, 1.46s with filters off for a ~5.6x speedup. This query is added to targeted-perf. Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c --- M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/impala-ir.cc M be/src/exec/filter-context.cc M be/src/exec/filter-context.h M be/src/exec/hdfs-parquet-scanner-ir.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/kudu-scan-node-base.cc M be/src/exec/kudu-scan-node-mt.cc M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-scanner.cc M be/src/exec/kudu-scanner.h M be/src/exec/kudu-util.cc M be/src/exec/kudu-util.h M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-filter-state.h M be/src/runtime/coordinator.cc M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-filter-bank.h M be/src/runtime/runtime-filter-ir.cc M be/src/runtime/runtime-filter.cc M be/src/runtime/runtime-filter.h M be/src/runtime/runtime-filter.inline.h M be/src/service/impala-internal-service.cc M be/src/util/CMakeLists.txt A be/src/util/min-max-filter-ir.cc A be/src/util/min-max-filter.cc A be/src/util/min-max-filter.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test M testdata/workloads/functional-planner/queries/PlannerTest/inline-view-limit.test M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test M testdata/workloads/functional-planner/queries/PlannerTest/joins.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu-delete.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu-update.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test A testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test M testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test M testdata/workloads/functional-planner/queries/PlannerTest/order.test M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test M testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test M
[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/7793 ) Change subject: IMPALA-4252: Min-max runtime filters for Kudu .. Patch Set 6: (18 comments) http://gerrit.cloudera.org:8080/#/c/7793/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/7793/6//COMMIT_MSG@15 PS6, Line 15: them into the Kudu client. Because the Kudu client doesn't provide a > Is there any documentation for Kudu about what ordering Kudu uses for compa I asked the Kudu team, and its not documented anywhere, but they confirmed they use the same ordering as Impala does. http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/exec/filter-context.h File be/src/exec/filter-context.h: http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/exec/filter-context.h@106 PS6, Line 106: bloom_filter > has_bloom_filter() ? At callsites it reads like this should return a filter Done http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/exec/filter-context.h@109 PS6, Line 109: min_max_filter > has_min_max_filter? Done http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/coordinator.cc@1220 PS6, Line 1220: MinMaxFilter::Or(src_type(), params.min_max_filter, min_max_filter_.get()); > I mentioned this in a later comment, but we should consider what happens if Done http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/runtime-filter-bank.cc File be/src/runtime/runtime-filter-bank.cc: http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/runtime-filter-bank.cc@235 PS6, Line 235: MinMaxFilter* min_max_filter = MinMaxFilter::Create(type, _pool_); > (nit) - could combine into one line. Done http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/runtime-filter-ir.cc File be/src/runtime/runtime-filter-ir.cc: http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/runtime-filter-ir.cc@27 PS6, Line 27: // to a) the atomicity of / pointer assignments and b) the x86 TSO memory model. > Comment not relevant any more since we're using actual atomics? Done http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter-ir.cc File be/src/util/min-max-filter-ir.cc: http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter-ir.cc@53 PS6, Line 53: min_str = string(value->ptr, value->len); > We should think through the behaviour with very large strings. It may make Done http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.h File be/src/util/min-max-filter.h: http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.h@83 PS6, Line 83: #define NUMERIC_MIN_MAX_FILTER(NAME, TYPE) \ > Yeah, the macro seems like it may be the least of the evils. Done http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc File be/src/util/min-max-filter.cc: http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@24 PS6, Line 24: " > #include "runtime/timestamp-value.inline.h" Done http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@77 PS6, Line 77: std::string NAME##MinMaxFilter::DebugString() const { \ > nit: don't need std:: prefix in .cc files. Done http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@139 PS6, Line 139: out->min.__set_string_val(std::min(in.min.string_val, out->min.string_val)); > I feel like we should use our StringValue::Compare() function instead of re Done http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@291 PS6, Line 291: BigIntMinMaxFilter::Or(in, out); > Shouldn't this be calling the timestamp method? TColumnData doesn't have a timestamp field, so I convert timestamps with UtcToUnixTimeMicros and pass them around in thrift as longs. (noted in a comment now) Unless I'm missing a reason that comparing timestamps with way won't lead to the same results as comparing the unconverted TimestampValues. http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@327 PS6, Line 327: } > Timestamp? Same as above. http://gerrit.cloudera.org:8080/#/c/7793/6/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java: http://gerrit.cloudera.org:8080/#/c/7793/6/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@379 PS6, Line 379: if (slotRef == null || slotRef.getDesc().getColumn() == null > Are there planner tests for all these cases? Done http://gerrit.cloudera.org:8080/#/c/7793/6/testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test File testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test:
[Impala-ASF-CR] Allow the SASL protocol service name to be configurable
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8230 ) Change subject: Allow the SASL protocol service name to be configurable .. Patch Set 2: Code-Review+2 Okay, thats fine. We can revisit if these conflicts cause future work. -- To view, visit http://gerrit.cloudera.org:8080/8230 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e Gerrit-Change-Number: 8230 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 18 Oct 2017 18:42:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8085 ) Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet .. Patch Set 8: Code-Review+2 (1 comment) Yes, this is good with me. Could you get Mostafa to sign off on perf implications, perhaps recording that in the JIRA if not already (issues.apache.org not loading for me)? http://gerrit.cloudera.org:8080/#/c/8085/8/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: http://gerrit.cloudera.org:8080/#/c/8085/8/be/src/exec/parquet-column-readers.cc@983 PS8, Line 983: data buffers? -- To view, visit http://gerrit.cloudera.org:8080/8085 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2 Gerrit-Change-Number: 8085 Gerrit-PatchSet: 8 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Oct 2017 18:16:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5976: Remove equivalent class computation in FE
Tianyi Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8317 Change subject: IMPALA-5976: Remove equivalent class computation in FE .. IMPALA-5976: Remove equivalent class computation in FE Equivalent class is used to get the equivalencies between slots. It is ill-defined and the current implementation is inefficient. This patch removes it and directly uses the information from value the transfer graph instead. Value transfer graph is reimplemented using Tarjan's strongly connected component algorithm and adjacency list based DFS to speed up on both condensed and sparse graphs. Testing: A few test cases are modified because the change in equivalence definition enables some existing optimizations on these cases. It passes other existing tests. On a query with 1800 union operations, the equivalence class computation time is reduced from 7m57s to 104ms and the planning time is reduced from 8m5s to 13s. Change-Id: If4cb1d8be46efa8fd61a97048cc79dabe2ffa51a --- M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/ExistsPredicate.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/InPredicate.java M fe/src/main/java/org/apache/impala/analysis/IsNotEmptyPredicate.java M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java M fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java M fe/src/main/java/org/apache/impala/analysis/LikePredicate.java M fe/src/main/java/org/apache/impala/analysis/NullLiteral.java M fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StringLiteral.java M fe/src/main/java/org/apache/impala/analysis/Subquery.java M fe/src/main/java/org/apache/impala/analysis/TimestampArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/TimestampLiteral.java M fe/src/main/java/org/apache/impala/analysis/TupleIsNullPredicate.java M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java A fe/src/main/java/org/apache/impala/util/Graph.java M testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test 30 files changed, 805 insertions(+), 921 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/8317/1 -- To view, visit http://gerrit.cloudera.org:8080/8317 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If4cb1d8be46efa8fd61a97048cc79dabe2ffa51a Gerrit-Change-Number: 8317 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi Wang
[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8085 ) Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet .. Patch Set 8: Dan, does this look ok to you at a high level? I think the code change itself is relatively straightforward but it would be good to confirm the direction and perf trade-off. -- To view, visit http://gerrit.cloudera.org:8080/8085 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2 Gerrit-Change-Number: 8085 Gerrit-PatchSet: 8 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Oct 2017 17:54:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@256 PS1, Line 256: if [[ -z ${AWS_ACCESS_KEY_ID-} ]]; then > We may want to include a check here that "set -x" has not been enabled? (Bash hackery) If you want, you can use a subshell: $(set -x; (set +x; [ $USER == philip ])) && echo yes || echo no + set +x yes 10:46:02 mystery ~ $(set -x; (set +x; [ $USER == phili ])) && echo yes || echo no + set +x no Return values survive, and you can just do if (set +x; [[ -z ${AWS_SECRET_ACCESS_KEY_ID-} ]]); then ... fi I tested something similar above. -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo GaalGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 18 Oct 2017 17:47:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@256 PS1, Line 256: if [[ -z ${AWS_ACCESS_KEY_ID-} ]]; then We may want to include a check here that "set -x" has not been enabled? # Prevent leaking the AWS keys to the log set_x=0 if set +o | grep -q "set -o xtrace"; then set_x=1 set +x fi DO STUFF # Restore xtrace flag if [[ $set_x -eq 1 ]]; then set -x fi -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo GaalGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 18 Oct 2017 17:32:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Add 'psmisc' to bootstrap system.sh.
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8306 ) Change subject: Add 'psmisc' to bootstrap_system.sh. .. Patch Set 1: > Would you like me to submit this for you, Phil? Yes, please. I don't have permission to do it. Thanks! -- To view, visit http://gerrit.cloudera.org:8080/8306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I378b6fbf7a7b3f0b42ecf064e98cfe6e29b3 Gerrit-Change-Number: 8306 Gerrit-PatchSet: 1 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 18 Oct 2017 17:10:17 + Gerrit-HasComments: No
[Impala-ASF-CR] Add 'psmisc' to bootstrap system.sh.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8306 ) Change subject: Add 'psmisc' to bootstrap_system.sh. .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1346/ -- To view, visit http://gerrit.cloudera.org:8080/8306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I378b6fbf7a7b3f0b42ecf064e98cfe6e29b3 Gerrit-Change-Number: 8306 Gerrit-PatchSet: 1 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 18 Oct 2017 17:11:31 + Gerrit-HasComments: No
[Impala-ASF-CR] Add 'psmisc' to bootstrap system.sh.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/8306 ) Change subject: Add 'psmisc' to bootstrap_system.sh. .. Patch Set 1: Would you like me to submit this for you, Phil? -- To view, visit http://gerrit.cloudera.org:8080/8306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I378b6fbf7a7b3f0b42ecf064e98cfe6e29b3 Gerrit-Change-Number: 8306 Gerrit-PatchSet: 1 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Jim Apple Gerrit-Comment-Date: Wed, 18 Oct 2017 17:09:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4918: Support getting column comments via HS2
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/8315 ) Change subject: IMPALA-4918: Support getting column comments via HS2 .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/8315/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8315/2//COMMIT_MSG@9 PS2, Line 9: Fill the 'comments'/'remarks' field during HS2 column metadata requests. Please include in the commit message how you tested this. I think this also should include tests for other tables formats (Kudu, HBase). Those could be end to end tests. It would also be good to re-visit the existing "ALTER TABLE" tests and make sure they test adding comments. http://gerrit.cloudera.org:8080/#/c/8315/2/fe/src/test/java/org/apache/impala/common/FrontendTestBase.java File fe/src/test/java/org/apache/impala/common/FrontendTestBase.java: http://gerrit.cloudera.org:8080/#/c/8315/2/fe/src/test/java/org/apache/impala/common/FrontendTestBase.java@189 PS2, Line 189: dummyTable.addColumn(new Column(colDef.getColName(), colDef.getType(), colDef.getComment(), i)); Long line. http://gerrit.cloudera.org:8080/#/c/8315/2/fe/src/test/java/org/apache/impala/service/JdbcTest.java File fe/src/test/java/org/apache/impala/service/JdbcTest.java: http://gerrit.cloudera.org:8080/#/c/8315/2/fe/src/test/java/org/apache/impala/service/JdbcTest.java@460 PS2, Line 460: assertEquals("Incorrect table comment", "", rs.getString("REMARKS")); This looks like a bug in getTables() to me, but Alex and Dimitris would know better. -- To view, visit http://gerrit.cloudera.org:8080/8315 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1d33dfd031b5344d7136695b623cec76143ada5c Gerrit-Change-Number: 8315 Gerrit-PatchSet: 2 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Wed, 18 Oct 2017 16:44:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Allow configuration of values passed into kerberos env vars
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8308 ) Change subject: Allow configuration of values passed into kerberos env vars .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1345/ -- To view, visit http://gerrit.cloudera.org:8080/8308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iab4ce72c04ec4056dc89fb4c1c540a6fdaca4404 Gerrit-Change-Number: 8308 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 18 Oct 2017 16:34:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources
Hello Joe McDonnell, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8303 to look at the new patch set (#9). Change subject: IMPALA-1575: Part 1: eagerly release query exec resources .. IMPALA-1575: Part 1: eagerly release query exec resources Release of backend resources for query execution on the coordinator daemon should occur eagerly as soon as query execution is finished or cancelled. Before this patch some resources managed by QueryState, like scratch files, were only released when the query was closed and all of the query's control structures were torn down. These resources are referred to as "ExecResources" in various places to distinguish them from resources associated with the client request (like the result cache) that are still required after the query finishes executing. This first change does not solve the admission control problem for two reasons: * We don't release the "admitted" memory on the coordinator until the query is unregistered. * Admission control still considers the memory reserved until the query memtracker is unregistered, which happens only when the QueryState is destroyed: see MemTracker::GetPoolMemReserved(). The flow is mostly similar to initial_reservation_refcnt_, except the coordinator also holds onto a reference count, which is released when either the final row is returned or cancellation is initiated. After the coordinator releases its refcount, the resources can be freed as soon as local fragments also release their refcounts. Also clean up Coordinator slightly by preventing runtime_state() from leaking out to ClientRequestState - instead it's possible to log the query MemTracker by following parent links in the MemTracker. This patch is partially based on Joe McDonnell's IMPALA-1575 patch. Testing: Ran core tests. Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 --- M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/test-env.cc M be/src/service/client-request-state.cc 10 files changed, 122 insertions(+), 85 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/8303/9 -- To view, visit http://gerrit.cloudera.org:8080/8303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Gerrit-Change-Number: 8303 Gerrit-PatchSet: 9 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources
Tim Armstrong has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/8303 ) Change subject: IMPALA-1575: Part 1: eagerly release query exec resources .. IMPALA-1575: Part 1: eagerly release query exec resources Release of backend resources for query execution on the coordinator daemon should occur eagerly as soon as query execution is finished or cancelled. Before this patch some resources managed by QueryState, like scratch files, were only released when the query was closed and all of the query's control structures were torn down. These resources are referred to as "ExecResources" in various places to distinguish them from resources associated with the client request (like the result cache) that are still required after the query finishes executing. This first change does not solve the admission control problem for two reasons: * We don't release the "admitted" memory on the coordinator until the query is unregistered. * Admission control still considers the memory reserved until the query memtracker is unregistered, which happens only when the QueryState is destroyed: see MemTracker::GetPoolMemReserved(). The flow is mostly similar to initial_reservation_refcnt_, except the coordinator also holds onto a reference count, which is released when either the final row is returned or cancellation is initiated. After the coordinator releases its refcount, the resources can be freed as soon as local fragments also release their refcounts. Also clean up Coordinator slightly by preventing runtime_state() from leaking out to ClientRequestState - instead it's possible to log the query MemTracker by following parent links in the MemTracker. This patch is partially based on Joe McDonnell's IMPALA-1575 patch. Testing: Ran core tests. Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 --- M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/test-env.cc M be/src/service/client-request-state.cc 10 files changed, 122 insertions(+), 85 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/8303/8 -- To view, visit http://gerrit.cloudera.org:8080/8303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Gerrit-Change-Number: 8303 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8274 ) Change subject: IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+ .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1344/ -- To view, visit http://gerrit.cloudera.org:8080/8274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30234aa50fea93f316e75beea2ced002dcea0c24 Gerrit-Change-Number: 8274 Gerrit-PatchSet: 3 Gerrit-Owner: Zach AmsdenGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 18 Oct 2017 16:06:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8274 ) Change subject: IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+ .. Patch Set 2: Looks like a transient failure, retrying -- To view, visit http://gerrit.cloudera.org:8080/8274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30234aa50fea93f316e75beea2ced002dcea0c24 Gerrit-Change-Number: 8274 Gerrit-PatchSet: 2 Gerrit-Owner: Zach AmsdenGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 18 Oct 2017 16:05:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8274 ) Change subject: IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+ .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1343/ -- To view, visit http://gerrit.cloudera.org:8080/8274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30234aa50fea93f316e75beea2ced002dcea0c24 Gerrit-Change-Number: 8274 Gerrit-PatchSet: 2 Gerrit-Owner: Zach AmsdenGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 18 Oct 2017 16:04:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 ) Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala .. Patch Set 15: > Csaba, are those failing tests specifically require a timestamp > that issues such a warning? If not, I suggest we just change the > timestamps of those tests in a way that preserves test coverage and > avoids the warnings problem. Yes, those tests work with "non timestamp" strings by design. -- To view, visit http://gerrit.cloudera.org:8080/7954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff Gerrit-Change-Number: 7954 Gerrit-PatchSet: 15 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Wed, 18 Oct 2017 14:28:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4918: Support getting column comments via HS2
Csaba Ringhofer has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/8315 ) Change subject: IMPALA-4918: Support getting column comments via HS2 .. IMPALA-4918: Support getting column comments via HS2 Fill the 'comments'/'remarks' field during HS2 column metadata requests. Change-Id: I1d33dfd031b5344d7136695b623cec76143ada5c --- M fe/src/main/java/org/apache/impala/service/MetadataOp.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/service/FrontendTest.java M fe/src/test/java/org/apache/impala/service/JdbcTest.java 4 files changed, 54 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/8315/2 -- To view, visit http://gerrit.cloudera.org:8080/8315 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1d33dfd031b5344d7136695b623cec76143ada5c Gerrit-Change-Number: 8315 Gerrit-PatchSet: 2 Gerrit-Owner: Csaba Ringhofer
[Impala-ASF-CR] IMPALA-4918: Support getting column comments via HS2
Csaba Ringhofer has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8315 Change subject: IMPALA-4918: Support getting column comments via HS2 .. IMPALA-4918: Support getting column comments via HS2 Fill the 'comments'/'remarks' field during HS2 column metadata requests. Change-Id: I1d33dfd031b5344d7136695b623cec76143ada5c --- M fe/src/main/java/org/apache/impala/service/MetadataOp.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/service/FrontendTest.java M fe/src/test/java/org/apache/impala/service/JdbcTest.java 4 files changed, 55 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/8315/1 -- To view, visit http://gerrit.cloudera.org:8080/8315 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I1d33dfd031b5344d7136695b623cec76143ada5c Gerrit-Change-Number: 8315 Gerrit-PatchSet: 1 Gerrit-Owner: Csaba Ringhofer
[Impala-ASF-CR] Allow configuration of values passed into kerberos env vars
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8308 ) Change subject: Allow configuration of values passed into kerberos env vars .. Patch Set 1: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1342/ -- To view, visit http://gerrit.cloudera.org:8080/8308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iab4ce72c04ec4056dc89fb4c1c540a6fdaca4404 Gerrit-Change-Number: 8308 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 18 Oct 2017 07:46:14 + Gerrit-HasComments: No