[Impala-ASF-CR] IMPALA-6118: Fix assertion failure in coordinator bloom filter updating
Tianyi Wang has abandoned this change. ( http://gerrit.cloudera.org:8080/8410 ) Change subject: IMPALA-6118: Fix assertion failure in coordinator bloom filter updating .. Abandoned -- To view, visit http://gerrit.cloudera.org:8080/8410 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: Ia28f9eb8d0b3e795d798949a1cfcfeaaec7040ed Gerrit-Change-Number: 8410 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] IMPALA-6118: Fix assertion failure in coordinator bloom filter updating
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/8410 ) Change subject: IMPALA-6118: Fix assertion failure in coordinator bloom filter updating .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8410/1/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/8410/1/be/src/runtime/coordinator.cc@1180 PS1, Line 1180: bloom_filter_.directory.clear(); > shrink_to_fit in the next line should free the memory. It's not enforced by It turns out string() has capacity 15 in GCC5.2 and 0 in GCC4.9. I think we should just track size() here since there is no guarantee on anything about capacity. I will abandon this. -- To view, visit http://gerrit.cloudera.org:8080/8410 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia28f9eb8d0b3e795d798949a1cfcfeaaec7040ed Gerrit-Change-Number: 8410 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Sat, 28 Oct 2017 05:06:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6118: Fix assertion failure in coordinator bloom filter updating
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/8410 ) Change subject: IMPALA-6118: Fix assertion failure in coordinator bloom filter updating .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8410/1/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/8410/1/be/src/runtime/coordinator.cc@1180 PS1, Line 1180: bloom_filter_.directory.clear(); > I think the bigger problem here is actually that the memory of the string i shrink_to_fit in the next line should free the memory. It's not enforced by the standard, but the standard didn't specify the capacity of a newly constructed string() as well. -- To view, visit http://gerrit.cloudera.org:8080/8410 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia28f9eb8d0b3e795d798949a1cfcfeaaec7040ed Gerrit-Change-Number: 8410 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Sat, 28 Oct 2017 05:00:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6118: Fix assertion failure in coordinator bloom filter updating
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8410 ) Change subject: IMPALA-6118: Fix assertion failure in coordinator bloom filter updating .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8410/1/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/8410/1/be/src/runtime/coordinator.cc@1180 PS1, Line 1180: bloom_filter_.directory.clear(); I think the bigger problem here is actually that the memory of the string is not freed (meaning we have a big chunk of untracked memory!). You could create a new string on the stack and move() the contents into that string, so the memory is freed. Then I think the old assumption of 0 capacity is valid again. -- To view, visit http://gerrit.cloudera.org:8080/8410 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia28f9eb8d0b3e795d798949a1cfcfeaaec7040ed Gerrit-Change-Number: 8410 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Comment-Date: Sat, 28 Oct 2017 04:41:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6118: Fix assertion failure in coordinator bloom filter updating
Tianyi Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8410 Change subject: IMPALA-6118: Fix assertion failure in coordinator bloom filter updating .. IMPALA-6118: Fix assertion failure in coordinator bloom filter updating Coordinator::UpdateFilter wrongly assumed std::string::capacity() to be 0 after std::string::clear(), resulting in an DCHECK failure in memory tracking. This patch uses std::string::shrink_to_fit, calculates the difference of the capacity and releases memory in the memory tracker accordingly. Change-Id: Ia28f9eb8d0b3e795d798949a1cfcfeaaec7040ed --- M be/src/runtime/coordinator.cc 1 file changed, 7 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/8410/1 -- To view, visit http://gerrit.cloudera.org:8080/8410 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ia28f9eb8d0b3e795d798949a1cfcfeaaec7040ed Gerrit-Change-Number: 8410 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi Wang
[Impala-ASF-CR] IMPALA-6123: Fix column order of a query test in test inline view limit
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8409 ) Change subject: IMPALA-6123: Fix column order of a query test in test_inline_view_limit .. Patch Set 2: Any idea how this test succeeded in the pre-commit tests? -- To view, visit http://gerrit.cloudera.org:8080/8409 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330 Gerrit-Change-Number: 8409 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 28 Oct 2017 04:01:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6123: Fix column order of a query test in test inline view limit
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8409 ) Change subject: IMPALA-6123: Fix column order of a query test in test_inline_view_limit .. Patch Set 2: Actually nvm, I see it only fails i exhaustive. Makes sense. -- To view, visit http://gerrit.cloudera.org:8080/8409 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330 Gerrit-Change-Number: 8409 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 28 Oct 2017 04:02:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8405 ) Change subject: IMPALA-6108, IMPALA-6070: Parallel data load (re-instated). .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/8405/2/bin/load-data.py File bin/load-data.py: http://gerrit.cloudera.org:8080/#/c/8405/2/bin/load-data.py@114 PS2, Line 114: # When HiveServer2 is configured to use "local" mode (i.e., MR jobs are run > Let's mention the HADOOP JIRA in case it's ever fixed and we can remove the Good idea. Done. -- To view, visit http://gerrit.cloudera.org:8080/8405 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10 Gerrit-Change-Number: 8405 Gerrit-PatchSet: 3 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 28 Oct 2017 03:48:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).
Hello Joe McDonnell, Tim Armstrong, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8405 to look at the new patch set (#3). Change subject: IMPALA-6108, IMPALA-6070: Parallel data load (re-instated). .. IMPALA-6108, IMPALA-6070: Parallel data load (re-instated). This is a revert of a revert, re-enabling parallel data load. It avoid the race condition by explicitly configuring the temporary directory in question in load-data.py. When the parallel data load change went in, we discovered a race with a signature of: java.io.FileNotFoundException: File /tmp/hadoop-jenkins/mapred/local/1508958341829_tmp does not exist The number in this path is milliseconds since the epoch, and the race occurs when two queries submitted to HiveServer2, running with the local runner, hit the same millisecond time stamp. The upstream bug is https://issues.apache.org/jira/browse/MAPREDUCE-6441, and I described the symptoms in https://issues.apache.org/jira/browse/MAPREDUCE-6992 (which is now marked as a dupe). I've tested this by running data load 5 times on the same machines where it failed before. I also ran data load manually and inspected the system to make sure that the temporary directories are getting created as expected in /tmp/impala-data-load-*. Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10 --- M bin/load-data.py M testdata/bin/create-load-data.sh M testdata/bin/run-hive-server.sh M testdata/bin/run-step.sh 4 files changed, 59 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/8405/3 -- To view, visit http://gerrit.cloudera.org:8080/8405 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10 Gerrit-Change-Number: 8405 Gerrit-PatchSet: 3 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8211 ) Change subject: IMPALA-5243: Speed up code gen for wide Avro tables. .. Patch Set 8: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8211 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476 Gerrit-Change-Number: 8211 Gerrit-PatchSet: 8 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 28 Oct 2017 02:51:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8211 ) Change subject: IMPALA-5243: Speed up code gen for wide Avro tables. .. IMPALA-5243: Speed up code gen for wide Avro tables. HdfsAvroScanner::CodegenMaterializeTuple generates a function linear in size to the number of columns. On 1000 column tables, codegen time is significant. This commit roughly halves it for wide columns. (Note that this had been much worse in recent history (<= Impala 2.9).) It does so by breaking up MaterializeTuple() into multiple smaller functions, and then calls them in order. When breaking up into 200-column chunks, there is a noticeable speed-up. I've made the helper code for generating LLVM function prototypes have a mutable function name, so that the builder can be re-used multiple times. I've checked by inspecting optimized LLVM that in the case where there's only 1 helper function, code gets inlined so that there doesn't seem to be an extra function. I measured codegen time for various "step sizes." The case where there are no helper functions is about 2.7s. The best case was about a step size of 200, with timings of 1.3s. For the query "select count(int_col16) from functional_avro.widetable_1000_cols", codegen times as a function of step size are roughly as follows. This is averaged across 5 executions, and rounded to 0.1s. step time 10 2.4 50 2.5 75 2.9 100 3.0 125 3.0 150 1.4 175 1.3 200 1.3 <-- chosen step size 225 1.5 250 1.4 300 1.6 400 1.6 500 1.8 1000 2.7 The raw data was generated like so, with some code that let me change the step size at runtime: $(for step in 10 50 75 100 125 150 175 200 225 250 300 400 500 1000; do for try in $(seq 5); do echo $step > /tmp/step_size.txt; echo -n "$step "; impala-shell.sh -q "select count(int_col16) from functional_avro.widetable_1000_cols; profile;" 2> /dev/null | grep -A9 'CodeGen:(Total: [0-9]*s' -m 1 | sed -e 's/ - / /' | sed -e 's/([0-9]*)//' | tr -d '\n' | tr -s ' ' ' '; echo; done; done) | tee out.txt ... 200 CodeGen:(Total: 1s333ms, non-child: 1s333ms, % non-child: 100.00%) CodegenTime: 613.562us CompileTime: 605.320ms LoadTime: 0.000ns ModuleBitcodeSize: 1.95 MB NumFunctions: 38 NumInstructions: 8.44K OptimizationTime: 701.276ms PeakMemoryUsage: 4.12 MB PrepareTime: 10.014ms ... 1000 CodeGen:(Total: 2s659ms, non-child: 2s659ms, % non-child: 100.00%) CodegenTime: 558.860us CompileTime: 1s267ms LoadTime: 0.000ns ModuleBitcodeSize: 1.95 MB NumFunctions: 34 NumInstructions: 8.41K OptimizationTime: 1s362ms PeakMemoryUsage: 4.11 MB PrepareTime: 10.574ms I have run the core tests with this change. Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476 Reviewed-on: http://gerrit.cloudera.org:8080/8211 Reviewed-by: Tim ArmstrongTested-by: Impala Public Jenkins --- M be/src/codegen/llvm-codegen.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h 3 files changed, 158 insertions(+), 100 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8211 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476 Gerrit-Change-Number: 8211 Gerrit-PatchSet: 9 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8303 ) Change subject: IMPALA-1575: Part 1: eagerly release query exec resources .. IMPALA-1575: Part 1: eagerly release query exec resources Release of backend resources for query execution on the coordinator daemon should occur eagerly as soon as query execution is finished or cancelled. Before this patch some resources managed by QueryState, like scratch files, were only released when the query was closed and all of the query's control structures were torn down. These resources are referred to as "ExecResources" in various places to distinguish them from resources associated with the client request (like the result cache) that are still required after the query finishes executing. This first change does not solve the admission control problem for two reasons: * We don't release the "admitted" memory on the coordinator until the query is unregistered. * Admission control still considers the memory reserved until the query memtracker is unregistered, which happens only when the QueryState is destroyed: see MemTracker::GetPoolMemReserved(). The flow is mostly similar to initial_reservation_refcnt_, except the coordinator also holds onto a reference count, which is released when either the final row is returned or cancellation is initiated. After the coordinator releases its refcount, the resources can be freed as soon as local fragments also release their refcounts. Also clean up Coordinator slightly by preventing runtime_state() from leaking out to ClientRequestState - instead it's possible to log the query MemTracker by following parent links in the MemTracker. This patch is partially based on Joe McDonnell's IMPALA-1575 patch. Testing: Ran core tests. Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Reviewed-on: http://gerrit.cloudera.org:8080/8303 Reviewed-by: Tim ArmstrongTested-by: Impala Public Jenkins --- M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/test-env.cc M be/src/service/client-request-state.cc 10 files changed, 140 insertions(+), 89 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Gerrit-Change-Number: 8303 Gerrit-PatchSet: 17 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8303 ) Change subject: IMPALA-1575: Part 1: eagerly release query exec resources .. Patch Set 16: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Gerrit-Change-Number: 8303 Gerrit-PatchSet: 16 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 28 Oct 2017 02:45:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6123: Fix column order of a query test in test inline view limit
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8409 ) Change subject: IMPALA-6123: Fix column order of a query test in test_inline_view_limit .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1413/ -- To view, visit http://gerrit.cloudera.org:8080/8409 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330 Gerrit-Change-Number: 8409 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 28 Oct 2017 02:35:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6123: Fix column order of a query test in test inline view limit
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/8409 ) Change subject: IMPALA-6123: Fix column order of a query test in test_inline_view_limit .. Patch Set 2: Code-Review+2 I'll +2 this to unblock our exhaustive tests. Please follow up with Alex on whether this is problem was expected b/c of IMPALA-886. -- To view, visit http://gerrit.cloudera.org:8080/8409 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330 Gerrit-Change-Number: 8409 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 28 Oct 2017 02:35:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6123: Fix column order of a query test in test inline view limit
Hello Lars Volker, Tim Armstrong, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8409 to look at the new patch set (#2). Change subject: IMPALA-6123: Fix column order of a query test in test_inline_view_limit .. IMPALA-6123: Fix column order of a query test in test_inline_view_limit Currently a "select *" query in test_inline_view_limit fails during exhaustive testing because Impala returns columns from HBase tables in a different order (IMPALA-886) than the one expected. This fix ensures the column order is consistent by specifying the output columns in the right order in the select query. Testing: Tested locally, with and without exhaustive exploration strategy. Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330 --- M testdata/workloads/functional-query/queries/QueryTest/inline-view-limit.test 1 file changed, 3 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8409/2 -- To view, visit http://gerrit.cloudera.org:8080/8409 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330 Gerrit-Change-Number: 8409 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6123: Fix column order of a query test in test inline view limit
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/8409 ) Change subject: IMPALA-6123: Fix column order of a query test in test_inline_view_limit .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8409/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8409/1//COMMIT_MSG@10 PS1, Line 10: because the output column order is different from : the expected one for it corresponding hbase table > how about: ... because Impala returns columns from HBase tables in a differ yes this sounds better. -- To view, visit http://gerrit.cloudera.org:8080/8409 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330 Gerrit-Change-Number: 8409 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 28 Oct 2017 02:29:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6093: diagnostics for flaky TestHashJoinTimer
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8403 ) Change subject: IMPALA-6093: diagnostics for flaky TestHashJoinTimer .. IMPALA-6093: diagnostics for flaky TestHashJoinTimer We don't know the root cause yet but try to improve things: * Eliminate one possible cause of flakiness - unfinished fragments left from previous queries. * Print a profile if an assertion fails so we can see why it failed. Testing: Ran core tests. Change-Id: Ic33296931db807abb960db43b99e5fd0f256 Reviewed-on: http://gerrit.cloudera.org:8080/8403 Reviewed-by: Tim ArmstrongTested-by: Impala Public Jenkins --- M tests/query_test/test_hash_join_timer.py 1 file changed, 18 insertions(+), 6 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ic33296931db807abb960db43b99e5fd0f256 Gerrit-Change-Number: 8403 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6093: diagnostics for flaky TestHashJoinTimer
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8403 ) Change subject: IMPALA-6093: diagnostics for flaky TestHashJoinTimer .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic33296931db807abb960db43b99e5fd0f256 Gerrit-Change-Number: 8403 Gerrit-PatchSet: 3 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 28 Oct 2017 01:55:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6123: Fix column order of a query test in test inline view limit
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/8409 ) Change subject: IMPALA-6123: Fix column order of a query test in test_inline_view_limit .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8409/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8409/1//COMMIT_MSG@10 PS1, Line 10: because the output column order is different from : the expected one for it corresponding hbase table how about: ... because Impala returns columns from HBase tables in a different order (IMPALA-886). -- To view, visit http://gerrit.cloudera.org:8080/8409 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330 Gerrit-Change-Number: 8409 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 28 Oct 2017 01:19:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6123: Fix column order of a query test in test inline view limit
Bikramjeet Vig has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8409 Change subject: IMPALA-6123: Fix column order of a query test in test_inline_view_limit .. IMPALA-6123: Fix column order of a query test in test_inline_view_limit Currently a "select *" query in test_inline_view_limit fails during exhaustive testing because the output column order is different from the expected one for it corresponding hbase table. This fix ensures the column order is consistent by specifying the output columns in the right order in the select query. Testing: Tested locally, with and without exhaustive exploration strategy. Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330 --- M testdata/workloads/functional-query/queries/QueryTest/inline-view-limit.test 1 file changed, 3 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8409/1 -- To view, visit http://gerrit.cloudera.org:8080/8409 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330 Gerrit-Change-Number: 8409 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig
[Impala-ASF-CR] IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8405 ) Change subject: IMPALA-6108, IMPALA-6070: Parallel data load (re-instated). .. Patch Set 2: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/8405/2/bin/load-data.py File bin/load-data.py: http://gerrit.cloudera.org:8080/#/c/8405/2/bin/load-data.py@114 PS2, Line 114: # When HiveServer2 is configured to use "local" mode (i.e., MR jobs are run Let's mention the HADOOP JIRA in case it's ever fixed and we can remove the workaround. -- To view, visit http://gerrit.cloudera.org:8080/8405 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10 Gerrit-Change-Number: 8405 Gerrit-PatchSet: 2 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 28 Oct 2017 00:14:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition
Taras Bobrovytsky has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/8309 ) Change subject: IMPALA-5019: Decimal V2 addition .. IMPALA-5019: Decimal V2 addition In this patch, we implement the new decimal return type rules for addition expressions. These rules become active when the query option DECIMAL_V2 is enabled. The algorithm for determining the type of the result is described in the JIRA. DECIMAL V1: ++ | typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) | ++ | DECIMAL(38,38) | ++ DECIMAL V2: ++ | typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) | ++ | DECIMAL(38,6) | ++ This patch required backend changes. We implement an algorithm where we handle the whole and fractional parts separately, and then combine them to get the final result. This is more complex and slower. We try to avoid this by first checking if the result would fit into int128. Testing: - Added expr tests. - Tested locally on my machine with a script that generates random decimal numbers and checks that Impala adds them correctly. Performance: For the common case, performance remains the same. select cast(2.2 as decimal(18, 1) + cast(2.2 as decimal(18, 1) BEFORE: 4.74s AFTER: 4.73s In this case, we check if it is necessary to do the complex addition, and it turns out to be not necessary. We see a slowdown because the result needs to be scaled down by dividing. select cast(2.2 as decimal(38, 19) + cast(2.2 as decimal(38, 19) BEFORE: 1.63s AFTER: 13.57s In following case, we take the most complex path and see the most signification perfmance hit. select cast(7.5 as decimal(38,37)) + cast(2.2 as decimal(38,37)) BEFORE: 1.63s AFTER: 20.57 Change-Id: I401049c56d910eb1546a178c909c923b01239336 --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.inline.h M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java 4 files changed, 387 insertions(+), 84 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8309/3 -- To view, visit http://gerrit.cloudera.org:8080/8309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336 Gerrit-Change-Number: 8309 Gerrit-PatchSet: 3 Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 ) Change subject: IMPALA-5019: Decimal V2 addition .. Patch Set 2: (12 comments) http://gerrit.cloudera.org:8080/#/c/8309/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8309/1//COMMIT_MSG@31 PS1, Line 31: to avoid this by first checking if the result would into int128. > fit? Done http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@166 PS2, Line 166: LeadingZeroAdjustment > I think the name could make it clearer what it's computing. Maybe something Done. I like the new name. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@170 PS2, Line 170: floor(log2(b)) > The table values are actually floor(log2(10^b)) for i = 0, 1, 2, 3... etc, Yes, exactly. Fixed the comment. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@181 PS2, Line 181: same_sign > We usually make numeric/bool template arguments upper case to make it clear Separated this large function into 3 smaller ones, as you suggested. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@185 PS2, Line 185: // This is expected to be handled by the caller. > What about the case when they're zero? That's also meant to be handled by t It wasn't possible for this condition to fail because at least one must be greater than zero for the following reasons: - If they are both zero, we will not be calling this function, because of the leading zero test. - If one is negative and the other is zero, we will invert it on line 327 I modified this check in the new implementation. Yes we do have cases, where we add zero. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@272 PS2, Line 272: DCHECK_EQ(result_scale, std::max(this_scale, other_scale)); > A brief comment that this is guaranteed by the frontend migh help people ne Done http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@277 PS2, Line 277: DCHECK(!*overflow) << "Cannot overflow unless result is Decimal16Value"; > extra indent here Done http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@287 PS2, Line 287: if (this_scale < other_scale) { > It might be slightly easier to follow if the branches were(delta_scale < 0) Rewrote the code to make it more clear. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@299 PS2, Line 299: into > long line. Done http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@305 PS2, Line 305: bool ovf = AdjustToSameScale(*this, this_scale, other, other_scale, > It feels a bit weird that we're calculating delta_scale above and inside Ad Delta scale is different in AdjustToSameScale(). There are two delta scales in this patch: between x and y and between max(x_scale, y_scale) and result_scale. What we are doing here and in AddLarge() is first adjust to the same scale, do the addition, then scale down to result_scale. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@310 PS2, Line 310: if (delta_scale > 0) { > This might need a brief comment to explain why it's necessary (or perhaps t Added comment. http://gerrit.cloudera.org:8080/#/c/8309/2/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java File fe/src/main/java/org/apache/impala/analysis/TypesUtil.java: http://gerrit.cloudera.org:8080/#/c/8309/2/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java@228 PS2, Line 228:* TODO: implement V2 rules for ADD/SUB. > TODO not needed. Done -- To view, visit http://gerrit.cloudera.org:8080/8309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336 Gerrit-Change-Number: 8309 Gerrit-PatchSet: 2 Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Sat, 28 Oct 2017 00:02:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 ) Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. Patch Set 12: (5 comments) I think this is very close. http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-scanner.h File be/src/exec/hdfs-parquet-scanner.h: http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-scanner.h@352 PS12, Line 352: MemTracker* dict_mem_tracker() { return scan_node_->mem_tracker(); } See comment in HdfsParquetTableWriter. Also, see how we pass the mem_tracker to data_page_pool_. http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.h File be/src/exec/hdfs-parquet-table-writer.h: http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.h@81 PS12, Line 81: /// Memory tracker to track the memory used by encoder. : MemTracker* dict_mem_tracker(); When I'm reading the code, I'm thinking that this is hiding more than it is illuminating. We use this in exactly one place. I would rather have the exact code right there with a good comment explaining which mem_tracker we are using. http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.cc File be/src/exec/hdfs-parquet-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.cc@179 PS12, Line 179: dict_encoder_base_->ClearIndices(); : dict_encoder_base_->Close(); When Close() does a call to ClearIndices(), this can be simplified to just a Close() call. http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.cc@322 PS12, Line 322:plain_encoded_value_size_, :parent_->dict_mem_tracker())); Nit: Indentation in this case should be even with the "D" in new DictEncoder. It goes against every editor's typical behavior, but it's the standard we have. Also, feel free to keep plain_encoded_value_size_ on the first line. (It's on the edge of our 90 char limit.) http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/util/dict-encoding.h File be/src/util/dict-encoding.h: http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/util/dict-encoding.h@63 PS12, Line 63: void Close() { : ReleaseBytes(); I think Close() should do ClearIndices(). -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 12 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 23:59:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6087: Revisit tests withheld from TPC-DS suite for use of TRUNCATE
Tim Wood has posted comments on this change. ( http://gerrit.cloudera.org:8080/8372 ) Change subject: IMPALA-6087: Revisit tests withheld from TPC-DS suite for use of TRUNCATE .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/8372/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8372/3//COMMIT_MSG@18 PS3, Line 18: (unless done so explicitly). Better to say, "unless the test does so explicitly." -- To view, visit http://gerrit.cloudera.org:8080/8372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79d2e34621639c8f8c4c4eb0b0944eaefca13a7a Gerrit-Change-Number: 8372 Gerrit-PatchSet: 3 Gerrit-Owner: Tim WoodGerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Fri, 27 Oct 2017 23:07:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options
John Russell has posted comments on this change. ( http://gerrit.cloudera.org:8080/8401 ) Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options .. Patch Set 2: (2 comments) I'll punt to Sailesh for the answer to one of Henry's questions. http://gerrit.cloudera.org:8080/#/c/8401/2/docs/topics/impala_ssl.xml File docs/topics/impala_ssl.xml: http://gerrit.cloudera.org:8080/#/c/8401/2/docs/topics/impala_ssl.xml@171 PS2, Line 171: This value is used in some organizations to disallow TLS 1.0 and 1.1. > This seems redundant, as that's what "Allow any TLS version of 1.2 higher." Hmm I was trying to come up a subtle way to indicate, "consider using this value if your organization is security-conscious". I'm not an expert on TLS/SSL vulns but I did turn up this one that suggests some problems are in both 1.0 and 1.1 but not 1.2. https://nakedsecurity.sophos.com/2013/02/07/boffins-crack-https-encryptionin-lucky-thirteen-attack/ http://gerrit.cloudera.org:8080/#/c/8401/2/docs/topics/impala_ssl.xml@177 PS2, Line 177: TLSv1.2 may not work > How does it 'not work' - does the daemon fail to start, or does the daemon Good question for Sailesh! -- To view, visit http://gerrit.cloudera.org:8080/8401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e Gerrit-Change-Number: 8401 Gerrit-PatchSet: 2 Gerrit-Owner: John RussellGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 23:04:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8211 ) Change subject: IMPALA-5243: Speed up code gen for wide Avro tables. .. Patch Set 8: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8211 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476 Gerrit-Change-Number: 8211 Gerrit-PatchSet: 8 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 22:55:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8211 ) Change subject: IMPALA-5243: Speed up code gen for wide Avro tables. .. Patch Set 8: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1412/ -- To view, visit http://gerrit.cloudera.org:8080/8211 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476 Gerrit-Change-Number: 8211 Gerrit-PatchSet: 8 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 22:54:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8303 ) Change subject: IMPALA-1575: Part 1: eagerly release query exec resources .. Patch Set 16: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1411/ -- To view, visit http://gerrit.cloudera.org:8080/8303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Gerrit-Change-Number: 8303 Gerrit-PatchSet: 16 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 22:54:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8303 ) Change subject: IMPALA-1575: Part 1: eagerly release query exec resources .. Patch Set 16: Code-Review+2 carry -- To view, visit http://gerrit.cloudera.org:8080/8303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Gerrit-Change-Number: 8303 Gerrit-PatchSet: 16 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 22:54:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources
Hello Sailesh Mukil, Joe McDonnell, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8303 to look at the new patch set (#15). Change subject: IMPALA-1575: Part 1: eagerly release query exec resources .. IMPALA-1575: Part 1: eagerly release query exec resources Release of backend resources for query execution on the coordinator daemon should occur eagerly as soon as query execution is finished or cancelled. Before this patch some resources managed by QueryState, like scratch files, were only released when the query was closed and all of the query's control structures were torn down. These resources are referred to as "ExecResources" in various places to distinguish them from resources associated with the client request (like the result cache) that are still required after the query finishes executing. This first change does not solve the admission control problem for two reasons: * We don't release the "admitted" memory on the coordinator until the query is unregistered. * Admission control still considers the memory reserved until the query memtracker is unregistered, which happens only when the QueryState is destroyed: see MemTracker::GetPoolMemReserved(). The flow is mostly similar to initial_reservation_refcnt_, except the coordinator also holds onto a reference count, which is released when either the final row is returned or cancellation is initiated. After the coordinator releases its refcount, the resources can be freed as soon as local fragments also release their refcounts. Also clean up Coordinator slightly by preventing runtime_state() from leaking out to ClientRequestState - instead it's possible to log the query MemTracker by following parent links in the MemTracker. This patch is partially based on Joe McDonnell's IMPALA-1575 patch. Testing: Ran core tests. Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 --- M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/test-env.cc M be/src/service/client-request-state.cc 10 files changed, 140 insertions(+), 89 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/8303/15 -- To view, visit http://gerrit.cloudera.org:8080/8303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Gerrit-Change-Number: 8303 Gerrit-PatchSet: 15 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8303 ) Change subject: IMPALA-1575: Part 1: eagerly release query exec resources .. Patch Set 14: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/8303/14/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: http://gerrit.cloudera.org:8080/#/c/8303/14/be/src/runtime/runtime-state.cc@87 PS14, Line 87: profile_(RuntimeProfile::Create(obj_pool(), "")) { > maybe explain that this is the test constructor Done -- To view, visit http://gerrit.cloudera.org:8080/8303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Gerrit-Change-Number: 8303 Gerrit-PatchSet: 14 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 22:53:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 6: (15 comments) http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@123 PS6, Line 123: until : // any in-flight RPC completes if the preceding RPC is still in-flight. http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@135 PS6, Line 135: free frees http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@258 PS6, Line 258: stack code http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@261 PS6, Line 261: stack code http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@264 PS6, Line 264: thread threads http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@279 PS6, Line 279: thread threads http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h File be/src/runtime/row-batch.h: http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@104 PS6, Line 104: CachedProtobufRowBatch OutboundRowBatch http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@144 PS6, Line 144: /// Populate a row batch from a serialized protobuf input_batch by copying : /// input_batch's tuple_data into the row batch's mempool and converting all : /// offsets in the data back into pointers. stale http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@455 PS6, Line 455: tuple_offsets input_* http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@531 PS6, Line 531: CachedProtobufRowBatch OutboundProtoRowBatch http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/util/network-util.cc File be/src/util/network-util.cc: http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/util/network-util.cc@41 PS6, Line 41: using kudu::Sockaddr; undo. Bad rebase. http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/util/network-util.cc@120 PS6, Line 120: Sockaddr sock; undo http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/data_stream_service.proto File common/protobuf/data_stream_service.proto: http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/data_stream_service.proto@38 PS6, Line 38: 'tuple_offsets' tuple offsets' buffer http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/data_stream_service.proto@43 PS6, Line 43: 'tuple_data' tuple data's buffer http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/row_batch.proto File common/protobuf/row_batch.proto: http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/row_batch.proto@25 PS6, Line 25: The indices of the sidecars are included in the header below. delete -- To view, visit http://gerrit.cloudera.org:8080/8023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 Gerrit-Change-Number: 8023 Gerrit-PatchSet: 6 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 27 Oct 2017 22:41:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6087: Revisit tests withheld from TPC-DS suite for use of TRUNCATE
Tim Wood has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/8372 ) Change subject: IMPALA-6087: Revisit tests withheld from TPC-DS suite for use of TRUNCATE .. IMPALA-6087: Revisit tests withheld from TPC-DS suite for use of TRUNCATE The .test files in this commit are ones held out from IMPALA-5376 because of observed anomalies with returned decimal precision that I previously controlled with TRUNCATE(). This ticket was filed after team members expressed a preference not to use TRUNCATE() and to use actual results generated within numerical range of expected, unless the results could not be controlled otherwise. Rounds of testing for this commit show that the earlier anomalies no longer occur, clearing the way for their inclusion in our TPC-DS suite. It has been observed that these tests tend to fail with the DECIMAL_V2 option set (unless done so explicitly). Testing: The gerrit_verify_dryrun_external (build #24) job passes with this change. The fix (rebased hereon) to IMPALA-6106 (finally) cleared up a source of flapping for these tests. Change-Id: I79d2e34621639c8f8c4c4eb0b0944eaefca13a7a --- A testdata/workloads/tpcds/queries/tpcds-q26.test A testdata/workloads/tpcds/queries/tpcds-q28.test A testdata/workloads/tpcds/queries/tpcds-q47.test A testdata/workloads/tpcds/queries/tpcds-q57.test A testdata/workloads/tpcds/queries/tpcds-q59.test A testdata/workloads/tpcds/queries/tpcds-q63.test M tests/query_test/test_tpcds_queries.py 7 files changed, 837 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/8372/3 -- To view, visit http://gerrit.cloudera.org:8080/8372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I79d2e34621639c8f8c4c4eb0b0944eaefca13a7a Gerrit-Change-Number: 8372 Gerrit-PatchSet: 3 Gerrit-Owner: Tim WoodGerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Wood
[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8303 ) Change subject: IMPALA-1575: Part 1: eagerly release query exec resources .. Patch Set 14: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/8303/14/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: http://gerrit.cloudera.org:8080/#/c/8303/14/be/src/runtime/runtime-state.cc@87 PS14, Line 87: profile_(RuntimeProfile::Create(obj_pool(), "")) { maybe explain that this is the test constructor -- To view, visit http://gerrit.cloudera.org:8080/8303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Gerrit-Change-Number: 8303 Gerrit-PatchSet: 14 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 22:21:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2758: Remove BufferedTupleStream::GetRows
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8226 ) Change subject: IMPALA-2758: Remove BufferedTupleStream::GetRows .. IMPALA-2758: Remove BufferedTupleStream::GetRows This patch removes BufferedTupleStream::GetRows. This function pins a stream and reads all the rows into a single batch. It is not a good API since it creates an arbitrarily large row batch. In this patch the call sites pin the stream and then directly use GetNext to retrieve a single batch at a time. Testing: It passes existing tests. A test case for GetRows is removed. Change-Id: I3831c38994da2b69775a9809ff01de5d23584414 Reviewed-on: http://gerrit.cloudera.org:8080/8226 Reviewed-by: Tim ArmstrongTested-by: Impala Public Jenkins --- M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/buffered-tuple-stream.cc M be/src/runtime/buffered-tuple-stream.h 5 files changed, 63 insertions(+), 99 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414 Gerrit-Change-Number: 8226 Gerrit-PatchSet: 7 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-2758: Remove BufferedTupleStream::GetRows
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8226 ) Change subject: IMPALA-2758: Remove BufferedTupleStream::GetRows .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414 Gerrit-Change-Number: 8226 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 22:15:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6093: diagnostics for flaky TestHashJoinTimer
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8403 ) Change subject: IMPALA-6093: diagnostics for flaky TestHashJoinTimer .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1410/ -- To view, visit http://gerrit.cloudera.org:8080/8403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic33296931db807abb960db43b99e5fd0f256 Gerrit-Change-Number: 8403 Gerrit-PatchSet: 3 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 22:12:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6093: diagnostics for flaky TestHashJoinTimer
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8403 ) Change subject: IMPALA-6093: diagnostics for flaky TestHashJoinTimer .. Patch Set 3: Code-Review+2 carry -- To view, visit http://gerrit.cloudera.org:8080/8403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic33296931db807abb960db43b99e5fd0f256 Gerrit-Change-Number: 8403 Gerrit-PatchSet: 3 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 22:11:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6093: diagnostics for flaky TestHashJoinTimer
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/8403 ) Change subject: IMPALA-6093: diagnostics for flaky TestHashJoinTimer .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic33296931db807abb960db43b99e5fd0f256 Gerrit-Change-Number: 8403 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 22:04:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3436: Return a decimal when rounding a double
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8398 ) Change subject: IMPALA-3436: Return a decimal when rounding a double .. Patch Set 1: (13 comments) http://gerrit.cloudera.org:8080/#/c/8398/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8398/1//COMMIT_MSG@16 PS1, Line 16: This is implemented by introducing a "hack" where we make the parser How about we say rewrite or translation instead of hack? I don't think it's really a terrible hack - it's a minimal change. http://gerrit.cloudera.org:8080/#/c/8398/1/be/src/exprs/decimal-functions-ir.cc File be/src/exprs/decimal-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/8398/1/be/src/exprs/decimal-functions-ir.cc@111 PS1, Line 111: DCHECK(!scale.is_null); Add a FE test case in AnalyzeExprsTest.java that makes sure this is enforced. The existing round() functions that take a DECIMAL argument must have a non-NULL, constant second argument. New tests should be added to make sure that is also enforced for your new round() function. There's also a subtle change in behavior we should consider. The existing round() function accepts non-constant arguments, should the new round function() also allow that or not? http://gerrit.cloudera.org:8080/#/c/8398/1/be/src/exprs/decimal-functions-ir.cc@116 PS1, Line 116: bool ovf = false; overflow http://gerrit.cloudera.org:8080/#/c/8398/1/be/src/exprs/decimal-functions.h File be/src/exprs/decimal-functions.h: http://gerrit.cloudera.org:8080/#/c/8398/1/be/src/exprs/decimal-functions.h@58 PS1, Line 58: FunctionContext*, const DoubleVal&, const IntVal&); add arg names for consistency http://gerrit.cloudera.org:8080/#/c/8398/1/common/function-registry/impala_functions.py File common/function-registry/impala_functions.py: http://gerrit.cloudera.org:8080/#/c/8398/1/common/function-registry/impala_functions.py@284 PS1, Line 284: [['round_v1','dround_v1'], Why keep the dround_v1 alias? http://gerrit.cloudera.org:8080/#/c/8398/1/common/function-registry/impala_functions.py@286 PS1, Line 286: [['round','dround'], 'DECIMAL', ['DOUBLE', 'INT'], 'impala::DecimalFunctions::RoundTo'], Can you organize these and comment on whether these are used in v1/v2 or both? Seems confusing now. http://gerrit.cloudera.org:8080/#/c/8398/1/common/function-registry/impala_functions.py@287 PS1, Line 287: [['round','dround','round_v1','dround_v1'], Why keep the dround_v1 alias? http://gerrit.cloudera.org:8080/#/c/8398/1/common/function-registry/impala_functions.py@403 PS1, Line 403: [['round','dround','round_v1','dround_v1'], Why keep the dround_v1 alias in these? http://gerrit.cloudera.org:8080/#/c/8398/1/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: http://gerrit.cloudera.org:8080/#/c/8398/1/fe/src/main/cup/sql-parser.cup@70 PS1, Line 70: private TQueryOptions queryOptions; add TODO to remove when decimal v1 is gone, maybe come up with a specific thing you can grep for later in the code, e.g. DECIMAL_V1 and use that consistently in the TODOs for removal http://gerrit.cloudera.org:8080/#/c/8398/1/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java: http://gerrit.cloudera.org:8080/#/c/8398/1/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@100 PS1, Line 100: FunctionCallExpr functionCallExpr = new FunctionCallExpr(fnName, params); newline http://gerrit.cloudera.org:8080/#/c/8398/1/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@111 PS1, Line 111: // nullif(x, y) -> if(x DISTINCT FROM y, x, NULL) newline http://gerrit.cloudera.org:8080/#/c/8398/1/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@385 PS1, Line 385:* This can only be called for functions that return wildcard decimals and the first Is this comment still accurate? http://gerrit.cloudera.org:8080/#/c/8398/1/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@407 PS1, Line 407: // The situtation where none of the parameters are decimal, but the return type is * Shrink comment to: None of the parameters are decimal but the return type is decimal. * It's not clear to me why we'd use (38,0) in this case, can you explain? For example, round(double, int) does not have decimal args. Shouldn't the return type be (38,X) where X is the value of the second arg if it is a constant? -- To view, visit http://gerrit.cloudera.org:8080/8398 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I240ec70da7996bbfefcf6438eba4dff8d33d7bd6 Gerrit-Change-Number: 8398 Gerrit-PatchSet: 1 Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Comment-Date: Fri, 27 Oct 2017
[Impala-ASF-CR] IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).
Hello Joe McDonnell, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8405 to look at the new patch set (#2). Change subject: IMPALA-6108, IMPALA-6070: Parallel data load (re-instated). .. IMPALA-6108, IMPALA-6070: Parallel data load (re-instated). This is a revert of a revert, re-enabling parallel data load. It avoid the race condition by explicitly configuring the temporary directory in question in load-data.py. When the parallel data load change went in, we discovered a race with a signature of: java.io.FileNotFoundException: File /tmp/hadoop-jenkins/mapred/local/1508958341829_tmp does not exist The number in this path is milliseconds since the epoch, and the race occurs when two queries submitted to HiveServer2, running with the local runner, hit the same millisecond time stamp. The upstream bug is https://issues.apache.org/jira/browse/MAPREDUCE-6441, and I described the symptoms in https://issues.apache.org/jira/browse/MAPREDUCE-6992 (which is now marked as a dupe). I've tested this by running data load 5 times on the same machines where it failed before. I also ran data load manually and inspected the system to make sure that the temporary directories are getting created as expected in /tmp/impala-data-load-*. Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10 --- M bin/load-data.py M testdata/bin/create-load-data.sh M testdata/bin/run-hive-server.sh M testdata/bin/run-step.sh 4 files changed, 58 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/8405/2 -- To view, visit http://gerrit.cloudera.org:8080/8405 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10 Gerrit-Change-Number: 8405 Gerrit-PatchSet: 2 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).
Philip Zeyliger has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8405 Change subject: IMPALA-6108, IMPALA-6070: Parallel data load (re-instated). .. IMPALA-6108, IMPALA-6070: Parallel data load (re-instated). This is a revert of a revert, re-enabling parallel data load. It avoid the race condition by explicitly configuring the temporary directory in question in load-data.py. When the parallel data load change went in, we discovered a race with a signature of: java.io.FileNotFoundException: File /tmp/hadoop-jenkins/mapred/local/1508958341829_tmp does not exist The number in this path is milliseconds since the epoch, and the race occurs when two queries submitted to HiveServer2, running with the local runner, hit the same millisecond time stamp. The upstream bug is https://issues.apache.org/jira/browse/MAPREDUCE-6441, and I described the symptoms in https://issues.apache.org/jira/browse/MAPREDUCE-6992 (which is now marked as a dupe). I've tested this by running data load 5 times on the same machines where it failed before. I also ran data load manually and inspected the system to make sure that the temporary directories are getting created as expected in /tmp/impala-data-load-*. Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10 --- M bin/load-data.py M testdata/bin/create-load-data.sh M testdata/bin/run-hive-server.sh M testdata/bin/run-step.sh 4 files changed, 61 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/8405/1 -- To view, visit http://gerrit.cloudera.org:8080/8405 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10 Gerrit-Change-Number: 8405 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger
[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8211 ) Change subject: IMPALA-5243: Speed up code gen for wide Avro tables. .. Patch Set 8: > Patch Set 8: > > > Oof. GVO caught a (correct) clang-tidy issue. Alas, clang-tidy is > > not in pre-review-test, so this was my first experience with that. > > You can also use > https://jenkins.impala.io/view/Utility/job/gerrit-verify-dryrun-external/ , > which does run clang-tidy. Thanks for the pointer. I'm giving it a shot at https://jenkins.impala.io/job/gerrit-verify-dryrun-external/25/ My local clang-tidy build was clean: $ bin/run_clang_tidy.sh | tee clang_tidy.txt $ cat clang_tidy.txt | grep ']' $ -- To view, visit http://gerrit.cloudera.org:8080/8211 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476 Gerrit-Change-Number: 8211 Gerrit-PatchSet: 8 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 21:39:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/8211 ) Change subject: IMPALA-5243: Speed up code gen for wide Avro tables. .. Patch Set 8: > Oof. GVO caught a (correct) clang-tidy issue. Alas, clang-tidy is > not in pre-review-test, so this was my first experience with that. You can also use https://jenkins.impala.io/view/Utility/job/gerrit-verify-dryrun-external/ , which does run clang-tidy. -- To view, visit http://gerrit.cloudera.org:8080/8211 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476 Gerrit-Change-Number: 8211 Gerrit-PatchSet: 8 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 21:32:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/7793 ) Change subject: IMPALA-4252: Min-max runtime filters for Kudu .. Patch Set 8: TPCDS results: +-++++-++++-+---+ | Workload| Query | File Format| Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters | +-++++-++++-+---+ | TPCDS(_300) | TPCDS-Q73 | kudu / none / none | 12.01 | 7.99| R +50.42% | 0.31%| 1.04%| 1 | 4 | | TPCDS(_300) | TPCDS-Q68 | kudu / none / none | 17.02 | 13.00 | R +30.97% | 0.82%| 0.26%| 1 | 4 | | TPCDS(_300) | TPCDS-Q79 | kudu / none / none | 16.61 | 12.95 | R +28.31% | 0.56%| 0.54%| 1 | 4 | | TPCDS(_300) | TPCDS-Q1 | kudu / none / none | 2.36 | 1.99| +18.19% | 2.86%| 1.38%| 1 | 4 | | TPCDS(_300) | TPCDS-Q53 | kudu / none / none | 2.18 | 2.08| +4.96% | 0.32%| 1.98%| 1 | 4 | | TPCDS(_300) | TPCDS-Q8 | kudu / none / none | 4.03 | 3.93| +2.38% | 0.43%| 0.38%| 1 | 4 | | TPCDS(_300) | TPCDS-Q98 | kudu / none / none | 7.80 | 7.67| +1.72% | 0.66%| 0.81%| 1 | 4 | | TPCDS(_300) | TPCDS-Q7 | kudu / none / none | 3.75 | 3.71| +1.15% | 1.14%| 1.17%| 1 | 4 | | TPCDS(_300) | TPCDS-Q4 | kudu / none / none | 29.59 | 29.27 | +1.11% | 4.13%| 3.67%| 1 | 4 | | TPCDS(_300) | TPCDS-Q65 | kudu / none / none | 13.97 | 14.01 | -0.28% | 2.70%| 1.43%| 1 | 4 | | TPCDS(_300) | TPCDS-Q28 | kudu / none / none | 4.06 | 4.08| -0.35% | 0.84%| 0.48%| 1 | 4 | | TPCDS(_300) | TPCDS-Q89 | kudu / none / none | 2.46 | 2.48| -0.62% | 2.60%| 1.81%| 1 | 4 | | TPCDS(_300) | TPCDS-Q55 | kudu / none / none | 2.44 | 2.46| -0.83% | 1.08%| 5.41%| 1 | 4 | | TPCDS(_300) | TPCDS-Q42 | kudu / none / none | 2.42 | 2.47| -1.89% | 0.26%| 4.50%| 1 | 4 | | TPCDS(_300) | TPCDS-Q23 | kudu / none / none | 90.96 | 95.30 | -4.55% | 1.06%| 2.21%| 1 | 4 | | TPCDS(_300) | TPCDS-Q43 | kudu / none / none | 5.05 | 5.34| -5.49% | 0.89%| 4.17%| 1 | 4 | | TPCDS(_300) | TPCDS-Q3 | kudu / none / none | 4.11 | 4.36| -5.84% | 2.48%| 1.15%| 1 | 4 | | TPCDS(_300) | TPCDS-Q61 | kudu / none / none | 8.56 | 9.93| -13.83% | 0.78%| 1.84%| 1 | 4 | | TPCDS(_300) | TPCDS-Q2 | kudu / none / none | 12.19 | 15.68 | I -22.28% | * 27.57% * | 1.11%| 1 | 4 | | TPCDS(_300) | TPCDS-Q47 | kudu / none / none | 16.82 | 21.82 | I -22.91% | 1.16%| 1.25%| 1 | 4 | | TPCDS(_300) | TPCDS-Q19 | kudu / none / none | 4.78 | 6.32| I -24.29% | 1.14%| 0.97%| 1 | 4 | | TPCDS(_300) | TPCDS-Q46 | kudu / none / none | 6.58 | 8.77| I -24.88% | 0.86%| 1.01%| 1 | 4 | | TPCDS(_300) | TPCDS-Q88 | kudu / none / none | 14.89 | 19.84 | I -24.95% | 0.17%| 3.79%| 1 | 4 | | TPCDS(_300) | TPCDS-Q59 | kudu / none / none | 24.05 | 35.61 | I -32.45% | * 11.97% * | * 20.61% * | 1 | 4 | | TPCDS(_300) | TPCDS-Q34 | kudu / none / none | 3.94 | 6.82| I -42.24% | 2.37%| 4.12%| 1 | 4 | | TPCDS(_300) | TPCDS-Q27 | kudu / none / none | 3.97 | 7.91| I -49.85% |
[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options
Henry Robinson has posted comments on this change. ( http://gerrit.cloudera.org:8080/8401 ) Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/8401/2/docs/topics/impala_ssl.xml File docs/topics/impala_ssl.xml: http://gerrit.cloudera.org:8080/#/c/8401/2/docs/topics/impala_ssl.xml@171 PS2, Line 171: This value is used in some organizations to disallow TLS 1.0 and 1.1. This seems redundant, as that's what "Allow any TLS version of 1.2 higher." means. http://gerrit.cloudera.org:8080/#/c/8401/2/docs/topics/impala_ssl.xml@177 PS2, Line 177: TLSv1.2 may not work How does it 'not work' - does the daemon fail to start, or does the daemon silently accept connections <1.2? -- To view, visit http://gerrit.cloudera.org:8080/8401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e Gerrit-Change-Number: 8401 Gerrit-PatchSet: 2 Gerrit-Owner: John RussellGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 21:03:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8303 ) Change subject: IMPALA-1575: Part 1: eagerly release query exec resources .. Patch Set 11: (3 comments) http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/mem-tracker.cc File be/src/runtime/mem-tracker.cc: http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/mem-tracker.cc@359 PS11, Line 359: was not available > Okay, but my original question was whether the comment should be reworded f Done http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.cc@95 PS11, Line 95: if (!released_exec_resources_) ReleaseExecResources(); > What I proposed was not to handle it in Init(), but let the caller handle i Should have read it more carefully, sorry. I try to avoid function APIs that require callers to undo side-effects of a failed function - that seems error-prone in general but I guess it is ok in this specific case. http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/runtime-state.cc@88 PS11, Line 88: local_query_state_->AcquireExecResourceRefcount(); // Decremented in ReleaseResources(). > Why are exvaluating exprs query-wide resources. Aren't they per-thread reso I'd consider per-thread execution resources to be a subset of query-wide execution resources. I.e. releasing a query's execution resources while threads are still consuming execution resources is not valid in my mind. The comments in QueryState weren't very clear about what holding a refcount meant so I tried to improve them. -- To view, visit http://gerrit.cloudera.org:8080/8303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Gerrit-Change-Number: 8303 Gerrit-PatchSet: 11 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 20:57:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources
Hello Sailesh Mukil, Joe McDonnell, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8303 to look at the new patch set (#14). Change subject: IMPALA-1575: Part 1: eagerly release query exec resources .. IMPALA-1575: Part 1: eagerly release query exec resources Release of backend resources for query execution on the coordinator daemon should occur eagerly as soon as query execution is finished or cancelled. Before this patch some resources managed by QueryState, like scratch files, were only released when the query was closed and all of the query's control structures were torn down. These resources are referred to as "ExecResources" in various places to distinguish them from resources associated with the client request (like the result cache) that are still required after the query finishes executing. This first change does not solve the admission control problem for two reasons: * We don't release the "admitted" memory on the coordinator until the query is unregistered. * Admission control still considers the memory reserved until the query memtracker is unregistered, which happens only when the QueryState is destroyed: see MemTracker::GetPoolMemReserved(). The flow is mostly similar to initial_reservation_refcnt_, except the coordinator also holds onto a reference count, which is released when either the final row is returned or cancellation is initiated. After the coordinator releases its refcount, the resources can be freed as soon as local fragments also release their refcounts. Also clean up Coordinator slightly by preventing runtime_state() from leaking out to ClientRequestState - instead it's possible to log the query MemTracker by following parent links in the MemTracker. This patch is partially based on Joe McDonnell's IMPALA-1575 patch. Testing: Ran core tests. Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 --- M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/test-env.cc M be/src/service/client-request-state.cc 10 files changed, 138 insertions(+), 89 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/8303/14 -- To view, visit http://gerrit.cloudera.org:8080/8303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Gerrit-Change-Number: 8303 Gerrit-PatchSet: 14 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8211 ) Change subject: IMPALA-5243: Speed up code gen for wide Avro tables. .. Patch Set 8: Oof. GVO caught a (correct) clang-tidy issue. Alas, clang-tidy is not in pre-review-test, so this was my first experience with that. I've made the trivial change (removed a const in llvm-codegen.h) and am running it through the relevant gauntlet locally. -- To view, visit http://gerrit.cloudera.org:8080/8211 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476 Gerrit-Change-Number: 8211 Gerrit-PatchSet: 8 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 20:56:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5307: Part 2: copy out strings in uncompressed Avro
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8146 ) Change subject: IMPALA-5307: Part 2: copy out strings in uncompressed Avro .. Patch Set 14: Code-Review+1 Thanks. This approach looks okay to me, please see if Michael can do the +2 review. -- To view, visit http://gerrit.cloudera.org:8080/8146 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If1fc78790d778c874f5aafa5958c3c045a88d233 Gerrit-Change-Number: 8146 Gerrit-PatchSet: 14 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 20:33:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8211 ) Change subject: IMPALA-5243: Speed up code gen for wide Avro tables. .. Patch Set 7: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1407/ -- To view, visit http://gerrit.cloudera.org:8080/8211 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476 Gerrit-Change-Number: 8211 Gerrit-PatchSet: 7 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 20:27:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 6: (6 comments) http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@75 PS3, Line 75: g two > see comment in row-batch.h about this terminology. Done http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@178 PS3, Line 178: // The two OutboundRowBatch which are re-used across multiple RPCs. Each entry contains : // a RowBatchHeaderPB and the buffers for the serialized tuple offsets and data. When : // one is being used for an in-flight RPC, the execution thread continues to run and : // serializes another row batch into the other entry. 'current_batch_idx_' is the index : // of the entry being used by the in-flight or last completed RPC. : // : // TODO: replace this with an ac > We need to access these fields from the callback (e.g. due to retry). req_ removed in PS5. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/row-batch.h File be/src/runtime/row-batch.h: http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/row-batch.h@52 PS3, Line 52: class OutboundRowBatch { > ProtoRowBatch is a conceptual representation of a serialized row batch in b Removed in PS5. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/row-batch.h@89 PS3, Line 89: > Had hard time coming up with a good name to indicate the re-use of the vect Renamed to OutboundRowBatch in PS5. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/row-batch.h@431 PS3, Line 431: /// : /// 'uncompressed_size': Updated with the uncompressed size of 'tuple_data'. : /// : /// 'is_compressed': Sets to true if compression is applied on 'tuple_data'. : /// False otherwise. : /// : /// Returns error status if serialization failed. Returns OK otherwise. : /// TODO: clean this up once the thrift RPC implementation is removed. : Status Serialize(bool full_dedup, vector* tuple_offsets, string* tuple_data, : int64_t* uncompressed_size, bool* is_compressed); : : /// Shared implementation between thrift and protobuf to deserialize a row batch. : /// : /// 'input_tuple_offsets': an int32_t array of tuples; offsets into 'input_tuple_data'. : /// Used for populating the tuples in the row batch with actual pointers. : /// : /// 'input_tuple_data': contains pointer and size of tuples' data buffer. : /// If 'is_compressed' is true, the data is compressed. : /// : /// 'uncompressed_size': the uncompressed size of 'input_tuple_data' if it's compressed. : /// : /// 'is_compressed': True if 'input_tuple_data' is compressed. : /// : /// TODO: clean this up once the thrift RPC implementation is removed. : void Deserialize(const kudu::Slice& tuple_offsets, const kudu::Slice& tuple_data, : int64_t uncompressed_size, bool is_compressed); : : typedef FixedSizeHashTableDedupMap; : : /// The total size of all data represented in this row batch (tuples and referenced : /// string and collection data). This is the size of the row batch after removing all : /// gaps in the auxiliary and deduplicated tuples (i.e. the smallest footprint for the : /// row batch). If the distinct_tuples argument is non-null, full deduplication is : /// enabled. The distinct_tuples map must be empty. : int64_t TotalByteSize(DedupMap* distinct_tuples); : : void SerializeInternal(int64_t size, DedupMap* distinct_tuples, : vector* tuple_offsets, string* tuple_data); : : /// All members below need to be handled in R > let's leave a TODO about cleaning this all up once we can remove the thrift TODO added. Cannot file a JIRA as apache JIRA is being re-indexed. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/service/data-stream-service.cc File be/src/service/data-stream-service.cc: http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/service/data-stream-service.cc@64 PS3, Line 64: : : : : : : : : > see comment in row-batch.h. I think we should do this later
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Hello Sailesh Mukil, Mostafa Mokhtar, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8023 to look at the new patch set (#6). Change subject: IMPALA-4856: Port data stream service to KRPC .. IMPALA-4856: Port data stream service to KRPC This patch implements a new data stream service which utilizes KRPC. Similar to the thrift RPC implementation, there are 3 major components to the data stream services: KrpcDataStreamSender serializes and sends row batches materialized by a fragment instance to a KrpcDataStreamRecvr. KrpcDataStreamMgr is responsible for routing an incoming row batch to the appropriate receiver. The data stream service runs on the port FLAGS_krpc_port which is 29000 by default. Unlike the implementation with thrift RPC, KRPC provides an asynchronous interface for invoking remote methods. As a result, KrpcDataStreamSender doesn't need to create a thread per connection. There is one connection between two Impalad nodes for each direction (i.e. client and server). Multiple queries can multi-plex on the same connection for transmitting row batches between two Impalad nodes. The asynchronous interface also prevents some issues with thrift RPC in which a thread may be stuck in an RPC call without being able to check for cancellation. A TransmitData() call with KRPC is in essence a trio of RpcController, a serialized protobuf request buffer and a protobuf response buffer. The call is invoked via a DataStreamService proxy object. The serialized tuple offsets and row batches are sent via "sidecars" in KRPC to avoid extra copy into the serialized request buffer. Each impalad node creates a singleton DataStreamService object at start-up time. All incoming calls are served by a service thread pool created as part of DataStreamService. By default, there are 64 service threads. The service threads are shared across all queries so the RPC handler should avoid blocking as much as possible. In thrift RPC implementation, we make a thrift thread handling a TransmitData() RPC to block for extended period of time when the receiver is not yet created when the call arrives. In KRPC implementation, we store TransmitData() or EndDataStream() requests which arrive before the receiver is ready in a per-receiver early sender list stored in KrpcDataStreamMgr. These RPC calls will be processed and responded to when the receiver is created or when timeout occurs. Similarly, there is limited space in the sender queues in KrpcDataStreamRecvr. If adding a row batch to a queue in KrpcDataStreamRecvr causes the buffer limit to exceed, the request will be stashed in a blocked_sender_ queue to be processed later. The stashed RPC request will not be responded to until it is processed so as to exert back pressure to the client. An alternative would be to reply with an error and the request / row batches need to be sent again. This may end up consuming more network bandwidth than the thrift RPC implementation. This change adopts the behavior of allowing one stashed request per sender. All rpc requests and responses are serialized using protobuf. The equivalent of TRowBatch would be ProtoRowBatch which contains a serialized header about the meta-data of the row batch and two Kudu Slice objects which contain pointers to the actual data (i.e. tuple offsets and tuple data). This patch is based on an abandoned patch by Henry Robinson. TESTING --- * Build passed with FLAGS_use_krpc=true. TO DO - * Port some BE tests to KRPC services. Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 --- M be/src/common/status.cc M be/src/common/status.h M be/src/exec/data-sink.cc M be/src/exec/exchange-node.cc M be/src/exec/kudu-util.h M be/src/rpc/CMakeLists.txt M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/runtime/CMakeLists.txt M be/src/runtime/data-stream-mgr-base.h M be/src/runtime/data-stream-mgr.h M be/src/runtime/data-stream-recvr.h M be/src/runtime/data-stream-sender.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M be/src/runtime/krpc-data-stream-recvr.cc M be/src/runtime/krpc-data-stream-recvr.h A be/src/runtime/krpc-data-stream-sender.cc A be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/service/CMakeLists.txt A be/src/service/data-stream-service.cc A be/src/service/data-stream-service.h M be/src/service/impala-server.cc M be/src/util/network-util.cc M cmake_modules/FindProtobuf.cmake M common/protobuf/CMakeLists.txt A common/protobuf/common.proto A common/protobuf/data_stream_service.proto A common/protobuf/row_batch.proto M common/thrift/generate_error_codes.py 34 files changed, 2,932 insertions(+), 175 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8023/6 -- To view, visit
[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8303 ) Change subject: IMPALA-1575: Part 1: eagerly release query exec resources .. Patch Set 11: (3 comments) http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/mem-tracker.cc File be/src/runtime/mem-tracker.cc: http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/mem-tracker.cc@359 PS11, Line 359: was not available > We do call it on the process MemTracker in QueryState::Init(). Okay, but my original question was whether the comment should be reworded for that case: ... if this tracker is part of the query hierarchy. i.e. "not available" seems misleading now that we don't get this from 'state' (which is the thing that wasn't always available). http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.cc@95 PS11, Line 95: if (!released_exec_resources_) ReleaseExecResources(); > My initial concern was that handling it in Init() would get messy in the ca What I proposed was not to handle it in Init(), but let the caller handle it. i.e. Init() always gets a resource ref count and QueryExecMgr::StartQuery() has to handle the error. It already has to handle this and other errors and release ref counts if it won't be passing them along... Is there a reason to not just do that? http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/runtime-state.cc@88 PS11, Line 88: local_query_state_->AcquireExecResourceRefcount(); // Decremented in ReleaseResources(). > The resources used for evaluating exprs, etc - this is used in various test Why are exvaluating exprs query-wide resources. Aren't they per-thread resources? -- To view, visit http://gerrit.cloudera.org:8080/8303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Gerrit-Change-Number: 8303 Gerrit-PatchSet: 11 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 19:37:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options
John Russell has posted comments on this change. ( http://gerrit.cloudera.org:8080/8401 ) Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options .. Patch Set 2: (4 comments) I included the "long-form" description of the RHEL/CentOS issue from Sailesh's email. http://gerrit.cloudera.org:8080/#/c/8401/1/docs/topics/impala_ssl.xml File docs/topics/impala_ssl.xml: http://gerrit.cloudera.org:8080/#/c/8401/1/docs/topics/impala_ssl.xml@145 PS1, Line 145: . In , you can use startup : options for the impal > I assume you meant catalogd and statestored in 2 of these? Done http://gerrit.cloudera.org:8080/#/c/8401/1/docs/topics/impala_ssl.xml@158 PS1, Line 158: > mention this the default may be? Right, the default is stated on the next line. http://gerrit.cloudera.org:8080/#/c/8401/1/docs/topics/impala_ssl.xml@177 PS1, Line 177: s of , TLSv1.2 may not work for > Elaborate on the default state a little bit: "Default is empty and then Imp Done http://gerrit.cloudera.org:8080/#/c/8401/1/docs/topics/impala_ssl.xml@182 PS1, Line 182: > Could you also please add a note saying that TLSv1.2 may not work on CentOS Done -- To view, visit http://gerrit.cloudera.org:8080/8401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e Gerrit-Change-Number: 8401 Gerrit-PatchSet: 2 Gerrit-Owner: John RussellGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 19:35:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options
Hello Bharath Vissapragada, Henry Robinson, Michael Brown, Sailesh Mukil, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8401 to look at the new patch set (#2). Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options .. IMPALA-5473: [DOCS] Document TLS min version & cipher options Under the doc JIRA IMPALA-6065. Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e --- M docs/topics/impala_ssl.xml 1 file changed, 71 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/8401/2 -- To view, visit http://gerrit.cloudera.org:8080/8401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e Gerrit-Change-Number: 8401 Gerrit-PatchSet: 2 Gerrit-Owner: John RussellGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow
John Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/8404 ) Change subject: IMPALA-5017: Error on decimal overflow .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG@12 PS1, Line 12: In this patch, the beharior is changed so that an error is produced in typo: beharior should be behavior -- To view, visit http://gerrit.cloudera.org:8080/8404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019 Gerrit-Change-Number: 8404 Gerrit-PatchSet: 1 Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 27 Oct 2017 19:16:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2758: Remove BufferedTupleStream::GetRows
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8226 ) Change subject: IMPALA-2758: Remove BufferedTupleStream::GetRows .. Patch Set 6: Code-Review+2 Glad we have automated checking for dropped status now, it's super-easy to miss. -- To view, visit http://gerrit.cloudera.org:8080/8226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414 Gerrit-Change-Number: 8226 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 18:47:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2758: Remove BufferedTupleStream::GetRows
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8226 ) Change subject: IMPALA-2758: Remove BufferedTupleStream::GetRows .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1408/ -- To view, visit http://gerrit.cloudera.org:8080/8226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414 Gerrit-Change-Number: 8226 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 18:46:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2758: Remove BufferedTupleStream::GetRows
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/8226 ) Change subject: IMPALA-2758: Remove BufferedTupleStream::GetRows .. Patch Set 6: Sorry I forgot to handle some error. -- To view, visit http://gerrit.cloudera.org:8080/8226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414 Gerrit-Change-Number: 8226 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 18:35:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2758: Remove BufferedTupleStream::GetRows
Hello Thomas Tauber-Marshall, Tim Armstrong, Dan Hecht, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8226 to look at the new patch set (#6). Change subject: IMPALA-2758: Remove BufferedTupleStream::GetRows .. IMPALA-2758: Remove BufferedTupleStream::GetRows This patch removes BufferedTupleStream::GetRows. This function pins a stream and reads all the rows into a single batch. It is not a good API since it creates an arbitrarily large row batch. In this patch the call sites pin the stream and then directly use GetNext to retrieve a single batch at a time. Testing: It passes existing tests. A test case for GetRows is removed. Change-Id: I3831c38994da2b69775a9809ff01de5d23584414 --- M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/buffered-tuple-stream.cc M be/src/runtime/buffered-tuple-stream.h 5 files changed, 63 insertions(+), 99 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/8226/6 -- To view, visit http://gerrit.cloudera.org:8080/8226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414 Gerrit-Change-Number: 8226 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Hello Sailesh Mukil, Mostafa Mokhtar, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8023 to look at the new patch set (#5). Change subject: IMPALA-4856: Port data stream service to KRPC .. IMPALA-4856: Port data stream service to KRPC This patch implements a new data stream service which utilizes KRPC. Similar to the thrift RPC implementation, there are 3 major components to the data stream services: KrpcDataStreamSender serializes and sends row batches materialized by a fragment instance to a KrpcDataStreamRecvr. KrpcDataStreamMgr is responsible for routing an incoming row batch to the appropriate receiver. The data stream service runs on the port FLAGS_krpc_port which is 29000 by default. Unlike the implementation with thrift RPC, KRPC provides an asynchronous interface for invoking remote methods. As a result, KrpcDataStreamSender doesn't need to create a thread per connection. There is one connection between two Impalad nodes for each direction (i.e. client and server). Multiple queries can multi-plex on the same connection for transmitting row batches between two Impalad nodes. The asynchronous interface also prevents some issues with thrift RPC in which a thread may be stuck in an RPC call without being able to check for cancellation. A TransmitData() call with KRPC is in essence a trio of RpcController, a serialized protobuf request buffer and a protobuf response buffer. The call is invoked via a DataStreamService proxy object. The serialized tuple offsets and row batches are sent via "sidecars" in KRPC to avoid extra copy into the serialized request buffer. Each impalad node creates a singleton DataStreamService object at start-up time. All incoming calls are served by a service thread pool created as part of DataStreamService. By default, there are 64 service threads. The service threads are shared across all queries so the RPC handler should avoid blocking as much as possible. In thrift RPC implementation, we make a thrift thread handling a TransmitData() RPC to block for extended period of time when the receiver is not yet created when the call arrives. In KRPC implementation, we store TransmitData() or EndDataStream() requests which arrive before the receiver is ready in a per-receiver early sender list stored in KrpcDataStreamMgr. These RPC calls will be processed and responded to when the receiver is created or when timeout occurs. Similarly, there is limited space in the sender queues in KrpcDataStreamRecvr. If adding a row batch to a queue in KrpcDataStreamRecvr causes the buffer limit to exceed, the request will be stashed in a blocked_sender_ queue to be processed later. The stashed RPC request will not be responded to until it is processed so as to exert back pressure to the client. An alternative would be to reply with an error and the request / row batches need to be sent again. This may end up consuming more network bandwidth than the thrift RPC implementation. This change adopts the behavior of allowing one stashed request per sender. All rpc requests and responses are serialized using protobuf. The equivalent of TRowBatch would be ProtoRowBatch which contains a serialized header about the meta-data of the row batch and two Kudu Slice objects which contain pointers to the actual data (i.e. tuple offsets and tuple data). This patch is based on an abandoned patch by Henry Robinson. TESTING --- * Build passed with FLAGS_use_krpc=true. TO DO - * Port some BE tests to KRPC services. Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 --- M be/src/common/status.cc M be/src/common/status.h M be/src/exec/data-sink.cc M be/src/exec/exchange-node.cc M be/src/exec/kudu-util.h M be/src/rpc/CMakeLists.txt M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/runtime/CMakeLists.txt M be/src/runtime/data-stream-mgr-base.h M be/src/runtime/data-stream-mgr.h M be/src/runtime/data-stream-recvr.h M be/src/runtime/data-stream-sender.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M be/src/runtime/krpc-data-stream-recvr.cc M be/src/runtime/krpc-data-stream-recvr.h A be/src/runtime/krpc-data-stream-sender.cc A be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/service/CMakeLists.txt A be/src/service/data-stream-service.cc A be/src/service/data-stream-service.h M be/src/service/impala-server.cc M be/src/util/network-util.cc M cmake_modules/FindProtobuf.cmake M common/protobuf/CMakeLists.txt A common/protobuf/common.proto A common/protobuf/data_stream_service.proto A common/protobuf/row_batch.proto M common/thrift/generate_error_codes.py 34 files changed, 2,929 insertions(+), 175 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8023/5 -- To view, visit
[Impala-ASF-CR] IMPALA-6093: diagnostics for flaky TestHashJoinTimer
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/8403 ) Change subject: IMPALA-6093: diagnostics for flaky TestHashJoinTimer .. Patch Set 2: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/8403/2/tests/query_test/test_hash_join_timer.py File tests/query_test/test_hash_join_timer.py: http://gerrit.cloudera.org:8080/#/c/8403/2/tests/query_test/test_hash_join_timer.py@103 PS2, Line 103: for impalad in ImpalaCluster().impalads: : verifier = MetricVerifier(impalad.service) : verifier.wait_for_metric("impala-server.num-fragments-in-flight", 0) nice addition! -- To view, visit http://gerrit.cloudera.org:8080/8403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic33296931db807abb960db43b99e5fd0f256 Gerrit-Change-Number: 8403 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 17:46:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/8401 ) Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8401/1/docs/topics/impala_ssl.xml File docs/topics/impala_ssl.xml: http://gerrit.cloudera.org:8080/#/c/8401/1/docs/topics/impala_ssl.xml@182 PS1, Line 182: Could you also please add a note saying that TLSv1.2 may not work on CentOS/RHEL 6, even if OpenSSL 1.0.1 is available? It is due to this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1497859 We are still working on a fix. -- To view, visit http://gerrit.cloudera.org:8080/8401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e Gerrit-Change-Number: 8401 Gerrit-PatchSet: 1 Gerrit-Owner: John RussellGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 17:39:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8303 ) Change subject: IMPALA-1575: Part 1: eagerly release query exec resources .. Patch Set 13: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/8303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Gerrit-Change-Number: 8303 Gerrit-PatchSet: 13 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 17:34:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5307: Part 2: copy out strings in uncompressed Avro
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8146 ) Change subject: IMPALA-5307: Part 2: copy out strings in uncompressed Avro .. Patch Set 14: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/8146 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If1fc78790d778c874f5aafa5958c3c045a88d233 Gerrit-Change-Number: 8146 Gerrit-PatchSet: 14 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 17:33:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6093: diagnostics for flaky TestHashJoinTimer
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8403 ) Change subject: IMPALA-6093: diagnostics for flaky TestHashJoinTimer .. Patch Set 2: I implemented Bikram's suggestion from the JIRA. -- To view, visit http://gerrit.cloudera.org:8080/8403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic33296931db807abb960db43b99e5fd0f256 Gerrit-Change-Number: 8403 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 17:09:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6093: diagnostics for flaky TestHashJoinTimer
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8403 ) Change subject: IMPALA-6093: diagnostics for flaky TestHashJoinTimer .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8403/2/tests/query_test/test_hash_join_timer.py File tests/query_test/test_hash_join_timer.py: http://gerrit.cloudera.org:8080/#/c/8403/2/tests/query_test/test_hash_join_timer.py@150 PS2, Line 150: assert (asyn_build), "Join is not prepared asynchronously: {0}".format(profile) To test that the profile was printed ok, I modified this to always fail. -- To view, visit http://gerrit.cloudera.org:8080/8403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic33296931db807abb960db43b99e5fd0f256 Gerrit-Change-Number: 8403 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 17:08:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6093: diagnostics for flaky TestHashJoinTimer
Tim Armstrong has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/8403 ) Change subject: IMPALA-6093: diagnostics for flaky TestHashJoinTimer .. IMPALA-6093: diagnostics for flaky TestHashJoinTimer We don't know the root cause yet but try to improve things: * Eliminate one possible cause of flakiness - unfinished fragments left from previous queries. * Print a profile if an assertion fails so we can see why it failed. Testing: Ran core tests. Change-Id: Ic33296931db807abb960db43b99e5fd0f256 --- M tests/query_test/test_hash_join_timer.py 1 file changed, 18 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/8403/2 -- To view, visit http://gerrit.cloudera.org:8080/8403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic33296931db807abb960db43b99e5fd0f256 Gerrit-Change-Number: 8403 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8211 ) Change subject: IMPALA-5243: Speed up code gen for wide Avro tables. .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1407/ -- To view, visit http://gerrit.cloudera.org:8080/8211 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476 Gerrit-Change-Number: 8211 Gerrit-PatchSet: 7 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 16:38:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8211 ) Change subject: IMPALA-5243: Speed up code gen for wide Avro tables. .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8211 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476 Gerrit-Change-Number: 8211 Gerrit-PatchSet: 7 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 16:38:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables.
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8211 ) Change subject: IMPALA-5243: Speed up code gen for wide Avro tables. .. Patch Set 7: Code-Review+1 Carrying the +1. I've not made any changes here (except rebases). I've run the core tests successfully. I had run into https://issues.apache.org/jira/browse/IMPALA-6118 and https://issues.apache.org/jira/browse/IMPALA-6106 in previous testing. I didn't see any evidence that this change breaks the build specifically. The specific failed GVO build had the following assertion. That's IMPALA-6099, which is also reasonably unrelated to the Avro-only changes here. F1023 19:06:33.677640 43548 filter-context.cc:49] Check failed: it != counters.end() Tried to increment unknown counter group *** Check failure stack trace: *** @ 0x2f1e11d google::LogMessage::Fail() F1023 19:06:33.677934 43350 filter-context.cc:49] Check failed: it != counters.end() Tried to increment unknown counter group *** Check failure stack trace: *** @ 0x2f1f9c2 google::LogMessage::SendToLog() @ 0x2f1daf7 google::LogMessage::Flush() @ 0x2f1e11d google::LogMessage::Fail() @ 0x2f210be google::LogMessageFatal::~LogMessageFatal() @ 0x2f1f9c2 google::LogMessage::SendToLog() @ 0x1ca4d99 impala::FilterStats::IncrCounters() @ 0x1ca72a4 impala::FilterContext::CheckForAlwaysFalse() @ 0x2f1daf7 google::LogMessage::Flush() @ 0x1aa537d impala::HdfsScanNodeBase::PartitionPassesFilters() @ 0x2f210be google::LogMessageFatal::~LogMessageFatal() @ 0x1b0c9c6 impala::HdfsParquetScanner::GetNextInternal() @ 0x1ca4d99 impala::FilterStats::IncrCounters() @ 0x1b0acca impala::HdfsParquetScanner::ProcessSplit() @ 0x1a99cc0 impala::HdfsScanNode::ProcessSplit() @ 0x1ca72a4 impala::FilterContext::CheckForAlwaysFalse() @ 0x1a99136 impala::HdfsScanNode::ScannerThread() @ 0x1a985d3 _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_17ThreadResourceMgr12ResourcePoolEENKUlvE_clEv @ 0x1aa537d impala::HdfsScanNodeBase::PartitionPassesFilters() @ 0x1a9a501 _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_17ThreadResourceMgr12ResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE @ 0x171bdfc boost::function0<>::operator()() @ 0x1b0c9c6 impala::HdfsParquetScanner::GetNextInternal() @ 0x19f3393 impala::Thread::SuperviseThread() @ 0x1b0acca impala::HdfsParquetScanner::ProcessSplit() @ 0x19fbf26 boost::_bi::list4<>::operator()<>() @ 0x1a99cc0 impala::HdfsScanNode::ProcessSplit() @ 0x19fbe69 boost::_bi::bind_t<>::operator()() @ 0x19fbe2c boost::detail::thread_data<>::run() @ 0x1a99136 impala::HdfsScanNode::ScannerThread() @ 0x1a985d3 _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_17ThreadResourceMgr12ResourcePoolEENKUlvE_clEv @ 0x20a7c9a thread_proxy @ 0x1a9a501 _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_17ThreadResourceMgr12ResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE @ 0x7f0cd8e3a6ba start_thread @ 0x7f0cd8b703dd clone -- To view, visit http://gerrit.cloudera.org:8080/8211 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476 Gerrit-Change-Number: 8211 Gerrit-PatchSet: 7 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 16:17:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5607: Add additional units to EXTRACT, DATE PART, TRUNC
Kim Jin Chul has posted comments on this change. ( http://gerrit.cloudera.org:8080/8311 ) Change subject: IMPALA-5607: Add additional units to EXTRACT, DATE_PART, TRUNC .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/8311/2/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/8311/2/be/src/exprs/expr-test.cc@5959 PS2, Line 5959: "cast(trunc(cast('2012-09-10 01:02:03' as timestamp), 'DAY') as string)", > I don't think the additional test cases add any value unless you add greate Done http://gerrit.cloudera.org:8080/#/c/8311/2/common/thrift/Exprs.thrift File common/thrift/Exprs.thrift: http://gerrit.cloudera.org:8080/#/c/8311/2/common/thrift/Exprs.thrift@92 PS2, Line 92: MILLISECONDS, > We shouldn't have redundant field definitions here; instead we should be le I think we need them even though they look like redundant. See my comment at line 194: https://gerrit.cloudera.org/#/c/8311/2/be/src/exprs/udf-builtins-ir.cc -- To view, visit http://gerrit.cloudera.org:8080/8311 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If23ea310ddcc5c1c806b2b1ac24d24a4bd6aa4d9 Gerrit-Change-Number: 8311 Gerrit-PatchSet: 3 Gerrit-Owner: Kim Jin ChulGerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 27 Oct 2017 09:43:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5607: Add additional units to EXTRACT, DATE PART, TRUNC
Hello Greg Rahn, Zach Amsden, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8311 to look at the new patch set (#3). Change subject: IMPALA-5607: Add additional units to EXTRACT, DATE_PART, TRUNC .. IMPALA-5607: Add additional units to EXTRACT, DATE_PART, TRUNC Return type of EXTRACT/DATE_PART is changed from INT to BIGINT because INT data type cannot cover NANOSECOND's value. * Add the following fields to EXTRACT and DATE_PART: WEEK DOW DOY SECOND/SECONDS MICROSECOND/MICROSECONDS NANOSECOND/NANOSECONDS * Add the following field to TRUNC: SS * Testing: Extend unit tests in expr-test Change-Id: If23ea310ddcc5c1c806b2b1ac24d24a4bd6aa4d9 --- M be/src/exprs/expr-test.cc M be/src/exprs/udf-builtins-ir.cc M be/src/exprs/udf-builtins.cc M be/src/exprs/udf-builtins.h M common/function-registry/impala_functions.py M common/thrift/Exprs.thrift 6 files changed, 210 insertions(+), 84 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/8311/3 -- To view, visit http://gerrit.cloudera.org:8080/8311 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If23ea310ddcc5c1c806b2b1ac24d24a4bd6aa4d9 Gerrit-Change-Number: 8311 Gerrit-PatchSet: 3 Gerrit-Owner: Kim Jin ChulGerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Zach Amsden