[native-toolchain-CR] PREVIEW: build ORC C++ lib in toolchain
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/9274 ) Change subject: PREVIEW: build ORC C++ lib in toolchain .. Patch Set 3: Code-Review+1 LGTM. I can build the orc patch based on this. Thank you for solving these! -- To view, visit http://gerrit.cloudera.org:8080/9274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e Gerrit-Change-Number: 9274 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 15 Feb 2018 07:47:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6512: Fix test exchange delays for KRPC
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9331 ) Change subject: IMPALA-6512: Fix test_exchange_delays for KRPC .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1941/ -- To view, visit http://gerrit.cloudera.org:8080/9331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idd9410381dbb931231c92f084917265e5067b4c9 Gerrit-Change-Number: 9331 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 15 Feb 2018 07:14:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6512: Fix test exchange delays for KRPC
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9331 ) Change subject: IMPALA-6512: Fix test_exchange_delays for KRPC .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idd9410381dbb931231c92f084917265e5067b4c9 Gerrit-Change-Number: 9331 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 15 Feb 2018 06:49:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8707 ) Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront .. Patch Set 25: (33 comments) http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/exec/hdfs-parquet-scanner.cc@1445 PS25, Line 1445: ; << "Already provided a buffer" was nice http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h File be/src/runtime/io/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@84 PS25, Line 84: GetNextRange: returns to the caller the next scan range it should process. : /// This is based on disk load. This also begins reading the data in this scan : /// range. This is blocking. that could probably use updating? or maybe rephrase to talk about states or something rather than API calls. http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@109 PS25, Line 109: Memory usage in the IoMgr comes from queued read buffers. If we queue the minimum : /// (i.e. 1), then the disks are idle while we are processing the buffer. If we don't : /// limit the queue, then it possible we end up queueing the entire data set (i.e. CPU : /// is slower than disks) and run out of memory. this seems a bit outdated http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@116 PS25, Line 116: In that case each query will need fewer queued buffers and : /// therefore have less memory usage. : // and that http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@136 PS25, Line 136: a client buffer provided by the caller : /// when constructing the scan range. in that case, is it required that the single buffer can fit the entire range? http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@147 PS25, Line 147: buffers are owned how does a buffer become owned by the caller? I guess that means when it's returned by GetNext() but not yet ReturnBuffer()? How does the caller know how many buffers there are in play in order to satisfy this requirement? or does it have to just conservatively assume that it should only "own" one buffer at a time? So it seems we effectively still have a client cooperation requirement in the new code (which I think is okay). It's just that instead of leading to a reservation violation, it would lead to a resource deadlock. I do agree the waiting on buffers rather than on queue space is an improvement though. http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@149 PS25, Line 149: it can allocate at least three : /// max-sized buffers per scan range the API below (AllocateBuffersForRange()) doesn't seem to give the caller (explicit) control over how many buffers are allocated. I guess it's implicit because of the maximum IO buffer size? http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@269 PS25, Line 269: , "on return" or "If 'needs_buffers' is returned as true" to make it clearer this is an out-only param http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@282 PS25, Line 282: range is : /// available what does it mean for a range to be available (from an API perspective)? http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@288 PS25, Line 288: 'needs_buffers' is set to true this returns true in '*needs_buffers' http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@300 PS25, Line 300: allocate allocated http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@474 PS25, Line 474: /// Reads the specified scan range and calls HandleReadFinished() when done. this should probably mention how it might not do anything if there's no buffer available http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.cc File be/src/runtime/io/disk-io-mgr.cc: http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.cc@375 PS25, Line 375: // Add the range to the queue of the disk the range is on that doesn't seem to match the code block http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.cc@387 PS25, Line 387: // Can't schedule until the caller allocates some buffers for the range. that comment seems to talk about only one case handled here http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.cc@391 PS25, Line 391: *needs_buffers ? ScheduleMode::BY_CALLER why is this case different than when AddScanRanges() has NO_BUFFER (which does UPON_GETNEXT)? I guess in this case we don't want it to go through the next_sc
[Impala-ASF-CR] IMPALA-6008: Creating a UDF from a shared library with a .ll extenion crashes impala
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9154 ) Change subject: IMPALA-6008: Creating a UDF from a shared library with a .ll extenion crashes impala .. Patch Set 3: (4 comments) Thanks for fixing this, it would be good to get this merged. The test coverage looks good and the fix is valid, I just noticed the test may be flaky and the structure of the error handling in the code could be improved to reduce the chance of similar mistakes in future. http://gerrit.cloudera.org:8080/#/c/9154/3/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: http://gerrit.cloudera.org:8080/#/c/9154/3/be/src/codegen/llvm-codegen.cc@223 PS3, Line 223: CreateFromMemory This function also has the same bad pattern of not closing 'codegen' on errors - we should probably fix that too while we're here. http://gerrit.cloudera.org:8080/#/c/9154/3/be/src/codegen/llvm-codegen.cc@1315 PS3, Line 1315: Status status = CreateFromFile(nullptr, &pool, nullptr, file, module_id, &codegen); CreateFromFile() should really close the LlvmCodegen object on failure - it's awkward requiring the caller to do it. Might be easier to use the "goto error" pattern with a cleanup block at the end of the CreateFromFile() function. http://gerrit.cloudera.org:8080/#/c/9154/3/tests/query_test/test_udfs.py File tests/query_test/test_udfs.py: http://gerrit.cloudera.org:8080/#/c/9154/3/tests/query_test/test_udfs.py@332 PS3, Line 332: with open("bad_udf.ll", "w") as f: The local and HDFS file creation will race if test_udf_errors() runs with itself concurrently. Would be best to use tempfile.mkstemp() for the local file and include the unique_database string in the path of the HDFS file. http://gerrit.cloudera.org:8080/#/c/9154/3/tests/query_test/test_udfs.py@338 PS3, Line 338: check_call Use call() instead of check_call(), since we want to execute the next cleanup step even on errors. -- To view, visit http://gerrit.cloudera.org:8080/9154 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id060668802ca9c80367cdc0e8a823b968d549bbb Gerrit-Change-Number: 9154 Gerrit-PatchSet: 3 Gerrit-Owner: anujphadke Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Thu, 15 Feb 2018 06:29:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8363 ) Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ .. Patch Set 10: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 Gerrit-Change-Number: 8363 Gerrit-PatchSet: 10 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Thu, 15 Feb 2018 05:04:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8363 ) Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ .. IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ The following 2 locks have shown to be frequent points of contention on recent perf runs: - qs_map_lock_ - client_request_state_map_lock_ Since these are process wide locks, any threads waiting on these locks potentially slow down the runtime of a query. I tried to address this previously by converting the client_request_state_map_lock_ to a reader-writer lock. This showed great perf improvements in the general case, however, there were edge cases with big regressions as well. In the general case, strict readers of the map got through so quickly that we were able to see a reduction in the number of client connections created, since this lock was contended for in the context of Thrift threads too. The bad case is when writers were starved trying to register a new query since there were so many readers. Changing the starve option resulted in worse read performance all round. Another approach which is relatively simpler is to shard the locks, which proves to be very effective with no regressions. The maps and locks are sharded to a default of 4 buckets initally. Query IDs are created by using boost::uuids::random_generator. We use the high bits of a query ID to assign queries to buckets. I verified that the distribution of the high bits of a query ID are even across buckets on my local machine: For 10,000 queries sharded across 4 buckets, the distribution was: bucket[0]: 2500 bucket[1]: 2489 bucket[2]: 2566 bucket[3]: 2445 A micro-benchmark is added to measure the improvement in performance. This benchmark creates multiple threads each of which creates a QueryState and accesses it multiple times. We can see improvements in the range 2x - 3.5x. BEFORE: --Benchmark 1: Create and access Query States. Total Time (#Queries: 5 #Accesses: 100) : 1ms Total Time (#Queries: 50 #Accesses: 100) : 8ms Total Time (#Queries: 50 #Accesses: 1000) : 54ms Total Time (#Queries: 500 #Accesses: 100) : 76ms Total Time (#Queries: 500 #Accesses: 1000) : 543ms AFTER: --Benchmark 1: Create and access Query States. Total Time (#Queries: 5 #Accesses: 100) : 2173.59K clock cycles Total Time (#Queries: 50 #Accesses: 100) : 4ms Total Time (#Queries: 50 #Accesses: 1000) : 15ms Total Time (#Queries: 500 #Accesses: 100) : 46ms Total Time (#Queries: 500 #Accesses: 1000) : 151ms This change introduces a ShardedQueryMap, which is used to replace the QueryExecMgr::qs_map_ and the ImpalaServer::client_request_state_map_, and their corresponding locks, thereby abstracting away the access to the maps locks. For operations that need to happen on every entry in the ShardedQueryMap maps, a new function ShardedQueryMap::DoFuncForAllEntries() is introduced which takes a user supplied lambda and passes it every individual map entry and executes it. NOTE: This microbenchmark has shown that SpinLock has better performance than boost::mutex for the qs_map_lock_'s, so that change has been made too. TODO: Add benchmark for client_request_state_map_lock_ too. The APIs around that are more complicated, so this patch only includes the benchmarking of qs_map_lock_. TODO 2: Consider adopting the ShardedQueryMapTemplate for the SessionStateMap. Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 Reviewed-on: http://gerrit.cloudera.org:8080/8363 Reviewed-by: Sailesh Mukil Tested-by: Impala Public Jenkins --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/process-wide-locks-benchmark.cc M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-exec-mgr.h M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h A be/src/util/sharded-query-map-util.h 8 files changed, 379 insertions(+), 59 deletions(-) Approvals: Sailesh Mukil: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 Gerrit-Change-Number: 8363 Gerrit-PatchSet: 11 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-5139: Update mvn-quiet.sh to print execution content to log file
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/9273 ) Change subject: IMPALA-5139: Update mvn-quiet.sh to print execution content to log file .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/9273/5/bin/mvn-quiet.sh File bin/mvn-quiet.sh: http://gerrit.cloudera.org:8080/#/c/9273/5/bin/mvn-quiet.sh@35 PS5, Line 35: mvn "$@" | tee -a $LOG_FILE | grep -E -e WARNING -e ERROR -e SUCCESS -e FAILURE -e Test; then I'll repeat my caveat about not being any sort of bash expert. I recommend running my comments past someone else -- e.g., Phil Z. or Michael. I feel like there is something that is still confusing about this code, and some of it has to do with this patch, but a lot of it preceded your patch as well. * I still think mvn being piped to mvn is an odd construction, and probably something to be avoided * I think that the combined interaction between "set -eo pipefail" and the if-block and the piping is a little hard to reason about * I strongly suspect that inside the if-block, as it stands, "echo [...] exited with code $?" will *always* show an exit code of 0, regardless of whether mvn failed or not * That said, I do think that "exit 1" will execute if mvn failed, so at least the script will fail as expected Stepping back for a minute to consider what we want this script to do, I think it's something like this: * we don't want to see the verbose output of mvn, but we want to capture it in a log file * however, if there are lines ERRORS or WARNINGS or SUCCESS or FAILURE, we want to send those lines to the console * in the chain of mvn | tee | grep, we want to exit with a non-zero status if mvn fails, but we don't esepcially care about the exit status of tee and certainly not grep I think that all of those things can be accomplished with something like this. set -uo pipefail # i.e., get rid of set -e [...] mvn $IMPALA_MAVEN_OPTIONS "$@" | tee -a $LOG_FILE | \ grep -E -e WARNING -e ERROR -e SUCCESS -e FAILURE -e Test MVN_EXIT_STATUS=${PIPESTATUS[0]} echo "mvn exited with ${MVN_EXIT_STATUS}" exit ${MVN_EXIT_STATUS} Getting rid of "set -e" means that we can continue to execute in this script even if mvn fails, but without having to resort to "if mvn [...]; then...". Everything gets logged, but only lines matching our grep pattern go to console. mvn exit status is captured in ${PIPESTATUS[0]}, but tee and grep status are ignored. The script exits with the same status as mvn. This manages to avoid piping mvn to mvn. FYI, this is the page I read about to clarify my understanding of pipefail and PIPESTATUS: https://unix.stackexchange.com/questions/14270/get-exit-status-of-process-thats-piped-to-another -- To view, visit http://gerrit.cloudera.org:8080/9273 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I475b17a4dccfa624dda61402491b461c53473f8b Gerrit-Change-Number: 9273 Gerrit-PatchSet: 5 Gerrit-Owner: Nithya Janarthanan Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Thu, 15 Feb 2018 04:59:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 3: This should have read: > However, the dependencies are often *not* just between > the existence of the object but between the state of the bojects > (e.g. whether Init() was called yet), so it doesn't always help. -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 15 Feb 2018 03:44:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 3: > (2 comments) > > > Patch Set 1: > > > > (6 comments) > > > > I went through the change and found that I would prefer to not > access dependencies through the ExecEnv singleton, but instead > inject them through the ctors and Init(). That way it feels easier > to me to follow the code. I don't feel strongly about it though. > > I'm curious how do others feel about this? > My feeling is that if something is a singleton, then it's usually easier for me to follow the code if it's accessed directly rather than adding indirection by passing around pointers to it (which ultimately I'll often need to trace back to figure out where the object came from anyway). Additionally, at least in some places, the architecture dictates some of these daemon services are singletons, so I don't see the advantage in pretending that they parametrizable. On the other hand, passing by argument can help show dependencies between objects. However, the dependencies are often just between the existence of the object but between the state of the bojects (e.g. whether Init() was called yet), so it doesn't always help. -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 15 Feb 2018 03:43:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6508: add KRPC test flag
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9291 ) Change subject: IMPALA-6508: add KRPC test flag .. Patch Set 8: (2 comments) http://gerrit.cloudera.org:8080/#/c/9291/7/bin/start-impala-cluster.py File bin/start-impala-cluster.py: http://gerrit.cloudera.org:8080/#/c/9291/7/bin/start-impala-cluster.py@264 PS7, Line 264: "-use_krpc=tru > seems more consistent to do -use_krpc=true here as other options above seem Done http://gerrit.cloudera.org:8080/#/c/9291/7/tests/common/custom_cluster_test_suite.py File tests/common/custom_cluster_test_suite.py: http://gerrit.cloudera.org:8080/#/c/9291/7/tests/common/custom_cluster_test_suite.py@138 PS7, Line 138: --use_krpc") > why not just --use_krpc here ? Good point, done. -- To view, visit http://gerrit.cloudera.org:8080/9291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab Gerrit-Change-Number: 9291 Gerrit-PatchSet: 8 Gerrit-Owner: Lars Volker Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Thu, 15 Feb 2018 03:27:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6508: add KRPC test flag
Hello Michael Ho, Michael Brown, Sailesh Mukil, David Knupp, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9291 to look at the new patch set (#8). Change subject: IMPALA-6508: add KRPC test flag .. IMPALA-6508: add KRPC test flag This change adds a flag "--use_krpc" to start-impala-cluster.py. The flag is currently passed as an argument to the impalad daemon. In the future it will also enable KRPC for the catalogd and statestored daemons. This change also adds a flag "--test_krpc" to pytest. When running tests using "impala-py.test --test_krpc", the test cluster will be started by passing "--use_krpc" to start-impala-cluster.py (see above). This change also adds a SkipIf to skip tests based on whether the cluster was started with KRPC support or not. - SkipIf.not_krpc can be used to mark a test that depends on KRPC. - SkipIf.not_thrift can be used to mark a test that depends on Thrift RPC. This change adds a meta test to make sure that the new SkipIf decorators work correctly. The test should be removed as soon as real tests have been added with the new decorators. Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab --- M bin/run-all-tests.sh M bin/start-impala-cluster.py M tests/common/custom_cluster_test_suite.py M tests/common/skip.py A tests/common/test_skip.py M tests/conftest.py M tests/custom_cluster/test_exchange_delays.py M tests/custom_cluster/test_krpc_mem_usage.py M tests/custom_cluster/test_rpc_exception.py 9 files changed, 85 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/9291/8 -- To view, visit http://gerrit.cloudera.org:8080/9291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab Gerrit-Change-Number: 9291 Gerrit-PatchSet: 8 Gerrit-Owner: Lars Volker Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 3: (2 comments) > Patch Set 1: > > (6 comments) > > I went through the change and found that I would prefer to not access > dependencies through the ExecEnv singleton, but instead inject them through > the ctors and Init(). That way it feels easier to me to follow the code. I > don't feel strongly about it though. I'm curious how do others feel about this? > > However, if we want to go with ExecEnv, we should probably remove some of the > injected dependencies as well to keep it consistent. http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/rpc-mgr-test-base.h File be/src/rpc/rpc-mgr-test-base.h: http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/rpc-mgr-test-base.h@133 PS1, Line 133: // Takes over ownership of the newly created 'service' which needs to have a lifetime > The RpcMgr no longer owns the rpc service. RpcMgr::Shutdown() will call rpc Shouldn't the RpcMgr own the services then? http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/krpc-data-stream-mgr.cc File be/src/runtime/krpc-data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/krpc-data-stream-mgr.cc@83 PS1, Line 83: mem_tracker_.reset(new MemTracker(-1, "Data Stream Manager Deferred RPCs", > It seems the common pattern is to have the MemTracker inside the owning obj That makes sense to me, thanks for the explanation. -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 15 Feb 2018 03:23:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9250 ) Change subject: IMPALA-6519: API to allocate unreserved buffer .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983 Gerrit-Change-Number: 9250 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 15 Feb 2018 01:35:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9250 ) Change subject: IMPALA-6519: API to allocate unreserved buffer .. IMPALA-6519: API to allocate unreserved buffer The motivation is to allow allocation of buffers without reservation in ExchangeNode. Currently this it not possible because IncreaseReservationToFit() followed by AllocateBuffer() is non-atomic. We need to handle concurrent allocations in ExchangeNode because there may be multiple batches being received at a given time. This is a temporary solution until we can implement proper reservations in ExchangeNode (IMPALA-6524). Testing: Added basic unit test. Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983 Reviewed-on: http://gerrit.cloudera.org:8080/9250 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins --- M be/src/runtime/bufferpool/buffer-pool-internal.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/bufferpool/buffer-pool.cc M be/src/runtime/bufferpool/buffer-pool.h M be/src/runtime/bufferpool/reservation-tracker.cc M be/src/runtime/bufferpool/reservation-tracker.h 6 files changed, 131 insertions(+), 33 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983 Gerrit-Change-Number: 9250 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8363 ) Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ .. Patch Set 10: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1940/ -- To view, visit http://gerrit.cloudera.org:8080/8363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 Gerrit-Change-Number: 8363 Gerrit-PatchSet: 10 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Thu, 15 Feb 2018 01:26:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/8363 ) Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ .. Patch Set 10: Code-Review+2 Fixed clang-tidy issues. Rebase, carry +2. -- To view, visit http://gerrit.cloudera.org:8080/8363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 Gerrit-Change-Number: 8363 Gerrit-PatchSet: 10 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Thu, 15 Feb 2018 01:26:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock
Hello Philip Zeyliger, Dan Hecht, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8363 to look at the new patch set (#10). Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ .. IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ The following 2 locks have shown to be frequent points of contention on recent perf runs: - qs_map_lock_ - client_request_state_map_lock_ Since these are process wide locks, any threads waiting on these locks potentially slow down the runtime of a query. I tried to address this previously by converting the client_request_state_map_lock_ to a reader-writer lock. This showed great perf improvements in the general case, however, there were edge cases with big regressions as well. In the general case, strict readers of the map got through so quickly that we were able to see a reduction in the number of client connections created, since this lock was contended for in the context of Thrift threads too. The bad case is when writers were starved trying to register a new query since there were so many readers. Changing the starve option resulted in worse read performance all round. Another approach which is relatively simpler is to shard the locks, which proves to be very effective with no regressions. The maps and locks are sharded to a default of 4 buckets initally. Query IDs are created by using boost::uuids::random_generator. We use the high bits of a query ID to assign queries to buckets. I verified that the distribution of the high bits of a query ID are even across buckets on my local machine: For 10,000 queries sharded across 4 buckets, the distribution was: bucket[0]: 2500 bucket[1]: 2489 bucket[2]: 2566 bucket[3]: 2445 A micro-benchmark is added to measure the improvement in performance. This benchmark creates multiple threads each of which creates a QueryState and accesses it multiple times. We can see improvements in the range 2x - 3.5x. BEFORE: --Benchmark 1: Create and access Query States. Total Time (#Queries: 5 #Accesses: 100) : 1ms Total Time (#Queries: 50 #Accesses: 100) : 8ms Total Time (#Queries: 50 #Accesses: 1000) : 54ms Total Time (#Queries: 500 #Accesses: 100) : 76ms Total Time (#Queries: 500 #Accesses: 1000) : 543ms AFTER: --Benchmark 1: Create and access Query States. Total Time (#Queries: 5 #Accesses: 100) : 2173.59K clock cycles Total Time (#Queries: 50 #Accesses: 100) : 4ms Total Time (#Queries: 50 #Accesses: 1000) : 15ms Total Time (#Queries: 500 #Accesses: 100) : 46ms Total Time (#Queries: 500 #Accesses: 1000) : 151ms This change introduces a ShardedQueryMap, which is used to replace the QueryExecMgr::qs_map_ and the ImpalaServer::client_request_state_map_, and their corresponding locks, thereby abstracting away the access to the maps locks. For operations that need to happen on every entry in the ShardedQueryMap maps, a new function ShardedQueryMap::DoFuncForAllEntries() is introduced which takes a user supplied lambda and passes it every individual map entry and executes it. NOTE: This microbenchmark has shown that SpinLock has better performance than boost::mutex for the qs_map_lock_'s, so that change has been made too. TODO: Add benchmark for client_request_state_map_lock_ too. The APIs around that are more complicated, so this patch only includes the benchmarking of qs_map_lock_. TODO 2: Consider adopting the ShardedQueryMapTemplate for the SessionStateMap. Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/process-wide-locks-benchmark.cc M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-exec-mgr.h M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h A be/src/util/sharded-query-map-util.h 8 files changed, 379 insertions(+), 59 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/8363/10 -- To view, visit http://gerrit.cloudera.org:8080/8363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 Gerrit-Change-Number: 8363 Gerrit-PatchSet: 10 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6512: Fix test exchange delays for KRPC
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/9331 ) Change subject: IMPALA-6512: Fix test_exchange_delays for KRPC .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/9331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idd9410381dbb931231c92f084917265e5067b4c9 Gerrit-Change-Number: 9331 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Thu, 15 Feb 2018 01:23:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6512: Fix test exchange delays for KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9331 ) Change subject: IMPALA-6512: Fix test_exchange_delays for KRPC .. Patch Set 1: FWIW, this is the error message in question: https://github.com/apache/impala/blob/master/common/thrift/generate_error_codes.py#L227-L228 -- To view, visit http://gerrit.cloudera.org:8080/9331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idd9410381dbb931231c92f084917265e5067b4c9 Gerrit-Change-Number: 9331 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Thu, 15 Feb 2018 01:04:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6512: Fix test exchange delays for KRPC
Michael Ho has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9331 Change subject: IMPALA-6512: Fix test_exchange_delays for KRPC .. IMPALA-6512: Fix test_exchange_delays for KRPC The sender timed out error message diverges between Thrift and KRPC slightly due to the source address being not readily available with Thrift RPC implementation. This leads to failure in test_exchange_delays when KRPC is enabled. This change fixes the problem by shortening the error message string to match against. Testing done: Tested with KRPC enabled in the code and verified the tests passed. Change-Id: Idd9410381dbb931231c92f084917265e5067b4c9 --- M testdata/workloads/functional-query/queries/QueryTest/exchange-delays-zero-rows.test M testdata/workloads/functional-query/queries/QueryTest/exchange-delays.test 2 files changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/9331/1 -- To view, visit http://gerrit.cloudera.org:8080/9331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Idd9410381dbb931231c92f084917265e5067b4c9 Gerrit-Change-Number: 9331 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho
[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9053 ) Change subject: IMPALA-6416: extend Thread::Create to track instance id .. Patch Set 6: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1938/ -- To view, visit http://gerrit.cloudera.org:8080/9053 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187 Gerrit-Change-Number: 9053 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 15 Feb 2018 00:56:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 3: Code-Review+1 I'm okay with this change modulo the choice of limit and ordering of memtracker updates. Tim, can you do the +2 once you guys agree on that and you're happy with the change? Lars, please see if you have any further comments. -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 15 Feb 2018 00:43:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/krpc-data-stream-mgr.cc File be/src/runtime/krpc-data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/krpc-data-stream-mgr.cc@176 PS2, Line 176: service_mem_tracker_->Release(transfer_size); > The process limit is checked by looking at TC-malloc and the buffer pool, r The parent trackers are both process memory tracker. -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 15 Feb 2018 00:36:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6508: add KRPC test flag
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9291 ) Change subject: IMPALA-6508: add KRPC test flag .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/9291/7/bin/start-impala-cluster.py File bin/start-impala-cluster.py: http://gerrit.cloudera.org:8080/#/c/9291/7/bin/start-impala-cluster.py@264 PS7, Line 264: "-use_krpc %s" seems more consistent to do -use_krpc=true here as other options above seem to explicitly set the boolean flag. http://gerrit.cloudera.org:8080/#/c/9291/7/tests/common/custom_cluster_test_suite.py File tests/common/custom_cluster_test_suite.py: http://gerrit.cloudera.org:8080/#/c/9291/7/tests/common/custom_cluster_test_suite.py@138 PS7, Line 138: --impalad_args=-use_krpc why not just --use_krpc here ? -- To view, visit http://gerrit.cloudera.org:8080/9291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab Gerrit-Change-Number: 9291 Gerrit-PatchSet: 7 Gerrit-Owner: Lars Volker Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Thu, 15 Feb 2018 00:22:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/krpc-data-stream-mgr.cc File be/src/runtime/krpc-data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/krpc-data-stream-mgr.cc@176 PS2, Line 176: service_mem_tracker_->Release(transfer_size); > There may be a temporary window in which the process mem limit is exceeded, The process limit is checked by looking at TC-malloc and the buffer pool, rather than adding up the children consumption. So, I don't think this impacts the process limit. Who are the parents of these two trackers? Tim, what's your take on the order that the Consume() and Release() should occur in? -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 15 Feb 2018 00:06:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/exec-env.cc File be/src/runtime/exec-env.cc: http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/exec-env.cc@304 PS2, Line 304: RETURN_IF_ERROR(data_svc_->Init()); > now that we don't pass parameters, if there are still dependencies in the o Done http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/krpc-data-stream-mgr.cc File be/src/runtime/krpc-data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/krpc-data-stream-mgr.cc@176 PS2, Line 176: service_mem_tracker_->Release(transfer_size); > what is the impact of the transient double counting now? There may be a temporary window in which the process mem limit is exceeded, leading to other callers of MemTracker::TryConsume() to fail. Worst case, we may be double counting by # service threads * avg row batch size. On the other hand, leaving the window open may lead to a higher chance of going over the process mem limit. http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/service/data-stream-service.h File be/src/service/data-stream-service.h: http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/service/data-stream-service.h@45 PS2, Line 45: /// Initializes the service by registering it with the singleton RPC manager. > note here that the RpcMgr::Init() must have already been called (or whateve Done -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 23:57:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Hello Lars Volker, Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9282 to look at the new patch set (#3). Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. IMPALA-6116: Bound memory usage of DataStreamSevice's service queue The fix for IMPALA-6193 added a memory tracker for the memory consumed by the payloads in the service queue of DataStreamService. This change extends it by introducing a bound on the memory usage for that service queue. In addition, it deprecates FLAGS_datastream_service_queue_depth and replaces it with FLAGS_datastream_service_queue_mem_limit. These flags only take effect when KRPC is in use and KRPC was never enabled in any previous releases so it seems safe to do this flag replacement. The new flag FLAGS_datastream_service_queue_mem_limit directly dictates the amount of memory which can be consumed by the service queue of DataStreamService. This allows a more direct control over the memory usage of the queue instead of inferring via the number of entries in the queue. The default value of this flag is left at 0, in which case it will be set to 20% of process memory limit. Testing done: exhaustive debug builds. Updated data-stream-test to exercise the case in which the payload is larger than the limit. Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr-test-base.h M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M be/src/runtime/mem-tracker.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M tests/custom_cluster/test_krpc_mem_usage.py 15 files changed, 283 insertions(+), 164 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/9282/3 -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] DRAFT MISSING COMMENTS IMPALA-6520: Add metrics for number of rejected RPCs
Lars Volker has abandoned this change. ( http://gerrit.cloudera.org:8080/9320 ) Change subject: DRAFT MISSING COMMENTS IMPALA-6520: Add metrics for number of rejected RPCs .. Abandoned Squashed into IMPALA-6269 -- To view, visit http://gerrit.cloudera.org:8080/9320 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I636478808d9a1c8625c6636109fb0ec5c662f2bc Gerrit-Change-Number: 9320 Gerrit-PatchSet: 2 Gerrit-Owner: Lars Volker
[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage
Hello Michael Ho, Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9292 to look at the new patch set (#5). Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage .. IMPALA-6269: Expose KRPC metrics on debug webpage This change exposes KRPC metrics on the /rpcz debug web page. This change also exposes metrics for rejected RPCs on the /metrics debug web page. This change also fixes a bug in PrettyPrinter::GetByteUnit(), which previously did not work for unsigned values due to an implicit cast. This change is based on a change by Sailesh Mukil. TODO: testing, once IMPALA-6508 has been submitted. Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc-trace.cc M be/src/rpc/rpc-trace.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/service/impalad-main.cc M be/src/util/histogram-metric.h M be/src/util/pretty-printer.h M common/thrift/Metrics.thrift M common/thrift/metrics.json M tests/webserver/test_web_pages.py M www/rpcz.tmpl 15 files changed, 424 insertions(+), 106 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/9292/5 -- To view, visit http://gerrit.cloudera.org:8080/9292 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f Gerrit-Change-Number: 9292 Gerrit-PatchSet: 5 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9292 ) Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage .. Patch Set 5: PS5 exposes metrics for rejected RPCs on /metrics. -- To view, visit http://gerrit.cloudera.org:8080/9292 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f Gerrit-Change-Number: 9292 Gerrit-PatchSet: 5 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 14 Feb 2018 23:37:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9123 ) Change subject: IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates .. IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates This adds a set of "prioritized" statestore topics that are small but are important to deliver in a timely manner. These are delivered more frequently by a separate thread pool to reduce the window for stale admission control and scheduling information. The contract between statestore and subscriber is changed so that the statestore can send concurrent Update() RPCs for disjoint sets of topics. This required changes to the subscriber implementation, which assumed that only one Update RPC would arrive at a time. It also changes the locking in the statestore so that the prioritized update threads don't get stuck behind the catalog threads holding 'topic_lock_'. Specifically, it uses a reader-writer lock to protect modification of the set of topics and a reader-writer lock per topic to allow the topic data to be read by multiple threads concurrently. Added metrics to monitor the per-topic update interval. Testing: Ran core tests. Inspected metrics on Impala daemons, saw that membership and request queue processing times had more samples recorded than the catalog topic, reflecting the increased frequency. Ran under thread sanitizer, made sure no data races were reported in Statestore or StatestoreSubscriber. Change-Id: Ifc49c2d0f2a5bfad822545616b8c62b4b95dc210 Reviewed-on: http://gerrit.cloudera.org:8080/9123 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins --- M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/impala-server.cc M be/src/statestore/statestore-subscriber.cc M be/src/statestore/statestore-subscriber.h M be/src/statestore/statestore.cc M be/src/statestore/statestore.h M common/thrift/metrics.json M tests/custom_cluster/test_admission_controller.py M tests/statestore/test_statestore.py M www/statestore_subscribers.tmpl M www/statestore_topics.tmpl 15 files changed, 845 insertions(+), 485 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9123 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ifc49c2d0f2a5bfad822545616b8c62b4b95dc210 Gerrit-Change-Number: 9123 Gerrit-PatchSet: 17 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9123 ) Change subject: IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates .. Patch Set 16: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9123 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc49c2d0f2a5bfad822545616b8c62b4b95dc210 Gerrit-Change-Number: 9123 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 22:44:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9250 ) Change subject: IMPALA-6519: API to allocate unreserved buffer .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1939/ -- To view, visit http://gerrit.cloudera.org:8080/9250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983 Gerrit-Change-Number: 9250 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 21:56:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9250 ) Change subject: IMPALA-6519: API to allocate unreserved buffer .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc File be/src/runtime/bufferpool/buffer-pool.cc: http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc@569 PS4, Line 569: } > That bug probably wouldn't have happened if the code was just put into the Yes I undermined myself a bit there. http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc@570 PS4, Line 570: if (success != nullptr) *success = false; > I was suggesting just moving the logic directly into BufferPool::AllocateUn I'll stick with this for now. -- To view, visit http://gerrit.cloudera.org:8080/9250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983 Gerrit-Change-Number: 9250 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 21:56:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9250 ) Change subject: IMPALA-6519: API to allocate unreserved buffer .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983 Gerrit-Change-Number: 9250 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 21:56:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9250 ) Change subject: IMPALA-6519: API to allocate unreserved buffer .. Patch Set 5: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc File be/src/runtime/bufferpool/buffer-pool.cc: http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc@569 PS4, Line 569: DCHECK(success != nullptr); > There was a bug here because we didn't test the failure case. Extended the That bug probably wouldn't have happened if the code was just put into the callers :) http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc@570 PS4, Line 570: // The client may not have reserved the memory. > I had trouble judging whether it was better to have one function with more I was suggesting just moving the logic directly into BufferPool::AllocateUnreservedBuffer and BufferPool::AllocateBuffer. Or can they not access the required impl_ fields? Given that they don't really share much other than what's under the lock (and have two control flow paramters - reserved & success), seems clearer to just have this code inlined. But if you prefer to leave it the way it is now, that's okay too. -- To view, visit http://gerrit.cloudera.org:8080/9250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983 Gerrit-Change-Number: 9250 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 21:48:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9250 ) Change subject: IMPALA-6519: API to allocate unreserved buffer .. Patch Set 4: (6 comments) http://gerrit.cloudera.org:8080/#/c/9250/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9250/4//COMMIT_MSG@14 PS4, Line 14: > let's note that this is a temporary solution so that we can introduce real Done http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.h File be/src/runtime/bufferpool/buffer-pool.h: http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.h@246 PS4, Line 246: /// operations for 'client', except for operations on the same 'handle'. > maybe note that if there is an unexpected runtime failure during allocation Done http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.h@250 PS4, Line 250: /// out of that instead of relying on the "best effort" interface. > how about explicitly saying that the interface is provided so to help trans Done http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc File be/src/runtime/bufferpool/buffer-pool.cc: http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc@562 PS4, Line 562: Ensure we have the reservation required first. > that's not really the case when 'reserved' is true -- the client needs to e Restructured the comments to separate the two cases. http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc@569 PS4, Line 569: } There was a bug here because we didn't test the failure case. Extended the test to exercise it. http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc@570 PS4, Line 570: if (success != nullptr) *success = false; > given that there's only one callsite with reserved==true and one with reser I had trouble judging whether it was better to have one function with more complex control flow vs two functions with simpler control flow. Getting the accounting right was a little subtle so I thought this way we'd get better test coverage, but I can see the argument for the other approach if you think it is superior. -- To view, visit http://gerrit.cloudera.org:8080/9250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983 Gerrit-Change-Number: 9250 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 21:29:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer
Hello Michael Ho, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9250 to look at the new patch set (#5). Change subject: IMPALA-6519: API to allocate unreserved buffer .. IMPALA-6519: API to allocate unreserved buffer The motivation is to allow allocation of buffers without reservation in ExchangeNode. Currently this it not possible because IncreaseReservationToFit() followed by AllocateBuffer() is non-atomic. We need to handle concurrent allocations in ExchangeNode because there may be multiple batches being received at a given time. This is a temporary solution until we can implement proper reservations in ExchangeNode (IMPALA-6524). Testing: Added basic unit test. Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983 --- M be/src/runtime/bufferpool/buffer-pool-internal.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/bufferpool/buffer-pool.cc M be/src/runtime/bufferpool/buffer-pool.h M be/src/runtime/bufferpool/reservation-tracker.cc M be/src/runtime/bufferpool/reservation-tracker.h 6 files changed, 131 insertions(+), 33 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/9250/5 -- To view, visit http://gerrit.cloudera.org:8080/9250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983 Gerrit-Change-Number: 9250 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9053 ) Change subject: IMPALA-6416: extend Thread::Create to track instance id .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1938/ -- To view, visit http://gerrit.cloudera.org:8080/9053 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187 Gerrit-Change-Number: 9053 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 14 Feb 2018 21:11:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9053 ) Change subject: IMPALA-6416: extend Thread::Create to track instance id .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9053 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187 Gerrit-Change-Number: 9053 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 14 Feb 2018 21:11:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9053 ) Change subject: IMPALA-6416: extend Thread::Create to track instance id .. Patch Set 5: Code-Review+2 Thanks, I think this makes sense. -- To view, visit http://gerrit.cloudera.org:8080/9053 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187 Gerrit-Change-Number: 9053 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 14 Feb 2018 21:10:45 + Gerrit-HasComments: No
[Impala-ASF-CR] DRAFT MISSING COMMENTS IMPALA-6520: Add metrics for number of rejected RPCs
Lars Volker has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/9320 ) Change subject: DRAFT MISSING COMMENTS IMPALA-6520: Add metrics for number of rejected RPCs .. DRAFT MISSING COMMENTS IMPALA-6520: Add metrics for number of rejected RPCs Change-Id: I636478808d9a1c8625c6636109fb0ec5c662f2bc --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/runtime/exec-env.cc M common/thrift/metrics.json 6 files changed, 51 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/9320/2 -- To view, visit http://gerrit.cloudera.org:8080/9320 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I636478808d9a1c8625c6636109fb0ec5c662f2bc Gerrit-Change-Number: 9320 Gerrit-PatchSet: 2 Gerrit-Owner: Lars Volker
[Impala-ASF-CR] IMPALA-6520: Add metrics for number of rejected RPCs
Lars Volker has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9320 Change subject: IMPALA-6520: Add metrics for number of rejected RPCs .. IMPALA-6520: Add metrics for number of rejected RPCs Change-Id: I636478808d9a1c8625c6636109fb0ec5c662f2bc --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/runtime/exec-env.cc M common/thrift/metrics.json 6 files changed, 51 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/9320/1 -- To view, visit http://gerrit.cloudera.org:8080/9320 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I636478808d9a1c8625c6636109fb0ec5c662f2bc Gerrit-Change-Number: 9320 Gerrit-PatchSet: 1 Gerrit-Owner: Lars Volker
[Impala-ASF-CR] IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9123 ) Change subject: IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates .. Patch Set 16: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1937/ -- To view, visit http://gerrit.cloudera.org:8080/9123 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc49c2d0f2a5bfad822545616b8c62b4b95dc210 Gerrit-Change-Number: 9123 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 19:15:37 + Gerrit-HasComments: No
[native-toolchain-CR] PREVIEW: build ORC C++ lib in toolchain
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9274 ) Change subject: PREVIEW: build ORC C++ lib in toolchain .. Patch Set 3: It looks like my first issue may be fixed by ORC-266 https://issues.apache.org/jira/browse/ORC-266 . When they release ORC 1.5 we might be able to remove that custom patch. -- To view, visit http://gerrit.cloudera.org:8080/9274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e Gerrit-Change-Number: 9274 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 19:13:51 + Gerrit-HasComments: No
[native-toolchain-CR] PREVIEW: build ORC C++ lib in toolchain
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9274 ) Change subject: PREVIEW: build ORC C++ lib in toolchain .. Patch Set 2: Also bumped the version, thanks for letting me know! -- To view, visit http://gerrit.cloudera.org:8080/9274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e Gerrit-Change-Number: 9274 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 19:07:14 + Gerrit-HasComments: No
[native-toolchain-CR] PREVIEW: build ORC C++ lib in toolchain
Hello Quanlong Huang, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9274 to look at the new patch set (#3). Change subject: PREVIEW: build ORC C++ lib in toolchain .. PREVIEW: build ORC C++ lib in toolchain Includes a patch to allow pointing ORC at our versions of the dependencies instead of building them itself. Testing: Confirmed that it built locally, confirmed that the built utilities were runnable after sourcing impala-config.sh. TODO: build on more Linux distros, check that the built libraries are ususable. Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e --- M buildall.sh A source/orc/build.sh A source/orc/orc-1.4.3-patches/0001-Allow-building-against-external-versions-of-dependen.patch A source/orc/orc-1.4.3-patches/0002-ORC-239.-Install-missing-Statistics.hh-header-file.patch 4 files changed, 214 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/74/9274/3 -- To view, visit http://gerrit.cloudera.org:8080/9274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e Gerrit-Change-Number: 9274 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong
[native-toolchain-CR] PREVIEW: build ORC C++ lib in toolchain
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9274 ) Change subject: PREVIEW: build ORC C++ lib in toolchain .. Patch Set 2: It looks like they fixed the issue on master with this commit, but I added a simple patch to the 1.4.2 branch. commit 06a013cabecea8b9c560abd69d4c2d03509867a5 Author: Jim Crist Date: Wed Nov 1 17:20:00 2017 -0500 Fix installation to include all header files - Fixes installation to include all necessary header files in `include/orc` - Removes a few unnecessary includes Fixes #185 Signed-off-by: Owen O'Malley -- To view, visit http://gerrit.cloudera.org:8080/9274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e Gerrit-Change-Number: 9274 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 19:02:30 + Gerrit-HasComments: No
[native-toolchain-CR] PREVIEW: build ORC C++ lib in toolchain
Hello Quanlong Huang, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9274 to look at the new patch set (#2). Change subject: PREVIEW: build ORC C++ lib in toolchain .. PREVIEW: build ORC C++ lib in toolchain Includes a patch to allow pointing ORC at our versions of the dependencies instead of building them itself. Testing: Confirmed that it built locally, confirmed that the built utilities were runnable after sourcing impala-config.sh. TODO: build on more Linux distros, check that the built libraries are ususable. Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e --- M buildall.sh A source/orc/build.sh A source/orc/orc-1.4.2-patches/0001-Allow-building-against-external-versions-of-dependen.patch A source/orc/orc-1.4.2-patches/0002-Install-Statistics.hh.patch 4 files changed, 212 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/74/9274/2 -- To view, visit http://gerrit.cloudera.org:8080/9274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e Gerrit-Change-Number: 9274 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/exec-env.cc File be/src/runtime/exec-env.cc: http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/exec-env.cc@304 PS2, Line 304: RETURN_IF_ERROR(data_svc_->Init()); now that we don't pass parameters, if there are still dependencies in the ordering of these methods, how about noting that? http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/krpc-data-stream-mgr.cc File be/src/runtime/krpc-data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/krpc-data-stream-mgr.cc@176 PS2, Line 176: service_mem_tracker_->Release(transfer_size); what is the impact of the transient double counting now? http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/service/data-stream-service.h File be/src/service/data-stream-service.h: http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/service/data-stream-service.h@45 PS2, Line 45: /// Initializes the service by registering it with the singleton RPC manager. note here that the RpcMgr::Init() must have already been called (or whatever the ordering dependency is). -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 18:27:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5801: Clean up codegen GetType() interface
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/9063 ) Change subject: IMPALA-5801: Clean up codegen GetType() interface .. Patch Set 7: I have uploaded a patch with the template approach. (please ignore patch set 6, which contains some file by accident) The patch also contains some simplification around the changed call sites (like replacing GetTypes(...) with convenience functions), so it is not "rename only". I have also noticed several places where codegen is used in over complicated ways, for example "llvm::ConstantInt::get(llvm::Type::getInt1Ty(codegen->context()), false)" could be simply "codegen->false_value()". I can do some GetConst cleanup here, or I can create a ticket for it. -- To view, visit http://gerrit.cloudera.org:8080/9063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77 Gerrit-Change-Number: 9063 Gerrit-PatchSet: 7 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 18:22:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9250 ) Change subject: IMPALA-6519: API to allocate unreserved buffer .. Patch Set 4: @tarmstrong. oops. sorry, must have picked the wrong change-id when squashing multiple commits -- To view, visit http://gerrit.cloudera.org:8080/9250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983 Gerrit-Change-Number: 9250 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 18:19:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9123 ) Change subject: IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates .. Patch Set 16: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1936/ -- To view, visit http://gerrit.cloudera.org:8080/9123 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc49c2d0f2a5bfad822545616b8c62b4b95dc210 Gerrit-Change-Number: 9123 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 17:57:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9123 ) Change subject: IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates .. Patch Set 16: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9123 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc49c2d0f2a5bfad822545616b8c62b4b95dc210 Gerrit-Change-Number: 9123 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 17:57:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6482: add QUERY TIME LIMIT S option
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9227 ) Change subject: IMPALA-6482: add QUERY_TIME_LIMIT_S option .. Patch Set 5: Is there already a time limit for time spent in the AC queue? If not, maybe we'll need that as well since this clock starts after that? -- To view, visit http://gerrit.cloudera.org:8080/9227 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8 Gerrit-Change-Number: 9227 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Comment-Date: Wed, 14 Feb 2018 17:37:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9123 ) Change subject: IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates .. Patch Set 15: Code-Review+2 Thanks for the larger scale testing. -- To view, visit http://gerrit.cloudera.org:8080/9123 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc49c2d0f2a5bfad822545616b8c62b4b95dc210 Gerrit-Change-Number: 9123 Gerrit-PatchSet: 15 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 17:36:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id
Hello Lars Volker, Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9053 to look at the new patch set (#5). Change subject: IMPALA-6416: extend Thread::Create to track instance id .. IMPALA-6416: extend Thread::Create to track instance id This commit builds upon IMPALA-3703. Each thread that was created through Thread::Create() has a ThreadDebugInfo object on the stack frame of Thread::SuperviseThread(). This object has stack allocated char buffers that can be read during a debug session even if we only have minidumps. However, with the old solution ThreadDebugInfo::instance_id was set manually for each thread. It is too easy to forget to set instance_id every time we create a new thread. This commit has the assumption that if a thread has an instance id associated, then the threads spawned by it will always work on the same instance id. In Thread::StartThread the parent thread passes its ThreadDebugInfo object to its child who copies the instance id, and also stores the name and system thread id of its parent. This means if we set ThreadDebugInfo::instance_id in some "root thread", then all descendant threads will annotate themselves with the instance id automatically. Since threads also record the name (and a system thread id) of their parent, it might be also possible to reconstruct the thread creation graph. With GDB I tested if it copies the instance id at every place where we previously needed to set it manually. I added an automated test to thread-debug-info-test.cc Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187 --- M be/src/common/thread-debug-info-test.cc M be/src/common/thread-debug-info.cc M be/src/common/thread-debug-info.h M be/src/exec/blocking-join-node.cc M be/src/exec/hdfs-scan-node.cc M be/src/runtime/fragment-instance-state.cc M be/src/util/thread.cc M be/src/util/thread.h 8 files changed, 74 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9053/5 -- To view, visit http://gerrit.cloudera.org:8080/9053 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187 Gerrit-Change-Number: 9053 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/9053 ) Change subject: IMPALA-6416: extend Thread::Create to track instance id .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h File be/src/common/thread-debug-info.h: http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h@115 PS4, Line 115: system_thread_id_ > Are you saying that thread_debug_info_ pointer may point to stray > memory? Yes, I think that can potentially happen. > If that's the case, maybe we shouldn't keep that pointer > but instead have a way to traverse TDIs and find it if still > exists? We can use the system thread id to find the parent thread and then its TDI if it still exists. I also added a copy of the parent thread name. Even if the parent thread exits, at least we will now what was its name. We can also check if parent thread name equals to the TDI's thread name that we found via the system thread id. If they're not the same, we'll know that the parent thread was re-cycled in some way. http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h@119 PS4, Line 119: boost::thread::id boost_thread_id_; : int64_t system_thread_id_ = 0; > I'm not sure what you mean by "switch to the parent thread > quickly". I was just thinking of a GDB session. GDB prints out the system thread id for each thread, so we can traverse the threads quickly if we know the ID. It can be also useful if we write a script for that and we don't want to iterate through all the threads. > I don't think I fully understand the tradeoffs. Just seems > confusing to have multiple IDs and I don't understand how each of > these fields helps in debugging. Perhaps some comments attached to > them explaining how one can use them for debugging would help? I removed boost::thread::id, and added the parent's thread name. I think it is more clear, and knowing the parent thread name can be useful on its own. And we can also use it to verify the parent thread. -- To view, visit http://gerrit.cloudera.org:8080/9053 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187 Gerrit-Change-Number: 9053 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 14 Feb 2018 17:32:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9250 ) Change subject: IMPALA-6519: API to allocate unreserved buffer .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/9250/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9250/4//COMMIT_MSG@14 PS4, Line 14: let's note that this is a temporary solution so that we can introduce real reservation accounting in the exchange node as a second step. http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.h File be/src/runtime/bufferpool/buffer-pool.h: http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.h@246 PS4, Line 246: /// operations for 'client', except for operations on the same 'handle'. maybe note that if there is an unexpected runtime failure during allocation, the reservation may still have been increased? http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.h@250 PS4, Line 250: /// out of that instead of relying on the "best effort" interface. how about explicitly saying that the interface is provided so to help transition components to the buffer pool so that they can implement reservation accounting as a second step during that transition? http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc File be/src/runtime/bufferpool/buffer-pool.cc: http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc@562 PS4, Line 562: Ensure we have the reservation required first. that's not really the case when 'reserved' is true -- the client needs to ensure that. Maybe move that first sentence into the else and int he first case say: the client manages the reservation and must ensure it is available, or something? http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc@570 PS4, Line 570: if (success != nullptr) *success = false; given that there's only one callsite with reserved==true and one with reserved==false, maybe refactor this to move the reservation accounting into the callers. Actually, maybe the whole thing should just go into the callers since there's just a couple of lines of common code here. -- To view, visit http://gerrit.cloudera.org:8080/9250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983 Gerrit-Change-Number: 9250 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 17:31:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5801: Clean up codegen GetType() interface
Hello Lars Volker, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9063 to look at the new patch set (#7). Change subject: IMPALA-5801: Clean up codegen GetType() interface .. IMPALA-5801: Clean up codegen GetType() interface Several functions that return llvm::(Pointer)Type were renamed to make them shorter or indicate their roles more clearly. Some additional convenience functions were created to make some common codegen tasks simpler. The renamed functions can be found in llvm-codegen.h/cc. Changes in other files are mainly renamed calls with the same functionality. Testing: No new tests are necessary, as no functionality was changed. Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77 --- M be/src/benchmarks/hash-benchmark.cc M be/src/codegen/codegen-anyval.cc M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/exec-node.cc M be/src/exec/filter-context.cc M be/src/exec/hash-table.cc M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scanner.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/text-converter.cc M be/src/exprs/compound-predicates.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-fn-call.cc M be/src/exprs/slot-ref.cc M be/src/runtime/descriptors.cc M be/src/runtime/tuple.cc M be/src/runtime/types.cc M be/src/util/tuple-row-compare.cc 26 files changed, 339 insertions(+), 361 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/9063/7 -- To view, visit http://gerrit.cloudera.org:8080/9063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77 Gerrit-Change-Number: 9063 Gerrit-PatchSet: 7 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5801: Clean up codegen GetType() interface
Hello Lars Volker, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9063 to look at the new patch set (#6). Change subject: IMPALA-5801: Clean up codegen GetType() interface .. IMPALA-5801: Clean up codegen GetType() interface Several functions that return llvm::(Pointer)Type were renamed to make them shorter or indicate their roles more clearly. Some additional convenience functions were created to make some common codegen tasks simpler. The renamed functions can be found in llvm-codegen.h/cc. Changes in other files are mainly renamed calls with the same functionality. Testing: No new tests are necessary, as no functionality was changed. Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77 --- M be/src/benchmarks/hash-benchmark.cc M be/src/codegen/codegen-anyval.cc M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/exec-node.cc A be/src/exec/exec-node.cc.save A be/src/exec/exec-node.cc.save.1 A be/src/exec/exec-node.cc.save.2 A be/src/exec/exec-node.cc.save.3 M be/src/exec/filter-context.cc M be/src/exec/hash-table.cc M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scanner.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/text-converter.cc M be/src/exprs/compound-predicates.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-fn-call.cc M be/src/exprs/slot-ref.cc M be/src/runtime/descriptors.cc M be/src/runtime/tuple.cc M be/src/runtime/types.cc M be/src/util/tuple-row-compare.cc 30 files changed, 3,347 insertions(+), 361 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/9063/6 -- To view, visit http://gerrit.cloudera.org:8080/9063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77 Gerrit-Change-Number: 9063 Gerrit-PatchSet: 6 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9250 ) Change subject: IMPALA-6519: API to allocate unreserved buffer .. Patch Set 4: @kwho I think you took over my Change-Id for your draft patch. I deleted the draft for now. -- To view, visit http://gerrit.cloudera.org:8080/9250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983 Gerrit-Change-Number: 9250 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 16:53:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6314: Add run time scalar subquery check for uncorrelated subqueries
Hello Attila Jeges, Dimitris Tsirogiannis, Tim Armstrong, Csaba Ringhofer, Alex Behm, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9005 to look at the new patch set (#12). Change subject: IMPALA-6314: Add run time scalar subquery check for uncorrelated subqueries .. IMPALA-6314: Add run time scalar subquery check for uncorrelated subqueries If a scalar subquery is used with a binary predicate, or, used in an arithmetic expression, it must return only one row/column to be valid. If this cannot be guaranteed at parse time through a single row aggregate or limit clause, Impala fails the query like such. E.g., currently the following query is not allowed: SELECT bigint_col FROM alltypesagg WHERE id = (SELECT id FROM alltypesagg WHERE id = 1) However, it would be allowed if the query contained a LIMIT 1 clause, or instead of id it was max(id). This commit makes the example valid by introducing a runtime check to test if the subquery returns a single row. If the subquery returns more than one row, it aborts the query with an error. I added a new node type, called CardinalityCheckNode. It is created during planning on top of the subquery when needed, then during execution it checks if its child only returns a single row. I extended the frontend tests and e2e tests as well. Change-Id: I0f52b93a60eeacedd242a2f17fa6b99c4fc38e06 --- M be/src/exec/CMakeLists.txt A be/src/exec/cardinality-check-node.cc A be/src/exec/cardinality-check-node.h M be/src/exec/exec-node.cc M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/HdfsCachingOp.java M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/StatementBase.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java A fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java M testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test M testdata/workloads/functional-query/queries/QueryTest/subquery.test 28 files changed, 958 insertions(+), 78 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/9005/12 -- To view, visit http://gerrit.cloudera.org:8080/9005 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0f52b93a60eeacedd242a2f17fa6b99c4fc38e06 Gerrit-Change-Number: 9005 Gerrit-PatchSet: 12 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-6314: Add run time scalar subquery check for uncorrelated subqueries
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/9005 ) Change subject: IMPALA-6314: Add run time scalar subquery check for uncorrelated subqueries .. Patch Set 11: (24 comments) http://gerrit.cloudera.org:8080/#/c/9005/11/be/src/exec/cardinality-check-node.h File be/src/exec/cardinality-check-node.h: http://gerrit.cloudera.org:8080/#/c/9005/11/be/src/exec/cardinality-check-node.h@27 PS11, Line 27: /// Node that passes along the row pulled from its child unchanged. > First sentence should crisply state purpose of this node, e.g.: Thanks, I am using your sentences :) http://gerrit.cloudera.org:8080/#/c/9005/11/be/src/exec/cardinality-check-node.h@29 PS11, Line 29: /// Note that this node must be a blocking node! > Please give an explanation why, e.g.: Thanks again, I don't think I could formulate it so nicely! :) http://gerrit.cloudera.org:8080/#/c/9005/9/be/src/exec/cardinality-check-node.h File be/src/exec/cardinality-check-node.h: http://gerrit.cloudera.org:8080/#/c/9005/9/be/src/exec/cardinality-check-node.h@29 PS9, Line 29: /// Note that this node must be a blocking node! > Thanks. To my defense in PS1 it was called ScalarCheckNode and it only allowed one row :) Later when I renamed it I felt that some extra flexibility suits the new name. But I see that it is unlikely that we'll ever need this flexibility. http://gerrit.cloudera.org:8080/#/c/9005/9/be/src/exec/cardinality-check-node.h@38 PS9, Line 38: virtual Status Reset(RuntimeState* state) override; > Nice experiments and tests! Seems really hard to hit this case with a real Thanks! Done. http://gerrit.cloudera.org:8080/#/c/9005/11/be/src/exec/cardinality-check-node.cc File be/src/exec/cardinality-check-node.cc: http://gerrit.cloudera.org:8080/#/c/9005/11/be/src/exec/cardinality-check-node.cc@68 PS11, Line 68: DCHECK_LE(rows_collected, 1); > DCHECK that rows_collected is either 0 or 1. The current check accepts nega Done http://gerrit.cloudera.org:8080/#/c/9005/11/be/src/exec/cardinality-check-node.cc@85 PS11, Line 85: row_batch_->DeepCopyTo(output_row_batch); > Why do we deep copy the row twice? Once in Open() should be sufficient. Done http://gerrit.cloudera.org:8080/#/c/9005/11/fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java File fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java: http://gerrit.cloudera.org:8080/#/c/9005/11/fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java@175 PS11, Line 175: StmtRewriter.rewriteNonScalarSubqueries(operand, analyzer); > We should still keep the analysis and rewrite phases distinctly separate as I'm a bit confused about how to organise this. In AnalysisContext.analyze() I see the following phases: * analyze * rewrite exprs * rewrite stmts * re-analyze if needed If I don't do anything here, this code will throw an exception that needs to be handled somewhere, but the real problem is that the first analyze phase will remain unfinished. Then, we would start applying rewrites on the partially analyzed tree. So, some modifications are needed to analyzeImpl() I think. I tried to simply return from it when we encounter a non-scalar subquery. Then, do the rewrite in StmtRewriter, but analyze() is also called during statement rewriting. This is problematic because the AnalysisExceptions are wrapped in an IllegalStateException and that causes some tests to fail. This happens when the subquery is still invalid after making it scalar, e.g.: select id from functional.alltypestiny where int_col = (select max(timestamp_col) from functional.alltypessmall) ERROR: AnalysisException: null CAUSED BY: IllegalStateException: Failed analysis after expr substitution. CAUSED BY: AnalysisException: operands of type INT and TIMESTAMP are not comparable: int_col = (SELECT max(timestamp_col) FROM functional.alltypessmall) My subquery "rewrite" only changes the type of the subquery. Aren't these kind of type deductions belong to the analysis phase? Maybe I should move rewriteNonScalarSubqueries() to Expr and rename it to something like makeContainedSubquerySalar() ? Am I missing something? Do you have something in mind how should I deal with it? http://gerrit.cloudera.org:8080/#/c/9005/11/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java: http://gerrit.cloudera.org:8080/#/c/9005/11/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@177 PS11, Line 177: for (Expr expr: children_) { > Move to the rewrite phase. see comment in ArithmeticExpr.java http://gerrit.cloudera.org:8080/#/c/9005/11/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java: http://gerrit.cloudera.org:8080/#/c/9005/11/fe/src/main/java/org/apache/impala/analys
[native-toolchain-CR] PREVIEW: build ORC C++ lib in toolchain
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/9274 ) Change subject: PREVIEW: build ORC C++ lib in toolchain .. Patch Set 1: Hi Tim, one file (include/orc/Statistics.hh) is not added to build/orc-1.4.2-p1/include/orc. After manually copying it into the dir, I was able to build my orc-patch. Upgrading from orc-1.2.3 to orc-1.4.2 bring so many benefits. I was able to simplify my patch by using the ORC lib to parse the orc file tail, instead of writing codes to do this in the initial patch. BTW, recently orc-1.4.3 was released. We can use the latest version. -- To view, visit http://gerrit.cloudera.org:8080/9274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e Gerrit-Change-Number: 9274 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 14 Feb 2018 10:15:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 1: (14 comments) http://gerrit.cloudera.org:8080/#/c/9282/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9282/1//COMMIT_MSG@20 PS1, Line 20: this flag is left at 0, in which case it will be set to 20% of process > It seems too aggressive for small-to-medium deployments for sure :). It mig Pushed it down to 5% for now. Until IMPALA-5859 is fixed, RPC retries may end up consuming quite a lot more network bandwidth. http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/impala-service-pool.h File be/src/rpc/impala-service-pool.h: http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/impala-service-pool.h@43 PS1, Line 43: MemTracker* mem_tracker > it would be good to document these parameters. Is this the mem_tracker that Done http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/impala-service-pool.cc File be/src/rpc/impala-service-pool.cc: http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/impala-service-pool.cc@150 PS1, Line 150: or > , otherwise Done http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/impala-service-pool.cc@152 PS1, Line 152: std:: > nit: names.h should take care of that Done http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/rpc-mgr-test-base.h File be/src/rpc/rpc-mgr-test-base.h: http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/rpc-mgr-test-base.h@133 PS1, Line 133: // Takes over ownership of the newly created 'service'. > Can you explain why this method is needed? Could you just pass in a pointer The RpcMgr no longer owns the rpc service. RpcMgr::Shutdown() will call rpc service's Shutdown() function so the lifetime of the service should be as long as the rpc_mgr_. That's why the ownership is transferred to RpcMgrTestBase class instead. Comments added. http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/rpc-mgr.h File be/src/rpc/rpc-mgr.h: http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/rpc-mgr.h@127 PS1, Line 127: mem_tracker > document. and this is the service's mem tracker right? if so, maybe service Done http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/exec-env.cc File be/src/runtime/exec-env.cc: http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/exec-env.cc@304 PS1, Line 304: RETURN_IF_ERROR(data_svc_->Init(rpc_mgr_.get())); > The RpcMgr has a getter in the ExecEnv, too, so you could remove it from th Done http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/exec-env.cc@305 PS1, Line 305: data_svc_ > this one, too This is a bit tricky as this is needed in data-stream-test.cc. http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/krpc-data-stream-mgr.h File be/src/runtime/krpc-data-stream-mgr.h: http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/krpc-data-stream-mgr.h@234 PS1, Line 234: handed over to DataStreamService > is that correct? Isn't incoming_request_tracker the DataStreamService's mem DataStreamMgr / Exchange node to be more precise. http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/krpc-data-stream-mgr.cc File be/src/runtime/krpc-data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/krpc-data-stream-mgr.cc@83 PS1, Line 83: mem_tracker_.reset(new MemTracker(-1, "Data Stream Manager Deferred RPCs", It seems the common pattern is to have the MemTracker inside the owning object (e.g. exec node, DataStreamRecvr etc) so having it externally in ExecEnv seems to be an uncommon pattern. The lifetime of the MemTracker of DataStreamMgr shouldn't be longer than that of the DataStreamMgr. http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/krpc-data-stream-mgr.cc@175 PS1, Line 175: Release > I guess we were using the "Local" versions before since we were just transf The "local" variant doesn't allow the tracker to have memory limit. There is a window but then both of these trackers have process_mem_tracker as their parents. I can change the order of Consume() and Release() to avoid under-counting if that's a big concern at all. http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/service/data-stream-service.h File be/src/service/data-stream-service.h: http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/service/data-stream-service.h@58 PS1, Line 58: MemTracker* mem_tracker() { return mem_tracker_.get(); } > does that need to be public? It's used in ExecEnv::Init(). http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/service/data-stream-service.cc File be/src/service/data-stream-service.cc: http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/service/data-stream-service.cc@37 PS1, Line 37: DEFINE_int64(datastream_service_queue_mem_limit, 0, > It would be good to use ParseUtil::ParseMemSpec() for this for consistency Good idea. Done. http://gerrit.cloudera.org:8080/#/c/9282/1/b
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Hello Lars Volker, Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9282 to look at the new patch set (#2). Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. IMPALA-6116: Bound memory usage of DataStreamSevice's service queue The fix for IMPALA-6193 added a memory tracker for the memory consumed by the payloads in the service queue of DataStreamService. This change extends it by introducing a bound on the memory usage for that service queue. In addition, it deprecates FLAGS_datastream_service_queue_depth and replaces it with FLAGS_datastream_service_queue_max_mem. These flags only take effect when KRPC is in use and KRPC was never enabled in any previous releases so it seems safe to do this flag replacement. The new flag FLAGS_datastream_service_queue_max_mem directly dictates the amount of memory which can be consumed by the service queue of DataStreamService. This allows a more direct control over the memory usage of the queue instead of inferring via the number of entries in the queue. The default value of this flag is left at 0, in which case it will be set to 20% of process memory limit. Testing done: exhaustive debug builds. Updated data-stream-test to exercise the case in which the payload is larger than the limit. Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr-test-base.h M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M be/src/runtime/mem-tracker.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M tests/custom_cluster/test_krpc_mem_usage.py 15 files changed, 282 insertions(+), 164 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/9282/2 -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong