[ https://issues.apache.org/jira/browse/IMPALA-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461283#comment-16461283 ]
ASF subversion and git services commented on IMPALA-6920: --------------------------------------------------------- Commit c8c6947797c8a872728b01872b1d3872526f663c in impala's branch refs/heads/2.x from [~tarmstr...@cloudera.com] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=c8c6947 ] IMPALA-6920: fix inconsistencies with scanner thread tokens The first scanner thread to start now takes a "required" token, which always succeeds. Only additional threads try to get "optional" tokens, which can fail. Previously threads always requested optional tokens, which could fail and leave the scan node without any running threads until its callback is invoked. This allows us to remove the "reserved optional token" and set_max_quota() interfaces from ThreadResourceManager. There should be no behavioural changes in ThreadResourceMgr in cases when those features are not used. Also switch Kudu to using the same logic for implementing NUM_SCANNER_THREADS (it was not switched over to the improved HDFS scanner logic added in IMPALA-2831). Do some cleanup in ThreadResourceMgr code while we're here: * Fix some benign data races in ThreadResourceMgr by switching to AtomicInt* classes. * Remove pointless object caching (TCMalloc will do better). * Reduce dependencies on the thread-resource-mgr.h header. Testing: Ran core tests. Ran a few queries under TSAN, checked that it didn't report any more races in this code after fixing those data races. I couldn't construct a regression test because there are no easily testable consequences of the change - the main difference is that some scanner threads start earlier when there is pressure on scanner thread tokens but that is hard to construct a robust test around. Change-Id: I16d31d72441aff7293759281d0248e641df43704 Reviewed-on: http://gerrit.cloudera.org:8080/10186 Reviewed-by: Tim Armstrong <tarmstr...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> > Multithreaded scans are not guaranteed to get a thread token immediately > ------------------------------------------------------------------------ > > Key: IMPALA-6920 > URL: https://issues.apache.org/jira/browse/IMPALA-6920 > Project: IMPALA > Issue Type: Bug > Components: Backend > Affects Versions: Impala 2.12.0 > Reporter: Tim Armstrong > Assignee: Tim Armstrong > Priority: Major > Labels: resource-management > > This bug applies to multithreaded HDFS and Kudu scans. > So what happens is that we reserve an optional token for the first scanner > thread but that can be taken by any other operator in the same fragment. What > happens in one fragment in TPC-DS q18a is: > 1. The hash join grabs an extra token for the join build. I guess it does > this early so it gets an optional token before other fragments can grab them. > 2. The scan node reserves an optional token in Open(). This optional token is > already in use by the hash join. > 3. The scan node tries to start the first scanner thread, but there are no > optional tokens available, so it can't start any. > 4. Eventually the optional token is given up and the scanner thread can start. > If #4 always happens without the scan making progress, then no deadlock is > possible, but if there's any kind of circular dependency, this can deadlock. > Kudu scans also do not implement the num_scanner_threads query option in the > same way as HDFS scans - the IMPALA-2831 changes were not applied to kudu. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org