Michael Ho created IMPALA-8685: ---------------------------------- Summary: Evaluate default configuration of NUM_REMOTE_EXECUTOR_CANDIDATES Key: IMPALA-8685 URL: https://issues.apache.org/jira/browse/IMPALA-8685 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Michael Ho
The query option {{NUM_REMOTE_EXECUTOR_CANDIDATES}} is set to 3 by default. This means that there are potentially 3 different executors which can process a remote scan range. Over time, the data of a given remote scan range will be spread across these 3 executors. My understanding of why this is not set to 1 is to avoid hot spots in pathological cases. On the other hand, this may mean that we may not maximize the utilization of the file handle cache and data cache. Also, for small clusters (e.g. a 3 node cluster), the default value may render deterministic remote scan range scheduling ineffective. We may want to re-evaluate the default value of {{NUM_REMOTE_EXECUTOR_CANDIDATES}}. One idea is to set it to min(3, half of cluster size) so it works okay with small cluster, which may be rather common for demo purposes. There may also be other criteria for evaluating the default value. cc'ing [~joemcdonnell], [~tlipcon] and [~drorke] -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org