[ https://issues.apache.org/jira/browse/IMPALA-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sahil Takiar reassigned IMPALA-2638: ------------------------------------ Assignee: Sahil Takiar > Retry queries that fail during scheduling > ----------------------------------------- > > Key: IMPALA-2638 > URL: https://issues.apache.org/jira/browse/IMPALA-2638 > Project: IMPALA > Issue Type: Improvement > Components: Distributed Exec > Affects Versions: Impala 2.3.0 > Reporter: Henry Robinson > Assignee: Sahil Takiar > Priority: Minor > Labels: scalability > > An important building block for node-decommissioning is the ability to retry > queries if they fail during scheduling for some recoverable reason (e.g. RPC > failed due to unreachable host, fragment could not be started due to memory > pressure). > To do this we can detect failures during {{Coordinator::Exec()}}, cancel the > running query and then re-start from somewhere in > {{QueryExecState::ExecQueryOrDmlRequest()}} - updating a local blacklist of > nodes so that we know to avoid those that have caused failures. > There are some subtleties though: > * Queries shouldn't be retried more than a small number of times, in case > they *cause* the outage (there might be a good way to figure that out at the > time) > * If the query is restarted from the scheduling step (rather than completely > restarting), some care will have to be taken to ensure that none of the old > query's fragments that are being cancelled can affect the new query's > operation in any way (there are several ways to do this). > Eventually the failures will propagate to the rest of the cluster via the > statestore - this mechanism allows queries to recover and continue while the > statestore detects the failure. > This JIRA doesn't address restarting queries that have suffered failures > part-way through execution, because that's strictly harder and not (as) > needed for decommissioning. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org