Sahil Takiar created IMPALA-9254: ------------------------------------ Summary: Queries should only be retried if all fragments fail with retryable errors Key: IMPALA-9254 URL: https://issues.apache.org/jira/browse/IMPALA-9254 Project: IMPALA Issue Type: Sub-task Reporter: Sahil Takiar
Currently, Impala only propagates an {{overall_status}} from an executor to the coordinator. The {{overall_status}} is set in the {{QueryState}} and "If multiple fragments have errors, the first fragment to hit an error is givenĀ preference.". The issue is that if multiple fragments fail, it is possible some of the errors should trigger a retry, while other errors shouldn't. For example, one fragment could fail due to faulty disks, but others could fail due to mem limit exceptions. These types of queries shouldn't be retried because it is likely the query will just fail again. This can only happen if the non-retryable error occurs in a specific time window: [when the retryable error occurs, the query is cancelled]. Since any fragment failure causes the entire query to be cancelled, this can only occur if the non-retryable error occurs after the retryable error, but before the query is cancelled. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org