[
https://issues.apache.org/jira/browse/IMPALA-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17909095#comment-17909095
]
Michael Smith commented on IMPALA-3978:
---------------------------------------
We should also consider increasing this default. Under heavy load fragments can
take several minutes to return a batch.
It might help to add an ACK that the message was received and fragment still
exists to reduce the time waiting here? It seems most timeouts occur because
the fragment takes awhile to generate the first response batch.
> Reword datastream timeout error message to highlight situations where
> changing the timeout makes sense
> ------------------------------------------------------------------------------------------------------
>
> Key: IMPALA-3978
> URL: https://issues.apache.org/jira/browse/IMPALA-3978
> Project: IMPALA
> Issue Type: Improvement
> Components: Distributed Exec
> Affects Versions: Impala 2.7.0
> Reporter: Henry Robinson
> Assignee: Henry Robinson
> Priority: Major
>
> This error message:
> {quote}
> Datastream sender timed-out waiting for recvr for fragment instance:
> 9f4cdc5cf9c63511:3ad1e5076f9a41e2 (time-out was: 2m).
> Increase --datastream_sender_timeout_ms if you see this message frequently.
> {quote}
> is printed out no matter what the cause of the timeout. However, increasing
> the timeout only really makes sense if the system is overloaded, i.e.
> fragments are being started but they're taking a very long time. We should
> reword the message to make the clearer, so when timeouts are being hit due to
> failures, users don't think they have to reconfigure the system.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]