[ https://issues.apache.org/jira/browse/KUDU-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16783805#comment-16783805 ]
Mike Percy commented on KUDU-1868: ---------------------------------- Merged as part of these patches from Will: * [https://gerrit.cloudera.org/c/12338/] * [https://gerrit.cloudera.org/c/12363/] > Java client mishandles socket read timeouts for scans > ----------------------------------------------------- > > Key: KUDU-1868 > URL: https://issues.apache.org/jira/browse/KUDU-1868 > Project: Kudu > Issue Type: Bug > Components: client > Affects Versions: 1.2.0 > Reporter: Jean-Daniel Cryans > Assignee: Will Berkeley > Priority: Major > Labels: backup > > Scan calls from the Java client that take more than the socket read timeout > get retried (unless the operation timeout has expired) instead of being > killed. Users will see this: > {code} > org.apache.kudu.client.NonRecoverableException: Invalid call sequence ID in > scan request > {code} > Note that the right behavior here would still end up killing the scanner, so > this is really a problem the user has to deal with! It's usually caused by > slow IO, combined with very selection scans. > Workaround: set defaultSocketReadTimeoutMs higher, ideally equal to > defaultOperationTimeoutMs (the defaults are 10 and 30 seconds respectively). > But really the user should investigate why single the scans are so slow. > One potentially easy fix to this is to handle retries differently for > scanners so that the user gets nicer exception. A harder fix is to handle > socket read timeouts completely differently, basically it should be per-RPC > and not per TabletClient like it is right now. -- This message was sent by Atlassian JIRA (v7.6.3#76005)