[ 
https://issues.apache.org/jira/browse/KUDU-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16783805#comment-16783805
 ] 

Mike Percy commented on KUDU-1868:
----------------------------------

Merged as part of these patches from Will:
 * [https://gerrit.cloudera.org/c/12338/]
 * [https://gerrit.cloudera.org/c/12363/]

 

> Java client mishandles socket read timeouts for scans
> -----------------------------------------------------
>
>                 Key: KUDU-1868
>                 URL: https://issues.apache.org/jira/browse/KUDU-1868
>             Project: Kudu
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 1.2.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Will Berkeley
>            Priority: Major
>              Labels: backup
>
> Scan calls from the Java client that take more than the socket read timeout 
> get retried (unless the operation timeout has expired) instead of being 
> killed. Users will see this:
> {code}
> org.apache.kudu.client.NonRecoverableException: Invalid call sequence ID in 
> scan request
> {code}
> Note that the right behavior here would still end up killing the scanner, so 
> this is really a problem the user has to deal with! It's usually caused by 
> slow IO, combined with very selection scans.
> Workaround: set defaultSocketReadTimeoutMs higher, ideally equal to 
> defaultOperationTimeoutMs (the defaults are 10 and 30 seconds respectively). 
> But really the user should investigate why single the scans are so slow.
> One potentially easy fix to this is to handle retries differently for 
> scanners so that the user gets nicer exception. A harder fix is to handle 
> socket read timeouts completely differently, basically it should be per-RPC 
> and not per TabletClient like it is right now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to