[jira] [Updated] (KUDU-694) Re-visit C++ client scan retry logic

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-694:
-
Target Version/s: 1.8.0  (was: 1.5.0)

> Re-visit C++ client scan retry logic
> 
>
> Key: KUDU-694
> URL: https://issues.apache.org/jira/browse/KUDU-694
> Project: Kudu
>  Issue Type: Bug
>  Components: client
>Affects Versions: Private Beta
>Reporter: Andrew Wang
>Priority: Major
>
> There are a number of remaining issues with scanner robustness, even after 
> KUDU-597:
> * Once a node is marked as failed, it will not be used again in the call. 
> This is more of an issue with longer timeouts (since the node is more likely 
> to come back), or if the scan is LEADER_ONLY (since only one node being down 
> leads to unavailability).
> * In the LEADER_ONLY case, since we don't refresh quorum information within 
> the call, we won't recover when a failover happens.
> * The scanner code calls a number of other RPCs that are not retried on 
> error, i.e. LookupTabletByKey or RefreshProxy's DNS resolution in 
> GetTabletServer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-694) Re-visit C++ client scan retry logic

2017-07-06 Thread Alexey Serbin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Serbin updated KUDU-694:
---
Summary: Re-visit C++ client scan retry logic  (was: Revist C++ client scan 
retry logic)

> Re-visit C++ client scan retry logic
> 
>
> Key: KUDU-694
> URL: https://issues.apache.org/jira/browse/KUDU-694
> Project: Kudu
>  Issue Type: Bug
>  Components: client
>Affects Versions: Private Beta
>Reporter: Andrew Wang
>
> There are a number of remaining issues with scanner robustness, even after 
> KUDU-597:
> * Once a node is marked as failed, it will not be used again in the call. 
> This is more of an issue with longer timeouts (since the node is more likely 
> to come back), or if the scan is LEADER_ONLY (since only one node being down 
> leads to unavailability).
> * In the LEADER_ONLY case, since we don't refresh quorum information within 
> the call, we won't recover when a failover happens.
> * The scanner code calls a number of other RPCs that are not retried on 
> error, i.e. LookupTabletByKey or RefreshProxy's DNS resolution in 
> GetTabletServer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)