[ 
https://issues.apache.org/jira/browse/ACCUMULO-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037441#comment-14037441
 ] 

Mike Drob commented on ACCUMULO-2868:
-------------------------------------

Todd outlines some more [advanced 
logic|https://issues.apache.org/jira/browse/HDFS-599?focusedCommentId=12756258&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12756258]
 for HDFS deciding when to mark a node as dead, rather than just X retries * Y 
seconds.

> Make master configurable in when it kills tablet servers
> --------------------------------------------------------
>
>                 Key: ACCUMULO-2868
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2868
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: master
>    Affects Versions: 1.6.0
>            Reporter: Bill Havanki
>              Labels: admin, configuration, master
>
> On a cluster with a flaky network, the master may be unable to contact a 
> tserver for some moderate amount of time and then direct it to terminate, 
> even though the tserver is still up. (See {{gatherTableInformation()}} and 
> {{StatusThread}}. It does not appear possible to configure the master to be 
> more forgiving in these checks. Relevant constants:
> * {{DEFAULT_WAIT_FOR_WATCHER}} - interval between server checks
> * {{MAX_BAD_STATUS_COUNT}} - the maximum number of failed attempts allowed 
> before killing the tserver
> Making one or both of those configurable, or some other pertinent parameter 
> configurable, would allow cluster admins to cope with mild network maladies. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to