[ https://issues.apache.org/jira/browse/HBASE-21806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755559#comment-16755559 ]
Duo Zhang commented on HBASE-21806: ----------------------------------- One of my goal for introducing AsyncFSWAL is to set a smaller timeout when writing WAL, so if there are slow DNs we will fail to sync sonn and roll a new WAL... But anyway, the proposal here may still be useful, as for example, we can set the timeout to 5 seconds, and set a 2 seconds threshold, if we haven't got response after 5 seconds, the sync will fail, if the sync can come back but still spends more than 2 seconds, we roll a new WAL. > add an option to roll WAL on very slow syncs > -------------------------------------------- > > Key: HBASE-21806 > URL: https://issues.apache.org/jira/browse/HBASE-21806 > Project: HBase > Issue Type: Bug > Reporter: Sergey Shelukhin > Assignee: Sergey Shelukhin > Priority: Major > Attachments: HBASE-21806.patch > > > In large heterogeneous clusters sometimes a slow datanode can cause WAL syncs > to be very slow. In this case, before the bad datanode recovers, or is > discovered and repaired, it would be helpful to roll WAL on a very slow sync > to get a new pipeline. > Otherwise the slow WAL will impact write latency for a long time (slow writes > result in less writes result in the WAL not being rolled for longer) -- This message was sent by Atlassian JIRA (v7.6.3#76005)