[ 
https://issues.apache.org/jira/browse/HBASE-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438854#comment-13438854
 ] 

stack commented on HBASE-6490:
------------------------------

What should we increase it to N?  We can't increase it just for WAL... it'd be 
globally?
                
> 'dfs.client.block.write.retries' value could be increased in HBase
> ------------------------------------------------------------------
>
>                 Key: HBASE-6490
>                 URL: https://issues.apache.org/jira/browse/HBASE-6490
>             Project: HBase
>          Issue Type: Improvement
>          Components: master, regionserver
>    Affects Versions: 0.96.0
>         Environment: all
>            Reporter: nkeywal
>            Priority: Minor
>
> When allocating a new node during writing, hdfs tries 
> 'dfs.client.block.write.retries' times (default 3) to write the block. When 
> it fails, it goes back to the nanenode for a new list, and raises an error if 
> the number of retries is reached. In HBase, if the error is while we're 
> writing a hlog file, it will trigger a region server abort (as hbase does not 
> trust the log anymore). For simple case (new, and as such empty log file), 
> this seems to be ok, and we don't lose data. There could be some complex 
> cases if the error occurs on a hlog file with already multiple blocks written.
> Logs lines are:
> "Exception in createBlockOutputStream", then "Abandoning block " followed by 
> "Excluding datanode " for a retry.
> IOException: "Unable to create new block.", when the number of retries is 
> reached.
> Probability of occurence seems quite low, (number of bad nodes / number of 
> nodes)^(number of retries), and it implies that you have a region server 
> without its datanode. But it's per new block.
> Increasing the default value of 'dfs.client.block.write.retries' could make 
> sense to be better covered in chaotic conditions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to