[jira] [Commented] (HBASE-14317) Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL

stack (JIRA) Thu, 27 Aug 2015 16:47:24 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717769#comment-14717769
 ]


stack commented on HBASE-14317:
-------------------------------

That patch don't look bad to me.  Usually we just return immediately out of the 
wait on safe point so we won't be doing all the currentTimeMillis calls all the 
time, only when we waiting on safe point.

> Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL
> -----------------------------------------------------
>
>                 Key: HBASE-14317
>                 URL: https://issues.apache.org/jira/browse/HBASE-14317
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.2.0, 1.1.1
>            Reporter: stack
>            Priority: Critical
>         Attachments: HBASE-14317.patch, [Java] RS stuck on WAL sync to a dead 
> DN - Pastebin.com.html, raw.php, subset.of.rs.log
>
>
> hbase-1.1.1 and hadoop-2.7.1
> We try to roll logs because can't append (See HDFS-8960) but we get stuck. 
> See attached thread dump and associated log. What is interesting is that 
> syncers are waiting to take syncs to run and at same time we want to flush so 
> we are waiting on a safe point but there seems to be nothing in our ring 
> buffer; did we go to roll log and not add safe point sync to clear out 
> ringbuffer?
> Needs a bit of study. Try to reproduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14317) Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL

Reply via email to