[jira] [Commented] (HDFS-4621) additional logging to help diagnose slow QJM logSync

Todd Lipcon (JIRA) Wed, 27 Mar 2013 11:45:17 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13615618#comment-13615618
 ]


Todd Lipcon commented on HDFS-4621:
-----------------------------------

Thanks, I decided not to change the timeouts for warning -- they're set to 1 
sec, which is a long enough time that you wouldn't expect any well-configured 
disk to take that long to fsync. Even if the timeout is 20sec, you'd probably 
want to know if your IO is taking more than 10s of ms, really, so 1sec should 
be conservative. Committing to branch-2 and trunk.
                
> additional logging to help diagnose slow QJM logSync
> ----------------------------------------------------
>
>                 Key: HDFS-4621
>                 URL: https://issues.apache.org/jira/browse/HDFS-4621
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha, qjm
>    Affects Versions: 2.0.3-alpha
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Minor
>         Attachments: hdfs-4621.txt
>
>
> I've been working on diagnosing an issue with a cluster which is seeing slow 
> logSync calls occasionally to QJM. Adding a few more pieces of logging would 
> help this:
> - in the warning messages on the client side leading up to a timeout, include 
> which nodes have responded and which ones are still pending
> - on the server side, when we actually call FileChannel.force, log a warning 
> if the sync takes longer than 1 second

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4621) additional logging to help diagnose slow QJM logSync

Reply via email to