[jira] [Updated] (HDFS-6110) adding more slow action log in critical write path

stack (JIRA) Fri, 25 Apr 2014 13:09:14 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-6110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


stack updated HDFS-6110:
------------------------

    Attachment: HDFS-6110v6.txt

[~xieliang007] 's latest patch adding in offline review feedback I got from our 
Todd (See below): i.e. having one threshold for dfsclient (a higher one so 
folks MR'ing don't get annoyed by all the WARNings about slow i/o), and then 
another for datanode side which is much lower so we can see bad i/os.

{code}
16:38 < todd> stack: just looked at 6110. had one more thought after commenting 
on the JIRA
16:38 < todd> you think we should add a separate config for client vs server?
16:38 < todd> I'm afraid that the 300ms default may be a little aggressive for 
the client - people using hadoop fs -put to upload files may get kind of 
nervous the next time they upgrade if they start
              seeing warnings
16:38 < todd> MR jobs too
16:39 < todd> may be better to have the client default be 10sec or something 
really long, and then HBase could tune it down for WAL files
16:39 < stack> todd: thanks boss
16:39 < todd> you think i'm crazy?
16:39 < stack> no
16:39 < stack> Testing it, it is "illuminating" to see how long stuff takes
16:39 < todd> k. yea
16:39 < todd> I had a patch like that once on the server side
16:39 < stack> Was worried though that it'd freak folks out.
16:40 < stack> Or, rather, they'd ignore what is being said and just consider 
it 'noise'.
16:40 < todd> yea
16:40 < todd> for a throughput app it is kind of noise
16:40 < todd> but hbase could definitely tune the default inside the RS down
16:40 < stack> Let me do as you suggest.
16:40 < todd> k
16:40 < stack> Thanks for review.
16:40 < todd> feel free to paste this convo into the jira so it makes sense :)
16:40 < todd> didn't want to post yet another comment and pollute everyone's 
mailboxes
16:41  * stack nod
{code}

> adding more slow action log in critical write path
> --------------------------------------------------
>
>                 Key: HDFS-6110
>                 URL: https://issues.apache.org/jira/browse/HDFS-6110
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 3.0.0, 2.3.0
>            Reporter: Liang Xie
>            Assignee: Liang Xie
>         Attachments: HDFS-6110-v2.txt, HDFS-6110.txt, HDFS-6110v3.txt, 
> HDFS-6110v4.txt, HDFS-6110v5.txt, HDFS-6110v6.txt
>
>
> After digging a HBase write spike issue caused by slow buffer io in our 
> cluster, just realize we'd better to add more abnormal latency warning log in 
> write flow, such that if other guys hit HLog sync spike, we could know more 
> detail info from HDFS side at the same time.
> Patch will be uploaded soon.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6110) adding more slow action log in critical write path

Reply via email to