[ https://issues.apache.org/jira/browse/HDFS-6110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HDFS-6110: ------------------------ Attachment: HDFS-6110v6.txt [~xieliang007] 's latest patch adding in offline review feedback I got from our Todd (See below): i.e. having one threshold for dfsclient (a higher one so folks MR'ing don't get annoyed by all the WARNings about slow i/o), and then another for datanode side which is much lower so we can see bad i/os. {code} 16:38 < todd> stack: just looked at 6110. had one more thought after commenting on the JIRA 16:38 < todd> you think we should add a separate config for client vs server? 16:38 < todd> I'm afraid that the 300ms default may be a little aggressive for the client - people using hadoop fs -put to upload files may get kind of nervous the next time they upgrade if they start seeing warnings 16:38 < todd> MR jobs too 16:39 < todd> may be better to have the client default be 10sec or something really long, and then HBase could tune it down for WAL files 16:39 < stack> todd: thanks boss 16:39 < todd> you think i'm crazy? 16:39 < stack> no 16:39 < stack> Testing it, it is "illuminating" to see how long stuff takes 16:39 < todd> k. yea 16:39 < todd> I had a patch like that once on the server side 16:39 < stack> Was worried though that it'd freak folks out. 16:40 < stack> Or, rather, they'd ignore what is being said and just consider it 'noise'. 16:40 < todd> yea 16:40 < todd> for a throughput app it is kind of noise 16:40 < todd> but hbase could definitely tune the default inside the RS down 16:40 < stack> Let me do as you suggest. 16:40 < todd> k 16:40 < stack> Thanks for review. 16:40 < todd> feel free to paste this convo into the jira so it makes sense :) 16:40 < todd> didn't want to post yet another comment and pollute everyone's mailboxes 16:41 * stack nod {code} > adding more slow action log in critical write path > -------------------------------------------------- > > Key: HDFS-6110 > URL: https://issues.apache.org/jira/browse/HDFS-6110 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode > Affects Versions: 3.0.0, 2.3.0 > Reporter: Liang Xie > Assignee: Liang Xie > Attachments: HDFS-6110-v2.txt, HDFS-6110.txt, HDFS-6110v3.txt, > HDFS-6110v4.txt, HDFS-6110v5.txt, HDFS-6110v6.txt > > > After digging a HBase write spike issue caused by slow buffer io in our > cluster, just realize we'd better to add more abnormal latency warning log in > write flow, such that if other guys hit HLog sync spike, we could know more > detail info from HDFS side at the same time. > Patch will be uploaded soon. -- This message was sent by Atlassian JIRA (v6.2#6252)