[ https://issues.apache.org/jira/browse/HDFS-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14153706#comment-14153706 ]
Colin Patrick McCabe commented on HDFS-7055: -------------------------------------------- bq. I think SpanReceiverHost#getUniqueLocalTraceFileName is useful but it should belong to htrace. Can I port it to htrace later and remove from hadoop on the next bumping of htrace version? Yeah, absolutely. bq. I attached screenshot of spans for reference. It shows trace of getting 1MB of file by FsShell on pseudo distributed cluster with .004 patch. The trace consists of over 500 spans in this case.... Setting hadoop.trace.sampler=ProbabilitySampler did not reduce the number of spans above because Trace#startSpan always start span without regarding to sampler when there is ongoing trace. Well, I guess it depends on what you mean by "granular." :) I certainly don't want all trace spans to be activated randomly. We need to see the parent/child relationships between the spans. I think the granularity of individual reads is just about right-- less than that, and we start not being able to see the big picture. More than that, and we can't effectively do random sampling. But you are right that we have too many trace spans here. I thought about this a little more, and I don't think we have to create a trace span for each BlockReader operation. We can just create trace spans for the operations that actually perform I/O to the datanode. I think we can reduce this by not creating trace spans for every read done via a BlockReader-- only the reads which actually result in data being written from the DN. Similarly for BlockReaderLocal, we can trace the times we fill up the buffer, but not every call into BlockReaderLocal. > Add tracing to DFSInputStream > ----------------------------- > > Key: HDFS-7055 > URL: https://issues.apache.org/jira/browse/HDFS-7055 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode > Affects Versions: 2.6.0 > Reporter: Colin Patrick McCabe > Assignee: Colin Patrick McCabe > Attachments: HDFS-7055.002.patch, HDFS-7055.003.patch, > HDFS-7055.004.patch, screenshot-get-1mb.png > > > Add tracing to DFSInputStream. -- This message was sent by Atlassian JIRA (v6.3.4#6332)