[ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903734#action_12903734 ]
Konstantin Shvachko commented on HADOOP-6884: --------------------------------------------- I think Doug's request to provide benchmarks is reasonable. We should benchmark more. So I modified {{NNThroughputBenchmark}}, to benchmark {{getFileStatus()}}. In current trunk {{getFileStatus()}} calls {{Groups.getGroups()}}, which has a LOG.debug() in it. In {{NNThroughputBenchmark}} you can specify a logLevel, so if I run it with {{logLevel = DEBUG}} the debug message from {{Groups.getGroups()}} is printed for every {{getFileStatus()}} call, and no other messages are printed. I compared two versions of the code with {{logLevel = INFO}}: # Unmodified {{Groups.getGroups()}}, where the argument for the debug call is calculated as a sum of strings. # Modified {{Groups.getGroups()}} with {{if(LOG.isDebugEnabled())}} before the logging statement. Results: # 10,000 calls of {{getFileStatus()}} in (1) yield 22,905 ops/sec 10,000 calls of {{getFileStatus()}} in (2) yield 24,610 ops/sec This about 7% difference. # I increased the namespace to 100,000 files and performed that many calls of {{getFileStatus()}} for both setups. I still see 3-4% improvement in case (2). The gain is less because it is absorbed by the increased cost of navigating down the namespace tree. # Did the same with 1,000 files. (2) is better about 12-13%. {{Groups.getGroups()}} is a very popular method it is a part of user authentication, so this log message is a part of each and every name-node call. In many cases the cost of concatenating strings in it will be absorbed by disk io operations, but it is still good to have this improvement. I'll post the patch for {{NNThroughputBenchmark}} in another jira shortly. > Add LOG.isDebugEnabled() guard for each LOG.debug("...") > -------------------------------------------------------- > > Key: HADOOP-6884 > URL: https://issues.apache.org/jira/browse/HADOOP-6884 > Project: Hadoop Common > Issue Type: Improvement > Affects Versions: 0.22.0 > Reporter: Erik Steffl > Assignee: Erik Steffl > Fix For: 0.22.0 > > Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, > HADOOP-6884-0.22.patch > > > Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is > true, in some cases it's expensive to construct the string that is being > printed to log. It's much easier to always use LOG.isDebugEnabled() because > it's easier to check (rather than in each case reason whether it's necessary > or not). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.