[ 
https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903734#action_12903734
 ] 

Konstantin Shvachko commented on HADOOP-6884:
---------------------------------------------

I think Doug's request to provide benchmarks is reasonable. We should benchmark 
more.
So I modified {{NNThroughputBenchmark}}, to benchmark {{getFileStatus()}}. In 
current trunk {{getFileStatus()}} calls {{Groups.getGroups()}}, which has a 
LOG.debug() in it. In {{NNThroughputBenchmark}} you can specify a logLevel, so 
if I run it with {{logLevel = DEBUG}} the debug message from 
{{Groups.getGroups()}} is printed for every {{getFileStatus()}} call, and no 
other messages are printed.

I compared two versions of the code with {{logLevel = INFO}}:
# Unmodified {{Groups.getGroups()}}, where the argument for the debug call is 
calculated as a sum of strings.
# Modified {{Groups.getGroups()}} with {{if(LOG.isDebugEnabled())}} before the 
logging statement.

Results:
# 10,000 calls of {{getFileStatus()}} in (1) yield 22,905 ops/sec
 10,000 calls of {{getFileStatus()}} in (2) yield 24,610 ops/sec
 This about 7% difference.
# I increased the namespace to 100,000 files and performed that many calls of 
{{getFileStatus()}} for both setups. I still see 3-4% improvement in case (2).
  The gain is less because it is absorbed by the increased cost of navigating 
down the namespace tree.
# Did the same with 1,000 files. (2) is better about 12-13%.

{{Groups.getGroups()}} is a very popular method it is a part of user 
authentication, so this log message is a part of each and every name-node call. 
In many cases the cost of concatenating strings in it will be absorbed by disk 
io operations, but it is still good to have this improvement.

I'll post the patch for {{NNThroughputBenchmark}} in another jira shortly.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, 
> HADOOP-6884-0.22.patch
>
>
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is 
> true, in some cases it's expensive to construct the string that is being 
> printed to log. It's much easier to always use LOG.isDebugEnabled() because 
> it's easier to check (rather than in each case reason whether it's necessary 
> or not).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to