[jira] [Commented] (HDFS-4817) make HDFS advisory caching configurable on a per-file basis

Hadoop QA (JIRA) Thu, 23 May 2013 16:47:22 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665847#comment-13665847
 ]


Hadoop QA commented on HDFS-4817:
---------------------------------

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12584581/HDFS-4817.004.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle.

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4432//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4432//console

This message is automatically generated.
                
> make HDFS advisory caching configurable on a per-file basis
> -----------------------------------------------------------
>
>                 Key: HDFS-4817
>                 URL: https://issues.apache.org/jira/browse/HDFS-4817
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 3.0.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>         Attachments: HDFS-4817.001.patch, HDFS-4817.002.patch, 
> HDFS-4817.004.patch
>
>
> HADOOP-7753 and related JIRAs introduced some performance optimizations for 
> the DataNode.  One of them was readahead.  When readahead is enabled, the 
> DataNode starts reading the next bytes it thinks it will need in the block 
> file, before the client requests them.  This helps hide the latency of 
> rotational media and send larger reads down to the device.  Another 
> optimization was "drop-behind."  Using this optimization, we could remove 
> files from the Linux page cache after they were no longer needed.
> Using {{dfs.datanode.drop.cache.behind.writes}} and 
> {{dfs.datanode.drop.cache.behind.reads}} can improve performance  
> substantially on many MapReduce jobs.  In our internal benchmarks, we have 
> seen speedups of 40% on certain workloads.  The reason is because if we know 
> the block data will not be read again any time soon, keeping it out of memory 
> allows more memory to be used by the other processes on the system.  See 
> HADOOP-7714 for more benchmarks.
> We would like to turn on these configurations on a per-file or per-client 
> basis, rather than on the DataNode as a whole.  This will allow more users to 
> actually make use of them.  It would also be good to add unit tests for the 
> drop-cache code path, to ensure that it is functioning as we expect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4817) make HDFS advisory caching configurable on a per-file basis

Reply via email to