[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463154#comment-13463154
 ] 

Konstantin Shvachko commented on MAPREDUCE-4651:
------------------------------------------------

Ravi, 
- even with single node you can specify more -nrFiles if you have multiple 
drives on the node. I usually setup number of map slots equal to the number of 
drives on a node.
- I don't know how big was the file that you created with -write prior to 
reads. If it was 10 MB than the actual size of reads was not more than that. 
Check the DFSIO summary it prints how much data was read.
- You probably ran reads right after creating the file. So the the data was in 
buffer cache. I usually clean the cache before each test run. (On linux 'echo 1 
> /proc/sys/vm/drop_caches')
- Also -fileSize is replaced by -size in my patch. It says how much data you 
want to read/write/append, rather than specifying the size of a file. Initially 
(read/write) it was the same.
                
> Benchmarking random reads with DFSIO
> ------------------------------------
>
>                 Key: MAPREDUCE-4651
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4651
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: benchmarks, test
>    Affects Versions: 1.0.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>         Attachments: randomDFSIO.patch, randomDFSIO.patch, randomDFSIO.patch
>
>
> TestDFSIO measures throughput of HDFS write, read, and append operations. It 
> will be useful to have an option to use it for benchmarking random reads.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to