Re: set number of map tasks in GridMix2

2011-03-18 Thread Denny Ye
hi Pedro, You are right, the number of map tasks is defined by the number of input splits. In default, one DFS block one split. In your first example. Each file is sole block in DFS, so it have 10 map tasks for ten blocks. The second example, default block size is 64m in DFS, each

Re: mapred.min.split.size

2011-03-18 Thread Pedro Costa
As I understand, mapred.min.split.size defines the minimum size of a split. In the case below: (1) HDFS block size = 32MB, mapred.min.split.size=64MB (mapred.min.split.size can be only set to larger than HDFS block size) when I run mapreduce, it means that a map will run one input split of 64MB o

Re: mapred.min.split.size

2011-03-18 Thread Marcos Ortiz
El 3/18/2011 3:54 PM, Pedro Costa escribió: Hi What's the purpose of the parameter "mapred.min.split.size"? Thanks, There are many parameters that control the number of map tasks for a Job, and mapred.min.split.size controls the minimun size of a split. Other parameters are: - mapreduce.

Re: mapred.min.split.size

2011-03-18 Thread Ted Yu
Cycling bits: http://search-hadoop.com/m/O7sT4278lbG/but+it+seems+a+trade+off+with+the+number+of+files+that+have+to+be+shuffled+for+the&subj=RE+HDFS+block+size+v+s+mapred+min+split+size On Fri, Mar 18, 2011 at 12:54 PM, Pedro Costa wrote: > Hi > > What's the purpose of the parameter "mapred.min.

mapred.min.split.size

2011-03-18 Thread Pedro Costa
Hi What's the purpose of the parameter "mapred.min.split.size"? Thanks, -- Pedro

What the examples of Gridmix2 do?

2011-03-18 Thread Pedro Costa
Hi, I don't know what the examples of the Gridmix do. Where can I find an explanation of that? Thank -- Pedro

Re: set number of map tasks in GridMix2

2011-03-18 Thread Pedro Costa
I've another question. The number of map tasks is defined by the number of input splits? For example, if I run an example that read 10 txt files with 1kb each, does it means that 10 map tasks will run? And if I've 10 txt files with 1GB each, how many map tasks I will run? Thanks, On Fri, Mar 18,

set number of map tasks in GridMix2

2011-03-18 Thread Pedro Costa
Hi, I would like define the number of map tasks to use in the GridMix2. For example, I would like to run the GridMixMonsterQuery at GridMix2 with 5 maps, another with 10 and another with 20 maps. How can I do that? Thanks, -- Pedro

WordCount - sleep modification

2011-03-18 Thread Robert Grandl
Hi all, I want to modify WordCount application in order to delay the execution of maps. I have tried to put a sleep in map function but even a 1 ns sleep and for 128 MB blocks it took almost 30 minutes to complete. However, if regular execution takes almost 1:30 minutes, I want to put a dela