Re: Questions about dfs and MapRed in the Hadoop.

Ed Mazur Tue, 05 Jan 2010 15:09:46 -0800

Hi Pedro,

I can answer a couple of these.


On Tue, Jan 5, 2010 at 5:46 PM, psdc1978 <psdc1...@gmail.com> wrote:
> 1 - What are the difference between the classes:
> org.apache.hadoop.mapred.Reducer.java and
> org.apache.hadoop.mapreduce.Reducer.java? In which case the 2 reducers
> are used?
>
> 2 - The same question for the Mapper.java?

These classes were refactored in 0.20. The older ones (mapred package)
were left to maintain backwards compatibility.

> 4 - What's the purpose of the property in hdfs-site.xml called
> "dfs.replication"?
>
> I've read what is defined in the Hadoop site,
> "dfs.replication - Default block replication. The actual number of
> replications can be specified when the file is created. The default is
> used if replication is not specified in create time. ", but I still
> haven't understand it. Is it in how many machines a file will be
> replicated?

Pretty much. Note that the underlying structure of an HDFS file is a
collection of large blocks (64MB default) and that it is these blocks
that are replicated.

Ed

Re: Questions about dfs and MapRed in the Hadoop.

Reply via email to