No, you either have small enough data that you can have all go to a single
reducer or you can setup a (sampling) partitioner so that the partitions are
sorted and you can get globally sorted output from multiple reduces - take a
look at the TeraSort example for this.
Arun
On Jul 26, 2011, at 3
**
See if this (very old) reply from Mikhail helps.
http://search-hadoop.com/m/QFVD1kEmQT
Here is the patch he is referring to.
http://m1.archiveorange.com/m/att/RNVYm/ArchiveOrange_8dEcdJI4bXFkKHBnsll8YzTc8u8a.patch
**replying in hurry
On Tue, Jul 26, 2011 at 12:28 PM, John Armstrong
wrote:
> I
I'm back to trying to add libraries to the classpath instead of handing
around a fat JAR. This time I've served up my directory full of JARs on
NFS, which each node in my cluster has mounted at /mnt/hadoop-libs. Now my
question is how to add that (local) directory to the classpath of the
mapper a
Hi,
I'm using the avro format both for input and output, for a mapper and a
reducer. I would like to output multiple avro items with different schemata.
For sequence files I would use the MultipleOutputs class from the mapreduce
package.
I looked into the same class but from the old package "mapr
yes, it works for me! I sincerely appreciate your help! thanks!
At 2011-07-26 17:19:34,"Harsh J" wrote:
>Ah wait, guess I figured your problem -- you may not be reutilizing
>the Configuration instance inside your mapper.
>
>Override the configure() method in your mapper and get the value out
>
Ah wait, guess I figured your problem -- you may not be reutilizing
the Configuration instance inside your mapper.
Override the configure() method in your mapper and get the value out
of the configuration instance passed to the mapper instead of
instantiating a new one (with defaults).
2011/7/26
This shouldn't be the case, as the configuration is supposed to
propagate (even the partitioner is something that'd consume this).
Could you post a cleaned up version of your whole Driver code?
2011/7/26 rabbit_cheng :
> In my map function, I need to know the number of reducer, the code segment
>
In my map function, I need to know the number of reducer, the code segment in
my program like this:
JobConf job = new JobConf(driverClass.class);
int numReducer=job.getNumReduceTasks();
but the function invocation job.getNumReduceTasks() always returns the value of
"1". I have tested many
Hello,
You can also use Cascading API (http://www.cascading.org/) which greatly
simplify the Job chainning.
In Thales we try both MR native and Cacading approach and we obtain very good
results (productivity and performance) using cascading...
regards
[@@THALES GROUP RESTRICTED@@]
-Messa