Re: Merge Reducers Outputs

2011-07-26 Thread Arun C Murthy
No, you either have small enough data that you can have all go to a single reducer or you can setup a (sampling) partitioner so that the partitions are sorted and you can get globally sorted output from multiple reduces - take a look at the TeraSort example for this. Arun On Jul 26, 2011, at 3

Re: Adding files to map/reduce classpath

2011-07-26 Thread Shrijeet Paliwal
** See if this (very old) reply from Mikhail helps. http://search-hadoop.com/m/QFVD1kEmQT Here is the patch he is referring to. http://m1.archiveorange.com/m/att/RNVYm/ArchiveOrange_8dEcdJI4bXFkKHBnsll8YzTc8u8a.patch **replying in hurry On Tue, Jul 26, 2011 at 12:28 PM, John Armstrong wrote: > I

Adding files to map/reduce classpath

2011-07-26 Thread John Armstrong
I'm back to trying to add libraries to the classpath instead of handing around a fat JAR. This time I've served up my directory full of JARs on NFS, which each node in my cluster has mounted at /mnt/hadoop-libs. Now my question is how to add that (local) directory to the classpath of the mapper a

Multiple avro outputs from a reducer

2011-07-26 Thread Vyacheslav Zholudev
Hi, I'm using the avro format both for input and output, for a mapper and a reducer. I would like to output multiple avro items with different schemata. For sequence files I would use the MultipleOutputs class from the mapreduce package. I looked into the same class but from the old package "mapr

Re:Re: how can I get the number of reducer in Map

2011-07-26 Thread rabbit_cheng
yes, it works for me! I sincerely appreciate your help! thanks! At 2011-07-26 17:19:34,"Harsh J" wrote: >Ah wait, guess I figured your problem -- you may not be reutilizing >the Configuration instance inside your mapper. > >Override the configure() method in your mapper and get the value out >

Re: how can I get the number of reducer in Map

2011-07-26 Thread Harsh J
Ah wait, guess I figured your problem -- you may not be reutilizing the Configuration instance inside your mapper. Override the configure() method in your mapper and get the value out of the configuration instance passed to the mapper instead of instantiating a new one (with defaults). 2011/7/26

Re: how can I get the number of reducer in Map

2011-07-26 Thread Harsh J
This shouldn't be the case, as the configuration is supposed to propagate (even the partitioner is something that'd consume this). Could you post a cleaned up version of your whole Driver code? 2011/7/26 rabbit_cheng : > In my map function, I  need to know the number of reducer, the code segment >

how can I get the number of reducer in Map

2011-07-26 Thread rabbit_cheng
In my map function, I need to know the number of reducer, the code segment in my program like this: JobConf job = new JobConf(driverClass.class); int numReducer=job.getNumReduceTasks(); but the function invocation job.getNumReduceTasks() always returns the value of "1". I have tested many

RE: MR 0.20.2 job chaining

2011-07-26 Thread MONTMORY Alain
Hello, You can also use Cascading API (http://www.cascading.org/) which greatly simplify the Job chainning. In Thales we try both MR native and Cacading approach and we obtain very good results (productivity and performance) using cascading... regards [@@THALES GROUP RESTRICTED@@] -Messa