Re: Perl Mapper with Java Reducer

2011-09-07 Thread Amareshwari Sri Ramadasu
You can look at Hadoop streaming http://hadoop.apache.org/common/docs/r0.20.0/streaming.html Thanks Amareshwari On 9/7/11 1:38 PM, Bejoy KS bejoy.had...@gmail.com wrote: Hi Is it possible to have my mapper in Perl and reducer in java. In my existing legacy system some larger process is

Re: Need 0.20.2 new API documentation/examples, where are they?

2011-03-31 Thread Amareshwari Sri Ramadasu
John, Examples and libraries are rewritten to use new api in branch 0.21. You can have a look at them. New api in branch 0.20 is not stable yet. And old api is undeprecated in branch 0.21. So, you can use old api still. Thanks Amareshwari On 3/30/11 11:38 PM, John Therrell jtherr...@gmail.com

Re: JobClient using deprecated JobConf

2010-09-22 Thread Amareshwari Sri Ramadasu
In 0.21, JobClient methods are available in org.apache.hadoop.mapreduce.Job and org.apache.hadoop.mapreduce.Cluster classes. On 9/22/10 3:07 PM, Martin Becker _martinbec...@web.de wrote: Hello, I am using the Hadoop MapReduce version 0.20.2 and soon 0.21. I wanted to use the JobClient class

Re: Changing default separator for streaming application

2010-06-16 Thread Amareshwari Sri Ramadasu
Final output is written by OutputFormat. By default, TextOutputFormat will write \t as the key-value separator. You can specify a different key-value separator for TextOutputFormat by specifying the value for configuration property mapred.textoutputformat.separator. Try setting ' ' for the

Re: multiple outputs

2010-06-08 Thread Amareshwari Sri Ramadasu
also be used inside a mapper? So basically I pipe data into different reducers from the mapper. Of course I could do two separate jobs but that would very inefficient as I would have to go/read through all the data twice. cheers -- Torsten On Tue, Jun 8, 2010 at 06:22, Amareshwari Sri Ramadasu amar

Re: multiple outputs

2010-06-07 Thread Amareshwari Sri Ramadasu
MultipleOutputs is ported to use new api through http://issues.apache.org/jira/browse/MAPREDUCE-370 See the discussions on jira and javadoc/testcase as an example on how to use it. Thanks Amareshwari On 6/7/10 8:08 PM, Torsten Curdt tcu...@apache.org wrote: I need to emit to different output

Re: Reducers are stuck fetching map data.

2010-01-20 Thread Amareshwari Sri Ramadasu
ReadTimeOuts are found to be costly during shuffle, if the map runtime is high. Please see HADOOP-3327( http://issues.apache.org/jira/browse/HADOOP-3327) for shuffle improvements done for ReadTimeOut specificlly Thanks Amareshwari On 1/20/10 6:07 PM, Suhail Rehman suhailreh...@gmail.com wrote: