Re: Type mismatch in key from map when replacing Mapper with MultithreadMapper

2011-09-27 Thread Arsen Zahray
Hey! Thank you for replying! Please, confirm that I understand you correctly: 1. Use a class, which extends mapper class MyMapper extends Mapper { public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { //implement all logic here

Re: Type mismatch in key from map when replacing Mapper with MultithreadMapper

2011-09-27 Thread Kamesh
On Wednesday 28 September 2011 11:33 AM, Arsen Zahray wrote: MultithreadMapper extends MultithreadedMapperIntWritable, MyPage> { ConcurrentLinkedQueuescrapers= new ConcurrentLinkedQueue(); public static final intnThreads= 5; public MyMultithreadMapper(

Type mismatch in key from map when replacing Mapper with MultithreadMapper

2011-09-27 Thread Arsen Zahray
I'd like to implement a MultithreadMapper for my MapReduce job. For this I replaced Mapper with MultithreadMapper in a working code. Here's the exeption I'm getting: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.IntWritable, recieved org.apache.hadoop.io.L

How to send objects to map task?

2011-09-27 Thread Zhiwei Xiao
Hi, My application needs to send some objects to map tasks, which specify how to process the input records. I know I can transfer them as string via the configuration file. But I prefer to leverage hadoop Writable interface, since the objects require a recursive serialization. I tried to create a

Re: output from one map reduce job as the input to another map reduce job?

2011-09-27 Thread Mike Spreitzer
It looks to me like Oozie will not do what was asked. In http://yahoo.github.com/oozie/releases/3.0.0/WorkflowFunctionalSpec.html#a0_Definitions I see: 3.2.2 Map-Reduce Action ... The workflow job will wait until the Hadoop map/reduce job completes before continuing to the next action in the

Re: output from one map reduce job as the input to another map reduce job?

2011-09-27 Thread Arun C Murthy
On Sep 27, 2011, at 12:09 PM, Kevin Burton wrote: > Is it possible to connect the output of one map reduce job so that it is the > input to another map reduce job. > > Basically… then reduce() outputs a key, that will be passed to another map() > function without having to store intermediate d

Re: output from one map reduce job as the input to another map reduce job?

2011-09-27 Thread Marcos Luis Ortiz Valmaseda
Are you consider for this to Oozie? It´s a workflow engine developed for the Yahoo! engineers Yahoo/oozie at GitHub https://github.com/yahoo/oozie Oozie at InfoQ http://www.infoq.com/articles/introductionOozie Oozie´s examples: http://www.infoq.com/articles/oozieexample http://yahoo.github.com/oo

Re: output from one map reduce job as the input to another map reduce job?

2011-09-27 Thread Arko Provo Mukherjee
Hi, I am not sure how you can avoid the filesystem, however, I did it as follows: // For Job 1 FileInputFormat.addInputPath(job1, new Path(args[0])); FileOutputFormat.setOutputPath(job1, new Path(args[1])); // For job 2 FileInputFormat.addInputPath(job2, new Path(args[1])); FileOutputFormat.setO

output from one map reduce job as the input to another map reduce job?

2011-09-27 Thread Kevin Burton
Is it possible to connect the output of one map reduce job so that it is the input to another map reduce job. Basically… then reduce() outputs a key, that will be passed to another map() function without having to store intermediate data to the filesystem. Kevin -- Founder/CEO Spinn3r.com Loc

Re: System.out.println in Map / Reduce

2011-09-27 Thread Arko Provo Mukherjee
Great! I got it. Thank you so much! Warm regards Arko On Sep 26, 2011, at 10:18 PM, Subroto Sanyal wrote: > Hi Arko, > > > > Request you to look into the “userlogs” folder of the corresponding task. > > It will have three file sysout, syslog and syserr. Your System.out.println() > will be