Re: How to Create an effective chained MapReduce program.

2011-09-08 Thread ilyal levin
* open a SequenceFile.Reader on the sequence file * in a loop, call next(key,val) on the reader to read the next key/val pair in the file (see: http://hadoop.apache.org/**common/docs/current/api/org/** apache/hadoop/io/SequenceFile.**Reader.html#next(org.apache.** hadoop.io.Writable,%20org.**apache

Re: How to Create an effective chained MapReduce program.

2011-09-07 Thread Lance Norskog
You might find it more easy to understand this if you use one of the low-level job-scripting languages like Oozie or Hamake. They put the whole assemblage of stuff into one file. On Wed, Sep 7, 2011 at 3:17 PM, David Rosenstrauch wrote: > * open a SequenceFile.Reader on the sequence file > * in a

Re: How to Create an effective chained MapReduce program.

2011-09-07 Thread David Rosenstrauch
* open a SequenceFile.Reader on the sequence file * in a loop, call next(key,val) on the reader to read the next key/val pair in the file (see: http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/SequenceFile.Reader.html#next(org.apache.hadoop.io.Writable,%20org.apache.hadoop.i

Re: How to Create an effective chained MapReduce program.

2011-09-07 Thread ilyal levin
Can you be more specific on how to perform this. In general is there a way to convert the binary files i have to text files? On Tue, Sep 6, 2011 at 11:26 PM, David Rosenstrauch wrote: > On 09/06/2011 01:57 AM, Niels Basjes wrote: > >> Hi, >> >> In the past i've had the same situation where I ne

Re: How to Create an effective chained MapReduce program.

2011-09-06 Thread David Rosenstrauch
On 09/06/2011 01:57 AM, Niels Basjes wrote: Hi, In the past i've had the same situation where I needed the data for debugging. Back then I chose to create a second job with simply SequenceFileInputFormat, IdentityMapper, IdentityReducer and finally TextOutputFormat. In my situation that worked

Re: How to Create an effective chained MapReduce program.

2011-09-06 Thread ilyal levin
I need it because the intermediate data is also part of the solution to the problem my algorithm solve. i somehow need to log this information. The key is Text and the value is ArrayWritable (TextArrayWritable). On Tue, Sep 6, 2011 at 8:57 AM, Niels Basjes wrote: > Hi, > > In the past i've had

Re: How to Create an effective chained MapReduce program.

2011-09-05 Thread Niels Basjes
Hi, In the past i've had the same situation where I needed the data for debugging. Back then I chose to create a second job with simply SequenceFileInputFormat, IdentityMapper, IdentityReducer and finally TextOutputFormat. In my situation that worked great for my purpose. -- Met vriendelijke gr

Re: How to Create an effective chained MapReduce program.

2011-09-05 Thread Joey Echeverria
Why do you need to see the intermediate data as text? What are the types of your key and values? -Joey On Sep 5, 2011 6:54 PM, "ilyal levin" wrote: > o.k , so now i'm using SequenceFileInputFormat and SequenceFileOutputFormat > and it works fine but the output of the reducer is > now a binary fi

Re: How to Create an effective chained MapReduce program.

2011-09-05 Thread ilyal levin
o.k , so now i'm using SequenceFileInputFormat and SequenceFileOutputFormat and it works fine but the output of the reducer is now a binary file (not txt) so i can't understand the data. how can i solve this? i need the data (in txt form ) of the Intermediate stages in the chain. Thanks On Tue, S

Re: How to Create an effective chained MapReduce program.

2011-09-05 Thread ilyal levin
Thanks for the help. On Mon, Sep 5, 2011 at 10:50 PM, Roger Chen wrote: > The binary file will allow you to pass the output from the first reducer to > the second mapper. For example, if you outputed Text, IntWritable from the > first one in SequenceFileOutputFormat, then you are able to retriev

Re: How to Create an effective chained MapReduce program.

2011-09-05 Thread Roger Chen
The binary file will allow you to pass the output from the first reducer to the second mapper. For example, if you outputed Text, IntWritable from the first one in SequenceFileOutputFormat, then you are able to retrieve Text, IntWritable input at the head of the second mapper. The idea of chaining

Re: How to Create an effective chained MapReduce program.

2011-09-05 Thread ilyal levin
Thanks for the reply. I tried it but it creates a binary file which i can not understand (i need the result of the first job). The other thing is how can i use this file in the next chained mapper? i.e how can i retrieve the keys and the values in the map function? Ilyal On Mon, Sep 5, 2011 at 7

Re: How to Create an effective chained MapReduce program.

2011-09-05 Thread Joey Echeverria
Have you tried SequenceFileOutputFormat and SequenceFileInputFormat? -Joey On Mon, Sep 5, 2011 at 11:49 AM, ilyal levin wrote: > Hi > I'm trying to write a chained mapreduce program. i'm doing so with a simple > loop where in each iteration i > create a job ,execute it and every time the current

How to Create an effective chained MapReduce program.

2011-09-05 Thread ilyal levin
Hi I'm trying to write a chained mapreduce program. i'm doing so with a simple loop where in each iteration i create a job ,execute it and every time the current job's output is the next job's input. how can i configure the outputFormat of the current job and the inputFormat of the next job so t