* open a SequenceFile.Reader on the sequence file
* in a loop, call next(key,val) on the reader to read the next key/val pair
in the file (see: http://hadoop.apache.org/**common/docs/current/api/org/**
apache/hadoop/io/SequenceFile.**Reader.html#next(org.apache.**
hadoop.io.Writable,%20org.**apache
You might find it more easy to understand this if you use one of the
low-level job-scripting languages like Oozie or Hamake. They put the whole
assemblage of stuff into one file.
On Wed, Sep 7, 2011 at 3:17 PM, David Rosenstrauch wrote:
> * open a SequenceFile.Reader on the sequence file
> * in a
* open a SequenceFile.Reader on the sequence file
* in a loop, call next(key,val) on the reader to read the next key/val
pair in the file (see:
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/SequenceFile.Reader.html#next(org.apache.hadoop.io.Writable,%20org.apache.hadoop.i
Can you be more specific on how to perform this. In general is there a way
to convert the binary files i have to text files?
On Tue, Sep 6, 2011 at 11:26 PM, David Rosenstrauch wrote:
> On 09/06/2011 01:57 AM, Niels Basjes wrote:
>
>> Hi,
>>
>> In the past i've had the same situation where I ne
On 09/06/2011 01:57 AM, Niels Basjes wrote:
Hi,
In the past i've had the same situation where I needed the data for
debugging. Back then I chose to create a second job with simply
SequenceFileInputFormat, IdentityMapper, IdentityReducer and finally
TextOutputFormat.
In my situation that worked
I need it because the intermediate data is also part of the solution to the
problem my algorithm solve.
i somehow need to log this information.
The key is Text and the value is ArrayWritable (TextArrayWritable).
On Tue, Sep 6, 2011 at 8:57 AM, Niels Basjes wrote:
> Hi,
>
> In the past i've had
Hi,
In the past i've had the same situation where I needed the data for
debugging. Back then I chose to create a second job with simply
SequenceFileInputFormat, IdentityMapper, IdentityReducer and finally
TextOutputFormat.
In my situation that worked great for my purpose.
--
Met vriendelijke gr
Why do you need to see the intermediate data as text?
What are the types of your key and values?
-Joey
On Sep 5, 2011 6:54 PM, "ilyal levin" wrote:
> o.k , so now i'm using SequenceFileInputFormat and
SequenceFileOutputFormat
> and it works fine but the output of the reducer is
> now a binary fi
o.k , so now i'm using SequenceFileInputFormat and SequenceFileOutputFormat
and it works fine but the output of the reducer is
now a binary file (not txt) so i can't understand the data. how can i solve
this? i need the data (in txt form ) of the Intermediate stages in the
chain.
Thanks
On Tue, S
Thanks for the help.
On Mon, Sep 5, 2011 at 10:50 PM, Roger Chen wrote:
> The binary file will allow you to pass the output from the first reducer to
> the second mapper. For example, if you outputed Text, IntWritable from the
> first one in SequenceFileOutputFormat, then you are able to retriev
The binary file will allow you to pass the output from the first reducer to
the second mapper. For example, if you outputed Text, IntWritable from the
first one in SequenceFileOutputFormat, then you are able to retrieve Text,
IntWritable input at the head of the second mapper. The idea of chaining
Thanks for the reply.
I tried it but it creates a binary file which i can not understand (i need
the result of the first job).
The other thing is how can i use this file in the next chained mapper? i.e
how can i retrieve the keys and the values in the map function?
Ilyal
On Mon, Sep 5, 2011 at 7
Have you tried SequenceFileOutputFormat and SequenceFileInputFormat?
-Joey
On Mon, Sep 5, 2011 at 11:49 AM, ilyal levin wrote:
> Hi
> I'm trying to write a chained mapreduce program. i'm doing so with a simple
> loop where in each iteration i
> create a job ,execute it and every time the current
Hi
I'm trying to write a chained mapreduce program. i'm doing so with a simple
loop where in each iteration i
create a job ,execute it and every time the current job's output is the next
job's input.
how can i configure the outputFormat of the current job and the inputFormat
of the next job so t
14 matches
Mail list logo