Fwd: Read key and values from HDFS

2012-04-01 Thread Pedro Costa
Does anyone have answered this question? Because I can't find it.

-- Forwarded message --
From: Pedro Costa psdc1...@gmail.com
Date: 30 March 2012 18:19
Subject: Read key and values from HDFS
To: mapreduce-user mapreduce-user@hadoop.apache.org



The ReduceTask can save the file using several output format:
InternalFileOutputFormant, SequenceFileOutputFormat, TeraOuputFormat, etc...

How can I read the keys and the values from the output file? Can anyone
give me an example? Is there a way to create just one method that can read
all different outputformat?


Thanks,

-- 
Best regards,




-- 
Best regards,


Re: Read key and values from HDFS

2012-03-30 Thread GUOJUN Zhu
I believe there is no requirement to save both key and value for the 
OutputFormat, therefore, it is not guaranteed that you can extract 
(key,value) pair from a file generated by an arbitrary OutputFormat. 

Zhu, Guojun
Modeling Sr Graduate
571-3824370
guojun_...@freddiemac.com
Financial Engineering
Freddie Mac



   Pedro Costa psdc1...@gmail.com 
   03/30/2012 01:19 PM
   Please respond to
mapreduce-user@hadoop.apache.org


To
mapreduce-user mapreduce-user@hadoop.apache.org
cc

Subject
Read key and values from HDFS







The ReduceTask can save the file using several output format: 
InternalFileOutputFormant, SequenceFileOutputFormat, TeraOuputFormat, 
etc...

How can I read the keys and the values from the output file? Can anyone 
give me an example? Is there a way to create just one method that can read 
all different outputformat?


Thanks,

-- 
Best regards,



Re: Read key and values from HDFS

2012-03-30 Thread kasi subrahmanyam
Hi Pedro i am not sure we have a single method for reading the data in
output files for different otutput formats.
But for sequence files we can use SequenceFile.Reader class in the API to
read the sequence files.

On Fri, Mar 30, 2012 at 10:49 PM, Pedro Costa psdc1...@gmail.com wrote:


 The ReduceTask can save the file using several output format:
 InternalFileOutputFormant, SequenceFileOutputFormat, TeraOuputFormat, etc...

 How can I read the keys and the values from the output file? Can anyone
 give me an example? Is there a way to create just one method that can read
 all different outputformat?


 Thanks,

 --
 Best regards,




Re: Read key and values from HDFS

2012-03-30 Thread Harsh J
Hi,

Just use hadoop fs -text file. It would read most of these files
without breaking a sweat :)

You can look at its implementation inside FsShell.java if you want to
implement/reuse things in Java.

On Fri, Mar 30, 2012 at 11:01 PM, kasi subrahmanyam
kasisubbu...@gmail.com wrote:
 Hi Pedro i am not sure we have a single method for reading the data in
 output files for different otutput formats.
 But for sequence files we can use SequenceFile.Reader class in the API to
 read the sequence files.


 On Fri, Mar 30, 2012 at 10:49 PM, Pedro Costa psdc1...@gmail.com wrote:


 The ReduceTask can save the file using several output format:
 InternalFileOutputFormant, SequenceFileOutputFormat, TeraOuputFormat, etc...

 How can I read the keys and the values from the output file? Can anyone
 give me an example? Is there a way to create just one method that can read
 all different outputformat?


 Thanks,

 --
 Best regards,





-- 
Harsh J