Fwd: Read key and values from HDFS
Does anyone have answered this question? Because I can't find it. -- Forwarded message -- From: Pedro Costa psdc1...@gmail.com Date: 30 March 2012 18:19 Subject: Read key and values from HDFS To: mapreduce-user mapreduce-user@hadoop.apache.org The ReduceTask can save the file using several output format: InternalFileOutputFormant, SequenceFileOutputFormat, TeraOuputFormat, etc... How can I read the keys and the values from the output file? Can anyone give me an example? Is there a way to create just one method that can read all different outputformat? Thanks, -- Best regards, -- Best regards,
Re: Read key and values from HDFS
I believe there is no requirement to save both key and value for the OutputFormat, therefore, it is not guaranteed that you can extract (key,value) pair from a file generated by an arbitrary OutputFormat. Zhu, Guojun Modeling Sr Graduate 571-3824370 guojun_...@freddiemac.com Financial Engineering Freddie Mac Pedro Costa psdc1...@gmail.com 03/30/2012 01:19 PM Please respond to mapreduce-user@hadoop.apache.org To mapreduce-user mapreduce-user@hadoop.apache.org cc Subject Read key and values from HDFS The ReduceTask can save the file using several output format: InternalFileOutputFormant, SequenceFileOutputFormat, TeraOuputFormat, etc... How can I read the keys and the values from the output file? Can anyone give me an example? Is there a way to create just one method that can read all different outputformat? Thanks, -- Best regards,
Re: Read key and values from HDFS
Hi Pedro i am not sure we have a single method for reading the data in output files for different otutput formats. But for sequence files we can use SequenceFile.Reader class in the API to read the sequence files. On Fri, Mar 30, 2012 at 10:49 PM, Pedro Costa psdc1...@gmail.com wrote: The ReduceTask can save the file using several output format: InternalFileOutputFormant, SequenceFileOutputFormat, TeraOuputFormat, etc... How can I read the keys and the values from the output file? Can anyone give me an example? Is there a way to create just one method that can read all different outputformat? Thanks, -- Best regards,
Re: Read key and values from HDFS
Hi, Just use hadoop fs -text file. It would read most of these files without breaking a sweat :) You can look at its implementation inside FsShell.java if you want to implement/reuse things in Java. On Fri, Mar 30, 2012 at 11:01 PM, kasi subrahmanyam kasisubbu...@gmail.com wrote: Hi Pedro i am not sure we have a single method for reading the data in output files for different otutput formats. But for sequence files we can use SequenceFile.Reader class in the API to read the sequence files. On Fri, Mar 30, 2012 at 10:49 PM, Pedro Costa psdc1...@gmail.com wrote: The ReduceTask can save the file using several output format: InternalFileOutputFormant, SequenceFileOutputFormat, TeraOuputFormat, etc... How can I read the keys and the values from the output file? Can anyone give me an example? Is there a way to create just one method that can read all different outputformat? Thanks, -- Best regards, -- Harsh J