Re: Does anyone have experience with using Hadoop InputFormats?

Antsy.Rao Sat, 01 Aug 2015 09:53:58 -0700


Sent from my iPad


On 2014-9-24, at 上午8:13, Steve Lewis <lordjoe2...@gmail.com> wrote:

>  When I experimented with using an InputFormat I had used in Hadoop for a 
> long time in Hadoop I found
> 1) it must extend org.apache.hadoop.mapred.FileInputFormat (the deprecated 
> class not org.apache.hadoop.mapreduce.lib.input;FileInputFormat
> 2) initialize needs to be called in the constructor
> 3) The type - mine was extends FileInputFormat<Text, Text> must not be a 
> Hadoop Writable - those are not serializable but extends 
> FileInputFormat<StringBuffer, StringBuffer> does work - I don't think this is 
> allowed in Hadoop 
> 
> Are these statements correct and if so it seems like most Hadoop InputFormate 
> - certainly the custom ones I create require serious modifications to work - 
> does anyone have samples of use of Hadoop InputFormat 
> 
> Since I am working with problems where a directory with multiple files are 
> processed and some files are many gigabytes in size with multiline complex 
> records an input format is a requirement.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Does anyone have experience with using Hadoop InputFormats?

Reply via email to