Re: Does anyone have experience with using Hadoop InputFormats?

2015-08-01 Thread Antsy.Rao
Sent from my iPad On 2014-9-24, at 上午8:13, Steve Lewis lordjoe2...@gmail.com wrote: When I experimented with using an InputFormat I had used in Hadoop for a long time in Hadoop I found 1) it must extend org.apache.hadoop.mapred.FileInputFormat (the deprecated class not

Re: Does anyone have experience with using Hadoop InputFormats?

2014-09-24 Thread Steve Lewis
I tried newAPIHadoopFile and it works except that my original InputFormat extends InputFormatText,Text and has a RecordReaderText,Text This throws a not Serializable exception on Text - changing the type to InputFormatStringBuffer, StringBuffer works with minor code changes. I do not, however,

Re: Does anyone have experience with using Hadoop InputFormats?

2014-09-24 Thread Russ Weeks
I use newAPIHadoopRDD with AccumuloInputFormat. It produces a PairRDD using Accumulo's Key and Value classes, both of which extend Writable. Works like a charm. I use the same InputFormat for all my MR jobs. -Russ On Wed, Sep 24, 2014 at 9:33 AM, Steve Lewis lordjoe2...@gmail.com wrote: I

Re: Does anyone have experience with using Hadoop InputFormats?

2014-09-24 Thread Steve Lewis
Do your custom Writable classes implement Serializable - I think that is the only real issue - my code uses vanilla Text

Re: Does anyone have experience with using Hadoop InputFormats?

2014-09-24 Thread Russ Weeks
No, they do not implement Serializable. There are a couple of places where I've had to do a Text-String conversion but generally it hasn't been a problem. -Russ On Wed, Sep 24, 2014 at 10:27 AM, Steve Lewis lordjoe2...@gmail.com wrote: Do your custom Writable classes implement Serializable - I

Re: Does anyone have experience with using Hadoop InputFormats?

2014-09-24 Thread Steve Lewis
Hmmm - I have only tested in local mode but I got an java.io.NotSerializableException: org.apache.hadoop.io.Text at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1180) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1528) at

Does anyone have experience with using Hadoop InputFormats?

2014-09-23 Thread Steve Lewis
When I experimented with using an InputFormat I had used in Hadoop for a long time in Hadoop I found 1) it must extend org.apache.hadoop.mapred.FileInputFormat (the deprecated class not org.apache.hadoop.mapreduce.lib.input;FileInputFormat 2) initialize needs to be called in the constructor 3)