Not that much familiar with Python APIs, but You should be able to configure a job object with your custom InputFormat and pass in the required configuration (:- job.getConfiguration()) to newAPIHadoopRDD to get the required RDD
On Wed, Aug 13, 2014 at 2:59 PM, Tassilo Klein <tjkl...@gmail.com> wrote: > Hi, > > I'd like to read in a (binary) file from Python for which I have defined a > Java InputFormat (.java) definition. However, now I am stuck in how to use > that in Python and didn't find anything in newsgroups either. > As far as I know, I have to use this newAPIHadoopRDD function. However, I > am > not sure how to use that in combination with my custom InputFormat. > Does anybody have a short snipped of code how to do it? > Thanks in advance. > Best, > Tassilo > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Using-Hadoop-InputFormat-in-Python-tp12067.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >