Re: Using Hadoop InputFormat in Python

Sunny Khatri Wed, 13 Aug 2014 15:13:43 -0700

Not that much familiar with Python APIs, but You should be able to
configure a job object with your custom InputFormat and pass in the
required configuration (:- job.getConfiguration()) to newAPIHadoopRDD to
get the required RDD



On Wed, Aug 13, 2014 at 2:59 PM, Tassilo Klein <tjkl...@gmail.com> wrote:

> Hi,
>
> I'd like to read in a (binary) file from Python for which I have defined a
> Java InputFormat (.java) definition. However, now I am stuck in how to use
> that in Python and didn't find anything in newsgroups either.
> As far as I know, I have to use this newAPIHadoopRDD function. However, I
> am
> not sure how to use that in combination with my custom InputFormat.
> Does anybody have a short snipped of code how to do it?
> Thanks in advance.
> Best,
>  Tassilo
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Using-Hadoop-InputFormat-in-Python-tp12067.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: Using Hadoop InputFormat in Python

Reply via email to