My guess is that if you are just wrapping the spark sql APIs, you can get
away with not having to reimplement a lot of the complexities in Pyspark
like storing everything in RDDs as pickled byte arrays, pipelining RDDs,
doing aggregations and joins in the python interpreters, etc.
Since the canoni
Hi Vasili,
It so happens that the entire SparkR code was merged to Apache Spark in a
single pull request. So you can see at once all the required changes in
https://github.com/apache/spark/pull/5096. It's 12,043 lines and took more
than 20 people about a year to write as I understand it.
On Mon, J
Shivaram,
Vis-a-vis Haskell support, I am reading DataFrame.R,
SparkRBackend*, context.R, et. al., am I headed in the correct
direction?/ Yes or no, please give more guidance. Thank you.
Kind regards,
Vasili
On Tue, Jun 23, 2015 at 1:46 PM, Shivaram Venkataraman
wrote:
> Every language h
The SparkR code is in the `R` directory i.e.
https://github.com/apache/spark/tree/master/R
Shivaram
On Wed, Jun 24, 2015 at 8:45 AM, Vasili I. Galchin
wrote:
> Matei,
>
> Last night I downloaded the Spark bundle.
> In order to save me time, can you give me the name of the SparkR example
>
Matei,
Last night I downloaded the Spark bundle.
In order to save me time, can you give me the name of the SparkR example is
and where it is in the Sparc tree?
Thanks,
Bill
On Tuesday, June 23, 2015, Matei Zaharia wrote:
> Just FYI, it would be easiest to follow SparkR's example and add
Just FYI, it would be easiest to follow SparkR's example and add the DataFrame
API first. Other APIs will be designed to work on DataFrames (most notably
machine learning pipelines), and the surface of this API is much smaller than
of the RDD API. This API will also give you great performance as
Every language has its own quirks / features -- so I don't think there
exists a document on how to go about doing this for a new language. The
most related write up I know of is the wiki page on PySpark internals
https://cwiki.apache.org/confluence/display/SPARK/PySpark+Internals written
by Josh Ro
Hello,
I want to add language support for another language(other than Scala,
Java et. al.). Where is documentation that explains to provide support for
a new language?
Thank you,
Vasili