Re: how can I write a language wrapper?

2015-06-29 Thread Daniel Darabos
Hi Vasili, It so happens that the entire SparkR code was merged to Apache Spark in a single pull request. So you can see at once all the required changes in https://github.com/apache/spark/pull/5096. It's 12,043 lines and took more than 20 people about a year to write as I understand it. On Mon,

Re: how can I write a language wrapper?

2015-06-29 Thread Vasili I. Galchin
Shivaram, Vis-a-vis Haskell support, I am reading DataFrame.R, SparkRBackend*, context.R, et. al., am I headed in the correct direction?/ Yes or no, please give more guidance. Thank you. Kind regards, Vasili On Tue, Jun 23, 2015 at 1:46 PM, Shivaram Venkataraman

Re: how can I write a language wrapper?

2015-06-29 Thread Justin Uang
My guess is that if you are just wrapping the spark sql APIs, you can get away with not having to reimplement a lot of the complexities in Pyspark like storing everything in RDDs as pickled byte arrays, pipelining RDDs, doing aggregations and joins in the python interpreters, etc. Since the

Re: how can I write a language wrapper?

2015-06-24 Thread Shivaram Venkataraman
The SparkR code is in the `R` directory i.e. https://github.com/apache/spark/tree/master/R Shivaram On Wed, Jun 24, 2015 at 8:45 AM, Vasili I. Galchin vigalc...@gmail.com wrote: Matei, Last night I downloaded the Spark bundle. In order to save me time, can you give me the name of the

Re: how can I write a language wrapper?

2015-06-24 Thread Vasili I. Galchin
Matei, Last night I downloaded the Spark bundle. In order to save me time, can you give me the name of the SparkR example is and where it is in the Sparc tree? Thanks, Bill On Tuesday, June 23, 2015, Matei Zaharia matei.zaha...@gmail.com wrote: Just FYI, it would be easiest to follow

Re: how can I write a language wrapper?

2015-06-23 Thread Shivaram Venkataraman
Every language has its own quirks / features -- so I don't think there exists a document on how to go about doing this for a new language. The most related write up I know of is the wiki page on PySpark internals https://cwiki.apache.org/confluence/display/SPARK/PySpark+Internals written by Josh

Re: how can I write a language wrapper?

2015-06-23 Thread Matei Zaharia
Just FYI, it would be easiest to follow SparkR's example and add the DataFrame API first. Other APIs will be designed to work on DataFrames (most notably machine learning pipelines), and the surface of this API is much smaller than of the RDD API. This API will also give you great performance

how can I write a language wrapper?

2015-06-23 Thread Vasili I. Galchin
Hello, I want to add language support for another language(other than Scala, Java et. al.). Where is documentation that explains to provide support for a new language? Thank you, Vasili