Re: Spark Core and ways of "talking" to it for enhancing application language support

2015-07-14 Thread Vasili I. Galchin
thanks.

On Tuesday, July 14, 2015, Shivaram Venkataraman 
wrote:

> Both SparkR and the PySpark API call into the JVM Spark API (i.e.
> JavaSparkContext, JavaRDD etc.). They use different methods (Py4J vs. the
> R-Java bridge) to call into the JVM based on libraries available / features
> supported in each language. So for Haskell, one would need to see what is
> the best way to call the underlying Java API functions from Haskell and get
> results back.
>
> Thanks
> Shivaram
>
> On Mon, Jul 13, 2015 at 8:51 PM, Vasili I. Galchin  > wrote:
>
>> Hello,
>>
>>  So far I think there are at two ways (maybe more) to interact
>> from various programming languages with the Spark Core: PySpark API
>> and R API. From reading code it seems that PySpark approach and R
>> approach are very disparate ... with the latter using the R-Java
>> bridge. Vis-a-vis/regarding I am trying to decide Haskell which way to
>> go. I realize that like any open software effort that approaches
>> varied based on history. Is there an intent to adopt one approach as
>> standard?(Not trying to start a war :-) :-(.
>>
>> Vasili
>>
>> BTW I guess Java and Scala APIs are simple given the nature of both
>> languages vis-a-vis the JVM??
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> 
>> For additional commands, e-mail: dev-h...@spark.apache.org
>> 
>>
>>
>


Spark Core and ways of "talking" to it for enhancing application language support

2015-07-13 Thread Vasili I. Galchin
Hello,

 So far I think there are at two ways (maybe more) to interact
from various programming languages with the Spark Core: PySpark API
and R API. From reading code it seems that PySpark approach and R
approach are very disparate ... with the latter using the R-Java
bridge. Vis-a-vis/regarding I am trying to decide Haskell which way to
go. I realize that like any open software effort that approaches
varied based on history. Is there an intent to adopt one approach as
standard?(Not trying to start a war :-) :-(.

Vasili

BTW I guess Java and Scala APIs are simple given the nature of both
languages vis-a-vis the JVM??

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Spark application examples

2015-07-11 Thread Vasili I. Galchin
Hello,

  Reading slides entitled "DATABRICKS" written by Holden Karau, et. al.

  I am also reading Spark application examples under
../spark/examples/src/main/*.

  Let's assume examples Driver data-manipulation.R and
dataframe.R. Question: where in these Drivers are the worker "bees"
spawned??

Vasili

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: language-independent RDD Spark core code?

2015-07-10 Thread Vasili I. Galchin
think I found this RDD code

On Fri, Jul 10, 2015 at 7:00 PM, Vasili I. Galchin  wrote:
> I am looking at R side, but curious what the RDD core side looks like.
> Not sure which directory to look inside. ??
>
> Thanks,
>
> Vasili

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



language-independent RDD Spark core code?

2015-07-10 Thread Vasili I. Galchin
I am looking at R side, but curious what the RDD core side looks like.
Not sure which directory to look inside. ??

Thanks,

Vasili

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



PySpark vs R

2015-07-09 Thread Vasili I. Galchin
Hello,

Just trying to get up to speed ( a week .. pls be patient with me).

I have been reading several docs .. plus ...

reading PySpark vs R code. I don't see an invariant between the Python
and R implementations. ??

   Probably I should read native Scala code, yes?

Kind thx,

Vasili

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: callJMethod?

2015-07-09 Thread Vasili I. Galchin
Now I want to look at the PySpark side for comparison. I assume same
mechanism to do remote function call!!

Maybe in the slides .. I assume there are multiple JVMs for load
balancing and fault tolerance, yes??


How can I get one pdf with all slides together and not slide show?

Vasili

On Thu, Jul 9, 2015 at 4:41 PM, Shivaram Venkataraman
 wrote:
> callJMethod is a private R function that is defined in
> https://github.com/apache/spark/blob/a0cc3e5aa3fcfd0fce6813c520152657d327aaf2/R/pkg/R/backend.R#L31
>
> callJMethod serializes the function names, arguments and sends them over a
> socket to the JVM. This is the socket-based R to JVM bridge described in the
> Spark Summit talk
> https://spark-summit.org/2015/events/sparkr-the-past-the-present-and-the-future/
>
> Thanks
> Shivaram
>
> On Thu, Jul 9, 2015 at 4:28 PM, Vasili I. Galchin 
> wrote:
>>
>> Hello,
>>
>>I am reading R code, e.g. RDD.R, DataFrame.R, etc. I see that
>> callJMethod is repeatedly call. Is callJMethod part of the Spark Java API?
>> Thx.
>>
>> Vasili
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: callJMethod?

2015-07-09 Thread Vasili I. Galchin
very nice explanation. Thx

On Thu, Jul 9, 2015 at 4:41 PM, Shivaram Venkataraman
 wrote:
> callJMethod is a private R function that is defined in
> https://github.com/apache/spark/blob/a0cc3e5aa3fcfd0fce6813c520152657d327aaf2/R/pkg/R/backend.R#L31
>
> callJMethod serializes the function names, arguments and sends them over a
> socket to the JVM. This is the socket-based R to JVM bridge described in the
> Spark Summit talk
> https://spark-summit.org/2015/events/sparkr-the-past-the-present-and-the-future/
>
> Thanks
> Shivaram
>
> On Thu, Jul 9, 2015 at 4:28 PM, Vasili I. Galchin 
> wrote:
>>
>> Hello,
>>
>>I am reading R code, e.g. RDD.R, DataFrame.R, etc. I see that
>> callJMethod is repeatedly call. Is callJMethod part of the Spark Java API?
>> Thx.
>>
>> Vasili
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



callJMethod?

2015-07-09 Thread Vasili I. Galchin
Hello,

   I am reading R code, e.g. RDD.R, DataFrame.R, etc. I see that
callJMethod is repeatedly call. Is callJMethod part of the Spark Java API?
Thx.

Vasili


Spark and Haskell support

2015-07-08 Thread Vasili I. Galchin
Hello,

  1) I have been rereading kind email responses to my Spark queries. Thx.

  2) I have also been reading "R" code:

 1) RDD.R

 2) DataFrame.R

  3) All following API's =>
https://cwiki.apache.org/confluence/display/SPARK/Spark+Internals

  4) Python ... https://spark.apache.org/docs/latest/programming-guide.html


   Based on the above points, when I see in e.g. DataFrame.R,
calls to a function "callJMethod" ... two questions ...

- is is callJMethod calling into the JVM or better yet the
Spark Java API? Please answer in a more mathematical way without
ambiguities ... this will save too many back and forth queries 

- if the above is true, how is the JVM "imported" into the
DataFrame.R code.

 I would appreciate if you answers would be interleaved
with my questions to avoid ambiguities .. Thx,


Thx,

Vasili

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: how can I write a language "wrapper"?

2015-06-29 Thread Vasili I. Galchin
Shivaram,

Vis-a-vis Haskell support, I am reading DataFrame.R,
SparkRBackend*, context.R, et. al., am I headed in the correct
direction?/ Yes or no, please give more guidance. Thank you.

Kind regards,

Vasili



On Tue, Jun 23, 2015 at 1:46 PM, Shivaram Venkataraman
 wrote:
> Every language has its own quirks / features -- so I don't think there
> exists a document on how to go about doing this for a new language. The most
> related write up I know of is the wiki page on PySpark internals
> https://cwiki.apache.org/confluence/display/SPARK/PySpark+Internals written
> by Josh Rosen -- It covers some of the issues like closure capture,
> serialization, JVM communication that you'll need to handle for a new
> language.
>
> Thanks
> Shivaram
>
> On Tue, Jun 23, 2015 at 1:35 PM, Vasili I. Galchin 
> wrote:
>>
>> Hello,
>>
>>   I want to add language support for another language(other than
>> Scala, Java et. al.). Where is documentation that explains to provide
>> support for a new language?
>>
>> Thank you,
>>
>> Vasili
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: R - Scala interface used in Spark?

2015-06-26 Thread Vasili I. Galchin
thx Reynold!

Vasya

On Fri, Jun 26, 2015 at 7:03 PM, Reynold Xin  wrote:

> Take a look at this for Python:
>
> https://cwiki.apache.org/confluence/display/SPARK/PySpark+Internals
>
>
> On Fri, Jun 26, 2015 at 6:06 PM, Reynold Xin  wrote:
>
>> You doing something for Haskell??
>>
>> On Fri, Jun 26, 2015 at 5:21 PM, Vasili I. Galchin 
>> wrote:
>>
>>> How about Python??
>>>
>>> On Friday, June 26, 2015, Shivaram Venkataraman <
>>> shiva...@eecs.berkeley.edu> wrote:
>>>
>>>> We don't use the rscala package in SparkR -- We have an in built R-JVM
>>>> bridge that is customized to work with various deployment modes. You can
>>>> find more details in my Spark Summit 2015 talk.
>>>>
>>>> Thanks
>>>> Shivaram
>>>>
>>>> On Fri, Jun 26, 2015 at 3:19 PM, Vasili I. Galchin >>> > wrote:
>>>>
>>>>> A friend sent the below:
>>>>>
>>>>> http://cran.r-project.org/web/packages/rscala/index.html
>>>>>
>>>>> Is this the "glue" between R and Scala that is used in Spark?
>>>>>
>>>>> Vasili
>>>>>
>>>>
>>>>
>>
>


Re: R - Scala interface used in Spark?

2015-06-26 Thread Vasili I. Galchin
How about Python??

On Friday, June 26, 2015, Shivaram Venkataraman 
wrote:

> We don't use the rscala package in SparkR -- We have an in built R-JVM
> bridge that is customized to work with various deployment modes. You can
> find more details in my Spark Summit 2015 talk.
>
> Thanks
> Shivaram
>
> On Fri, Jun 26, 2015 at 3:19 PM, Vasili I. Galchin  > wrote:
>
>> A friend sent the below:
>>
>> http://cran.r-project.org/web/packages/rscala/index.html
>>
>> Is this the "glue" between R and Scala that is used in Spark?
>>
>> Vasili
>>
>
>


Re: R - Scala interface used in Spark?

2015-06-26 Thread Vasili I. Galchin
Url plese !!  URL. Please of ypur work.

On Friday, June 26, 2015, Shivaram Venkataraman 
wrote:

> We don't use the rscala package in SparkR -- We have an in built R-JVM
> bridge that is customized to work with various deployment modes. You can
> find more details in my Spark Summit 2015 talk.
>
> Thanks
> Shivaram
>
> On Fri, Jun 26, 2015 at 3:19 PM, Vasili I. Galchin  > wrote:
>
>> A friend sent the below:
>>
>> http://cran.r-project.org/web/packages/rscala/index.html
>>
>> Is this the "glue" between R and Scala that is used in Spark?
>>
>> Vasili
>>
>
>


R - Scala interface used in Spark?

2015-06-26 Thread Vasili I. Galchin
A friend sent the below:

http://cran.r-project.org/web/packages/rscala/index.html

Is this the "glue" between R and Scala that is used in Spark?

Vasili


Re: how can I write a language "wrapper"?

2015-06-24 Thread Vasili I. Galchin
Matei,

 Last night I downloaded the Spark bundle.
In order to save me time, can you give me the name of the SparkR example is
and where it is in the Sparc tree?

Thanks,

Bill

On Tuesday, June 23, 2015, Matei Zaharia  wrote:

> Just FYI, it would be easiest to follow SparkR's example and add the
> DataFrame API first. Other APIs will be designed to work on DataFrames
> (most notably machine learning pipelines), and the surface of this API is
> much smaller than of the RDD API. This API will also give you great
> performance as we continue to optimize Spark SQL.
>
> Matei
>
> On Jun 23, 2015, at 1:46 PM, Shivaram Venkataraman <
> shiva...@eecs.berkeley.edu
> > wrote:
>
> Every language has its own quirks / features -- so I don't think there
> exists a document on how to go about doing this for a new language. The
> most related write up I know of is the wiki page on PySpark internals
> https://cwiki.apache.org/confluence/display/SPARK/PySpark+Internals
> written by Josh Rosen -- It covers some of the issues like closure capture,
> serialization, JVM communication that you'll need to handle for a new
> language.
>
> Thanks
> Shivaram
>
> On Tue, Jun 23, 2015 at 1:35 PM, Vasili I. Galchin  > wrote:
>
>> Hello,
>>
>>   I want to add language support for another language(other than
>> Scala, Java et. al.). Where is documentation that explains to provide
>> support for a new language?
>>
>> Thank you,
>>
>> Vasili
>>
>
>
>


how can I write a language "wrapper"?

2015-06-23 Thread Vasili I. Galchin
Hello,

  I want to add language support for another language(other than Scala,
Java et. al.). Where is documentation that explains to provide support for
a new language?

Thank you,

Vasili