Re: Spark Core and ways of "talking" to it for enhancing application language support
thanks. On Tuesday, July 14, 2015, Shivaram Venkataraman wrote: > Both SparkR and the PySpark API call into the JVM Spark API (i.e. > JavaSparkContext, JavaRDD etc.). They use different methods (Py4J vs. the > R-Java bridge) to call into the JVM based on libraries available / features > supported in each language. So for Haskell, one would need to see what is > the best way to call the underlying Java API functions from Haskell and get > results back. > > Thanks > Shivaram > > On Mon, Jul 13, 2015 at 8:51 PM, Vasili I. Galchin > wrote: > >> Hello, >> >> So far I think there are at two ways (maybe more) to interact >> from various programming languages with the Spark Core: PySpark API >> and R API. From reading code it seems that PySpark approach and R >> approach are very disparate ... with the latter using the R-Java >> bridge. Vis-a-vis/regarding I am trying to decide Haskell which way to >> go. I realize that like any open software effort that approaches >> varied based on history. Is there an intent to adopt one approach as >> standard?(Not trying to start a war :-) :-(. >> >> Vasili >> >> BTW I guess Java and Scala APIs are simple given the nature of both >> languages vis-a-vis the JVM?? >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >> >> For additional commands, e-mail: dev-h...@spark.apache.org >> >> >> >
Spark Core and ways of "talking" to it for enhancing application language support
Hello, So far I think there are at two ways (maybe more) to interact from various programming languages with the Spark Core: PySpark API and R API. From reading code it seems that PySpark approach and R approach are very disparate ... with the latter using the R-Java bridge. Vis-a-vis/regarding I am trying to decide Haskell which way to go. I realize that like any open software effort that approaches varied based on history. Is there an intent to adopt one approach as standard?(Not trying to start a war :-) :-(. Vasili BTW I guess Java and Scala APIs are simple given the nature of both languages vis-a-vis the JVM?? - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Spark application examples
Hello, Reading slides entitled "DATABRICKS" written by Holden Karau, et. al. I am also reading Spark application examples under ../spark/examples/src/main/*. Let's assume examples Driver data-manipulation.R and dataframe.R. Question: where in these Drivers are the worker "bees" spawned?? Vasili - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: language-independent RDD Spark core code?
think I found this RDD code On Fri, Jul 10, 2015 at 7:00 PM, Vasili I. Galchin wrote: > I am looking at R side, but curious what the RDD core side looks like. > Not sure which directory to look inside. ?? > > Thanks, > > Vasili - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
language-independent RDD Spark core code?
I am looking at R side, but curious what the RDD core side looks like. Not sure which directory to look inside. ?? Thanks, Vasili - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
PySpark vs R
Hello, Just trying to get up to speed ( a week .. pls be patient with me). I have been reading several docs .. plus ... reading PySpark vs R code. I don't see an invariant between the Python and R implementations. ?? Probably I should read native Scala code, yes? Kind thx, Vasili - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: callJMethod?
Now I want to look at the PySpark side for comparison. I assume same mechanism to do remote function call!! Maybe in the slides .. I assume there are multiple JVMs for load balancing and fault tolerance, yes?? How can I get one pdf with all slides together and not slide show? Vasili On Thu, Jul 9, 2015 at 4:41 PM, Shivaram Venkataraman wrote: > callJMethod is a private R function that is defined in > https://github.com/apache/spark/blob/a0cc3e5aa3fcfd0fce6813c520152657d327aaf2/R/pkg/R/backend.R#L31 > > callJMethod serializes the function names, arguments and sends them over a > socket to the JVM. This is the socket-based R to JVM bridge described in the > Spark Summit talk > https://spark-summit.org/2015/events/sparkr-the-past-the-present-and-the-future/ > > Thanks > Shivaram > > On Thu, Jul 9, 2015 at 4:28 PM, Vasili I. Galchin > wrote: >> >> Hello, >> >>I am reading R code, e.g. RDD.R, DataFrame.R, etc. I see that >> callJMethod is repeatedly call. Is callJMethod part of the Spark Java API? >> Thx. >> >> Vasili > > - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: callJMethod?
very nice explanation. Thx On Thu, Jul 9, 2015 at 4:41 PM, Shivaram Venkataraman wrote: > callJMethod is a private R function that is defined in > https://github.com/apache/spark/blob/a0cc3e5aa3fcfd0fce6813c520152657d327aaf2/R/pkg/R/backend.R#L31 > > callJMethod serializes the function names, arguments and sends them over a > socket to the JVM. This is the socket-based R to JVM bridge described in the > Spark Summit talk > https://spark-summit.org/2015/events/sparkr-the-past-the-present-and-the-future/ > > Thanks > Shivaram > > On Thu, Jul 9, 2015 at 4:28 PM, Vasili I. Galchin > wrote: >> >> Hello, >> >>I am reading R code, e.g. RDD.R, DataFrame.R, etc. I see that >> callJMethod is repeatedly call. Is callJMethod part of the Spark Java API? >> Thx. >> >> Vasili > > - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
callJMethod?
Hello, I am reading R code, e.g. RDD.R, DataFrame.R, etc. I see that callJMethod is repeatedly call. Is callJMethod part of the Spark Java API? Thx. Vasili
Spark and Haskell support
Hello, 1) I have been rereading kind email responses to my Spark queries. Thx. 2) I have also been reading "R" code: 1) RDD.R 2) DataFrame.R 3) All following API's => https://cwiki.apache.org/confluence/display/SPARK/Spark+Internals 4) Python ... https://spark.apache.org/docs/latest/programming-guide.html Based on the above points, when I see in e.g. DataFrame.R, calls to a function "callJMethod" ... two questions ... - is is callJMethod calling into the JVM or better yet the Spark Java API? Please answer in a more mathematical way without ambiguities ... this will save too many back and forth queries - if the above is true, how is the JVM "imported" into the DataFrame.R code. I would appreciate if you answers would be interleaved with my questions to avoid ambiguities .. Thx, Thx, Vasili - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: how can I write a language "wrapper"?
Shivaram, Vis-a-vis Haskell support, I am reading DataFrame.R, SparkRBackend*, context.R, et. al., am I headed in the correct direction?/ Yes or no, please give more guidance. Thank you. Kind regards, Vasili On Tue, Jun 23, 2015 at 1:46 PM, Shivaram Venkataraman wrote: > Every language has its own quirks / features -- so I don't think there > exists a document on how to go about doing this for a new language. The most > related write up I know of is the wiki page on PySpark internals > https://cwiki.apache.org/confluence/display/SPARK/PySpark+Internals written > by Josh Rosen -- It covers some of the issues like closure capture, > serialization, JVM communication that you'll need to handle for a new > language. > > Thanks > Shivaram > > On Tue, Jun 23, 2015 at 1:35 PM, Vasili I. Galchin > wrote: >> >> Hello, >> >> I want to add language support for another language(other than >> Scala, Java et. al.). Where is documentation that explains to provide >> support for a new language? >> >> Thank you, >> >> Vasili > > - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: R - Scala interface used in Spark?
thx Reynold! Vasya On Fri, Jun 26, 2015 at 7:03 PM, Reynold Xin wrote: > Take a look at this for Python: > > https://cwiki.apache.org/confluence/display/SPARK/PySpark+Internals > > > On Fri, Jun 26, 2015 at 6:06 PM, Reynold Xin wrote: > >> You doing something for Haskell?? >> >> On Fri, Jun 26, 2015 at 5:21 PM, Vasili I. Galchin >> wrote: >> >>> How about Python?? >>> >>> On Friday, June 26, 2015, Shivaram Venkataraman < >>> shiva...@eecs.berkeley.edu> wrote: >>> >>>> We don't use the rscala package in SparkR -- We have an in built R-JVM >>>> bridge that is customized to work with various deployment modes. You can >>>> find more details in my Spark Summit 2015 talk. >>>> >>>> Thanks >>>> Shivaram >>>> >>>> On Fri, Jun 26, 2015 at 3:19 PM, Vasili I. Galchin >>> > wrote: >>>> >>>>> A friend sent the below: >>>>> >>>>> http://cran.r-project.org/web/packages/rscala/index.html >>>>> >>>>> Is this the "glue" between R and Scala that is used in Spark? >>>>> >>>>> Vasili >>>>> >>>> >>>> >> >
Re: R - Scala interface used in Spark?
How about Python?? On Friday, June 26, 2015, Shivaram Venkataraman wrote: > We don't use the rscala package in SparkR -- We have an in built R-JVM > bridge that is customized to work with various deployment modes. You can > find more details in my Spark Summit 2015 talk. > > Thanks > Shivaram > > On Fri, Jun 26, 2015 at 3:19 PM, Vasili I. Galchin > wrote: > >> A friend sent the below: >> >> http://cran.r-project.org/web/packages/rscala/index.html >> >> Is this the "glue" between R and Scala that is used in Spark? >> >> Vasili >> > >
Re: R - Scala interface used in Spark?
Url plese !! URL. Please of ypur work. On Friday, June 26, 2015, Shivaram Venkataraman wrote: > We don't use the rscala package in SparkR -- We have an in built R-JVM > bridge that is customized to work with various deployment modes. You can > find more details in my Spark Summit 2015 talk. > > Thanks > Shivaram > > On Fri, Jun 26, 2015 at 3:19 PM, Vasili I. Galchin > wrote: > >> A friend sent the below: >> >> http://cran.r-project.org/web/packages/rscala/index.html >> >> Is this the "glue" between R and Scala that is used in Spark? >> >> Vasili >> > >
R - Scala interface used in Spark?
A friend sent the below: http://cran.r-project.org/web/packages/rscala/index.html Is this the "glue" between R and Scala that is used in Spark? Vasili
Re: how can I write a language "wrapper"?
Matei, Last night I downloaded the Spark bundle. In order to save me time, can you give me the name of the SparkR example is and where it is in the Sparc tree? Thanks, Bill On Tuesday, June 23, 2015, Matei Zaharia wrote: > Just FYI, it would be easiest to follow SparkR's example and add the > DataFrame API first. Other APIs will be designed to work on DataFrames > (most notably machine learning pipelines), and the surface of this API is > much smaller than of the RDD API. This API will also give you great > performance as we continue to optimize Spark SQL. > > Matei > > On Jun 23, 2015, at 1:46 PM, Shivaram Venkataraman < > shiva...@eecs.berkeley.edu > > wrote: > > Every language has its own quirks / features -- so I don't think there > exists a document on how to go about doing this for a new language. The > most related write up I know of is the wiki page on PySpark internals > https://cwiki.apache.org/confluence/display/SPARK/PySpark+Internals > written by Josh Rosen -- It covers some of the issues like closure capture, > serialization, JVM communication that you'll need to handle for a new > language. > > Thanks > Shivaram > > On Tue, Jun 23, 2015 at 1:35 PM, Vasili I. Galchin > wrote: > >> Hello, >> >> I want to add language support for another language(other than >> Scala, Java et. al.). Where is documentation that explains to provide >> support for a new language? >> >> Thank you, >> >> Vasili >> > > >
how can I write a language "wrapper"?
Hello, I want to add language support for another language(other than Scala, Java et. al.). Where is documentation that explains to provide support for a new language? Thank you, Vasili