[ 
https://issues.apache.org/jira/browse/SPARK-7264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711973#comment-14711973
 ] 

Shivaram Venkataraman commented on SPARK-7264:
----------------------------------------------

Sorry for the delay in getting back on this and thanks everybody for your 
comments. There are a number of issues that are intermingled in the original 
design doc and I have created two separate smaller design docs that we can 
hopefully iterate on to build things out.

In this JIRA, I'd like to focus on running parallel R functions on Spark to 
provide functionality similar to `snow` or `doParallel` etc. I've created a new 
document detailing syntax and other aspects of this at 
https://docs.google.com/document/d/1oegI3OjmK_a-ME4m7sdL4ZlzY7wkXzfaX69GqQqK0VI/edit#

In https://issues.apache.org/jira/browse/SPARK-6817, I have posted a design doc 
that outlines running user-defined R functions on Spark DataFrames. Lets 
continue the discussion about that API in SPARK-6817.

> SparkR API for parallel functions
> ---------------------------------
>
>                 Key: SPARK-7264
>                 URL: https://issues.apache.org/jira/browse/SPARK-7264
>             Project: Spark
>          Issue Type: New Feature
>          Components: SparkR
>            Reporter: Shivaram Venkataraman
>
> This is a JIRA to discuss design proposals for enabling parallel R 
> computation in SparkR without exposing the entire RDD API. 
> The rationale for this is that the RDD API has a number of low level 
> functions and we would like to expose a more light-weight API that is both 
> friendly to R users and easy to maintain.
> http://goo.gl/GLHKZI has a first cut design doc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to