[jira] [Comment Edited] (SPARK-5654) Integrate SparkR into Apache Spark

Hari Sekhon (JIRA) Fri, 27 Feb 2015 08:45:04 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-5654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340355#comment-14340355
 ]


Hari Sekhon edited comment on SPARK-5654 at 2/27/15 4:42 PM:
-------------------------------------------------------------

Ok replace the word packaging with upstream integration and support, similar to 
HCatalog going in to Hive because it makes sense. This way it's standardized to 
all platforms, not the whim of a particular vendor's packaging strategy to bolt 
it on for you or DIY. I agree with Matei and Jason that it seems like a logical 
extension of the major language support that makes Spark so accessible. A lot 
of people know R and feel more comfortable sticking in their RStudio, this 
would surely benefit the Apache Spark project's popularity and accessibility 
even more, and help Databricks etc.


was (Author: harisekhon):
Ok replace the word packaging with upstream integration and support, similar to 
HCatalog going in to Hive because it makes sense. This way it's standardized to 
all platforms, not the whim of a particular vendor's packaging strategy to bolt 
it on for you or DIY. I agree with Matei and Jason that it seems like a logical 
extension of the major language support that makes Spark so accessible. A lot 
of people know R and feel more comfortable sticking in their RStudio, this 
would surely benefit the Apache Spark project.

> Integrate SparkR into Apache Spark
> ----------------------------------
>
>                 Key: SPARK-5654
>                 URL: https://issues.apache.org/jira/browse/SPARK-5654
>             Project: Spark
>          Issue Type: New Feature
>          Components: Project Infra
>            Reporter: Shivaram Venkataraman
>
> The SparkR project [1] provides a light-weight frontend to launch Spark jobs 
> from R. The project was started at the AMPLab around a year ago and has been 
> incubated as its own project to make sure it can be easily merged into 
> upstream Spark, i.e. not introduce any external dependencies etc. SparkR’s 
> goals are similar to PySpark and shares a similar design pattern as described 
> in our meetup talk[2], Spark Summit presentation[3].
> Integrating SparkR into the Apache project will enable R users to use Spark 
> out of the box and given R’s large user base, it will help the Spark project 
> reach more users.  Additionally, work in progress features like providing R 
> integration with ML Pipelines and Dataframes can be better achieved by 
> development in a unified code base.
> SparkR is available under the Apache 2.0 License and does not have any 
> external dependencies other than requiring users to have R and Java installed 
> on their machines.  SparkR’s developers come from many organizations 
> including UC Berkeley, Alteryx, Intel and we will support future development, 
> maintenance after the integration.
> [1] https://github.com/amplab-extras/SparkR-pkg
> [2] http://files.meetup.com/3138542/SparkR-meetup.pdf
> [3] http://spark-summit.org/2014/talk/sparkr-interactive-r-programs-at-scale-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-5654) Integrate SparkR into Apache Spark

Reply via email to