[ 
https://issues.apache.org/jira/browse/SPARK-5654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340026#comment-14340026
 ] 

Hari Sekhon commented on SPARK-5654:
------------------------------------

SparkR absolutely must go in to mainline and be shipped in core Spark by all 
the vendors... having to deal with it separately, compile (serial id mismatch, 
recompile to align versions) and no support because it's an add-on are all 
major barriers to enterprise adoption and hurts Spark's offering too.

The stakeholders at my current banking client are literally crying out for 
SparkR over and over and when we tell them to go use PySpark instead they still 
insist that R is too important a language to them.

Also some vendors that want to buld on Spark would benefit from SparkR to 
replace existing product workflows where standard R integration is currently 
used (eg. Alteryx).

> Integrate SparkR into Apache Spark
> ----------------------------------
>
>                 Key: SPARK-5654
>                 URL: https://issues.apache.org/jira/browse/SPARK-5654
>             Project: Spark
>          Issue Type: New Feature
>          Components: Project Infra
>            Reporter: Shivaram Venkataraman
>
> The SparkR project [1] provides a light-weight frontend to launch Spark jobs 
> from R. The project was started at the AMPLab around a year ago and has been 
> incubated as its own project to make sure it can be easily merged into 
> upstream Spark, i.e. not introduce any external dependencies etc. SparkR’s 
> goals are similar to PySpark and shares a similar design pattern as described 
> in our meetup talk[2], Spark Summit presentation[3].
> Integrating SparkR into the Apache project will enable R users to use Spark 
> out of the box and given R’s large user base, it will help the Spark project 
> reach more users.  Additionally, work in progress features like providing R 
> integration with ML Pipelines and Dataframes can be better achieved by 
> development in a unified code base.
> SparkR is available under the Apache 2.0 License and does not have any 
> external dependencies other than requiring users to have R and Java installed 
> on their machines.  SparkR’s developers come from many organizations 
> including UC Berkeley, Alteryx, Intel and we will support future development, 
> maintenance after the integration.
> [1] https://github.com/amplab-extras/SparkR-pkg
> [2] http://files.meetup.com/3138542/SparkR-meetup.pdf
> [3] http://spark-summit.org/2014/talk/sparkr-interactive-r-programs-at-scale-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to