I would guess that the technology behind Azure R Server is about Revolution
Enterprise DistributedR/ScaleR. I don’t know the details, but the statement in
the “Step 6. Install R packages” section in the given documentation page.
However, if you need to install R packages on the worker nodes
Thanks for the link Sun, I believe running external Scripts like R code in
Data Frames is a much needed facility, for example for the algorithms that
are not available in MLLIB, invoking such from a R script would definitely
be a powerful feature when your APP is Scala/Python based, you don;t
Hi, Gilad,
You can try the dapply() and gapply() function in SparkR in Spark 2.0. Yes, it
is required that R installed on each worker node.
However, if your Spark application is Scala/Java based, it is not supported for
now to run R code in DataFrames. There is closed lira
It looks like it. "DataFrame UDFs in R" is resolved in Spark 2.0:
https://issues.apache.org/jira/browse/SPARK-6817
Here's some of the code:
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/r/MapPartitionsRWrapper.scala
/**
* A function wrapper
Here we (or certainly I) am not talking about R Server, but plain vanilla
R, as used with Spark and SparkR. Currently, SparkR doesn't distribute R
code at all (it used to, sort of), so I'm wondering if that is changing
back.
On Wed, Jun 29, 2016 at 10:53 PM, John Aherne
I don't think R server requires R on the executor nodes. I originally set
up a SparkR cluster for our Data Scientist on Azure which required that I
install R on each node, but for the R Server set up, there is an extra edge
node with R server that they connect to. From what little research I was
Oh, interesting: does this really mean the return of distributing R
code from driver to executors and running it remotely, or do I
misunderstand? this would require having R on the executor nodes like
it used to?
On Wed, Jun 29, 2016 at 5:53 PM, Xinh Huynh wrote:
> There is
Still you need sparkR
> On 29 Jun 2016, at 19:14, John Aherne wrote:
>
> Microsoft Azure has an option to create a spark cluster with R Server. MS
> bought RevoScale (I think that was the name) and just recently deployed it.
>
>> On Wed, Jun 29, 2016 at 10:53 AM,
Microsoft Azure has an option to create a spark cluster with R Server. MS
bought RevoScale (I think that was the name) and just recently deployed it.
On Wed, Jun 29, 2016 at 10:53 AM, Xinh Huynh wrote:
> There is some new SparkR functionality coming in Spark 2.0, such as
>
There is some new SparkR functionality coming in Spark 2.0, such as
"dapply". You could use SparkR to load a Parquet file and then run "dapply"
to apply a function to each partition of a DataFrame.
Info about loading Parquet file:
try Spark pipeRDD's , you can invoke the R script from pipe , push the
stuff you want to do on the Rscript stdin, p
On Wed, Jun 29, 2016 at 7:10 PM, Gilad Landau
wrote:
> Hello,
>
>
>
> I want to use R code as part of spark application (the same way I would do
>
Hello,
I want to use R code as part of spark application (the same way I would do with
Scala/Python). I want to be able to run an R syntax as a map function on a big
Spark dataframe loaded from a parquet file.
Is this even possible or the only way to use R is as part of RStudio
orchestration
12 matches
Mail list logo