Hi Richard,

> Would it be possible to access the session API from within ROSE,
> to get for example the images that are generated by R / openCPU

Technically it would be possible although there would be some potentially 
significant runtime costs per task in doing so, primarily those related to 
extracting image data from the R session, serializing and then moving that data 
across the cluster for each and every image.

From a design perspective ROSE was intended to be used within Spark scale 
applications where R object data was seen as the primary task output. An output 
in a format that could be rapidly serialized and easily processed. Are there 
real world use cases where Spark scale applications capable of generating 10k, 
100k, or even millions of image files would actually need to capture and store 
images? If so, how practically speaking, would these images ever be used? I'm 
just not sure. Maybe you could describe your own use case to provide some 
insights?

> and the logging to stdout that is logged by R?

If you are referring to the R console output (generated within the R session 
during the execution of an OCPUTask) then this data could certainly 
(optionally) be captured and returned on an OCPUResult. Again, can you provide 
any details for how you might use this console output in a real world 
application?

As an aside, for simple standalone Spark applications that will only ever run 
on a single host (no cluster) you could consider using an alternative library 
called fluent-r. This library is also available under my GitHub repo, [see 
here](https://github.com/onetapbeyond/fluent-r). The fluent-r library already 
has support for the retrieval of R objects, R console output and R graphics 
device image/plots. However it is not as lightweight as ROSE and it not 
designed to work in a clustered environment. ROSE on the other hand is designed 
for scale.

David

"All that is gold does not glitter, Not all those who wander are lost."



-------- Original Message --------
Subject: Re: ROSE: Spark + R on the JVM.
Local Time: January 12 2016 6:56 pm
UTC Time: January 12 2016 11:56 pm
From: rsiebel...@gmail.com
To: m...@vijaykiran.com
CC: 
cjno...@gmail.com,themarchoffo...@protonmail.com,user@spark.apache.org,d...@spark.apache.org



Hi,

this looks great and seems to be very usable.
Would it be possible to access the session API from within ROSE, to get for 
example the images that are generated by R / openCPU and the logging to stdout 
that is logged by R?

thanks in advance,
Richard



On Tue, Jan 12, 2016 at 10:16 PM, Vijay Kiran <m...@vijaykiran.com> wrote:

I think it would be this: https://github.com/onetapbeyond/opencpu-spark-executor

> On 12 Jan 2016, at 18:32, Corey Nolet <cjno...@gmail.com> wrote:
>


> David,
>
> Thank you very much for announcing this! It looks like it could be very 
> useful. Would you mind providing a link to the github?
>
> On Tue, Jan 12, 2016 at 10:03 AM, David <themarchoffo...@protonmail.com> 
> wrote:
> Hi all,
>
> I'd like to share news of the recent release of a new Spark package, ROSE.
>
> ROSE is a Scala library offering access to the full scientific computing 
> power of the R programming language to Apache Spark batch and streaming 
> applications on the JVM. Where Apache SparkR lets data scientists use Spark 
> from R, ROSE is designed to let Scala and Java developers use R from Spark.
>
> The project is available and documented on GitHub and I would encourage you 
> to take a look. Any feedback, questions etc very welcome.
>
> David
>
> "All that is gold does not glitter, Not all those who wander are lost."
>




---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to