You can create standalone jobs in SparkR as just R files that are run using the sparkR script. These commands will be sent to a Spark cluster and the examples on the SparkR repository ( https://github.com/amplab-extras/SparkR-pkg#examples-unit-tests) are in fact standalone jobs.
However I don't think that will completely solve your use case of using Streaming + R. We don't yet have a way to call R functions from Spark's Java or Scala API. So right now one thing you can try is to save data from SparkStreaming to HDFS and then run a SparkR job which reads in the file. Regarding the other idea of calling R from Scala -- it might be possible to do that in your code if the classpath etc. is setup correctly. I haven't tried it out though, but do let us know if you get it to work. Thanks Shivaram On Mon, Apr 7, 2014 at 2:21 PM, pawan kumar <pkv...@gmail.com> wrote: > Hi, > > Is it possible to create a standalone job in scala using sparkR? If > possible can you provide me with the information of the setup process. > (Like the dependencies in SBT and where to include the JAR files) > > This is my use-case: > > 1. I have a Spark Streaming standalone Job running in local machine which > streams twitter data. > 2. I have an R script which performs Sentiment Analysis. > > I am looking for an optimal way where I could combine these two operations > into a single job and run using "SBT Run" command. > > I came across this document which talks about embedding R into scala ( > http://dahl.byu.edu/software/jvmr/dahl-payne-uppalapati-2013.pdf) but was > not sure if that would work well within the spark context. > > Thanks, > Pawan Venugopal > >