Ok thanks. It seems that --jars is not behaving as expected - getting class not found for even the most simple object from my lib. But anyways, I have to do at least a filter transformation before collecting the HBaseRDD into R so will have to go the route of using scala spark shell to transform and collect and save into local filesystem and the visualise the file with R until custom RDD transformations are exposed in the SparkR API.
On 13 July 2015 at 10:27, Sun, Rui <rui....@intel.com> wrote: > Hi, Michal, > > SparkR comes with a JVM backend that supports Java object instantiation, > calling Java instance and static methods from R side. As defined in > https://github.com/apache/spark/blob/master/R/pkg/R/backend.R, > newJObject() is to create an instance of a Java class; > callJMethod() is to call an instance method of a Java object; > callJStatic() is to call a static method of a Java class. > > If the thing is as simple as data visualization, you can use the above > low-level functions to create an instance of your HBASE RDD in JVM side, > collect the data to R side, and visualize it. > > However, if you want to do HBASE RDD transformation and HBASE table > update, things are quite complex now. SparkR supports majority of RDD API > (though not exposed publicly in 1.4 release) allowing transformation > functions in R code, but currently it only supports RDD source from text > files and SparkR Data Frames, so your HBASE RDDs can't be used by SparkR > RDD API for further processing. > > You can use --jars to include your scala library to be accessed by the JVM > backend. > > ________________________________ > From: Michal Haris [michal.ha...@visualdna.com] > Sent: Sunday, July 12, 2015 6:39 PM > To: user@spark.apache.org > Subject: Including additional scala libraries in sparkR > > I have spark program with a custom optimised rdd for hbase scans and > updates. I have a small library of objects in scala to support efficient > serialisation, partitioning etc. I would like to use R as an analysis and > visualisation front-end. I have tried to use rJava (i.e. not using sparkR) > and I got as far as initialising the spark context but I have encountered > problems with hbase dependencies (HBaseConfiguration : Unsupported > major.minor version 51.0) so tried sparkR but I can't figure out how to > make my custom scala classes available to sparkR other than re-implementing > them in R. Is there a way to include and invoke additional scala objects > and RDDs within sparkR shell/job ? Something similar to additional jars and > init script in normal spark submit/shell.. > > -- > Michal Haris > Technical Architect > direct line: +44 (0) 207 749 0229 > www.visualdna.com<http://www.visualdna.com> | t: +44 (0) 207 734 7033 > 31 Old Nichol Street > London > E2 7HR > -- Michal Haris Technical Architect direct line: +44 (0) 207 749 0229 www.visualdna.com | t: +44 (0) 207 734 7033 31 Old Nichol Street London E2 7HR