Re: Including additional scala libraries in sparkR

Shivaram Venkataraman Tue, 14 Jul 2015 08:19:30 -0700

There was a fix for `--jars` that went into 1.4.1
https://github.com/apache/spark/commit/2579948bf5d89ac2d822ace605a6a4afce5258d6


Shivaram

On Tue, Jul 14, 2015 at 4:18 AM, Sun, Rui <rui....@intel.com> wrote:

> Could you give more details about the mis-behavior of --jars for SparkR?
> maybe it's a bug.
> ________________________________
> From: Michal Haris [michal.ha...@visualdna.com]
> Sent: Tuesday, July 14, 2015 5:31 PM
> To: Sun, Rui
> Cc: Michal Haris; user@spark.apache.org
> Subject: Re: Including additional scala libraries in sparkR
>
> Ok thanks. It seems that --jars is not behaving as expected - getting
> class not found for even the most simple object from my lib. But anyways, I
> have to do at least a filter transformation before collecting the HBaseRDD
> into R so will have to go the route of using scala spark shell to transform
> and collect and save into local filesystem and the visualise the file with
> R until custom RDD transformations are exposed in the SparkR API.
>
> On 13 July 2015 at 10:27, Sun, Rui <rui....@intel.com<mailto:
> rui....@intel.com>> wrote:
> Hi, Michal,
>
> SparkR comes with a JVM backend that supports Java object instantiation,
> calling Java instance and static methods from R side. As defined in
> https://github.com/apache/spark/blob/master/R/pkg/R/backend.R,
> newJObject() is to create an instance of a Java class;
> callJMethod() is to call an instance method of a Java object;
> callJStatic() is to call a static method of a Java class.
>
> If the thing is as simple as data visualization, you can use the above
> low-level functions to create an instance of your HBASE RDD in JVM side,
> collect the data to R side, and visualize it.
>
> However, if you want to do HBASE RDD transformation and HBASE table
> update, things are quite complex now. SparkR supports majority of RDD API
> (though not exposed publicly in 1.4 release) allowing transformation
> functions in R code, but currently it only supports RDD source from text
> files and SparkR Data Frames, so your HBASE RDDs can't be used by SparkR
> RDD API for further processing.
>
> You can use --jars to include your scala library to be accessed by the JVM
> backend.
>
> ________________________________
> From: Michal Haris [michal.ha...@visualdna.com<mailto:
> michal.ha...@visualdna.com>]
> Sent: Sunday, July 12, 2015 6:39 PM
> To: user@spark.apache.org<mailto:user@spark.apache.org>
> Subject: Including additional scala libraries in sparkR
>
> I have spark program with a custom optimised rdd for hbase scans and
> updates. I have a small library of objects in scala to support efficient
> serialisation, partitioning etc. I would like to use R as an analysis and
> visualisation front-end. I have tried to use rJava (i.e. not using sparkR)
> and I got as far as initialising the spark context but I have encountered
> problems with hbase dependencies (HBaseConfiguration : Unsupported
> major.minor version 51.0) so tried sparkR but I can't figure out how to
> make my custom scala classes available to sparkR other than re-implementing
> them in R. Is there a way to include and invoke additional scala objects
> and RDDs within sparkR shell/job ? Something similar to additional jars and
> init script in normal spark submit/shell..
>
> --
> Michal Haris
> Technical Architect
> direct line: +44 (0) 207 749 0229<tel:%2B44%20%280%29%20207%20749%200229>
> www.visualdna.com<http://www.visualdna.com><http://www.visualdna.com> |
> t: +44 (0) 207 734 7033<tel:%2B44%20%280%29%20207%20734%207033>
> 31 Old Nichol Street
> London
> E2 7HR
>
>
>
> --
> Michal Haris
> Technical Architect
> direct line: +44 (0) 207 749 0229
> www.visualdna.com<http://www.visualdna.com> | t: +44 (0) 207 734 7033
> 31 Old Nichol Street
> London
> E2 7HR
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: Including additional scala libraries in sparkR

Reply via email to