RE: Add jar files on classpath when submitting tasks to Spark

2016-11-02 Thread Jan Botorek
same path and the folder on all workers contains the same .jar files. Thank you for your help, Regards, Jan From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com] Sent: Tuesday, November 1, 2016 3:22 PM To: Jan Botorek <jan.boto...@infor.com> Cc: Vinod Mangipudi <vinod...@gma

RE: Add jar files on classpath when submitting tasks to Spark

2016-11-01 Thread Jan Botorek
2016 2:51 PM To: Jan Botorek <jan.boto...@infor.com> Cc: Vinod Mangipudi <vinod...@gmail.com>; user <user@spark.apache.org> Subject: Re: Add jar files on classpath when submitting tasks to Spark Are you submitting your job through spark-submit? Dr Mich Talebzadeh LinkedI

RE: Add jar files on classpath when submitting tasks to Spark

2016-11-01 Thread Jan Botorek
. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On 1 November 2016 at 13:04, Vinod Mangipudi <vinod...@gmail.com<mailto:vinod...@gmail.com>> wrote: unsubscribe On Tue, Nov 1, 2016 at 8:56 AM, Jan Botorek <jan.boto..

RE: Add jar files on classpath when submitting tasks to Spark

2016-11-01 Thread Jan Botorek
, November 1, 2016 1:49 PM To: Jan Botorek <jan.boto...@infor.com> Cc: user <user@spark.apache.org> Subject: Re: Add jar files on classpath when submitting tasks to Spark There are options to specify external jars in the form of --jars, --driver-classpath etc depending on spark version

Add jar files on classpath when submitting tasks to Spark

2016-11-01 Thread Jan Botorek
Hello, I have a problem trying to add jar files to be available on classpath when submitting task to Spark. In my spark-defaults.conf file I have configuration: spark.driver.extraClassPath = path/to/folder/with/jars all jars in the folder are available in SPARK-SHELL The problem is that jars

RE: Help needed in parsing JSon with nested structures

2016-10-31 Thread Jan Botorek
Hello, >From my point of view, it would be more efficient and probably i more >"readible" if you just extracted the required data using some json parsing >library (GSON, Jackson), construct some global object (or pre-process data), >and then begin with the Spark operations. Jan From:

RE: No of partitions in a Dataframe

2016-10-27 Thread Jan Botorek
Hello, Nipun In my opinion, the „converting the dataframe to an RDD“ wouldn’t be a costly operation since Dataframe (Dataset) operations are under the hood operated always as RDDs. I don’t know which version of Spark you operate, but I suppose you utilize the 2.0. I would, therefore go for: