Re: Best Practice for Spark Job Jar Generation

Chetan Khatri Fri, 23 Dec 2016 10:00:50 -0800

Andy, Thanks for reply.

If we download all the dependencies at separate location  and link with
spark job jar on spark cluster, is it best way to execute spark job ?


Thanks.

On Fri, Dec 23, 2016 at 8:34 PM, Andy Dang <nam...@gmail.com> wrote:

> I used to use uber jar in Spark 1.x because of classpath issues (we
> couldn't re-model our dependencies based on our code, and thus cluster's
> run dependencies could be very different from running Spark directly in the
> IDE. We had to use userClasspathFirst "hack" to work around this.
>
> With Spark 2, it's easier to replace dependencies (say, Guava) than
> before. We moved away from deploying superjar and just pass the libraries
> as part of Spark jars (still can't use Guava v19 or later because Spark
> uses a deprecated method that's not available, but that's not a big issue
> for us).
>
> -------
> Regards,
> Andy
>
> On Fri, Dec 23, 2016 at 6:44 AM, Chetan Khatri <
> chetan.opensou...@gmail.com> wrote:
>
>> Hello Spark Community,
>>
>> For Spark Job Creation I use SBT Assembly to build Uber("Super") Jar and
>> then submit to spark-submit.
>>
>> Example,
>>
>> bin/spark-submit --class hbase.spark.chetan.com.SparkHbaseJob
>> /home/chetan/hbase-spark/SparkMSAPoc-assembly-1.0.jar
>>
>> But other folks has debate with for Uber Less Jar, Guys can you please
>> explain me best practice industry standard for the same.
>>
>> Thanks,
>>
>> Chetan Khatri.
>>
>
>

Re: Best Practice for Spark Job Jar Generation

Reply via email to