Thank you, Sandy! I'll investigate use of the extraClassPath variable. Both 
options are helpful.

Thanks,

Matt

On Jun 17, 2015, at 8:01 PM, Sandy Ryza 
<sandy.r...@cloudera.com<mailto:sandy.r...@cloudera.com>> wrote:

Hi Matt,

If you place your jars on HDFS in a public location, YARN will cache them on 
each node after the first download.  You can also use the 
spark.executor.extraClassPath config to point to them.

-Sandy

On Wed, Jun 17, 2015 at 4:47 PM, Sweeney, Matt 
<mswee...@fourv.com<mailto:mswee...@fourv.com>> wrote:
Hi folks,

I'm looking to deploy spark on YARN and I have read through the docs 
(https://spark.apache.org/docs/latest/running-on-yarn.html). One question that 
I still have is if there is an alternate means of including your own app jars 
as opposed to the process in the "Adding Other Jars" section of the docs. The 
app jars and dependencies that I need to include are significant in size (100s 
MBs) and I'd rather deploy them in advance onto the cluster nodes disk so that 
I don't have that overhead cost on the network for each spark-submit that is 
executed.

Thanks in advance for your help!

Matt

Reply via email to