I think we need to just update the docs, it is a bit unclear right
now. At the time, we made it worded fairly sternly because we really
wanted people to use --jars when we deprecated SPARK_CLASSPATH. But
there are other types of deployments where there is a legitimate need
to augment the classpath of every executor.

I think it should probably say something more like

"Extra classpath entries to append to the classpath of executors. This
is sometimes used in deployment environments where dependencies of
Spark are present in a specific place on all nodes".

Kannan - if you want to submit I patch I can help review it.

On Thu, Feb 26, 2015 at 8:24 PM, Kannan Rajah <kra...@maprtech.com> wrote:
> Thanks Marcelo. Do you think it would be useful to make
> spark.executor.extraClassPath be made to pick up some environment variable
> that can be set from spark-env.sh? Here is a example.
>
> spark-env.sh
> ------------------
> executor_extra_cp = get_hbase_jars_for_cp
> export executor_extra_cp
>
> spark-defaults.conf
> ---------------------
> spark.executor.extraClassPath = ${executor_extra_cp}
>
> This will let us add logic inside get_hbase_jars_for_cp function to pick the
> right version hbase jars. There could be multiple versions installed on the
> node.
>
>
>
> --
> Kannan
>
> On Thu, Feb 26, 2015 at 6:08 PM, Marcelo Vanzin <van...@cloudera.com> wrote:
>>
>> On Thu, Feb 26, 2015 at 5:12 PM, Kannan Rajah <kra...@maprtech.com> wrote:
>> > Also, I would like to know if there is a localization overhead when we
>> > use
>> > spark.executor.extraClassPath. Again, in the case of hbase, these jars
>> > would
>> > be typically available on all nodes. So there is no need to localize
>> > them
>> > from the node where job was submitted. I am wondering if we use the
>> > SPARK_CLASSPATH approach, then it would not do localization. That would
>> > be
>> > an added benefit.
>> > Please clarify.
>>
>> spark.executor.extraClassPath doesn't localize anything. It just
>> prepends those classpath entries to the usual classpath used to launch
>> the executor. There's no copying of files or anything, so they're
>> expected to exist on the nodes.
>>
>> It's basically exactly the same as SPARK_CLASSPATH, but broken down to
>> two options (one for the executors, and one for the driver).
>>
>> --
>> Marcelo
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to