Yes, I think that is the case. I haven't tried that before, but it should
work.

Thanks,
Gene

On Fri, Apr 13, 2018 at 11:32 AM, Jason Boorn <jbo...@gmail.com> wrote:

> Hi Gene -
>
> Are you saying that I just need to figure out how to get the Alluxio jar
> into the classpath of my parent application?  If it shows up in the
> classpath then Spark will automatically know that it needs to use it when
> communicating with Alluxio?
>
> Apologies for going back-and-forth on this - I feel like my particular use
> case is clouding what is already a tricky issue.
>
> On Apr 13, 2018, at 2:26 PM, Gene Pang <gene.p...@gmail.com> wrote:
>
> Hi Jason,
>
> Alluxio does work with Spark in master=local mode. This is because both
> spark-submit and spark-shell have command-line options to set the classpath
> for the JVM that is being started.
>
> If you are not using spark-submit or spark-shell, you will have to figure
> out how to configure that JVM instance with the proper properties.
>
> Thanks,
> Gene
>
> On Fri, Apr 13, 2018 at 10:47 AM, Jason Boorn <jbo...@gmail.com> wrote:
>
>> Ok thanks - I was basing my design on this:
>>
>> https://databricks.com/blog/2016/08/15/how-to-use-sparksessi
>> on-in-apache-spark-2-0.html
>>
>> Wherein it says:
>> Once the SparkSession is instantiated, you can configure Spark’s runtime
>> config properties.
>> Apparently the suite of runtime configs you can change does not include
>> classpath.
>>
>> So the answer to my original question is basically this:
>>
>> When using local (pseudo-cluster) mode, there is no way to add external
>> jars to the spark instance.  This means that Alluxio will not work with
>> Spark when Spark is run in master=local mode.
>>
>> Thanks again - often getting a definitive “no” is almost as good as a
>> yes.  Almost ;)
>>
>> On Apr 13, 2018, at 1:21 PM, Marcelo Vanzin <van...@cloudera.com> wrote:
>>
>> There are two things you're doing wrong here:
>>
>> On Thu, Apr 12, 2018 at 6:32 PM, jb44 <jbo...@gmail.com> wrote:
>>
>> Then I can add the alluxio client library like so:
>> sparkSession.conf.set("spark.driver.extraClassPath",
>> ALLUXIO_SPARK_CLIENT)
>>
>>
>> First one, you can't modify JVM configuration after it has already
>> started. So this line does nothing since it can't re-launch your
>> application with a new JVM.
>>
>> sparkSession.conf.set("spark.executor.extraClassPath",
>> ALLUXIO_SPARK_CLIENT)
>>
>>
>> There is a lot of configuration that you cannot set after the
>> application has already started. For example, after the session is
>> created, most probably this option will be ignored, since executors
>> will already have started.
>>
>> I'm not so sure about what happens when you use dynamic allocation,
>> but these post-hoc config changes in general are not expected to take
>> effect.
>>
>> The documentation could be clearer about this (especially stuff that
>> only applies to spark-submit), but that's the gist of it.
>>
>>
>> --
>> Marcelo
>>
>>
>>
>
>

Reply via email to