Hi, I believe the HiveContext uses a different class loader. It then falls back to the system class loader if it can't find the classes in the context class loader. The system class loader contains the classpath passed through --driver-class-path and spark.executor.extraClassPath. The JVM is already running during the resolution of jars in --jars, therefore, they can't be added to the System Classloader. Instead they live in a separate context classloader, which the HiveContext doesn't use, hence the lost dependencies.
I know what I wrote may be a little complicated, please let me know if you have any problems. I HTH. Best, Burak On Tue, Jul 14, 2015 at 11:15 PM, gen tang <gen.tan...@gmail.com> wrote: > Hi, > > I met some interesting problems with --jars options > As I use the third party dependencies: elasticsearch-spark, I pass this > jar with the following command: > ./bin/spark-submit --jars path-to-dependencies ... > It works well. > However, if I use HiveContext.sql, spark will lost the dependencies that I > passed.It seems that the execution of HiveContext will override the > configuration.(But if we check sparkContext._conf, the configuration is > unchanged) > > But if I passed dependencies with --driver-class-path > and spark.executor.extraClassPath. The problem will disappear. > > Is there anyone know why this interesting problem happens? > > Thanks a lot for your help in advance. > > Cheers > Gen >