Hey there,

We believe we may have run into a class loading bug w/ Guava libraries when
trying to configure a different version of the Hive metastore. Saw another
user ran into this too -- email on this list with subject "Troubles
interacting with different version of Hive metastore".

We figured out why it's happening and have a work-around, but not sure if
this is a bug or just a known idiosyncrasy w/ Guava dependencies.

We're running the latest version of Spark (1.5.1) and patched versions of
Hadoop 2.2.0 and Hive 1.0.0 -- old, we know :).

We set "spark.sql.hive.metastore.version" to "1.0.0" and
"spark.sql.hive.metastore.jars" to
"<path_to_hive>/lib/*:<output_of_hadoop_classpath_cmd>". When trying to
launch the spark-shell, the sqlContext would fail to initialize with:

java.lang.ClassNotFoundException: java.lang.NoClassDefFoundError:
*com/google/common/base/Predicate* when creating Hive client using
classpath: <all the jars>
Please make sure that jars for your version of hive and hadoop are included
in the paths passed to SQLConfEntry(key = spark.sql.hive.metastore.jars,
defaultValue=builtin, doc=...

We verified the Guava libraries are in the huge list of the included jars,
but we saw that in
the org.apache.spark.sql.hive.client.IsolatedClientLoader.isSharedClass
method it seems to assume that *all* "com.google" (excluding
"com.google.cloud") classes should be loaded from the base class loader.
The Spark libraries seem to have *some* "com.google.common.base" classes
shaded in but not all.

I searched through existing JIRA tickets but didn't see anything relevant
to this. Let me know if this is a bug that should be added to JIRA.

Thanks!
Joey

Reply via email to