Re: Guava ClassLoading Issue When Using Different Hive Metastore Version

2015-11-05 Thread Marcelo Vanzin
On Thu, Nov 5, 2015 at 3:41 PM, Joey Paskhay  wrote:
> We verified the Guava libraries are in the huge list of the included jars,
> but we saw that in the
> org.apache.spark.sql.hive.client.IsolatedClientLoader.isSharedClass method
> it seems to assume that *all* "com.google" (excluding "com.google.cloud")
> classes should be loaded from the base class loader. The Spark libraries
> seem to have *some* "com.google.common.base" classes shaded in but not all.

Yeah, seems to me like HiveContext should not be trying to include
guava in the shared list at all; the goal is to not have any Guava
classes show up in Spark's classpath, unfortunately that's currently
not possible because some types are exposed in the Java API (the ones
that are not shaded).

Could you file a bug to track this?


-- 
Marcelo

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Guava ClassLoading Issue When Using Different Hive Metastore Version

2015-11-05 Thread Joey Paskhay
Hey there,

We believe we may have run into a class loading bug w/ Guava libraries when
trying to configure a different version of the Hive metastore. Saw another
user ran into this too -- email on this list with subject "Troubles
interacting with different version of Hive metastore".

We figured out why it's happening and have a work-around, but not sure if
this is a bug or just a known idiosyncrasy w/ Guava dependencies.

We're running the latest version of Spark (1.5.1) and patched versions of
Hadoop 2.2.0 and Hive 1.0.0 -- old, we know :).

We set "spark.sql.hive.metastore.version" to "1.0.0" and
"spark.sql.hive.metastore.jars" to
"/lib/*:". When trying to
launch the spark-shell, the sqlContext would fail to initialize with:

java.lang.ClassNotFoundException: java.lang.NoClassDefFoundError:
*com/google/common/base/Predicate* when creating Hive client using
classpath: 
Please make sure that jars for your version of hive and hadoop are included
in the paths passed to SQLConfEntry(key = spark.sql.hive.metastore.jars,
defaultValue=builtin, doc=...

We verified the Guava libraries are in the huge list of the included jars,
but we saw that in
the org.apache.spark.sql.hive.client.IsolatedClientLoader.isSharedClass
method it seems to assume that *all* "com.google" (excluding
"com.google.cloud") classes should be loaded from the base class loader.
The Spark libraries seem to have *some* "com.google.common.base" classes
shaded in but not all.

I searched through existing JIRA tickets but didn't see anything relevant
to this. Let me know if this is a bug that should be added to JIRA.

Thanks!
Joey


Re: Guava ClassLoading Issue When Using Different Hive Metastore Version

2015-11-05 Thread Michael Armbrust
I would be in favor of limiting the scope here.  The problem you might run
into is that FinalizableReferenceQueue

uses
the system classloader and I ran into weird issues if you don't share those
classes across the boundary.

On Thu, Nov 5, 2015 at 3:51 PM, Marcelo Vanzin  wrote:

> On Thu, Nov 5, 2015 at 3:41 PM, Joey Paskhay 
> wrote:
> > We verified the Guava libraries are in the huge list of the included
> jars,
> > but we saw that in the
> > org.apache.spark.sql.hive.client.IsolatedClientLoader.isSharedClass
> method
> > it seems to assume that *all* "com.google" (excluding "com.google.cloud")
> > classes should be loaded from the base class loader. The Spark libraries
> > seem to have *some* "com.google.common.base" classes shaded in but not
> all.
>
> Yeah, seems to me like HiveContext should not be trying to include
> guava in the shared list at all; the goal is to not have any Guava
> classes show up in Spark's classpath, unfortunately that's currently
> not possible because some types are exposed in the Java API (the ones
> that are not shaded).
>
> Could you file a bug to track this?
>
>
> --
> Marcelo
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>