Ratandeep Ratti created HIVE-11878:
--------------------------------------
Summary: ClassNotFoundException can possibly occur if multiple
jars are registered in Hive
Key: HIVE-11878
URL: https://issues.apache.org/jira/browse/HIVE-11878
Project: Hive
Issue Type: Bug
Components: Hive
Affects Versions: 1.2.1
Reporter: Ratandeep Ratti
Assignee: Ratandeep Ratti
When we register a jar on the Hive console. Hive creates a fresh URL
classloader which includes the path of the current jar to be registered and all
the jar paths of the parent classloader. The parent classlaoder is the current
ThreadContextClassLoader. Once the URLClassloader is created Hive sets that as
the current ThreadContextClassloader.
So if we register multiple jars in Hive, there will be multiple URLClassLoaders
created, each classloader including the jars from its parent and the one extra
jar to be registered. The last URLClassLoader created will end up as the
current ThreadContextClassLoader. (See details:
org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath)
Now here's an example in which the above strategy can lead to a CNF exception.
We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class
*c1* and internally relies on class *c2* in jar *j2*. We register *j1* first,
the URLClassLoader *u1* is created and also set as the
ThreadContextClassLoader. We register *j2* next, the new URLClassLoader created
will be *u2* with *u1* as parent and *u2* becomes the new
ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2*
whereas *u1* only has paths to *j1* (For details see:
org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath).
Now when we register class *c1* under a temporary function in Hive, we load the
class using {code} class.forName("c1", true,
Thread.currentThread().getContextClassLoader()) {code} . The
currentThreadContext class-loader is *u2*, and it has the path to the class
*c1*, but note that Class-loaders work by delegating to parent class-loader
first. In this case class *c1* will be found and *defined* by class-loader *u1*.
Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say
initialize) is called in *c1*, which references the class *c2*, *c2* will not
be found since the class-loader used to search for *c2* will be *u1* (Since the
caller's class-loader is used to load a class)
I've added a qtest to explain the problem. Please see the attached patch
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)