Akhil, Thanks for the suggestions.
I tried out sc.addJar, --jars, --conf spark.executor.extraClassPath and
none of them helped. I added stuff into compute-classpath.sh. That did not
change anything. I checked the classpath of the running executor and made
sure that the hive jars are in that dir. For me the most confusing thing is
that the executor can actually create HiveConf objects but when it cannot
find that when the task deserializer is at work.

On 20 April 2015 at 14:18, Akhil Das <ak...@sigmoidanalytics.com> wrote:

> Can you try sc.addJar("/path/to/your/hive/jar"), i think it will resolve
> it.
>
> Thanks
> Best Regards
>
> On Mon, Apr 20, 2015 at 12:26 PM, Manku Timma <manku.tim...@gmail.com>
> wrote:
>
>> Akhil,
>> But the first case of creating HiveConf on the executor works fine (map
>> case). Only the second case fails. I was suspecting some foul play with
>> classloaders.
>>
>> On 20 April 2015 at 12:20, Akhil Das <ak...@sigmoidanalytics.com> wrote:
>>
>>> Looks like a missing jar, try to print the classpath and make sure the
>>> hive jar is present.
>>>
>>> Thanks
>>> Best Regards
>>>
>>> On Mon, Apr 20, 2015 at 11:52 AM, Manku Timma <manku.tim...@gmail.com>
>>> wrote:
>>>
>>>> I am using spark-1.3 with hadoop-provided and hive-provided and
>>>> hive-0.13.1 profiles. I am running a simple spark job on a yarn cluster by
>>>> adding all hadoop2 and hive13 jars to the spark classpaths.
>>>>
>>>> If I remove the hive-provided while building spark, I dont face any
>>>> issue. But with hive-provided I am getting a
>>>> "java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf" in
>>>> the yarn executor.
>>>>
>>>> Code is below:
>>>> import org.apache.spark._
>>>> import org.apache.spark.sql._
>>>> import org.apache.hadoop.hive.conf.HiveConf
>>>>
>>>> object Simple {
>>>>   def main(args: Array[String]) = {
>>>>     val sc = new SparkContext(new SparkConf())
>>>>     val sqlC = new  org.apache.spark.sql.hive.HiveContext(sc)
>>>>
>>>>     val x = sc.parallelize(1 to 2).map(x =>
>>>>       { val h = new HiveConf; h.getBoolean("hive.test", false) })
>>>>     x.collect.foreach(x => println(s"-----------------  $x
>>>> ----------------"))
>>>>
>>>>     val result = sqlC.sql("""
>>>>       select * from products_avro order by month, name, price
>>>>       """)
>>>>     result.collect.foreach(println)
>>>>   }
>>>> }
>>>>
>>>> The first job (involving map) runs fine. HiveConf is instantiated and
>>>> the conf variable is looked up etc. But the second job (involving the
>>>> select * query) throws the class not found exception.
>>>>
>>>> The task deserializer is the one throwing the exception. It is unable
>>>> to find the class in its classpath. Not sure what is different from the
>>>> first job which also involved HiveConf.
>>>>
>>>> 157573 [task-result-getter-3] 2015/04/20 11:01:48:287 WARN
>>>> TaskSetManager: Lost task 0.2 in stage 2.0 (TID 4, localhost):
>>>> java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf
>>>>     at java.lang.Class.getDeclaredFields0(Native Method)
>>>>     at java.lang.Class.privateGetDeclaredFields(Class.java:2436)
>>>>     at java.lang.Class.getDeclaredField(Class.java:1946)
>>>>     at
>>>> java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1659)
>>>>     at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
>>>>     at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:480)
>>>>     at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468)
>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>     at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468)
>>>>     at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
>>>>     at
>>>> java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:602)
>>>>     at
>>>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
>>>>     at
>>>> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
>>>>     at
>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
>>>>     at
>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>>     at
>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>>>     at
>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>>>     at
>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>>     at
>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>>     at
>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>>>     at
>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>>>     at
>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>>     at
>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>>     at
>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>>>     at
>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>>>     at
>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>>     at
>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>>     at
>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>>>     at
>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>>>     at
>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>>     at
>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>>     at
>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>>>     at
>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>>>     at
>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>>     at
>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>>     at
>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>>>     at
>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>>>     at
>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>>     at
>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>>     at
>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>>>     at
>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>>>     at
>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>>     at
>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>>     at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>>>>     at
>>>> scala.collection.immutable.$colon$colon.readObject(List.scala:362)
>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>     at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>     at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>     at java.lang.reflect.Method.invoke(Method.java:606)
>>>>     at
>>>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>>>>     at
>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>>>>     at
>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>>     at
>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>>     at
>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>>>     at
>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>>>     at
>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>>     at
>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>>     at
>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>>>     at
>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>>>     at
>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>>     at
>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>>     at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>>>>     at
>>>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
>>>>     at
>>>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:94)
>>>>     at
>>>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:57)
>>>>     at org.apache.spark.scheduler.Task.run(Task.scala:64)
>>>>     at
>>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
>>>>     at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>     at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>     at java.lang.Thread.run(Thread.java:745)
>>>> Caused by: java.lang.ClassNotFoundException:
>>>> org.apache.hadoop.hive.conf.HiveConf
>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>>>     ... 70 more
>>>>
>>>> -- Thanks
>>>>
>>>
>>>
>>
>

Reply via email to