RE: Build Spark 1.2.0-rc1 encounter exceptions when running HiveContext - Caused by: java.lang.ClassNotFoundException: com.esotericsoftware.shaded.org.objenesis.strategy.InstantiatorStrategy

2014-12-29 Thread Andrew Lee
Hi Patrick,
I manually hardcoded the hive version to 0.13.1a and it works. It turns out 
that for some reason, 0.13.1 is being picked up instead of the 0.13.1a version 
from maven.
So my solution was:hardcode the hive.version to 0.13.1a in my case since I am 
building it against hive 0.13 only, so the pom.xml was hardcoded with that 
version string, and the final JAR is working now with hive-exec 0.13.1a embed.
Possible Reason why it didn't work?I suspect our internal environment is 
picking up 0.13.1 since we do use our own maven repo as a proxy and caching.  
0.13.1a did appear in our own repo and it got replicated from the maven central 
repo, but during the build process, maven picked up 0.13.1 instead of 0.13.1a.

> Date: Wed, 10 Dec 2014 12:23:08 -0800
> Subject: Re: Build Spark 1.2.0-rc1 encounter exceptions when running 
> HiveContext - Caused by: java.lang.ClassNotFoundException: 
> com.esotericsoftware.shaded.org.objenesis.strategy.InstantiatorStrategy
> From: pwend...@gmail.com
> To: alee...@hotmail.com
> CC: dev@spark.apache.org
> 
> Hi Andrew,
> 
> It looks like somehow you are including jars from the upstream Apache
> Hive 0.13 project on your classpath. For Spark 1.2 Hive 0.13 support,
> we had to modify Hive to use a different version of Kryo that was
> compatible with Spark's Kryo version.
> 
> https://github.com/pwendell/hive/commit/5b582f242946312e353cfce92fc3f3fa472aedf3
> 
> I would look through the actual classpath and make sure you aren't
> including your own hive-exec jar somehow.
> 
> - Patrick
> 
> On Wed, Dec 10, 2014 at 9:48 AM, Andrew Lee  wrote:
> > Apologize for the format, somehow it got messed up and linefeed were 
> > removed. Here's a reformatted version.
> > Hi All,
> > I tried to include necessary libraries in SPARK_CLASSPATH in spark-env.sh 
> > to include auxiliaries JARs and datanucleus*.jars from Hive, however, when 
> > I run HiveContext, it gives me the following error:
> >
> > Caused by: java.lang.ClassNotFoundException: 
> > com.esotericsoftware.shaded.org.objenesis.strategy.InstantiatorStrategy
> >
> > I have checked the JARs with (jar tf), looks like this is already included 
> > (shaded) in the assembly JAR (spark-assembly-1.2.0-hadoop2.4.1.jar) which 
> > is configured in the System classpath already. I couldn't figure out what 
> > is going on with the shading on the esotericsoftware JARs here.  Any help 
> > is appreciated.
> >
> >
> > How to reproduce the problem?
> > Run the following 3 statements in spark-shell ( This is how I launched my 
> > spark-shell. cd /opt/spark; ./bin/spark-shell --master yarn --deploy-mode 
> > client --queue research --driver-memory 1024M)
> >
> > import org.apache.spark.SparkContext
> > val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
> > hiveContext.hql("CREATE TABLE IF NOT EXISTS spark_hive_test_table (key INT, 
> > value STRING)")
> >
> >
> >
> > A reference of my environment.
> > Apache Hadoop 2.4.1
> > Apache Hive 0.13.1
> > Apache Spark branch-1.2 (installed under /opt/spark/, and config under 
> > /etc/spark/)
> > Maven build command:
> >
> > mvn -U -X -Phadoop-2.4 -Pyarn -Phive -Phive-0.13.1 -Dhadoop.version=2.4.1 
> > -Dyarn.version=2.4.1 -Dhive.version=0.13.1 -DskipTests install
> >
> > Source Code commit label: eb4d457a870f7a281dc0267db72715cd00245e82
> >
> > My spark-env.sh have the following contents when I executed spark-shell:
> >> HADOOP_HOME=/opt/hadoop/
> >> HIVE_HOME=/opt/hive/
> >> HADOOP_CONF_DIR=/etc/hadoop/
> >> YARN_CONF_DIR=/etc/hadoop/
> >> HIVE_CONF_DIR=/etc/hive/
> >> HADOOP_SNAPPY_JAR=$(find $HADOOP_HOME/share/hadoop/common/lib/ -type f 
> >> -name "snappy-java-*.jar")
> >> HADOOP_LZO_JAR=$(find $HADOOP_HOME/share/hadoop/common/lib/ -type f -name 
> >> "hadoop-lzo-*.jar")
> >> SPARK_YARN_DIST_FILES=/user/spark/libs/spark-assembly-1.2.0-hadoop2.4.1.jar
> >> export JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:$HADOOP_HOME/lib/native
> >> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native
> >> export SPARK_LIBRARY_PATH=$SPARK_LIBRARY_PATH:$HADOOP_HOME/lib/native
> >> export 
> >> SPARK_CLASSPATH=$SPARK_CLASSPATH:$HADOOP_SNAPPY_JAR:$HADOOP_LZO_JAR:$HIVE_CONF_DIR:/opt/hive/lib/datanucleus-api-jdo-3.2.6.jar:/opt/hive/lib/datanucleus-core-3.2.10.jar:/opt/hive/lib/datanucleus-rdbms-3.2.9.jar
> >
> >
> >> Here's what I see from my stack trace.
> >> warning: there were 1 deprecation warning(s); re-run with -deprecation for 
> >&g

Re: Build Spark 1.2.0-rc1 encounter exceptions when running HiveContext - Caused by: java.lang.ClassNotFoundException: com.esotericsoftware.shaded.org.objenesis.strategy.InstantiatorStrategy

2014-12-10 Thread Patrick Wendell
Hi Andrew,

It looks like somehow you are including jars from the upstream Apache
Hive 0.13 project on your classpath. For Spark 1.2 Hive 0.13 support,
we had to modify Hive to use a different version of Kryo that was
compatible with Spark's Kryo version.

https://github.com/pwendell/hive/commit/5b582f242946312e353cfce92fc3f3fa472aedf3

I would look through the actual classpath and make sure you aren't
including your own hive-exec jar somehow.

- Patrick

On Wed, Dec 10, 2014 at 9:48 AM, Andrew Lee  wrote:
> Apologize for the format, somehow it got messed up and linefeed were removed. 
> Here's a reformatted version.
> Hi All,
> I tried to include necessary libraries in SPARK_CLASSPATH in spark-env.sh to 
> include auxiliaries JARs and datanucleus*.jars from Hive, however, when I run 
> HiveContext, it gives me the following error:
>
> Caused by: java.lang.ClassNotFoundException: 
> com.esotericsoftware.shaded.org.objenesis.strategy.InstantiatorStrategy
>
> I have checked the JARs with (jar tf), looks like this is already included 
> (shaded) in the assembly JAR (spark-assembly-1.2.0-hadoop2.4.1.jar) which is 
> configured in the System classpath already. I couldn't figure out what is 
> going on with the shading on the esotericsoftware JARs here.  Any help is 
> appreciated.
>
>
> How to reproduce the problem?
> Run the following 3 statements in spark-shell ( This is how I launched my 
> spark-shell. cd /opt/spark; ./bin/spark-shell --master yarn --deploy-mode 
> client --queue research --driver-memory 1024M)
>
> import org.apache.spark.SparkContext
> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
> hiveContext.hql("CREATE TABLE IF NOT EXISTS spark_hive_test_table (key INT, 
> value STRING)")
>
>
>
> A reference of my environment.
> Apache Hadoop 2.4.1
> Apache Hive 0.13.1
> Apache Spark branch-1.2 (installed under /opt/spark/, and config under 
> /etc/spark/)
> Maven build command:
>
> mvn -U -X -Phadoop-2.4 -Pyarn -Phive -Phive-0.13.1 -Dhadoop.version=2.4.1 
> -Dyarn.version=2.4.1 -Dhive.version=0.13.1 -DskipTests install
>
> Source Code commit label: eb4d457a870f7a281dc0267db72715cd00245e82
>
> My spark-env.sh have the following contents when I executed spark-shell:
>> HADOOP_HOME=/opt/hadoop/
>> HIVE_HOME=/opt/hive/
>> HADOOP_CONF_DIR=/etc/hadoop/
>> YARN_CONF_DIR=/etc/hadoop/
>> HIVE_CONF_DIR=/etc/hive/
>> HADOOP_SNAPPY_JAR=$(find $HADOOP_HOME/share/hadoop/common/lib/ -type f -name 
>> "snappy-java-*.jar")
>> HADOOP_LZO_JAR=$(find $HADOOP_HOME/share/hadoop/common/lib/ -type f -name 
>> "hadoop-lzo-*.jar")
>> SPARK_YARN_DIST_FILES=/user/spark/libs/spark-assembly-1.2.0-hadoop2.4.1.jar
>> export JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:$HADOOP_HOME/lib/native
>> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native
>> export SPARK_LIBRARY_PATH=$SPARK_LIBRARY_PATH:$HADOOP_HOME/lib/native
>> export 
>> SPARK_CLASSPATH=$SPARK_CLASSPATH:$HADOOP_SNAPPY_JAR:$HADOOP_LZO_JAR:$HIVE_CONF_DIR:/opt/hive/lib/datanucleus-api-jdo-3.2.6.jar:/opt/hive/lib/datanucleus-core-3.2.10.jar:/opt/hive/lib/datanucleus-rdbms-3.2.9.jar
>
>
>> Here's what I see from my stack trace.
>> warning: there were 1 deprecation warning(s); re-run with -deprecation for 
>> details
>> Hive history 
>> file=/home/hive/log/alti-test-01/hive_job_log_b5db9539-4736-44b3-a601-04fa77cb6730_1220828461.txt
>> java.lang.NoClassDefFoundError: 
>> com/esotericsoftware/shaded/org/objenesis/strategy/InstantiatorStrategy
>>   at 
>> org.apache.hadoop.hive.ql.exec.Utilities.(Utilities.java:925)
>>   at 
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.validate(SemanticAnalyzer.java:9718)
>>   at 
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.validate(SemanticAnalyzer.java:9712)
>>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:434)
>>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322)
>>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:975)
>>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1040)
>>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
>>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
>>   at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:305)
>>   at 
>> org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:276)
>>   at 
>> org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult$lzycompute(NativeCommand.scala:35)
>>   at 
>> org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult(NativeCommand.scala:35)
>>   at 
>> org.apache.spark.sql.execution.Command$class.execute(commands.scala:46)
>>   at 
>> org.apache.spark.sql.hive.execution.NativeCommand.execute(NativeCommand.scala:30)
>>   at 
>> org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)
>>   at 
>> org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425)
>>   at 
>> org.apache.spark.sql.SchemaRDDLike$class.$init$(S

RE: Build Spark 1.2.0-rc1 encounter exceptions when running HiveContext - Caused by: java.lang.ClassNotFoundException: com.esotericsoftware.shaded.org.objenesis.strategy.InstantiatorStrategy

2014-12-10 Thread Andrew Lee
Apologize for the format, somehow it got messed up and linefeed were removed. 
Here's a reformatted version.
Hi All,
I tried to include necessary libraries in SPARK_CLASSPATH in spark-env.sh to 
include auxiliaries JARs and datanucleus*.jars from Hive, however, when I run 
HiveContext, it gives me the following error:

Caused by: java.lang.ClassNotFoundException: 
com.esotericsoftware.shaded.org.objenesis.strategy.InstantiatorStrategy

I have checked the JARs with (jar tf), looks like this is already included 
(shaded) in the assembly JAR (spark-assembly-1.2.0-hadoop2.4.1.jar) which is 
configured in the System classpath already. I couldn't figure out what is going 
on with the shading on the esotericsoftware JARs here.  Any help is appreciated.


How to reproduce the problem?
Run the following 3 statements in spark-shell ( This is how I launched my 
spark-shell. cd /opt/spark; ./bin/spark-shell --master yarn --deploy-mode 
client --queue research --driver-memory 1024M)

import org.apache.spark.SparkContext
val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
hiveContext.hql("CREATE TABLE IF NOT EXISTS spark_hive_test_table (key INT, 
value STRING)")



A reference of my environment.
Apache Hadoop 2.4.1
Apache Hive 0.13.1
Apache Spark branch-1.2 (installed under /opt/spark/, and config under 
/etc/spark/)
Maven build command:

mvn -U -X -Phadoop-2.4 -Pyarn -Phive -Phive-0.13.1 -Dhadoop.version=2.4.1 
-Dyarn.version=2.4.1 -Dhive.version=0.13.1 -DskipTests install

Source Code commit label: eb4d457a870f7a281dc0267db72715cd00245e82

My spark-env.sh have the following contents when I executed spark-shell:
> HADOOP_HOME=/opt/hadoop/
> HIVE_HOME=/opt/hive/
> HADOOP_CONF_DIR=/etc/hadoop/
> YARN_CONF_DIR=/etc/hadoop/
> HIVE_CONF_DIR=/etc/hive/
> HADOOP_SNAPPY_JAR=$(find $HADOOP_HOME/share/hadoop/common/lib/ -type f -name 
> "snappy-java-*.jar")
> HADOOP_LZO_JAR=$(find $HADOOP_HOME/share/hadoop/common/lib/ -type f -name 
> "hadoop-lzo-*.jar")
> SPARK_YARN_DIST_FILES=/user/spark/libs/spark-assembly-1.2.0-hadoop2.4.1.jar
> export JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:$HADOOP_HOME/lib/native
> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native
> export SPARK_LIBRARY_PATH=$SPARK_LIBRARY_PATH:$HADOOP_HOME/lib/native
> export 
> SPARK_CLASSPATH=$SPARK_CLASSPATH:$HADOOP_SNAPPY_JAR:$HADOOP_LZO_JAR:$HIVE_CONF_DIR:/opt/hive/lib/datanucleus-api-jdo-3.2.6.jar:/opt/hive/lib/datanucleus-core-3.2.10.jar:/opt/hive/lib/datanucleus-rdbms-3.2.9.jar


> Here's what I see from my stack trace.
> warning: there were 1 deprecation warning(s); re-run with -deprecation for 
> details
> Hive history 
> file=/home/hive/log/alti-test-01/hive_job_log_b5db9539-4736-44b3-a601-04fa77cb6730_1220828461.txt
> java.lang.NoClassDefFoundError: 
> com/esotericsoftware/shaded/org/objenesis/strategy/InstantiatorStrategy
>   at org.apache.hadoop.hive.ql.exec.Utilities.(Utilities.java:925)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.validate(SemanticAnalyzer.java:9718)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.validate(SemanticAnalyzer.java:9712)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:434)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:975)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1040)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
>   at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:305)
>   at 
> org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:276)
>   at 
> org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult$lzycompute(NativeCommand.scala:35)
>   at 
> org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult(NativeCommand.scala:35)
>   at 
> org.apache.spark.sql.execution.Command$class.execute(commands.scala:46)
>   at 
> org.apache.spark.sql.hive.execution.NativeCommand.execute(NativeCommand.scala:30)
>   at 
> org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)
>   at 
> org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425)
>   at 
> org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58)
>   at org.apache.spark.sql.SchemaRDD.(SchemaRDD.scala:108)
>   at org.apache.spark.sql.hive.HiveContext.hiveql(HiveContext.scala:102)
>   at org.apache.spark.sql.hive.HiveContext.hql(HiveContext.scala:106)
>   at $iwC$$iwC$$iwC$$iwC.(:16)
>   at $iwC$$iwC$$iwC.(:21)
>   at $iwC$$iwC.(:23)
>   at $iwC.(:25)
>   at (:27)
>   at .(:31)
>   at .()
>   at .(:7)
>   at .()
>   at $print()
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun