As Marcelo said, CDH5.3 is based on hadoop 2.3, so please try ./make-distribution.sh -Pyarn -Phive -Phadoop-2.3 -Dhadoop.version=2.3.0-cdh5.1.3 -DskipTests
See the detail of how to change the profile at https://spark.apache.org/docs/latest/building-with-maven.html Sincerely, DB Tsai ------------------------------------------------------- My Blog: https://www.dbtsai.com LinkedIn: https://www.linkedin.com/in/dbtsai On Fri, Dec 5, 2014 at 12:54 PM, Marcelo Vanzin <van...@cloudera.com> wrote: > When building against Hadoop 2.x, you need to enable the appropriate > profile, aside from just specifying the version. e.g. "-Phadoop-2.3" > for Hadoop 2.3. > > On Fri, Dec 5, 2014 at 12:51 PM, <spark.dubovsky.ja...@seznam.cz> wrote: >> Hi devs, >> >> I play with your amazing Spark here in Prague for some time. I have >> stumbled on a thing which I like to ask about. I create assembly jars from >> source and then use it to run simple jobs on our 2.3.0-cdh5.1.3 cluster >> using yarn. Example of my usage [1]. Formerly I had started to use sbt for >> creating assemblies like this [2] which runs just fine. Then reading those >> maven-prefered stories here on dev list I found make-distribution.sh script >> in root of codebase and wanted to give it a try. I used it to create >> assembly by both [3] and [4]. >> >> But I am not able to use assemblies created by make-distribution because >> it refuses to be submited to cluster. Here is what happens: >> - run [3] or [4] >> - recompile app agains new assembly >> - submit job using new assembly by [1] like command >> - submit fails with important parts of stack trace being [5] >> >> My guess is that it is due to improper version of protobuf included in >> assembly jar. My questions are: >> - Can you confirm this hypothesis? >> - What is the difference between sbt and mvn way of creating assembly? I >> mean sbt works and mvn not... >> - What additional option I need to pass to make-distribution to make it >> work? >> >> Any help/explanation here would be appreciated >> >> Jakub >> ---------------------- >> [1] ./bin/spark-submit --num-executors 200 --master yarn-cluster --conf >> spark.yarn.jar=assembly/target/scala-2.10/spark-assembly-1.2.1-SNAPSHOT- >> hadoop2.3.0-cdh5.1.3.jar --class org.apache.spark.mllib. >> CreateGuidDomainDictionary root-0.1.jar ${args} >> >> [2] ./sbt/sbt -Dhadoop.version=2.3.0-cdh5.1.3 -Pyarn -Phive assembly/ >> assembly >> >> [3] ./make-distribution.sh -Dhadoop.version=2.3.0-cdh5.1.3 -Pyarn -Phive - >> DskipTests >> >> [4] ./make-distribution.sh -Dyarn.version=2.3.0 -Dhadoop.version=2.3.0-cdh >> 5.1.3 -Pyarn -Phive -DskipTests >> >> [5]Exception in thread "main" org.apache.hadoop.yarn.exceptions. >> YarnRuntimeException: java.lang.reflect.InvocationTargetException >> at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl. >> getClient(RpcClientFactoryPBImpl.java:79) >> at org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getProxy >> (HadoopYarnProtoRPC.java:48) >> at org.apache.hadoop.yarn.client.RMProxy$1.run(RMProxy.java:134) >> ... >> Caused by: java.lang.reflect.InvocationTargetException >> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native >> Method) >> at sun.reflect.NativeConstructorAccessorImpl.newInstance >> (NativeConstructorAccessorImpl.java:39) >> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance >> (DelegatingConstructorAccessorImpl.java:27) >> at java.lang.reflect.Constructor.newInstance(Constructor.java:513) >> at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl. >> getClient(RpcClientFactoryPBImpl.java:76) >> ... 27 more >> Caused by: java.lang.VerifyError: class org.apache.hadoop.yarn.proto. >> YarnServiceProtos$SubmitApplicationRequestProto overrides final method >> getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet; >> at java.lang.ClassLoader.defineClass1(Native Method) >> at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) >> > > > > -- > Marcelo > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org