my spark can also work well with hadoop2.2.0 my shark can not work well with hadoop2.2.0 because protobuf version problem.
in shark direcotry , i found two vesions of protobuf, and they all are loaded into classpath. [root@bigdata001 shark-0.9.0]# find . -name "proto*.jar" ./lib_managed/jars/org.spark-project.protobuf/protobuf-java/protobuf-java-2.4.1-shaded.jar ./lib_managed/bundles/com.google.protobuf/protobuf-java/protobuf-java-2.5.0.jar 2014-03-27 2:26 GMT+08:00 yao <[email protected]>: > @qingyang, spark 0.9.0 works for me perfectly when accessing (read/write) > data on hdfs. BTW, if you look at pom.xml, you have to choose yarn profile > to compile spark, so that it won't include protobuf 2.4.1 in your final > jars. Here is the command line we use to compile spark with hadoop 2.2: > > mvn -U -Dyarn.version=2.2.0 -Dhadoop.version=2.2.0 -Pyarn -DskipTests > package > > Thanks > -Shengzhe > > > On Wed, Mar 26, 2014 at 12:04 AM, qingyang li <[email protected] > >wrote: > > > Egor, i encounter the same problem which you have asked in this thread: > > > > > > > http://mail-archives.apache.org/mod_mbox/spark-user/201402.mbox/%3CCAMrx5DwJVJS0g_FE7_2qwMu4Xf0y5VfV=tlyauv2kh5v4k6...@mail.gmail.com%3E > > > > have you fixed this problem? > > > > i am using shark to read a table which i have created on hdfs. > > > > i found in shark lib_managed directory there are two protobuf*.jar: > > [root@bigdata001 shark-0.9.0]# find . -name "proto*.jar" > > > > > ./lib_managed/jars/org.spark-project.protobuf/protobuf-java/protobuf-java-2.4.1-shaded.jar > > > > > ./lib_managed/bundles/com.google.protobuf/protobuf-java/protobuf-java-2.5.0.jar > > > > > > my hadoop is using protobuf-java-2.5.0.jar . > > >
