Is 1.0.8 working for you ? You indicated your last known good version is 1.0.0
Maybe we can track down where it broke. > On Sep 16, 2014, at 12:25 AM, Paul Wais <pw...@yelp.com> wrote: > > Thanks Christian! I tried compiling from source but am still getting the > same hadoop client version error when reading from HDFS. Will have to poke > deeper... perhaps I've got some classpath issues. FWIW I compiled using: > > $ MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m" mvn > -Phadoop-2.3 -Dhadoop.version=2.3.0 -DskipTests clean package > > and hadoop 2.3 / cdh5 from > http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.3.0-cdh5.0.0.tar.gz > > > > > >> On Mon, Sep 15, 2014 at 6:49 PM, Christian Chua <cc8...@icloud.com> wrote: >> Hi Paul. >> >> I would recommend building your own 1.1.0 distribution. >> >> ./make-distribution.sh --name hadoop-personal-build-2.4 --tgz -Pyarn >> -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests >> >> >> >> I downloaded the "Pre-build for Hadoop 2.4" binary, and it had this strange >> behavior where >> >> spark-submit --master yarn-cluster ... >> >> will work, but >> >> spark-submit --master yarn-client ... >> >> will fail. >> >> >> But on the personal build obtained from the command above, both will then >> work. >> >> >> -Christian >> >> >> >> >>> On Sep 15, 2014, at 6:28 PM, Paul Wais <pw...@yelp.com> wrote: >>> >>> Dear List, >>> >>> I'm having trouble getting Spark 1.1 to use the Hadoop 2 API for >>> reading SequenceFiles. In particular, I'm seeing: >>> >>> Exception in thread "main" org.apache.hadoop.ipc.RemoteException: >>> Server IPC version 7 cannot communicate with client version 4 >>> at org.apache.hadoop.ipc.Client.call(Client.java:1070) >>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) >>> at com.sun.proxy.$Proxy7.getProtocolVersion(Unknown Source) >>> ... >>> >>> When invoking JavaSparkContext#newAPIHadoopFile(). (With args >>> validSequenceFileURI, SequenceFileInputFormat.class, Text.class, >>> BytesWritable.class, new Job().getConfiguration() -- Pretty close to >>> the unit test here: >>> https://github.com/apache/spark/blob/f0f1ba09b195f23f0c89af6fa040c9e01dfa8951/core/src/test/java/org/apache/spark/JavaAPISuite.java#L916 >>> ) >>> >>> >>> This error indicates to me that Spark is using an old hadoop client to >>> do reads. Oddly I'm able to do /writes/ ok, i.e. I'm able to write >>> via JavaPairRdd#saveAsNewAPIHadoopFile() to my hdfs cluster. >>> >>> >>> Do I need to explicitly build spark for modern hadoop?? I previously >>> had an hdfs cluster running hadoop 2.3.0 and I was getting a similar >>> error (server is using version 9, client is using version 4). >>> >>> >>> I'm using Spark 1.1 cdh4 as well as hadoop cdh4 from the links posted >>> on spark's site: >>> * http://d3kbcqa49mib13.cloudfront.net/spark-1.1.0-bin-cdh4.tgz >>> * http://d3kbcqa49mib13.cloudfront.net/hadoop-2.0.0-cdh4.2.0.tar.gz >>> >>> >>> What distro of hadoop is used at Data Bricks? Are there distros of >>> Spark 1.1 and hadoop that should work together out-of-the-box? >>> (Previously I had Spark 1.0.0 and Hadoop 2.3 working fine..) >>> >>> Thanks for any help anybody can give me here! >>> -Paul >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >