There's a version incompatibility between your hadoop jars. You need to
make sure you build your spark with Hadoop 2.5.0-cdh5.3.1 version.

Thanks
Best Regards

On Fri, Apr 17, 2015 at 5:17 AM, lalasriza . <lala.s.r...@gmail.com> wrote:

> Dear everyone,
>
> right now I am working with SparkR on cluster. The following are the
> package versions installed on the cluster:
> ----
> 1) Hadoop and Yarn:
> Hadoop 2.5.0-cdh5.3.1
> Subversion http://github.com/cloudera/hadoop -r
> 4cda8416c73034b59cc8baafbe3666b074472846
> Compiled by jenkins on 2015-01-28T00:46Z
> Compiled with protoc 2.5.0
> From source with checksum 6a018149a764de4b8992755df9a2a1b
>
> 2) Spark: Spark version 1.2.0
> For the SparkR installation, I was following the guide at
> https://github.com/amplab-extras/SparkR-pkg, by cloning the SparkR-pkg.
> Then, in SparkR-pkg, I typed:
> SPARK_VERSION=1.2.0 ./install-dev.sh
> SPARK_HADOOP_VERSION=2.5.0-cdh5.3.1 ./install-dev.sh
>  ----
>
> After the installation, I tested SparkR as follows:
> MASTER=spark://xxx:7077 ./sparkR
> R> rdd <- parallelize(sc, 1:10)
> R> partitionSum <- lapplyPartition(rdd, function(part) { Reduce("+", part)
> })
> R> collect(partitionSum) # 15, 40
>
> I got the result perfectly. However, when I try to get a file from HDFS or
> local file, I always failed. For example,
> R> lines <- textFile(sc, "hdfs://xxx:8020/user/lala/simulation/README.md")
> R> count(lines)
>
> The following are the errors I got:
> ------
> collect on 2 failed with java.lang.reflect.InvocationTargetException
> java.lang.reflect.InvocationTargetException
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at
> edu.berkeley.cs.amplab.sparkr.SparkRBackendHandler.handleMethodCall(SparkRBackendHandler.scala:111)
>         at
> edu.berkeley.cs.amplab.sparkr.SparkRBackendHandler.channelRead0(SparkRBackendHandler.scala:58)
>         at
> edu.berkeley.cs.amplab.sparkr.SparkRBackendHandler.channelRead0(SparkRBackendHandler.scala:19)
>         at
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>         at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>         at
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>         at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>         at
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
>         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>         at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>         at
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
>         at
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
>         at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>         at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>         at
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>         at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>         at
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: org.apache.hadoop.ipc.RemoteException: Server IPC version 9
> cannot communicate with client version 4
>         at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>         at com.sun.proxy.$Proxy10.getProtocolVersion(Unknown Source)
>         at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>         at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>         at
> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>         at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>         at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>         at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>         at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>         at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>         at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
>         at
> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:176)
>         at
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)
>         at
> org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:201)
>         at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
>         at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:203)
>         at scala.Option.getOrElse(Option.scala:120)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:203)
>         at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
>         at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
>         at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:203)
>         at scala.Option.getOrElse(Option.scala:120)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:203)
>         at org.apache.spark.SparkContext.runJob(SparkContext.scala:1328)
>         at org.apache.spark.rdd.RDD.collect(RDD.scala:780)
>         at
> org.apache.spark.api.java.JavaRDDLike$class.collect(JavaRDDLike.scala:309)
>         at org.apache.spark.api.java.JavaRDD.collect(JavaRDD.scala:32)
>         ... 25 more
> Error: returnStatus == 0 is not TRUE
>
> -----
> I have read some comments regarding the errors, which is caused by
> different versions between the master node and its workers. However, I am
> not so sure, maybe there is another reason. Moreover, I do not know how to
> solve it. So, I am looking forward to your idea and advice.
>
> Many thanks in advance,
>
> Regards,
>
> Lala SR
>
>

Reply via email to