[jira] [Commented] (SPARK-524) spark integration issue with Cloudera hadoop

Sean Owen (JIRA) Sun, 13 Jul 2014 03:23:27 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14060070#comment-14060070
 ]


Sean Owen commented on SPARK-524:
---------------------------------

Can I ask a meta-question? This JIRA is an example, but just one. I see 
hundreds of JIRAs that likely have no further action.

Some are likely obsoleted by time and subsequent changes, like this one -- CDH 
integration is much different now and presumably fixes this. Some are feature 
requests or changes that de facto don't have support and therefore won't be 
committed. These seem like they should be closed, for clarity. Bugs are riskier 
to close in case they identify a real issue that still exists.

Is there any momentum for, or anything I can do, to help clean up things like 
this just to start?

> spark integration issue with Cloudera hadoop
> --------------------------------------------
>
>                 Key: SPARK-524
>                 URL: https://issues.apache.org/jira/browse/SPARK-524
>             Project: Spark
>          Issue Type: Bug
>            Reporter: openreserach
>
> Hi, 
> 1. I am using single EC2 instance with pre-built mesos (ami-0fcb7966) (Same 
> issue if I build mesos from source code in locall VM)
> 2. Follow instruction on 
> https://github.com/mesos/spark/wiki/Running-spark-on-mesos with some tweaks.
> 3. I install Cloudera cdhu5 by yum (not using pre-built hadoop due to lack of 
> document)
> 4. ./spartk-shell.sh
> import spark._
> val sc = new SparkContext("localhost:5050","passwd")
> val ec2 = sc.textFile("hdfs://localhost:8020/tmp/passwd")
> IF I keep val HADOOP_VERSION = "0.20.205.0" in project/SparkBuild.scala
> at val file = sc.textFile("hdfs://localhost:8020/tmp/passwd")
> I am getting error
> Protocol org.apache.hadoop.hdfs.protocol.ClientProtocol version mismatch. 
> (client = 61, server = 63)
> IF I set val HADOOP_VERSION = "0.20.2-cdh3u5" or val HADOOP_VERSION = 
> "0.20.2-cdh3u3" 
> I am getting error at  ec2.count()
> ERROR spark.SimpleJob: Task 0:0 failed more than 4 times; aborting job
> like the one reported at 
> http://mail-archives.apache.org/mod_mbox/incubator-mesos-dev/201108.mbox/%3cbd25ae7a-c9dc-4020-ad40-41c66dcaa...@eecs.berkeley.edu%3E
> Please let me know if you cannot replicate this error, and give more 
> instruction on how Spark integrate with Cloudera Hadoop 
> Thanks
> -QH



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-524) spark integration issue with Cloudera hadoop

Reply via email to