Hi,
I am not able to read from HDFS(Intel distribution hadoop,Hadoop version is 
1.0.3) from spark-shell(spark version is 1.2.1). I built spark using the 
commandmvn -Dhadoop.version=1.0.3 clean package and started  spark-shell and 
read a HDFS file using sc.textFile() and the exception is  
 WARN hdfs.DFSClient: Failed to connect to /10.88.6.133:50010, add to deadNodes 
and continuejava.net.SocketTimeoutException: 120000 millis timeout while 
waiting for channel to be ready for read. ch : 
java.nio.channels.SocketChannel[connected local=/10.88.6.131:44264 
remote=/10.88.6.133:50010]

The same problem is asked in the this mail.
 RE: Spark is unable to read from HDFS
|   |
|   |   |   |   |   |
| RE: Spark is unable to read from HDFSHi,Thanks for the reply. I've tried the 
below.  |
|  |
| View on mail-archives.us.apache.org | Preview by Yahoo |
|  |
|   |

  As suggested in the above mail,"In addition to specifying 
HADOOP_VERSION=1.0.3 in the ./project/SparkBuild.scala file, you will need to 
specify the libraryDependencies and name "spark-core"  resolvers. Otherwise, 
sbt will fetch version 1.0.3 of hadoop-core from apache instead of Intel. You 
can set up your own local or remote repository that you specify" 
Now HADOOP_VERSION is deprecated and -Dhadoop.version should be used. Can 
anybody please elaborate on how to specify tat SBT should fetch hadoop-core 
from Intel which is in our internal repository?
Thanks & Regards,
Meethu M

Reply via email to