Are you able to connect to Name node UI on MACHINE_IP:50070. Check what is URI there.
If UI does't open it means your hdfs is not up ,try to start it using start.dfs.sh. On Thu, Apr 28, 2016 at 2:59 AM, Bibudh Lahiri <bibudhlah...@gmail.com> wrote: > Hi, > I installed Hadoop 2.6.0 today on one of the machines (172.26.49.156), > got HDFS running on it (both Namenode and Datanode on the same machine) and > copied the files to HDFS. However, from the same machine, when I try to > load the same CSV with the following statement: > > sqlContext.read.format("com.databricks.spark.csv").option("header", > "false").load("hdfs:// > 172.26.49.156:54310/bibudh/healthcare/data/cloudera_challenge/patients.csv > ") > > I get the error > > java.net.ConnectException: Call From impetus-i0276.impetus.co.in/127.0.0.1 > to impetus-i0276:54310 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > > I have changed the port number to 8020 but the same error gets reported. > > Even the following command is not working from the command line, when > launched from the HADOOP_HOME folder for > > bin/hdfs dfs -ls hdfs://172.26.49.156:54310/ > > which was working earlier when issued from the other machine > (172.26.49.55), from under HADOOP_HOME for Hadoop 1.0.4. > > I set ~/.bashrc are as follows, when I installed Hadoop 2.6.0: > > export JAVA_HOME=/usr/lib/jvm/jre-1.7.0-openjdk.x86_64 > export HADOOP_HOME=/usr/local/hadoop-2.6.0 > export HADOOP_INSTALL=$HADOOP_HOME > export HADOOP_PREFIX=$HADOOP_HOME > export HADOOP_MAPRED_HOME=$HADOOP_PREFIX > export HADOOP_COMMON_HOME=$HADOOP_PREFIX > export HADOOP_HDFS_HOME=$HADOOP_PREFIX > export YARN_HOME=$HADOOP_PREFIX > export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native > export SPARK_HOME=/home/impadmin/spark-1.6.0-bin-hadoop2.6 > > PATH=$PATH:$JAVA_HOME/bin:$HADOOP_PREFIX/bin:$HADOOP_HOME/sbin:$SPARK_HOME/bin > export HADOOP_CONF_DIR=$HADOOP_HOME > export HADOOP_LIBEXEC_DIR=$HADOOP_HOME/libexec > export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JAVA_LIBRARY_PATH > export HADOOP_CONF_DIR=$HADOOP_PREFIX/etc/hadoop > > Am I getting the port number wrong, or is it some other config param > that I should check? What's the general rule here? > > Thanks > Bibudh > > On Tue, Apr 26, 2016 at 7:51 PM, Davies Liu <dav...@databricks.com> wrote: > >> The Spark package you are using is packaged with Hadoop 2.6, but the >> HDFS is Hadoop 1.0.4, they are not compatible. >> >> On Tue, Apr 26, 2016 at 11:18 AM, Bibudh Lahiri <bibudhlah...@gmail.com> >> wrote: >> > Hi, >> > I am trying to load a CSV file which is on HDFS. I have two machines: >> > IMPETUS-1466 (172.26.49.156) and IMPETUS-1325 (172.26.49.55). Both have >> > Spark 1.6.0 pre-built for Hadoop 2.6 and later, but for both, I had >> existing >> > Hadoop clusters running Hadoop 1.0.4. I have launched HDFS from >> > 172.26.49.156 by running start-dfs.sh from it, copied files from local >> file >> > system to HDFS and can view them by hadoop fs -ls. >> > >> > However, when I am trying to load the CSV file from pyspark shell >> > (launched by bin/pyspark --packages com.databricks:spark-csv_2.10:1.3.0) >> > from IMPETUS-1325 (172.26.49.55) with the following commands: >> > >> > >> >>>from pyspark.sql import SQLContext >> > >> >>>sqlContext = SQLContext(sc) >> > >> >>>patients_df = >> >>> sqlContext.read.format("com.databricks.spark.csv").option("header", >> >>> "false").load("hdfs:// >> 172.26.49.156:54310/bibudh/healthcare/data/cloudera_challenge/patients.csv >> ") >> > >> > >> > I get the following error: >> > >> > >> > java.io.EOFException: End of File Exception between local host is: >> > "IMPETUS-1325.IMPETUS.CO.IN/172.26.49.55"; destination host is: >> > "IMPETUS-1466":54310; : java.io.EOFException; For more details see: >> > http://wiki.apache.org/hadoop/EOFException >> > >> > >> > U have changed the port number from 54310 to 8020, but then I get the >> error >> > >> > >> > java.net.ConnectException: Call From >> IMPETUS-1325.IMPETUS.CO.IN/172.26.49.55 >> > to IMPETUS-1466:8020 failed on connection exception: >> > java.net.ConnectException: Connection refused; For more details see: >> > http://wiki.apache.org/hadoop/ConnectionRefused >> > >> > >> > To me it seemed like this may result from a version mismatch between >> Spark >> > Hadoop client and my Hadoop cluster, so I have made the following >> changes: >> > >> > >> > 1) Added the following lines to conf/spark-env.sh >> > >> > >> > export HADOOP_HOME="/usr/local/hadoop-1.0.4" export >> > HADOOP_CONF_DIR="$HADOOP_HOME/conf" export >> > HDFS_URL="hdfs://172.26.49.156:8020" >> > >> > >> > 2) Downloaded Spark 1.6.0, pre-built with user-provided Hadoop, and in >> > addition to the three lines above, added the following line to >> > conf/spark-env.sh >> > >> > >> > export SPARK_DIST_CLASSPATH="/usr/local/hadoop-1.0.4/bin/hadoop" >> > >> > >> > but none of it seems to work. However, the following command works from >> > 172.26.49.55 and gives the directory listing: >> > >> > /usr/local/hadoop-1.0.4/bin/hadoop fs -ls hdfs://172.26.49.156:54310/ >> > >> > >> > Any suggestion? >> > >> > >> > Thanks >> > >> > Bibudh >> > >> > >> > -- >> > Bibudh Lahiri >> > Data Scientist, Impetus Technolgoies >> > 5300 Stevens Creek Blvd >> > San Jose, CA 95129 >> > http://knowthynumbers.blogspot.com/ >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> > > > -- > Bibudh Lahiri > Senior Data Scientist, Impetus Technolgoies > 720 University Avenue, Suite 130 > Los Gatos, CA 95129 > http://knowthynumbers.blogspot.com/ > > -- Thanks and Regards, Saurav Sinha Contact: 9742879062