Hi All -

I used Cloudera .debs to install a trivial Hadoop on a single host..
log/hadoop-hdfs/hadoop-hdfs-namemode.log shows version 2.0.0- cdh-4.3.1

however I am not loading data into the cluster, for example in pySpark

tW = sc.textFile( "http://my.domain/www_shared/a_report.csv"; )

  says "No file system for scheme http"

tW = sc.textFile( '/home/dbb/a_text_file.csv' )

says "TypeError: unsupported operand type(s) for -: 'unicode' and 'float' "

that is for both a single column float field CSV with no header,
and a multi-column CSV with header

(clearly I can load CSV and pyscopg2 in python generally)
but I am not understanding the next steps to read the data into the HDFS system..

ps- I see the gui port number is 4040 now, and the IPYTHON=1 f lag works fine

  thanks -Brian


Reply via email to