Hi All -
I used Cloudera .debs to install a trivial Hadoop on a single host..
log/hadoop-hdfs/hadoop-hdfs-namemode.log shows version 2.0.0-
cdh-4.3.1
however I am not loading data into the cluster, for example in pySpark
tW = sc.textFile( "http://my.domain/www_shared/a_report.csv" )
says "No file system for scheme http"
tW = sc.textFile( '/home/dbb/a_text_file.csv' )
says "TypeError: unsupported operand type(s) for -: 'unicode' and
'float' "
that is for both a single column float field CSV with no header,
and a multi-column CSV with header
(clearly I can load CSV and pyscopg2 in python generally)
but I am not understanding the next steps to read the data into the
HDFS system..
ps- I see the gui port number is 4040 now, and the IPYTHON=1 f lag
works fine
thanks -Brian