loading data into a trivial single cluster

Brian Hamlin Sat, 14 Sep 2013 09:44:04 -0700

Hi All -

I used Cloudera .debs to install a trivial Hadoop on a single host..

log/hadoop-hdfs/hadoop-hdfs-namemode.log shows version 2.0.0-cdh-4.3.1


however I am not loading data into the cluster, for example in pySpark

tW = sc.textFile( "http://my.domain/www_shared/a_report.csv"; )

  says "No file system for scheme http"

tW = sc.textFile( '/home/dbb/a_text_file.csv' )

says "TypeError: unsupported operand type(s) for -: 'unicode' and'float' "


that is for both a single column float field CSV with no header,
and a multi-column CSV with header

(clearly I can load CSV and pyscopg2 in python generally)

but I am not understanding the next steps to read the data into theHDFS system..

ps- I see the gui port number is 4040 now, and the IPYTHON=1 f lagworks fine


  thanks -Brian

loading data into a trivial single cluster

Reply via email to