Turns out to be an issue with number of fields being read, one of the fields
might be missing from the raw data file causing this error. Michael Ambrust
pointed it out in another thread.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/spark-shell-running-int
Just wanted to add more info.. I was using SparkSQL reading in the
tab-delimited raw data files converting the timestamp to Date format:
sc.textFile("rawdata/*").map(_.split("\t")).map(p => Point(df.format(new
Date( p(0).trim.toLong*1000L )), p(1), p(2).trim.toInt ,p(3).trim.toInt,
p(4).trim.toI