Re: spark-shell -- running into ArrayIndexOutOfBoundsException

2014-07-23 Thread buntu
Turns out to be an issue with number of fields being read, one of the fields might be missing from the raw data file causing this error. Michael Ambrust pointed it out in another thread. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-shell-running-int

Re: spark-shell -- running into ArrayIndexOutOfBoundsException

2014-07-23 Thread buntu
Just wanted to add more info.. I was using SparkSQL reading in the tab-delimited raw data files converting the timestamp to Date format: sc.textFile("rawdata/*").map(_.split("\t")).map(p => Point(df.format(new Date( p(0).trim.toLong*1000L )), p(1), p(2).trim.toInt ,p(3).trim.toInt, p(4).trim.toI