Thanks for the help. Yes, I did not realize that the first header line has a different separator.
By the way, is there a way to drop the first line that contains the header? Something along the following lines: sc.textFile(inp_file) .drop(1) // or tail() to drop the header line .map.... // rest of the processing I could not find a drop() function or take the bottom (n) elements for RDD. Alternatively, a way to create the case class schema from the header line of the file and use the rest for the data would be useful - just as a suggestion. Currently I am just deleting this header line manually before processing it in Spark. thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-ArrayIndexOutofBoundsException-tp15639p15642.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org