Re: Fwd: Spark SQL: ArrayIndexOutofBoundsException

SK Thu, 02 Oct 2014 16:05:13 -0700

Thanks for the help. Yes, I did not realize that the first header line has a
different separator.


By the way, is there a way to drop the first line that contains the header?
Something along the following lines:

      sc.textFile(inp_file)
          .drop(1)  // or tail() to drop the header line 
          .map....  // rest of the processing 

I could not find a drop() function or take the bottom (n) elements for RDD.
Alternatively, a way to create the case class schema from the header line of
the file  and use the rest for the data would be useful - just as a
suggestion.  Currently I am just deleting this header line manually before
processing it in Spark. 


thanks





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-ArrayIndexOutofBoundsException-tp15639p15642.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Fwd: Spark SQL: ArrayIndexOutofBoundsException

Reply via email to