You can do filter with startswith ? On Thu, Oct 2, 2014 at 4:04 PM, SK <skrishna...@gmail.com> wrote:
> Thanks for the help. Yes, I did not realize that the first header line has > a > different separator. > > By the way, is there a way to drop the first line that contains the header? > Something along the following lines: > > sc.textFile(inp_file) > .drop(1) // or tail() to drop the header line > .map.... // rest of the processing > > I could not find a drop() function or take the bottom (n) elements for RDD. > Alternatively, a way to create the case class schema from the header line > of > the file and use the rest for the data would be useful - just as a > suggestion. Currently I am just deleting this header line manually before > processing it in Spark. > > > thanks > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-ArrayIndexOutofBoundsException-tp15639p15642.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >