Spark can definitely process data with optional fields. It kinda depends
on what you want to do with the results -- its more of a object design /
knowing scala types question.
Eg., scala has a built in type Option specifically for handling optional
data, which works nicely in pattern matching
Yes I think you need to create one map first which will keep the number of
values in every line. Now you can group all the records with same number of
values. Now you know how many types of arrays you will have.
val dataRDD = sc.textFile(file.csv)
val dataLengthRDD = dataRDD