Hi all,
I am a newbie Spark user with many doubts, so sorry if this is a silly
question.
I am dealing with tabular data formatted as text files, so when I first
load the data, my code is like this:
case class data_class(
V1: String,
V2: String,
V3: String,
V4: String,
V5: String,
Thank you for your fast reply.
We are considering this Map[String, String] solution, but there are some
details that we do not control yet. What would happen if we have different
data types for different fields? Also, with this solution, we have to
repeat the field names for every row that we
If you intern the string it will be more efficient, but still significantly
more expensive than the class based approach.
** VERY EXPERIMENTAL **
We are working with EPFL on a lightweight syntax for naming the results of
spark transformations in scala (and are going to make it interoperate with