I have a set of CSV that I need to perform ETL on, with the plan to re-use a lot of code between each file in a parent abstract class.
I tried creating the following simple abstract class that will have a parameterized type of a case class that represents the schema being read in. This won't compile, it just complains about not being able to find an encoder, but I'm importing the implicits and don't believe this error. scala> import spark.implicits._ import spark.implicits._ scala> scala> case class RawTemp(f1: String, f2: String, temp: Long, created_at: java.sql.Timestamp, data_filename: String) defined class RawTemp scala> scala> abstract class RawTable[A](inDir: String) { | def load() = { | spark.read | .option("header", "true") | .option("mode", "FAILFAST") | .option("escape", "\"") | .option("nullValue", "") | .option("indferSchema", "true") | .csv(inDir) | .as[A] | } | } <console>:27: error: Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases. .as[A] scala> class TempTable extends RawTable[RawTemp]("/user/drake/t.csv") <console>:13: error: not found: type RawTable class TempTable extends RawTable[RawTemp]("/user/drake/t.csv") ^ What's odd is that this output looks okay: scala> val RTEncoder = Encoders.product[RawTemp] RTEncoder: org.apache.spark.sql.Encoder[RawTemp] = class[f1[0]: string, f2[0]: string, temp[0]: bigint, created_at[0]: timestamp, data_filename[0]: string] scala> RTEncoder.schema res4: org.apache.spark.sql.types.StructType = StructType(StructField(f1,StringType,true), StructField(f2,StringType,true), StructField(temp,LongType,false), StructField(created_at,TimestampType,true), StructField(data_filename,StringType,true)) scala> RTEncoder.clsTag res5: scala.reflect.ClassTag[RawTemp] = RawTemp Any ideas? -- Donald Drake Drake Consulting http://www.drakeconsulting.com/ https://twitter.com/dondrake <http://www.MailLaunder.com/> 800-733-2143