Ah, sorry, I should have read this carefully. Do you mind if I ask your codes to test?
I would like to reproduce. I just tested this by myself but I couldn't reproduce as below (is this what your doing, right?): case class ClassData(a: String, b: Date) val ds: Dataset[ClassData] = Seq( ("a", Date.valueOf("1990-12-13")), ("a", Date.valueOf("1990-12-13")), ("a", Date.valueOf("1990-12-13")) ).toDF("a", "b").as[ClassData] ds.write.csv("/tmp/data.csv") spark.read.csv("/tmp/data.csv").show() prints as below: +---+----+ |_c0| _c1| +---+----+ | a|7651| | a|7651| | a|7651| +---+----+ 2016-08-19 9:27 GMT+09:00 Efe Selcuk <efema...@gmail.com>: > Thanks for the response. The problem with that thought is that I don't > think I'm dealing with a complex nested type. It's just a dataset where > every record is a case class with only simple types as fields, strings and > dates. There's no nesting. > > That's what confuses me about how it's interpreting the schema. The schema > seems to be one complex field rather than a bunch of simple fields. > > On Thu, Aug 18, 2016, 5:07 PM Hyukjin Kwon <gurwls...@gmail.com> wrote: > >> Hi Efe, >> >> If my understanding is correct, supporting to write/read complex types is >> not supported because CSV format can't represent the nested types in its >> own format. >> >> I guess supporting them in writing in external CSV is rather a bug. >> >> I think it'd be great if we can write and read back CSV in its own format >> but I guess we can't. >> >> Thanks! >> >> On 19 Aug 2016 6:33 a.m., "Efe Selcuk" <efema...@gmail.com> wrote: >> >>> We have an application working in Spark 1.6. It uses the databricks csv >>> library for the output format when writing out. >>> >>> I'm attempting an upgrade to Spark 2. When writing with both the native >>> DataFrameWriter#csv() method and with first specifying the >>> "com.databricks.spark.csv" format (I suspect underlying format is the same >>> but I don't know how to verify), I get the following error: >>> >>> java.lang.UnsupportedOperationException: CSV data source does not >>> support struct<[bunch of field names and types]> data type >>> >>> There are 20 fields, mostly plain strings with a couple of dates. The >>> source object is a Dataset[T] where T is a case class with various fields >>> The line just looks like: someDataset.write.csv(outputPath) >>> >>> Googling returned this fairly recent pull request: https://mail- >>> archives.apache.org/mod_mbox/spark-commits/201605.mbox/% >>> 3c65d35a72bd05483392857098a2635...@git.apache.org%3E >>> >>> If I'm reading that correctly, the schema shows that each record has one >>> field of this complex struct type? And the validation thinks it's something >>> that it can't serialize. I would expect the schema to have a bunch of >>> fields in it matching the case class, so maybe there's something I'm >>> misunderstanding. >>> >>> Efe >>> >>