Re: [Spark2] Error writing "complex" type to CSV

2016-08-22 Thread Hyukjin Kwon
Whether it writes the data as garbage or string representation, this is not able to load back. So, I'd say both are wrong and bugs. I think it'd be great if we can write and read back CSV in its own format but I guess we can't for now. 2016-08-20 2:54 GMT+09:00 Efe Selcuk :

Re: [Spark2] Error writing "complex" type to CSV

2016-08-19 Thread Efe Selcuk
Okay so this is partially PEBKAC. I just noticed that there's a debugging field at the end that's another case class with its own simple fields - *that's* the struct that was showing up in the error, not the entry itself. This raises a different question. What has changed that this is no longer

Re: [Spark2] Error writing "complex" type to CSV

2016-08-18 Thread Hyukjin Kwon
Ah, BTW, there is an issue, SPARK-16216, about printing dates and timestamps here. So please ignore the integer values for dates 2016-08-19 9:54 GMT+09:00 Hyukjin Kwon : > Ah, sorry, I should have read this carefully. Do you mind if I ask your > codes to test? > > I would

Re: [Spark2] Error writing "complex" type to CSV

2016-08-18 Thread Hyukjin Kwon
Ah, sorry, I should have read this carefully. Do you mind if I ask your codes to test? I would like to reproduce. I just tested this by myself but I couldn't reproduce as below (is this what your doing, right?): case class ClassData(a: String, b: Date) val ds: Dataset[ClassData] = Seq(

Re: [Spark2] Error writing "complex" type to CSV

2016-08-18 Thread Efe Selcuk
Thanks for the response. The problem with that thought is that I don't think I'm dealing with a complex nested type. It's just a dataset where every record is a case class with only simple types as fields, strings and dates. There's no nesting. That's what confuses me about how it's interpreting

Re: [Spark2] Error writing "complex" type to CSV

2016-08-18 Thread Hyukjin Kwon
Hi Efe, If my understanding is correct, supporting to write/read complex types is not supported because CSV format can't represent the nested types in its own format. I guess supporting them in writing in external CSV is rather a bug. I think it'd be great if we can write and read back CSV in

[Spark2] Error writing "complex" type to CSV

2016-08-18 Thread Efe Selcuk
We have an application working in Spark 1.6. It uses the databricks csv library for the output format when writing out. I'm attempting an upgrade to Spark 2. When writing with both the native DataFrameWriter#csv() method and with first specifying the "com.databricks.spark.csv" format (I suspect