Hi everyone, I'm writing a Scala program which uses Spark CSV <https://github.com/databricks/spark-csv> to read CSV files from a directory. After reading the CSVs as data frames I need to convert them to Avro format since I need to eventually convert that data to a GenericRecord <https://avro.apache.org/docs/1.7.6/api/java/org/apache/avro/generic/GenericData.Record.html> for further processing. I know I can do toJSON on the DF and get a valid JSON format, however I need it to be Avro compliant (I have some nullable fields in these CSV files which require special handling <http://stackoverflow.com/questions/27485580/how-to-fix-expected-start-union-got-value-number-int-when-converting-json-to-av> when converting to Avro)
Does anyone have any idea on how this can be done besides normalizing all the fields by myself? Thanks in advance -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Converting-CSV-files-to-Avro-tp25985.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org