Hi everyone,

I'm writing a Scala program which uses  Spark CSV
<https://github.com/databricks/spark-csv>   to read CSV files from a
directory. After reading the CSVs as data frames I need to convert them to
Avro format since I need to eventually convert that data to a  GenericRecord
<https://avro.apache.org/docs/1.7.6/api/java/org/apache/avro/generic/GenericData.Record.html>
  
for further processing. 
I know I can do toJSON on the DF and get a valid JSON format, however I need
it to be Avro compliant (I have some nullable fields in these CSV files
which require  special handling
<http://stackoverflow.com/questions/27485580/how-to-fix-expected-start-union-got-value-number-int-when-converting-json-to-av>
  
when converting to Avro)

Does anyone have any idea on how this can be done besides normalizing all
the fields by myself?

Thanks in advance 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Converting-CSV-files-to-Avro-tp25985.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to