I recently worked around datasource and parquet a bit at Spark and someone requested me to make a XML datasource plugin. So iI did this.
https://github.com/HyukjinKwon/spark-xml It tried to get rid of in-line format just like Json datasource in Spark. Altough I didn't add a CI tool for this yet, this looks working for the testcodes and rough use cases. Maybe you can try this :). Thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Parsing-a-large-XML-file-using-Spark-tp19239p25272.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org