+1 On Fri, Jul 28, 2023 at 15:54 Sean Owen <sro...@gmail.com> wrote:
> +1 I think that porting the package 'as is' into Spark is probably > worthwhile. > That's relatively easy; the code is already pretty battle-tested and not > that big and even originally came from Spark code, so is more or less > similar already. > > One thing it never got was DSv2 support, which means XML reading would > still be somewhat behind other formats. (I was not able to implement it.) > This isn't a necessary goal right now, but would be possibly part of the > logic of moving it into the Spark code base. > > On Fri, Jul 28, 2023 at 5:38 PM Sandip Agarwala > <sandip.agarw...@databricks.com.invalid> wrote: > >> Dear Spark community, >> >> I would like to start the vote for "SPIP: XML data source support". >> >> XML is a widely used data format. An external spark-xml package ( >> https://github.com/databricks/spark-xml) is available to read and write >> XML data in spark. Making spark-xml built-in will provide a better user >> experience for Spark SQL and structured streaming. The proposal is to >> inline code from the spark-xml package. >> >> SPIP link: >> >> https://docs.google.com/document/d/1ZaOBT4-YFtN58UCx2cdFhlsKbie1ugAn-Fgz_Dddz-Q/edit?usp=sharing >> >> JIRA: >> https://issues.apache.org/jira/browse/SPARK-44265 >> >> Discussion Thread: >> https://lists.apache.org/thread/q32hxgsp738wom03mgpg9ykj9nr2n1fh >> >> Please vote on the SPIP for the next 72 hours: >> [ ] +1: Accept the proposal as an official SPIP >> [ ] +0 >> [ ] -1: I don’t think this is a good idea because __. >> >> Thanks, Sandip >> >