+ 1
> 2023年7月29日 13:06,Adrian Pop-Tifrea <poptifreaadr...@gmail.com> 写道: > > +1, the more data source formats, the better, and if the solution is already > thoroughly tested, I say we should go for it. > > On Sat, Jul 29, 2023, 06:35 Xiao Li <gatorsm...@gmail.com > <mailto:gatorsm...@gmail.com>> wrote: >> +1 >> >> On Fri, Jul 28, 2023 at 15:54 Sean Owen <sro...@gmail.com >> <mailto:sro...@gmail.com>> wrote: >>> +1 I think that porting the package 'as is' into Spark is probably >>> worthwhile. >>> That's relatively easy; the code is already pretty battle-tested and not >>> that big and even originally came from Spark code, so is more or less >>> similar already. >>> >>> One thing it never got was DSv2 support, which means XML reading would >>> still be somewhat behind other formats. (I was not able to implement it.) >>> This isn't a necessary goal right now, but would be possibly part of the >>> logic of moving it into the Spark code base. >>> >>> On Fri, Jul 28, 2023 at 5:38 PM Sandip Agarwala >>> <sandip.agarw...@databricks.com.invalid> wrote: >>>> Dear Spark community, >>>> >>>> I would like to start the vote for "SPIP: XML data source support". >>>> >>>> XML is a widely used data format. An external spark-xml package >>>> (https://github.com/databricks/spark-xml) is available to read and write >>>> XML data in spark. Making spark-xml built-in will provide a better user >>>> experience for Spark SQL and structured streaming. The proposal is to >>>> inline code from the spark-xml package. >>>> >>>> SPIP link: >>>> https://docs.google.com/document/d/1ZaOBT4-YFtN58UCx2cdFhlsKbie1ugAn-Fgz_Dddz-Q/edit?usp=sharing >>>> >>>> JIRA: >>>> https://issues.apache.org/jira/browse/SPARK-44265 >>>> >>>> Discussion Thread: >>>> https://lists.apache.org/thread/q32hxgsp738wom03mgpg9ykj9nr2n1fh >>>> >>>> Please vote on the SPIP for the next 72 hours: >>>> [ ] +1: Accept the proposal as an official SPIP >>>> [ ] +0 >>>> [ ] -1: I don’t think this is a good idea because __. >>>> >>>> Thanks, Sandip