Re: Spark Docker Official Image is now available

2023-07-19 Thread Ruifeng Zheng
Awesome, thank you YiKun for driving this! On Thu, Jul 20, 2023 at 9:12 AM Hyukjin Kwon wrote: > This is amazing, finally! > > On Thu, 20 Jul 2023 at 10:10, Yikun Jiang wrote: > >> The spark Docker Official Image is now available: >> https://hub.docker.com/_/spark >> >> $ docker run -it --rm

Re: Spark Docker Official Image is now available

2023-07-19 Thread Hyukjin Kwon
This is amazing, finally! On Thu, 20 Jul 2023 at 10:10, Yikun Jiang wrote: > The spark Docker Official Image is now available: > https://hub.docker.com/_/spark > > $ docker run -it --rm *spark* /opt/spark/bin/spark-shell > $ docker run -it --rm *spark*:python3 /opt/spark/bin/pyspark > $ docker

Spark Docker Official Image is now available

2023-07-19 Thread Yikun Jiang
The spark Docker Official Image is now available: https://hub.docker.com/_/spark $ docker run -it --rm *spark* /opt/spark/bin/spark-shell $ docker run -it --rm *spark*:python3 /opt/spark/bin/pyspark $ docker run -it --rm *spark*:r /opt/spark/bin/sparkR We had a longer review journey than we

Re: [DISCUSS] SPIP: XML data source support

2023-07-19 Thread Maciej
That's a great idea, as long as we can keep additional dependencies under control. Best regards, Maciej Szymkiewicz Web:https://zero323.net PGP: A30CEF0C31A501EC On 7/19/23 18:22, Franco Patano wrote: +1 Many people have struggled with incorporating this separate library into their Spark

Re: [DISCUSS] SPIP: XML data source support

2023-07-19 Thread Franco Patano
+1 Many people have struggled with incorporating this separate library into their Spark pipelines. On Wed, Jul 19, 2023 at 10:53 AM Burak Yavuz wrote: > +1 on adding to Spark. Community involvement will make the XML reader > better. > > Best, > Burak > > On Wed, Jul 19, 2023 at 3:25 AM Martin

Re: [DISCUSS] SPIP: XML data source support

2023-07-19 Thread Burak Yavuz
+1 on adding to Spark. Community involvement will make the XML reader better. Best, Burak On Wed, Jul 19, 2023 at 3:25 AM Martin Andersson wrote: > Alright, makes sense to add it then. > -- > *From:* Hyukjin Kwon > *Sent:* Wednesday, July 19, 2023 11:01 > *To:*

Re: [DISCUSS] SPIP: XML data source support

2023-07-19 Thread Martin Andersson
Alright, makes sense to add it then. From: Hyukjin Kwon Sent: Wednesday, July 19, 2023 11:01 To: Martin Andersson Cc: Sandip Agarwala ; dev@spark.apache.org Subject: Re: [DISCUSS] SPIP: XML data source support EXTERNAL SENDER. Do not click links or open

Re: [DISCUSS] SPIP: XML data source support

2023-07-19 Thread Hyukjin Kwon
Here are the benefits of having it as a built-in source: - We can leverage the community to improve the Spark XML (not within Databricks repositories). - We can share the same core for XML expressions (e.g., from_xml and to_xml like from_csv, from_json, etc.). - It is more to

Re: [DISCUSS] SPIP: XML data source support

2023-07-19 Thread Martin Andersson
How much of an effort is it to use the spark-xml library today? What's the drawback to keeping this as an external library as-is? Best Regards, Martin From: Hyukjin Kwon Sent: Wednesday, July 19, 2023 01:27 To: Sandip Agarwala Cc: dev@spark.apache.org Subject: