Re: Spark Docker Official Image is now available

2023-07-20 Thread Dongjoon Hyun
Thank you! Dongjoon On Thu, Jul 20, 2023 at 8:40 AM Xiao Li wrote: > Thank you, Yikun! This is great! > > On Wed, Jul 19, 2023 at 7:55 PM Ruifeng Zheng wrote: > >> Awesome, thank you YiKun for driving this! >> >> On Thu, Jul 20, 2023 at 9:12 AM Hyukjin Kwon >> wrote: >> >>> This is amazing,

Re: Spark Docker Official Image is now available

2023-07-20 Thread Xiao Li
Thank you, Yikun! This is great! On Wed, Jul 19, 2023 at 7:55 PM Ruifeng Zheng wrote: > Awesome, thank you YiKun for driving this! > > On Thu, Jul 20, 2023 at 9:12 AM Hyukjin Kwon wrote: > >> This is amazing, finally! >> >> On Thu, 20 Jul 2023 at 10:10, Yikun Jiang wrote: >> >>> The spark

Re: Spark Docker Official Image is now available

2023-07-19 Thread Ruifeng Zheng
Awesome, thank you YiKun for driving this! On Thu, Jul 20, 2023 at 9:12 AM Hyukjin Kwon wrote: > This is amazing, finally! > > On Thu, 20 Jul 2023 at 10:10, Yikun Jiang wrote: > >> The spark Docker Official Image is now available: >> https://hub.docker.com/_/spark >> >> $ docker run -it --rm

Re: Spark Docker Official Image is now available

2023-07-19 Thread Hyukjin Kwon
This is amazing, finally! On Thu, 20 Jul 2023 at 10:10, Yikun Jiang wrote: > The spark Docker Official Image is now available: > https://hub.docker.com/_/spark > > $ docker run -it --rm *spark* /opt/spark/bin/spark-shell > $ docker run -it --rm *spark*:python3 /opt/spark/bin/pyspark > $ docker

Spark Docker Official Image is now available

2023-07-19 Thread Yikun Jiang
The spark Docker Official Image is now available: https://hub.docker.com/_/spark $ docker run -it --rm *spark* /opt/spark/bin/spark-shell $ docker run -it --rm *spark*:python3 /opt/spark/bin/pyspark $ docker run -it --rm *spark*:r /opt/spark/bin/sparkR We had a longer review journey than we

Re: [DISCUSS] SPIP: XML data source support

2023-07-19 Thread Maciej
That's a great idea, as long as we can keep additional dependencies under control. Best regards, Maciej Szymkiewicz Web:https://zero323.net PGP: A30CEF0C31A501EC On 7/19/23 18:22, Franco Patano wrote: +1 Many people have struggled with incorporating this separate library into their Spark

Re: [DISCUSS] SPIP: XML data source support

2023-07-19 Thread Franco Patano
+1 Many people have struggled with incorporating this separate library into their Spark pipelines. On Wed, Jul 19, 2023 at 10:53 AM Burak Yavuz wrote: > +1 on adding to Spark. Community involvement will make the XML reader > better. > > Best, > Burak > > On Wed, Jul 19, 2023 at 3:25 AM Martin

Re: [DISCUSS] SPIP: XML data source support

2023-07-19 Thread Burak Yavuz
+1 on adding to Spark. Community involvement will make the XML reader better. Best, Burak On Wed, Jul 19, 2023 at 3:25 AM Martin Andersson wrote: > Alright, makes sense to add it then. > -- > *From:* Hyukjin Kwon > *Sent:* Wednesday, July 19, 2023 11:01 > *To:*

Re: [DISCUSS] SPIP: XML data source support

2023-07-19 Thread Martin Andersson
Alright, makes sense to add it then. From: Hyukjin Kwon Sent: Wednesday, July 19, 2023 11:01 To: Martin Andersson Cc: Sandip Agarwala ; dev@spark.apache.org Subject: Re: [DISCUSS] SPIP: XML data source support EXTERNAL SENDER. Do not click links or open

Re: [DISCUSS] SPIP: XML data source support

2023-07-19 Thread Hyukjin Kwon
Here are the benefits of having it as a built-in source: - We can leverage the community to improve the Spark XML (not within Databricks repositories). - We can share the same core for XML expressions (e.g., from_xml and to_xml like from_csv, from_json, etc.). - It is more to

Re: [DISCUSS] SPIP: XML data source support

2023-07-19 Thread Martin Andersson
How much of an effort is it to use the spark-xml library today? What's the drawback to keeping this as an external library as-is? Best Regards, Martin From: Hyukjin Kwon Sent: Wednesday, July 19, 2023 01:27 To: Sandip Agarwala Cc: dev@spark.apache.org Subject:

Re: [DISCUSS] SPIP: XML data source support

2023-07-18 Thread Hyukjin Kwon
Yeah I support this. XML is pretty outdated format TBH but still used in many legacy systems. For example, Wikipedia dump is one case. Even when you take a look from stats CVS vs XML vs JSON, some show that XML is more used in CSV. On Wed, Jul 19, 2023 at 12:58 AM Sandip Agarwala <

Re: Spark Scala SBT Local build fails

2023-07-18 Thread Varun Shah
++ DEV community On Mon, Jul 17, 2023 at 4:14 PM Varun Shah wrote: > Resending this message with a proper Subject line > > Hi Spark Community, > > I am trying to set up my forked apache/spark project locally for my 1st > Open Source Contribution, by building and creating a package as mentioned

Re: Spark 3.5 Branch Cut

2023-07-17 Thread Yuanjian Li
Further reminder for the release timeline: DateEvent July 17th 2023 Code freeze. Release branch cut. Late July 2023 QA period. Focus on bug fixes, tests, stability and docs. Generally, no new features merged. August 2023 Release candidates (RC), voting, etc. until final release passes Please

Re: Spark 3.5 Branch Cut

2023-07-17 Thread Raghu Angadi
Thanks Yuanjian for accepting these for warmfix. Raghu. On Mon, Jul 17, 2023 at 1:04 PM Yuanjian Li wrote: > Hi, all > > FYI, I cut branch-3.5 as https://github.com/apache/spark/tree/branch-3.5 > > Here is the complete list of exception merge requests received before the > cut: > >- > >

Re: Spark 3.5 Branch Cut

2023-07-17 Thread Dongjoon Hyun
Thank you so much, Yuanjian! Dongjoon. On Mon, Jul 17, 2023 at 1:05 PM Yuanjian Li wrote: > Hi, all > > FYI, I cut branch-3.5 as https://github.com/apache/spark/tree/branch-3.5 > > Here is the complete list of exception merge requests received before the > cut: > >- > >SPARK-44421:

Spark 3.5 Branch Cut

2023-07-17 Thread Yuanjian Li
Hi, all FYI, I cut branch-3.5 as https://github.com/apache/spark/tree/branch-3.5 Here is the complete list of exception merge requests received before the cut: - SPARK-44421: Reattach to existing execute in Spark Connect (server mechanism) - SPARK-44423: Reattach to existing

Re: [Reminder] Spark 3.5 Branch Cut

2023-07-16 Thread Herman van Hovell
Hi Yuanjian, For the ongoing encoder work for the connect scala client I'd like to get the following tickets in: - SPARK-44396 : Direct Arrow Deserialization - SPARK-9 :

Re: Data Contracts

2023-07-16 Thread Phillip Henry
No worries. Have you had a chance to look at it? Since this thread has gone dead, I assume there is no appetite for adding data contract functionality..? Regards, Phillip On Mon, 19 Jun 2023, 11:23 Deepak Sharma, wrote: > Sorry for using simple in my last email . > It’s not gonna to be

Re: [Reminder] Spark 3.5 Branch Cut

2023-07-15 Thread Enrico Minack
Speaking of JdbcDialect, is there any interest in getting upserts for JDBC into 3.5.0? [SPARK-19335][SPARK-38200][SQL] Add upserts for writing to JDBC: https://github.com/apache/spark/pull/41518 [SPARK-19335][SPARK-38200][SQL] Add upserts for writing to JDBC using MERGE INTO with temp table:

Re: [Reminder] Spark 3.5 Branch Cut

2023-07-14 Thread Jia Fan
Can we put [SPARK-44262][SQL] Add `dropTable` and `getInsertStatement` to JdbcDialect into 3.5.0? https://github.com/apache/spark/pull/41855 Since this is the last major version update of 3.x, I think we need to make sure JdbcDialect can support more databases. Gengliang Wang 于2023年7月15日周六

Re: [Reminder] Spark 3.5 Branch Cut

2023-07-14 Thread Gengliang Wang
Hi Yuanjian, Besides the abovementioned changes, it would be great to include the UI page for Spakr Connect: SPARK-44394 . Best Regards, Gengliang On Fri, Jul 14, 2023 at 11:44 AM Julek Sompolski wrote: > Thank you, > My changes that you

Re: [Reminder] Spark 3.5 Branch Cut

2023-07-14 Thread Julek Sompolski
Thank you, My changes that you listed are tracked under this Epic: https://issues.apache.org/jira/browse/SPARK-43754 I am also working on https://issues.apache.org/jira/browse/SPARK-44422, didn't mention it before because I have hopes that this one will make it before the cut. (Unrelated) My

Re: [Reminder] Spark 3.5 Branch Cut

2023-07-14 Thread Raghu Angadi
Thank you. We plan to get remaining major pieces for Streaming Spark Connect (Epic SPARK-42938 ). I would like to request a warmfix exception for the following tweaks and improvements over the next two weeks (all in the same epic). -

Re: Time for Spark v3.5.0 release

2023-07-14 Thread Yuanjian Li
Thanks for raising all the requests. Let's stick to the previously agreed branch cut time. Based on past practice, let's label the above requests as exception features. I have just sent out a branch cut reminder titled "[Reminder] Spark 3.5 Branch Cut." Please ensure that all your requests are

[Reminder] Spark 3.5 Branch Cut

2023-07-14 Thread Yuanjian Li
Hi everyone, As discussed earlier in "Time for Spark v3.5.0 release", I will cut branch-3.5 on *Monday, July 17th at 1 pm PST* as scheduled. Please plan your PR merge accordingly with the given timeline. Currently, we have received the following exception merge requests: - SPARK-44421:

Re: Time for Spark v3.5.0 release

2023-07-14 Thread Julek Sompolski
I am working on SPARK-44421, SPARK-44423 and SPARK-44424 in Spark Connect to support execution reconnection. A week or two of warmfix grace period would be much appreciated for this work. Best regards, Juliusz Sompolski On Fri, Jul 14, 2023 at 5:40 PM Raghu Angadi wrote: > We have a bunch of

Re: Time for Spark v3.5.0 release

2023-07-14 Thread Raghu Angadi
We have a bunch of work in progress for Spark Connect trying to meet the branch cut deadline. Moving to 17th is certainly welcome. Is it feasible to extend it by a couple of more days? Alternatively, we could have a relaxed warmfix process for Spark Connect code for a week or two since it does

Unsubscribe

2023-07-13 Thread Dumas Hwang

unsubscribe

2023-07-13 Thread Raffael Bottoli Schemmer
unsubscribe

Re: Apache Arrow integration issue with Spark involving Netty

2023-07-13 Thread Dane Pitkin
I just want to add that there is a Spark Jira issue[1] for upgrading Netty once Arrow v13.0.0 is released this month. [1] https://issues.apache.org/jira/projects/SPARK/issues/SPARK-44212 On Thu, Jul 6, 2023 at 2:25 PM Dane Pitkin wrote: > Hi all, > > The next release of Apache Arrow v13.0.0

Re: [VOTE][RESULT] Python Data Source API

2023-07-11 Thread Mich Talebzadeh
Hi Allison, Great job and thanks for your efforts in driving this. Looking forward to seeing it in action soon! Best Mich Talebzadeh, Solutions Architect/Engineering Lead Palantir Technologies Limited London United Kingdom view my Linkedin profile

[VOTE][RESULT] Python Data Source API

2023-07-10 Thread Allison Wang
The vote passes with 12 +1s (8 binding +1s) and one +0 (binding). (* = binding) +1: - Hyukjin Kwon * - Xiao Li * - Denny Lee - Martin Grund - Mich Talebzadeh - Huaxin Gao * - Holden Karau * - Reynold Xin * - Jungtaek Lim - Ruifeng Zheng * - Takuya Ueshin * - Matei Zaharia * +0: Maciej

Re: [VOTE][SPIP] Python Data Source API

2023-07-10 Thread Jungtaek Lim
Just to be fully sure, SPIP does not cover streaming, but if the performance is not great compared to the JVM based implementation in any way (which I expect so), I don't think it's good to integrate with streaming which targets lower latency. That's the reason I gave +1 although it's not covering

Re: [VOTE][SPIP] Python Data Source API

2023-07-10 Thread Matei Zaharia
+1 > On Jul 10, 2023, at 10:19 AM, Takuya UESHIN wrote: > > +1 > > On Sun, Jul 9, 2023 at 10:05 PM Ruifeng Zheng > wrote: >> +1 >> >> On Mon, Jul 10, 2023 at 8:20 AM Jungtaek Lim > > wrote: >>> +1 >>> >>> On Sat, Jul 8, 2023

Re: [VOTE][SPIP] Python Data Source API

2023-07-10 Thread Takuya UESHIN
+1 On Sun, Jul 9, 2023 at 10:05 PM Ruifeng Zheng wrote: > +1 > > On Mon, Jul 10, 2023 at 8:20 AM Jungtaek Lim > wrote: > >> +1 >> >> On Sat, Jul 8, 2023 at 4:13 AM Reynold Xin >> wrote: >> >>> +1! >>> >>> >>> On Fri, Jul 7 2023 at 11:58 AM, Holden Karau >>> wrote: >>> +1 On

Unsubscribe

2023-07-10 Thread Bode, Meikel
Unsubscribe

Re: [VOTE][SPIP] Python Data Source API

2023-07-09 Thread Ruifeng Zheng
+1 On Mon, Jul 10, 2023 at 8:20 AM Jungtaek Lim wrote: > +1 > > On Sat, Jul 8, 2023 at 4:13 AM Reynold Xin > wrote: > >> +1! >> >> >> On Fri, Jul 7 2023 at 11:58 AM, Holden Karau >> wrote: >> >>> +1 >>> >>> On Fri, Jul 7, 2023 at 9:55 AM huaxin gao >>> wrote: >>> +1 On Fri,

Re: [VOTE][SPIP] Python Data Source API

2023-07-09 Thread Jungtaek Lim
+1 On Sat, Jul 8, 2023 at 4:13 AM Reynold Xin wrote: > +1! > > > On Fri, Jul 7 2023 at 11:58 AM, Holden Karau > wrote: > >> +1 >> >> On Fri, Jul 7, 2023 at 9:55 AM huaxin gao wrote: >> >>> +1 >>> >>> On Fri, Jul 7, 2023 at 8:59 AM Mich Talebzadeh < >>> mich.talebza...@gmail.com> wrote: >>>

Re: [VOTE][SPIP] Python Data Source API

2023-07-07 Thread Reynold Xin
+1! On Fri, Jul 7 2023 at 11:58 AM, Holden Karau < hol...@pigscanfly.ca > wrote: > > +1 > > > On Fri, Jul 7, 2023 at 9:55 AM huaxin gao < huaxin.ga...@gmail.com > wrote: > > > >> +1 >> >> >> On Fri, Jul 7, 2023 at 8:59 AM Mich Talebzadeh < mich.talebza...@gmail.com >> > wrote: >> >>

Re: [VOTE][SPIP] Python Data Source API

2023-07-07 Thread Holden Karau
+1 On Fri, Jul 7, 2023 at 9:55 AM huaxin gao wrote: > +1 > > On Fri, Jul 7, 2023 at 8:59 AM Mich Talebzadeh > wrote: > >> +1 for me >> >> Mich Talebzadeh, >> Solutions Architect/Engineering Lead >> Palantir Technologies Limited >> London >> United Kingdom >> >> >>view my Linkedin profile

Re: [VOTE][SPIP] Python Data Source API

2023-07-07 Thread huaxin gao
+1 On Fri, Jul 7, 2023 at 8:59 AM Mich Talebzadeh wrote: > +1 for me > > Mich Talebzadeh, > Solutions Architect/Engineering Lead > Palantir Technologies Limited > London > United Kingdom > > >view my Linkedin profile > > > >

Re: [VOTE][SPIP] Python Data Source API

2023-07-07 Thread Mich Talebzadeh
+1 for me Mich Talebzadeh, Solutions Architect/Engineering Lead Palantir Technologies Limited London United Kingdom view my Linkedin profile https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk.

Re: [VOTE][SPIP] Python Data Source API

2023-07-07 Thread Martin Grund
+1 (non-binding) On Fri, Jul 7, 2023 at 12:05 AM Denny Lee wrote: > +1 (non-binding) > > On Fri, Jul 7, 2023 at 00:50 Maciej wrote: > >> +0 >> >> Best regards, >> Maciej Szymkiewicz >> >> Web: https://zero323.net >> PGP: A30CEF0C31A501EC >> >> On 7/6/23 17:41, Xiao Li wrote: >> >> +1 >> >>

Re: [VOTE][SPIP] Python Data Source API

2023-07-06 Thread Denny Lee
+1 (non-binding) On Fri, Jul 7, 2023 at 00:50 Maciej wrote: > +0 > > Best regards, > Maciej Szymkiewicz > > Web: https://zero323.net > PGP: A30CEF0C31A501EC > > On 7/6/23 17:41, Xiao Li wrote: > > +1 > > Xiao > > Hyukjin Kwon 于2023年7月5日周三 17:28写道: > >> +1. >> >> See

Apache Arrow integration issue with Spark involving Netty

2023-07-06 Thread Dane Pitkin
Hi all, The next release of Apache Arrow v13.0.0 coming this month[1] has upgraded Netty to v4.1.94.Final[2] due to a moderate severity CVE[3]. We are seeing that Spark using Netty v4.1.93.Final is not compatible with Arrow v13.0.0, throwing an exception at runtime[4]. There has been some talk in

Re: [VOTE][SPIP] Python Data Source API

2023-07-06 Thread Maciej
+0 Best regards, Maciej Szymkiewicz Web:https://zero323.net PGP: A30CEF0C31A501EC On 7/6/23 17:41, Xiao Li wrote: +1 Xiao Hyukjin Kwon 于2023年7月5日周三 17:28写道: +1. See https://youtu.be/yj7XlTB1Jvc?t=604 :-). On Thu, 6 Jul 2023 at 09:15, Allison Wang wrote: Hi all,

Re: [VOTE][SPIP] Python Data Source API

2023-07-06 Thread Xiao Li
+1 Xiao Hyukjin Kwon 于2023年7月5日周三 17:28写道: > +1. > > See https://youtu.be/yj7XlTB1Jvc?t=604 :-). > > On Thu, 6 Jul 2023 at 09:15, Allison Wang > wrote: > >> Hi all, >> >> I'd like to start the vote for SPIP: Python Data Source API. >> >> The high-level summary for the SPIP is that it aims to

Re: [VOTE][SPIP] Python Data Source API

2023-07-05 Thread Hyukjin Kwon
+1. See https://youtu.be/yj7XlTB1Jvc?t=604 :-). On Thu, 6 Jul 2023 at 09:15, Allison Wang wrote: > Hi all, > > I'd like to start the vote for SPIP: Python Data Source API. > > The high-level summary for the SPIP is that it aims to introduce a simple > API in Python for Data Sources. The idea

[VOTE][SPIP] Python Data Source API

2023-07-05 Thread Allison Wang
Hi all, I'd like to start the vote for SPIP: Python Data Source API. The high-level summary for the SPIP is that it aims to introduce a simple API in Python for Data Sources. The idea is to enable Python developers to create data sources without learning Scala or dealing with the complexities of

Re: Time for Spark v3.5.0 release

2023-07-04 Thread Xinrong Meng
+1 Thank you! On Tue, Jul 4, 2023 at 3:04 PM Jungtaek Lim wrote: > +1 > > On Wed, Jul 5, 2023 at 2:23 AM L. C. Hsieh wrote: > >> +1 >> >> Thanks Yuanjian. >> >> On Tue, Jul 4, 2023 at 7:45 AM yangjie01 wrote: >> > >> > +1 >> > >> > >> > >> > 发件人: Maxim Gekk >> > 日期: 2023年7月4日 星期二 17:24 >> >

Re: Time for Spark v3.5.0 release

2023-07-04 Thread Jungtaek Lim
+1 On Wed, Jul 5, 2023 at 2:23 AM L. C. Hsieh wrote: > +1 > > Thanks Yuanjian. > > On Tue, Jul 4, 2023 at 7:45 AM yangjie01 wrote: > > > > +1 > > > > > > > > 发件人: Maxim Gekk > > 日期: 2023年7月4日 星期二 17:24 > > 收件人: Kent Yao > > 抄送: "dev@spark.apache.org" > > 主题: Re: Time for Spark v3.5.0

Re: Time for Spark v3.5.0 release

2023-07-04 Thread L. C. Hsieh
+1 Thanks Yuanjian. On Tue, Jul 4, 2023 at 7:45 AM yangjie01 wrote: > > +1 > > > > 发件人: Maxim Gekk > 日期: 2023年7月4日 星期二 17:24 > 收件人: Kent Yao > 抄送: "dev@spark.apache.org" > 主题: Re: Time for Spark v3.5.0 release > > > > +1 > > On Tue, Jul 4, 2023 at 11:55 AM Kent Yao wrote: > > +1, thank you

Re: Time for Spark v3.5.0 release

2023-07-04 Thread yangjie01
+1 发件人: Maxim Gekk 日期: 2023年7月4日 星期二 17:24 收件人: Kent Yao 抄送: "dev@spark.apache.org" 主题: Re: Time for Spark v3.5.0 release +1 On Tue, Jul 4, 2023 at 11:55 AM Kent Yao mailto:y...@apache.org>> wrote: +1, thank you Kent On 2023/07/04 05:32:52 Dongjoon Hyun wrote: > +1 > > Thank you, Yuanjian

Re: Time for Spark v3.5.0 release

2023-07-04 Thread Jia Fan
+1 Maxim Gekk 于2023年7月4日周二 17:23写道: > +1 > > On Tue, Jul 4, 2023 at 11:55 AM Kent Yao wrote: > >> +1, thank you >> >> Kent >> >> On 2023/07/04 05:32:52 Dongjoon Hyun wrote: >> > +1 >> > >> > Thank you, Yuanjian >> > >> > Dongjoon >> > >> > On Tue, Jul 4, 2023 at 1:03 AM Hyukjin Kwon >> wrote:

Re: Time for Spark v3.5.0 release

2023-07-04 Thread Maxim Gekk
+1 On Tue, Jul 4, 2023 at 11:55 AM Kent Yao wrote: > +1, thank you > > Kent > > On 2023/07/04 05:32:52 Dongjoon Hyun wrote: > > +1 > > > > Thank you, Yuanjian > > > > Dongjoon > > > > On Tue, Jul 4, 2023 at 1:03 AM Hyukjin Kwon > wrote: > > > > > Yeah one day postponed shouldn't be a big deal.

Re: Time for Spark v3.5.0 release

2023-07-04 Thread Kent Yao
+1, thank you Kent On 2023/07/04 05:32:52 Dongjoon Hyun wrote: > +1 > > Thank you, Yuanjian > > Dongjoon > > On Tue, Jul 4, 2023 at 1:03 AM Hyukjin Kwon wrote: > > > Yeah one day postponed shouldn't be a big deal. > > > > On Tue, Jul 4, 2023 at 7:10 AM Yuanjian Li wrote: > > > >> Hi All,

Re: Time for Spark v3.5.0 release

2023-07-03 Thread Dongjoon Hyun
+1 Thank you, Yuanjian Dongjoon On Tue, Jul 4, 2023 at 1:03 AM Hyukjin Kwon wrote: > Yeah one day postponed shouldn't be a big deal. > > On Tue, Jul 4, 2023 at 7:10 AM Yuanjian Li wrote: > >> Hi All, >> >> According to the Spark versioning policy at >>

Re: Introducing English SDK for Apache Spark - Seeking Your Feedback and Contributions

2023-07-03 Thread Hyukjin Kwon
The demo was really amazing. On Tue, 4 Jul 2023 at 09:17, Farshid Ashouri wrote: > This is wonderful news! > > On Tue, 4 Jul 2023 at 01:14, Gengliang Wang wrote: > >> Dear Apache Spark community, >> >> We are delighted to announce the launch of a groundbreaking tool that >> aims to make Apache

Introducing English SDK for Apache Spark - Seeking Your Feedback and Contributions

2023-07-03 Thread Gengliang Wang
Dear Apache Spark community, We are delighted to announce the launch of a groundbreaking tool that aims to make Apache Spark more user-friendly and accessible - the English SDK . Powered by the application of Generative AI, the English SDK

Re: Time for Spark v3.5.0 release

2023-07-03 Thread Hyukjin Kwon
Yeah one day postponed shouldn't be a big deal. On Tue, Jul 4, 2023 at 7:10 AM Yuanjian Li wrote: > Hi All, > > According to the Spark versioning policy at > https://spark.apache.org/versioning-policy.html, should we cut > *branch-3.5* on *July 17th, 2023*? (We initially proposed January 16th,

Time for Spark v3.5.0 release

2023-07-03 Thread Yuanjian Li
Hi All, According to the Spark versioning policy at https://spark.apache.org/versioning-policy.html, should we cut *branch-3.5* on *July 17th, 2023*? (We initially proposed January 16th, but since it's a Sunday, I suggest we postpone it by one day). I would like to volunteer as the release

Re: Beginner - Looking for starter issues

2023-06-29 Thread Jia Fan
Hi Harry, Maybe you can start with https://issues.apache.org/jira/browse/SPARK-37935 Jia Fan > 2023年6月28日 08:09,Harry 写道: > > Hi, > > I am looking to pick up some tasks on ASF Jira. > I have a basic understanding of how things work in the Spark code base.

Beginner - Looking for starter issues

2023-06-27 Thread Harry
Hi, I am looking to pick up some tasks on ASF Jira. I have a basic understanding of how things work in the Spark code base. So I am thinking if I can start with some simple tasks to get ramped up. I tried searching on JIRA open issues and there were many. It was confusing as some tasks are

Unsubscribe

2023-06-27 Thread Amogh Desai
Unsubscribe

[VOTE][RESULT] PySpark Test Framework

2023-06-26 Thread Amanda Liu
The vote passes with 10 +1s (nine binding +1s) and one +0. Thank you all for your participation and comments! (* = binding) +1: - Holden Karau (*) - Reynold Xin (*) - Mich Talebzadeh - Maciej Szymkiewicz (*) - Hyukjin Kwon (*) - Dongjoon Hyun (*) - Ruifeng Zheng (*) - Xinrong Meng (*) -

Re: [DISCUSS] SPIP: Python Data Source API

2023-06-25 Thread Reynold Xin
Personally I'd love this, but I agree with some of the earlier comments that this should not be Python specific (meaning I should be able to implement a data source in Python and then make it usable across all languages Spark  supports). I think we should find a way to make this reusable beyond

Re: [DISCUSS] SPIP: Python Data Source API

2023-06-25 Thread Maciej
Thanks for your feedback Martin. However, if the primary intended purpose of this API is to provide an interface for endpoint querying, then I find this proposal even less convincing. Neither the Spark execution model nor the data source API (full or restricted as proposed here) are a good

Re: [VOTE][SPIP] PySpark Test Framework

2023-06-24 Thread Yikun Jiang
+1 Regards, Yikun On Fri, Jun 23, 2023 at 6:17 AM L. C. Hsieh wrote: > +1 > > On Thu, Jun 22, 2023 at 3:10 PM Xinrong Meng wrote: > > > > +1 > > > > Thanks for driving that! > > > > On Wed, Jun 21, 2023 at 10:25 PM Ruifeng Zheng > wrote: > >> > >> +1 > >> > >> On Thu, Jun 22, 2023 at 1:11 

Re: [ANNOUNCE] Apache Spark 3.4.1 released

2023-06-24 Thread yangjie01
Thanks Dongjoon ~ 在 2023/6/24 10:29,“L. C. Hsieh”mailto:vii...@gmail.com>> 写入: Thanks Dongjoon! On Fri, Jun 23, 2023 at 7:10 PM Hyukjin Kwon mailto:gurwls...@apache.org>> wrote: > > Thanks! > > On Sat, Jun 24, 2023 at 11:01 AM Mridul Muralidharan > wrote: >> >> >>

Re:[ANNOUNCE] Apache Spark 3.4.1 released

2023-06-24 Thread beliefer
Thanks! Dongjoon Hyun. Congratulation too! At 2023-06-24 07:57:05, "Dongjoon Hyun" wrote: We are happy to announce the availability of Apache Spark 3.4.1! Spark 3.4.1 is a maintenance release containing stability fixes. This release is based on the branch-3.4 maintenance branch of Spark.

Re: [DISCUSS] SPIP: Python Data Source API

2023-06-24 Thread Martin Grund
Hey, I would like to express my strong support for Python Data Sources even though they might not be immediately as powerful as Scala-based data sources. One element that is easily lost in this discussion is how much faster the iteration speed is with Python compared to Scala. Due to the dynamic

Re: [DISCUSS] SPIP: Python Data Source API

2023-06-24 Thread Maciej
With such limited scope (both language availability and features) do we have any representative examples of sources that could significantly benefit from providing this API,  compared other available options, such as batch imports, direct queries from vectorized  UDFs or even interfacing

Re: [ANNOUNCE] Apache Spark 3.4.1 released

2023-06-23 Thread L. C. Hsieh
Thanks Dongjoon! On Fri, Jun 23, 2023 at 7:10 PM Hyukjin Kwon wrote: > > Thanks! > > On Sat, Jun 24, 2023 at 11:01 AM Mridul Muralidharan wrote: >> >> >> Thanks Dongjoon ! >> >> Regards, >> Mridul >> >> On Fri, Jun 23, 2023 at 6:58 PM Dongjoon Hyun wrote: >>> >>> We are happy to announce the

Re: [ANNOUNCE] Apache Spark 3.4.1 released

2023-06-23 Thread Hyukjin Kwon
Thanks! On Sat, Jun 24, 2023 at 11:01 AM Mridul Muralidharan wrote: > > Thanks Dongjoon ! > > Regards, > Mridul > > On Fri, Jun 23, 2023 at 6:58 PM Dongjoon Hyun wrote: > >> We are happy to announce the availability of Apache Spark 3.4.1! >> >> Spark 3.4.1 is a maintenance release containing

Re: [ANNOUNCE] Apache Spark 3.4.1 released

2023-06-23 Thread Mridul Muralidharan
Thanks Dongjoon ! Regards, Mridul On Fri, Jun 23, 2023 at 6:58 PM Dongjoon Hyun wrote: > We are happy to announce the availability of Apache Spark 3.4.1! > > Spark 3.4.1 is a maintenance release containing stability fixes. This > release is based on the branch-3.4 maintenance branch of Spark.

[ANNOUNCE] Apache Spark 3.4.1 released

2023-06-23 Thread Dongjoon Hyun
We are happy to announce the availability of Apache Spark 3.4.1! Spark 3.4.1 is a maintenance release containing stability fixes. This release is based on the branch-3.4 maintenance branch of Spark. We strongly recommend all 3.4 users to upgrade to this stable release. To download Spark 3.4.1,

Re: [VOTE][RESULT] Release Spark 3.4.1 (RC1)

2023-06-23 Thread Dongjoon Hyun
Thank you, Mridul. :) On Fri, Jun 23, 2023 at 7:26 AM Mridul Muralidharan wrote: > A late +1 from me too … forgot to send this yesterday :-) > > Regards, > Mridul > > On Fri, Jun 23, 2023 at 3:20 AM Dongjoon Hyun wrote: > >> The vote passes with 15 +1s (10 binding +1s). >> Thanks to all who

Re: [VOTE][RESULT] Release Spark 3.4.1 (RC1)

2023-06-23 Thread Mridul Muralidharan
A late +1 from me too … forgot to send this yesterday :-) Regards, Mridul On Fri, Jun 23, 2023 at 3:20 AM Dongjoon Hyun wrote: > The vote passes with 15 +1s (10 binding +1s). > Thanks to all who helped with the release! > > (* = binding) > +1: > - Jia Fan > - Dongjoon Hyun * > - Liang-Chi

[VOTE][RESULT] Apache Spark PMC asks Databricks to differentiate its Spark version string

2023-06-23 Thread Dongjoon Hyun
The vote failed with one +1 (binding), one +0 (binding), and three -1s (two binding -1s). Thanks to all for your participation. (* = binding) +1: - Dongjoon Hyun * +0: None - Maciej Szymkiewicz * -1: None - Sean Owen * - Hyukjin Kwon * - Mich Talebzadeh

[VOTE][RESULT] Release Spark 3.4.1 (RC1)

2023-06-23 Thread Dongjoon Hyun
The vote passes with 15 +1s (10 binding +1s). Thanks to all who helped with the release! (* = binding) +1: - Jia Fan - Dongjoon Hyun * - Liang-Chi Hsieh * - Yang Jie - Hyukjin Kwon * - Huaxin Gao * - Ruifeng Zheng * - Peter Toth - Xinrong Meng * - Jacek Laskowski - Yuming Wang * - Chao Sun * -

Re: [VOTE] Release Spark 3.4.1 (RC1)

2023-06-22 Thread Thomas graves
+1 Tom On Mon, Jun 19, 2023 at 9:41 PM Dongjoon Hyun wrote: > Please vote on releasing the following candidate as Apache Spark version > 3.4.1. > > The vote is open until June 23rd 1AM (PST) and passes if a majority +1 PMC > votes are cast, with a minimum of 3 +1 votes. > > [ ] +1 Release this

Re: [VOTE][SPIP] PySpark Test Framework

2023-06-22 Thread L. C. Hsieh
+1 On Thu, Jun 22, 2023 at 3:10 PM Xinrong Meng wrote: > > +1 > > Thanks for driving that! > > On Wed, Jun 21, 2023 at 10:25 PM Ruifeng Zheng wrote: >> >> +1 >> >> On Thu, Jun 22, 2023 at 1:11 PM Dongjoon Hyun >> wrote: >>> >>> +1 >>> >>> Dongjoon >>> >>> On Wed, Jun 21, 2023 at 8:56 PM

Re: [VOTE][SPIP] PySpark Test Framework

2023-06-22 Thread Xinrong Meng
+1 Thanks for driving that! On Wed, Jun 21, 2023 at 10:25 PM Ruifeng Zheng wrote: > +1 > > On Thu, Jun 22, 2023 at 1:11 PM Dongjoon Hyun > wrote: > >> +1 >> >> Dongjoon >> >> On Wed, Jun 21, 2023 at 8:56 PM Hyukjin Kwon >> wrote: >> >>> +1 >>> >>> On Thu, 22 Jun 2023 at 02:20, Jacek

Re: [VOTE] Release Spark 3.4.1 (RC1)

2023-06-22 Thread Gengliang Wang
+1 On Thu, Jun 22, 2023 at 11:14 AM Driesprong, Fokko wrote: > Thank you for running the release Dongjoon > > +1 > > Tested against Iceberg and it looks good. > > > Op do 22 jun 2023 om 18:03 schreef yangjie01 : > >> +1 >> >> >> >> *发件人**: *Dongjoon Hyun >> *日期**: *2023年6月22日 星期四 23:35 >>

Re: [VOTE] Release Spark 3.4.1 (RC1)

2023-06-22 Thread Driesprong, Fokko
Thank you for running the release Dongjoon +1 Tested against Iceberg and it looks good. Op do 22 jun 2023 om 18:03 schreef yangjie01 : > +1 > > > > *发件人**: *Dongjoon Hyun > *日期**: *2023年6月22日 星期四 23:35 > *收件人**: *Chao Sun > *抄送**: *Yuming Wang , Jacek Laskowski , > dev > *主题**: *Re: [VOTE]

Re: [VOTE] Release Spark 3.4.1 (RC1)

2023-06-22 Thread yangjie01
+1 发件人: Dongjoon Hyun 日期: 2023年6月22日 星期四 23:35 收件人: Chao Sun 抄送: Yuming Wang , Jacek Laskowski , dev 主题: Re: [VOTE] Release Spark 3.4.1 (RC1) Thank you everyone for your participation. The vote is open until June 23rd 1AM (PST) and I'll conclude this vote after that. Dongjoon. On Thu,

Re: [VOTE] Apache Spark PMC asks Databricks to differentiate its Spark version string

2023-06-22 Thread Mich Talebzadeh
Sorry I believe I opinionated on it but did not vote. -1 for me For reasons already brought up and discussed. HTH Mich Talebzadeh, Solutions Architect/Engineering Lead Palantir Technologies Limited London United Kingdom view my Linkedin profile

Re: [VOTE] Apache Spark PMC asks Databricks to differentiate its Spark version string

2023-06-22 Thread Dongjoon Hyun
Thank you, Sean, Mitch, Hyukjin, and Maciej for your participation. The vote is open until June 23rd 1AM (PST) and I'll conclude this vote after that. Dongjoon. PS. Steve's email seems to arrive to this thread mistakenly. :) On Wed, Jun 21, 2023 at 3:12 AM Steve Loughran wrote: > I'd say

Re: [VOTE] Release Spark 3.4.1 (RC1)

2023-06-22 Thread Dongjoon Hyun
Thank you everyone for your participation. The vote is open until June 23rd 1AM (PST) and I'll conclude this vote after that. Dongjoon. On Thu, Jun 22, 2023 at 8:29 AM Chao Sun wrote: > +1 > > On Thu, Jun 22, 2023 at 6:52 AM Yuming Wang wrote: > > > > +1. > > > > On Thu, Jun 22, 2023 at

Re: [VOTE] Release Spark 3.4.1 (RC1)

2023-06-22 Thread Chao Sun
+1 On Thu, Jun 22, 2023 at 6:52 AM Yuming Wang wrote: > > +1. > > On Thu, Jun 22, 2023 at 4:41 PM Jacek Laskowski wrote: >> >> +1 >> >> Builds and runs fine on Java 17, macOS. >> >> $ ./dev/change-scala-version.sh 2.13 >> $ mvn \ >>

Re: [VOTE] Release Spark 3.4.1 (RC1)

2023-06-22 Thread Yuming Wang
+1. On Thu, Jun 22, 2023 at 4:41 PM Jacek Laskowski wrote: > +1 > > Builds and runs fine on Java 17, macOS. > > $ ./dev/change-scala-version.sh 2.13 > $ mvn \ > -Pkubernetes,hadoop-cloud,hive,hive-thriftserver,scala-2.13,volcano,connect > \ > -DskipTests \ > clean install > > $ python/run-tests

Re: [VOTE] Release Spark 3.4.1 (RC1)

2023-06-22 Thread Jacek Laskowski
+1 Builds and runs fine on Java 17, macOS. $ ./dev/change-scala-version.sh 2.13 $ mvn \ -Pkubernetes,hadoop-cloud,hive,hive-thriftserver,scala-2.13,volcano,connect \ -DskipTests \ clean install $ python/run-tests --parallelism=1 --testnames 'pyspark.sql.session SparkSession.sql' ... Tests

Re: [VOTE][SPIP] PySpark Test Framework

2023-06-21 Thread Ruifeng Zheng
+1 On Thu, Jun 22, 2023 at 1:11 PM Dongjoon Hyun wrote: > +1 > > Dongjoon > > On Wed, Jun 21, 2023 at 8:56 PM Hyukjin Kwon wrote: > >> +1 >> >> On Thu, 22 Jun 2023 at 02:20, Jacek Laskowski wrote: >> >>> +0 >>> >>> Pozdrawiam, >>> Jacek Laskowski >>> >>> "The Internals Of" Online Books

Re: [VOTE][SPIP] PySpark Test Framework

2023-06-21 Thread Dongjoon Hyun
+1 Dongjoon On Wed, Jun 21, 2023 at 8:56 PM Hyukjin Kwon wrote: > +1 > > On Thu, 22 Jun 2023 at 02:20, Jacek Laskowski wrote: > >> +0 >> >> Pozdrawiam, >> Jacek Laskowski >> >> "The Internals Of" Online Books >> Follow me on https://twitter.com/jaceklaskowski

Re: [VOTE][SPIP] PySpark Test Framework

2023-06-21 Thread Hyukjin Kwon
+1 On Thu, 22 Jun 2023 at 02:20, Jacek Laskowski wrote: > +0 > > Pozdrawiam, > Jacek Laskowski > > "The Internals Of" Online Books > Follow me on https://twitter.com/jaceklaskowski > > > > > On Wed, Jun 21, 2023 at 5:11 PM

Re: [VOTE] Release Spark 3.4.1 (RC1)

2023-06-21 Thread Xinrong Meng
+1 Thank you! On Wed, Jun 21, 2023 at 1:14 AM Peter Toth wrote: > +1 > > Ruifeng Zheng ezt írta (időpont: 2023. jún. 21., > Sze, 9:43): > >> +1 >> >> On Wed, Jun 21, 2023 at 2:26 PM huaxin gao >> wrote: >> >>> +1 >>> >>> On Tue, Jun 20, 2023 at 11:21 PM Hyukjin Kwon >>> wrote: >>> +1

Re: [VOTE][SPIP] PySpark Test Framework

2023-06-21 Thread Jacek Laskowski
+0 Pozdrawiam, Jacek Laskowski "The Internals Of" Online Books Follow me on https://twitter.com/jaceklaskowski On Wed, Jun 21, 2023 at 5:11 PM Amanda Liu wrote: > Hi all, > > I'd like to start the vote for SPIP: PySpark

Re: [VOTE][SPIP] PySpark Test Framework

2023-06-21 Thread Amanda Liu
Yes, let's extend the vote by two days in light of traveling for pride weekend and conferences. Best, Amanda Liu On Wed, Jun 21, 2023 at 8:41 AM Maciej wrote: > +1 > > -- > Best regards, > Maciej Szymkiewicz > > Web: https://zero323.net > PGP: A30CEF0C31A501EC > > > On 6/21/23 17:35, Holden

Re: [VOTE][SPIP] PySpark Test Framework

2023-06-21 Thread Maciej
+1 -- Best regards, Maciej Szymkiewicz Web:https://zero323.net PGP: A30CEF0C31A501EC On 6/21/23 17:35, Holden Karau wrote: A small request, it’s pride weekend in San Francisco where some of the core developers are and right before one of the larger spark related conferences so more folks

<    9   10   11   12   13   14   15   16   17   18   >