Re: [FYI] SPARK-45981: Improve Python language test coverage

2023-12-02 Thread Hyukjin Kwon
Awesome! On Sat, Dec 2, 2023 at 2:33 PM Dongjoon Hyun wrote: > Hi, All. > > As a part of Apache Spark 4.0.0 (SPARK-44111), the Apache Spark community > starts to have test coverage for all supported Python versions from Today. > > - https://github.com/apache/spark/actions/runs/7061665420 > >

Apache Spark 3.3.4 EOL Release?

2023-12-01 Thread Dongjoon Hyun
Hi, All. Since the Apache Spark 3.3.0 RC6 vote passed on Jun 14, 2022, branch-3.3 has been maintained and served well until now. - https://github.com/apache/spark/releases/tag/v3.3.0 (tagged on Jun 9th, 2022) - https://lists.apache.org/thread/zg6k1spw6k1c7brgo6t7qldvsqbmfytm (vote result on June

[FYI] SPARK-45981: Improve Python language test coverage

2023-12-01 Thread Dongjoon Hyun
Hi, All. As a part of Apache Spark 4.0.0 (SPARK-44111), the Apache Spark community starts to have test coverage for all supported Python versions from Today. - https://github.com/apache/spark/actions/runs/7061665420 Here is a summary. 1. Main CI: All PRs and commits on `master` branch are

10x to 100x faster df.groupby().applyInPandas()

2023-12-01 Thread Enrico Minack
Hi devs, I am looking for some PySpark dev that is interested in some 10x to 100x speed up of df.groupby().applyInPandas() for small groups. A PoC and benchmark can be found at https://github.com/apache/spark/pull/37360#issuecomment-1228293766. I suppose, the same approach could be taken

Re:[ANNOUNCE] Apache Spark 3.4.2 released

2023-11-30 Thread beliefer
Congratulations! At 2023-12-01 01:23:55, "Dongjoon Hyun" wrote: We are happy to announce the availability of Apache Spark 3.4.2! Spark 3.4.2 is a maintenance release containing many fixes including security and correctness domains. This release is based on the branch-3.4 maintenance

Unsubscribe

2023-11-30 Thread Devarshi Vyas

unsubscribe

2023-11-30 Thread Sandeep Vinayak
- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-30 Thread Kumar K
+1 On Fri, Nov 10, 2023 at 8:51 PM Khalid Mammadov wrote: > +1 > > On Fri, 10 Nov 2023, 15:23 Peter Toth, wrote: > >> +1 >> >> On Fri, Nov 10, 2023, 14:09 Bjørn Jørgensen >> wrote: >> >>> +1 >>> >>> fre. 10. nov. 2023 kl. 08:39 skrev Nan Zhu : >>> just curious what happened on google’s

[ANNOUNCE] Apache Spark 3.4.2 released

2023-11-30 Thread Dongjoon Hyun
We are happy to announce the availability of Apache Spark 3.4.2! Spark 3.4.2 is a maintenance release containing many fixes including security and correctness domains. This release is based on the branch-3.4 maintenance branch of Spark. We strongly recommend all 3.4 users to upgrade to this

[VOTE][RESULT] Release Spark 3.4.2 (RC1)

2023-11-30 Thread Dongjoon Hyun
The vote passes with 6 +1s (3 binding +1s) and one non-binding -1. Thanks to all who helped with the release! (* = binding) +1: - Dongjoon Hyun * - Kent Yao - Yang Jie - Mridul Muralidharan * - Liang-Chi Hsieh * - Jia Fan +0: None -1: - Marc Le Bihan

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-30 Thread Dongjoon Hyun
Thank you all. This vote passed. I will conclude this vote. Dongjoon. On 2023/11/30 09:53:17 Jia Fan wrote: > +1 > > L. C. Hsieh 于2023年11月30日周四 12:33写道: > > > +1 > > > > Thanks Dongjoon! > > > > On Wed, Nov 29, 2023 at 7:53 PM Mridul Muralidharan > > wrote: > > > > > > +1 > > > > > >

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-30 Thread Jia Fan
+1 L. C. Hsieh 于2023年11月30日周四 12:33写道: > +1 > > Thanks Dongjoon! > > On Wed, Nov 29, 2023 at 7:53 PM Mridul Muralidharan > wrote: > > > > +1 > > > > Signatures, digests, etc check out fine. > > Checked out tag and build/tested with -Phive -Pyarn -Pmesos -Pkubernetes > > > > Regards, > > Mridul

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-29 Thread L. C. Hsieh
+1 Thanks Dongjoon! On Wed, Nov 29, 2023 at 7:53 PM Mridul Muralidharan wrote: > > +1 > > Signatures, digests, etc check out fine. > Checked out tag and build/tested with -Phive -Pyarn -Pmesos -Pkubernetes > > Regards, > Mridul > > On Wed, Nov 29, 2023 at 5:08 AM Yang Jie wrote: >> >>

Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2023-11-29 Thread Anish Shrigondekar
Hi dev, Addressed the comments that Jungtaek had on the doc. Bumping the thread once again to see if other folks have any feedback on the proposal. Thanks, Anish On Mon, Nov 27, 2023 at 8:15 PM Jungtaek Lim wrote: > Kindly bump for better reach after the long holiday. Please kindly review >

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-29 Thread Shiqi Sun
Hi Zhou, Thanks for the reply. For the language choice, since I don't think I've used many k8s components written in Java on k8s, I can't really tell, but at least for the components written in Golang, they are well-organized, easy to read/maintain and run well in general. In addition, goroutines

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-29 Thread Mridul Muralidharan
+1 Signatures, digests, etc check out fine. Checked out tag and build/tested with -Phive -Pyarn -Pmesos -Pkubernetes Regards, Mridul On Wed, Nov 29, 2023 at 5:08 AM Yang Jie wrote: > +1(non-binding) > > Jie Yang > > On 2023/11/29 02:08:04 Kent Yao wrote: > > +1(non-binding) > > > > Kent Yao >

[sql] how to connect query stage to Spark job/stages?

2023-11-29 Thread Chenghao Lyu
Hi, I am seeking advice on measuring the performance of each QueryStage (QS) when AQE is enabled in Spark SQL. Specifically, I need help to automatically map a QS to its corresponding jobs (or stages) to get the QS runtime metrics. I recorded the QS structure via a customized injected Query

Re: Remove HiveContext from Apache Spark 4.0

2023-11-29 Thread Yang Jie
Thank you very much for the feedback from Dongjoon and Xiao Li. After carefully reading https://lists.apache.org/thread/mrx0y078cf3ozs7czykvv864y6dr55xq, I have decided to abandon the deletion of HiveContext. As Xiao Li said, its maintenance cost is not high, but it will increase the cost of

Re: Remove HiveContext from Apache Spark 4.0

2023-11-29 Thread Xiao Li
Thank you for raising it in the dev list. I do not think we should remove HiveContext based on the cost of break and maintenance. FYI, when releasing Spark 3.0, we had a lot of discussions about the related topics https://lists.apache.org/thread/mrx0y078cf3ozs7czykvv864y6dr55xq Dongjoon Hyun

Re: Remove HiveContext from Apache Spark 4.0

2023-11-29 Thread Dongjoon Hyun
Thank you for the heads-up. I agree with your intention and the fact that it's not useful in Apache Spark 4.0.0. However, as you know, historically, it was removed once and explicitly added back to the Apache Spark 3.0 via the vote. SPARK-31088 Add back HiveContext and createExternalTable (As a

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-29 Thread Yang Jie
+1(non-binding) Jie Yang On 2023/11/29 02:08:04 Kent Yao wrote: > +1(non-binding) > > Kent Yao > > On 2023/11/27 01:12:53 Dongjoon Hyun wrote: > > Hi, Marc. > > > > Given that it exists in 3.4.0 and 3.4.1, I don't think it's a release > > blocker for Apache Spark 3.4.2. > > > > When the

Remove HiveContext from Apache Spark 4.0

2023-11-29 Thread 杨杰
Hi all, In SPARK-46171 (apache/spark#44077 [1]), I’m trying to remove the deprecated HiveContext from Apache Spark 4.0 since HiveContext has been marked as deprecated after Spark 2.0. This is a long-deprecated API, it should be replaced with SparkSession with enableHiveSupport now, so I think

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-28 Thread Kent Yao
+1(non-binding) Kent Yao On 2023/11/27 01:12:53 Dongjoon Hyun wrote: > Hi, Marc. > > Given that it exists in 3.4.0 and 3.4.1, I don't think it's a release > blocker for Apache Spark 3.4.2. > > When the patch is ready, we can consider it for 3.4.3. > > In addition, note that we categorized

[RESULT][VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-28 Thread Kent Yao
Hi Spark dev, The vote[1] has now closed. The results are: +1 Votes(*=binding): - Mridul Muralidharan* - Ye Zhou - Dongjoon Hyun* - Reynold Xin* - Yang Jie - Gengliang Wang* - Ruifeng Zheng* - Binjie Yang - Kent Yao 0 Votes: None -1 Votes: None The vote is successful with 5 binding +1 votes.

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-28 Thread Kent Yao
+1(non-binding) I will raise a new thread for the result. Thank you all for the vote. Thanks Kent On 2023/11/28 02:48:33 Binjie Yang wrote: > + 1 > > Thanks, > Binjie Yang > > On 2023/11/27 02:27:22 Ruifeng Zheng wrote: > > +1 > > > > On Sun, Nov 26, 2023 at 6:58 AM Gengliang Wang wrote: >

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-28 Thread Zhou Jiang
Hi Shiqi, Thanks for the cross-posting here - sorry for the response delay during the holiday break :) We prefer Java for the operator project as it's JVM-based and widely familiar within the Spark community. This choice aims to facilitate better adoption and ease of onboarding for future

Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2023-11-27 Thread Jungtaek Lim
Kindly bump for better reach after the long holiday. Please kindly review the proposal which opens the chance to address complex use cases of streaming. Thanks! On Thu, Nov 23, 2023 at 8:19 AM Jungtaek Lim wrote: > Thanks Anish for proposing SPIP and initiating this thread! I believe this >

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-27 Thread Binjie Yang
+ 1 Thanks, Binjie Yang On 2023/11/27 02:27:22 Ruifeng Zheng wrote: > +1 > > On Sun, Nov 26, 2023 at 6:58 AM Gengliang Wang wrote: > > > +1 > > > > On Sat, Nov 25, 2023 at 2:50 AM yangjie01 > > wrote: > > > >> +1 > >> > >> > >> > >> *发件人**: *Reynold Xin > >> *日期**: *2023年11月25日 星期六 14:35 >

Join push down in DSv2

2023-11-27 Thread Stefan Hagedorn
Hi, At the Spark Summit 2017 Ioana Delaney presented an approach for join pushdown in Apache Spark [1]. Is there any intent to actually bring this into Spark, especially in the DSv2 interface? Does anyone know if there's ongoing work or a document about this? [1]

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-26 Thread Ruifeng Zheng
+1 On Sun, Nov 26, 2023 at 6:58 AM Gengliang Wang wrote: > +1 > > On Sat, Nov 25, 2023 at 2:50 AM yangjie01 > wrote: > >> +1 >> >> >> >> *发件人**: *Reynold Xin >> *日期**: *2023年11月25日 星期六 14:35 >> *收件人**: *Dongjoon Hyun >> *抄送**: *Ye Zhou , Mridul Muralidharan < >> mri...@gmail.com>, Kent Yao ,

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-26 Thread Dongjoon Hyun
Hi, Marc. Given that it exists in 3.4.0 and 3.4.1, I don't think it's a release blocker for Apache Spark 3.4.2. When the patch is ready, we can consider it for 3.4.3. In addition, note that we categorized release-blocker-level issues by marking 'Blocker' priority with `Target Version` before

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-25 Thread Gengliang Wang
+1 On Sat, Nov 25, 2023 at 2:50 AM yangjie01 wrote: > +1 > > > > *发件人**: *Reynold Xin > *日期**: *2023年11月25日 星期六 14:35 > *收件人**: *Dongjoon Hyun > *抄送**: *Ye Zhou , Mridul Muralidharan < > mri...@gmail.com>, Kent Yao , dev > *主题**: *Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-25 Thread Marc Le Bihan
-1 If you can wait that the last remaining problem with Generics (?) is entirely solved, that causes this exception to be thrown : java.lang.ClassCastException: class [Ljava.lang.Object; cannot becast to class [Ljava.lang.reflect.TypeVariable; ([Ljava.lang.Object; and

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-25 Thread Dongjoon Hyun
+1 Dongjoon. On 2023/11/25 10:48:41 Dongjoon Hyun wrote: > Please vote on releasing the following candidate as Apache Spark version > 3.4.2. > > The vote is open until November 30th 1AM (PST) and passes if a majority +1 > PMC votes are cast, with a minimum of 3 +1 votes. > > [ ] +1 Release

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-25 Thread yangjie01
+1 发件人: Reynold Xin 日期: 2023年11月25日 星期六 14:35 收件人: Dongjoon Hyun 抄送: Ye Zhou , Mridul Muralidharan , Kent Yao , dev 主题: Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files +1 On Fri, Nov 24, 2023 at 10:19 PM, Dongjoon Hyun mailto:dongjoon.h...@gmail.com>> wrote: +1 Thanks,

[VOTE] Release Spark 3.4.2 (RC1)

2023-11-25 Thread Dongjoon Hyun
Please vote on releasing the following candidate as Apache Spark version 3.4.2. The vote is open until November 30th 1AM (PST) and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 3.4.2 [ ] -1 Do not release this package

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-24 Thread Reynold Xin
+1 On Fri, Nov 24, 2023 at 10:19 PM, Dongjoon Hyun < dongjoon.h...@gmail.com > wrote: > > +1 > > > Thanks, > Dongjoon. > > On Fri, Nov 24, 2023 at 7:14 PM Ye Zhou < zhouyejoe@ gmail. com ( > zhouye...@gmail.com ) > wrote: > > >> +1(non-binding) >> >> On Fri, Nov 24, 2023 at 11:16 Mridul

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-24 Thread Dongjoon Hyun
+1 Thanks, Dongjoon. On Fri, Nov 24, 2023 at 7:14 PM Ye Zhou wrote: > +1(non-binding) > > On Fri, Nov 24, 2023 at 11:16 Mridul Muralidharan > wrote: > >> >> +1 >> >> Regards, >> Mridul >> >> On Fri, Nov 24, 2023 at 8:21 AM Kent Yao wrote: >> >>> Hi Spark Dev, >>> >>> Following the discussion

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-24 Thread Ye Zhou
+1(non-binding) On Fri, Nov 24, 2023 at 11:16 Mridul Muralidharan wrote: > > +1 > > Regards, > Mridul > > On Fri, Nov 24, 2023 at 8:21 AM Kent Yao wrote: > >> Hi Spark Dev, >> >> Following the discussion [1], I'd like to start the vote for the SPIP [2]. >> >> The SPIP aims to improve the test

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-24 Thread Mridul Muralidharan
+1 Regards, Mridul On Fri, Nov 24, 2023 at 8:21 AM Kent Yao wrote: > Hi Spark Dev, > > Following the discussion [1], I'd like to start the vote for the SPIP [2]. > > The SPIP aims to improve the test coverage and develop experience for > Spark UI-related javascript codes. > > This thread will

[VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-24 Thread Kent Yao
Hi Spark Dev, Following the discussion [1], I'd like to start the vote for the SPIP [2]. The SPIP aims to improve the test coverage and develop experience for Spark UI-related javascript codes. This thread will be open for at least the next 72 hours. Please vote accordingly, [ ] +1: Accept

Re: [DISCUSS] SPIP: Testing Framework for Spark UI Javascript files

2023-11-24 Thread Kent Yao
Thank you all. I will start an official vote for this SPIP. Kent On 2023/11/22 03:05:42 Mridul Muralidharan wrote: > This should be a very good addition ! > > Regards, > Mridul > > On Tue, Nov 21, 2023 at 7:46 PM Dongjoon Hyun > wrote: > > > Thank you for proposing a new UI test framework

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-22 Thread Shiqi Sun
Hi all, Sorry for being late to the party. I went through the SPIP doc and I think this is a great proposal! I left a comment in the SPIP doc a couple days ago, but I don't see much activity there and no one replied, so I wanted to cross-post it here to get some feedback. I'm Shiqi Sun, and I

Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2023-11-22 Thread Jungtaek Lim
Thanks Anish for proposing SPIP and initiating this thread! I believe this SPIP will help a bunch of complex use cases on streaming. dev@: We are coincidentally initiating this discussion in thanksgiving holidays. We understand people in the US may not have time to review the SPIP, and we plan to

[DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2023-11-22 Thread Anish Shrigondekar
Hi dev, I would like to start a discussion on "Structured Streaming - Arbitrary State API v2". This proposal aims to address a bunch of limitations we see today using mapGroupsWithState/flatMapGroupsWithState operator. The detailed set of limitations is described in the SPIP doc. We propose to

Re: [DISCUSS] SPIP: Testing Framework for Spark UI Javascript files

2023-11-21 Thread Mridul Muralidharan
This should be a very good addition ! Regards, Mridul On Tue, Nov 21, 2023 at 7:46 PM Dongjoon Hyun wrote: > Thank you for proposing a new UI test framework for Apache Spark 4.0. > > It looks very useful. > > Thanks, > Dongjoon. > > > On Tue, Nov 21, 2023 at 1:51 AM Kent Yao wrote: > >> Hi

Re: [DISCUSS] SPIP: Testing Framework for Spark UI Javascript files

2023-11-21 Thread Wenchen Fan
+1, very useful! On Wed, Nov 22, 2023 at 10:29 AM Dongjoon Hyun wrote: > Thank you for proposing a new UI test framework for Apache Spark 4.0. > > It looks very useful. > > Thanks, > Dongjoon. > > > On Tue, Nov 21, 2023 at 1:51 AM Kent Yao wrote: > >> Hi Spark Dev, >> >> This is a call to

Help for testing Windows specific fix (SPARK-23015)

2023-11-21 Thread Hyukjin Kwon
Hi all, I used to have my Windows environment in another laptop but that laptop is broken now so I don't have Windows env to test Windows PRs out (e.g., https://github.com/apache/spark/pull/43706). If anyone has a Windows env, would appreciate it if you take a look at this. Thanks.

Re: [DISCUSS] SPIP: Testing Framework for Spark UI Javascript files

2023-11-21 Thread Dongjoon Hyun
Thank you for proposing a new UI test framework for Apache Spark 4.0. It looks very useful. Thanks, Dongjoon. On Tue, Nov 21, 2023 at 1:51 AM Kent Yao wrote: > Hi Spark Dev, > > This is a call to discuss a new SPIP: Testing Framework for > Spark UI Javascript files [1]. The SPIP aims to

[DISCUSS] SPIP: Testing Framework for Spark UI Javascript files

2023-11-21 Thread Kent Yao
Hi Spark Dev, This is a call to discuss a new SPIP: Testing Framework for Spark UI Javascript files [1]. The SPIP aims to improve the test coverage and develop experience for Spark UI-related javascript codes. The Jest [2], a JavaScript Testing Framework licensed under MIT, will be used to build

[VOTE][RESULT] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-17 Thread L. C. Hsieh
Hi all, The vote passes with 19 +1s (11 binding +1s). Thanks to all who reviews the SPIP doc and votes! (* = binding) +1: - Ye Zhou - L. C. Hsieh (*) - Chao Sun (*) - Vakaris Baškirov - DB Tsai (*) - Holden Karau (*) - Lucian Neghina - Mridul Muralidharan (*) - Huaxin Gao (*) - Cheng Pan -

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-16 Thread Gabor Somogyi
+1 (non-binding) I think it's good from directional perspective. Apache Flink is already using this approach for quite some time in production. The overall conclusion is that it's a big gain :) G On Tue, Nov 14, 2023 at 6:42 PM L. C. Hsieh wrote: > Hi all, > > I’d like to start a vote for

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-15 Thread Jungtaek Lim
+1 (non-binding) On Thu, Nov 16, 2023 at 4:23 AM Ilan Filonenko wrote: > +1 (non-binding) > > On Wed, Nov 15, 2023 at 12:57 PM Xiao Li wrote: > >> +1 >> >> bo yang 于2023年11月15日周三 05:55写道: >> >>> +1 >>> >>> On Tue, Nov 14, 2023 at 7:18 PM huaxin gao >>> wrote: >>> +1 On Tue,

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-15 Thread Ruifeng Zheng
+1 On Thu, Nov 16, 2023 at 8:34 AM Ilan Filonenko wrote: > +1 (non-binding) > > On Wed, Nov 15, 2023 at 12:57 PM Xiao Li wrote: > >> +1 >> >> bo yang 于2023年11月15日周三 05:55写道: >> >>> +1 >>> >>> On Tue, Nov 14, 2023 at 7:18 PM huaxin gao >>> wrote: >>> +1 On Tue, Nov 14, 2023 at

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-15 Thread Ilan Filonenko
+1 (non-binding) On Wed, Nov 15, 2023 at 12:57 PM Xiao Li wrote: > +1 > > bo yang 于2023年11月15日周三 05:55写道: > >> +1 >> >> On Tue, Nov 14, 2023 at 7:18 PM huaxin gao >> wrote: >> >>> +1 >>> >>> On Tue, Nov 14, 2023 at 10:45 AM Holden Karau >>> wrote: >>> +1 On Tue, Nov 14, 2023

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-15 Thread Xiao Li
+1 bo yang 于2023年11月15日周三 05:55写道: > +1 > > On Tue, Nov 14, 2023 at 7:18 PM huaxin gao wrote: > >> +1 >> >> On Tue, Nov 14, 2023 at 10:45 AM Holden Karau >> wrote: >> >>> +1 >>> >>> On Tue, Nov 14, 2023 at 10:21 AM DB Tsai wrote: >>> +1 DB Tsai | https://www.dbtsai.com/ |

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-15 Thread Dongjoon Hyun
+1 - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-15 Thread Yikun Jiang
+1 Regards, Yikun On Wed, Nov 15, 2023 at 4:26 PM huaxin gao wrote: > +1 > > On Tue, Nov 14, 2023 at 10:45 AM Holden Karau > wrote: > >> +1 >> >> On Tue, Nov 14, 2023 at 10:21 AM DB Tsai wrote: >> >>> +1 >>> >>> DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1 >>> >>> On Nov 14,

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread bo yang
+1 On Tue, Nov 14, 2023 at 7:18 PM huaxin gao wrote: > +1 > > On Tue, Nov 14, 2023 at 10:45 AM Holden Karau > wrote: > >> +1 >> >> On Tue, Nov 14, 2023 at 10:21 AM DB Tsai wrote: >> >>> +1 >>> >>> DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1 >>> >>> On Nov 14, 2023, at 10:14 

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread Yuming Wang
+1 On Wed, Nov 15, 2023 at 2:44 AM Holden Karau wrote: > +1 > > On Tue, Nov 14, 2023 at 10:21 AM DB Tsai wrote: > >> +1 >> >> DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1 >> >> On Nov 14, 2023, at 10:14 AM, Vakaris Baškirov < >> vakaris.bashki...@gmail.com> wrote: >> >> +1

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread Cheng Pan
+1 (non-binding) Thanks, Cheng Pan > On Nov 15, 2023, at 01:41, L. C. Hsieh wrote: > > Hi all, > > I’d like to start a vote for SPIP: An Official Kubernetes Operator for > Apache Spark. > > The proposal is to develop an official Java-based Kubernetes operator > for Apache Spark to automate

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread huaxin gao
+1 On Tue, Nov 14, 2023 at 10:45 AM Holden Karau wrote: > +1 > > On Tue, Nov 14, 2023 at 10:21 AM DB Tsai wrote: > >> +1 >> >> DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1 >> >> On Nov 14, 2023, at 10:14 AM, Vakaris Baškirov < >> vakaris.bashki...@gmail.com> wrote: >> >> +1

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread Mridul Muralidharan
+1 Regards, Mridul On Tue, Nov 14, 2023 at 12:45 PM Holden Karau wrote: > +1 > > On Tue, Nov 14, 2023 at 10:21 AM DB Tsai wrote: > >> +1 >> >> DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1 >> >> On Nov 14, 2023, at 10:14 AM, Vakaris Baškirov < >> vakaris.bashki...@gmail.com>

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread Holden Karau
+1 On Tue, Nov 14, 2023 at 10:21 AM DB Tsai wrote: > +1 > > DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1 > > On Nov 14, 2023, at 10:14 AM, Vakaris Baškirov < > vakaris.bashki...@gmail.com> wrote: > > +1 (non-binding) > > > On Tue, Nov 14, 2023 at 8:03 PM Chao Sun wrote: > >> +1

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread DB Tsai
+1 DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1 > On Nov 14, 2023, at 10:14 AM, Vakaris Baškirov > wrote: > > +1 (non-binding) > > On Tue, Nov 14, 2023 at 8:03 PM Chao Sun > wrote: >> +1 >> >> On Tue, Nov 14, 2023 at 9:52 AM L. C. Hsieh >

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread Vakaris Baškirov
+1 (non-binding) On Tue, Nov 14, 2023 at 8:03 PM Chao Sun wrote: > +1 > > On Tue, Nov 14, 2023 at 9:52 AM L. C. Hsieh wrote: > > > > +1 > > > > On Tue, Nov 14, 2023 at 9:46 AM Ye Zhou wrote: > > > > > > +1(Non-binding) > > > > > > On Tue, Nov 14, 2023 at 9:42 AM L. C. Hsieh wrote: > > >> > >

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread Chao Sun
+1 On Tue, Nov 14, 2023 at 9:52 AM L. C. Hsieh wrote: > > +1 > > On Tue, Nov 14, 2023 at 9:46 AM Ye Zhou wrote: > > > > +1(Non-binding) > > > > On Tue, Nov 14, 2023 at 9:42 AM L. C. Hsieh wrote: > >> > >> Hi all, > >> > >> I’d like to start a vote for SPIP: An Official Kubernetes Operator for

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread L. C. Hsieh
+1 On Tue, Nov 14, 2023 at 9:46 AM Ye Zhou wrote: > > +1(Non-binding) > > On Tue, Nov 14, 2023 at 9:42 AM L. C. Hsieh wrote: >> >> Hi all, >> >> I’d like to start a vote for SPIP: An Official Kubernetes Operator for >> Apache Spark. >> >> The proposal is to develop an official Java-based

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread Ye Zhou
+1(Non-binding) On Tue, Nov 14, 2023 at 9:42 AM L. C. Hsieh wrote: > Hi all, > > I’d like to start a vote for SPIP: An Official Kubernetes Operator for > Apache Spark. > > The proposal is to develop an official Java-based Kubernetes operator > for Apache Spark to automate the deployment and

[VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread L. C. Hsieh
Hi all, I’d like to start a vote for SPIP: An Official Kubernetes Operator for Apache Spark. The proposal is to develop an official Java-based Kubernetes operator for Apache Spark to automate the deployment and simplify the lifecycle management and orchestration of Spark applications and Spark

Why create/drop/alter/rename partition does not post listener event in ExternalCatalogWithListener?

2023-11-14 Thread 李响
Dear Spark Community: In ExternalCatalogWithListener , I see postToAll() is called for create/drop/alter/rename database/table/function to post

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-13 Thread L. C. Hsieh
Thanks for all the support from the community for the SPIP proposal. Since all questions/discussion are settled down (if I didn't miss any major ones), if no more questions or concerns, I'll be the shepherd for this SPIP proposal and call for a vote tomorrow. Thank you all! On Mon, Nov 13, 2023

Spark Docker Official image (Java 17) coming soon

2023-11-13 Thread Yikun Jiang
We added the Java 17 support for Apache Spark docker official image at [1]. (Thanks @vakarisbk efforts) After the [2] merge in future, the first java17 series docker official image will be available. You can also have a try on ghcr test image: all in one image: ghcr.io/apache/spark-docker/spark

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-13 Thread Zhou Jiang
Hi Holden, Thanks a lot for your feedback! Yes, this proposal attempts to integrate existing solutions, especially from CRD perspective. The proposed schema retains similarity with current designs, while reducing duplicates and maintaining a single source of truth from conf properties. It also

Re: Apache Spark 3.4.2 (?)

2023-11-12 Thread Dongjoon Hyun
Thank you all. Here is an update. Thanks to your help, all open blocker issues (including correctness issues) are resolved. However, I'm still waiting for this additional alternative approach PR for the previously resolved JIRAs. https://github.com/apache/spark/pull/43760 (for Apache Spark

De-serialization by Java encoder : Spark 3.4.x doesn't support anymore fields having an accessor but no setter? (Encoder fails on many "NoSuchElementException: None.get" since 3.4.x [SPARK-45311])

2023-11-12 Thread Marc Le Bihan
Hello, I am writing to check if what I am encountering is bug or the behavior that is expected from Spark 3.4.x and over. I've noticed that analysis quickly fails on a "/NoSuchElementException: None.get/" with the JavaBeanEncoder in deserialization since 3.4.x, if a candidate field has a

Re: Dynamic resource allocation for structured streaming [SPARK-24815]

2023-11-12 Thread Pavan Kotikalapudi
Here is an initial Implementation draft PR https://github.com/apache/spark/pull/42352 and design doc: https://docs.google.com/document/d/1_YmfCsQQb9XhRdKh0ijbc-j8JKGtGBxYsk_30NVSTWo/edit?usp=sharing On Sun, Nov 12, 2023 at 5:24 PM Pavan Kotikalapudi wrote: > Hi Dev community, > > Just bumping

Re: Dynamic resource allocation for structured streaming [SPARK-24815]

2023-11-12 Thread Pavan Kotikalapudi
Hi Dev community, Just bumping to see if there are more reviews to evaluate this idea of adding auto-scaling to structured streaming. Thanks again, Pavan On Wed, Aug 23, 2023 at 2:49 PM Pavan Kotikalapudi wrote: > Thanks for the review Mich. > > I have updated the Q4 with as concise

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-12 Thread Holden Karau
To be clear: I am generally supportive of the idea (+1) but have some follow-up questions: Have we taken the time to learn from the other operators? Do we have a compatible CRD/API or not (and if so why?) The API seems to assume that everything is packaged in the container in advance, but I

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-12 Thread Zhou Jiang
resending cc dev for record - sorry forgot to reply all earlier :) For 1 - I'm more leaning towards 'official' as this aims to provide Spark users a community-recommended way to automate and manage Spark deployments on k8s. It does not mean the current / other options would become off-standard

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-12 Thread Zhou Jiang
I'd say that's actually the other way round. A user may either 1. Use spark-submit, this works with or without operator. Or, 2. Deploy the operator, create the Spark Applications with kubectl / clients - so that the Operator does spark-submit for you. We may also continue this discussion in the

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-11 Thread Mich Talebzadeh
Thanks Zhou for your response to my points raised (private communication) If we start with a base model and cluster, minimal footprint for the tool, then we can establish the operational parameters needed. So +1 for me too. HTH view my Linkedin profile

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-10 Thread Cheng Pan
> Not really - this is not designed to be a replacement for the current > approach. That's what I assumed too. But my question is, as a user, how to write a spark-submit command to submit a Spark app to leverage this operator? Thanks, Cheng Pan > On Nov 11, 2023, at 03:21, Zhou Jiang wrote:

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-10 Thread kazuyuki tanimura
+1 Kazu > On Nov 10, 2023, at 10:05 AM, Khalid Mammadov > wrote: > > +1 > > On Fri, 10 Nov 2023, 15:23 Peter Toth, > wrote: >> +1 >> >> On Fri, Nov 10, 2023, 14:09 Bjørn Jørgensen > > wrote: >>> +1 >>> >>> fre. 10. nov. 2023

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-10 Thread Khalid Mammadov
+1 On Fri, 10 Nov 2023, 15:23 Peter Toth, wrote: > +1 > > On Fri, Nov 10, 2023, 14:09 Bjørn Jørgensen > wrote: > >> +1 >> >> fre. 10. nov. 2023 kl. 08:39 skrev Nan Zhu : >> >>> just curious what happened on google’s spark operator? >>> >>> On Thu, Nov 9, 2023 at 19:12 Ilan Filonenko wrote:

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-10 Thread Mich Talebzadeh
Hi, Looks like a good idea but before committing myself, I have a number of design questions having looked at SPIP itself: 1. Will the name "Standard add-on Kubernetes operator to Spark '' describe it better? 2. We are still struggling with improving Spark driver start-up time.

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-10 Thread Peter Toth
+1 On Fri, Nov 10, 2023, 14:09 Bjørn Jørgensen wrote: > +1 > > fre. 10. nov. 2023 kl. 08:39 skrev Nan Zhu : > >> just curious what happened on google’s spark operator? >> >> On Thu, Nov 9, 2023 at 19:12 Ilan Filonenko wrote: >> >>> +1 >>> >>> On Thu, Nov 9, 2023 at 7:43 PM Ryan Blue wrote:

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-10 Thread Bjørn Jørgensen
+1 fre. 10. nov. 2023 kl. 08:39 skrev Nan Zhu : > just curious what happened on google’s spark operator? > > On Thu, Nov 9, 2023 at 19:12 Ilan Filonenko wrote: > >> +1 >> >> On Thu, Nov 9, 2023 at 7:43 PM Ryan Blue wrote: >> >>> +1 >>> >>> On Thu, Nov 9, 2023 at 4:23 PM Hussein Awala wrote:

Re: Apache Spark 3.4.2 (?)

2023-11-10 Thread Kent Yao
+1 Maxim Gekk 于2023年11月9日周四 18:18写道: > > +1 > > On Wed, Nov 8, 2023 at 5:29 AM kazuyuki tanimura > wrote: >> >> +1 >> >> Kazu >> >> On Nov 7, 2023, at 5:23 PM, L. C. Hsieh wrote: >> >> +1 >> >> On Tue, Nov 7, 2023 at 4:56 PM Dongjoon Hyun wrote: >> >> >> Thank you all! >> >> Dongjoon >> >>

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Yuming Wang
+1 On Fri, Nov 10, 2023 at 10:01 AM Ilan Filonenko wrote: > +1 > > On Thu, Nov 9, 2023 at 7:43 PM Ryan Blue wrote: > >> +1 >> >> On Thu, Nov 9, 2023 at 4:23 PM Hussein Awala wrote: >> >>> +1 for creating an official Kubernetes operator for Apache Spark >>> >>> On Fri, Nov 10, 2023 at 12:38 AM

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Cheng Pan
Thanks for this impressive proposal, I have a basic question, how does spark-submit work with this operator? Or it enforces that we must use `kubectl apply -f spark-job.yaml`(or K8s client in programming way) to submit Spark app? Thanks, Cheng Pan > On Nov 10, 2023, at 04:05, Zhou Jiang

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread L. C. Hsieh
+1 On Thu, Nov 9, 2023 at 7:57 PM Chao Sun wrote: > > +1 > > > On Thu, Nov 9, 2023 at 6:36 PM Xiao Li wrote: > > > > +1 > > > > huaxin gao 于2023年11月9日周四 16:53写道: > >> > >> +1 > >> > >> On Thu, Nov 9, 2023 at 3:14 PM DB Tsai wrote: > >>> > >>> +1 > >>> > >>> To be completely transparent, I am

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Nan Zhu
just curious what happened on google’s spark operator? On Thu, Nov 9, 2023 at 19:12 Ilan Filonenko wrote: > +1 > > On Thu, Nov 9, 2023 at 7:43 PM Ryan Blue wrote: > >> +1 >> >> On Thu, Nov 9, 2023 at 4:23 PM Hussein Awala wrote: >> >>> +1 for creating an official Kubernetes operator for

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Chao Sun
+1 On Thu, Nov 9, 2023 at 6:36 PM Xiao Li wrote: > > +1 > > huaxin gao 于2023年11月9日周四 16:53写道: >> >> +1 >> >> On Thu, Nov 9, 2023 at 3:14 PM DB Tsai wrote: >>> >>> +1 >>> >>> To be completely transparent, I am employed in the same department as Zhou >>> at Apple. >>> >>> I support this

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Xiao Li
+1 huaxin gao 于2023年11月9日周四 16:53写道: > +1 > > On Thu, Nov 9, 2023 at 3:14 PM DB Tsai wrote: > >> +1 >> >> To be completely transparent, I am employed in the same department as >> Zhou at Apple. >> >> I support this proposal, provided that we witness community adoption >> following the release

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Ilan Filonenko
+1 On Thu, Nov 9, 2023 at 7:43 PM Ryan Blue wrote: > +1 > > On Thu, Nov 9, 2023 at 4:23 PM Hussein Awala wrote: > >> +1 for creating an official Kubernetes operator for Apache Spark >> >> On Fri, Nov 10, 2023 at 12:38 AM huaxin gao >> wrote: >> >>> +1 >>> >>> On Thu, Nov 9, 2023 at 3:14 PM DB

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Ryan Blue
+1 On Thu, Nov 9, 2023 at 4:23 PM Hussein Awala wrote: > +1 for creating an official Kubernetes operator for Apache Spark > > On Fri, Nov 10, 2023 at 12:38 AM huaxin gao > wrote: > >> +1 >> >> On Thu, Nov 9, 2023 at 3:14 PM DB Tsai wrote: >> >>> +1 >>> >>> To be completely transparent, I am

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Hussein Awala
+1 for creating an official Kubernetes operator for Apache Spark On Fri, Nov 10, 2023 at 12:38 AM huaxin gao wrote: > +1 > > On Thu, Nov 9, 2023 at 3:14 PM DB Tsai wrote: > >> +1 >> >> To be completely transparent, I am employed in the same department as >> Zhou at Apple. >> >> I support this

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread huaxin gao
+1 On Thu, Nov 9, 2023 at 3:14 PM DB Tsai wrote: > +1 > > To be completely transparent, I am employed in the same department as Zhou > at Apple. > > I support this proposal, provided that we witness community adoption > following the release of the Flink Kubernetes operator, streamlining Flink

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread DB Tsai
+1 To be completely transparent, I am employed in the same department as Zhou at Apple. I support this proposal, provided that we witness community adoption following the release of the Flink Kubernetes operator, streamlining Flink deployment on Kubernetes. A well-maintained official Spark

<    3   4   5   6   7   8   9   10   11   12   >