[RESULT][VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-28 Thread Kent Yao
Hi Spark dev, The vote[1] has now closed. The results are: +1 Votes(*=binding): - Mridul Muralidharan* - Ye Zhou - Dongjoon Hyun* - Reynold Xin* - Yang Jie - Gengliang Wang* - Ruifeng Zheng* - Binjie Yang - Kent Yao 0 Votes: None -1 Votes: None The vote is successful with 5 binding +1 votes.

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-28 Thread Kent Yao
+1(non-binding) I will raise a new thread for the result. Thank you all for the vote. Thanks Kent On 2023/11/28 02:48:33 Binjie Yang wrote: > + 1 > > Thanks, > Binjie Yang > > On 2023/11/27 02:27:22 Ruifeng Zheng wrote: > > +1 > > > > On Sun, Nov 26, 2023 at 6:58 AM Gengliang Wang wrote: >

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-28 Thread Zhou Jiang
Hi Shiqi, Thanks for the cross-posting here - sorry for the response delay during the holiday break :) We prefer Java for the operator project as it's JVM-based and widely familiar within the Spark community. This choice aims to facilitate better adoption and ease of onboarding for future

Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2023-11-27 Thread Jungtaek Lim
Kindly bump for better reach after the long holiday. Please kindly review the proposal which opens the chance to address complex use cases of streaming. Thanks! On Thu, Nov 23, 2023 at 8:19 AM Jungtaek Lim wrote: > Thanks Anish for proposing SPIP and initiating this thread! I believe this >

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-27 Thread Binjie Yang
+ 1 Thanks, Binjie Yang On 2023/11/27 02:27:22 Ruifeng Zheng wrote: > +1 > > On Sun, Nov 26, 2023 at 6:58 AM Gengliang Wang wrote: > > > +1 > > > > On Sat, Nov 25, 2023 at 2:50 AM yangjie01 > > wrote: > > > >> +1 > >> > >> > >> > >> *发件人**: *Reynold Xin > >> *日期**: *2023年11月25日 星期六 14:35 >

Join push down in DSv2

2023-11-27 Thread Stefan Hagedorn
Hi, At the Spark Summit 2017 Ioana Delaney presented an approach for join pushdown in Apache Spark [1]. Is there any intent to actually bring this into Spark, especially in the DSv2 interface? Does anyone know if there's ongoing work or a document about this? [1]

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-26 Thread Ruifeng Zheng
+1 On Sun, Nov 26, 2023 at 6:58 AM Gengliang Wang wrote: > +1 > > On Sat, Nov 25, 2023 at 2:50 AM yangjie01 > wrote: > >> +1 >> >> >> >> *发件人**: *Reynold Xin >> *日期**: *2023年11月25日 星期六 14:35 >> *收件人**: *Dongjoon Hyun >> *抄送**: *Ye Zhou , Mridul Muralidharan < >> mri...@gmail.com>, Kent Yao ,

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-26 Thread Dongjoon Hyun
Hi, Marc. Given that it exists in 3.4.0 and 3.4.1, I don't think it's a release blocker for Apache Spark 3.4.2. When the patch is ready, we can consider it for 3.4.3. In addition, note that we categorized release-blocker-level issues by marking 'Blocker' priority with `Target Version` before

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-25 Thread Gengliang Wang
+1 On Sat, Nov 25, 2023 at 2:50 AM yangjie01 wrote: > +1 > > > > *发件人**: *Reynold Xin > *日期**: *2023年11月25日 星期六 14:35 > *收件人**: *Dongjoon Hyun > *抄送**: *Ye Zhou , Mridul Muralidharan < > mri...@gmail.com>, Kent Yao , dev > *主题**: *Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-25 Thread Marc Le Bihan
-1 If you can wait that the last remaining problem with Generics (?) is entirely solved, that causes this exception to be thrown : java.lang.ClassCastException: class [Ljava.lang.Object; cannot becast to class [Ljava.lang.reflect.TypeVariable; ([Ljava.lang.Object; and

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-25 Thread Dongjoon Hyun
+1 Dongjoon. On 2023/11/25 10:48:41 Dongjoon Hyun wrote: > Please vote on releasing the following candidate as Apache Spark version > 3.4.2. > > The vote is open until November 30th 1AM (PST) and passes if a majority +1 > PMC votes are cast, with a minimum of 3 +1 votes. > > [ ] +1 Release

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-25 Thread yangjie01
+1 发件人: Reynold Xin 日期: 2023年11月25日 星期六 14:35 收件人: Dongjoon Hyun 抄送: Ye Zhou , Mridul Muralidharan , Kent Yao , dev 主题: Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files +1 On Fri, Nov 24, 2023 at 10:19 PM, Dongjoon Hyun mailto:dongjoon.h...@gmail.com>> wrote: +1 Thanks,

[VOTE] Release Spark 3.4.2 (RC1)

2023-11-25 Thread Dongjoon Hyun
Please vote on releasing the following candidate as Apache Spark version 3.4.2. The vote is open until November 30th 1AM (PST) and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 3.4.2 [ ] -1 Do not release this package

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-24 Thread Reynold Xin
+1 On Fri, Nov 24, 2023 at 10:19 PM, Dongjoon Hyun < dongjoon.h...@gmail.com > wrote: > > +1 > > > Thanks, > Dongjoon. > > On Fri, Nov 24, 2023 at 7:14 PM Ye Zhou < zhouyejoe@ gmail. com ( > zhouye...@gmail.com ) > wrote: > > >> +1(non-binding) >> >> On Fri, Nov 24, 2023 at 11:16 Mridul

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-24 Thread Dongjoon Hyun
+1 Thanks, Dongjoon. On Fri, Nov 24, 2023 at 7:14 PM Ye Zhou wrote: > +1(non-binding) > > On Fri, Nov 24, 2023 at 11:16 Mridul Muralidharan > wrote: > >> >> +1 >> >> Regards, >> Mridul >> >> On Fri, Nov 24, 2023 at 8:21 AM Kent Yao wrote: >> >>> Hi Spark Dev, >>> >>> Following the discussion

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-24 Thread Ye Zhou
+1(non-binding) On Fri, Nov 24, 2023 at 11:16 Mridul Muralidharan wrote: > > +1 > > Regards, > Mridul > > On Fri, Nov 24, 2023 at 8:21 AM Kent Yao wrote: > >> Hi Spark Dev, >> >> Following the discussion [1], I'd like to start the vote for the SPIP [2]. >> >> The SPIP aims to improve the test

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-24 Thread Mridul Muralidharan
+1 Regards, Mridul On Fri, Nov 24, 2023 at 8:21 AM Kent Yao wrote: > Hi Spark Dev, > > Following the discussion [1], I'd like to start the vote for the SPIP [2]. > > The SPIP aims to improve the test coverage and develop experience for > Spark UI-related javascript codes. > > This thread will

[VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-24 Thread Kent Yao
Hi Spark Dev, Following the discussion [1], I'd like to start the vote for the SPIP [2]. The SPIP aims to improve the test coverage and develop experience for Spark UI-related javascript codes. This thread will be open for at least the next 72 hours. Please vote accordingly, [ ] +1: Accept

Re: [DISCUSS] SPIP: Testing Framework for Spark UI Javascript files

2023-11-24 Thread Kent Yao
Thank you all. I will start an official vote for this SPIP. Kent On 2023/11/22 03:05:42 Mridul Muralidharan wrote: > This should be a very good addition ! > > Regards, > Mridul > > On Tue, Nov 21, 2023 at 7:46 PM Dongjoon Hyun > wrote: > > > Thank you for proposing a new UI test framework

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-22 Thread Shiqi Sun
Hi all, Sorry for being late to the party. I went through the SPIP doc and I think this is a great proposal! I left a comment in the SPIP doc a couple days ago, but I don't see much activity there and no one replied, so I wanted to cross-post it here to get some feedback. I'm Shiqi Sun, and I

Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2023-11-22 Thread Jungtaek Lim
Thanks Anish for proposing SPIP and initiating this thread! I believe this SPIP will help a bunch of complex use cases on streaming. dev@: We are coincidentally initiating this discussion in thanksgiving holidays. We understand people in the US may not have time to review the SPIP, and we plan to

[DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2023-11-22 Thread Anish Shrigondekar
Hi dev, I would like to start a discussion on "Structured Streaming - Arbitrary State API v2". This proposal aims to address a bunch of limitations we see today using mapGroupsWithState/flatMapGroupsWithState operator. The detailed set of limitations is described in the SPIP doc. We propose to

Re: [DISCUSS] SPIP: Testing Framework for Spark UI Javascript files

2023-11-21 Thread Mridul Muralidharan
This should be a very good addition ! Regards, Mridul On Tue, Nov 21, 2023 at 7:46 PM Dongjoon Hyun wrote: > Thank you for proposing a new UI test framework for Apache Spark 4.0. > > It looks very useful. > > Thanks, > Dongjoon. > > > On Tue, Nov 21, 2023 at 1:51 AM Kent Yao wrote: > >> Hi

Re: [DISCUSS] SPIP: Testing Framework for Spark UI Javascript files

2023-11-21 Thread Wenchen Fan
+1, very useful! On Wed, Nov 22, 2023 at 10:29 AM Dongjoon Hyun wrote: > Thank you for proposing a new UI test framework for Apache Spark 4.0. > > It looks very useful. > > Thanks, > Dongjoon. > > > On Tue, Nov 21, 2023 at 1:51 AM Kent Yao wrote: > >> Hi Spark Dev, >> >> This is a call to

Help for testing Windows specific fix (SPARK-23015)

2023-11-21 Thread Hyukjin Kwon
Hi all, I used to have my Windows environment in another laptop but that laptop is broken now so I don't have Windows env to test Windows PRs out (e.g., https://github.com/apache/spark/pull/43706). If anyone has a Windows env, would appreciate it if you take a look at this. Thanks.

Re: [DISCUSS] SPIP: Testing Framework for Spark UI Javascript files

2023-11-21 Thread Dongjoon Hyun
Thank you for proposing a new UI test framework for Apache Spark 4.0. It looks very useful. Thanks, Dongjoon. On Tue, Nov 21, 2023 at 1:51 AM Kent Yao wrote: > Hi Spark Dev, > > This is a call to discuss a new SPIP: Testing Framework for > Spark UI Javascript files [1]. The SPIP aims to

[DISCUSS] SPIP: Testing Framework for Spark UI Javascript files

2023-11-21 Thread Kent Yao
Hi Spark Dev, This is a call to discuss a new SPIP: Testing Framework for Spark UI Javascript files [1]. The SPIP aims to improve the test coverage and develop experience for Spark UI-related javascript codes. The Jest [2], a JavaScript Testing Framework licensed under MIT, will be used to build

[VOTE][RESULT] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-17 Thread L. C. Hsieh
Hi all, The vote passes with 19 +1s (11 binding +1s). Thanks to all who reviews the SPIP doc and votes! (* = binding) +1: - Ye Zhou - L. C. Hsieh (*) - Chao Sun (*) - Vakaris Baškirov - DB Tsai (*) - Holden Karau (*) - Lucian Neghina - Mridul Muralidharan (*) - Huaxin Gao (*) - Cheng Pan -

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-16 Thread Gabor Somogyi
+1 (non-binding) I think it's good from directional perspective. Apache Flink is already using this approach for quite some time in production. The overall conclusion is that it's a big gain :) G On Tue, Nov 14, 2023 at 6:42 PM L. C. Hsieh wrote: > Hi all, > > I’d like to start a vote for

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-15 Thread Jungtaek Lim
+1 (non-binding) On Thu, Nov 16, 2023 at 4:23 AM Ilan Filonenko wrote: > +1 (non-binding) > > On Wed, Nov 15, 2023 at 12:57 PM Xiao Li wrote: > >> +1 >> >> bo yang 于2023年11月15日周三 05:55写道: >> >>> +1 >>> >>> On Tue, Nov 14, 2023 at 7:18 PM huaxin gao >>> wrote: >>> +1 On Tue,

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-15 Thread Ruifeng Zheng
+1 On Thu, Nov 16, 2023 at 8:34 AM Ilan Filonenko wrote: > +1 (non-binding) > > On Wed, Nov 15, 2023 at 12:57 PM Xiao Li wrote: > >> +1 >> >> bo yang 于2023年11月15日周三 05:55写道: >> >>> +1 >>> >>> On Tue, Nov 14, 2023 at 7:18 PM huaxin gao >>> wrote: >>> +1 On Tue, Nov 14, 2023 at

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-15 Thread Ilan Filonenko
+1 (non-binding) On Wed, Nov 15, 2023 at 12:57 PM Xiao Li wrote: > +1 > > bo yang 于2023年11月15日周三 05:55写道: > >> +1 >> >> On Tue, Nov 14, 2023 at 7:18 PM huaxin gao >> wrote: >> >>> +1 >>> >>> On Tue, Nov 14, 2023 at 10:45 AM Holden Karau >>> wrote: >>> +1 On Tue, Nov 14, 2023

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-15 Thread Xiao Li
+1 bo yang 于2023年11月15日周三 05:55写道: > +1 > > On Tue, Nov 14, 2023 at 7:18 PM huaxin gao wrote: > >> +1 >> >> On Tue, Nov 14, 2023 at 10:45 AM Holden Karau >> wrote: >> >>> +1 >>> >>> On Tue, Nov 14, 2023 at 10:21 AM DB Tsai wrote: >>> +1 DB Tsai | https://www.dbtsai.com/ |

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-15 Thread Dongjoon Hyun
+1 - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-15 Thread Yikun Jiang
+1 Regards, Yikun On Wed, Nov 15, 2023 at 4:26 PM huaxin gao wrote: > +1 > > On Tue, Nov 14, 2023 at 10:45 AM Holden Karau > wrote: > >> +1 >> >> On Tue, Nov 14, 2023 at 10:21 AM DB Tsai wrote: >> >>> +1 >>> >>> DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1 >>> >>> On Nov 14,

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread bo yang
+1 On Tue, Nov 14, 2023 at 7:18 PM huaxin gao wrote: > +1 > > On Tue, Nov 14, 2023 at 10:45 AM Holden Karau > wrote: > >> +1 >> >> On Tue, Nov 14, 2023 at 10:21 AM DB Tsai wrote: >> >>> +1 >>> >>> DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1 >>> >>> On Nov 14, 2023, at 10:14 

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread Yuming Wang
+1 On Wed, Nov 15, 2023 at 2:44 AM Holden Karau wrote: > +1 > > On Tue, Nov 14, 2023 at 10:21 AM DB Tsai wrote: > >> +1 >> >> DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1 >> >> On Nov 14, 2023, at 10:14 AM, Vakaris Baškirov < >> vakaris.bashki...@gmail.com> wrote: >> >> +1

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread Cheng Pan
+1 (non-binding) Thanks, Cheng Pan > On Nov 15, 2023, at 01:41, L. C. Hsieh wrote: > > Hi all, > > I’d like to start a vote for SPIP: An Official Kubernetes Operator for > Apache Spark. > > The proposal is to develop an official Java-based Kubernetes operator > for Apache Spark to automate

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread huaxin gao
+1 On Tue, Nov 14, 2023 at 10:45 AM Holden Karau wrote: > +1 > > On Tue, Nov 14, 2023 at 10:21 AM DB Tsai wrote: > >> +1 >> >> DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1 >> >> On Nov 14, 2023, at 10:14 AM, Vakaris Baškirov < >> vakaris.bashki...@gmail.com> wrote: >> >> +1

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread Mridul Muralidharan
+1 Regards, Mridul On Tue, Nov 14, 2023 at 12:45 PM Holden Karau wrote: > +1 > > On Tue, Nov 14, 2023 at 10:21 AM DB Tsai wrote: > >> +1 >> >> DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1 >> >> On Nov 14, 2023, at 10:14 AM, Vakaris Baškirov < >> vakaris.bashki...@gmail.com>

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread Holden Karau
+1 On Tue, Nov 14, 2023 at 10:21 AM DB Tsai wrote: > +1 > > DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1 > > On Nov 14, 2023, at 10:14 AM, Vakaris Baškirov < > vakaris.bashki...@gmail.com> wrote: > > +1 (non-binding) > > > On Tue, Nov 14, 2023 at 8:03 PM Chao Sun wrote: > >> +1

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread DB Tsai
+1 DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1 > On Nov 14, 2023, at 10:14 AM, Vakaris Baškirov > wrote: > > +1 (non-binding) > > On Tue, Nov 14, 2023 at 8:03 PM Chao Sun > wrote: >> +1 >> >> On Tue, Nov 14, 2023 at 9:52 AM L. C. Hsieh >

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread Vakaris Baškirov
+1 (non-binding) On Tue, Nov 14, 2023 at 8:03 PM Chao Sun wrote: > +1 > > On Tue, Nov 14, 2023 at 9:52 AM L. C. Hsieh wrote: > > > > +1 > > > > On Tue, Nov 14, 2023 at 9:46 AM Ye Zhou wrote: > > > > > > +1(Non-binding) > > > > > > On Tue, Nov 14, 2023 at 9:42 AM L. C. Hsieh wrote: > > >> > >

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread Chao Sun
+1 On Tue, Nov 14, 2023 at 9:52 AM L. C. Hsieh wrote: > > +1 > > On Tue, Nov 14, 2023 at 9:46 AM Ye Zhou wrote: > > > > +1(Non-binding) > > > > On Tue, Nov 14, 2023 at 9:42 AM L. C. Hsieh wrote: > >> > >> Hi all, > >> > >> I’d like to start a vote for SPIP: An Official Kubernetes Operator for

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread L. C. Hsieh
+1 On Tue, Nov 14, 2023 at 9:46 AM Ye Zhou wrote: > > +1(Non-binding) > > On Tue, Nov 14, 2023 at 9:42 AM L. C. Hsieh wrote: >> >> Hi all, >> >> I’d like to start a vote for SPIP: An Official Kubernetes Operator for >> Apache Spark. >> >> The proposal is to develop an official Java-based

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread Ye Zhou
+1(Non-binding) On Tue, Nov 14, 2023 at 9:42 AM L. C. Hsieh wrote: > Hi all, > > I’d like to start a vote for SPIP: An Official Kubernetes Operator for > Apache Spark. > > The proposal is to develop an official Java-based Kubernetes operator > for Apache Spark to automate the deployment and

[VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread L. C. Hsieh
Hi all, I’d like to start a vote for SPIP: An Official Kubernetes Operator for Apache Spark. The proposal is to develop an official Java-based Kubernetes operator for Apache Spark to automate the deployment and simplify the lifecycle management and orchestration of Spark applications and Spark

Why create/drop/alter/rename partition does not post listener event in ExternalCatalogWithListener?

2023-11-14 Thread 李响
Dear Spark Community: In ExternalCatalogWithListener , I see postToAll() is called for create/drop/alter/rename database/table/function to post

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-13 Thread L. C. Hsieh
Thanks for all the support from the community for the SPIP proposal. Since all questions/discussion are settled down (if I didn't miss any major ones), if no more questions or concerns, I'll be the shepherd for this SPIP proposal and call for a vote tomorrow. Thank you all! On Mon, Nov 13, 2023

Spark Docker Official image (Java 17) coming soon

2023-11-13 Thread Yikun Jiang
We added the Java 17 support for Apache Spark docker official image at [1]. (Thanks @vakarisbk efforts) After the [2] merge in future, the first java17 series docker official image will be available. You can also have a try on ghcr test image: all in one image: ghcr.io/apache/spark-docker/spark

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-13 Thread Zhou Jiang
Hi Holden, Thanks a lot for your feedback! Yes, this proposal attempts to integrate existing solutions, especially from CRD perspective. The proposed schema retains similarity with current designs, while reducing duplicates and maintaining a single source of truth from conf properties. It also

Re: Apache Spark 3.4.2 (?)

2023-11-12 Thread Dongjoon Hyun
Thank you all. Here is an update. Thanks to your help, all open blocker issues (including correctness issues) are resolved. However, I'm still waiting for this additional alternative approach PR for the previously resolved JIRAs. https://github.com/apache/spark/pull/43760 (for Apache Spark

De-serialization by Java encoder : Spark 3.4.x doesn't support anymore fields having an accessor but no setter? (Encoder fails on many "NoSuchElementException: None.get" since 3.4.x [SPARK-45311])

2023-11-12 Thread Marc Le Bihan
Hello, I am writing to check if what I am encountering is bug or the behavior that is expected from Spark 3.4.x and over. I've noticed that analysis quickly fails on a "/NoSuchElementException: None.get/" with the JavaBeanEncoder in deserialization since 3.4.x, if a candidate field has a

Re: Dynamic resource allocation for structured streaming [SPARK-24815]

2023-11-12 Thread Pavan Kotikalapudi
Here is an initial Implementation draft PR https://github.com/apache/spark/pull/42352 and design doc: https://docs.google.com/document/d/1_YmfCsQQb9XhRdKh0ijbc-j8JKGtGBxYsk_30NVSTWo/edit?usp=sharing On Sun, Nov 12, 2023 at 5:24 PM Pavan Kotikalapudi wrote: > Hi Dev community, > > Just bumping

Re: Dynamic resource allocation for structured streaming [SPARK-24815]

2023-11-12 Thread Pavan Kotikalapudi
Hi Dev community, Just bumping to see if there are more reviews to evaluate this idea of adding auto-scaling to structured streaming. Thanks again, Pavan On Wed, Aug 23, 2023 at 2:49 PM Pavan Kotikalapudi wrote: > Thanks for the review Mich. > > I have updated the Q4 with as concise

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-12 Thread Holden Karau
To be clear: I am generally supportive of the idea (+1) but have some follow-up questions: Have we taken the time to learn from the other operators? Do we have a compatible CRD/API or not (and if so why?) The API seems to assume that everything is packaged in the container in advance, but I

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-12 Thread Zhou Jiang
resending cc dev for record - sorry forgot to reply all earlier :) For 1 - I'm more leaning towards 'official' as this aims to provide Spark users a community-recommended way to automate and manage Spark deployments on k8s. It does not mean the current / other options would become off-standard

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-12 Thread Zhou Jiang
I'd say that's actually the other way round. A user may either 1. Use spark-submit, this works with or without operator. Or, 2. Deploy the operator, create the Spark Applications with kubectl / clients - so that the Operator does spark-submit for you. We may also continue this discussion in the

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-11 Thread Mich Talebzadeh
Thanks Zhou for your response to my points raised (private communication) If we start with a base model and cluster, minimal footprint for the tool, then we can establish the operational parameters needed. So +1 for me too. HTH view my Linkedin profile

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-10 Thread Cheng Pan
> Not really - this is not designed to be a replacement for the current > approach. That's what I assumed too. But my question is, as a user, how to write a spark-submit command to submit a Spark app to leverage this operator? Thanks, Cheng Pan > On Nov 11, 2023, at 03:21, Zhou Jiang wrote:

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-10 Thread kazuyuki tanimura
+1 Kazu > On Nov 10, 2023, at 10:05 AM, Khalid Mammadov > wrote: > > +1 > > On Fri, 10 Nov 2023, 15:23 Peter Toth, > wrote: >> +1 >> >> On Fri, Nov 10, 2023, 14:09 Bjørn Jørgensen > > wrote: >>> +1 >>> >>> fre. 10. nov. 2023

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-10 Thread Khalid Mammadov
+1 On Fri, 10 Nov 2023, 15:23 Peter Toth, wrote: > +1 > > On Fri, Nov 10, 2023, 14:09 Bjørn Jørgensen > wrote: > >> +1 >> >> fre. 10. nov. 2023 kl. 08:39 skrev Nan Zhu : >> >>> just curious what happened on google’s spark operator? >>> >>> On Thu, Nov 9, 2023 at 19:12 Ilan Filonenko wrote:

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-10 Thread Mich Talebzadeh
Hi, Looks like a good idea but before committing myself, I have a number of design questions having looked at SPIP itself: 1. Will the name "Standard add-on Kubernetes operator to Spark '' describe it better? 2. We are still struggling with improving Spark driver start-up time.

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-10 Thread Peter Toth
+1 On Fri, Nov 10, 2023, 14:09 Bjørn Jørgensen wrote: > +1 > > fre. 10. nov. 2023 kl. 08:39 skrev Nan Zhu : > >> just curious what happened on google’s spark operator? >> >> On Thu, Nov 9, 2023 at 19:12 Ilan Filonenko wrote: >> >>> +1 >>> >>> On Thu, Nov 9, 2023 at 7:43 PM Ryan Blue wrote:

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-10 Thread Bjørn Jørgensen
+1 fre. 10. nov. 2023 kl. 08:39 skrev Nan Zhu : > just curious what happened on google’s spark operator? > > On Thu, Nov 9, 2023 at 19:12 Ilan Filonenko wrote: > >> +1 >> >> On Thu, Nov 9, 2023 at 7:43 PM Ryan Blue wrote: >> >>> +1 >>> >>> On Thu, Nov 9, 2023 at 4:23 PM Hussein Awala wrote:

Re: Apache Spark 3.4.2 (?)

2023-11-10 Thread Kent Yao
+1 Maxim Gekk 于2023年11月9日周四 18:18写道: > > +1 > > On Wed, Nov 8, 2023 at 5:29 AM kazuyuki tanimura > wrote: >> >> +1 >> >> Kazu >> >> On Nov 7, 2023, at 5:23 PM, L. C. Hsieh wrote: >> >> +1 >> >> On Tue, Nov 7, 2023 at 4:56 PM Dongjoon Hyun wrote: >> >> >> Thank you all! >> >> Dongjoon >> >>

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Yuming Wang
+1 On Fri, Nov 10, 2023 at 10:01 AM Ilan Filonenko wrote: > +1 > > On Thu, Nov 9, 2023 at 7:43 PM Ryan Blue wrote: > >> +1 >> >> On Thu, Nov 9, 2023 at 4:23 PM Hussein Awala wrote: >> >>> +1 for creating an official Kubernetes operator for Apache Spark >>> >>> On Fri, Nov 10, 2023 at 12:38 AM

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Cheng Pan
Thanks for this impressive proposal, I have a basic question, how does spark-submit work with this operator? Or it enforces that we must use `kubectl apply -f spark-job.yaml`(or K8s client in programming way) to submit Spark app? Thanks, Cheng Pan > On Nov 10, 2023, at 04:05, Zhou Jiang

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread L. C. Hsieh
+1 On Thu, Nov 9, 2023 at 7:57 PM Chao Sun wrote: > > +1 > > > On Thu, Nov 9, 2023 at 6:36 PM Xiao Li wrote: > > > > +1 > > > > huaxin gao 于2023年11月9日周四 16:53写道: > >> > >> +1 > >> > >> On Thu, Nov 9, 2023 at 3:14 PM DB Tsai wrote: > >>> > >>> +1 > >>> > >>> To be completely transparent, I am

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Nan Zhu
just curious what happened on google’s spark operator? On Thu, Nov 9, 2023 at 19:12 Ilan Filonenko wrote: > +1 > > On Thu, Nov 9, 2023 at 7:43 PM Ryan Blue wrote: > >> +1 >> >> On Thu, Nov 9, 2023 at 4:23 PM Hussein Awala wrote: >> >>> +1 for creating an official Kubernetes operator for

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Chao Sun
+1 On Thu, Nov 9, 2023 at 6:36 PM Xiao Li wrote: > > +1 > > huaxin gao 于2023年11月9日周四 16:53写道: >> >> +1 >> >> On Thu, Nov 9, 2023 at 3:14 PM DB Tsai wrote: >>> >>> +1 >>> >>> To be completely transparent, I am employed in the same department as Zhou >>> at Apple. >>> >>> I support this

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Xiao Li
+1 huaxin gao 于2023年11月9日周四 16:53写道: > +1 > > On Thu, Nov 9, 2023 at 3:14 PM DB Tsai wrote: > >> +1 >> >> To be completely transparent, I am employed in the same department as >> Zhou at Apple. >> >> I support this proposal, provided that we witness community adoption >> following the release

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Ilan Filonenko
+1 On Thu, Nov 9, 2023 at 7:43 PM Ryan Blue wrote: > +1 > > On Thu, Nov 9, 2023 at 4:23 PM Hussein Awala wrote: > >> +1 for creating an official Kubernetes operator for Apache Spark >> >> On Fri, Nov 10, 2023 at 12:38 AM huaxin gao >> wrote: >> >>> +1 >>> >>> On Thu, Nov 9, 2023 at 3:14 PM DB

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Ryan Blue
+1 On Thu, Nov 9, 2023 at 4:23 PM Hussein Awala wrote: > +1 for creating an official Kubernetes operator for Apache Spark > > On Fri, Nov 10, 2023 at 12:38 AM huaxin gao > wrote: > >> +1 >> >> On Thu, Nov 9, 2023 at 3:14 PM DB Tsai wrote: >> >>> +1 >>> >>> To be completely transparent, I am

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Hussein Awala
+1 for creating an official Kubernetes operator for Apache Spark On Fri, Nov 10, 2023 at 12:38 AM huaxin gao wrote: > +1 > > On Thu, Nov 9, 2023 at 3:14 PM DB Tsai wrote: > >> +1 >> >> To be completely transparent, I am employed in the same department as >> Zhou at Apple. >> >> I support this

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread huaxin gao
+1 On Thu, Nov 9, 2023 at 3:14 PM DB Tsai wrote: > +1 > > To be completely transparent, I am employed in the same department as Zhou > at Apple. > > I support this proposal, provided that we witness community adoption > following the release of the Flink Kubernetes operator, streamlining Flink

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread DB Tsai
+1 To be completely transparent, I am employed in the same department as Zhou at Apple. I support this proposal, provided that we witness community adoption following the release of the Flink Kubernetes operator, streamlining Flink deployment on Kubernetes. A well-maintained official Spark

Re: ASF board report draft for Nov 2023

2023-11-09 Thread Matei Zaharia
Alright, done and posted. > On Nov 6, 2023, at 10:55 AM, Dongjoon Hyun wrote: > > Thank you, Matei. > > It would be great if we can include upcoming plans briefly. > > - Apache Spark 3.4.2 > (https://lists.apache.org/thread/35o2169l5r05k2mknqjy9mztq3ty1btr) > - Apache Spark 3.3.4 EOL

[DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Zhou Jiang
Hi Spark community, I'm reaching out to initiate a conversation about the possibility of developing a Java-based Kubernetes operator for Apache Spark. Following the operator pattern ( https://kubernetes.io/docs/concepts/extend-kubernetes/operator/), Spark users may manage applications and related

Re: Apache Spark 3.4.2 (?)

2023-11-09 Thread Maxim Gekk
+1 On Wed, Nov 8, 2023 at 5:29 AM kazuyuki tanimura wrote: > +1 > > Kazu > > On Nov 7, 2023, at 5:23 PM, L. C. Hsieh wrote: > > +1 > > On Tue, Nov 7, 2023 at 4:56 PM Dongjoon Hyun > wrote: > > > Thank you all! > > Dongjoon > > On Mon, Nov 6, 2023 at 6:03 PM Holden Karau wrote: > > > +1 > >

Re: Apache Spark 3.4.2 (?)

2023-11-07 Thread kazuyuki tanimura
+1 Kazu > On Nov 7, 2023, at 5:23 PM, L. C. Hsieh wrote: > > +1 > > On Tue, Nov 7, 2023 at 4:56 PM Dongjoon Hyun wrote: >> >> Thank you all! >> >> Dongjoon >> >> On Mon, Nov 6, 2023 at 6:03 PM Holden Karau wrote: >>> >>> +1 >>> >>> On Mon, Nov 6, 2023 at 4:30 PM yangjie01 >>> wrote:

Re: Apache Spark 3.4.2 (?)

2023-11-07 Thread L. C. Hsieh
+1 On Tue, Nov 7, 2023 at 4:56 PM Dongjoon Hyun wrote: > > Thank you all! > > Dongjoon > > On Mon, Nov 6, 2023 at 6:03 PM Holden Karau wrote: >> >> +1 >> >> On Mon, Nov 6, 2023 at 4:30 PM yangjie01 wrote: >>> >>> +1 >>> >>> >>> >>> 发件人: Yuming Wang >>> 日期: 2023年11月7日 星期二 07:00 >>> 收件人:

Re: Apache Spark 3.4.2 (?)

2023-11-07 Thread Dongjoon Hyun
Thank you all! Dongjoon On Mon, Nov 6, 2023 at 6:03 PM Holden Karau wrote: > +1 > > On Mon, Nov 6, 2023 at 4:30 PM yangjie01 > wrote: > >> +1 >> >> >> >> *发件人**: *Yuming Wang >> *日期**: *2023年11月7日 星期二 07:00 >> *收件人**: *Santosh Pingale >> *抄送**: *Dongjoon Hyun , dev < >>

Re: Apache Spark 3.4.2 (?)

2023-11-06 Thread Holden Karau
+1 On Mon, Nov 6, 2023 at 4:30 PM yangjie01 wrote: > +1 > > > > *发件人**: *Yuming Wang > *日期**: *2023年11月7日 星期二 07:00 > *收件人**: *Santosh Pingale > *抄送**: *Dongjoon Hyun , dev > > *主题**: *Re: Apache Spark 3.4.2 (?) > > > > +1 > > > > On Tue, Nov 7, 2023 at 3:55 AM Santosh Pingale > wrote: > >

Re: Apache Spark 3.4.2 (?)

2023-11-06 Thread yangjie01
+1 发件人: Yuming Wang 日期: 2023年11月7日 星期二 07:00 收件人: Santosh Pingale 抄送: Dongjoon Hyun , dev 主题: Re: Apache Spark 3.4.2 (?) +1 On Tue, Nov 7, 2023 at 3:55 AM Santosh Pingale wrote: Makes sense given the nature of those commits. On Mon, Nov 6, 2023, 7:52 PM Dongjoon Hyun

Re: Apache Spark 3.4.2 (?)

2023-11-06 Thread Yuming Wang
+1 On Tue, Nov 7, 2023 at 3:55 AM Santosh Pingale wrote: > Makes sense given the nature of those commits. > > On Mon, Nov 6, 2023, 7:52 PM Dongjoon Hyun > wrote: > >> Hi, All. >> >> Apache Spark 3.4.1 tag was created on Jun 19th and `branch-3.4` has 103 >> commits including important security

Re: Apache Spark 3.4.2 (?)

2023-11-06 Thread Santosh Pingale
Makes sense given the nature of those commits. On Mon, Nov 6, 2023, 7:52 PM Dongjoon Hyun wrote: > Hi, All. > > Apache Spark 3.4.1 tag was created on Jun 19th and `branch-3.4` has 103 > commits including important security and correctness patches like > SPARK-44251, SPARK-44805, and

Re: On adding applyInArrow to groupBy and cogroup

2023-11-06 Thread Hyukjin Kwon
Sounds good, I'll review the PR. On Fri, 3 Nov 2023 at 14:08, Abdeali Kothari wrote: > Seeing more support for arrow based functions would be great. > Gives more control to application developers. And so pandas just becomes 1 > of the available options. > > On Fri, 3 Nov 2023, 21:23 Luca

Re: ASF board report draft for Nov 2023

2023-11-06 Thread Dongjoon Hyun
Thank you, Matei. It would be great if we can include upcoming plans briefly. - Apache Spark 3.4.2 (https://lists.apache.org/thread/35o2169l5r05k2mknqjy9mztq3ty1btr) - Apache Spark 3.3.4 EOL (December 16th) Dongjoon. On 2023/11/06 05:32:11 Matei Zaharia wrote: > It’s time to send our

Apache Spark 3.4.2 (?)

2023-11-06 Thread Dongjoon Hyun
Hi, All. Apache Spark 3.4.1 tag was created on Jun 19th and `branch-3.4` has 103 commits including important security and correctness patches like SPARK-44251, SPARK-44805, and SPARK-44940. https://github.com/apache/spark/releases/tag/v3.4.1 $ git log --oneline v3.4.1..HEAD | wc -l

ASF board report draft for Nov 2023

2023-11-05 Thread Matei Zaharia
It’s time to send our project’s quarterly report to the ASF board on Wednesday November 8th. Here’s what I wrote as a draft; let me know any suggested changes. = Issues for the board: - None Project status: - We released Apache Spark 3.5 on September 15, a feature

Re: [DISCUSS] SPIP: ShuffleManager short name registration via SparkPlugin

2023-11-05 Thread Alessandro Bellina
Thanks for the comments Reynold. This is an ease of use change, and it is not absolutely required (as other ease of use changes are not required either). That said, do we not want to invest in making Spark easier to configure for the average user, or even the user that is trying out Spark? Here

Re: [DISCUSS] SPIP: ShuffleManager short name registration via SparkPlugin

2023-11-04 Thread Reynold Xin
Why do we need this? The reason data source APIs need it is because it will be used by very unsophisticated end users and used all the time (for each connection / query). Shuffle is something you set up once, presumably by fairly sophisticated admins / engineers. On Sat, Nov 04, 2023 at 2:42

[DISCUSS] SPIP: ShuffleManager short name registration via SparkPlugin

2023-11-04 Thread Alessandro Bellina
Hello devs, I would like to start discussion on the SPIP "ShuffleManager short name registration via SparkPlugin" The idea behind this change is to allow a driver plugin (spark.plugins) to export ShuffleManagers via short names, along with sensible default configurations. Users can then use this

Re: On adding applyInArrow to groupBy and cogroup

2023-11-03 Thread Abdeali Kothari
Seeing more support for arrow based functions would be great. Gives more control to application developers. And so pandas just becomes 1 of the available options. On Fri, 3 Nov 2023, 21:23 Luca Canali, wrote: > Hi Enrico, > > > > +1 on supporting Arrow on par with Pandas. Besides the frameworks

RE: On adding applyInArrow to groupBy and cogroup

2023-11-03 Thread Luca Canali
Hi Enrico, +1 on supporting Arrow on par with Pandas. Besides the frameworks and libraries that you mentioned I add awkward array, a library used in High Energy Physics (for those interested more details on how we tested awkward array with Spark from back when mapInArrow was introduced can be

unsubscribe

2023-11-03 Thread Stefan Hagedorn

Re: Spark 3.2.1 parquet read error

2023-10-30 Thread Mich Talebzadeh
Hi, The error message when reading Parquet data in Spark 3.2.1 is due to a schema mismatch between the Parquet file and the Spark schema. The Parquet file contains INT32 data for the ss_sold_time_sk column, while Spark schema expects it to be BIGINT. This schema mismatch is causing the error.

Spark 3.2.1 parquet read error

2023-10-30 Thread Suryansh Agnihotri
Hello spark-dev I have loaded tpcds data in parquet format using spark *3.0.2* and while reading it from spark *3.2.1* , my query is failing with below error. Later I set spark.sql.parquet.enableVectorizedReader=false my but it resulted in a different error. I am also providing output of

Re: On adding applyInArrow to groupBy and cogroup

2023-10-28 Thread Adam Binford
I'm definitely +1 to include this. - It seems like an odd feature parity gap to have a map function but no group apply function. - There's currently no way to use large arrow types with applyInPandas, which can lead to errors hitting the 2 GiB max string/binary array size. I have a PR to Arrow

<    6   7   8   9   10   11   12   13   14   15   >