Re: [DISCUSS] SPIP: Add llms.txt files to Spark Documentation

2025-09-10 Thread Bjørn Jørgensen
t 8:47 AM Wenchen Fan wrote: > >> This should just be a llm-facing index page of Spark docs? Given the >> amount of APIs Spark provides today, I think this index page should be >> useful to humans as well. >> >> On Wed, Sep 10, 2025 at 10:46 PM Dongjoon Hyun >

Re: [DISCUSS] SPIP: Add llms.txt files to Spark Documentation

2025-09-10 Thread Allison Wang
uage scopes. On Wed, Sep 10, 2025 at 8:47 AM Wenchen Fan wrote: > This should just be a llm-facing index page of Spark docs? Given the > amount of APIs Spark provides today, I think this index page should be > useful to humans as well. > > On Wed, Sep 10, 2025 at 10:46 PM Dongjoon

Re: [DISCUSS] SPIP: Add llms.txt files to Spark Documentation

2025-09-10 Thread Wenchen Fan
This should just be a llm-facing index page of Spark docs? Given the amount of APIs Spark provides today, I think this index page should be useful to humans as well. On Wed, Sep 10, 2025 at 10:46 PM Dongjoon Hyun wrote: > Thank you, Allison and Hyukjin. > > IIUC, this proposal is no

Re: [DISCUSS] Release Apache Spark 3.5.7

2025-09-10 Thread Max Gekk
+1 On Wed, Sep 10, 2025 at 8:13 AM Shaoyun Chen wrote: > +1 > > SPARK-46941[1] also fixed an issue with incorrect results. > > 1. https://issues.apache.org/jira/browse/SPARK-46941 > > Yang Jie 于2025年9月10日周三 11:49写道: > > > > +1 > > > > On 20

Re: [DISCUSS] Release Apache Spark 3.5.7

2025-09-09 Thread Yang Jie
edin profile > > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > > > > > > > > > > On Tue, 9 Sept 2025 at 16:51, Peter Toth wrote: > > > >> Hi dev list, > >> > >> Apache Spark 3.5.6 was released on May

Re: [DISCUSS] Release Apache Spark 3.5.7

2025-09-09 Thread Wenchen Fan
; > > > On Tue, 9 Sept 2025 at 16:51, Peter Toth wrote: > >> Hi dev list, >> >> Apache Spark 3.5.6 was released on May 29, 2025, so it's been more than 3 >> months. >> As far as I can see, we have ~40 unreleased commits on the branch and 34 >>

Re: [DISCUSS] SPIP: Add llms.txt files to Spark Documentation

2025-09-09 Thread Allison Wang
gt; idea. > > On Tue, 9 Sept 2025 at 10:22, Allison Wang wrote: > >> Hi all, >> >> I’d like to propose adding llms.txt files to the Spark documentation. >> >> As more users rely on AI-assisted tools and LLMs to learn, write Spark >> code, and troubleshoot

Re: [DISCUSS] SPIP: Add llms.txt files to Spark Documentation

2025-09-09 Thread Hyukjin Kwon
so it's basically adding one text file for llm, right? I think it's a good idea. On Tue, 9 Sept 2025 at 10:22, Allison Wang wrote: > Hi all, > > I’d like to propose adding llms.txt files to the Spark documentation. > > As more users rely on AI-assisted tools and L

Re: [DISCUSS] Release Apache Spark 3.5.7

2025-09-09 Thread Mich Talebzadeh
Agreed +1 Dr Mich Talebzadeh, Architect | Data Science | Financial Crime | Forensic Analysis | GDPR view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> On Tue, 9 Sept 2025 at 16:51, Peter Toth wrote: > Hi dev list, > > Apache Spark 3.5.6

Re: [DISCUSS] Release Apache Spark 3.5.7

2025-09-09 Thread Dongjoon Hyun
+1 Yes, it's a perfect timing to deliver Spark 3.5.7. Thank you for volunteering for it, Peter. Dongjoon. On 2025/09/09 15:49:36 Peter Toth wrote: > Hi dev list, > > Apache Spark 3.5.6 was released on May 29, 2025, so it's been more than 3 > months. > As far

Re: SPARK-51166 Prepare Apache Spark 4.1.0 for November 2025

2025-09-09 Thread Dongjoon Hyun
Hi, Xiao. Apache Spark project has a world-wide community which is working on November and the community already decided to put more efforts via the monthly releases. Let me rephrase the community schedule. Apache Spark 4.1.0-preview1 (2025-09-02) Apache Spark 4.1.0-preview2 (2025-10-02

Re: SPARK-51166 Prepare Apache Spark 4.1.0 for November 2025

2025-09-08 Thread Xiao Li
ecause we have Apache > Spark 4.1.0-preview1 already. > > In addition, I expect Apache Spark 4.1.0-preview2 in October. So, the > 4.1.0 release will be smoother than ever. > > I will volunteer as the release manager of Apache Spark 4.1.0 to finish it > on time. > > Dongjoon. >

[DISCUSS] SPIP: Add llms.txt files to Spark Documentation

2025-09-08 Thread Allison Wang
Hi all, I’d like to propose adding llms.txt files to the Spark documentation. As more users rely on AI-assisted tools and LLMs to learn, write Spark code, and troubleshoot issues, it’s increasingly important that these tools point back to the up-to-date official documentation. This will help

Re: SPARK-51166 Prepare Apache Spark 4.1.0 for November 2025

2025-09-08 Thread Dongjoon Hyun
Thank you, Holden. Yes, it's true and I agree with all your comments. At this time, we are in a much better situation because we have Apache Spark 4.1.0-preview1 already. In addition, I expect Apache Spark 4.1.0-preview2 in October. So, the 4.1.0 release will be smoother than ever. I

SPARK-51166 Prepare Apache Spark 4.1.0 for November 2025

2025-09-08 Thread Dongjoon Hyun
Hi, All. As of now, the Apache Spark Versioning Policy page is a little outdated because it still shows only the delivered Spark 4.0.0 release window. https://spark.apache.org/versioning-policy.html Since Apache Spark 4.0.0 was announced on May 23rd, I believe we can release 4.1.0 after 6

Re: [External] [ANNOUNCE] Announcing Apache Spark 4.1.0-preview1

2025-09-08 Thread Ofir Manor
ust my two cents, Ofir From: Hyukjin Kwon Sent: Wednesday, September 3, 2025 3:31 AM To: dev Subject: [External] [ANNOUNCE] Announcing Apache Spark 4.1.0-preview1 Hi, all. To enable wide-scale community testing of the upcoming Spark 4.1.0 release, the Apache Spa

Re: [DISCUSS][SPIP] JDBC Driver for Spark Connect

2025-09-08 Thread Martin Grund
IP. > > Thanks, > Cheng Pan > > > > On Sep 4, 2025, at 14:16, Cheng Pan wrote: > > Hi all, > > I’d like to propose introducing a JDBC driver for Spark Connect. > > I believe enabling JDBC support would facilitate a smoother transition > from Spark Thrift Serv

Re: [DISCUSS][SPIP] JDBC Driver for Spark Connect

2025-09-07 Thread Cheng Pan
e introducing a JDBC driver for Spark Connect. > > I believe enabling JDBC support would facilitate a smoother transition > from Spark Thrift Server to Spark Connect Server, and accelerate the > overall adoption of Spark Connect within the community. > > Looking forward to your

Re: [ANNOUNCE] Apache Spark 4.0.1 released

2025-09-07 Thread Hyukjin Kwon
Yay! On Sun, 7 Sept 2025 at 13:54, Dongjoon Hyun wrote: > We are happy to announce the availability of Apache Spark 4.0.1! > > Spark 4.0.1 is the first maintenance release based on the branch-4.0 > maintenance branch of Spark. It contains many fixes including security and > corr

[VOTE][RESULT] Release Spark 4.0.1 (RC1)

2025-09-07 Thread Dongjoon Hyun
The vote passes with 21 +1s (9 binding +1s). Thanks to all who helped with the release! (* = binding) +1: - Max Gekk * - Yang Jie * - Peter Toth - Kent Yao * - John Zhuge - Wenchen Fan * - Cheng Pan - Kousuke Saruta * - Liang-Chi Hsieh * - Dongjoon Hyun * - Huaxin Gao * - Prashant Singh - Jafeer A

[ANNOUNCE] Apache Spark 4.0.1 released

2025-09-06 Thread Dongjoon Hyun
We are happy to announce the availability of Apache Spark 4.0.1! Spark 4.0.1 is the first maintenance release based on the branch-4.0 maintenance branch of Spark. It contains many fixes including security and correctness domains. We strongly recommend all 4.0 users to upgrade to this stable

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-06 Thread Jules Damji
+1 > > On Tue, Sep 2, 2025 at 7:48 AM <dongj...@apache.org> wrote: > > > Please vote on releasing the following candidate as Apache Spark version > > 4.0.1. > > > > The vote is open until Fri, 05 Sep 2025 22:47:52 PDT and passes if a > > majo

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-06 Thread Dongjoon Hyun
Thank you all. This vote passed. I'll conclude this vote. Dongjoon. On 2025/09/04 18:02:47 Szehon Ho wrote: > +1 (non binding) > > Checked signature, checksum, and ran basic test on spark-4.0.1-bin-hadoop3. > > Thanks Dongjoon > Szehon > > On Tue, Sep 2,

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-04 Thread Szehon Ho
+1 (non binding) Checked signature, checksum, and ran basic test on spark-4.0.1-bin-hadoop3. Thanks Dongjoon Szehon On Tue, Sep 2, 2025 at 11:50 PM Jungtaek Lim wrote: > +1 (non-binding) > > Thanks, > Jungtaek Lim (HeartSaVioR) > > On Wed, Sep 3, 2025 at 9:16 AM kazuyuki

[DISCUSS][SPIP] JDBC Driver for Spark Connect

2025-09-03 Thread Cheng Pan
Hi all, I’d like to propose introducing a JDBC driver for Spark Connect. I believe enabling JDBC support would facilitate a smoother transition from Spark Thrift Server to Spark Connect Server, and accelerate the overall adoption of Spark Connect within the community. Looking forward to your

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-03 Thread Peter Toth
+1 On Tue, Sep 2, 2025 at 11:49 AM Yang Jie wrote: > +1 > > On 2025/09/02 08:17:17 Max Gekk wrote: > > +1 > > > > On Tue, Sep 2, 2025 at 7:48 AM wrote: > > > > > Please vote on releasing the following candidate as Apache Spark > version > >

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-03 Thread huaxin gao
Tue, Sep 2, 2025 at 1:48 PM wrote: > > >> > > >> Please vote on releasing the following candidate as Apache Spark > version 4.0.1. > > >> > > >> The vote is open until Fri, 05 Sep 2025 22:47:52 PDT and passes if a > majority +1 PMC votes are ca

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-03 Thread Holden Karau
< > dongj...@apache.org>, "dev@spark.apache.org" > *Subject: *RE: [EXTERNAL] [VOTE] Release Spark 4.0.1 (RC1) > > > > +1 (non-binding) > > > > On Tue, Sep 2, 2025 at 10:07 AM Anish Shrigondekar > wrote: > > +1 > > > > Thanks, > &g

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-02 Thread Jungtaek Lim
Vlad > wrote: > >> +1 (non-binding) >> >> >> >> Thank you, >> >> >> >> Vlad >> >> >> >> *From: *Zhou Jiang >> *Date: *Tuesday, September 2, 2025 at 10:10 AM >> *To: *Anish Shrigondekar >> *Cc: *

Re: [ANNOUNCE] Announcing Apache Spark 4.1.0-preview1

2025-09-02 Thread Dongjoon Hyun
Great! Thank you, Hyukjin. Dongjoon. On 2025/09/03 00:31:53 Hyukjin Kwon wrote: > Hi, all. > > To enable wide-scale community testing of the upcoming Spark 4.1.0 release, > the Apache Spark community has posted a Spark 4.1.0-preview1 release > <https://dist.apache.org/repos/

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-02 Thread Dongjoon Hyun
Hi, Yuming. As a release manager, I already evaluated the same request and made a decision on the PR. https://github.com/apache/spark/pull/52165#issuecomment-3240831583 ( [SPARK-53420][BUILD] Upgrade Parquet to 1.16.0 ) Apache Parquet 1.16.0 was just released **one hour** ago. We need to spend

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-02 Thread Max Gekk
+1 On Tue, Sep 2, 2025 at 7:48 AM wrote: > Please vote on releasing the following candidate as Apache Spark version > 4.0.1. > > The vote is open until Fri, 05 Sep 2025 22:47:52 PDT and passes if a > majority +1 PMC votes are cast, with > a minimum of 3 +1 votes. > > [ ]

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-02 Thread Kent Yao
+1 在 2025年9月2日星期二,Peter Toth 写道: > +1 > > On Tue, Sep 2, 2025 at 11:49 AM Yang Jie wrote: > >> +1 >> >> On 2025/09/02 08:17:17 Max Gekk wrote: >> > +1 >> > >> > On Tue, Sep 2, 2025 at 7:48 AM wrote: >> > >> > &g

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-02 Thread Jafeer Ali
t;>> On 2025/09/02 08:17:17 Max Gekk wrote: >>>>> > +1 >>>>> > >>>>> > On Tue, Sep 2, 2025 at 7:48 AM wrote: >>>>> > >>>>> > > Please vote on releasing the following candidate as Apache Spark >>&

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-02 Thread Zhou Jiang
t;> >>> Dongjoon >>> >>> On 2025/09/02 15:23:55 "L. C. Hsieh" wrote: >>> > +1 >>> > >>> > On Tue, Sep 2, 2025 at 6:08 AM Wenchen Fan >>> wrote: >>> > > >>> > > +1 >>> > >

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-02 Thread Yuming Wang
;> >> *From: *Zhou Jiang >> *Date: *Tuesday, September 2, 2025 at 10:10 AM >> *To: *Anish Shrigondekar >> *Cc: *huaxin gao , Dongjoon Hyun < >> dongj...@apache.org>, "dev@spark.apache.org" >> *Subject: *RE: [EXTERNAL] [VOTE] Release Spark 4.0.

[ANNOUNCE] Announcing Apache Spark 4.1.0-preview1

2025-09-02 Thread Hyukjin Kwon
Hi, all. To enable wide-scale community testing of the upcoming Spark 4.1.0 release, the Apache Spark community has posted a Spark 4.1.0-preview1 release <https://dist.apache.org/repos/dist/release/spark/spark-4.1.0-preview1/>. This preview is not a stable release in terms of either

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-02 Thread kazuyuki tanimura
t;> >> >> From: Zhou Jiang mailto:zhou.c.ji...@gmail.com>> >> Date: Tuesday, September 2, 2025 at 10:10 AM >> To: Anish Shrigondekar >> Cc: huaxin gao mailto:huaxin.ga...@gmail.com>>, >> Dongjoon Hyun mailto:dongj...@apache.org>>, &

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-02 Thread Rozov, Vlad
+1 (non-binding) Thank you, Vlad From: Zhou Jiang Date: Tuesday, September 2, 2025 at 10:10 AM To: Anish Shrigondekar Cc: huaxin gao , Dongjoon Hyun , "dev@spark.apache.org" Subject: RE: [EXTERNAL] [VOTE] Release Spark 4.0.1 (RC1) +1 (non-binding) On Tue, Sep 2, 2025 at 10:0

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-02 Thread Anish Shrigondekar
t;> > On Tue, Sep 2, 2025 at 6:08 AM Wenchen Fan wrote: >> > > >> > > +1 >> > > >> > > On Tue, Sep 2, 2025 at 1:48 PM wrote: >> > >> >> > >> Please vote on releasing the following candidate as Apache Spark >> ve

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-02 Thread Dongjoon Hyun
+1 Dongjoon On 2025/09/02 15:23:55 "L. C. Hsieh" wrote: > +1 > > On Tue, Sep 2, 2025 at 6:08 AM Wenchen Fan wrote: > > > > +1 > > > > On Tue, Sep 2, 2025 at 1:48 PM wrote: > >> > >> Please vote on releasing the following candida

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-02 Thread Prashant Singh
gt;>> >>> On Tue, Sep 2, 2025 at 11:49 AM Yang Jie wrote: >>> >>>> +1 >>>> >>>> On 2025/09/02 08:17:17 Max Gekk wrote: >>>> > +1 >>>> > >>>> > On Tue, Sep 2, 2025 at 7:48 AM wrote: >>>> &

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-02 Thread Cheng Pan
+1 (non-binding) Env info: Hadoop 3.4.2, OpenJDK 17, Ubuntu focal arm64 I tested Spark on YARN mode with ESS enabled, and Spark Standalone mode, run some basic queries, everything looks good. Thanks, Cheng Pan > On Sep 2, 2025, at 13:47, dongj...@apache.org wrote: > > Pleas

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-02 Thread L. C. Hsieh
+1 On Tue, Sep 2, 2025 at 6:08 AM Wenchen Fan wrote: > > +1 > > On Tue, Sep 2, 2025 at 1:48 PM wrote: >> >> Please vote on releasing the following candidate as Apache Spark version >> 4.0.1. >> >> The vote is open until Fri, 05 Sep 2025 22:47:52 PDT

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-02 Thread John Zhuge
ote: >>> > +1 >>> > >>> > On Tue, Sep 2, 2025 at 7:48 AM wrote: >>> > >>> > > Please vote on releasing the following candidate as Apache Spark >>> version >>> > > 4.0.1. >>> > > >>>

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-02 Thread Kousuke Saruta
+1 2025年9月2日(火) 22:09 Wenchen Fan : > +1 > > On Tue, Sep 2, 2025 at 1:48 PM wrote: > >> Please vote on releasing the following candidate as Apache Spark version >> 4.0.1. >> >> The vote is open until Fri, 05 Sep 2025 22:47:52 PDT and passes if a >>

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-02 Thread Wenchen Fan
+1 On Tue, Sep 2, 2025 at 1:48 PM wrote: > Please vote on releasing the following candidate as Apache Spark version > 4.0.1. > > The vote is open until Fri, 05 Sep 2025 22:47:52 PDT and passes if a > majority +1 PMC votes are cast, with > a minimum of 3 +1 votes. > > [ ]

Re: [VOTE] Release Spark 4.0.1 (RC1)

2025-09-02 Thread Yang Jie
+1 On 2025/09/02 08:17:17 Max Gekk wrote: > +1 > > On Tue, Sep 2, 2025 at 7:48 AM wrote: > > > Please vote on releasing the following candidate as Apache Spark version > > 4.0.1. > > > > The vote is open until Fri, 05 Sep 2025 22:47:52 PDT and passes if

[VOTE] Release Spark 4.0.1 (RC1)

2025-09-01 Thread dongjoon
Please vote on releasing the following candidate as Apache Spark version 4.0.1. The vote is open until Fri, 05 Sep 2025 22:47:52 PDT and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 4.0.1 [ ] -1 Do not release this package

Re: Apache Spark 4.0.1 ?

2025-08-28 Thread Dongjoon Hyun
gt; >> status. > >> > >> Thanks, > >> Dongjoon. > >> > >> On 2025/08/26 16:07:35 Anish Shrigondekar wrote: > >> > +1 - Thanks Dongjoon ! > >> > > >> > Thanks, > >> > Anish > >> > > >> > On Tue,

Re: Apache Spark 4.0.1 ?

2025-08-27 Thread Jungtaek Lim
ish >> > >> > On Tue, Aug 26, 2025 at 8:42 AM Wenchen Fan >> wrote: >> > >> > > +1, I think we can still do the PyPi upload with the shared >> spark-upload >> > > account, though it's not ideal. >> > > >>

Re: Apache Spark 4.0.1 ?

2025-08-27 Thread Jules Damji
; > > +1, I think we can still do the PyPi upload with the shared spark-upload > > account, though it's not ideal. > > > > On Tue, Aug 26, 2025 at 4:37 PM Peter Toth <peter.t...@gmail.com> wrote: > > > >> +1 > >> > >> On Tue, Aug 26, 2

Re: Apache Spark 4.0.1 ?

2025-08-27 Thread Jules Damji
; > > +1, I think we can still do the PyPi upload with the shared spark-upload > > account, though it's not ideal. > > > > On Tue, Aug 26, 2025 at 4:37 PM Peter Toth <peter.t...@gmail.com> wrote: > > > >> +1 > >> > >>

Re: [DISCUSS] Proposal to Add Theta and Tuple Sketches to Spark SQL

2025-08-26 Thread Daniel Tenedorio
Hi, I can help review. I helped review the original implementation of HLL sketch aggregate functions into Spark from Ryan Berti earlier. Sorry for not seeing this Spark mailing list thread earlier, I've been out on parental leave for a while (but back now). Best Daniel On 2025/06/04 23:

Re: [DISCUSS] Proposal to Add Theta and Tuple Sketches to Spark SQL

2025-08-26 Thread Daniel Tenedorio
Thanks! I can help with the review for this. On 2025/07/16 17:34:50 "Boumalhab, Chris" wrote: > PR is ready for review here: https://github.com/apache/spark/pull/51298 > - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Apache Spark 4.0.1 ?

2025-08-26 Thread Rozov, Vlad
+1 Thank you, Vlad From: Ángel Álvarez Pascua Date: Tuesday, August 26, 2025 at 9:24 AM To: Dongjoon Hyun , Dongjoon Hyun Cc: dev Subject: RE: [EXTERNAL] Apache Spark 4.0.1 ? +1. Thanks @Dongjoon Hyun<mailto:dongjoon.h...@gmail.com> El mar, 26 ago 2025, 18:20, Dongjoon Hyun mailto

Re: Apache Spark 4.0.1 ?

2025-08-26 Thread Ángel Álvarez Pascua
gt; On 2025/08/26 16:07:35 Anish Shrigondekar wrote: > > +1 - Thanks Dongjoon ! > > > > Thanks, > > Anish > > > > On Tue, Aug 26, 2025 at 8:42 AM Wenchen Fan wrote: > > > > > +1, I think we can still do the PyPi upload with the shared > spark-upload > &

Re: Apache Spark 4.0.1 ?

2025-08-26 Thread Dongjoon Hyun
Aug 26, 2025 at 8:42 AM Wenchen Fan wrote: > > > +1, I think we can still do the PyPi upload with the shared spark-upload > > account, though it's not ideal. > > > > On Tue, Aug 26, 2025 at 4:37 PM Peter Toth wrote: > > > >> +1 > >> > >

Re: Apache Spark 4.0.1 ?

2025-08-26 Thread Anish Shrigondekar
+1 - Thanks Dongjoon ! Thanks, Anish On Tue, Aug 26, 2025 at 8:42 AM Wenchen Fan wrote: > +1, I think we can still do the PyPi upload with the shared spark-upload > account, though it's not ideal. > > On Tue, Aug 26, 2025 at 4:37 PM Peter Toth wrote: > >> +1 >&

Re: Apache Spark 4.0.1 ?

2025-08-26 Thread Wenchen Fan
+1, I think we can still do the PyPi upload with the shared spark-upload account, though it's not ideal. On Tue, Aug 26, 2025 at 4:37 PM Peter Toth wrote: > +1 > > On Tue, Aug 26, 2025 at 5:10 AM Yang Jie wrote: > >> +1, thank you Dongjoon >> >> Thanks >&

Re: Apache Spark 4.0.1 ?

2025-08-26 Thread Peter Toth
r driving this. > > > > > > Thanks, > > > Cheng Pan > > > > > > > > > > > > On Aug 26, 2025, at 00:31, Dongjoon Hyun > wrote: > > > > > > Hi, All. > > > > > > Since the Apache Spark 4.0.0 tag was created i

Re: Apache Spark 4.0.1 ?

2025-08-25 Thread Yang Jie
ug 26, 2025, at 00:31, Dongjoon Hyun wrote: > > > > Hi, All. > > > > Since the Apache Spark 4.0.0 tag was created in May, more than three > > months have passed. > > > > https://github.com/apache/spark/releases/tag/v4.0.0 (2025-05-19) > > >

Re: Apache Spark 4.0.1 ?

2025-08-25 Thread Kent Yao
+1, thank you Dongjoon Cheng Pan 于2025年8月26日周二 10:15写道: > +1, thank you for driving this. > > Thanks, > Cheng Pan > > > > On Aug 26, 2025, at 00:31, Dongjoon Hyun wrote: > > Hi, All. > > Since the Apache Spark 4.0.0 tag was created in May, more than three

Re: Apache Spark 4.0.1 ?

2025-08-25 Thread Cheng Pan
+1, thank you for driving this. Thanks, Cheng Pan > On Aug 26, 2025, at 00:31, Dongjoon Hyun wrote: > > Hi, All. > > Since the Apache Spark 4.0.0 tag was created in May, more than three months > have passed. > > https://github.com/apache/spark/releases/tag/v4.

Re: Apache Spark 4.0.1 ?

2025-08-25 Thread Bjørn Jørgensen
+1 Thank you, @Dongjoon Hyun man. 25. aug. 2025 kl. 18:32 skrev Dongjoon Hyun : > Hi, All. > > Since the Apache Spark 4.0.0 tag was created in May, more than three > months have passed. > > https://github.com/apache/spark/releases/tag/v4.0.0 (2025-05-19) > > So far,

Apache Spark 4.0.1 ?

2025-08-25 Thread Dongjoon Hyun
Hi, All. Since the Apache Spark 4.0.0 tag was created in May, more than three months have passed. https://github.com/apache/spark/releases/tag/v4.0.0 (2025-05-19) So far, 124 commits (mostly bug fixes) have been merged into the branch-4.0 branch. $ git log --oneline v4.0.0...HEAD | wc

Re: [Spark SQL][Parquet]: Question about support for Parquet TIME data

2025-08-25 Thread Sarah Gilmore
Hi all, I opened a sub-task (SPARK-53368) of SPARK-51162 to track future discussions. Here's a link[1] to the new JIRA issue. I created a subtask of SPARK-51162 instead of SPARK-51342 since the latter is already a subtask. Thanks for taking the time to consider this enhancement! Best Re

Re: [Spark SQL][Parquet]: Question about support for Parquet TIME data

2025-08-23 Thread serge rielau . com
Wouldn’t isAdjustedToUTC=false imply TIME WITH LOCAL TIMEZONE? That would be a "different" type. Personally, I’d much rather see Spark support TIME/TIMESTAMP WITH TIMEZONE TIMESTAMP WITH LOCAL TIMEZONE has been providing a rich set of "interesting" challenges over the years

Re: [Spark SQL][Parquet]: Question about support for Parquet TIME data

2025-08-23 Thread Max Gekk
UTC=true. I do believe it shouldn't be by default because it is incorrect semantically. Could you open a sub-task of SPARK-51342 for future discussions, please. Yours faithfully, Max Gekk On Tue, Aug 19, 2025 at 10:51 PM Sarah Gilmore wrote: > Hi all, > > My name is Sarah Gi

[Spark SQL][Parquet]: Question about support for Parquet TIME data

2025-08-19 Thread Sarah Gilmore
Hi all, My name is Sarah Gilmore, and I am a software developer at MathWorks[1] as well as a committer for the apache/arrow project. I noticed that the Spark ecosystem is introducing a new data type called TimeType[2] to represent time of day values in the upcoming 4.1.0 release, and I'm

Re: [Spark SQL]: Questions regarding Parquet In Predicate Behavior

2025-08-13 Thread Asif Shahid
Hi Yian, How are you? Though I do not have complete understanding of the code involved, but one difference between or and In is that In would be on a single column, while or pred can involve different columns , so may provide better filter? Regards Asif On Wed, Aug 13, 2025, 10:45 PM Yian Liou wr

[Spark SQL]: Questions regarding Parquet In Predicate Behavior

2025-08-13 Thread Yian Liou
Hi Everyone, I was exploring the details of the Parquet In Predicate in ParquetFilters.scala and had some lingering questions. What are the advantages of pushing ORs rather than an IN predicate from Parquet when the number of items is less than or equal to the InFilterThreshold? I also see that

Re: Problems when using Spark connect on Spark 3.5

2025-07-27 Thread Remzi Yang
Just checked again that the second issue did not appear on 4.0.0. Please ignore the 2nd one On 2025/07/28 03:36:10 Remzi Yang wrote: > Hi Spark experts, > > I met 2 problems when using Spark Connect on 3.5.5. And I am sure they are > not fixed on 4.0.0. Hope you can take a look w

Problems when using Spark connect on Spark 3.5

2025-07-27 Thread Remzi Yang
Hi Spark experts, I met 2 problems when using Spark Connect on 3.5.5. And I am sure they are not fixed on 4.0.0. Hope you can take a look when you have time: 1. No way to do health check on Spark connect service. Spark doesn’t provide a way to enable k8s liveness and readiness probes on

Re: [PR] [SPARK-52941] Make GitHub Actions work for spark-connect-rust [spark-connect-rust]

2025-07-26 Thread via GitHub
xuanyuanking merged PR #2: URL: https://github.com/apache/spark-connect-rust/pull/2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr

Any contributors interested in SPARK-46912

2025-07-24 Thread Melton Smith
Last month I’ve opened a PR, which fixes StandAlone mode bug in environments (detailed description inside the PR) Is anybody interested in reviewing and merging it? Regards, Melton - To unsubscribe e-mail: dev-unsubscr...@spark.

[PR] [SPARK-52941] Make GitHub Actions work for spark-connect-rust [spark-connect-rust]

2025-07-24 Thread via GitHub
sarutak opened a new pull request, #2: URL: https://github.com/apache/spark-connect-rust/pull/2 # Description This PR proposes to let GA work, and includes fix for a typo in `README.md` I confirmed this change works [on my forked repository](https://github.com/sarutak/spark

Re: [PR] feat: merge `spark-connect-rs` with apache project [spark-connect-rust]

2025-07-23 Thread via GitHub
xuanyuanking merged PR #1: URL: https://github.com/apache/spark-connect-rust/pull/1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr

Re: [PR] feat: merge `spark-connect-rs` with apache project [spark-connect-rust]

2025-07-20 Thread via GitHub
xuanyuanking commented on PR #1: URL: https://github.com/apache/spark-connect-rust/pull/1#issuecomment-3094973078 Waiting for a response from the general Incubator group. If there are no objections, I’ll merge the initial import PR in 48 hours. Ip-clearance registration: https

Re: [VOTE] Release Spark 3.5.7 (RC1)

2025-07-17 Thread Hyukjin Kwon
Date: *Wednesday, July 16, 2025 at 3:53 PM > *To: *Vlad Rozov > *Cc: *"dev@spark.apache.org" > *Subject: *RE: [EXTERNAL] [VOTE] Release Spark 3.5.7 (RC1) > > > > Ah this was 3.5.7, not 3.5.6. 3.5.7 has not been released yet. I am > dealing with the preview right now

Re: [VOTE] Release Spark 3.5.7 (RC1)

2025-07-16 Thread Rozov, Vlad
I see. Should branch-3.5 use 3.5.7-SNAPSHOT instead of 3.5.8-SNAPSHOT? Thank you, Vlad From: Hyukjin Kwon Date: Wednesday, July 16, 2025 at 3:53 PM To: Vlad Rozov Cc: "dev@spark.apache.org" Subject: RE: [EXTERNAL] [VOTE] Release Spark 3.5.7 (RC1) Ah this was 3.5.7, not 3.5.6. 3.5

Re: [VOTE] Release Spark 3.5.7 (RC1)

2025-07-16 Thread Hyukjin Kwon
gt;> >> >> *From: *Hyukjin Kwon >> *Date: *Monday, June 9, 2025 at 9:01 AM >> *To: *"dev@spark.apache.org" >> *Subject: *RE: [EXTERNAL] [VOTE] Release Spark 3.5.7 (RC1) >> >> >> >> These RC artifacts were dropped properly. >&g

Re: [VOTE] Release Spark 3.5.7 (RC1)

2025-07-16 Thread Hyukjin Kwon
; > > Vlad > > > > *From: *Hyukjin Kwon > *Date: *Monday, June 9, 2025 at 9:01 AM > *To: *"dev@spark.apache.org" > *Subject: *RE: [EXTERNAL] [VOTE] Release Spark 3.5.7 (RC1) > > > > These RC artifacts were dropped properly. > > > > On

Re: [VOTE] Release Spark 3.5.7 (RC1)

2025-07-16 Thread Rozov, Vlad
What will be the next version on 3.5 branch? Will it be 3.5.7 or 3.5.8 (version in the pom files)? Thank you, Vlad From: Hyukjin Kwon Date: Monday, June 9, 2025 at 9:01 AM To: "dev@spark.apache.org" Subject: RE: [EXTERNAL] [VOTE] Release Spark 3.5.7 (RC1) These RC artifacts we

Re: [DISCUSS] Proposal to Add Theta and Tuple Sketches to Spark SQL

2025-07-16 Thread Boumalhab, Chris
PR is ready for review here: https://github.com/apache/spark/pull/51298

Re: [PR] feat: merge `spark-connect-rs` with apache project [spark-connect-rust]

2025-07-16 Thread via GitHub
xuanyuanking commented on PR #1: URL: https://github.com/apache/spark-connect-rust/pull/1#issuecomment-3077258514 @sjrusso8 I’m submitting the incubator email following the process outlined at https://incubator.apache.org/ip-clearance/ip-clearance-template.html. I’ll report back here once

Re: [PR] feat: merge `spark-connect-rs` with apache project [spark-connect-rust]

2025-07-15 Thread via GitHub
sjrusso8 commented on PR #1: URL: https://github.com/apache/spark-connect-rust/pull/1#issuecomment-3073467335 @xuanyuanking @sarutak Let me know if there is anything else needed from my side! Super excited about this project being added to apache -- This is an automated message from

Re: [VOTE][RESULT] Release Spark 4.1.0-preview1 (RC1)

2025-07-14 Thread Hyukjin Kwon
There is a bit of a problem that I am trying to fix now. It will take a few more days for the release announcement :-). On Mon, 14 Jul 2025 at 08:52, Hyukjin Kwon wrote: > The vote passes with 17 +1s (8 binding +1s) and no -1s. > > (* = binding) > > +1: > Hyukjin Kwon (*) > Dongjoon Hyun (*) > V

[VOTE][RESULT] Release Spark 4.1.0-preview1 (RC1)

2025-07-13 Thread Hyukjin Kwon
The vote passes with 17 +1s (8 binding +1s) and no -1s. (* = binding) +1: Hyukjin Kwon (*) Dongjoon Hyun (*) Vlad Rozov Sakthi Kousuke Saruta (*) Wenchen Fan (*) Sandy Ryza Max Gekk (*) Anton Okolnychyi Jungtaek Lim Kent Yao (*) Peter Toth Jules Damji Ángel Álvarez Pascua Yang Jie (*) Szehon Ho Y

Re: [Question] Use of `super(Class, cls)` in Spark codebase

2025-07-13 Thread Hyukjin Kwon
Yeah, I think it's good to fix them. please go ahead with opening a JIRA and filing a PR. I think it's good to start fixing them. On Sat, 12 Jul 2025 at 18:32, Kyungjun Lee wrote: > Hi all, > > I'm a developer trying to make my first contribution to Apache Spark, a

[Question] Use of `super(Class, cls)` in Spark codebase

2025-07-12 Thread Kyungjun Lee
Hi all, I'm a developer trying to make my first contribution to Apache Spark, and while exploring the codebase I came across something I was curious about. In several places, such as this test case: https://github.com/apache/spark/pull/51225/files

Re: [VOTE] Release Spark 4.1.0-preview1 (RC1)

2025-07-11 Thread Yuming Wang
, Jul 12, 2025 at 3:13 AM Szehon Ho wrote: > +1 (non-binding) > > Checked signature, checksum, basic functionality of > spark-4.1.0-preview1-bin-hadoop3 > > Thanks for setting this up ! > Szehon > > On Thu, Jul 10, 2025 at 11:18 PM Yang Jie wrote: > >> +1 >&g

Re: [VOTE] Release Spark 4.1.0-preview1 (RC1)

2025-07-11 Thread Szehon Ho
+1 (non-binding) Checked signature, checksum, basic functionality of spark-4.1.0-preview1-bin-hadoop3 Thanks for setting this up ! Szehon On Thu, Jul 10, 2025 at 11:18 PM Yang Jie wrote: > +1 > > On 2025/07/11 04:23:27 Ángel Álvarez Pascua wrote: > > +1 (non-binding) > &g

Re: [VOTE] Release Spark 4.1.0-preview1 (RC1)

2025-07-10 Thread Yang Jie
> >>> It was a mistake in the email. The artifact shouldn't have a problem. > >>> > >>> On Thu, 10 Jul 2025 at 16:00, Hyukjin Kwon wrote: > >>> > >>>> oh yeah. I think I should change the email contents. > >>>> &

Re: [VOTE] Release Spark 4.1.0-preview1 (RC1)

2025-07-10 Thread Ángel Álvarez Pascua
te: >>> >>>> oh yeah. I think I should change the email contents. >>>> >>>> On Thu, 10 Jul 2025 at 15:02, Saruta, Kousuke >>>> wrote: >>>> >>>>> Using dev1 rather than preview1 seems intended. >>>>>

Re: [VOTE] Release Spark 4.1.0-preview1 (RC1)

2025-07-10 Thread Jules Damji
1 seems intended. https://github.com/apache/spark/blob/v4.1.0-preview1-rc1/dev/create-release/release-build.sh#L127   送信元: Jungtaek Lim <kabhwan.opensou...@gmail.com> 日付: 2025年7月10日 木曜日 14:30 宛先: Kent Yao <y...@apache.org> Cc: Anton Okolnychyi <aokolnyc...@gmail.com>, Max Gekk <max.g...@

Re: [VOTE] Release Spark 4.1.0-preview1 (RC1)

2025-07-10 Thread Peter Toth
kjin Kwon wrote: >> >>> oh yeah. I think I should change the email contents. >>> >>> On Thu, 10 Jul 2025 at 15:02, Saruta, Kousuke >>> wrote: >>> >>>> Using dev1 rather than preview1 seems intended. >>>> >>>>

Re: [VOTE] Release Spark 4.1.0-preview1 (RC1)

2025-07-10 Thread Kent Yao
>> On Thu, 10 Jul 2025 at 15:02, Saruta, Kousuke >> wrote: >> >>> Using dev1 rather than preview1 seems intended. >>> >>> https://github.com/apache/spark/blob/v4.1.0-preview1- >>> rc1/dev/create-release/release-build.sh#L127 >>> >&

Re: [VOTE] Release Spark 4.1.0-preview1 (RC1)

2025-07-10 Thread Hyukjin Kwon
an preview1 seems intended. >> >> >> https://github.com/apache/spark/blob/v4.1.0-preview1-rc1/dev/create-release/release-build.sh#L127 >> >> >> >> *送信元**: *Jungtaek Lim >> *日付**: *2025年7月10日 木曜日 14:30 >> *宛先**: *Kent Yao >> *Cc: *Anton Okolny

Re: [VOTE] Release Spark 4.1.0-preview1 (RC1)

2025-07-10 Thread Hyukjin Kwon
oh yeah. I think I should change the email contents. On Thu, 10 Jul 2025 at 15:02, Saruta, Kousuke wrote: > Using dev1 rather than preview1 seems intended. > > > https://github.com/apache/spark/blob/v4.1.0-preview1-rc1/dev/create-release/release-build.sh#L127 > > > > *送信

  1   2   3   4   5   6   7   8   9   10   >