Re: [DISCUSS] Spark 4.0.0 release

2024-05-01 Thread Jungtaek Lim
+1 love to see it! On Thu, May 2, 2024 at 10:08 AM Holden Karau wrote: > +1 :) yay previews > > On Wed, May 1, 2024 at 5:36 PM Chao Sun wrote: > >> +1 >> >> On Wed, May 1, 2024 at 5:23 PM Xiao Li wrote: >> >>> +1 for next Monday. >>> >>> We can do more previews when the other features are

Re: [DISCUSS] Spark 4.0.0 release

2024-05-01 Thread Holden Karau
+1 :) yay previews On Wed, May 1, 2024 at 5:36 PM Chao Sun wrote: > +1 > > On Wed, May 1, 2024 at 5:23 PM Xiao Li wrote: > >> +1 for next Monday. >> >> We can do more previews when the other features are ready for preview. >> >> Tathagata Das 于2024年5月1日周三 08:46写道: >> >>> Next week sounds

Re: [DISCUSS] clarify the definition of behavior changes

2024-05-01 Thread Wenchen Fan
Hi Erik, Thanks for sharing your thoughts! Note: developer APIs are also public APIs (such as Data Source V2 API, Spark Listener API, etc.), so breaking changes should be avoided as much as we can and new APIs should be mentioned in the release notes. Breaking binary compatibility is also a

Re: [DISCUSS] Spark 4.0.0 release

2024-05-01 Thread Chao Sun
+1 On Wed, May 1, 2024 at 5:23 PM Xiao Li wrote: > +1 for next Monday. > > We can do more previews when the other features are ready for preview. > > Tathagata Das 于2024年5月1日周三 08:46写道: > >> Next week sounds great! Thank you Wenchen! >> >> On Wed, May 1, 2024 at 11:16 AM Wenchen Fan wrote: >>

Re: [DISCUSS] Spark 4.0.0 release

2024-05-01 Thread Hyukjin Kwon
SGTM On Thu, 2 May 2024 at 02:06, Dongjoon Hyun wrote: > +1 for next Monday. > > Dongjoon. > > On Wed, May 1, 2024 at 8:46 AM Tathagata Das > wrote: > >> Next week sounds great! Thank you Wenchen! >> >> On Wed, May 1, 2024 at 11:16 AM Wenchen Fan wrote: >> >>> Yea I think a preview release

Re: [DISCUSS] Spark 4.0.0 release

2024-05-01 Thread Xiao Li
+1 for next Monday. We can do more previews when the other features are ready for preview. Tathagata Das 于2024年5月1日周三 08:46写道: > Next week sounds great! Thank you Wenchen! > > On Wed, May 1, 2024 at 11:16 AM Wenchen Fan wrote: > >> Yea I think a preview release won't hurt (without a branch

Re: [DISCUSS] Spark 4.0.0 release

2024-05-01 Thread Dongjoon Hyun
+1 for next Monday. Dongjoon. On Wed, May 1, 2024 at 8:46 AM Tathagata Das wrote: > Next week sounds great! Thank you Wenchen! > > On Wed, May 1, 2024 at 11:16 AM Wenchen Fan wrote: > >> Yea I think a preview release won't hurt (without a branch cut). We don't >> need to wait for all the

Re: [DISCUSS] clarify the definition of behavior changes

2024-05-01 Thread Erik Krogen
Thanks for raising this important discussion Wenchen! Two points I would like to raise, though I'm fully supportive of any improvements in this regard, my points below notwithstanding -- I am not intending to let perfect be the enemy of good here. On a similar note as Santosh's comment, we should

Re: [DISCUSS] clarify the definition of behavior changes

2024-05-01 Thread Wenchen Fan
Good point, Santosh! I was originally targeting end users who write queries with Spark, as this is probably the largest user base. But we should definitely consider other users who deploy and manage Spark clusters. Those users are usually more tolerant of behavior changes and I think it should be

Re: [DISCUSS] Spark 4.0.0 release

2024-05-01 Thread Tathagata Das
Next week sounds great! Thank you Wenchen! On Wed, May 1, 2024 at 11:16 AM Wenchen Fan wrote: > Yea I think a preview release won't hurt (without a branch cut). We don't > need to wait for all the ongoing projects to be ready. How about we do a > 4.0 preview release based on the current master

Re: [DISCUSS] clarify the definition of behavior changes

2024-05-01 Thread Santosh Pingale
Thanks Wenchen for starting this! How do we define "the user" for spark? 1. End users: There are some users that use spark as a service from a provider 2. Providers/Operators: There are some users that provide spark as a service for their internal(on-prem setup with yarn/k8s)/external(Something

Re: [DISCUSS] Spark 4.0.0 release

2024-05-01 Thread Wenchen Fan
Yea I think a preview release won't hurt (without a branch cut). We don't need to wait for all the ongoing projects to be ready. How about we do a 4.0 preview release based on the current master branch next Monday? On Wed, May 1, 2024 at 11:06 PM Tathagata Das wrote: > Hey all, > > Reviving

Re: [DISCUSS] Spark 4.0.0 release

2024-05-01 Thread Tathagata Das
Hey all, Reviving this thread, but Spark master has already accumulated a huge amount of changes. As a downstream project maintainer, I want to really start testing the new features and other breaking changes, and it's hard to do that without a Preview release. So the sooner we make a Preview

Re: Potential Impact of Hive Upgrades on Spark Tables

2024-05-01 Thread Mich Talebzadeh
It is important to consider potential impacts on Spark tables stored in the Hive metastore during an "upgrade". Depending on the upgrade path, the Hive metastore schema or SerDes behavior might change, requiring adjustments in the Sparkark code or configurations. I mentioned the need to test the