Re: [DISCUSS] clarify the definition of behavior changes

2024-05-15 Thread Wenchen Fan
; *Date: *Thursday, 2 May 2024 at 11:57 > *To: *Wenchen Fan > *Cc: *Erik Krogen , Spark dev list < > dev@spark.apache.org> > *Subject: *Re: [DISCUSS] clarify the definition of behavior changes > > *CAUTION:* This email originates from an external party (outside of > Pal

Re: [DISCUSS] clarify the definition of behavior changes

2024-05-02 Thread Will Raschkowski
appreciate Wenchen's stricter definition of "behavior changes" (especially for silent ones). From: Nimrod Ofek Date: Thursday, 2 May 2024 at 11:57 To: Wenchen Fan Cc: Erik Krogen , Spark dev list Subject: Re: [DISCUSS] clarify the definition of behavior changes CAUTION: This e

Re: [DISCUSS] clarify the definition of behavior changes

2024-05-02 Thread Nimrod Ofek
Hi Erik and Wenchen, I think that usually a good practice with public api and with internal api that has big impact and a lot of usage is to ease in changes by providing defaults to new parameters that will keep former behaviour in a method with the previous signature with deprecation notice, and

Re: [DISCUSS] clarify the definition of behavior changes

2024-05-01 Thread Wenchen Fan
Hi Erik, Thanks for sharing your thoughts! Note: developer APIs are also public APIs (such as Data Source V2 API, Spark Listener API, etc.), so breaking changes should be avoided as much as we can and new APIs should be mentioned in the release notes. Breaking binary compatibility is also a

Re: [DISCUSS] clarify the definition of behavior changes

2024-05-01 Thread Erik Krogen
Thanks for raising this important discussion Wenchen! Two points I would like to raise, though I'm fully supportive of any improvements in this regard, my points below notwithstanding -- I am not intending to let perfect be the enemy of good here. On a similar note as Santosh's comment, we should

Re: [DISCUSS] clarify the definition of behavior changes

2024-05-01 Thread Wenchen Fan
Good point, Santosh! I was originally targeting end users who write queries with Spark, as this is probably the largest user base. But we should definitely consider other users who deploy and manage Spark clusters. Those users are usually more tolerant of behavior changes and I think it should be

Re: [DISCUSS] clarify the definition of behavior changes

2024-05-01 Thread Santosh Pingale
Thanks Wenchen for starting this! How do we define "the user" for spark? 1. End users: There are some users that use spark as a service from a provider 2. Providers/Operators: There are some users that provide spark as a service for their internal(on-prem setup with yarn/k8s)/external(Something

[DISCUSS] clarify the definition of behavior changes

2024-04-30 Thread Wenchen Fan
Hi all, It's exciting to see innovations keep happening in the Spark community and Spark keeps evolving itself. To make these innovations available to more users, it's important to help users upgrade to newer Spark versions easily. We've done a good job on it: the PR template requires the author