+1 on fixing known issues to protect the quality of Apache Spark releases, rather than assuming “not a regression” means we should ship with known problems.
Longer term, once we move to a more frequent release cadence, if an issue comes from a new feature, we should default that feature off (or gate it) without blocking the release. But if it’s a day-zero bug that isn’t technically a regression, we should still address it before releasing. Sent from my iPhone > On Dec 17, 2025, at 3:19 AM, Mark Hamstra <[email protected]> wrote: > > On a little higher level, not restricted to just this issue/PR, there > is a distinct difference between "if there is no regression, then we > can release without fixing the issue" and "if there is no regression, > then we must release without fixing the issue". I don't believe that > the latter has ever been established as agreed upon policy in the > Spark project. I also don't believe that it is a good policy: there > are issues worth taking the time to fix (or at least carefully > discuss) even if they are not regressions. > >> On Tue, Dec 16, 2025 at 5:54 AM Herman van Hovell via dev >> <[email protected]> wrote: >> >> Dongjoon, >> >> I have a couple of problems with this course of action: >> >> You seem to be favoring speed over quality here. Even if my vote were >> erroneous, you should give me more than two hours to respond. This is a >> global community, not everyone is awake at the same time. As far as I know >> we try to follow a consensus driven decision making process here; this seems >> to be diametrically opposed to that. >> The problem itself is serious since it can cause driver crashes. In general >> I believe that we should not be in the business of shipping obviously broken >> things. The only thing you are doing now is increase toil by forcing us to >> release a patch version almost immediately. >> The offending change was backported to a maintenance release. That is >> something different than it being a previously known problem. >> I am not sure I follow the PR argument. You merged my initial PR without >> even checking in with me. That PR fixed the issue, it just needed proper >> tests and some touch-ups (again quality is important). I open a follow-up >> that contains proper testing, and yes this fails because of a change in >> error types, it happens, I will fix it. The statement that we don't have a >> fix is untrue, the fact that you state otherwise makes me seriously doubt >> your judgement here. You could have asked me or someone else, you could have >> leaned in and checked it yourself. >> >> I would like to understand why there is such a rush here. >> >> Kind regards, >> Herman >> >>> On Tue, Dec 16, 2025 at 7:27 AM Dongjoon Hyun <[email protected]> wrote: >>> >>> After rechecking, this vote passed. >>> >>> I'll send a vote result email. >>> >>> Dongjoon. >>> >>> On 2025/12/16 11:03:39 Dongjoon Hyun wrote: >>>> Hi, All. >>>> >>>> I've been working with Herman's PRs so far. >>>> >>>> As a kind of fact checking, I need to correct two things in RC3 thread. >>>> >>>> First, Herman claimed that he found a regression of Apache Spark 4.1.0, >>>> but actually it's not true because Apache Spark 4.0.1 also has SPARK-53342 >>>> since 2025-09-06. >>>> >>>> Second, although Herman shared us a patch since last Friday, Herman also >>>> made another PR containing the main code change 9 hours ago. In addition, >>>> unfortunately, it also didn't pass our CIs yet. It simply means that there >>>> is no complete patch yet in the community for both Apache Spark 4.1.0 and >>>> 4.0.2. >>>> >>>> https://github.com/apache/spark/pull/53480 >>>> ([SPARK-54696][CONNECT] Clean-up Arrow Buffers - follow-up) >>>> >>>> In short, he seems to block RC3 as a mistake. I'm re-checking the >>>> situation around RC3 vote and `branch-4.1` situation. >>>> >>>> Dongjoon. >>>> >>>>>>> >>>>>>> On 2025/12/15 14:59:32 Herman van Hovell via dev wrote: >>>>>>>> I pasted a non-existing link for the root cause. The actual link is >>>>>>>> here: >>>>>>>> https://issues.apache.org/jira/browse/SPARK-53342 >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Dec 15, 2025 at 10:47 AM Herman van Hovell < >>>>>>> [email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hey Dongjoon, >>>>>>>>> >>>>>>>>> Regarding your questions. >>>>>>>>> >>>>>>>>> 1. If you define a large-ish local relation (which makes us cache it >>>>>>>>> on the serverside) and keep using it, then leak off-heap memory >>>>>>> every time >>>>>>>>> it is being used. At some point the OS will OOM kill the driver. >>>>>>> While I >>>>>>>>> have a repro, testing it like this in CI is not a good idea. As an >>>>>>>>> alternative I am working on a test that checks buffer clean-up.For >>>>>>> the >>>>>>>>> record I don't appreciate the term `claim` here; I am not blocking a >>>>>>>>> release without genuine concern. >>>>>>>>> 2. The root cause is >>>>>>>>> https://databricks.atlassian.net/browse/SPARK-53342 and not the >>>>>>> large >>>>>>>>> local relations work. >>>>>>>>> 3. A PR has been open since Friday: >>>>>>>>> https://github.com/apache/spark/pull/53452. I hope that I can get >>>>>>> it >>>>>>>>> merged today. >>>>>>>>> 4. I don't see a reason why. >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Herman >>>>>>>>> >>>>>>>>> On Mon, Dec 15, 2025 at 5:47 AM Dongjoon Hyun <[email protected]> >>>>>>> wrote: >>>>>>>>> >>>>>>>>>> How can we verify the regression, Herman? >>>>>>>>>> >>>>>>>>>> It's a little difficult for me to evaluate your claim so far due to >>>>>>> the >>>>>>>>>> lack of the shared information. Specifically, there is no update for >>>>>>> last 3 >>>>>>>>>> days on "SPARK-54696 (Spark Connect LocalRelation support leak >>>>>>> off-heap >>>>>>>>>> memory)" after you created it. >>>>>>>>>> >>>>>>>>>> Could you provide us more technical information about your Spark >>>>>>> Connect >>>>>>>>>> issue? >>>>>>>>>> >>>>>>>>>> 1. How can we reproduce your claim? Do you have a test case? >>>>>>>>>> >>>>>>>>>> 2. For the root cause, I'm wondering if you are saying literally >>>>>>>>>> SPARK-53917 (Support large local relations) or another JIRA issue. >>>>>>> Which >>>>>>>>>> commit is the root cause? >>>>>>>>>> >>>>>>>>>> 3. Since you assigned SPARK-54696 to yourself for last 3 days, do you >>>>>>>>>> want to provide a PR soon? >>>>>>>>>> >>>>>>>>>> 4. If you need more time, shall we simply revert the root cause from >>>>>>>>>> Apache Spark 4.1.0 ? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Dongjoon >>>>>>>>>> >>>>>>>>>> On 2025/12/14 23:29:59 Herman van Hovell via dev wrote: >>>>>>>>>>> Yes. It is a regression in Spark 4.1. The root cause is a change >>>>>>> where >>>>>>>>>> we >>>>>>>>>>> fail to clean-up allocated (off-heap) buffers. >>>>>>>>>>> >>>>>>>>>>> On Sun, Dec 14, 2025 at 4:25 AM Dongjoon Hyun <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi, Herman. >>>>>>>>>>>> >>>>>>>>>>>> Do you mean that is a regression at Apache Spark 4.1.0? >>>>>>>>>>>> >>>>>>>>>>>> If then, do you know what was the root cause? >>>>>>>>>>>> >>>>>>>>>>>> Dongjoon. >>>>>>>>>>>> >>>>>>>>>>>> On 2025/12/13 23:09:02 Herman van Hovell via dev wrote: >>>>>>>>>>>>> -1. We need to get >>>>>>>>>> https://issues.apache.org/jira/browse/SPARK-54696 >>>>>>>>>>>> fixed. >>>>>>>>>>>>> >>>>>>>>>>>>> On Sat, Dec 13, 2025 at 11:07 AM Jules Damji < >>>>>>> [email protected] >>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> +1 non-binding >>>>>>>>>>>>>> — >>>>>>>>>>>>>> Sent from my iPhone >>>>>>>>>>>>>> Pardon the dumb thumb typos :) >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Dec 11, 2025, at 8:34 AM, [email protected] wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please vote on releasing the following candidate as Apache >>>>>>>>>> Spark >>>>>>>>>>>>>> version 4.1.0. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The vote is open until Sun, 14 Dec 2025 09:34:31 PST and >>>>>>> passes >>>>>>>>>> if a >>>>>>>>>>>>>> majority +1 PMC votes are cast, with >>>>>>>>>>>>>>> a minimum of 3 +1 votes. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [ ] +1 Release this package as Apache Spark 4.1.0 >>>>>>>>>>>>>>> [ ] -1 Do not release this package because ... >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> To learn more about Apache Spark, please see >>>>>>>>>>>> https://spark.apache.org/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The tag to be voted on is v4.1.0-rc3 (commit e221b56be7b): >>>>>>>>>>>>>>> https://github.com/apache/spark/tree/v4.1.0-rc3 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The release files, including signatures, digests, etc. can >>>>>>> be >>>>>>>>>> found >>>>>>>>>>>> at: >>>>>>>>>>>>>>> >>>>>>> https://dist.apache.org/repos/dist/dev/spark/v4.1.0-rc3-bin/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Signatures used for Spark RCs can be found in this file: >>>>>>>>>>>>>>> https://downloads.apache.org/spark/KEYS >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The staging repository for this release can be found at: >>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>> https://repository.apache.org/content/repositories/orgapachespark-1508/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The documentation corresponding to this release can be >>>>>>> found at: >>>>>>>>>>>>>>> >>>>>>> https://dist.apache.org/repos/dist/dev/spark/v4.1.0-rc3-docs/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The list of bug fixes going into 4.1.0 can be found at the >>>>>>>>>> following >>>>>>>>>>>> URL: >>>>>>>>>>>>>>> >>>>>>> https://issues.apache.org/jira/projects/SPARK/versions/12355581 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> FAQ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ========================= >>>>>>>>>>>>>>> How can I help test this release? >>>>>>>>>>>>>>> ========================= >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> If you are a Spark user, you can help us test this release >>>>>>> by >>>>>>>>>> taking >>>>>>>>>>>>>>> an existing Spark workload and running on this release >>>>>>>>>> candidate, >>>>>>>>>>>> then >>>>>>>>>>>>>>> reporting any regressions. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> If you're working in PySpark you can set up a virtual env >>>>>>> and >>>>>>>>>> install >>>>>>>>>>>>>>> the current RC via "pip install >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>> https://dist.apache.org/repos/dist/dev/spark/v4.1.0-rc3-bin/pyspark-4.1.0.tar.gz >>>>>>>>>>>>>> " >>>>>>>>>>>>>>> and see if anything important breaks. >>>>>>>>>>>>>>> In the Java/Scala, you can add the staging repository to >>>>>>> your >>>>>>>>>>>> project's >>>>>>>>>>>>>> resolvers and test >>>>>>>>>>>>>>> with the RC (make sure to clean up the artifact cache >>>>>>>>>> before/after so >>>>>>>>>>>>>>> you don't end up building with an out of date RC going >>>>>>> forward). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>>>>>>> To unsubscribe e-mail: [email protected] >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>>>>>> To unsubscribe e-mail: [email protected] >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>>>>>>> To unsubscribe e-mail: [email protected] >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>> To unsubscribe e-mail: [email protected] >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe e-mail: [email protected] >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe e-mail: [email protected] >>>>> >>>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe e-mail: [email protected] >>>> >>>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: [email protected] >>> > > --------------------------------------------------------------------- > To unsubscribe e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe e-mail: [email protected]
