Hi all, The blocking issues discussed earlier have been resolved and backported to branch-4.2.
I plan to cut 4.2.0 RC1 shortly today. Please let me know as soon as possible if you are aware of any remaining release blockers. Thanks, Huaxin On Wed, Jun 3, 2026 at 11:06 AM Anton Okolnychyi <[email protected]> wrote: > I think we finally agreed on the way forward for both items, so should be > able to wrap up quickly. > > вт, 2 черв. 2026 р. о 09:27 huaxin gao <[email protected]> пише: > >> Hi all, >> >> Just to keep everyone updated, the Spark 4.2.0 RC1 cut is still pending >> on the two DSv2 transaction fixes: >> >> 1. https://issues.apache.org/jira/browse/SPARK-56695 >> 2. https://issues.apache.org/jira/browse/SPARK-56995 >> >> Once these fixes are merged and backported to branch-4.2, I will proceed >> with cutting RC1. >> >> Thanks, >> Huaxin >> >> On Thu, May 28, 2026 at 10:04 AM Anton Okolnychyi <[email protected]> >> wrote: >> >>> We have addressed the CDC items I mentioned earlier. I spent more time >>> looking into CDC support in general and didn't find any new issues. >>> >>> Andreas opened two PRs to address the transaction problems above. I am >>> going to review them today and the plan is to get the PRs in by the end of >>> this week (hopefully) / early next week (if iterations are needed). >>> >>> - Anton >>> >>> чт, 28 трав. 2026 р. о 09:07 huaxin gao <[email protected]> пише: >>> >>>> Thanks Szehon for the update. These two Auto CDC PRs look good to me. >>>> >>>> Anton and Andreas, could you share the current status of the DSv2 >>>> transaction fixes for SPARK-56695 and SPARK-56995, and when you expect them >>>> to be merged and backported to branch-4.2? >>>> >>>> Once these pending items are in, I can proceed with cutting RC1. >>>> >>>> Thanks, >>>> Huaxin >>>> >>>> On Wed, May 27, 2026 at 4:56 PM Szehon Ho <[email protected]> >>>> wrote: >>>> >>>>> Hi Huaxin >>>>> >>>>> Thanks for all the hard work doing the release! >>>>> >>>>> It'd be nice to get these two PR by Anish in for the Spark 4.2 feature >>>>> Auto CDC (although its not the end of the world if we cannot). >>>>> >>>>> - https://github.com/apache/spark/pull/53073 >>>>> - https://github.com/apache/spark/pull/56160 >>>>> >>>>> The first one is day 0 bug for SDP and the second is a validation >>>>> that'd be awkward to add after the release. >>>>> >>>>> We will aim to get it in by EOD, but depend on CI. >>>>> >>>>> Thanks! >>>>> Szehon >>>>> >>>>> On Tue, May 26, 2026 at 7:37 PM Cheng Pan <[email protected]> wrote: >>>>> >>>>>> I apologize for any inconvenience caused. >>>>>> >>>>>> My intention was to keep PR open for at least 1-2 workdays (based on >>>>>> the size and complexity of the patch, also don't want to keep it open too >>>>>> long to block the release process) so that developers from all time zones >>>>>> would have the opportunity to review it, but I was completely unaware >>>>>> that >>>>>> Monday is a holiday in the US. The merge operation happened on Tue 11:48 >>>>>> AM >>>>>> PDT, after a formal approval from a PMC member active in the SQL area; >>>>>> half >>>>>> of the workday is indeed too short for reviewers based in the US to >>>>>> review. >>>>>> >>>>>> Apologize again, and I'm happy to address any post-review comments. >>>>>> >>>>>> Thanks, >>>>>> Cheng Pan >>>>>> >>>>>> >>>>>> >>>>>> On May 27, 2026, at 09:15, huaxin gao <[email protected]> wrote: >>>>>> >>>>>> Hi Cheng, >>>>>> >>>>>> Thanks for working on this fix. >>>>>> >>>>>> Since this has already been merged into branch-4.2, I will trust >>>>>> your judgment on the fix itself, but I do have some concerns about the >>>>>> process. >>>>>> >>>>>> The PR was opened over the weekend, Monday was a US holiday, and the >>>>>> 12-hour notice was sent at 10:59 PM Monday night Pacific time. In >>>>>> practice, >>>>>> that did not leave enough review time before merging into the release >>>>>> branch. This is especially concerning for a last-minute change close to >>>>>> RC >>>>>> that includes an API change and behavior changes beyond the narrow >>>>>> correctness issue. >>>>>> >>>>>> For future 4.2.0 release-branch changes, could we please allow more >>>>>> practical review time? >>>>>> >>>>>> Thanks, >>>>>> Huaxin >>>>>> >>>>>> On Mon, May 25, 2026 at 10:59 PM Cheng Pan <[email protected]> wrote: >>>>>> >>>>>>> Huaxin, thank you for replying. >>>>>>> >>>>>>> I would not treat it as a hard blocker given it has been existing >>>>>>> for a long time the impact scope is fairly narrow, but still good to get >>>>>>> the fix include the 4.2.0 given the fix is a relatively small change. >>>>>>> >>>>>>> > The PR also includes API changes and new TABLESAMPLE SYSTEM >>>>>>> support ... >>>>>>> > … unless you think the correctness fix needs to be split out >>>>>>> separately. >>>>>>> >>>>>>> 3 parts mentioned in the PR description can be split into dedicated >>>>>>> PRs, but the correctness fix for (1) requires the API change; the change >>>>>>> for (2) (3) are small, I put them together mainly for demonstration of >>>>>>> why >>>>>>> the API change makes sense. I’m fine to split the PR and defer the "new >>>>>>> TABLESAMPLE SYSTEM support” to 4.3 if you think it’s risky. >>>>>>> >>>>>>> The PR has been reviewed and approved by cloud-fan, I will leave it >>>>>>> open for another 12 hours and merge it as is if no further comments. >>>>>>> >>>>>>> Thanks, >>>>>>> Cheng Pan >>>>>>> >>>>>>> >>>>>>> >>>>>>> On May 26, 2026, at 00:53, huaxin gao <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>> Hi Cheng, >>>>>>> >>>>>>> Thanks for flagging this. The withReplacement = true pushdown issue >>>>>>> looks valid, but the impact seems fairly narrow. It mainly affects users >>>>>>> doing JDBC TABLESAMPLE pushdown with withReplacement = true on >>>>>>> PostgreSQL >>>>>>> or Databricks. The PR also includes API changes and new TABLESAMPLE >>>>>>> SYSTEM >>>>>>> support, which feels more like a 4.2.1 candidate than a last-minute RC >>>>>>> change. >>>>>>> >>>>>>> Could you evaluate the risk of merging at the last minute? Otherwise >>>>>>> I'd prefer 4.2.1, unless you think the correctness fix needs to be split >>>>>>> out separately. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Huaxin >>>>>>> >>>>>>> On Mon, May 25, 2026 at 3:27 AM Cheng Pan <[email protected]> wrote: >>>>>>> >>>>>>>> Hi Huaxin, >>>>>>>> >>>>>>>> I found some issues in the implementation of JDBC connector >>>>>>>> TABLESAMPLE pushdown, I opened SPARK-57040 and >>>>>>>> https://github.com/apache/spark/pull/56092, it would be great if >>>>>>>> you could take a look and evaluate whether this is a blocker and >>>>>>>> should be >>>>>>>> included in 4.2.0 since you are the author of this feature. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Cheng Pan >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On May 18, 2026, at 11:40, huaxin gao <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I plan to cut Spark 4.2.0 RC1 on May 20, assuming there are no >>>>>>>> outstanding release blockers. >>>>>>>> >>>>>>>> If you have any fixes that must be included in 4.2.0, please make >>>>>>>> sure they are merged/backported to branch-4.2 before then. If you >>>>>>>> are aware of any release blockers, please reply with the JIRA/PR and >>>>>>>> current status. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Huaxin >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>
