We have addressed the CDC items I mentioned earlier. I spent more time looking into CDC support in general and didn't find any new issues.
Andreas opened two PRs to address the transaction problems above. I am going to review them today and the plan is to get the PRs in by the end of this week (hopefully) / early next week (if iterations are needed). - Anton чт, 28 трав. 2026 р. о 09:07 huaxin gao <[email protected]> пише: > Thanks Szehon for the update. These two Auto CDC PRs look good to me. > > Anton and Andreas, could you share the current status of the DSv2 > transaction fixes for SPARK-56695 and SPARK-56995, and when you expect them > to be merged and backported to branch-4.2? > > Once these pending items are in, I can proceed with cutting RC1. > > Thanks, > Huaxin > > On Wed, May 27, 2026 at 4:56 PM Szehon Ho <[email protected]> wrote: > >> Hi Huaxin >> >> Thanks for all the hard work doing the release! >> >> It'd be nice to get these two PR by Anish in for the Spark 4.2 feature >> Auto CDC (although its not the end of the world if we cannot). >> >> - https://github.com/apache/spark/pull/53073 >> - https://github.com/apache/spark/pull/56160 >> >> The first one is day 0 bug for SDP and the second is a validation that'd >> be awkward to add after the release. >> >> We will aim to get it in by EOD, but depend on CI. >> >> Thanks! >> Szehon >> >> On Tue, May 26, 2026 at 7:37 PM Cheng Pan <[email protected]> wrote: >> >>> I apologize for any inconvenience caused. >>> >>> My intention was to keep PR open for at least 1-2 workdays (based on the >>> size and complexity of the patch, also don't want to keep it open too long >>> to block the release process) so that developers from all time zones would >>> have the opportunity to review it, but I was completely unaware that Monday >>> is a holiday in the US. The merge operation happened on Tue 11:48 AM PDT, >>> after a formal approval from a PMC member active in the SQL area; half of >>> the workday is indeed too short for reviewers based in the US to review. >>> >>> Apologize again, and I'm happy to address any post-review comments. >>> >>> Thanks, >>> Cheng Pan >>> >>> >>> >>> On May 27, 2026, at 09:15, huaxin gao <[email protected]> wrote: >>> >>> Hi Cheng, >>> >>> Thanks for working on this fix. >>> >>> Since this has already been merged into branch-4.2, I will trust your >>> judgment on the fix itself, but I do have some concerns about the process. >>> >>> The PR was opened over the weekend, Monday was a US holiday, and the >>> 12-hour notice was sent at 10:59 PM Monday night Pacific time. In practice, >>> that did not leave enough review time before merging into the release >>> branch. This is especially concerning for a last-minute change close to RC >>> that includes an API change and behavior changes beyond the narrow >>> correctness issue. >>> >>> For future 4.2.0 release-branch changes, could we please allow more >>> practical review time? >>> >>> Thanks, >>> Huaxin >>> >>> On Mon, May 25, 2026 at 10:59 PM Cheng Pan <[email protected]> wrote: >>> >>>> Huaxin, thank you for replying. >>>> >>>> I would not treat it as a hard blocker given it has been existing for a >>>> long time the impact scope is fairly narrow, but still good to get the fix >>>> include the 4.2.0 given the fix is a relatively small change. >>>> >>>> > The PR also includes API changes and new TABLESAMPLE SYSTEM support >>>> ... >>>> > … unless you think the correctness fix needs to be split out >>>> separately. >>>> >>>> 3 parts mentioned in the PR description can be split into dedicated >>>> PRs, but the correctness fix for (1) requires the API change; the change >>>> for (2) (3) are small, I put them together mainly for demonstration of why >>>> the API change makes sense. I’m fine to split the PR and defer the "new >>>> TABLESAMPLE SYSTEM support” to 4.3 if you think it’s risky. >>>> >>>> The PR has been reviewed and approved by cloud-fan, I will leave it >>>> open for another 12 hours and merge it as is if no further comments. >>>> >>>> Thanks, >>>> Cheng Pan >>>> >>>> >>>> >>>> On May 26, 2026, at 00:53, huaxin gao <[email protected]> wrote: >>>> >>>> Hi Cheng, >>>> >>>> Thanks for flagging this. The withReplacement = true pushdown issue >>>> looks valid, but the impact seems fairly narrow. It mainly affects users >>>> doing JDBC TABLESAMPLE pushdown with withReplacement = true on PostgreSQL >>>> or Databricks. The PR also includes API changes and new TABLESAMPLE SYSTEM >>>> support, which feels more like a 4.2.1 candidate than a last-minute RC >>>> change. >>>> >>>> Could you evaluate the risk of merging at the last minute? Otherwise >>>> I'd prefer 4.2.1, unless you think the correctness fix needs to be split >>>> out separately. >>>> >>>> Thanks, >>>> >>>> Huaxin >>>> >>>> On Mon, May 25, 2026 at 3:27 AM Cheng Pan <[email protected]> wrote: >>>> >>>>> Hi Huaxin, >>>>> >>>>> I found some issues in the implementation of JDBC connector >>>>> TABLESAMPLE pushdown, I opened SPARK-57040 and >>>>> https://github.com/apache/spark/pull/56092, it would be great if you >>>>> could take a look and evaluate whether this is a blocker and should be >>>>> included in 4.2.0 since you are the author of this feature. >>>>> >>>>> Thanks, >>>>> Cheng Pan >>>>> >>>>> >>>>> >>>>> On May 18, 2026, at 11:40, huaxin gao <[email protected]> wrote: >>>>> >>>>> Hi all, >>>>> >>>>> I plan to cut Spark 4.2.0 RC1 on May 20, assuming there are no >>>>> outstanding release blockers. >>>>> >>>>> If you have any fixes that must be included in 4.2.0, please make sure >>>>> they are merged/backported to branch-4.2 before then. If you are >>>>> aware of any release blockers, please reply with the JIRA/PR and current >>>>> status. >>>>> >>>>> Thanks, >>>>> Huaxin >>>>> >>>>> >>>>> >>>> >>>
