Re: Apache Spark 3.3 Release

Xiao Li Mon, 14 Mar 2022 21:54:52 -0700

To make our release time more predictable, let us collect the PRs and wait
three more days before the branch cut?


Please list all the actively developed feature work we plan to release with
Spark 3.3? We should avoid merging any new feature work that is not being
discussed in this email thread. Below is my list

   - #35789 [SPARK-32268][SQL] Row-level Runtime Filtering
   <https://github.com/apache/spark/pull/35789>
   - #34659 [SPARK-34863][SQL] Support complex types for Parquet vectorized
   reader <https://github.com/apache/spark/pull/34659>
   - #35848 [SPARK-38548][SQL] New SQL function: try_sum
   <https://github.com/apache/spark/pull/35848>




Chao Sun <[email protected]> 于2022年3月14日周一 21:17写道：

> I mainly mean:
>
>   - [SPARK-35801] Row-level operations in Data Source V2
>   - [SPARK-37166] Storage Partitioned Join
>
> For which the PR:
>
> - https://github.com/apache/spark/pull/35395
> - https://github.com/apache/spark/pull/35657
>
> are actively being reviewed. It seems there are ongoing PRs for other
> SPIPs as well but I'm not involved in those so not quite sure whether
> they are intended for 3.3 release.
>
> Chao
>
>
> Chao
>
> On Mon, Mar 14, 2022 at 8:53 PM Xiao Li <[email protected]> wrote:
> >
> > Could you please list which features we want to finish before the branch
> cut? How long will they take?
> >
> > Xiao
> >
> > Chao Sun <[email protected]> 于2022年3月14日周一 13:30写道：
> >>
> >> Hi Max,
> >>
> >> As there are still some ongoing work for the above listed SPIPs, can we
> still merge them after the branch cut?
> >>
> >> Thanks,
> >> Chao
> >>
> >> On Mon, Mar 14, 2022 at 6:12 AM Maxim Gekk 
> >> <[email protected]>
> wrote:
> >>>
> >>> Hi All,
> >>>
> >>> Since there are no actual blockers for Spark 3.3.0 and significant
> objections, I am going to cut branch-3.3 after 15th March at 00:00 PST.
> Please, let us know if you have any concerns about that.
> >>>
> >>> Best regards,
> >>> Max Gekk
> >>>
> >>>
> >>> On Thu, Mar 3, 2022 at 9:44 PM Maxim Gekk <[email protected]>
> wrote:
> >>>>
> >>>> Hello All,
> >>>>
> >>>> I would like to bring on the table the theme about the new Spark
> release 3.3. According to the public schedule at
> https://spark.apache.org/versioning-policy.html, we planned to start the
> code freeze and release branch cut on March 15th, 2022. Since this date is
> coming soon, I would like to take your attention on the topic and gather
> objections that you might have.
> >>>>
> >>>> Bellow is the list of ongoing and active SPIPs:
> >>>>
> >>>> Spark SQL:
> >>>> - [SPARK-31357] DataSourceV2: Catalog API for view metadata
> >>>> - [SPARK-35801] Row-level operations in Data Source V2
> >>>> - [SPARK-37166] Storage Partitioned Join
> >>>>
> >>>> Spark Core:
> >>>> - [SPARK-20624] Add better handling for node shutdown
> >>>> - [SPARK-25299] Use remote storage for persisting shuffle data
> >>>>
> >>>> PySpark:
> >>>> - [SPARK-26413] RDD Arrow Support in Spark Core and PySpark
> >>>>
> >>>> Kubernetes:
> >>>> - [SPARK-36057] Support Customized Kubernetes Schedulers
> >>>>
> >>>> Probably, we should finish if there are any remaining works for Spark
> 3.3, and switch to QA mode, cut a branch and keep everything on track. I
> would like to volunteer to help drive this process.
> >>>>
> >>>> Best regards,
> >>>> Max Gekk
>

Re: Apache Spark 3.3 Release

Reply via email to