Hi, All.
Since Apache Spark 3.1.1 tag creation (Feb 21),
new 172 patches including 9 correctness patches and 4 K8s patches arrived
at branch-3.1.
Shall we make a new release, Apache Spark 3.1.2, as the second release at
3.1 line?
I'd like to volunteer for the release manager for Apache Spark 3.1.2.
I'm thinking about starting the first RC next week.
$ git log --oneline v3.1.1..HEAD | wc -l
172
# Known correctness issues
SPARK-34534 New protocol FetchShuffleBlocks in OneForOneBlockFetcher
lead to data loss or correctness
SPARK-34545 PySpark Python UDF return inconsistent results when
applying 2 UDFs with different return type to 2 columns together
SPARK-34681 Full outer shuffled hash join when building left side
produces wrong result
SPARK-34719 fail if the view query has duplicated column names
SPARK-34794 Nested higher-order functions broken in DSL
SPARK-34829 transform_values return identical values when it's used
with udf that returns reference type
SPARK-34833 Apply right-padding correctly for correlated subqueries
SPARK-35381 Fix lambda variable name issues in nested DataFrame
functions in R APIs
SPARK-35382 Fix lambda variable name issues in nested DataFrame
functions in Python APIs
# Notable K8s patches since K8s GA
SPARK-34674 Close SparkContext after the Main method has finished
SPARK-34948 Add ownerReference to executor configmap to fix leakages
SPARK-34820 add apt-update before gnupg install
SPARK-34361 In case of downscaling avoid killing of executors already
known by the scheduler backend in the pod allocator
Bests,
Dongjoon.