Re: [VOTE] Release Spark 3.3.0 (RC5)

Maxim Gekk Thu, 09 Jun 2022 03:17:50 -0700

Hi All,

Results of voting for Spark 3.3.0 RC5 are:


+1:
Sean Owen (*)
Dongjoon Hyun (*)
Yuming Wang
Yikun Jiang
Martin Grigorov
Thomas Graves (*)
Gengliang Wang
L. C. Hsieh (*)
Cheng Su
Chris Nauroth
Cheng Pan

0:
Hyukjin Kwon (*)

-1:
Jungtaek Lim
Jerry Peng
Prashant Singh
Huaxin Gao

I consider the voting as *failed*. I will prepare RC6 as soon as the issues
mentioned in the thread are solved.

Maxim Gekk

Software Engineer

Databricks, Inc.


On Wed, Jun 8, 2022 at 9:18 PM huaxin gao <huaxin.ga...@gmail.com> wrote:

> I agree with Prashant, -1 from me too because this may break iceberg
> usage.
>
> Thanks,
> Huaxin
>
> On Wed, Jun 8, 2022 at 10:07 AM Prashant Singh <prashant010...@gmail.com>
> wrote:
>
>> -1 from my side as well, found this today.
>>
>> While testing Apache iceberg with 3.3 found this bug where a table with
>> partitions with null values we get a NPE on partition discovery, earlier we
>> use to get `DEFAULT_PARTITION_NAME`
>>
>> Please look into : https://issues.apache.org/jira/browse/SPARK-39417 for
>> more details
>>
>> Regards,
>> Prashant Singh
>>
>> On Wed, Jun 8, 2022 at 10:27 PM Jerry Peng <jerry.boyang.p...@gmail.com>
>> wrote:
>>
>>>
>>>
>>> I agree with Jungtaek,  -1 from me because of the issue of Kafka source
>>> throwing an error with an incorrect error message that was introduced
>>> recently.  This may mislead users and cause unnecessary confusion.
>>>
>>> On Wed, Jun 8, 2022 at 12:04 AM Jungtaek Lim <
>>> kabhwan.opensou...@gmail.com> wrote:
>>>
>>>> Apologize for late participation.
>>>>
>>>> I'm sorry, but -1 (non-binding) from me.
>>>>
>>>> Unfortunately I found a major user-facing issue which hurts UX
>>>> seriously on Kafka data source usage.
>>>>
>>>> In some cases, Kafka data source can throw IllegalStateException for
>>>> the case of failOnDataLoss=true which condition is bound to the state of
>>>> Kafka topic (not Spark's issue). With the recent change of Spark,
>>>> IllegalStateException is now bound to the "internal error", and Spark gives
>>>> incorrect guidance to the end users, telling to end users that Spark has a
>>>> bug and they are encouraged to file a JIRA ticket which is simply wrong.
>>>>
>>>> Previously, Kafka data source provided the error message with the
>>>> context why it failed, and how to workaround it. I feel this is a serious
>>>> regression on UX.
>>>>
>>>> Please look into https://issues.apache.org/jira/browse/SPARK-39412 for
>>>> more details.
>>>>
>>>>
>>>> On Wed, Jun 8, 2022 at 3:40 PM Hyukjin Kwon <gurwls...@gmail.com>
>>>> wrote:
>>>>
>>>>> Okay. Thankfully the binary release is fine per
>>>>> https://github.com/apache/spark/blob/v3.3.0-rc5/dev/create-release/release-build.sh#L268
>>>>> .
>>>>> The source package (and GitHub tag) has 3.3.0.dev0, and the binary
>>>>> package has 3.3.0. Technically this is not a blocker now because PyPI
>>>>> upload will be able to be made correctly.
>>>>> I lowered the priority to critical. I switch my -1 to 0.
>>>>>
>>>>> On Wed, 8 Jun 2022 at 15:17, Hyukjin Kwon <gurwls...@gmail.com> wrote:
>>>>>
>>>>>> Arrrgh  .. I am very sorry that I found this problem late.
>>>>>> RC 5 does not have the correct version of PySpark, see
>>>>>> https://github.com/apache/spark/blob/v3.3.0-rc5/python/pyspark/version.py#L19
>>>>>> I think the release script was broken because the version now has
>>>>>> 'str' type, see
>>>>>> https://github.com/apache/spark/blob/v3.3.0-rc5/dev/create-release/release-tag.sh#L88
>>>>>> I filed a JIRA at https://issues.apache.org/jira/browse/SPARK-39411
>>>>>>
>>>>>> -1 from me
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, 8 Jun 2022 at 13:16, Cheng Pan <pan3...@gmail.com> wrote:
>>>>>>
>>>>>>> +1 (non-binding)
>>>>>>>
>>>>>>> * Verified SPARK-39313 has been address[1]
>>>>>>> * Passed integration test w/ Apache Kyuubi (Incubating)[2]
>>>>>>>
>>>>>>> [1]
>>>>>>> https://github.com/housepower/spark-clickhouse-connector/pull/123
>>>>>>> [2] https://github.com/apache/incubator-kyuubi/pull/2817
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Cheng Pan
>>>>>>>
>>>>>>> On Wed, Jun 8, 2022 at 7:04 AM Chris Nauroth <cnaur...@apache.org>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > +1 (non-binding)
>>>>>>> >
>>>>>>> > * Verified all checksums.
>>>>>>> > * Verified all signatures.
>>>>>>> > * Built from source, with multiple profiles, to full success, for
>>>>>>> Java 11 and Scala 2.13:
>>>>>>> >     * build/mvn -Phadoop-3 -Phadoop-cloud -Phive-thriftserver
>>>>>>> -Pkubernetes -Pscala-2.13 -Psparkr -Pyarn -DskipTests clean package
>>>>>>> > * Tests passed.
>>>>>>> > * Ran several examples successfully:
>>>>>>> >     * bin/spark-submit --class org.apache.spark.examples.SparkPi
>>>>>>> examples/jars/spark-examples_2.12-3.3.0.jar
>>>>>>> >     * bin/spark-submit --class
>>>>>>> org.apache.spark.examples.sql.hive.SparkHiveExample
>>>>>>> examples/jars/spark-examples_2.12-3.3.0.jar
>>>>>>> >     * bin/spark-submit
>>>>>>> examples/src/main/python/streaming/network_wordcount.py localhost 9999
>>>>>>> > * Tested some of the issues that blocked prior release candidates:
>>>>>>> >     * bin/spark-sql -e 'SELECT (SELECT IF(x, 1, 0)) AS a FROM
>>>>>>> (SELECT true) t(x) UNION SELECT 1 AS a;'
>>>>>>> >     * bin/spark-sql -e "select date '2018-11-17' > 1"
>>>>>>> >     * SPARK-39293 ArrayAggregate fix
>>>>>>> >
>>>>>>> > Chris Nauroth
>>>>>>> >
>>>>>>> >
>>>>>>> > On Tue, Jun 7, 2022 at 1:30 PM Cheng Su <chen...@fb.com.invalid>
>>>>>>> wrote:
>>>>>>> >>
>>>>>>> >> +1 (non-binding). Built and ran some internal test for Spark SQL.
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> Thanks,
>>>>>>> >>
>>>>>>> >> Cheng Su
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> From: L. C. Hsieh <vii...@gmail.com>
>>>>>>> >> Date: Tuesday, June 7, 2022 at 1:23 PM
>>>>>>> >> To: dev <dev@spark.apache.org>
>>>>>>> >> Subject: Re: [VOTE] Release Spark 3.3.0 (RC5)
>>>>>>> >>
>>>>>>> >> +1
>>>>>>> >>
>>>>>>> >> Liang-Chi
>>>>>>> >>
>>>>>>> >> On Tue, Jun 7, 2022 at 1:03 PM Gengliang Wang <ltn...@gmail.com>
>>>>>>> wrote:
>>>>>>> >> >
>>>>>>> >> > +1 (non-binding)
>>>>>>> >> >
>>>>>>> >> > Gengliang
>>>>>>> >> >
>>>>>>> >> > On Tue, Jun 7, 2022 at 12:24 PM Thomas Graves <
>>>>>>> tgraves...@gmail.com> wrote:
>>>>>>> >> >>
>>>>>>> >> >> +1
>>>>>>> >> >>
>>>>>>> >> >> Tom Graves
>>>>>>> >> >>
>>>>>>> >> >> On Sat, Jun 4, 2022 at 9:50 AM Maxim Gekk
>>>>>>> >> >> <maxim.g...@databricks.com.invalid> wrote:
>>>>>>> >> >> >
>>>>>>> >> >> > Please vote on releasing the following candidate as Apache
>>>>>>> Spark version 3.3.0.
>>>>>>> >> >> >
>>>>>>> >> >> > The vote is open until 11:59pm Pacific time June 8th and
>>>>>>> passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 
>>>>>>> votes.
>>>>>>> >> >> >
>>>>>>> >> >> > [ ] +1 Release this package as Apache Spark 3.3.0
>>>>>>> >> >> > [ ] -1 Do not release this package because ...
>>>>>>> >> >> >
>>>>>>> >> >> > To learn more about Apache Spark, please see
>>>>>>> http://spark.apache.org/
>>>>>>> >> >> >
>>>>>>> >> >> > The tag to be voted on is v3.3.0-rc5 (commit
>>>>>>> 7cf29705272ab8e8c70e8885a3664ad8ae3cd5e9):
>>>>>>> >> >> > https://github.com/apache/spark/tree/v3.3.0-rc5
>>>>>>> >> >> >
>>>>>>> >> >> > The release files, including signatures, digests, etc. can
>>>>>>> be found at:
>>>>>>> >> >> > https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc5-bin/
>>>>>>> >> >> >
>>>>>>> >> >> > Signatures used for Spark RCs can be found in this file:
>>>>>>> >> >> > https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>>>>> >> >> >
>>>>>>> >> >> > The staging repository for this release can be found at:
>>>>>>> >> >> >
>>>>>>> https://repository.apache.org/content/repositories/orgapachespark-1406
>>>>>>> >> >> >
>>>>>>> >> >> > The documentation corresponding to this release can be found
>>>>>>> at:
>>>>>>> >> >> >
>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc5-docs/
>>>>>>> >> >> >
>>>>>>> >> >> > The list of bug fixes going into 3.3.0 can be found at the
>>>>>>> following URL:
>>>>>>> >> >> >
>>>>>>> https://issues.apache.org/jira/projects/SPARK/versions/12350369
>>>>>>> >> >> >
>>>>>>> >> >> > This release is using the release script of the tag
>>>>>>> v3.3.0-rc5.
>>>>>>> >> >> >
>>>>>>> >> >> >
>>>>>>> >> >> > FAQ
>>>>>>> >> >> >
>>>>>>> >> >> > =========================
>>>>>>> >> >> > How can I help test this release?
>>>>>>> >> >> > =========================
>>>>>>> >> >> > If you are a Spark user, you can help us test this release
>>>>>>> by taking
>>>>>>> >> >> > an existing Spark workload and running on this release
>>>>>>> candidate, then
>>>>>>> >> >> > reporting any regressions.
>>>>>>> >> >> >
>>>>>>> >> >> > If you're working in PySpark you can set up a virtual env
>>>>>>> and install
>>>>>>> >> >> > the current RC and see if anything important breaks, in the
>>>>>>> Java/Scala
>>>>>>> >> >> > you can add the staging repository to your projects
>>>>>>> resolvers and test
>>>>>>> >> >> > with the RC (make sure to clean up the artifact cache
>>>>>>> before/after so
>>>>>>> >> >> > you don't end up building with a out of date RC going
>>>>>>> forward).
>>>>>>> >> >> >
>>>>>>> >> >> > ===========================================
>>>>>>> >> >> > What should happen to JIRA tickets still targeting 3.3.0?
>>>>>>> >> >> > ===========================================
>>>>>>> >> >> > The current list of open tickets targeted at 3.3.0 can be
>>>>>>> found at:
>>>>>>> >> >> > https://issues.apache.org/jira/projects/SPARK  and search
>>>>>>> for "Target Version/s" = 3.3.0
>>>>>>> >> >> >
>>>>>>> >> >> > Committers should look at those and triage. Extremely
>>>>>>> important bug
>>>>>>> >> >> > fixes, documentation, and API tweaks that impact
>>>>>>> compatibility should
>>>>>>> >> >> > be worked on immediately. Everything else please retarget to
>>>>>>> an
>>>>>>> >> >> > appropriate release.
>>>>>>> >> >> >
>>>>>>> >> >> > ==================
>>>>>>> >> >> > But my bug isn't fixed?
>>>>>>> >> >> > ==================
>>>>>>> >> >> > In order to make timely releases, we will typically not hold
>>>>>>> the
>>>>>>> >> >> > release unless the bug in question is a regression from the
>>>>>>> previous
>>>>>>> >> >> > release. That being said, if there is something which is a
>>>>>>> regression
>>>>>>> >> >> > that has not been correctly targeted please ping me or a
>>>>>>> committer to
>>>>>>> >> >> > help target the issue.
>>>>>>> >> >> >
>>>>>>> >> >> > Maxim Gekk
>>>>>>> >> >> >
>>>>>>> >> >> > Software Engineer
>>>>>>> >> >> >
>>>>>>> >> >> > Databricks, Inc.
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> >> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>>> >> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>>>
>>>>>>>

Re: [VOTE] Release Spark 3.3.0 (RC5)

Reply via email to