Very sorry. When I was fixing `SPARK-45242 
(https://github.com/apache/spark/pull/43594)`, I noticed that its `Affects 
Version` and `Fix Version` of SPARK-45242 were both 4.0, and I didn't realize 
that it had also been merged into branch-3.5, so I didn't advocate for 
SPARK-45357 to be backported to branch-3.5.

As far as I know, the condition to trigger this test failure is: when using 
Maven to test the `connect` module, if  `sparkTestRelation` in 
`SparkConnectProtoSuite` is not the first `DataFrame` to be initialized, then 
the `id` of `sparkTestRelation` will no longer be 0. So, I think this is indeed 
related to the order in which Maven executes the test cases in the `connect` 
module.

I have submitted a backport PR<https://github.com/apache/spark/pull/45141> to 
branch-3.5, and if necessary, we can merge it to fix this test issue.

Jie Yang

发件人: Jungtaek Lim <kabhwan.opensou...@gmail.com>
日期: 2024年2月16日 星期五 22:15
收件人: Sean Owen <sro...@gmail.com>, Rui Wang <amaliu...@apache.org>
抄送: dev <dev@spark.apache.org>
主题: Re: [VOTE] Release Apache Spark 3.5.1 (RC2)

I traced back relevant changes and got a sense of what happened.

Yangjie figured out the issue via 
link<https://mailshield.baidu.com/check?q=8dOSfwXDFpe5HSp%2b%2bgCPsNQ52B7S7TAFG56Vj3tiFgMkCyOrQEGbg03AVWDX5bwwyIW7sZx3JZox3w8Jz1iw%2bPjaOZYmLWn2>.
 It's a tricky issue according to the comments from Yangjie - the test is 
dependent on ordering of execution for test suites. He said it does not fail in 
sbt, hence CI build couldn't catch it.
He fixed it via 
link<https://mailshield.baidu.com/check?q=ojK3dg%2fDFf3xmQ8SPzsIou3EKaE1ZePctdB%2fUzhWmewnZb5chnQM1%2f8D1JDJnkxF>,
 but we missed that the offending commit was also ported back to 3.5 as well, 
hence the fix wasn't ported back to 3.5.

Surprisingly, I can't reproduce locally even with maven. In my attempt to 
reproduce, SparkConnectProtoSuite was executed at third, 
SparkConnectStreamingQueryCacheSuite, and ExecuteEventsManagerSuite, and then 
SparkConnectProtoSuite. Maybe very specific to the environment, not just maven? 
My env: MBP M1 pro chip, MacOS 14.3.1, Openjdk 17.0.9. I used build/mvn (Maven 
3.8.8).

I'm not 100% sure this is something we should fail the release as it's a test 
only and sounds very environment dependent, but I'll respect your call on vote.

Btw, looks like Rui also made a relevant fix via 
link<https://mailshield.baidu.com/check?q=TUbVzroxG%2fbi2P4qN0kbggzXuPzSN%2bKDoUFGhS9xMet8aXVw6EH0rMr1MKJqp2E2>
 (not to fix the failing test but to fix other issues), but this also wasn't 
ported back to 3.5. @Rui Wang<mailto:amaliu...@apache.org> Do you think this is 
a regression issue and warrants a new RC?


On Fri, Feb 16, 2024 at 11:38 AM Sean Owen 
<sro...@gmail.com<mailto:sro...@gmail.com>> wrote:
Is anyone seeing this Spark Connect test failure? then again, I have some weird 
issue with this env that always fails 1 or 2 tests that nobody else can 
replicate.

- Test observe *** FAILED ***
  == FAIL: Plans do not match ===
  !CollectMetrics my_metric, [min(id#0) AS min_val#0, max(id#0) AS max_val#0, 
sum(id#0) AS sum(id)#0L], 0   CollectMetrics my_metric, [min(id#0) AS 
min_val#0, max(id#0) AS max_val#0, sum(id#0) AS sum(id)#0L], 44
   +- LocalRelation <empty>, [id#0, name#0]                                     
                            +- LocalRelation <empty>, [id#0, name#0] 
(PlanTest.scala:179)

On Thu, Feb 15, 2024 at 1:34 PM Jungtaek Lim 
<kabhwan.opensou...@gmail.com<mailto:kabhwan.opensou...@gmail.com>> wrote:
DISCLAIMER: RC for Apache Spark 3.5.1 starts with RC2 as I lately figured out 
doc generation issue after tagging RC1.

Please vote on releasing the following candidate as Apache Spark version 3.5.1.

The vote is open until February 18th 9AM (PST) and passes if a majority +1 PMC 
votes are cast, with
a minimum of 3 +1 votes.

[ ] +1 Release this package as Apache Spark 3.5.1
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see 
https://spark.apache.org/<https://mailshield.baidu.com/check?q=iR6md5rYrz%2bpTPJlEXXlR6NN3aGjunZT0DADO3Pcgs0%3d>

The tag to be voted on is v3.5.1-rc2 (commit 
fd86f85e181fc2dc0f50a096855acf83a6cc5d9c):
https://github.com/apache/spark/tree/v3.5.1-rc2<https://mailshield.baidu.com/check?q=BMfFodF3wXGjeH1b9pbW8V4xeWam1vqNNCMtg1lcpC0d4WtLLiIr8UPiFKSwNMjbEy0AJw%3d%3d>

The release files, including signatures, digests, etc. can be found at:
https://dist.apache.org/repos/dist/dev/spark/v3.5.1-rc2-bin/<https://mailshield.baidu.com/check?q=GisJJtraQY1N6Eyahj4wgpwh0wps%2bZC4JtMrCvefk0scRi8wuiCglswMgKTAct5KKjhc%2fw%2f2NWCY4YCv2NIWVg%3d%3d>

Signatures used for Spark RCs can be found in this file:
https://dist.apache.org/repos/dist/dev/spark/KEYS<https://mailshield.baidu.com/check?q=E6fHbSXEWw02TTJBpc3bfA9mi7ea0YiWcNHkm%2fDJxwlaWinGnMdaoO1PahHhgj00vKwcbElpuHA%3d>

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1452/<https://mailshield.baidu.com/check?q=buXpvEpH6X6T3RyvYe2VQXDD5HPLWSOBI0hXYHpxkBXBL%2fNC9HFVp0G4wysilGp6L%2fsWBxhLMf%2fMM49FKQGLLLRk9qhtZZKn7aRvpA%3d%3d>

The documentation corresponding to this release can be found at:
https://dist.apache.org/repos/dist/dev/spark/v3.5.1-rc2-docs/<https://mailshield.baidu.com/check?q=Wsh6KVSzVutMi1gwfF3Ssjy7t%2fs%2bXRvROyK0j2iIKRoBUNFfgMDWcoa56dn4otQsMMKWJTXpiWBjs5MlYb3FMzrn0Ew%3d>

The list of bug fixes going into 3.5.1 can be found at the following URL:
https://issues.apache.org/jira/projects/SPARK/versions/12353495<https://mailshield.baidu.com/check?q=kyejAwWf%2fvrHE5t0mqT6o4PEEi9Z4hr1JA5CjnkW%2fBpSavBxI95Jj7GEoLSvfDxUhKsrPUg8ex%2fhmPshmWKR%2fmZyktY%3d>

FAQ

=========================
How can I help test this release?
=========================

If you are a Spark user, you can help us test this release by taking
an existing Spark workload and running on this release candidate, then
reporting any regressions.

If you're working in PySpark you can set up a virtual env and install
the current RC via "pip install 
https://dist.apache.org/repos/dist/dev/spark/v3.5.1-rc2-bin/pyspark-3.5.1.tar.gz<https://mailshield.baidu.com/check?q=ELhy2kh7hectlW5w04ynVn%2f%2b6m5VbFf74gdy5r7c2c3%2bcjqCYCTTnHH2RBO4f4KQDpGxVe8epjVDicg7wr9U%2bTROX0Y8%2fmZjZ2ZMVzcLUWz%2flmGL>"
and see if anything important breaks.
In the Java/Scala, you can add the staging repository to your projects 
resolvers and test
with the RC (make sure to clean up the artifact cache before/after so
you don't end up building with a out of date RC going forward).

===========================================
What should happen to JIRA tickets still targeting 3.5.1?
===========================================

The current list of open tickets targeted at 3.5.1 can be found at:
https://issues.apache.org/jira/projects/SPARK<https://mailshield.baidu.com/check?q=4UUpJqq41y71Gnuj0qTUYo6hTjqzT7oytN6x%2fvgC5XUtQUC8MfJ77tj7K70O%2f1QMmNoa1A%3d%3d>
 and search for "Target Version/s" = 3.5.1

Committers should look at those and triage. Extremely important bug
fixes, documentation, and API tweaks that impact compatibility should
be worked on immediately. Everything else please retarget to an
appropriate release.

==================
But my bug isn't fixed?
==================

In order to make timely releases, we will typically not hold the
release unless the bug in question is a regression from the previous
release. That being said, if there is something which is a regression
that has not been correctly targeted please ping me or a committer to
help target the issue.

Reply via email to