Yeah let's get that fix in, but it seems to be a minor test only issue so
should not block release.

On Fri, Feb 16, 2024, 9:30 AM yangjie01 <yangji...@baidu.com> wrote:

> Very sorry. When I was fixing `SPARK-45242 (
> https://github.com/apache/spark/pull/43594)`
> <https://github.com/apache/spark/pull/43594)>, I noticed that its
> `Affects Version` and `Fix Version` of SPARK-45242 were both 4.0, and I
> didn't realize that it had also been merged into branch-3.5, so I didn't
> advocate for SPARK-45357 to be backported to branch-3.5.
>
>
>
> As far as I know, the condition to trigger this test failure is: when
> using Maven to test the `connect` module, if  `sparkTestRelation` in
> `SparkConnectProtoSuite` is not the first `DataFrame` to be initialized,
> then the `id` of `sparkTestRelation` will no longer be 0. So, I think this
> is indeed related to the order in which Maven executes the test cases in
> the `connect` module.
>
>
>
> I have submitted a backport PR
> <https://github.com/apache/spark/pull/45141> to branch-3.5, and if
> necessary, we can merge it to fix this test issue.
>
>
>
> Jie Yang
>
>
>
> *发件人**: *Jungtaek Lim <kabhwan.opensou...@gmail.com>
> *日期**: *2024年2月16日 星期五 22:15
> *收件人**: *Sean Owen <sro...@gmail.com>, Rui Wang <amaliu...@apache.org>
> *抄送**: *dev <dev@spark.apache.org>
> *主题**: *Re: [VOTE] Release Apache Spark 3.5.1 (RC2)
>
>
>
> I traced back relevant changes and got a sense of what happened.
>
>
>
> Yangjie figured out the issue via link
> <https://mailshield.baidu.com/check?q=8dOSfwXDFpe5HSp%2b%2bgCPsNQ52B7S7TAFG56Vj3tiFgMkCyOrQEGbg03AVWDX5bwwyIW7sZx3JZox3w8Jz1iw%2bPjaOZYmLWn2>.
> It's a tricky issue according to the comments from Yangjie - the test is
> dependent on ordering of execution for test suites. He said it does not
> fail in sbt, hence CI build couldn't catch it.
>
> He fixed it via link
> <https://mailshield.baidu.com/check?q=ojK3dg%2fDFf3xmQ8SPzsIou3EKaE1ZePctdB%2fUzhWmewnZb5chnQM1%2f8D1JDJnkxF>,
> but we missed that the offending commit was also ported back to 3.5 as
> well, hence the fix wasn't ported back to 3.5.
>
>
>
> Surprisingly, I can't reproduce locally even with maven. In my attempt to
> reproduce, SparkConnectProtoSuite was executed at
> third, SparkConnectStreamingQueryCacheSuite, and ExecuteEventsManagerSuite,
> and then SparkConnectProtoSuite. Maybe very specific to the environment,
> not just maven? My env: MBP M1 pro chip, MacOS 14.3.1, Openjdk 17.0.9. I
> used build/mvn (Maven 3.8.8).
>
>
>
> I'm not 100% sure this is something we should fail the release as it's a
> test only and sounds very environment dependent, but I'll respect your call
> on vote.
>
>
>
> Btw, looks like Rui also made a relevant fix via link
> <https://mailshield.baidu.com/check?q=TUbVzroxG%2fbi2P4qN0kbggzXuPzSN%2bKDoUFGhS9xMet8aXVw6EH0rMr1MKJqp2E2>
>  (not
> to fix the failing test but to fix other issues), but this also wasn't
> ported back to 3.5. @Rui Wang <amaliu...@apache.org> Do you think this is
> a regression issue and warrants a new RC?
>
>
>
>
>
> On Fri, Feb 16, 2024 at 11:38 AM Sean Owen <sro...@gmail.com> wrote:
>
> Is anyone seeing this Spark Connect test failure? then again, I have some
> weird issue with this env that always fails 1 or 2 tests that nobody else
> can replicate.
>
>
>
> - Test observe *** FAILED ***
>   == FAIL: Plans do not match ===
>   !CollectMetrics my_metric, [min(id#0) AS min_val#0, max(id#0) AS
> max_val#0, sum(id#0) AS sum(id)#0L], 0   CollectMetrics my_metric,
> [min(id#0) AS min_val#0, max(id#0) AS max_val#0, sum(id#0) AS sum(id)#0L],
> 44
>    +- LocalRelation <empty>, [id#0, name#0]
>                                   +- LocalRelation <empty>, [id#0, name#0]
> (PlanTest.scala:179)
>
>
>
> On Thu, Feb 15, 2024 at 1:34 PM Jungtaek Lim <kabhwan.opensou...@gmail.com>
> wrote:
>
> DISCLAIMER: RC for Apache Spark 3.5.1 starts with RC2 as I lately figured
> out doc generation issue after tagging RC1.
>
>
>
> Please vote on releasing the following candidate as Apache Spark version
> 3.5.1.
>
> The vote is open until February 18th 9AM (PST) and passes if a majority +1
> PMC votes are cast, with
> a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 3.5.1
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see https://spark.apache.org/
> <https://mailshield.baidu.com/check?q=iR6md5rYrz%2bpTPJlEXXlR6NN3aGjunZT0DADO3Pcgs0%3d>
>
> The tag to be voted on is v3.5.1-rc2 (commit
> fd86f85e181fc2dc0f50a096855acf83a6cc5d9c):
> https://github.com/apache/spark/tree/v3.5.1-rc2
> <https://mailshield.baidu.com/check?q=BMfFodF3wXGjeH1b9pbW8V4xeWam1vqNNCMtg1lcpC0d4WtLLiIr8UPiFKSwNMjbEy0AJw%3d%3d>
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.5.1-rc2-bin/
> <https://mailshield.baidu.com/check?q=GisJJtraQY1N6Eyahj4wgpwh0wps%2bZC4JtMrCvefk0scRi8wuiCglswMgKTAct5KKjhc%2fw%2f2NWCY4YCv2NIWVg%3d%3d>
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
> <https://mailshield.baidu.com/check?q=E6fHbSXEWw02TTJBpc3bfA9mi7ea0YiWcNHkm%2fDJxwlaWinGnMdaoO1PahHhgj00vKwcbElpuHA%3d>
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1452/
> <https://mailshield.baidu.com/check?q=buXpvEpH6X6T3RyvYe2VQXDD5HPLWSOBI0hXYHpxkBXBL%2fNC9HFVp0G4wysilGp6L%2fsWBxhLMf%2fMM49FKQGLLLRk9qhtZZKn7aRvpA%3d%3d>
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.5.1-rc2-docs/
> <https://mailshield.baidu.com/check?q=Wsh6KVSzVutMi1gwfF3Ssjy7t%2fs%2bXRvROyK0j2iIKRoBUNFfgMDWcoa56dn4otQsMMKWJTXpiWBjs5MlYb3FMzrn0Ew%3d>
>
> The list of bug fixes going into 3.5.1 can be found at the following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12353495
> <https://mailshield.baidu.com/check?q=kyejAwWf%2fvrHE5t0mqT6o4PEEi9Z4hr1JA5CjnkW%2fBpSavBxI95Jj7GEoLSvfDxUhKsrPUg8ex%2fhmPshmWKR%2fmZyktY%3d>
>
> FAQ
>
> =========================
> How can I help test this release?
> =========================
>
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC via "pip install
> https://dist.apache.org/repos/dist/dev/spark/v3.5.1-rc2-bin/pyspark-3.5.1.tar.gz
> <https://mailshield.baidu.com/check?q=ELhy2kh7hectlW5w04ynVn%2f%2b6m5VbFf74gdy5r7c2c3%2bcjqCYCTTnHH2RBO4f4KQDpGxVe8epjVDicg7wr9U%2bTROX0Y8%2fmZjZ2ZMVzcLUWz%2flmGL>
> "
> and see if anything important breaks.
> In the Java/Scala, you can add the staging repository to your projects
> resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===========================================
> What should happen to JIRA tickets still targeting 3.5.1?
> ===========================================
>
> The current list of open tickets targeted at 3.5.1 can be found at:
> https://issues.apache.org/jira/projects/SPARK
> <https://mailshield.baidu.com/check?q=4UUpJqq41y71Gnuj0qTUYo6hTjqzT7oytN6x%2fvgC5XUtQUC8MfJ77tj7K70O%2f1QMmNoa1A%3d%3d>
> and search for "Target Version/s" = 3.5.1
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
>
> ==================
> But my bug isn't fixed?
> ==================
>
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.
>
>

Reply via email to