Hi Yuanjian, This is a correctness issue that we should probably fix in 3.5: https://issues.apache.org/jira/browse/SPARK-44871 / https://github.com/apache/spark/pull/42559
Cheers, Peter yangjie01 <yangji...@baidu.com.invalid> ezt írta (időpont: 2023. aug. 12., Szo, 15:38): > Hi, Yuanjian, > > > > Maybe there is another issue that needs to be fixed > > > > - [SPARK-44784] <https://issues.apache.org/jira/browse/SPARK-44784> > Failure in testing `SparkSessionE2ESuite` using Maven > > > > Maven daily tests are still failing: > https://github.com/apache/spark/actions/runs/5832898984/job/15819181762 > > > > I think we should address this issue before the release of Apache Spark > 3.5.0. > > > > Jie Yang > > > > *发件人**: *Yuanjian Li <xyliyuanj...@gmail.com> > *日期**: *2023年8月12日 星期六 15:20 > *收件人**: *Yuming Wang <yumw...@apache.org> > *抄送**: *yangjie01 <yangji...@baidu.com.invalid>, Sean Owen < > sro...@gmail.com>, Spark dev list <dev@spark.apache.org> > *主题**: *Re: [VOTE] Release Apache Spark 3.5.0 (RC1) > > > > Thanks for all updates! > > The vote has failed. Here is the status of known blockers: > > - [SPARK-44719] > > <https://mailshield.baidu.com/check?q=wwfJriEy4YLHSWTPEZewyL%2f3Rqu%2fp4FKqD%2bp4FwtJqJ02sqPGmYPrQmOTBIEeRaP2%2fRVBQrfkLY%3d> > NoClassDefFoundError > when using Hive UDF - *Resolved* > - [SPARK-44653 > > <https://mailshield.baidu.com/check?q=wA9y49X0e47%2bUprborMbO4GB6VZLs4%2fZJckwZgX1zS%2fjL1b9OMia%2bpTr7SctUN6hN2R%2f527wJ4M%3d>] > non-trivial > DataFrame unions should not break caching - *Resolved* > - [SPARK-43646 > > <https://mailshield.baidu.com/check?q=xebdQk%2fkQ0oQcDLFMpwVi4eH7SqRuYIZqzKQihX%2fkaIz262tfrLqkhrU3yNWw0y%2fhebim80IThM%3d>] > Test failure of Connect: from_protobuf_messageClassName - *WIP* > > I'll cut RC2 once all blockers are resolved. > > > > > > Yuming Wang <yumw...@apache.org> 于2023年8月8日周二 05:29写道: > > -1. I found a NoClassDefFoundError bug: > https://issues.apache.org/jira/browse/SPARK-44719 > <https://mailshield.baidu.com/check?q=wwfJriEy4YLHSWTPEZewyL%2f3Rqu%2fp4FKqD%2bp4FwtJqJ02sqPGmYPrQmOTBIEeRaP2%2fRVBQrfkLY%3d> > . > > > > On Mon, Aug 7, 2023 at 11:24 AM yangjie01 <yangji...@baidu.com.invalid> > wrote: > > > > I submitted a PR last week to try and solve this issue: > https://github.com/apache/spark/pull/42236 > <https://mailshield.baidu.com/check?q=RuROuzGgilTwZNUWiMZ7pwqOOLeH0npaU%2bC8iO%2fbTipu0P69GMyEDJZSoDpwwVYG> > . > > > > *发件人**: *Sean Owen <sro...@gmail.com> > *日期**: *2023年8月7日 星期一 11:05 > *收件人**: *Yuanjian Li <xyliyuanj...@gmail.com> > *抄送**: *Spark dev list <dev@spark.apache.org> > *主题**: *Re: [VOTE] Release Apache Spark 3.5.0 (RC1) > > > ------------------------------ > > *【外部邮件】信息安全要牢记,账号密码不传递!* > ------------------------------ > > > > Let's keep testing 3.5.0 of course while that change is going in. (See > https://github.com/apache/spark/pull/42364#issuecomment-1666878287 > <https://mailshield.baidu.com/check?q=AKrpE6Sminif6hfi4rNDJwIsSJerLpjGHJOitfreGs%2br9nhri8QLJ%2ftr9QH6N%2bV3NWkpmvinswJbvV2NWElmX93WIhxprTwb> > ) > > > > Otherwise testing is pretty much as usual, except I get this test failure > in Connect, which is new. Anyone else? this is Java 8, Scala 2.13, Debian > 12. > > > > - from_protobuf_messageClassName_options *** FAILED *** > org.apache.spark.sql.AnalysisException: [CANNOT_LOAD_PROTOBUF_CLASS] > Could not load Protobuf class with name > org.apache.spark.connect.proto.StorageLevel. > org.apache.spark.connect.proto.StorageLevel does not extend shaded Protobuf > Message class org.sparkproject.spark_protobuf.protobuf.Message. The jar > with Protobuf classes needs to be shaded (com.google.protobuf.* --> > org.sparkproject.spark_protobuf.protobuf.*). > at > org.apache.spark.sql.errors.QueryCompilationErrors$.protobufClassLoadError(QueryCompilationErrors.scala:3554) > at > org.apache.spark.sql.protobuf.utils.ProtobufUtils$.buildDescriptorFromJavaClass(ProtobufUtils.scala:198) > at > org.apache.spark.sql.protobuf.utils.ProtobufUtils$.buildDescriptor(ProtobufUtils.scala:156) > at > org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.messageDescriptor$lzycompute(ProtobufDataToCatalyst.scala:58) > at > org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.messageDescriptor(ProtobufDataToCatalyst.scala:57) > at > org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.dataType$lzycompute(ProtobufDataToCatalyst.scala:43) > at > org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.dataType(ProtobufDataToCatalyst.scala:42) > at > org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:194) > at > org.apache.spark.sql.catalyst.plans.logical.Project.$anonfun$output$1(basicLogicalOperators.scala:73) > at scala.collection.immutable.List.map(List.scala:246) > > > > On Sat, Aug 5, 2023 at 5:42 PM Sean Owen <sro...@gmail.com> wrote: > > I'm still testing other combinations, but it looks like tests fail on Java > 17 after building with Java 8, which should be a normal supported > configuration. > > This is described at https://github.com/apache/spark/pull/41943 > <https://mailshield.baidu.com/check?q=ql9V9tzNbdXj5TkKGZwzVT77jYQzOydIlG1qmLV7nz%2foGxyXKYhPn9fye1uAazWW> > and looks like it is resolved by moving back to Scala 2.13.8 for now. > > Unless I'm missing something we need to fix this for 3.5 or it's not clear > the build will run on Java 17. > > > > On Fri, Aug 4, 2023 at 5:45 PM Yuanjian Li <xyliyuanj...@gmail.com> wrote: > > Please vote on releasing the following candidate(RC1) as Apache Spark > version 3.5.0. > > > > The vote is open until 11:59pm Pacific time *Aug 9th* and passes if a > majority +1 PMC votes are cast, with a minimum of 3 +1 votes. > > > > [ ] +1 Release this package as Apache Spark 3.5.0 > > [ ] -1 Do not release this package because ... > > > > To learn more about Apache Spark, please see http://spark.apache.org/ > <https://mailshield.baidu.com/check?q=eJcUboQ1HRRomPZKEwRzpl69wA8DbI%2fNIiRNsQ%3d%3d> > > > > The tag to be voted on is v3.5.0-rc1 (commit > 7e862c01fc9a1d3b47764df8b6a4b5c4cafb0807): > > https://github.com/apache/spark/tree/v3.5.0-rc1 > <https://mailshield.baidu.com/check?q=CIfxrhwmMkKOiF4rsdKfTCkl6lXjNIgr71aIiAqQXZyBDhzN%2fwH4LUuj6i27Vx8aHAYVbw%3d%3d> > > > > The release files, including signatures, digests, etc. can be found at: > > https://dist.apache.org/repos/dist/dev/spark/v3.5.0-rc1-bin/ > <https://mailshield.baidu.com/check?q=RmqjaQ8ZQhv48eFKXqe9S0%2fIYoniZZdJbJpVTk5Av5OF30HAcmaS3tFp69w2lZBAwdbOy6S1Xstj%2bb7j6ediDA%3d%3d> > > > > Signatures used for Spark RCs can be found in this file: > > https://dist.apache.org/repos/dist/dev/spark/KEYS > <https://mailshield.baidu.com/check?q=E6fHbSXEWw02TTJBpc3bfA9mi7ea0YiWcNHkm%2fDJxwlaWinGnMdaoO1PahHhgj00vKwcbElpuHA%3d> > > > > The staging repository for this release can be found at: > > https://repository.apache.org/content/repositories/orgapachespark-1444 > <https://mailshield.baidu.com/check?q=aYTVsf%2bJNjXD4qT%2bcG6MnwGancGlU%2bsokms5Qs3ZVaoS69X178XnsTMqB3pHa63zleVRdZaZ%2fQeSuL8a71w9KMQWDQW6i4UYgKWsMA%3d%3d> > > > > The documentation corresponding to this release can be found at: > > https://dist.apache.org/repos/dist/dev/spark/v3.5.0-rc1-docs/ > <https://mailshield.baidu.com/check?q=I35tevJv6FYAiBDgASEJs59imBRzvXYHUo94m625oeytc2zmUtP%2fwD6QJrv7HKtLA4yqPh4tO3GU1UfmtTpLWIFcl18%3d> > > > > The list of bug fixes going into 3.5.0 can be found at the following URL: > > https://issues.apache.org/jira/projects/SPARK/versions/12352848 > <https://mailshield.baidu.com/check?q=rOHxO3EFdnYTS41rF0m9qsTrteyGHUmLHghEJgmTMLY2%2bhbNu4VZqqsL4J8TXbsKbVjS4fDayxhT%2fqjJjgSX8zM00bc%3d> > > > > This release is using the release script of the tag v3.5.0-rc1. > > > > FAQ > > > > ========================= > > How can I help test this release? > > ========================= > > If you are a Spark user, you can help us test this release by taking > > an existing Spark workload and running on this release candidate, then > > reporting any regressions. > > > > If you're working in PySpark you can set up a virtual env and install > > the current RC and see if anything important breaks, in the Java/Scala > > you can add the staging repository to your projects resolvers and test > > with the RC (make sure to clean up the artifact cache before/after so > > you don't end up building with an out of date RC going forward). > > > > =========================================== > > What should happen to JIRA tickets still targeting 3.5.0? > > =========================================== > > The current list of open tickets targeted at 3.5.0 can be found at: > > https://issues.apache.org/jira/projects/SPARK > <https://mailshield.baidu.com/check?q=4UUpJqq41y71Gnuj0qTUYo6hTjqzT7oytN6x%2fvgC5XUtQUC8MfJ77tj7K70O%2f1QMmNoa1A%3d%3d> > and search for "Target Version/s" = 3.5.0 > > > > Committers should look at those and triage. Extremely important bug > > fixes, documentation, and API tweaks that impact compatibility should > > be worked on immediately. Everything else please retarget to an > > appropriate release. > > > > ================== > > But my bug isn't fixed? > > ================== > > In order to make timely releases, we will typically not hold the > > release unless the bug in question is a regression from the previous > > release. That being said, if there is something which is a regression > > that has not been correctly targeted please ping me or a committer to > > help target the issue. > > > > Thanks, > > Yuanjian Li > > > >