Hi Yuanjian,

This is a correctness issue that we should probably fix in 3.5:
https://issues.apache.org/jira/browse/SPARK-44871 /
https://github.com/apache/spark/pull/42559

Cheers,
Peter

yangjie01 <yangji...@baidu.com.invalid> ezt írta (időpont: 2023. aug. 12.,
Szo, 15:38):

> Hi, Yuanjian,
>
>
>
> Maybe there is another issue that needs to be fixed
>
>
>
> -    [SPARK-44784] <https://issues.apache.org/jira/browse/SPARK-44784>
> Failure in testing `SparkSessionE2ESuite` using Maven
>
>
>
> Maven daily tests are still failing:
> https://github.com/apache/spark/actions/runs/5832898984/job/15819181762
>
>
>
> I think we should address this issue before the release of Apache Spark
> 3.5.0.
>
>
>
> Jie Yang
>
>
>
> *发件人**: *Yuanjian Li <xyliyuanj...@gmail.com>
> *日期**: *2023年8月12日 星期六 15:20
> *收件人**: *Yuming Wang <yumw...@apache.org>
> *抄送**: *yangjie01 <yangji...@baidu.com.invalid>, Sean Owen <
> sro...@gmail.com>, Spark dev list <dev@spark.apache.org>
> *主题**: *Re: [VOTE] Release Apache Spark 3.5.0 (RC1)
>
>
>
> Thanks for all updates!
>
> The vote has failed. Here is the status of known blockers:
>
>    - [SPARK-44719]
>    
> <https://mailshield.baidu.com/check?q=wwfJriEy4YLHSWTPEZewyL%2f3Rqu%2fp4FKqD%2bp4FwtJqJ02sqPGmYPrQmOTBIEeRaP2%2fRVBQrfkLY%3d>
>  NoClassDefFoundError
>    when using Hive UDF - *Resolved*
>    - [SPARK-44653
>    
> <https://mailshield.baidu.com/check?q=wA9y49X0e47%2bUprborMbO4GB6VZLs4%2fZJckwZgX1zS%2fjL1b9OMia%2bpTr7SctUN6hN2R%2f527wJ4M%3d>]
>  non-trivial
>    DataFrame unions should not break caching - *Resolved*
>    - [SPARK-43646
>    
> <https://mailshield.baidu.com/check?q=xebdQk%2fkQ0oQcDLFMpwVi4eH7SqRuYIZqzKQihX%2fkaIz262tfrLqkhrU3yNWw0y%2fhebim80IThM%3d>]
>    Test failure of Connect: from_protobuf_messageClassName - *WIP*
>
> I'll cut RC2 once all blockers are resolved.
>
>
>
>
>
> Yuming Wang <yumw...@apache.org> 于2023年8月8日周二 05:29写道:
>
> -1. I found a NoClassDefFoundError bug:
> https://issues.apache.org/jira/browse/SPARK-44719
> <https://mailshield.baidu.com/check?q=wwfJriEy4YLHSWTPEZewyL%2f3Rqu%2fp4FKqD%2bp4FwtJqJ02sqPGmYPrQmOTBIEeRaP2%2fRVBQrfkLY%3d>
> .
>
>
>
> On Mon, Aug 7, 2023 at 11:24 AM yangjie01 <yangji...@baidu.com.invalid>
> wrote:
>
>
>
> I submitted a PR last week to try and solve this issue:
> https://github.com/apache/spark/pull/42236
> <https://mailshield.baidu.com/check?q=RuROuzGgilTwZNUWiMZ7pwqOOLeH0npaU%2bC8iO%2fbTipu0P69GMyEDJZSoDpwwVYG>
> .
>
>
>
> *发件人**: *Sean Owen <sro...@gmail.com>
> *日期**: *2023年8月7日 星期一 11:05
> *收件人**: *Yuanjian Li <xyliyuanj...@gmail.com>
> *抄送**: *Spark dev list <dev@spark.apache.org>
> *主题**: *Re: [VOTE] Release Apache Spark 3.5.0 (RC1)
>
>
> ------------------------------
>
> *【外部邮件】信息安全要牢记,账号密码不传递!*
> ------------------------------
>
>
>
> Let's keep testing 3.5.0 of course while that change is going in. (See
> https://github.com/apache/spark/pull/42364#issuecomment-1666878287
> <https://mailshield.baidu.com/check?q=AKrpE6Sminif6hfi4rNDJwIsSJerLpjGHJOitfreGs%2br9nhri8QLJ%2ftr9QH6N%2bV3NWkpmvinswJbvV2NWElmX93WIhxprTwb>
> )
>
>
>
> Otherwise testing is pretty much as usual, except I get this test failure
> in Connect, which is new. Anyone else? this is Java 8, Scala 2.13, Debian
> 12.
>
>
>
> - from_protobuf_messageClassName_options *** FAILED ***
>   org.apache.spark.sql.AnalysisException: [CANNOT_LOAD_PROTOBUF_CLASS]
> Could not load Protobuf class with name
> org.apache.spark.connect.proto.StorageLevel.
> org.apache.spark.connect.proto.StorageLevel does not extend shaded Protobuf
> Message class org.sparkproject.spark_protobuf.protobuf.Message. The jar
> with Protobuf classes needs to be shaded (com.google.protobuf.* -->
> org.sparkproject.spark_protobuf.protobuf.*).
>   at
> org.apache.spark.sql.errors.QueryCompilationErrors$.protobufClassLoadError(QueryCompilationErrors.scala:3554)
>   at
> org.apache.spark.sql.protobuf.utils.ProtobufUtils$.buildDescriptorFromJavaClass(ProtobufUtils.scala:198)
>   at
> org.apache.spark.sql.protobuf.utils.ProtobufUtils$.buildDescriptor(ProtobufUtils.scala:156)
>   at
> org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.messageDescriptor$lzycompute(ProtobufDataToCatalyst.scala:58)
>   at
> org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.messageDescriptor(ProtobufDataToCatalyst.scala:57)
>   at
> org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.dataType$lzycompute(ProtobufDataToCatalyst.scala:43)
>   at
> org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.dataType(ProtobufDataToCatalyst.scala:42)
>   at
> org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:194)
>   at
> org.apache.spark.sql.catalyst.plans.logical.Project.$anonfun$output$1(basicLogicalOperators.scala:73)
>   at scala.collection.immutable.List.map(List.scala:246)
>
>
>
> On Sat, Aug 5, 2023 at 5:42 PM Sean Owen <sro...@gmail.com> wrote:
>
> I'm still testing other combinations, but it looks like tests fail on Java
> 17 after building with Java 8, which should be a normal supported
> configuration.
>
> This is described at https://github.com/apache/spark/pull/41943
> <https://mailshield.baidu.com/check?q=ql9V9tzNbdXj5TkKGZwzVT77jYQzOydIlG1qmLV7nz%2foGxyXKYhPn9fye1uAazWW>
> and looks like it is resolved by moving back to Scala 2.13.8 for now.
>
> Unless I'm missing something we need to fix this for 3.5 or it's not clear
> the build will run on Java 17.
>
>
>
> On Fri, Aug 4, 2023 at 5:45 PM Yuanjian Li <xyliyuanj...@gmail.com> wrote:
>
> Please vote on releasing the following candidate(RC1) as Apache Spark
> version 3.5.0.
>
>
>
> The vote is open until 11:59pm Pacific time *Aug 9th* and passes if a
> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>
>
>
> [ ] +1 Release this package as Apache Spark 3.5.0
>
> [ ] -1 Do not release this package because ...
>
>
>
> To learn more about Apache Spark, please see http://spark.apache.org/
> <https://mailshield.baidu.com/check?q=eJcUboQ1HRRomPZKEwRzpl69wA8DbI%2fNIiRNsQ%3d%3d>
>
>
>
> The tag to be voted on is v3.5.0-rc1 (commit
> 7e862c01fc9a1d3b47764df8b6a4b5c4cafb0807):
>
> https://github.com/apache/spark/tree/v3.5.0-rc1
> <https://mailshield.baidu.com/check?q=CIfxrhwmMkKOiF4rsdKfTCkl6lXjNIgr71aIiAqQXZyBDhzN%2fwH4LUuj6i27Vx8aHAYVbw%3d%3d>
>
>
>
> The release files, including signatures, digests, etc. can be found at:
>
> https://dist.apache.org/repos/dist/dev/spark/v3.5.0-rc1-bin/
> <https://mailshield.baidu.com/check?q=RmqjaQ8ZQhv48eFKXqe9S0%2fIYoniZZdJbJpVTk5Av5OF30HAcmaS3tFp69w2lZBAwdbOy6S1Xstj%2bb7j6ediDA%3d%3d>
>
>
>
> Signatures used for Spark RCs can be found in this file:
>
> https://dist.apache.org/repos/dist/dev/spark/KEYS
> <https://mailshield.baidu.com/check?q=E6fHbSXEWw02TTJBpc3bfA9mi7ea0YiWcNHkm%2fDJxwlaWinGnMdaoO1PahHhgj00vKwcbElpuHA%3d>
>
>
>
> The staging repository for this release can be found at:
>
> https://repository.apache.org/content/repositories/orgapachespark-1444
> <https://mailshield.baidu.com/check?q=aYTVsf%2bJNjXD4qT%2bcG6MnwGancGlU%2bsokms5Qs3ZVaoS69X178XnsTMqB3pHa63zleVRdZaZ%2fQeSuL8a71w9KMQWDQW6i4UYgKWsMA%3d%3d>
>
>
>
> The documentation corresponding to this release can be found at:
>
> https://dist.apache.org/repos/dist/dev/spark/v3.5.0-rc1-docs/
> <https://mailshield.baidu.com/check?q=I35tevJv6FYAiBDgASEJs59imBRzvXYHUo94m625oeytc2zmUtP%2fwD6QJrv7HKtLA4yqPh4tO3GU1UfmtTpLWIFcl18%3d>
>
>
>
> The list of bug fixes going into 3.5.0 can be found at the following URL:
>
> https://issues.apache.org/jira/projects/SPARK/versions/12352848
> <https://mailshield.baidu.com/check?q=rOHxO3EFdnYTS41rF0m9qsTrteyGHUmLHghEJgmTMLY2%2bhbNu4VZqqsL4J8TXbsKbVjS4fDayxhT%2fqjJjgSX8zM00bc%3d>
>
>
>
> This release is using the release script of the tag v3.5.0-rc1.
>
>
>
> FAQ
>
>
>
> =========================
>
> How can I help test this release?
>
> =========================
>
> If you are a Spark user, you can help us test this release by taking
>
> an existing Spark workload and running on this release candidate, then
>
> reporting any regressions.
>
>
>
> If you're working in PySpark you can set up a virtual env and install
>
> the current RC and see if anything important breaks, in the Java/Scala
>
> you can add the staging repository to your projects resolvers and test
>
> with the RC (make sure to clean up the artifact cache before/after so
>
> you don't end up building with an out of date RC going forward).
>
>
>
> ===========================================
>
> What should happen to JIRA tickets still targeting 3.5.0?
>
> ===========================================
>
> The current list of open tickets targeted at 3.5.0 can be found at:
>
> https://issues.apache.org/jira/projects/SPARK
> <https://mailshield.baidu.com/check?q=4UUpJqq41y71Gnuj0qTUYo6hTjqzT7oytN6x%2fvgC5XUtQUC8MfJ77tj7K70O%2f1QMmNoa1A%3d%3d>
> and search for "Target Version/s" = 3.5.0
>
>
>
> Committers should look at those and triage. Extremely important bug
>
> fixes, documentation, and API tweaks that impact compatibility should
>
> be worked on immediately. Everything else please retarget to an
>
> appropriate release.
>
>
>
> ==================
>
> But my bug isn't fixed?
>
> ==================
>
> In order to make timely releases, we will typically not hold the
>
> release unless the bug in question is a regression from the previous
>
> release. That being said, if there is something which is a regression
>
> that has not been correctly targeted please ping me or a committer to
>
> help target the issue.
>
>
>
> Thanks,
>
> Yuanjian Li
>
>
>
>

Reply via email to