Thanks, Xiao. I will close this vote within a couple of hours. 2021년 2월 26일 (금) 오후 4:30, Xiao Li <gatorsm...@gmail.com>님이 작성:
> I confirmed that Q17 and Q39a/b have matching results between Spark 3.0 > and 3.1 after enabling spark.sql.legacy.statisticalAggregate. The result > changes are expected. For more details, you can read the PR > https://github.com/apache/spark/pull/29983/ Also, the result of Q18 is > affected by the overflow checking in Spark. These issues exist in all the > releases. We will continue to improve our ANSI mode and fix them in the > upcoming releases. > > Thus, I change my vote from -1 to +1. > > As Ismael suggested, we can add some Github Actions to validate the TPC-DS > and TPC-H results for small scale datasets. > > Cheers, > > Xiao > > > > Ismaël Mejía <ieme...@gmail.com> 于2021年2月25日周四 下午12:16写道: > >> Since the TPC-DS performance tests are one of the main validation sources >> for regressions on Spark releases maybe it is time to automate the query >> outputs validation to find correctness issues eagerly (it would be also >> nice to validate the performance regressions but correctness >>> >> performance). >> >> This has been a long standing open issue [1] that is probably worth to >> address and it seems that automating this via Github Actions could be >> relatively straight-forward. >> >> [1] https://github.com/databricks/spark-sql-perf/issues/184 >> >> >> On Wed, Feb 24, 2021 at 8:15 PM Reynold Xin <r...@databricks.com> wrote: >> >>> +1 Correctness issues are serious! >>> >>> >>> On Wed, Feb 24, 2021 at 11:08 AM, Mridul Muralidharan <mri...@gmail.com> >>> wrote: >>> >>>> That is indeed cause for concern. >>>> +1 on extending the voting deadline until we finish investigation of >>>> this. >>>> >>>> Regards, >>>> Mridul >>>> >>>> >>>> On Wed, Feb 24, 2021 at 12:55 PM Xiao Li <gatorsm...@gmail.com> wrote: >>>> >>>>> -1 Could we extend the voting deadline? >>>>> >>>>> A few TPC-DS queries (q17, q18, q39a, q39b) are returning different >>>>> results between Spark 3.0 and Spark 3.1. We need a few more days to >>>>> understand whether these changes are expected. >>>>> >>>>> Xiao >>>>> >>>>> >>>>> Mridul Muralidharan <mri...@gmail.com> 于2021年2月24日周三 上午10:41写道: >>>>> >>>>>> >>>>>> Sounds good, thanks for clarifying Hyukjin ! >>>>>> +1 on release. >>>>>> >>>>>> Regards, >>>>>> Mridul >>>>>> >>>>>> >>>>>> On Wed, Feb 24, 2021 at 2:46 AM Hyukjin Kwon <gurwls...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> I remember HiveExternalCatalogVersionsSuite was flaky for a while >>>>>>> which is fixed in >>>>>>> https://github.com/apache/spark/commit/0d5d248bdc4cdc71627162a3d20c42ad19f24ef4 >>>>>>> and .. KafkaDelegationTokenSuite is flaky ( >>>>>>> https://issues.apache.org/jira/browse/SPARK-31250). >>>>>>> >>>>>>> 2021년 2월 24일 (수) 오후 5:19, Mridul Muralidharan <mri...@gmail.com>님이 >>>>>>> 작성: >>>>>>> >>>>>>>> >>>>>>>> Signatures, digests, etc check out fine. >>>>>>>> Checked out tag and build/tested with -Pyarn -Phadoop-2.7 -Phive >>>>>>>> -Phive-thriftserver -Pmesos -Pkubernetes >>>>>>>> >>>>>>>> I keep getting test failures with >>>>>>>> * org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite >>>>>>>> * org.apache.spark.sql.kafka010.KafkaDelegationTokenSuite. >>>>>>>> (Note: I remove $HOME/.m2 and $HOME/.iv2 paths before build) >>>>>>>> >>>>>>>> Removing these suites gets the build through though - does anyone >>>>>>>> have suggestions on how to fix it ? I did not face this with RC1. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Mridul >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Feb 22, 2021 at 12:57 AM Hyukjin Kwon <gurwls...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Please vote on releasing the following candidate as Apache Spark >>>>>>>>> version 3.1.1. >>>>>>>>> >>>>>>>>> The vote is open until February 24th 11PM PST and passes if a >>>>>>>>> majority +1 PMC votes are cast, with a minimum of 3 +1 votes. >>>>>>>>> >>>>>>>>> [ ] +1 Release this package as Apache Spark 3.1.1 >>>>>>>>> [ ] -1 Do not release this package because ... >>>>>>>>> >>>>>>>>> To learn more about Apache Spark, please see >>>>>>>>> http://spark.apache.org/ >>>>>>>>> >>>>>>>>> The tag to be voted on is v3.1.1-rc3 (commit >>>>>>>>> 1d550c4e90275ab418b9161925049239227f3dc9): >>>>>>>>> https://github.com/apache/spark/tree/v3.1.1-rc3 >>>>>>>>> >>>>>>>>> The release files, including signatures, digests, etc. can be >>>>>>>>> found at: >>>>>>>>> <https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-bin/> >>>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc3-bin/ >>>>>>>>> >>>>>>>>> Signatures used for Spark RCs can be found in this file: >>>>>>>>> https://dist.apache.org/repos/dist/dev/spark/KEYS >>>>>>>>> >>>>>>>>> The staging repository for this release can be found at: >>>>>>>>> >>>>>>>>> https://repository.apache.org/content/repositories/orgapachespark-1367 >>>>>>>>> >>>>>>>>> The documentation corresponding to this release can be found at: >>>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc3-docs/ >>>>>>>>> >>>>>>>>> The list of bug fixes going into 3.1.1 can be found at the >>>>>>>>> following URL: >>>>>>>>> https://s.apache.org/41kf2 >>>>>>>>> >>>>>>>>> This release is using the release script of the tag v3.1.1-rc3. >>>>>>>>> >>>>>>>>> FAQ >>>>>>>>> >>>>>>>>> =================== >>>>>>>>> What happened to 3.1.0? >>>>>>>>> =================== >>>>>>>>> >>>>>>>>> There was a technical issue during Apache Spark 3.1.0 preparation, >>>>>>>>> and it was discussed and decided to skip 3.1.0. >>>>>>>>> Please see >>>>>>>>> https://spark.apache.org/news/next-official-release-spark-3.1.1.html >>>>>>>>> for >>>>>>>>> more details. >>>>>>>>> >>>>>>>>> ========================= >>>>>>>>> How can I help test this release? >>>>>>>>> ========================= >>>>>>>>> >>>>>>>>> If you are a Spark user, you can help us test this release by >>>>>>>>> taking >>>>>>>>> an existing Spark workload and running on this release candidate, >>>>>>>>> then >>>>>>>>> reporting any regressions. >>>>>>>>> >>>>>>>>> If you're working in PySpark you can set up a virtual env and >>>>>>>>> install >>>>>>>>> the current RC via "pip install >>>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc3-bin/pyspark-3.1.1.tar.gz >>>>>>>>> " >>>>>>>>> and see if anything important breaks. >>>>>>>>> In the Java/Scala, you can add the staging repository to your >>>>>>>>> projects resolvers and test >>>>>>>>> with the RC (make sure to clean up the artifact cache before/after >>>>>>>>> so >>>>>>>>> you don't end up building with an out of date RC going forward). >>>>>>>>> >>>>>>>>> =========================================== >>>>>>>>> What should happen to JIRA tickets still targeting 3.1.1? >>>>>>>>> =========================================== >>>>>>>>> >>>>>>>>> The current list of open tickets targeted at 3.1.1 can be found at: >>>>>>>>> https://issues.apache.org/jira/projects/SPARK and search for >>>>>>>>> "Target Version/s" = 3.1.1 >>>>>>>>> >>>>>>>>> Committers should look at those and triage. Extremely important bug >>>>>>>>> fixes, documentation, and API tweaks that impact compatibility >>>>>>>>> should >>>>>>>>> be worked on immediately. Everything else please retarget to an >>>>>>>>> appropriate release. >>>>>>>>> >>>>>>>>> ================== >>>>>>>>> But my bug isn't fixed? >>>>>>>>> ================== >>>>>>>>> >>>>>>>>> In order to make timely releases, we will typically not hold the >>>>>>>>> release unless the bug in question is a regression from the >>>>>>>>> previous >>>>>>>>> release. That being said, if there is something which is a >>>>>>>>> regression >>>>>>>>> that has not been correctly targeted please ping me or a committer >>>>>>>>> to >>>>>>>>> help target the issue. >>>>>>>>> >>>>>>>> >>>