Since the TPC-DS performance tests are one of the main validation sources
for regressions on Spark releases maybe it is time to automate the query
outputs validation to find correctness issues eagerly (it would be also
nice to validate the performance regressions but correctness >>>
performance).

This has been a long standing open issue [1] that is probably worth to
address and it seems that automating this via Github Actions could be
relatively straight-forward.

[1] https://github.com/databricks/spark-sql-perf/issues/184


On Wed, Feb 24, 2021 at 8:15 PM Reynold Xin <r...@databricks.com> wrote:

> +1 Correctness issues are serious!
>
>
> On Wed, Feb 24, 2021 at 11:08 AM, Mridul Muralidharan <mri...@gmail.com>
> wrote:
>
>> That is indeed cause for concern.
>> +1 on extending the voting deadline until we finish investigation of this.
>>
>> Regards,
>> Mridul
>>
>>
>> On Wed, Feb 24, 2021 at 12:55 PM Xiao Li <gatorsm...@gmail.com> wrote:
>>
>>> -1 Could we extend the voting deadline?
>>>
>>> A few TPC-DS queries (q17, q18, q39a, q39b) are returning different
>>> results between Spark 3.0 and Spark 3.1. We need a few more days to
>>> understand whether these changes are expected.
>>>
>>> Xiao
>>>
>>>
>>> Mridul Muralidharan <mri...@gmail.com> 于2021年2月24日周三 上午10:41写道:
>>>
>>>>
>>>> Sounds good, thanks for clarifying Hyukjin !
>>>> +1 on release.
>>>>
>>>> Regards,
>>>> Mridul
>>>>
>>>>
>>>> On Wed, Feb 24, 2021 at 2:46 AM Hyukjin Kwon <gurwls...@gmail.com>
>>>> wrote:
>>>>
>>>>> I remember HiveExternalCatalogVersionsSuite was flaky for a while
>>>>> which is fixed in
>>>>> https://github.com/apache/spark/commit/0d5d248bdc4cdc71627162a3d20c42ad19f24ef4
>>>>> and .. KafkaDelegationTokenSuite is flaky (
>>>>> https://issues.apache.org/jira/browse/SPARK-31250).
>>>>>
>>>>> 2021년 2월 24일 (수) 오후 5:19, Mridul Muralidharan <mri...@gmail.com>님이 작성:
>>>>>
>>>>>>
>>>>>> Signatures, digests, etc check out fine.
>>>>>> Checked out tag and build/tested with -Pyarn -Phadoop-2.7 -Phive
>>>>>> -Phive-thriftserver -Pmesos -Pkubernetes
>>>>>>
>>>>>> I keep getting test failures with
>>>>>> * org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite
>>>>>> * org.apache.spark.sql.kafka010.KafkaDelegationTokenSuite.
>>>>>> (Note: I remove $HOME/.m2 and $HOME/.iv2 paths before build)
>>>>>>
>>>>>> Removing these suites gets the build through though - does anyone
>>>>>> have suggestions on how to fix it ? I did not face this with RC1.
>>>>>>
>>>>>> Regards,
>>>>>> Mridul
>>>>>>
>>>>>>
>>>>>> On Mon, Feb 22, 2021 at 12:57 AM Hyukjin Kwon <gurwls...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Please vote on releasing the following candidate as Apache Spark
>>>>>>> version 3.1.1.
>>>>>>>
>>>>>>> The vote is open until February 24th 11PM PST and passes if a
>>>>>>> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>>>>>>
>>>>>>> [ ] +1 Release this package as Apache Spark 3.1.1
>>>>>>> [ ] -1 Do not release this package because ...
>>>>>>>
>>>>>>> To learn more about Apache Spark, please see
>>>>>>> http://spark.apache.org/
>>>>>>>
>>>>>>> The tag to be voted on is v3.1.1-rc3 (commit
>>>>>>> 1d550c4e90275ab418b9161925049239227f3dc9):
>>>>>>> https://github.com/apache/spark/tree/v3.1.1-rc3
>>>>>>>
>>>>>>> The release files, including signatures, digests, etc. can be found
>>>>>>> at:
>>>>>>> <https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-bin/>
>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc3-bin/
>>>>>>>
>>>>>>> Signatures used for Spark RCs can be found in this file:
>>>>>>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>>>>>
>>>>>>> The staging repository for this release can be found at:
>>>>>>>
>>>>>>> https://repository.apache.org/content/repositories/orgapachespark-1367
>>>>>>>
>>>>>>> The documentation corresponding to this release can be found at:
>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc3-docs/
>>>>>>>
>>>>>>> The list of bug fixes going into 3.1.1 can be found at the following
>>>>>>> URL:
>>>>>>> https://s.apache.org/41kf2
>>>>>>>
>>>>>>> This release is using the release script of the tag v3.1.1-rc3.
>>>>>>>
>>>>>>> FAQ
>>>>>>>
>>>>>>> ===================
>>>>>>> What happened to 3.1.0?
>>>>>>> ===================
>>>>>>>
>>>>>>> There was a technical issue during Apache Spark 3.1.0 preparation,
>>>>>>> and it was discussed and decided to skip 3.1.0.
>>>>>>> Please see
>>>>>>> https://spark.apache.org/news/next-official-release-spark-3.1.1.html for
>>>>>>> more details.
>>>>>>>
>>>>>>> =========================
>>>>>>> How can I help test this release?
>>>>>>> =========================
>>>>>>>
>>>>>>> If you are a Spark user, you can help us test this release by taking
>>>>>>> an existing Spark workload and running on this release candidate,
>>>>>>> then
>>>>>>> reporting any regressions.
>>>>>>>
>>>>>>> If you're working in PySpark you can set up a virtual env and install
>>>>>>> the current RC via "pip install
>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc3-bin/pyspark-3.1.1.tar.gz
>>>>>>> "
>>>>>>> and see if anything important breaks.
>>>>>>> In the Java/Scala, you can add the staging repository to your
>>>>>>> projects resolvers and test
>>>>>>> with the RC (make sure to clean up the artifact cache before/after so
>>>>>>> you don't end up building with an out of date RC going forward).
>>>>>>>
>>>>>>> ===========================================
>>>>>>> What should happen to JIRA tickets still targeting 3.1.1?
>>>>>>> ===========================================
>>>>>>>
>>>>>>> The current list of open tickets targeted at 3.1.1 can be found at:
>>>>>>> https://issues.apache.org/jira/projects/SPARK and search for
>>>>>>> "Target Version/s" = 3.1.1
>>>>>>>
>>>>>>> Committers should look at those and triage. Extremely important bug
>>>>>>> fixes, documentation, and API tweaks that impact compatibility should
>>>>>>> be worked on immediately. Everything else please retarget to an
>>>>>>> appropriate release.
>>>>>>>
>>>>>>> ==================
>>>>>>> But my bug isn't fixed?
>>>>>>> ==================
>>>>>>>
>>>>>>> In order to make timely releases, we will typically not hold the
>>>>>>> release unless the bug in question is a regression from the previous
>>>>>>> release. That being said, if there is something which is a regression
>>>>>>> that has not been correctly targeted please ping me or a committer to
>>>>>>> help target the issue.
>>>>>>>
>>>>>>
>

Reply via email to