[ https://issues.apache.org/jira/browse/SPARK-30296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012430#comment-17012430 ]
Dongjoon Hyun commented on SPARK-30296: --------------------------------------- Hi, [~EnricoMi]. Please don't set `Fixed Version`. We set that when the committers merge the PRs. Also, `New Feature` should have the version of `master` branch, 3.0.0 (as of today), because Apache Spark community has a policy which allows blackporting bug-fixes only. - https://spark.apache.org/contributing.html > Dataset diffing transformation > ------------------------------ > > Key: SPARK-30296 > URL: https://issues.apache.org/jira/browse/SPARK-30296 > Project: Spark > Issue Type: New Feature > Components: SQL > Affects Versions: 3.0.0 > Reporter: Enrico Minack > Priority: Major > > Evolving Spark code needs frequent regression testing to prove it still > produces identical results, or if changes are expected, to investigate those > changes. Diffing the Datasets of two code paths provides confidence. > Diffing small schemata is easy, but with wide schema the Spark query becomes > laborious and error-prone. With a single proven and tested method, diffing > becomes easier and a more reliable operation. As a Dataset transformation, > you get this operation first hand with your Dataset API. > This has proven to be useful for interactive spark as well as deployed > production code. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org