Eyal Allweil created DATAFU-159:
-----------------------------------
Summary: Add diff functionality to datafu-spark
Key: DATAFU-159
URL: https://issues.apache.org/jira/browse/DATAFU-159
Project: DataFu
Issue Type: New Feature
Reporter: Eyal Allweil
A useful feature when examining results is the ability to clearly understand
the differences between two datasets - for example, doing regressions between
expected and actual results.
Spark provides the _except_ functionality, but this is often not enough for
this - for example, see [this question on Stack
Overflow.|https://stackoverflow.com/questions/44338412/how-to-compare-two-dataframe-and-print-columns-that-are-different-in-scala]
Datafu-pig had a macro for doing this, and this could be a useful addition to
datafu-spark.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)