Eyal Allweil created DATAFU-159:
-----------------------------------

             Summary: Add diff functionality to datafu-spark
                 Key: DATAFU-159
                 URL: https://issues.apache.org/jira/browse/DATAFU-159
             Project: DataFu
          Issue Type: New Feature
            Reporter: Eyal Allweil


A useful feature when examining results is the ability to clearly understand 
the differences between two datasets - for example, doing regressions between 
expected and actual results.

Spark provides the _except_ functionality, but this is often not enough for 
this - for example, see [this question on Stack 
Overflow.|https://stackoverflow.com/questions/44338412/how-to-compare-two-dataframe-and-print-columns-that-are-different-in-scala]

Datafu-pig had a macro for doing this, and this could be a useful addition to 
datafu-spark.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to