[ https://issues.apache.org/jira/browse/DATAFU-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17927485#comment-17927485 ]
Eyal Allweil commented on DATAFU-159: ------------------------------------- Thank you [~anuta] ! We will look into it. Do you have an example of input/output so it will be easier to understand? > Add diff functionality to datafu-spark > -------------------------------------- > > Key: DATAFU-159 > URL: https://issues.apache.org/jira/browse/DATAFU-159 > Project: DataFu > Issue Type: New Feature > Reporter: Eyal Allweil > Priority: Major > > A useful feature when examining results is the ability to clearly understand > the differences between two datasets - for example, doing regressions between > expected and actual results. > Spark provides the _except_ functionality, but this is often not enough for > this - for example, see [this question on Stack > Overflow.|https://stackoverflow.com/questions/44338412/how-to-compare-two-dataframe-and-print-columns-that-are-different-in-scala] > Datafu-pig had a macro for doing this, and this could be a useful addition to > datafu-spark. > > -- This message was sent by Atlassian Jira (v8.20.10#820010)