[ 
https://issues.apache.org/jira/browse/DATAFU-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eyal Allweil closed DATAFU-159.
-------------------------------
    Resolution: Won't Do

I can see that the spark-extension library has artifacts in Maven Central for 
Spark 2.4. So there's no reason I can see for implementing it here. Since there 
are no objections, I am closing this issue.

> Add diff functionality to datafu-spark
> --------------------------------------
>
>                 Key: DATAFU-159
>                 URL: https://issues.apache.org/jira/browse/DATAFU-159
>             Project: DataFu
>          Issue Type: New Feature
>            Reporter: Eyal Allweil
>            Priority: Major
>
> A useful feature when examining results is the ability to clearly understand 
> the differences between two datasets - for example, doing regressions between 
> expected and actual results.
> Spark provides the _except_ functionality, but this is often not enough for 
> this - for example, see [this question on Stack 
> Overflow.|https://stackoverflow.com/questions/44338412/how-to-compare-two-dataframe-and-print-columns-that-are-different-in-scala]
> Datafu-pig had a macro for doing this, and this could be a useful addition to 
> datafu-spark.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to