[ 
https://issues.apache.org/jira/browse/SPARK-20463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Styles updated SPARK-20463:
-----------------------------------
    Component/s:     (was: PySpark)
                 SQL
        Summary: Add support for IS [NOT] DISTINCT FROM to SPARK SQL  (was: 
Expose SPARK SQL <=> operator in PySpark)

> Add support for IS [NOT] DISTINCT FROM to SPARK SQL
> ---------------------------------------------------
>
>                 Key: SPARK-20463
>                 URL: https://issues.apache.org/jira/browse/SPARK-20463
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: Michael Styles
>
> Expose the SPARK SQL '<=>' operator in Pyspark as a column function called 
> *isNotDistinctFrom*. For example:
> {panel}
> {noformat}
> data = [(10, 20), (30, 30), (40, None), (None, None)]
> df2 = sc.parallelize(data).toDF("c1", "c2")
> df2.where(df2["c1"].isNotDistinctFrom(df2["c2"]).collect())
> [Row(c1=30, c2=30), Row(c1=None, c2=None)]
> {noformat}
> {panel}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to