[jira] [Updated] (SPARK-20463) Add support for IS [NOT] DISTINCT FROM to SPARK SQL

Michael Styles (JIRA) Sun, 30 Apr 2017 14:51:29 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-20463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Michael Styles updated SPARK-20463:
-----------------------------------
    Description: 
Add support for the SQL standard distinct predicate to SPARK SQL.

{noformat}
<expression> IS [NOT] DISTINCT FROM <expression>
{noformat}

{noformat}
data = [(10, 20), (30, 30), (40, None), (None, None)]
df = sc.parallelize(data).toDF(["c1", "c2"])
df.createTempView("df")
spark.sql("select c1, c2 from df where c1 is not distinct from c2").collect()
[Row(c1=30, c2=30), Row(c1=None, c2=None)]
{noformat}


  was:
Expose the SPARK SQL '<=>' operator in Pyspark as a column function called 
*isNotDistinctFrom*. For example:

{panel}
{noformat}
data = [(10, 20), (30, 30), (40, None), (None, None)]
df2 = sc.parallelize(data).toDF("c1", "c2")
df2.where(df2["c1"].isNotDistinctFrom(df2["c2"]).collect())
[Row(c1=30, c2=30), Row(c1=None, c2=None)]
{noformat}
{panel}


> Add support for IS [NOT] DISTINCT FROM to SPARK SQL
> ---------------------------------------------------
>
>                 Key: SPARK-20463
>                 URL: https://issues.apache.org/jira/browse/SPARK-20463
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: Michael Styles
>
> Add support for the SQL standard distinct predicate to SPARK SQL.
> {noformat}
> <expression> IS [NOT] DISTINCT FROM <expression>
> {noformat}
> {noformat}
> data = [(10, 20), (30, 30), (40, None), (None, None)]
> df = sc.parallelize(data).toDF(["c1", "c2"])
> df.createTempView("df")
> spark.sql("select c1, c2 from df where c1 is not distinct from c2").collect()
> [Row(c1=30, c2=30), Row(c1=None, c2=None)]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-20463) Add support for IS [NOT] DISTINCT FROM to SPARK SQL

Reply via email to