[jira] [Commented] (SPARK-17867) Dataset.dropDuplicates (i.e. distinct) should consider the columns with same column name

Apache Spark (JIRA) Mon, 10 Oct 2016 23:14:18 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-17867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15564617#comment-15564617
 ]


Apache Spark commented on SPARK-17867:
--------------------------------------

User 'viirya' has created a pull request for this issue:
https://github.com/apache/spark/pull/15427

> Dataset.dropDuplicates (i.e. distinct) should consider the columns with same 
> column name
> ----------------------------------------------------------------------------------------
>
>                 Key: SPARK-17867
>                 URL: https://issues.apache.org/jira/browse/SPARK-17867
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Liang-Chi Hsieh
>
> We find and get the first resolved attribute from output with the given 
> column name in Dataset.dropDuplicates. When we have the more than one columns 
> with the same name. Other columns are put into aggregation columns, instead 
> of grouping columns. We should fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-17867) Dataset.dropDuplicates (i.e. distinct) should consider the columns with same column name

Reply via email to