[jira] [Commented] (SPARK-17867) Dataset.dropDuplicates (i.e. distinct) should consider the columns with same column name

2017-05-18 Thread Mitesh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016021#comment-16016021 ] Mitesh commented on SPARK-17867: Ah I see, thanks [~viirya]. The repartitionByColumns is just a short-cut

[jira] [Commented] (SPARK-17867) Dataset.dropDuplicates (i.e. distinct) should consider the columns with same column name

2017-05-18 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015852#comment-16015852 ] Liang-Chi Hsieh commented on SPARK-17867: - The above example code can't compile with current

[jira] [Commented] (SPARK-17867) Dataset.dropDuplicates (i.e. distinct) should consider the columns with same column name

2017-05-18 Thread Mitesh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015772#comment-16015772 ] Mitesh commented on SPARK-17867: I'm seeing a regression from this change, the last filter gets pushed

[jira] [Commented] (SPARK-17867) Dataset.dropDuplicates (i.e. distinct) should consider the columns with same column name

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564617#comment-15564617 ] Apache Spark commented on SPARK-17867: -- User 'viirya' has created a pull request for this issue: