Reynold Xin created SPARK-7324: ---------------------------------- Summary: Add DataFrame.dropDuplicates Key: SPARK-7324 URL: https://issues.apache.org/jira/browse/SPARK-7324 Project: Spark Issue Type: Sub-task Components: SQL Reporter: Reynold Xin
Similar to http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.drop_duplicates.html We can turn this into groupBy(cols).agg(first(...)) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org