[ https://issues.apache.org/jira/browse/SPARK-2278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14060677#comment-14060677 ]
Sean Owen commented on SPARK-2278: ---------------------------------- Where are keys replicated? I don't get that point. Keys can be what you like, as can values. Values don't need to contain keys to sortByKey for example. You can sort by fields of a composite key without copying anything. A function like "employee => employee.name" does not copy the value employee in order to access the name for sorting purposes. groupBy does not (necessarily) operate on key-value pair data. You can easily group any value by some function of the value. groupByKey does of course. > groupBy & groupByKey should support custom comparator > ----------------------------------------------------- > > Key: SPARK-2278 > URL: https://issues.apache.org/jira/browse/SPARK-2278 > Project: Spark > Issue Type: New Feature > Components: Java API > Affects Versions: 1.0.0 > Reporter: Hans Uhlig > > To maintain parity with MapReduce you should be able to specify a custom key > equality function in groupBy/groupByKey similar to sortByKey. -- This message was sent by Atlassian JIRA (v6.2#6252)