[ https://issues.apache.org/jira/browse/SPARK-4640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-4640. ------------------------------ Resolution: Won't Fix It's not a bad idea but given lack of response I think this should be closed. It can of course be implemented outside Spark > FixedRangePartitioner for partitioning items with a known range > --------------------------------------------------------------- > > Key: SPARK-4640 > URL: https://issues.apache.org/jira/browse/SPARK-4640 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Reporter: Kevin Mader > > For the large datasets I work with, it is common to have light-weight keys > and very heavy values (integers and large double arrays for example). The key > values are however known and unchanging. It would be nice if Spark had a > built in partitioner which could take advantage of this. A > FixedRangePartitioner[T](keys: Seq[T], partitions: Int) would be ideal. > Furthermore this partitioner type could be extended to a > PartitionerWithKnownKeys that had a getAllKeys function allowing for a list > of keys to be obtained without querying through the entire RDD. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org