The use case is to control the partitions as they come out of the HadoopRDD. 1. Have my own HadoopPartition that has fields specific to my application. These fields would then be used by other RDD operations (also overridden by me). This is why I was looking to extend HadoopPartition. 2. Have my own getPartitions which has slightly different partitioning logic. This can almost be solved by subclassing InputFormat and its getSplits method, but I still need to have getPartitions create MyHadoopPartition instead of HadoopPartition.
Ameet On Fri, Feb 21, 2014 at 2:37 PM, Jey Kottalam <j...@cs.berkeley.edu> wrote: > What's the motivation for subclassing HadoopRDD? I don't believe > that's a supported use case. Is it not possible to do what you need > with a Hadoop InputFormat? > > On Fri, Feb 21, 2014 at 11:16 AM, Ameet Kini <ameetk...@gmail.com> wrote: > > I'm looking to subclass HadoopRDD and was hoping to subclass NextIterator > > in compute(). > > > > Thanks, > > Ameet >