Re: RDD getPartitions() size and HashPartitioner numPartitions

2016-12-04 Thread Manish Malhotra
Its a pretty nice question ! I'll trying to understand the problem, and see can help further. When you say CustomRDD I believe you will using it in the transformation stage, once the data is read from a file source like HDFS or Cassandra or Kafka. Now the RDD.getPartitions() should return the

RDD getPartitions() size and HashPartitioner numPartitions

2016-12-02 Thread Amit Sela
This might be a silly question, but I wanted to make sure, when implementing my own RDD, if using a HashPartitioner as the RDD's partitioner the number of partitions returned by the implementation of getPartitions() has to match the number of partitions set in the HashPartitioner, correct ?