Re: Why is RDD to PairRDDFunctions only via implicits?

Justin Pihony Fri, 22 May 2015 12:10:36 -0700

The (crude) proof of concept seems to work:

class RDD[V](value: List[V]){
  def doStuff = println("I'm doing stuff")
}


object RDD{
  implicit def toPair[V](x:RDD[V]) = new PairRDD(List((1,2)))
}

class PairRDD[K,V](value: List[(K,V)]) extends RDD (value){
  def doPairs = println("I'm using pairs")
}

class Context{
  def parallelize[K,V](x: List[(K,V)]) = new PairRDD(x)
  def parallelize[V](x: List[V]) = new RDD(x)
}

On Fri, May 22, 2015 at 2:44 PM, Reynold Xin <r...@databricks.com> wrote:

> I'm not sure if it is possible to overload the map function twice, once
> for just KV pairs, and another for K and V separately.
>
>
> On Fri, May 22, 2015 at 10:26 AM, Justin Pihony <justin.pih...@gmail.com>
> wrote:
>
>> This ticket <https://issues.apache.org/jira/browse/SPARK-4397> improved
>> the RDD API, but it could be even more discoverable if made available via
>> the API directly. I assume this was originally an omission that now needs
>> to be kept for backwards compatibility, but would any of the repo owners be
>> open to making this more discoverable to the point of API docs and tab
>> completion (while keeping both binary and source compatibility)?
>>
>>
>>     class PairRDD extends RDD{
>>       ....pair methods
>>     }
>>
>>     RDD{
>>       def map[K: ClassTag, V: ClassTag](f: T => (K,V)):PairRDD[K,V]
>>     }
>>
>> As long as the implicits remain, then compatibility remains, but now it
>> is explicit in the docs on how to get a PairRDD and in tab completion.
>>
>> Thoughts?
>>
>> Justin Pihony
>>
>
>

Re: Why is RDD to PairRDDFunctions only via implicits?

Reply via email to