Hi Ankur,

I have the following questions on IndexedRDD.

1.  Does the IndexedRDD support the key types of String? As per the current
documentation, it looks like it supports only Long?

2. Is IndexedRDD efficient when joined with another RDD. So, basically my
usecase  is that I need to create an IndexedRDD for a certain set of data
and then get those keys that are present in the IndexedRDD but not present
in some other RDD.
How would an IndexedRDD support such an usecase in an efficient manner?


Thanks,
Swetha







On Wed, Jul 15, 2015 at 2:46 AM, Jem Tucker <jem.tuc...@gmail.com> wrote:

> This is very interesting, do you know if this version will be backwards
> compatible with older versions of Spark (1.2.0)?
>
> Thanks,
>
> Jem
>
>
> On Wed, Jul 15, 2015 at 10:04 AM Ankur Dave <ankurd...@gmail.com> wrote:
>
>> The latest version of IndexedRDD supports any key type with a defined
>> serializer
>> <https://github.com/amplab/spark-indexedrdd/blob/master/src/main/scala/edu/berkeley/cs/amplab/spark/indexedrdd/KeySerializer.scala>,
>> including Strings. It's not released yet, but you can use it from the
>> master branch if you're interested.
>>
>> Ankur <http://www.ankurdave.com/>
>>
>> On Wed, Jul 15, 2015 at 12:43 AM, Jem Tucker <jem.tuc...@gmail.com>
>> wrote:
>>
>>> With regards to Indexed structures in Spark are there any alternatives
>>> to IndexedRDD for more generic keys including Strings?
>>>
>>> Thanks
>>>
>>> Jem
>>>
>>

Reply via email to