dIndexes:RDD[(Int, InvIndex)] =
>>>>>>> a.reduceByKey(generateInvertedIndex)
>>>>>>> vectors:RDD.mapPartitions{
>>>>>>> iter =>
>>>>>>> val invIndex = invertedIndexes(samePartitionKey)
>>>>>
ight get too complicated and become problematic)
>>
>> Any thoughts on how I could attack this issue would be highly appreciated.
>>
>> thank you for your help!
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-us
t;>>> iter.map(invIndex.calculateSimilarity(_))
>>>>>> )
>>>>>> }
>>>>>>
>>>>>> How could I go about setting up the Partition such that the specific
>>>>>> data
>>>>>> st
l values (which would happen if I were
>>>> to
>>>> make a broadcast variable).
>>>>
>>>> One thought I have been having is to store the objects in HDFS but I'm
>>>> not
>>>> sure if that would be a suboptimal solution (It seems l
on (It seems like it could slow
>>> down the process a lot)
>>>
>>> Another thought I am currently exploring is whether there is some way I
>>> can
>>> create a custom Partition or Partitioner that could hol
hly appreciated.
thank you for your help!
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Sending-large-objects-to-specific-RDDs-tp25967.html
Sent from the Apache Spark User List mailing list archive at Nabble.
ted and become problematic)
>
> Any thoughts on how I could attack this issue would be highly appreciated.
>
> thank you for your help!
>
>
>
> --
> View this message in context:
> http://apache-spark-user-li
artition or Partitioner that could hold the data
>> structure
>> (Although that might get too complicated and become problematic)
>>
>> Any thoughts on how I could attack this issue w