Re: Performance problem on collect

Emmanuel Castanier Tue, 19 Aug 2014 23:48:32 -0700

It did the job.
Thanks. :)

Le 19 août 2014 à 10:20, Sean Owen <so...@cloudera.com> a écrit :


> In that case, why not collectAsMap() and have the whole result as a
> simple Map in memory? then lookups are trivial. RDDs aren't
> distributed maps.
> 
> On Tue, Aug 19, 2014 at 9:17 AM, Emmanuel Castanier
> <emmanuel.castan...@gmail.com> wrote:
>> Thanks for your answer.
>> In my case, that’s sad cause we have only 60 entries in the final RDD, I was 
>> thinking it will be fast to get the needed one.
>> 
>> 
>> Le 19 août 2014 à 09:58, Sean Owen <so...@cloudera.com> a écrit :
>> 
>>> You can use the function lookup() to accomplish this too; it may be a
>>> bit faster.
>>> 
>>> It will never be efficient like a database lookup since this is
>>> implemented by scanning through all of the data. There is no index or
>>> anything.
>>> 
>>> On Tue, Aug 19, 2014 at 8:43 AM, Emmanuel Castanier
>>> <emmanuel.castan...@gmail.com> wrote:
>>>> Hi all,
>>>> 
>>>> I’m totally newbie on Spark, so my question may be a dumb one.
>>>> I tried Spark to compute values, on this side all works perfectly (and 
>>>> it's fast :) ).
>>>> 
>>>> At the end of the process, I have an RDD with Key(String)/Values(Array
>>>> of String), on this I want to get only one entry like this :
>>>> 
>>>> myRdd.filter(t => t._1.equals(param))
>>>> 
>>>> If I make a collect to get the only « tuple » , It takes about 12 seconds 
>>>> to execute, I imagine that’s because Spark may be used differently...
>>>> 
>>>> Best regards,
>>>> 
>>>> Emmanuel
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Performance problem on collect

Reply via email to