It did the job. Thanks. :) Le 19 août 2014 à 10:20, Sean Owen <so...@cloudera.com> a écrit :
> In that case, why not collectAsMap() and have the whole result as a > simple Map in memory? then lookups are trivial. RDDs aren't > distributed maps. > > On Tue, Aug 19, 2014 at 9:17 AM, Emmanuel Castanier > <emmanuel.castan...@gmail.com> wrote: >> Thanks for your answer. >> In my case, that’s sad cause we have only 60 entries in the final RDD, I was >> thinking it will be fast to get the needed one. >> >> >> Le 19 août 2014 à 09:58, Sean Owen <so...@cloudera.com> a écrit : >> >>> You can use the function lookup() to accomplish this too; it may be a >>> bit faster. >>> >>> It will never be efficient like a database lookup since this is >>> implemented by scanning through all of the data. There is no index or >>> anything. >>> >>> On Tue, Aug 19, 2014 at 8:43 AM, Emmanuel Castanier >>> <emmanuel.castan...@gmail.com> wrote: >>>> Hi all, >>>> >>>> I’m totally newbie on Spark, so my question may be a dumb one. >>>> I tried Spark to compute values, on this side all works perfectly (and >>>> it's fast :) ). >>>> >>>> At the end of the process, I have an RDD with Key(String)/Values(Array >>>> of String), on this I want to get only one entry like this : >>>> >>>> myRdd.filter(t => t._1.equals(param)) >>>> >>>> If I make a collect to get the only « tuple » , It takes about 12 seconds >>>> to execute, I imagine that’s because Spark may be used differently... >>>> >>>> Best regards, >>>> >>>> Emmanuel >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>> For additional commands, e-mail: user-h...@spark.apache.org >>>> >> --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org