Same rdd means same sparkcontext means same workers

Cache/persist the rdd to avoid repeated jobs
On Jan 17, 2016 5:21 AM, "Mennour Rostom" <mennou...@gmail.com> wrote:

> Hi,
>
> Thank you all for your answers,
>
> If I correctly understand, actions (in my case foreach) can be run
> concurrently and simultaneously on the SAME rdd, (which is logical because
> they are read only object). however, I want to know if the same workers are
> used for the concurrent analysis ?
>
> Thank you
>
> 2016-01-15 21:11 GMT+01:00 Jakob Odersky <joder...@gmail.com>:
>
>> I stand corrected. How considerable are the benefits though? Will the
>> scheduler be able to dispatch jobs from both actions simultaneously (or on
>> a when-workers-become-available basis)?
>>
>> On 15 January 2016 at 11:44, Koert Kuipers <ko...@tresata.com> wrote:
>>
>>> we run multiple actions on the same (cached) rdd all the time, i guess
>>> in different threads indeed (its in akka)
>>>
>>> On Fri, Jan 15, 2016 at 2:40 PM, Matei Zaharia <matei.zaha...@gmail.com>
>>> wrote:
>>>
>>>> RDDs actually are thread-safe, and quite a few applications use them
>>>> this way, e.g. the JDBC server.
>>>>
>>>> Matei
>>>>
>>>> On Jan 15, 2016, at 2:10 PM, Jakob Odersky <joder...@gmail.com> wrote:
>>>>
>>>> I don't think RDDs are threadsafe.
>>>> More fundamentally however, why would you want to run RDD actions in
>>>> parallel? The idea behind RDDs is to provide you with an abstraction for
>>>> computing parallel operations on distributed data. Even if you were to call
>>>> actions from several threads at once, the individual executors of your
>>>> spark environment would still have to perform operations sequentially.
>>>>
>>>> As an alternative, I would suggest to restructure your RDD
>>>> transformations to compute the required results in one single operation.
>>>>
>>>> On 15 January 2016 at 06:18, Jonathan Coveney <jcove...@gmail.com>
>>>> wrote:
>>>>
>>>>> Threads
>>>>>
>>>>>
>>>>> El viernes, 15 de enero de 2016, Kira <mennou...@gmail.com> escribió:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Can we run *simultaneous* actions on the *same RDD* ?; if yes how can
>>>>>> this
>>>>>> be done ?
>>>>>>
>>>>>> Thank you,
>>>>>> Regards
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/simultaneous-actions-tp25977.html
>>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>>> Nabble.com <http://nabble.com>.
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>>>
>>>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to