;user"<user@spark.apache.org>;"Mendelson, Assaf"<
> assaf.mendel...@rsa.com>;
> *Subject:* Re: is dataframe thread safe?
>
> How about having a thread that update and cache a dataframe in-memory next
> to other threads requesting this dataframe, is it thread saf
ndelson, Assaf"<assaf.mendel...@rsa.com>;
Subject: Re: is dataframe thread safe?
How about having a thread that update and cache a dataframe in-memory next to
other threads requesting this dataframe, is it thread safe ?
2017-02-13 9:02 GMT+01:00 Reynold Xin <r...@databricks.com>:
Yes
If you update the data, then you don't have the same DataFrame anymore. If
you don't do like Assaf did, caching and forcing evaluation of the
DataFrame before using that DataFrame concurrently, then you'll still get
consistent and correct results, but not necessarily efficient results. If
the
How about having a thread that update and cache a dataframe in-memory next
to other threads requesting this dataframe, is it thread safe ?
2017-02-13 9:02 GMT+01:00 Reynold Xin :
> Yes your use case should be fine. Multiple threads can transform the same
> data frame in
Yes your use case should be fine. Multiple threads can transform the same
data frame in parallel since they create different data frames.
On Sun, Feb 12, 2017 at 9:07 AM Mendelson, Assaf
wrote:
> Hi,
>
> I was wondering if dataframe is considered thread safe. I know
for my understanding, all transformations are thread-safe cause dataframe
is just a description of the calculation and it's immutable, so the case
above is all right. just be careful with the actions.
On Sun, Feb 12, 2017 at 4:06 PM, Mendelson, Assaf
wrote:
> Hi,
>
> I
own overheads).
>
>
>
> Therefore Sean’s answer is what I was looking for (and hoping for…)
>
> Assaf
>
>
>
> *From:* Jörn Franke [mailto:jornfra...@gmail.com]
> *Sent:* Sunday, February 12, 2017 2:46 PM
> *To:* Sean Owen
> *Cc:* Mendelson, Assaf; user
&
for…)
Assaf
From: Jörn Franke [mailto:jornfra...@gmail.com]
Sent: Sunday, February 12, 2017 2:46 PM
To: Sean Owen
Cc: Mendelson, Assaf; user
Subject: Re: is dataframe thread safe?
I did not doubt that the submission of several jobs of one application makes
sense. However, he want to create threads
f1 and an unpersist in f2 I
>> would get an inconsistent result. So my question is, what, if any are the
>> legal operations to use on a dataframe so I could do the above.
>>
>> Thanks,
>> Assaf.
>>
>> From: Jörn Franke [mailto:jornfra
gt;
>>>
>>>
>>> However, if I would call f1 and f2 on different threads, then df2 can use
>>> free resources f1 has not consumed and the overall utilization would
>>> improve.
>>>
>>>
>>>
>>> Of course, I ca
;> Of course, I can do this only if the operations on the dataframe are
>> thread safe. For example, if I would do a cache in f1 and an unpersist in
>> f2 I would get an inconsistent result. So my question is, what, if any are
>> the legal operations to use on a dataframe so I cou
ent result. So my question is, what, if any are
> the legal operations to use on a dataframe so I could do the above.
>
>
>
> Thanks,
>
> Assaf.
>
>
>
> *From:* Jörn Franke [mailto:jornfra...@gmail.com <jornfra...@gmail.com>]
> *Sent:* Sund
sistent result. So my question is, what, if any are the legal
> operations to use on a dataframe so I could do the above.
>
> Thanks,
> Assaf.
>
> From: Jörn Franke [mailto:jornfra...@gmail.com]
> Sent: Sunday, February 12, 2017 10:39 AM
> To: Men
To: Mendelson, Assaf
Cc: user
Subject: Re: is dataframe thread safe?
I am not sure what you are trying to achieve here. Spark is taking care of
executing the transformations in a distributed fashion. This means you must not
use threads - it does not make sense. Hence, you do not find documentation
about
I am not sure what you are trying to achieve here. Spark is taking care of
executing the transformations in a distributed fashion. This means you must not
use threads - it does not make sense. Hence, you do not find documentation
about it.
> On 12 Feb 2017, at 09:06, Mendelson, Assaf
Hi,
I was wondering if dataframe is considered thread safe. I know the spark
session and spark context are thread safe (and actually have tools to manage
jobs from different threads) but the question is, can I use the same dataframe
in both threads.
The idea would be to create a dataframe in
16 matches
Mail list logo