Re: is dataframe thread safe?

2017-02-15 Thread vincent gromakowski
;user"<user@spark.apache.org>;"Mendelson, Assaf"< > assaf.mendel...@rsa.com>; > *Subject:* Re: is dataframe thread safe? > > How about having a thread that update and cache a dataframe in-memory next > to other threads requesting this dataframe, is it thread saf

Re: is dataframe thread safe?

2017-02-15 Thread ??????????
ndelson, Assaf"<assaf.mendel...@rsa.com>; Subject: Re: is dataframe thread safe? How about having a thread that update and cache a dataframe in-memory next to other threads requesting this dataframe, is it thread safe ? 2017-02-13 9:02 GMT+01:00 Reynold Xin <r...@databricks.com>: Yes

Re: is dataframe thread safe?

2017-02-13 Thread Mark Hamstra
If you update the data, then you don't have the same DataFrame anymore. If you don't do like Assaf did, caching and forcing evaluation of the DataFrame before using that DataFrame concurrently, then you'll still get consistent and correct results, but not necessarily efficient results. If the

Re: is dataframe thread safe?

2017-02-13 Thread vincent gromakowski
How about having a thread that update and cache a dataframe in-memory next to other threads requesting this dataframe, is it thread safe ? 2017-02-13 9:02 GMT+01:00 Reynold Xin : > Yes your use case should be fine. Multiple threads can transform the same > data frame in

Re: is dataframe thread safe?

2017-02-13 Thread Reynold Xin
Yes your use case should be fine. Multiple threads can transform the same data frame in parallel since they create different data frames. On Sun, Feb 12, 2017 at 9:07 AM Mendelson, Assaf wrote: > Hi, > > I was wondering if dataframe is considered thread safe. I know

Re: is dataframe thread safe?

2017-02-13 Thread 任弘迪
for my understanding, all transformations are thread-safe cause dataframe is just a description of the calculation and it's immutable, so the case above is all right. just be careful with the actions. On Sun, Feb 12, 2017 at 4:06 PM, Mendelson, Assaf wrote: > Hi, > > I

Re: is dataframe thread safe?

2017-02-12 Thread Timur Shenkao
own overheads). > > > > Therefore Sean’s answer is what I was looking for (and hoping for…) > > Assaf > > > > *From:* Jörn Franke [mailto:jornfra...@gmail.com] > *Sent:* Sunday, February 12, 2017 2:46 PM > *To:* Sean Owen > *Cc:* Mendelson, Assaf; user &

RE: is dataframe thread safe?

2017-02-12 Thread Mendelson, Assaf
for…) Assaf From: Jörn Franke [mailto:jornfra...@gmail.com] Sent: Sunday, February 12, 2017 2:46 PM To: Sean Owen Cc: Mendelson, Assaf; user Subject: Re: is dataframe thread safe? I did not doubt that the submission of several jobs of one application makes sense. However, he want to create threads

Re: is dataframe thread safe?

2017-02-12 Thread Jörn Franke
f1 and an unpersist in f2 I >> would get an inconsistent result. So my question is, what, if any are the >> legal operations to use on a dataframe so I could do the above. >> >> Thanks, >> Assaf. >> >> From: Jörn Franke [mailto:jornfra

Re: is dataframe thread safe?

2017-02-12 Thread Jörn Franke
gt; >>> >>> >>> However, if I would call f1 and f2 on different threads, then df2 can use >>> free resources f1 has not consumed and the overall utilization would >>> improve. >>> >>> >>> >>> Of course, I ca

Re: is dataframe thread safe?

2017-02-12 Thread Yan Facai
;> Of course, I can do this only if the operations on the dataframe are >> thread safe. For example, if I would do a cache in f1 and an unpersist in >> f2 I would get an inconsistent result. So my question is, what, if any are >> the legal operations to use on a dataframe so I cou

Re: is dataframe thread safe?

2017-02-12 Thread Sean Owen
ent result. So my question is, what, if any are > the legal operations to use on a dataframe so I could do the above. > > > > Thanks, > > Assaf. > > > > *From:* Jörn Franke [mailto:jornfra...@gmail.com <jornfra...@gmail.com>] > *Sent:* Sund

Re: is dataframe thread safe?

2017-02-12 Thread Jörn Franke
sistent result. So my question is, what, if any are the legal > operations to use on a dataframe so I could do the above. > > Thanks, > Assaf. > > From: Jörn Franke [mailto:jornfra...@gmail.com] > Sent: Sunday, February 12, 2017 10:39 AM > To: Men

RE: is dataframe thread safe?

2017-02-12 Thread Mendelson, Assaf
To: Mendelson, Assaf Cc: user Subject: Re: is dataframe thread safe? I am not sure what you are trying to achieve here. Spark is taking care of executing the transformations in a distributed fashion. This means you must not use threads - it does not make sense. Hence, you do not find documentation about

Re: is dataframe thread safe?

2017-02-12 Thread Jörn Franke
I am not sure what you are trying to achieve here. Spark is taking care of executing the transformations in a distributed fashion. This means you must not use threads - it does not make sense. Hence, you do not find documentation about it. > On 12 Feb 2017, at 09:06, Mendelson, Assaf

is dataframe thread safe?

2017-02-12 Thread Mendelson, Assaf
Hi, I was wondering if dataframe is considered thread safe. I know the spark session and spark context are thread safe (and actually have tools to manage jobs from different threads) but the question is, can I use the same dataframe in both threads. The idea would be to create a dataframe in