updating dataframe  returns NEW dataframe  like RDD please?

---Original---
From: "vincent gromakowski"<vincent.gromakow...@gmail.com>
Date: 2017/2/14 01:15:35
To: "Reynold Xin"<r...@databricks.com>;
Cc: "user"<user@spark.apache.org>;"Mendelson, Assaf"<assaf.mendel...@rsa.com>;
Subject: Re: is dataframe thread safe?


How about having a thread that update and cache a dataframe in-memory next to 
other threads requesting this dataframe, is it thread safe ?

2017-02-13 9:02 GMT+01:00 Reynold Xin <r...@databricks.com>:
Yes your use case should be fine. Multiple threads can transform the same data 
frame in parallel since they create different data frames.&#xA0;



On Sun, Feb 12, 2017 at 9:07 AM Mendelson, Assaf <assaf.mendel...@rsa.com> 
wrote:

   
Hi,
 
I was wondering if dataframe is considered thread safe. I know the spark 
session and spark context are thread safe (and actually have tools to manage 
jobs from different threads) but the question is, can I use the same dataframe 
in both  threads.
 
The idea would be to create a dataframe in the main thread and then in two sub 
threads do different transformations and actions on it.
 
I understand that some things might not be thread safe (e.g. if I unpersist in 
one thread it would affect the other. Checkpointing would cause similar 
issues), however, I can??t find any documentation as to what operations (if 
any) are thread  safe.
 
&#xA0;
 
Thanks,
 
&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;&#xA0;
 Assaf.

Reply via email to