How about having a thread that update and cache a dataframe in-memory next to other threads requesting this dataframe, is it thread safe ?
2017-02-13 9:02 GMT+01:00 Reynold Xin <r...@databricks.com>: > Yes your use case should be fine. Multiple threads can transform the same > data frame in parallel since they create different data frames. > > > On Sun, Feb 12, 2017 at 9:07 AM Mendelson, Assaf <assaf.mendel...@rsa.com> > wrote: > >> Hi, >> >> I was wondering if dataframe is considered thread safe. I know the spark >> session and spark context are thread safe (and actually have tools to >> manage jobs from different threads) but the question is, can I use the same >> dataframe in both threads. >> >> The idea would be to create a dataframe in the main thread and then in >> two sub threads do different transformations and actions on it. >> >> I understand that some things might not be thread safe (e.g. if I >> unpersist in one thread it would affect the other. Checkpointing would >> cause similar issues), however, I can’t find any documentation as to what >> operations (if any) are thread safe. >> >> >> >> Thanks, >> >> Assaf. >> >