Yes your use case should be fine. Multiple threads can transform the same data frame in parallel since they create different data frames.
On Sun, Feb 12, 2017 at 9:07 AM Mendelson, Assaf <assaf.mendel...@rsa.com> wrote: > Hi, > > I was wondering if dataframe is considered thread safe. I know the spark > session and spark context are thread safe (and actually have tools to > manage jobs from different threads) but the question is, can I use the same > dataframe in both threads. > > The idea would be to create a dataframe in the main thread and then in two > sub threads do different transformations and actions on it. > > I understand that some things might not be thread safe (e.g. if I > unpersist in one thread it would affect the other. Checkpointing would > cause similar issues), however, I can’t find any documentation as to what > operations (if any) are thread safe. > > > > Thanks, > > Assaf. >