You are using same csv twice? Отправлено с iPhone
> 7 дек. 2020 г., в 18:32, Amit Sharma <resolve...@gmail.com> написал(а): > > > Hi All, I am using caching in my code. I have a DF like > val DF1 = read csv. > val DF2 = DF1.groupBy().agg().select(.....) > > Val DF3 = read csv .join(DF1).join(DF2) > DF3 .save. > > If I do not cache DF2 or Df1 it is taking longer time . But i am doing 1 > action only why do I need to cache. > > Thanks > Amit > > --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org