You are using same csv twice?

Отправлено с iPhone

> 7 дек. 2020 г., в 18:32, Amit Sharma <resolve...@gmail.com> написал(а):
> 
> 
> Hi All, I am using caching in my code. I have a DF like
> val  DF1 = read csv.
> val DF2 = DF1.groupBy().agg().select(.....)
> 
> Val DF3 =  read csv .join(DF1).join(DF2)
>   DF3 .save.
> 
> If I do not cache DF2 or Df1 it is taking longer time  . But i am doing 1 
> action only why do I need to cache.
> 
> Thanks
> Amit
> 
> 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to