Hi,
Just need some advise.
   
   - When we have multiple spark nodes running code, under what conditions a 
repartition make sense?
   - Can we repartition and cache the result --> df = spark.sql("select from 
...").repartition(4).cache
   - If we choose a repartition (4), will that repartition applies to all nodes 
running the code and how can one see that?

Thanks

Reply via email to