Hi,

Just a quick clarification question: from what I understand, blocks in a batch 
together form a single RDD which is partitioned (usually using the 
HashPartitioner) across multiple tasks. First, is this correct? Second, the 
partitioner is called every single time a new task is created. Is the same 
RDD/batch re-partitioned every time a new task is instantiated?


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to