Hey, Guys!

I am using spark for NGS data application.

In my case I have to broadcast a very big dataset to each task.  

However there are serveral tasks (say 48 tasks) running on cpus (also 48 cores) 
in the same node. These tasks, who run on the same node, could share the same 
dataset. But spark broadcast them 48 times (if I understand correctly). 
Is there a way to broadcast just one copy for each node and share by all tasks 
running on such nodes? 

Much appreciated!

best!



huanglr

Reply via email to