[ 
https://issues.apache.org/jira/browse/SPARK-18098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15605898#comment-15605898
 ] 

Sean Owen commented on SPARK-18098:
-----------------------------------

It shouldn't work that way. The value is loaded in a lazy val, at least. I 
think I can imagine cases where you would end up with several per executor but 
they're not the normal use cases. Can you say more about what you're executing 
or what you're seeing?

> Broadcast creates 1 instance / core, not 1 instance / executor
> --------------------------------------------------------------
>
>                 Key: SPARK-18098
>                 URL: https://issues.apache.org/jira/browse/SPARK-18098
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.0.1
>            Reporter: Anthony Sciola
>
> I've created my spark executors with $SPARK_HOME/sbin/start-slave.sh -c 7 -m 
> 55g
> When I run a job which broadcasts data, it appears each *thread* requests and 
> receives a copy of the broadcast object, not each *executor*. This means I 
> need 7x as much memory for the broadcasted item because I have 7 cores.
> The problem appears to be due to a lack of synchronization around requesting 
> broadcast items.
> The only workaround I've come up with is writing the data out to HDFS, 
> broadcasting the paths, and doing a synchronized load from HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to