[ https://issues.apache.org/jira/browse/SPARK-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14177983#comment-14177983 ]
Apache Spark commented on SPARK-4031: ------------------------------------- User 'shivaram' has created a pull request for this issue: https://github.com/apache/spark/pull/2871 > Read broadcast variables on use > ------------------------------- > > Key: SPARK-4031 > URL: https://issues.apache.org/jira/browse/SPARK-4031 > Project: Spark > Issue Type: Bug > Components: Block Manager, Spark Core > Reporter: Shivaram Venkataraman > Assignee: Shivaram Venkataraman > > This is a proposal to change the broadcast variable implementations in Spark > to only read values when they are used rather than on deserializing. > This change will be very helpful (and in our use cases required) for complex > applications which have a large number of broadcast variables. For example if > broadcast variables are class members, they are captured in closures even > when they are not used. > We could also consider cleaning closures more aggressively, but that might be > a more complex change. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org