Is the OOM happening to the Driver JVM or one of the Executor JVMs? What
memory size is each JVM?

How large is the data you're trying to broadcast? If it's large enough, you
may want to consider just persisting the data to distributed storage (like
HDFS) and read it in through the normal read RDD methods like sc.textFile().

Maybe someone else can comment on what the largest recommended data
collection sizes are to use with Broadcast...



On Thu, Dec 11, 2014 at 10:14 AM, ll <duy.huynh....@gmail.com> wrote:

> hi.  i'm running into this OutOfMemory issue when i'm broadcasting a large
> array.  what is the best way to handle this?
>
> should i split the array into smaller arrays before broadcasting, and then
> combining them locally at each node?
>
> thanks!
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/broadcast-OutOfMemoryError-tp20633.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to