Re: Yarn Driver OOME (Java heap space) when executors request map output locations

2014-09-09 Thread Kostas Sakellis
Hey,

If you are interested in more details there is also a thread about this
issue here:
http://apache-spark-developers-list.1001551.n3.nabble.com/Eliminate-copy-while-sending-data-any-Akka-experts-here-td7127.html

Kostas

On Tue, Sep 9, 2014 at 3:01 PM, jbeynon  wrote:

> Thanks Marcelo, that looks like the same thing. I'll follow the Jira ticket
> for updates.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Yarn-Driver-OOME-Java-heap-space-when-executors-request-map-output-locations-tp13827p13829.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: Yarn Driver OOME (Java heap space) when executors request map output locations

2014-09-09 Thread jbeynon
Thanks Marcelo, that looks like the same thing. I'll follow the Jira ticket
for updates.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Yarn-Driver-OOME-Java-heap-space-when-executors-request-map-output-locations-tp13827p13829.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Yarn Driver OOME (Java heap space) when executors request map output locations

2014-09-09 Thread Marcelo Vanzin
Hi,

Yes, this is a problem, and I'm not aware of any simple workarounds
(or complex one for that matter). There are people working to fix
this, you can follow progress here:
https://issues.apache.org/jira/browse/SPARK-1239

On Tue, Sep 9, 2014 at 2:54 PM, jbeynon  wrote:
> I'm running on Yarn with relatively small instances with 4gb memory. I'm not
> caching any data but when the map stage ends and shuffling begins all of the
> executors request the map output locations at the same time which seems to
> kill the driver when the number of executors is turned up.
>
> For example, the "size of output statuses" is about 10mb and with 500
> executors the driver appears to be making 500 (5gb of data) copies of this
> data to send out and running out of memory. When I turn down the number of
> executors everything runs fine.
>
> Has anyone else run into this? Maybe I'm misunderstanding the underlying
> cause. I don't have a copy of the stack trace handy but can recreate it if
> necessary. It was somewhere in the  for HeapByteBuffer. Any advice
> would be helpful.
>
>
>
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Yarn-Driver-OOME-Java-heap-space-when-executors-request-map-output-locations-tp13827.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>



-- 
Marcelo

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Yarn Driver OOME (Java heap space) when executors request map output locations

2014-09-09 Thread jbeynon
I'm running on Yarn with relatively small instances with 4gb memory. I'm not
caching any data but when the map stage ends and shuffling begins all of the
executors request the map output locations at the same time which seems to
kill the driver when the number of executors is turned up.

For example, the "size of output statuses" is about 10mb and with 500
executors the driver appears to be making 500 (5gb of data) copies of this
data to send out and running out of memory. When I turn down the number of
executors everything runs fine.

Has anyone else run into this? Maybe I'm misunderstanding the underlying
cause. I don't have a copy of the stack trace handy but can recreate it if
necessary. It was somewhere in the  for HeapByteBuffer. Any advice
would be helpful.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Yarn-Driver-OOME-Java-heap-space-when-executors-request-map-output-locations-tp13827.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org