Hi Sandy, Any resolution for YARN failures ? It's a blocker for running spark on top of YARN.
Thanks. Deb On Tue, Aug 19, 2014 at 11:29 PM, Xiangrui Meng <men...@gmail.com> wrote: > Hi Deb, > > I think this may be the same issue as described in > https://issues.apache.org/jira/browse/SPARK-2121 . We know that the > container got killed by YARN because it used much more memory that it > requested. But we haven't figured out the root cause yet. > > +Sandy > > Best, > Xiangrui > > On Tue, Aug 19, 2014 at 8:51 PM, Debasish Das <debasish.da...@gmail.com> > wrote: > > Hi, > > > > During the 4th ALS iteration, I am noticing that one of the executor gets > > disconnected: > > > > 14/08/19 23:40:00 ERROR network.ConnectionManager: Corresponding > > SendingConnectionManagerId not found > > > > 14/08/19 23:40:00 INFO cluster.YarnClientSchedulerBackend: Executor 5 > > disconnected, so removing it > > > > 14/08/19 23:40:00 ERROR cluster.YarnClientClusterScheduler: Lost > executor 5 > > on tblpmidn42adv-hdp.tdc.vzwcorp.com: remote Akka client disassociated > > > > 14/08/19 23:40:00 INFO scheduler.DAGScheduler: Executor lost: 5 (epoch > 12) > > Any idea if this is a bug related to akka on YARN ? > > > > I am using master > > > > Thanks. > > Deb >