Re: Application dies, Driver keeps on running
Ah interesting, I stopped spark context and System.exit() from driver with supervise ON and that seemed to start app if it gets killed. On Mon, May 15, 2017 at 5:01 PM, map reduced wrote: > Hi, > I was looking at incorrect place for logs, yes I see some errors in logs: > > "Remote RPC client disassociated. Likely due to containers exceeding > thresholds, or network issues. Check driver logs for WARN messages." > > logger="org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend",message="Disconnected > from Spark cluster! Waiting for reconnection..." > > So what is best way to deal with this situation? I would rather have > driver killed along with it, is there a way to achieve that? > > > On Mon, May 15, 2017 at 3:05 PM, Shixiong(Ryan) Zhu < > shixi...@databricks.com> wrote: > >> So you are using `client` mode. Right? If so, Spark cluster doesn't >> manage the driver for you. Did you see any error logs in driver? >> >> On Mon, May 15, 2017 at 3:01 PM, map reduced wrote: >> >>> Hi, >>> >>> Setup: Standalone cluster with 32 workers, 1 master >>> I am running a long running streaming spark job (read from Kafka -> >>> process -> send to Http endpoint) which should ideally never stop. >>> >>> I have 2 questions: >>> 1) I have seen some times Driver is still running but application marked >>> as *Finished*. *Any idea why this happens or any way to debug this?* >>> Sometimes after running for say 2-3 days (or 4-5 days - random >>> timeframe) this issue arises, not sure what is causing it. Nothing in logs >>> suggests failures or exceptions >>> >>> 2) Is there a way for Driver to kill itself instead of keeping on >>> running without any application to drive? >>> >>> Thanks, >>> KP >>> >> >> >
Re: Application dies, Driver keeps on running
Hi, I was looking at incorrect place for logs, yes I see some errors in logs: "Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages." logger="org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend",message="Disconnected from Spark cluster! Waiting for reconnection..." So what is best way to deal with this situation? I would rather have driver killed along with it, is there a way to achieve that? On Mon, May 15, 2017 at 3:05 PM, Shixiong(Ryan) Zhu wrote: > So you are using `client` mode. Right? If so, Spark cluster doesn't manage > the driver for you. Did you see any error logs in driver? > > On Mon, May 15, 2017 at 3:01 PM, map reduced wrote: > >> Hi, >> >> Setup: Standalone cluster with 32 workers, 1 master >> I am running a long running streaming spark job (read from Kafka -> >> process -> send to Http endpoint) which should ideally never stop. >> >> I have 2 questions: >> 1) I have seen some times Driver is still running but application marked >> as *Finished*. *Any idea why this happens or any way to debug this?* >> Sometimes after running for say 2-3 days (or 4-5 days - random timeframe) >> this issue arises, not sure what is causing it. Nothing in logs suggests >> failures or exceptions >> >> 2) Is there a way for Driver to kill itself instead of keeping on running >> without any application to drive? >> >> Thanks, >> KP >> > >
Re: Application dies, Driver keeps on running
So you are using `client` mode. Right? If so, Spark cluster doesn't manage the driver for you. Did you see any error logs in driver? On Mon, May 15, 2017 at 3:01 PM, map reduced wrote: > Hi, > > Setup: Standalone cluster with 32 workers, 1 master > I am running a long running streaming spark job (read from Kafka -> > process -> send to Http endpoint) which should ideally never stop. > > I have 2 questions: > 1) I have seen some times Driver is still running but application marked > as *Finished*. *Any idea why this happens or any way to debug this?* > Sometimes after running for say 2-3 days (or 4-5 days - random timeframe) > this issue arises, not sure what is causing it. Nothing in logs suggests > failures or exceptions > > 2) Is there a way for Driver to kill itself instead of keeping on running > without any application to drive? > > Thanks, > KP >
Application dies, Driver keeps on running
Hi, Setup: Standalone cluster with 32 workers, 1 master I am running a long running streaming spark job (read from Kafka -> process -> send to Http endpoint) which should ideally never stop. I have 2 questions: 1) I have seen some times Driver is still running but application marked as *Finished*. *Any idea why this happens or any way to debug this?* Sometimes after running for say 2-3 days (or 4-5 days - random timeframe) this issue arises, not sure what is causing it. Nothing in logs suggests failures or exceptions 2) Is there a way for Driver to kill itself instead of keeping on running without any application to drive? Thanks, KP