Effective ways monitor and identify that a Streaming job has been failing for the last 5 minutes

SRK Tue, 01 Dec 2015 07:45:33 -0800

Hi,

We need to monitor and identify if the Streaming job has been failing for
the last 5 minutes and restart the job accordingly.  In most cases our Spark
Streaming with Kafka direct fails with leader lost errors. Or offsets not
found errors for that partition. What is the most effective way to monitor
and identify that the Streamjng job has been failing with an error . The
default monitoring provided by Spark does not seem to cover the case to
check if the job has been failing for a specific time or am I missing
something and this feature is already available?


Thanks,
Swetha



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Effective-ways-monitor-and-identify-that-a-Streaming-job-has-been-failing-for-the-last-5-minutes-tp25536.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Effective ways monitor and identify that a Streaming job has been failing for the last 5 minutes

Reply via email to