Akhil, I have checked the logs. There isn't any clue as to why the 5 receivers failed.
That's why I just take it for granted that it will be a common issue for receiver failures, and we need to figure out a way to detect this kind of failure and do fail-over. Thanks On Mon, Mar 16, 2015 at 3:17 PM, Akhil Das <ak...@sigmoidanalytics.com> wrote: > You need to figure out why the receivers failed in the first place. Look > in your worker logs and see what really happened. When you run a streaming > job continuously for longer period mostly there'll be a lot of logs (you > can enable log rotation etc.) and if you are doing a groupBy, join, etc > type of operations, then there will be a lot of shuffle data. So You need > to check in the worker logs and see what happened (whether DISK full etc.), > We have streaming pipelines running for weeks without having any issues. > > Thanks > Best Regards > > On Mon, Mar 16, 2015 at 12:40 PM, Jun Yang <yangjun...@gmail.com> wrote: > >> Guys, >> >> We have a project which builds upon Spark streaming. >> >> We use Kafka as the input stream, and create 5 receivers. >> >> When this application runs for around 90 hour, all the 5 receivers failed >> for some unknown reasons. >> >> In my understanding, it is not guaranteed that Spark streaming receiver >> will do fault recovery automatically. >> >> So I just want to figure out a way for doing fault-recovery to deal with >> receiver failure. >> >> There is a JIRA post mentioned using StreamingLister for monitoring the >> status of receiver: >> >> >> https://issues.apache.org/jira/browse/SPARK-2381?focusedCommentId=14056836&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14056836 >> >> However I haven't found any open doc about how to do this stuff. >> >> Any guys have met the same issue and deal with it? >> >> Our environment: >> Spark 1.3.0 >> Dual Master Configuration >> Kafka 0.8.2 >> >> Thanks >> >> -- >> yangjun...@gmail.com >> http://hi.baidu.com/yjpro >> > > -- yangjun...@gmail.com http://hi.baidu.com/yjpro