Re: Spark 1.5 java.net.ConnectException: Connection refused

2015-10-15 Thread legolasluk
: "Tathagata Das" <t...@databricks.com> Date: Oct 14, 2015 1:28 PM Subject: Re: Spark 1.5 java.net.ConnectException: Connection refused To: "Spark Newbie" <sparknewbie1...@gmail.com> Cc: "user" <user@spark.apache.org>, "Shixiong (Ryan) Zhu

Re: Spark 1.5 java.net.ConnectException: Connection refused

2015-10-15 Thread Spark Newbie
What is the best way to fail the application when job gets aborted? On Wed, Oct 14, 2015 at 1:27 PM, Tathagata Das wrote: > When a job gets aborted, it means that the internal tasks were retried a > number of times before the system gave up. You can control the number >

Re: Spark 1.5 java.net.ConnectException: Connection refused

2015-10-14 Thread Spark Newbie
Is it slowing things down or blocking progress. >> I didn't see slowing of processing, but I do see jobs aborted consecutively for a period of 18 batches (5 minute batch intervals). So I am worried about what happened to the records that these jobs were processing. Also, one more thing to mention

Re: Spark 1.5 java.net.ConnectException: Connection refused

2015-10-14 Thread Spark Newbie
I ran 2 different spark 1.5 clusters that have been running for more than a day now. I do see jobs getting aborted due to task retry's maxing out (default 4) due to ConnectionException. It seems like the executors die and get restarted and I was unable to find the root cause (same app code and

Re: Spark 1.5 java.net.ConnectException: Connection refused

2015-10-14 Thread Tathagata Das
When a job gets aborted, it means that the internal tasks were retried a number of times before the system gave up. You can control the number retries (see Spark's configuration page). The job by default does not get resubmitted. You could try getting the logs of the failed executor, to see what

Spark 1.5 java.net.ConnectException: Connection refused

2015-10-13 Thread Spark Newbie
Hi Spark users, I'm seeing the below exception in my spark streaming application. It happens in the first stage where the kinesis receivers receive records and perform a flatMap operation on the unioned Dstream. A coalesce step also happens as a part of that stage for optimizing the performance.

Re: Spark 1.5 java.net.ConnectException: Connection refused

2015-10-13 Thread Tathagata Das
Is this happening too often? Is it slowing things down or blocking progress. Failures once in a while is part of the norm, and the system should take care of itself. On Tue, Oct 13, 2015 at 2:47 PM, Spark Newbie wrote: > Hi Spark users, > > I'm seeing the below