fixed in 1.1.1 and 1.2.0
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Fetch-Failure-tp20787p20811.html
Sent from the Apache Spark User List mailing list archive at
Nabble.com
Which version of spark are you running?
It could be related to this
https://issues.apache.org/jira/browse/SPARK-3633
fixed in 1.1.1 and 1.2.0
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Fetch-Failure-tp20787p20811.html
Sent from the Apache Spark
I have a job that runs fine on relatively small input datasets but then
reaches a threshold where I begin to consistently get Fetch failure for
the Failure Reason, late in the job, during a saveAsText() operation.
The first error we are seeing on the Details for Stage page is
ExecutorLostFailure
to know what could be causing this.
On Fri, Dec 19, 2014 at 7:46 AM, bethesda swearinge...@mac.com wrote:
I have a job that runs fine on relatively small input datasets but then
reaches a threshold where I begin to consistently get Fetch failure for
the Failure Reason, late in the job, during
at 7:46 AM, bethesda swearinge...@mac.com wrote:
I have a job that runs fine on relatively small input datasets but then
reaches a threshold where I begin to consistently get Fetch failure for
the Failure Reason, late in the job, during a saveAsText() operation.
The first error we are seeing
wrote:
I have a job that runs fine on relatively small input datasets but then
reaches a threshold where I begin to consistently get Fetch failure for
the Failure Reason, late in the job, during a saveAsText() operation.
The first error we are seeing on the Details for Stage page
don't
know
how to interpret -- is there any kind of troubleshooting guide beyond
the
Spark Configuration page? I don't know if I'm providing enough info
here.
thanks.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Fetch-Failure-tp20787.html
Sent
to know what could be causing this.
On Fri, Dec 19, 2014 at 7:46 AM, bethesda swearinge...@mac.com wrote:
I have a job that runs fine on relatively small input datasets but then
reaches a threshold where I begin to consistently get Fetch failure for
the Failure Reason, late in the job, during
on relatively small input datasets but then
reaches a threshold where I begin to consistently get Fetch failure
for
the Failure Reason, late in the job, during a saveAsText() operation.
The first error we are seeing on the Details for Stage page is
ExecutorLostFailure
My Shuffle Read is 3.3 GB
*HI ALL:*
*My job is cpu intensive, and its resource configuration is 400 worker
* 1 core * 3G. There are many fetch failure, like:*
14-08-23 08:34:52 WARN [Result resolver thread-3] TaskSetManager: Loss
was due to fetch failure from BlockManagerId(slave1:33500)
14-08-23 08:34:52 INFO [spark
to fetch failure from
BlockManagerId(2, 192.168.222.164, 57185, 0)
14/06/30 19:30:18 WARN TaskSetManager: Lost TID 25310 (task 6.1:0)
14/06/30 19:30:18 WARN TaskSetManager: Loss was due to fetch failure from
BlockManagerId(2, 192.168.222.164, 57185, 0)
14/06/30 19:30:19 WARN TaskSetManager: Lost
application on a standalone 4-node spark
cluster?
14/06/30 19:30:16 WARN TaskSetManager: Lost TID 25036 (task 6.0:90)
14/06/30 19:30:16 WARN TaskSetManager: Loss was due to fetch failure from
BlockManagerId(2, 192.168.222.164, 57185, 0)
14/06/30 19:30:18 WARN TaskSetManager: Lost TID 25310
run our application on a standalone 4-node
spark
cluster?
14/06/30 19:30:16 WARN TaskSetManager: Lost TID 25036 (task 6.0:90)
14/06/30 19:30:16 WARN TaskSetManager: Loss was due to fetch failure from
BlockManagerId(2, 192.168.222.164, 57185, 0)
14/06/30 19:30:18 WARN
13 matches
Mail list logo