I ran into the same issue when the dataset is very big.

Marcelo from Cloudera found that it may be caused by SPARK-2711, so their
Spark 1.1 release reverted SPARK-2711, and the issue is gone. See
https://issues.apache.org/jira/browse/SPARK-3633 for detail.

You can checkout Cloudera's version here
https://github.com/cloudera/spark/tree/cdh5-1.1.0_5.2.0

PS, I don't test it yet, but will test it in the following couple days, and
report back.


Sincerely,

DB Tsai
-------------------------------------------------------
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai

On Sat, Oct 18, 2014 at 6:22 PM, marylucy <qaz163wsx_...@hotmail.com> wrote:

> When doing groupby for big data,may be 500g,some partition tasks
> success,some partition tasks fetchfailed error.   Spark system retry
> previous stage,but always fail
> 6 computers : 384g
> Worker:40g*7 for one computer
>
> Can anyone tell me why fetch failed???
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to