[ 
https://issues.apache.org/jira/browse/SPARK-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ziv Huang updated SPARK-3687:
-----------------------------
    Description: 
In my application, I read more than 100 sequence files to a JavaPairRDD, 
perform flatmap to get another JavaRDD, and then use takeOrdered to get the 
result.
It is quite often (but not always) that the spark hangs while the executing 
some of 110th-130th tasks.
The job can hang for several hours, maybe forever (I can't wait for its 
completion).
When the spark job hangs, I can't find any error message in anywhere, and I 
can't kill the job from web UI.

The current workaround is to use coalesce to reduce the number of partitions to 
be processed.
I never get job hanged if the number of partitions to be processed is no 
greater than 80.

  was:In my application, I read more than 100 sequence files to a JavaPairRDD, 
perform flatmap to get another JavaRDD, and then use takeOrdered


> Spark hang while processing more than 100 sequence files
> --------------------------------------------------------
>
>                 Key: SPARK-3687
>                 URL: https://issues.apache.org/jira/browse/SPARK-3687
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.0.2, 1.1.0
>            Reporter: Ziv Huang
>
> In my application, I read more than 100 sequence files to a JavaPairRDD, 
> perform flatmap to get another JavaRDD, and then use takeOrdered to get the 
> result.
> It is quite often (but not always) that the spark hangs while the executing 
> some of 110th-130th tasks.
> The job can hang for several hours, maybe forever (I can't wait for its 
> completion).
> When the spark job hangs, I can't find any error message in anywhere, and I 
> can't kill the job from web UI.
> The current workaround is to use coalesce to reduce the number of partitions 
> to be processed.
> I never get job hanged if the number of partitions to be processed is no 
> greater than 80.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to