[jira] [Commented] (SPARK-9429) TriangleCount: job aborted due to stage failure

Robin East (JIRA) Thu, 17 Sep 2015 16:03:27 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-9429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14804678#comment-14804678
 ]


Robin East commented on SPARK-9429:
-----------------------------------

The scala docs for triangleCount state 'Note that the input graph should have 
its edges in canonical direction (i.e. the sourceId less than destId). Also the 
graph must have been partitioned using 
org.apache.spark.graphx.Graph#partitionBy.'. The code checks for this condition 
and throws an assertion when the conditions are not met e.g. you have an 
Edge(2L,1L,...) which is not in canonical direction.

> TriangleCount: job aborted due to stage failure
> -----------------------------------------------
>
>                 Key: SPARK-9429
>                 URL: https://issues.apache.org/jira/browse/SPARK-9429
>             Project: Spark
>          Issue Type: Bug
>          Components: GraphX
>            Reporter: YangBaoxing
>
> Hi, all !
> When I run the TriangleCount algorithm on my own data, an exception like "Job 
> aborted to stage failure: Task 0 in stage 4.0 failed 1 times, most recent 
> failure: Lost task 0.0 in stage 4.0 (TID 8, localhost): 
> java.lang.AssertionError: assertion failed" occurred. Then I checked the 
> source code and found that the problem is in line "assert((dblCount & 1) == 
> 0)". And I also found that it run successfully on Array(0L -> 1L, 1L -> 2L, 
> 2L -> 0L) and Array(0L -> 1L, 1L -> 2L, 2L -> 0L, 0L -> 2L, 2L -> 1L, 1L -> 
> 0L) while failed on Array(0L -> 1L, 1L -> 2L, 2L -> 0L, 2L -> 1L). It seems 
> to be more suitable for all unidirectional or bidirectional graph. Is 
> TriangleCount suitable for incomplete bidirectional graph? The complete 
> exception as follows:
> Job aborted due to stage failure: Task 0 in stage 4.0 failed 1 times, most 
> recent failure: Lost task 0.0 in stage 4.0 (TID 8, localhost): 
> java.lang.AssertionError: assertion failed
>       at scala.Predef$.assert(Predef.scala:165)
>       at 
> org.apache.spark.graphx.lib.TriangleCount$$anonfun$7.apply(TriangleCount.scala:90)
>       at 
> org.apache.spark.graphx.lib.TriangleCount$$anonfun$7.apply(TriangleCount.scala:87)
>       at 
> org.apache.spark.graphx.impl.VertexPartitionBaseOps.leftJoin(VertexPartitionBaseOps.scala:140)
>       at 
> org.apache.spark.graphx.impl.VertexRDDImpl$$anonfun$3.apply(VertexRDDImpl.scala:159)
>       at 
> org.apache.spark.graphx.impl.VertexRDDImpl$$anonfun$3.apply(VertexRDDImpl.scala:156)
>       at 
> org.apache.spark.rdd.ZippedPartitionsRDD2.compute(ZippedPartitionsRDD.scala:88)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>       at org.apache.spark.graphx.VertexRDD.compute(VertexRDD.scala:71)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>       at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
>       at org.apache.spark.scheduler.Task.run(Task.scala:70)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-9429) TriangleCount: job aborted due to stage failure

Reply via email to