[jira] [Commented] (SPARK-3650) Triangle Count handles reverse edges incorrectly
[ https://issues.apache.org/jira/browse/SPARK-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156499#comment-15156499 ] Ovidiu Marcu commented on SPARK-3650: - Is it possible to apply this fix to a 1.5 version? > Triangle Count handles reverse edges incorrectly > > > Key: SPARK-3650 > URL: https://issues.apache.org/jira/browse/SPARK-3650 > Project: Spark > Issue Type: Bug > Components: GraphX >Affects Versions: 1.1.0, 1.2.0 >Reporter: Joseph E. Gonzalez >Assignee: Robin East >Priority: Critical > Labels: releasenotes > Fix For: 2.0.0 > > > The triangle count implementation assumes that edges are aligned in a > canonical direction. As stated in the documentation: > bq. Note that the input graph should have its edges in canonical direction > (i.e. the `sourceId` less than `destId`) > However the TriangleCount algorithm does not verify that this condition holds > and indeed even the unit tests exploits this functionality: > {code:scala} > val triangles = Array(0L -> 1L, 1L -> 2L, 2L -> 0L) ++ > Array(0L -> -1L, -1L -> -2L, -2L -> 0L) > val rawEdges = sc.parallelize(triangles, 2) > val graph = Graph.fromEdgeTuples(rawEdges, true).cache() > val triangleCount = graph.triangleCount() > val verts = triangleCount.vertices > verts.collect().foreach { case (vid, count) => > if (vid == 0) { > assert(count === 4) // <-- Should be 2 > } else { > assert(count === 2) // <-- Should be 1 > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3650) Triangle Count handles reverse edges incorrectly
[ https://issues.apache.org/jira/browse/SPARK-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15155991#comment-15155991 ] Apache Spark commented on SPARK-3650: - User 'insidedctm' has created a pull request for this issue: https://github.com/apache/spark/pull/11290 > Triangle Count handles reverse edges incorrectly > > > Key: SPARK-3650 > URL: https://issues.apache.org/jira/browse/SPARK-3650 > Project: Spark > Issue Type: Bug > Components: GraphX >Affects Versions: 1.1.0, 1.2.0 >Reporter: Joseph E. Gonzalez >Priority: Critical > > The triangle count implementation assumes that edges are aligned in a > canonical direction. As stated in the documentation: > bq. Note that the input graph should have its edges in canonical direction > (i.e. the `sourceId` less than `destId`) > However the TriangleCount algorithm does not verify that this condition holds > and indeed even the unit tests exploits this functionality: > {code:scala} > val triangles = Array(0L -> 1L, 1L -> 2L, 2L -> 0L) ++ > Array(0L -> -1L, -1L -> -2L, -2L -> 0L) > val rawEdges = sc.parallelize(triangles, 2) > val graph = Graph.fromEdgeTuples(rawEdges, true).cache() > val triangleCount = graph.triangleCount() > val verts = triangleCount.vertices > verts.collect().foreach { case (vid, count) => > if (vid == 0) { > assert(count === 4) // <-- Should be 2 > } else { > assert(count === 2) // <-- Should be 1 > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3650) Triangle Count handles reverse edges incorrectly
[ https://issues.apache.org/jira/browse/SPARK-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15152327#comment-15152327 ] Robin East commented on SPARK-3650: --- I did ask if the PR could be revived but never followed up on it. If I get a moment I'll try and submit the PR myself however have been a little busy on other GraphX things. By the way there is a workaround to the issue which is to make sure your edges are in the canonical direction before calling triangleCount. > Triangle Count handles reverse edges incorrectly > > > Key: SPARK-3650 > URL: https://issues.apache.org/jira/browse/SPARK-3650 > Project: Spark > Issue Type: Bug > Components: GraphX >Affects Versions: 1.1.0, 1.2.0 >Reporter: Joseph E. Gonzalez >Priority: Critical > > The triangle count implementation assumes that edges are aligned in a > canonical direction. As stated in the documentation: > bq. Note that the input graph should have its edges in canonical direction > (i.e. the `sourceId` less than `destId`) > However the TriangleCount algorithm does not verify that this condition holds > and indeed even the unit tests exploits this functionality: > {code:scala} > val triangles = Array(0L -> 1L, 1L -> 2L, 2L -> 0L) ++ > Array(0L -> -1L, -1L -> -2L, -2L -> 0L) > val rawEdges = sc.parallelize(triangles, 2) > val graph = Graph.fromEdgeTuples(rawEdges, true).cache() > val triangleCount = graph.triangleCount() > val verts = triangleCount.vertices > verts.collect().foreach { case (vid, count) => > if (vid == 0) { > assert(count === 4) // <-- Should be 2 > } else { > assert(count === 2) // <-- Should be 1 > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3650) Triangle Count handles reverse edges incorrectly
[ https://issues.apache.org/jira/browse/SPARK-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15152239#comment-15152239 ] Ovidiu Marcu commented on SPARK-3650: - I see interesting issues on GraphX, nobody working on, maybe low priority. too bad. > Triangle Count handles reverse edges incorrectly > > > Key: SPARK-3650 > URL: https://issues.apache.org/jira/browse/SPARK-3650 > Project: Spark > Issue Type: Bug > Components: GraphX >Affects Versions: 1.1.0, 1.2.0 >Reporter: Joseph E. Gonzalez >Priority: Critical > > The triangle count implementation assumes that edges are aligned in a > canonical direction. As stated in the documentation: > bq. Note that the input graph should have its edges in canonical direction > (i.e. the `sourceId` less than `destId`) > However the TriangleCount algorithm does not verify that this condition holds > and indeed even the unit tests exploits this functionality: > {code:scala} > val triangles = Array(0L -> 1L, 1L -> 2L, 2L -> 0L) ++ > Array(0L -> -1L, -1L -> -2L, -2L -> 0L) > val rawEdges = sc.parallelize(triangles, 2) > val graph = Graph.fromEdgeTuples(rawEdges, true).cache() > val triangleCount = graph.triangleCount() > val verts = triangleCount.vertices > verts.collect().foreach { case (vid, count) => > if (vid == 0) { > assert(count === 4) // <-- Should be 2 > } else { > assert(count === 2) // <-- Should be 1 > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3650) Triangle Count handles reverse edges incorrectly
[ https://issues.apache.org/jira/browse/SPARK-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15152075#comment-15152075 ] Sean Owen commented on SPARK-3650: -- [~ovidiumarcu] what are you expecting here? you should try to revive the PR as I mentioned above if you're interested. I don't think anyone else is working on GraphX though. > Triangle Count handles reverse edges incorrectly > > > Key: SPARK-3650 > URL: https://issues.apache.org/jira/browse/SPARK-3650 > Project: Spark > Issue Type: Bug > Components: GraphX >Affects Versions: 1.1.0, 1.2.0 >Reporter: Joseph E. Gonzalez >Priority: Critical > > The triangle count implementation assumes that edges are aligned in a > canonical direction. As stated in the documentation: > bq. Note that the input graph should have its edges in canonical direction > (i.e. the `sourceId` less than `destId`) > However the TriangleCount algorithm does not verify that this condition holds > and indeed even the unit tests exploits this functionality: > {code:scala} > val triangles = Array(0L -> 1L, 1L -> 2L, 2L -> 0L) ++ > Array(0L -> -1L, -1L -> -2L, -2L -> 0L) > val rawEdges = sc.parallelize(triangles, 2) > val graph = Graph.fromEdgeTuples(rawEdges, true).cache() > val triangleCount = graph.triangleCount() > val verts = triangleCount.vertices > verts.collect().foreach { case (vid, count) => > if (vid == 0) { > assert(count === 4) // <-- Should be 2 > } else { > assert(count === 2) // <-- Should be 1 > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3650) Triangle Count handles reverse edges incorrectly
[ https://issues.apache.org/jira/browse/SPARK-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15152066#comment-15152066 ] Ovidiu Marcu commented on SPARK-3650: - Can someone look over this issue? > Triangle Count handles reverse edges incorrectly > > > Key: SPARK-3650 > URL: https://issues.apache.org/jira/browse/SPARK-3650 > Project: Spark > Issue Type: Bug > Components: GraphX >Affects Versions: 1.1.0, 1.2.0 >Reporter: Joseph E. Gonzalez >Priority: Critical > > The triangle count implementation assumes that edges are aligned in a > canonical direction. As stated in the documentation: > bq. Note that the input graph should have its edges in canonical direction > (i.e. the `sourceId` less than `destId`) > However the TriangleCount algorithm does not verify that this condition holds > and indeed even the unit tests exploits this functionality: > {code:scala} > val triangles = Array(0L -> 1L, 1L -> 2L, 2L -> 0L) ++ > Array(0L -> -1L, -1L -> -2L, -2L -> 0L) > val rawEdges = sc.parallelize(triangles, 2) > val graph = Graph.fromEdgeTuples(rawEdges, true).cache() > val triangleCount = graph.triangleCount() > val verts = triangleCount.vertices > verts.collect().foreach { case (vid, count) => > if (vid == 0) { > assert(count === 4) // <-- Should be 2 > } else { > assert(count === 2) // <-- Should be 1 > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3650) Triangle Count handles reverse edges incorrectly
[ https://issues.apache.org/jira/browse/SPARK-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603016#comment-14603016 ] Robin East commented on SPARK-3650: --- What is the status of this issue? A user on the mailing list just ran into to this issue. It looks like PR-2495 should fix the issue. Is there a version that is being targeted for the fix? Triangle Count handles reverse edges incorrectly Key: SPARK-3650 URL: https://issues.apache.org/jira/browse/SPARK-3650 Project: Spark Issue Type: Bug Components: GraphX Affects Versions: 1.1.0, 1.2.0 Reporter: Joseph E. Gonzalez Priority: Critical The triangle count implementation assumes that edges are aligned in a canonical direction. As stated in the documentation: bq. Note that the input graph should have its edges in canonical direction (i.e. the `sourceId` less than `destId`) However the TriangleCount algorithm does not verify that this condition holds and indeed even the unit tests exploits this functionality: {code:scala} val triangles = Array(0L - 1L, 1L - 2L, 2L - 0L) ++ Array(0L - -1L, -1L - -2L, -2L - 0L) val rawEdges = sc.parallelize(triangles, 2) val graph = Graph.fromEdgeTuples(rawEdges, true).cache() val triangleCount = graph.triangleCount() val verts = triangleCount.vertices verts.collect().foreach { case (vid, count) = if (vid == 0) { assert(count === 4) // -- Should be 2 } else { assert(count === 2) // -- Should be 1 } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3650) Triangle Count handles reverse edges incorrectly
[ https://issues.apache.org/jira/browse/SPARK-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603992#comment-14603992 ] Sean Owen commented on SPARK-3650: -- [~robineast] looks like https://github.com/apache/spark/pull/2495 just never got merged for some reason. Dust it off and ping jegonzal and ankurdave for review (again) Triangle Count handles reverse edges incorrectly Key: SPARK-3650 URL: https://issues.apache.org/jira/browse/SPARK-3650 Project: Spark Issue Type: Bug Components: GraphX Affects Versions: 1.1.0, 1.2.0 Reporter: Joseph E. Gonzalez Priority: Critical The triangle count implementation assumes that edges are aligned in a canonical direction. As stated in the documentation: bq. Note that the input graph should have its edges in canonical direction (i.e. the `sourceId` less than `destId`) However the TriangleCount algorithm does not verify that this condition holds and indeed even the unit tests exploits this functionality: {code:scala} val triangles = Array(0L - 1L, 1L - 2L, 2L - 0L) ++ Array(0L - -1L, -1L - -2L, -2L - 0L) val rawEdges = sc.parallelize(triangles, 2) val graph = Graph.fromEdgeTuples(rawEdges, true).cache() val triangleCount = graph.triangleCount() val verts = triangleCount.vertices verts.collect().foreach { case (vid, count) = if (vid == 0) { assert(count === 4) // -- Should be 2 } else { assert(count === 2) // -- Should be 1 } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3650) Triangle Count handles reverse edges incorrectly
[ https://issues.apache.org/jira/browse/SPARK-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289012#comment-14289012 ] Apache Spark commented on SPARK-3650: - User 'Leolh' has created a pull request for this issue: https://github.com/apache/spark/pull/4176 Triangle Count handles reverse edges incorrectly Key: SPARK-3650 URL: https://issues.apache.org/jira/browse/SPARK-3650 Project: Spark Issue Type: Bug Components: GraphX Affects Versions: 1.1.0, 1.2.0 Reporter: Joseph E. Gonzalez Priority: Blocker The triangle count implementation assumes that edges are aligned in a canonical direction. As stated in the documentation: bq. Note that the input graph should have its edges in canonical direction (i.e. the `sourceId` less than `destId`) However the TriangleCount algorithm does not verify that this condition holds and indeed even the unit tests exploits this functionality: {code:scala} val triangles = Array(0L - 1L, 1L - 2L, 2L - 0L) ++ Array(0L - -1L, -1L - -2L, -2L - 0L) val rawEdges = sc.parallelize(triangles, 2) val graph = Graph.fromEdgeTuples(rawEdges, true).cache() val triangleCount = graph.triangleCount() val verts = triangleCount.vertices verts.collect().foreach { case (vid, count) = if (vid == 0) { assert(count === 4) // -- Should be 2 } else { assert(count === 2) // -- Should be 1 } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3650) Triangle Count handles reverse edges incorrectly
[ https://issues.apache.org/jira/browse/SPARK-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143954#comment-14143954 ] Apache Spark commented on SPARK-3650: - User 'jegonzal' has created a pull request for this issue: https://github.com/apache/spark/pull/2495 Triangle Count handles reverse edges incorrectly Key: SPARK-3650 URL: https://issues.apache.org/jira/browse/SPARK-3650 Project: Spark Issue Type: Bug Components: GraphX Affects Versions: 1.1.0 Reporter: Joseph E. Gonzalez The triangle count implementation assumes that edges are aligned in a canonical direction. As stated in the documentation: bq. Note that the input graph should have its edges in canonical direction (i.e. the `sourceId` less than `destId`) However the TriangleCount algorithm does not verify that this condition holds and indeed even the unit tests exploits this functionality: {code:scala} val triangles = Array(0L - 1L, 1L - 2L, 2L - 0L) ++ Array(0L - -1L, -1L - -2L, -2L - 0L) val rawEdges = sc.parallelize(triangles, 2) val graph = Graph.fromEdgeTuples(rawEdges, true).cache() val triangleCount = graph.triangleCount() val verts = triangleCount.vertices verts.collect().foreach { case (vid, count) = if (vid == 0) { assert(count === 4) // -- Should be 2 } else { assert(count === 2) // -- Should be 1 } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org