Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14940#discussion_r205839792 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/lib/LabelPropagation.scala --- @@ -58,7 +58,7 @@ object LabelPropagation { }.toMap } def vertexProgram(vid: VertexId, attr: Long, message: Map[VertexId, Long]): VertexId = { - if (message.isEmpty) attr else message.maxBy(_._2)._1 + (Map(attr -> 1L) ++ message).maxBy(m => (m._2, m._1))._1 --- End diff -- Nit, is `message :+ (attr -> 1L)` simpler for the first expression? I don't know enough to evaluate the implications of this change. It sounds like the current behavior is on purpose or according to some paper, but I'm not sure. Is there a reference for this being the more correct thing to do?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org