Github user bookling commented on the issue:

    https://github.com/apache/spark/pull/14880
  
    In the labelPropagation of graphx lib, node is initialized with a unique
    label and at every step each node adopts the label that most of its 
neighbors currently have, but ignore the label it currently have. I think it is 
unreasonable, because the labe a node had is also useful. When a node trend to 
has a stable label, this means there is an association between two iterations, 
so a node not only affected by its neighbors, but also its current label. 
    so I change the  code, and use both  the label of its neighbors  and 
itself. 
    
    
    This iterative process densely connected groups of nodes form a consensus 
on a unique label to form
    communities. But the communities of the LabelPropagation often 
discontinuous. 
    Because when the label that most of its neighbors currents have are 
many,e.g, node "0" has 6 neigbors labed {"1","1","2","2","3","3"},it maybe 
randomly select a label. in order to get a stable label of communities, and 
prevent the randomness, so I chose the max lable of node. 
    
    you can test graph with Edges: 
{10L->11L,11L->12L,11L->14L,12L->14L,13L->14L,13L->15L,13L->16L,15L->16L,15L->17L,16L->17L
 };or dandelion shape {1L->2L,2L->7L,2L->3L,2L->4L,2L->5L,2L->6L},etc.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to