[ https://issues.apache.org/jira/browse/SPARK-10994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-10994. ------------------------------- Resolution: Won't Fix > Clustering coefficient computation in GraphX > -------------------------------------------- > > Key: SPARK-10994 > URL: https://issues.apache.org/jira/browse/SPARK-10994 > Project: Spark > Issue Type: New Feature > Components: GraphX > Reporter: Yang Yang > Original Estimate: 336h > Remaining Estimate: 336h > > The Clustering Coefficient (CC) is a fundamental measure in social (or other > type of) network analysis assessing the degree to which nodes tend to cluster > together [1][2]. Clustering coefficient, along with density, node degree, > path length, diameter, connectedness, and node centrality are seven most > important properties to characterise a network [3]. > We found that GraphX has already implemented connectedness, node centrality, > path length, but does not have a componenet for computing clustering > coefficient. This actually was the first intention for us to implement an > algorithm to compute clustering coefficient for each vertex of a given graph. > Clustering coefficient is very helpful to many real applications, such as > user behaviour prediction and structure prediction (like link prediction). We > did that before in a bunch of papers (e.g., [4-5]), and also found many other > publication papers using this metric in their work [6-8]. We are very > confident that this feature will benefit GraphX and attract a large number of > users. > References > [1] https://en.wikipedia.org/wiki/Clustering_coefficient > [2] Watts, Duncan J., and Steven H. Strogatz. "Collective dynamics of > ‘small-world’ networks." nature 393.6684 (1998): 440-442. (with 27266 > citations). > [3] https://en.wikipedia.org/wiki/Network_science > [4] Jing Zhang, Zhanpeng Fang, Wei Chen, and Jie Tang. Diffusion of > "Following" Links in Microblogging Networks. IEEE Transaction on Knowledge > and Data Engineering (TKDE), Volume 27, Issue 8, 2015, Pages 2093-2106. > [5] Yang Yang, Jie Tang, Jacklyne Keomany, Yanting Zhao, Ying Ding, Juanzi > Li, and Liangwei Wang. Mining Competitive Relationships by Learning across > Heterogeneous Networks. In Proceedings of the Twenty-First Conference on > Information and Knowledge Management (CIKM'12). pp. 1432-1441. > [6] Clauset, Aaron, Cristopher Moore, and Mark EJ Newman. Hierarchical > structure and the prediction of missing links in networks. Nature 453.7191 > (2008): 98-101. (with 973 citations) > [7] Adamic, Lada A., and Eytan Adar. Friends and neighbors on the web. Social > networks 25.3 (2003): 211-230. (1238 citations) > [8] Lichtenwalter, Ryan N., Jake T. Lussier, and Nitesh V. Chawla. New > perspectives and methods in link prediction. In KDD'10. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org