Hi Adam, > It seems like that I can intensify the tension. :) [...] I > examined the cluster nr 293812 and it looks like the same as the nr 293815 > but here I can calculate the authority score.
Well, welcome to the world of eigenvector calculations using ARPACK! :) This is how it usually works. As far as I know, ARPACK basically implements a fancier and more effective version of the so-called "power iteration" method when it comes to calculating the eigenvector corresponding to the dominant eigenvalue. The power iteration method works like this: 1. You choose an arbitrary starting vector and normalize it. 2. You multiply it with the matrix for which you need the dominant eigenvector and normalize the product again. 3. Repeat step 2 until convergence (i.e. until the vector does not change "too much" between iterations) or until you reach the maximum number of allowed iterations. (ARPACK does something more sophisticated that is called the "implicitly restarted Arnoldi method", but the general idea is the same as far as I know). Now, if you are lucky, the process converges quickly for a unique eigenvector for most of the chosen starting vectors (except if the chosen starting vector happens to be parallel to some _other_eigenvector, in which case it will converge to that one - but it is very unlikely to happen). However, if the dominant eigenvalue has multiplicity > 1, then basically any linear combination of the dominant eigenvectors is also a dominant eigenvector, so it might be the case that the vector obtained from consecutive iterations will not converge to a unique vector but keep on "rotating around" in the subspace spanned by the dominant eigenvectors. So, this is not an error in igraph, it just happens to be the case that there is no _unique_ solution of the authority score formula for star-shaped graphs. For instance, let us consider a star graph consisting of four vertices A, B, C and D such that A is the central vertex. For sake of simplicity, let us also assume that all the weights are equal. The authority and hub scores are defined in a recursive way, and the definitions roughly boil down to this: "The authority score of vertex is i equal to the sum of the hub scores of all its neighbors, divided by a constant C1 chosen in a way that the sum of all the authority scores is 1". Similarly, "The hub score of vertex is i equal to the sum of the authority scores of all its neighbors, divided by a constant C2 chosen in a way that the sum of all the hub scores is 1". Note that you cannot really calculate the authority scores alone - you need to calculate both the authority and the hub scores because they mutually refer to each other. Now, consider the 4-vertex undirected star graph and the following solutions: authority scores = [1, 0, 0, 0] hub scores = [0, 1/3, 1/3, 1/3] Is it a good solution? Yes, with C1 = 1 and C2 = 1/3. Now, consider this solution: authority scores = [0, 1/3, 1/3, 1/3] hub scores = [1, 0, 0, 0] Is it a good solution? Yes, it is _also_ a good solution, with C1 = 1/3 and C2 = 1. Which one yields the "real" authority score vector? The answer is that _both_ of these are valid authority scores, and any linear combination of these two vectors are also valid authority scores. The problem here is that you cannot simply answer questions like "who are the most authoritative people in this star-shaped network", because one of the solutions tell you that the central vertex is a good hub and the other three vertices are good authorities, and the other solution tells you otherwise. > Another thing: can be installing arpack-ng a solution for the arpack fails? > https://github.com/pv/arpack-ng No, because the authority score itself is ill-defined for your input. Replacing the eigensolver with a different eigensolver would not magically reformulate the original definition. This is an inherent limitation of the authority score measure; a more detailed analysis of this is to be found in the following paper: https://www.math.hmc.edu/~ward/paperpdfs/hitsheaderbw6Jan05.pdf (Naive implementations of the authority and hub score calculations often hide this problem because the most naive implementations are based on the power iteration and simply tend to stop the calculation after a given number of iterations even if convergence was not reached, giving you the false impression that there is a unique solution where in fact there isn't). I would probably try what I mentioned in my previous email: calculate the communities on the undirected network but calculate the centrality measures (eigenvector centrality, hub and authority scores) in the directed variant, which is less likely to suffer from these problems (but the chance is still non-negligible, see Fig 4.1 in the paper I linked above for an example). Best, T. _______________________________________________ igraph-help mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/igraph-help
