Hi Davide, Am 03.07.20 um 11:31 schrieb Davide Cittaro: > Hello, > I'm testing the new Planted Partition model in graph-tool on my data, indeed > I'm finding interesting results. I have some questions/observations, though. > - PPBlockState returns a relatively large number of partitions on large > networks, which is fine and expected. When I use NSBM, instead, I make use of > the hierarchy not only because I can "abstract" partitions up to a certain > level, but also because the hierarchy has a meaning in my case. Is there (or > will it be there) a hierarchical formulation of the PPBlockState?
A hierarchical prior for the PP model is certainly feasible, and it is something that could come up in the future, but I can't promise when. > - I tried multiple initialisations of PPBlockState over my graph, I also > tried to increase the iterations of the initial MCMC sweep and I'd say I get > very consistent results. Is this expected? I mean, is it known if the > PPBlockState converges to a stable solution faster and in a consistent way? This depends a lot on the underlying data. If the model is a good fit, then this consistency is expected, otherwise it's not. It will not necessarily behave like this for every data. > - Does the time required to converge scales with the number of edges as it > does for SBM? Like in the SBM, the MCMC sweeps take time proportional to the number of edges, but the multiplicative factor is smaller, since the model is simpler. > - As far as I understand, if the assortativity is the dominant pattern the > difference between PP and NSBM is negligible. I don't know how to quantify > "negligible" as the differences in entropies are at least in the order of 1e2 > in the cases I tested (seems pretty large to me); I would be happy to switch > to PP, also given the shorter runtime so far, but I'm a bit concerned about > these differences. I do not recommend simply switching to PP for every analysis. As was described in the paper, the SBM is still a more powerful model, that is capable of better capturing the network structure in a wider variety of cases. To answer your question, you can test whether the two models give similar answers by comparing their partitions. You can use the partition_overlap() function for that. Comparing the description length is useful to select the best fitting model, but not to tell if they give similar answers. Best, Tiago -- Tiago de Paula Peixoto <ti...@skewed.de>
signature.asc
Description: OpenPGP digital signature
_______________________________________________ graph-tool mailing list graph-tool@skewed.de https://lists.skewed.de/mailman/listinfo/graph-tool