Am 19.03.20 um 11:51 schrieb santirdnd:
> Hi everybody,
> 
> I'm not an expert on graph theory, so forgive me if I’m misunderstanding
> something. I have a dataset (V=2.5k; E=55k) representing biological entities
> and edges linking them based on a similarity measure. This dataset is very
> heterogenous with a giant component just shy of 2k nodes while, at the same
> time, about 200 singletons. To easy the process I’ve filtered the connected
> components with less than 4 nodes, leaving only 2.2k nodes. Upon inspection
> the graph seems to reveal many quasi-cliques even in the giant component.
> Some of these “putative clusters” are mostly isolated while others have a
> lot of links outward, but usually each one have some unique biological
> properties.
> 
> My goal is to apply a more disciplined approach and, ideally, get to define
> the different communities found. The big communities can be found easily
> with any algorithm but graph-tool has prove really useful as it has also
> detected a community of hub nodes that are instances wrongly entered to the
> dataset. However, I get some blocks with mixed results. In fact they are
> formed by mostly unconnected “sub-communities”, some of then coming even
> from different components of the original graph, with nothing in common
> except for their connectivity pattern. As these sub-communities have very
> few members (around a dozen of nodes at most) I’m assuming that I’m hitting
> the resolution threshold even for nSBM. Is that correct? If it is the case,
> there is some way that could help to improve the analysis? 

It's wrong to think that different components should always belong to
different groups.

Think of completely random Erdős–Rényi graph with an average degree
close to one, such that the network is formed by many components. The
correct SBM inference in this case is of model with a single group,
despite the many components. The reason for this is that this division
into components happens by chance, and the nodes that end up together
have no special affinity. If the generative process is run again, the
same nodes will not necessarily belong to the same component.

You should view your results in the same way: nodes end up being grouped
together unless there is clear evidence pointing to the contrary.

Best,
Tiago

-- 
Tiago de Paula Peixoto <ti...@skewed.de>

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
graph-tool mailing list
graph-tool@skewed.de
https://lists.skewed.de/mailman/listinfo/graph-tool

Reply via email to