One more thing - for large data sets, the packages flashClust and
fastcluster provide much faster hierarchical clustering that (at least
for flashClust which I'm the maintainer of) give the exact same
results. Simply insert a

library(flashClust)

before you call the function and your code will run much faster.

Peter

On Thu, Jul 14, 2011 at 4:58 PM, Peter Langfelder
<peter.langfel...@gmail.com> wrote:
> Hi Paul,
>
> I assume you are using the argument cutoff to specify the p-value
> below which nodes are considered connected and above which they are
> not connected.
>
> I would use single linkage hierarchical clustering. If you have two
> groups of nodes and any two nodes between the groups are connected
> (i.e. have adjacency =1 or dissimilarity 0), then the groups have
> dissimilarity 0. If no two nodes between the two groups are connected,
> you will get dissimilarity 1. Thus you can use any tree cut height
> between 0 and 1 to get the clusters that correspond to connected. For
> large data you will need a large computer to hold your distance
> matrix, but you must have observed that already.
>
> subgraphs = function(mat, cut)
> {
>  disconnected = mat>cut # Change the inequality if necessary
>  tree = hclust(as.dist(disconnected), method = "single")
>  clusters = cutree(tree, h = 0.5)
>  # Clusters is already the answer, but you want it in a different
> format, so we reformat it.
>  nClusters = max(clusters)
>  connectedList = list();
>  for (c in 1:nClusters)
>    connectedList[[c]] = which(clusters==c)
>  connectedList
> }
>
> Try it and see if this does what you want.
>
> HTH
>
> Peter
>
> On Thu, Jul 14, 2011 at 4:12 PM, Benton, Paul
> <hpaul.bento...@imperial.ac.uk> wrote:
>> Sorry bad example. My data is undirected. It's a correlation matrix so 
>> probably better to look at something like:
>>
>> foomat<-cor(matrix(rnorm(100), ncol=10))
>> foomat
>>
>> mine are pvalues from the correlation but same idea.
>



-- 
Sent from my Linux computer. Way better than iPad :)

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to