[R] Jaccard dissimilarity matrix for PCA

2010-12-28 Thread Flabbergaster

Hi
I have a large dataset, containing a wide range of binary variables.
I would like first of all to compute a jaccard matrix, then do a PCA on this
matrix, so that I finally can do a hierarchical clustering on the principal
components. 
My problem is, that I don't know how to compute the jaccard dissimilarity
matrix in R? Which package to use, and so on...
Can anybody help me?
Alternatively I'm search for another way to explore the clusters present in
my data.
Another problem is, that I have cases with missing values on different
variables.

Jacob 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Jaccard-dissimilarity-matrix-for-PCA-tp3165982p3165982.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Jaccard dissimilarity matrix for PCA

2010-12-28 Thread Marcelo Luiz de Laia
Flabbergaster jlunding at gmail.com writes:
 My problem is, that I don't know how to compute the jaccard dissimilarity
 matrix in R? Which package to use, and so on...

http://rss.acs.unt.edu/Rdoc/library/arules/html/dissimilarity.html

http://cc.oulu.fi/~jarioksa/softhelp/vegan/html/vegdist.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Jaccard dissimilarity matrix for PCA

2010-12-28 Thread David L Lorenz
Jacob,
  You might have a look at the vegan package. It might compute the Jaccard 
distance and it might have some other toolsa that you might be interested 
in.
Dave




From:
Flabbergaster jlund...@gmail.com
To:
r-help@r-project.org
Date:
12/28/2010 08:26 AM
Subject:
[R] Jaccard dissimilarity matrix for PCA
Sent by:
r-help-boun...@r-project.org




Hi
I have a large dataset, containing a wide range of binary variables.
I would like first of all to compute a jaccard matrix, then do a PCA on 
this
matrix, so that I finally can do a hierarchical clustering on the 
principal
components. 
My problem is, that I don't know how to compute the jaccard dissimilarity
matrix in R? Which package to use, and so on...
Can anybody help me?
Alternatively I'm search for another way to explore the clusters present 
in
my data.
Another problem is, that I have cases with missing values on different
variables.

Jacob 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Jaccard-dissimilarity-matrix-for-PCA-tp3165982p3165982.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Jaccard dissimilarity matrix for PCA

2010-12-28 Thread Christian Hennig

jaccard in package prabclus computes a Jaccard matrix for you.

By the way, if you want to do hierarchical clustering, it doesn't seem to 
be a good idea to me to run PCA first. Why 
not cluster the dissimilarity matrix directly without information loss by 
PCA? (I should not make too general statements on this because generally 
how to cluster data always depends on the aim of clustering, the cluster 
concept you are interested in etc.)


prabclus also contains clustering methods for such data; have a 
look at the functions prabclust and hprabclust (however, they are 
documented as functions for clustering species distribution ranges, so if 
your application is different, you may have to think about whether and how 
to adapt them).


Hope this helps,
Christian




On Tue, 28 Dec 2010, Flabbergaster wrote:



Hi
I have a large dataset, containing a wide range of binary variables.
I would like first of all to compute a jaccard matrix, then do a PCA on this
matrix, so that I finally can do a hierarchical clustering on the principal
components.
My problem is, that I don't know how to compute the jaccard dissimilarity
matrix in R? Which package to use, and so on...
Can anybody help me?
Alternatively I'm search for another way to explore the clusters present in
my data.
Another problem is, that I have cases with missing values on different
variables.

Jacob
--
View this message in context: 
http://r.789695.n4.nabble.com/Jaccard-dissimilarity-matrix-for-PCA-tp3165982p3165982.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Jaccard dissimilarity matrix for PCA

2010-12-28 Thread Flabbergaster

This sounds like something I could use..
I'm kind of new with R, meaning I've having some minor troubles all the
time...
Say I have a range of binary(0,1) variables X1 to Xn, with missing data for
different cases.
At the moment my data is a binary indicator matrix; rows representing the i
individuals or subjects, columns representing presence(1)/absence(0) of
various characteristics. 
Actually I have 5 groups of variables (102 variables in total), describing
different aspects of the subject(s) I'm studying (people; i.e. refugees).
O - O1 to O43 
A - A1 to A38
R - R1 to R6
AP - AP1 to AP8
PT - PT1 to PT7

Can someone help me with the programming of a jaccard matrix in prabclus (or
in any other package). I'm having troubles defining the input-object to the
function, I think?
I get error messages like:
'x' must be an array of at least two dimensions
ERROR:  argument is not a matrix

Jacob


Christian Hennig wrote:
 
 jaccard in package prabclus computes a Jaccard matrix for you.
 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Jaccard-dissimilarity-matrix-for-PCA-tp3165982p3166205.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.