On Thu, 21 Aug 2008, [EMAIL PROTECTED] wrote:

Hi all,

I have a matrix of about 100.000 x 4 that I need to classify using
euclidean metric. For that I am using dist or daisy functions, but I
am afraid that the message: Error in vector("double", length) : vector
size specified is too large, means too much lines.


Yes, your distance matrix will take dozens of Gigabytes to store.


Can anyone suggest me how should I analyse this matrix?

Try something other than 'hierarchical clustering'.

See
        http://cran.r-project.org/web/views/Cluster.html

for some suggestions.

kmeans(), perhaps ?

HTH,

Chuck


Thanks in advance,

Diogo André Alagador
MNCN,CSIC, Madrid, Spain
ISA, Lisbon, Portugal
   

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry                            (858) 534-2098
                                            Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]                  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to