[R] Subsample points for mclust

Mario Valle Tue, 21 Jul 2009 08:05:12 -0700

Hi all!

I have an ordered vector of values. The distribution of these values canbe modeled by a sum of Gaussians.So I'm using the package 'mclust' to get the Gaussians's parameters forthis 1D distribution. It works very well, but, for input sizes above100.000 values it starts taking really forever. Unfortunately my datasethas around 4.6M values...

My question: is it correct to subsample my dataset taking a value everyN to make mclust happy? Or have I no alternative except using thecomplete dataset?


Excuse my profound ignorance and thank for your help!

mario


--
Ing. Mario Valle
Data Analysis and Visualization Group            | http://www.cscs.ch/~mvalle
Swiss National Supercomputing Centre (CSCS)      | Tel:  +41 (91) 610.82.60
v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax:  +41 (91) 610.82.82

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Subsample points for mclust

Reply via email to