RE: [R] memory problem with package mix

Ted Harding Tue, 15 Feb 2005 08:42:39 -0800

On 15-Feb-05 [EMAIL PROTECTED] wrote:
> Hello,
> I think we have a memory problem with  em.mix.
> 
> We have done:
> 
>>library(mix)
>>Manq <- read.table("C:/.../file.txt")
>>attach(Manq)
>>Manq
>>    V1 V2 V3 V4 .............V27
>> 1  1  1  1  1...........
>> 2  1 NA  3  6
>> 3  1  2  6  2
>> ...
>> ...
>> 300  2  NA  6  2...........
> 
>> Essaimanq <-prelim.mix(as.matrix(Manq),5)
>> test <- em.mix(Essaimanq)
>     error cannot allocated vector of size 535808 KB
>     in addition : warning message reached total allocation of 509MB


Hmm.

According to the above, it seems you might have 5 categorical
variables V1...V5 with at least 6 levels, so since your call to
em.mix does not specify any model restriction (for which you
need to call ecm.mix insead) you may have at least 6^5 = 46656
"cells" for the different combinations of levels. This will
require 46655 parameters for the probabilities of these cells.

For each cell, you have a separate vector of means for the
multivariate normal distribution to be fitted to the (27-5)=22
continuous variables. This requires 22*46656 = 1026432 parameters.

Sub-total: 1073087

Then, as a bit of sugar on the cake, you have the 22*21/2 = 121
parameters for the covariance matrix.

Sub-total: 1073208

Since em.mix does quite complicated things, it is perhaps
not surpising that it demands more than 509MB (corresponding
to about 500 bytes per parameter or, with 8 bytes per number,
about 60 numbers per parameter). Not to mention the 8100
numbers (about 65000 bytes) required for each working copy
of the representation of the data.

In any case, apparently you only have 300*27 = 8100 data,
quite inadequate for this unrestricted model!

Even if you could have allocated enough memory, you would
then have found that the EM fit would not get anywhere.

Suggested solution: think about restricting the number of
parameters in the model, using the parameter "margins" to
ecnm.mix to restrict the number of independent combinations
of categorical levels, and also "design" to specify a simpler
model for the dependence of the continuous variables
on the categoricals (e.g. the matrix corresponding to the
model "~ V1+V2+V3+V4+V5" only introduces 5*6*22=660 new
parameters, namely a simple additive effect of level of each Vi
on the mean of each of the 22 continuous variables).

Hoping this helps,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <[EMAIL PROTECTED]>
Fax-to-email: +44 (0)870 094 0861
Date: 15-Feb-05                                       Time: 16:24:18
------------------------------ XFMail ------------------------------

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] memory problem with package mix

Reply via email to