On 16-Feb-05 [EMAIL PROTECTED] wrote: > We use the mix package and we have a problem with the DA > function. We aren't sure, but it's maybbe a memory problem. > > We have done: >> Ent<--read.table("C:/.../File.txt") >> attach(Ent) >> Ent > V1 V2 V3 V4 ... V16 V17 > 1 1 1 2 6 18 18 > 2 1 1 1 NA 14 17 > 3 1 1 2 1 16 14 > .... > 199 2 1 NA 7 19 18 > 200 2 1 3 2 14 17 > >> EntPrelim<-prelim.mix(as.matrix(Ent),9) >> EntEM<-em.mix(EntPrelim,maxits=500) >> rngseed(1234567) >> EntDA<-da.mix(EntPrelim, EntEM, steps=100, showits=TRUE) > Steps of data Augmentation: > 1... Error in da.mix(EntPrelim, EntEM, steps=100; showits=TRUE): > Improper posterior--empty cells
Dear Stéphanie, This problem is closely related to the problem reported yesterday by Delphine Gille from your same institution: >> From: [EMAIL PROTECTED] >> To: r-help@stat.math.ethz.ch >> Subject: [R] memory problem with package mix >> Date: Tue, 15 Feb 2005 15:23:08 +0100 >> >> Hello, >> >> I think we have a memory problem with em.mix. >> >> We have done: >> >> >library(mix) >> >Manq <- read.table("C:/.../file.txt") >> >attach(Manq) >> >Manq >> > V1 V2 V3 V4 .............V27 >> > 1 1 1 1 1........... >> > 2 1 NA 3 6 >> > 3 1 2 6 2 >> > ... >> > ... >> > 300 2 NA 6 2........... >> >> > Essaimanq <-prelim.mix(as.matrix(Manq),5) >> > test <- em.mix(Essaimanq) >> error cannot allocated vector of size 535808 KB >> in addition : warning message >> reached total allocation of 509MB The reason is almost certainly the same fact that I pointed out in my reply to Delphine: you have 9 categorical variables, each necessarily at at least 2 levels (and in your case at least one has >=3 levels and at least one has >=6 levels) so you have at least (2^7)*3*6 = 2304 cells (possibly many more, depending on the numbers of levels in the variables) in your unrestricted model for the categorical variables (as implied by your usage of em.mix and da.mix). With only 200 rows of data, there will (even if it is only 2304 cells) be at least 2104 of them empty (i.e. with no data falling in them). Therefore, given the improper Dirichlet prior which da.mix uses by default, you will almost certainly end up with an improper posterior distribution as a result of your many empty cells, which is just what your error message is telling you. With so few data, you need to severely restrict the level of interaction allowed for the categorical variables (and use ecm.mix instead of em.mix, dabipf.mix instead of da.mix). In the best possible case (7 variables at 2 levels, one at 3, one at 6) implied by your data excerpt above, you need 7 + 2 + 5 = 14 parameters at a minimum (no-interaction or complete-independence model). If you admit the first-order (2-factor) interactions as well, you need 84 parameters (I hope I have calculated this right!). Going to 2nd-order (3-factor) will surely take you over your data size of 200 (I haven't worked this one out: maybe there's a snappy R function for this sort of thing!). But if your variables have more levels than the minimum I have assumed (based on your data excerpt) then the situation will rapidly get much worse. Another approach might be to consider using an informative ("proper") prior distribution for the Dirichlet probabilities, but unless you are very careful you risk adopting something which is not realistic for your problem. You can do this with both da.mix with em.mix (provided em.mix works with your sparse data, which it didn't for Delphine) and da.bipf.mix with ecm.mix. See also the explanations in "?da.mix" and "?dabipf.mix", section "Details", which refer to just the kind of problem you are having. Hoping this helps, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <[EMAIL PROTECTED]> Fax-to-email: +44 (0)870 094 0861 Date: 16-Feb-05 Time: 20:37:15 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html