[R] poLCA package inquiry
Dear R-users! I am using the poLCA package to find latent classes in a dataset with binary outcomes. Several hundreds of persons can engage hundreds of items. Assume 100 items being engaged by 500 persons, the resulting dataframe is as shown below (5 persons). item1 item2 item3 item4 item5 . . . . . . . item97 item98 item99 item100 1 0 1 0 1. . . . . . . 11 0 1 1 0 1 0 0. . . . . . . 10 0 1 0 1 0 0 0. . . . . . . 01 1 0 1 0 1 0 1. . . . . . . 11 0 1 0 0 1 1 1. . . . . . . 00 0 1 If the number of items (columns) are less than 63, everything works well. But for items (columns) is above 63, I get an Error message saying "*Error in model.matrix.default(formula, mframe) : model frame and formula mismatch in model.matrix()*" . I can not figure out why this happens. Could someone explain me why this happens or how to overcome it? Assume the dataframe is called DATA, the macro I am arunning is as follows: DATA.int=DATA+1 # poLCA can only analyze 1,2,... f<-as.formula(paste("cbind(", paste(colnames(DATA.int), collapse = ","), ")~1")) #all items are dependent variables fit<-list() #collect fits Kmax=5 #maximum nr of classes bic=rep(0,Kmax) #vector of BIC values ll=rep(0,Kmax) #vector of loglikelihood values for (j in 1:Kmax){ #fits for #classes=1,2,...,Kmax cat(j,"\n") #print current analysis number fit[[j]]<-poLCA(f,DATA.int,nclass=j,nrep=20,verbose=FALSE) #20 random starts bic[j]<-fit[[j]]$bic #collect BICs ll[j] <- fit[[j]]$llik #collect logliks } Thanks in Advance Kadengye Trevor -- View this message in context: http://r.789695.n4.nabble.com/poLCA-package-inquiry-tp4631698.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] model frame and formula mismatch with latent class analysis poLCA
Dear R-users, I keep getting an ERROR saying " Error in model.matrix.default(formula, mframe) : model frame and formula mismatch in model.matrix() " when i fit poLCA with more than 63 variables. Below are the details. I am trying to do a Latent Class Analysis using poLCA. My data set contains binary scores of, for instance, 200 students on 100 items. These numbers could even be more in due course. The resulting dataframe on which i want to perfrom LCA looks like shown below (first five person). Each row corresponds to scores of a person on the 100 items. item1 item2 item3 item4 item5 item6 . . . . . . item97 item98 item99 item100 1 1 0 1 1 1. . . . . . 1 01 1 0 0 0 0 1 1. . . . . . 1 01 0 0 1 0 1 0 1. . . . . .0 1 0 1 1 1 0 1 1 1. . . . . . 1 01 1 1 1 0 1 1 1. . . . . . 1 11 1 On this dataframe (here named datax), i perform LCA as follows: datax.int=datax+1 ### poLCA can only analyze 1,2,... f<-as.formula(paste("cbind(", paste(colnames(datax.int), collapse = ","), ")~1")) #all items are dependent variables fit<-list() #collect fits Kmax=5 #maximum nr of classes bic=rep(0,Kmax) #vector of BIC values ll=rep(0,Kmax) #vector of loglikelihood values for (j in 1:Kmax){ #fits for #classes=1,2,...,Kmax cat(j,"\n") #print current analysis number fit[[j]]<-poLCA(f,data.int,nclass=j,nrep=20,verbose=FALSE) #20 random starts bic[j]<-fit[[j]]$bic #collect BICs ll[j] <- fit[[j]]$llik #collect logliks } Then I get an ERROR saying " Error in model.matrix.default(formula, mframe) : model frame and formula mismatch in model.matrix() " What is confusing me is that the macro runs just fine when the number of items is restricted to 63 or less. I have checked this for 200 and 500 persons. If the number of columns (items) is 63 or less, i do not get an error. Mind you, my dataset can contain hundreds of items from thousands of persons. I wonder where I am going wrong. Any ideas? Thank you in advance!! Trevor -- View this message in context: http://r.789695.n4.nabble.com/model-frame-and-formula-mismatch-with-latent-class-analysis-poLCA-tp4631694.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] poLCA : Is maximum number of variables limited?
Dear R-users, I keep getting an ERROR saying " *Error in model.matrix.default(formula, mframe) : model frame and formula mismatch in model.matrix()* " when i fit poLCA with more than 63 variables. Below are the details. I am trying to do a Latent Class Analysis using poLCA. My data set contains binary scores of, for instance, 200 students on 100 items. These numbers could even be more in due course. The resulting dataframe on which i want to perfrom LCA looks like shown below (first five person). Each row corresponds to scores of a person on the 100 items. item1 item2 item3 item4 item5 item6 . . . . . . item97 item98 item99 item100 1 1 0 1 1 1. . . . . . 1 01 1 0 0 0 0 1 1. . . . . . 1 01 0 0 1 0 1 0 1. . . . . .0 1 0 1 1 1 0 1 1 1. . . . . . 1 01 1 1 1 0 1 1 1. . . . . . 1 11 1 On this dataframe (here named datax), i perform LCA as follows: datax.int=datax+1 ### poLCA can only analyze 1,2,... f<-as.formula(paste("cbind(", paste(colnames(datax.int), collapse = ","), ")~1")) #all items are dependent variables fit<-list() #collect fits Kmax=5 #maximum nr of classes bic=rep(0,Kmax) #vector of BIC values ll=rep(0,Kmax) #vector of loglikelihood values for (j in 1:Kmax){ #fits for #classes=1,2,...,Kmax cat(j,"\n") #print current analysis number fit[[j]]<-poLCA(f,data.int,nclass=j,nrep=20,verbose=FALSE) #20 random starts bic[j]<-fit[[j]]$bic #collect BICs ll[j] <- fit[[j]]$llik #collect logliks } Then I get an ERROR saying " Error in model.matrix.default(formula, mframe) : model frame and formula mismatch in model.matrix() " What is confusing me is that the macro runs just fine when the number of items is restricted to 63 or less. I have checked this for 200 and 500 persons. If the number of columns (items) is 63 or less, i do not get an error. Mind you, my dataset can contain hundreds of items from thousands of persons. I wonder where I am going wrong. Any ideas? Thank you in advance!! Trevor -- View this message in context: http://r.789695.n4.nabble.com/poLCA-Is-maximum-number-of-variables-limited-tp4631670.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extract estimates from each dataset: MI package
Thanks Joris, This works best for me!!! :) Thanks once more Trevor -- View this message in context: http://r.789695.n4.nabble.com/Extract-estimates-from-each-dataset-MI-package-tp2259864p2260191.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extract estimates from each dataset: MI package
Thanks Joris and pardon me for over assuming. let me add more information. My data is very huge and it is nested with repeated measurements. This is a sample of the dataset. id sex lang sch age chapt item length Resp 1 10 8 27.02095 31 4 0 1 10 8 27.02095 32 10 0 1 10 8 27.02095 13 10 0 1 10 8 27.02095 24 68 0 1 10 8 27.02095 25 63 NA 2 11 4 21.04946 31 4 NA 2 11 4 21.04946 32 10 1 2 11 4 21.04946 13 10 0 2 11 4 21.04946 24 68 NA 2 11 4 21.04946 25 63 NA 3 10 1 29.69218 31 4 NA 3 10 1 29.69218 32 10 1 3 10 1 29.69218 13 10 1 3 10 1 29.69218 24 68 1 3 10 1 29.69218 25 63 1 4 10 3 26.95328 31 4 0 4 10 3 26.95328 32 10 NA 4 10 3 26.95328 13 10 1 4 10 3 26.95328 24 68 0 4 10 3 26.95328 25 63 NA he imputation model and the model I am fitting are is as follows: imp <- mi(mydata,n.iter=6,n.imp=3, rand.imp.method="bootstrap", preprocess=F, run.past.convergence=F, check.coef.convergence=T,add.noise=F, post.run=F) model <- lmer.mi(Resp~1+ sex + lang + age + length + (1|id) + (1|item)+ (1|sch) + (1|chapt),imp, family=binomial(link="logit")) print(modelmi) display(modelmi) After fitting a model, I can use display(model) to visualize the pooled estimates as well as estimates of each imputed dataset. I can visualize these also by typing print(model). However I would like to know how I can extract estimates of single imputed datasets. I have tried several commands for the first imputed dataset like mi.pooled$coefficients[[1]] , summary(model$analyses[[1]], etc etc but each do not work and i keep getting an error "object of type 'closure' is not subsettable" Hope this makes my question clear. Thanks -- View this message in context: http://r.789695.n4.nabble.com/Extract-estimates-from-each-dataset-MI-package-tp2259864p2260019.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extract estimates from each dataset: MI package
Dear All, I am currently using the MI package (Su, Gelman, Hill and Yajima) to make multiple Imputations of my dataset with missing values. After fitting a model, I can use display(model) to visualize the pooled estimates as well as estimates of each imputed dataset. I can visualize these also by typing print(model). However I would like to know how I can extract estimates of single imputed datasets. I have tried several commands for the first imputed dataset like mi.pooled$coefficients[[1]] , summary(model$analyses[[1]], etc etc but each do not work and i keep getting an error "object of type 'closure' is not subsettable" Any one with an idea? Trevor -- View this message in context: http://r.789695.n4.nabble.com/Extract-estimates-from-each-dataset-MI-package-tp2259864p2259864.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MICE Package and LMER
Hi R users, I am estimating a multilevel model using lmer. My dataset has missing values and I am using MICE package to make Multiple Imputations. Everything works good until i reach the POOLING stage using the pool() function. I am able to get a summary of the pooled fixed effects but not the random effects. No errors or warnings are given to me. I checked the "help" file in R and the developers of MICE noted that "The function pools also estimates obtained with lme() and lmer(), BUT only the fixed part of the model." Does anyone have any ideas on how I can get a summary of pooled random effects? Below is my code imp<-mice(mydata,m=3, imputationMethod =c("","","","logreg"),maxit=2, pri=F) model <- with(data=imp, lmer(miss~ sex + age + (1|id) + (1|sch), family=binomial(link="logit"))) result<-pool(model) summary(result) Thanks Trevor -- View this message in context: http://r.789695.n4.nabble.com/MICE-Package-and-LMER-tp2254504p2254504.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] score counts in an aggregate function
Thanks Ista Works well Trevor -- View this message in context: http://n4.nabble.com/score-counts-in-an-aggregate-function-tp2007152p2011057.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] score counts in an aggregate function
Dear R-Users, I have a big data set "mydata" with repeated observation and some missing values. It looks like the format below: userid sex item score1 score2 1 01 1 1 1 02 0 1 1 03 NA 1 1 04 1 0 2 11 0 1 2 12 NA 1 2 13 1 NA 2 14 NA 0 3 01 1 0 3 02 1 NA 3 03 1 0 3 04 0 0 I wound like to summarise the dataset such that i get something in the format of userid sumscore1 countscore1 meanscore1 sumscore2 countscore2 meanscore2 1 230.67 3 4 0.75 2 120.52 3 0.67 3 340.75 0 3 0.00 I tried using : means <- data.frame(aggregate(mydata[, 4:5],by=list(mydata$userid),FUN="mean", na.rm="TRUE")) and sums <- data.frame(aggregate(mydata[, 4:5],by=list(mydata$userid),FUN="sum", na.rm="TRUE")) so that i could merge the two data.frames later. This works quite okay but i still can not get a function that can give me a data.frame for the counts!! Something like this:: counts <- data.frame(aggregate(mydata[, 4:5],by=list(mydata$userid),FUN="count", na.rm="TRUE")). Any advice? Trevor Belgium -- View this message in context: http://n4.nabble.com/score-counts-in-an-aggregate-function-tp2007152p2007152.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.