On Sep 7, 2012, at 8:03 AM, Berg, Tobias van den wrote: > Dear all, > > Probably I made a beginners mistake. While importing a spss file I didn't > specify that missings should be NA (use.missings = TRUE). Thanks to Petr > Pikal and Bert Gunter I now know how to check how many values are known > within a variable. > > Although I can fit my logistic model on this dataset, unfortunately, I > experience the same problem after bootstrapping the original dataset at hand. > > The R-code so far: > > bootstraps<-10 > > subsets<-list() > for (i in 1:bootstraps){ > subsets[[i]]<-as.matrix(sample(1:length(dat$PatID), replace=TRUE)) > } > subsets<-lapply (subsets, function (x) {subsets <- dat[x,]}) > > fit.subsets <-lapply (subsets, function (x) {lrm(MRI_Diag_RC ~ factor(O4_1r) > + N6_1r + leeftijd + LO1 + LO2, model=T, x=T, y=T, data=x)}) > > Everything is fine till I run the last line. The following result shows in R: > Error in catg(xi, name = nam, label = lab) : LO2 has <2 category levels > > I checked the simulated datasets how many values within LO2 are known, using: > lapply (subsets, function (x) {str(x$LO2)})
Instead do : apply (subsets, function (x) {table(x$LO2)}) You cannot tell what distribution of values you are getting with str(). Just because a factor has 2 levels does NOT mean it has two unique values populating those levels. -- David. > > The result: > Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 NA 1 1 1 1 ... > Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ... > Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 NA 1 1 1 1 ... > Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ... > Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ... > Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ... > Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ... > Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ... > Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ... > Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ... > [[1]] > NULL > > [[2]] > NULL > > [[3]] > NULL > > [[4]] > NULL > > [[5]] > NULL > > [[6]] > NULL > > [[7]] > NULL > > [[8]] > NULL > > [[9]] > NULL > > [[10]] > NULL > > It would be great to receive ideas, comments or questions about my challenge. > > Kind regards, Tobias > > > -----Oorspronkelijk bericht----- > Van: PIKAL Petr [mailto:petr.pi...@precheza.cz] > Verzonden: vrijdag 7 september 2012 16:22 > Aan: Berg, Tobias van den > CC: r-help > Onderwerp: RE: [R] error: in catg (xi, name=nam, label=lab): "LO2" has <2 > category levels > > Hi > > It is good to cc to list. Somebody could have better insight. > > >> >> Dear Petr, >> >> Thank you for responding. It seems right what you say. The funny thing >> however is that the 'LO2' variable in SPSS has 2 answer categories. If >> I look at the same variable in R, again I see 2 different values. > > How do you know? Any command? You shall provide at least > > str(LO2) > > result as we do not have access to your PC. > >> >> I used your "sapply" code and guess that I retrieved (per variable) the >> amount of answer categories/possible values. LO2 scores a 3 in the >> accompanying results. Do you know how I can change that? > > Hm. Result of this depends on what is LO2. If it is numeric, you have 3 > unique values. If it is factor you can have either 3 levels or 2 levels and > NA values(again str result would be helpful and we need not just guess how > your data look like). Well let me guess > > levels(dat$LO2) says you have 3 levels 2 meaningful and one comes out > probably as empty string "". > > It shall be the first level so > > levels(dat$LO2)[1] <- NA > > shall drop this unused and created levels. Or maybe you can get rid of this > unwanted levels by setting na.string to empty string during import, however > my knowledge of SPSS limitedly approaching zero so I could be completely > wrong. > > If your values are factors, you can change the code to > > sapply(sapply(ff, levels), length) > > and you will get 0 for numeric variables and number of levels for factor > variables. More complete insight in your data can be also found by > > summary(dat) > > Regards > Petr > > >> >> Kind regards, Tobias >> >> >> -----Oorspronkelijk bericht----- >> Van: PIKAL Petr [mailto:petr.pi...@precheza.cz] >> Verzonden: vrijdag 7 september 2012 15:02 >> Aan: Berg, Tobias van den; r-help@r-project.org >> Onderwerp: RE: [R] error: in catg (xi, name=nam, label=lab): "LO2" has >> <2 category levels >> >> Hi >> >>> -----Original Message----- >>> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- >>> project.org] On Behalf Of Tvandenberg >>> Sent: Friday, September 07, 2012 1:05 PM >>> To: r-help@r-project.org >>> Subject: [R] error: in catg (xi, name=nam, label=lab): "LO2" has <2 >>> category levels >>> >>> Dear R-users, >>> >>> During a fit procedure in a Logistic prediction model I encounter >> the >>> following problem: >>> >>> error: in catg (xi, name=nam, label=lab: X has <2 category levels >> >> I do not know lrm but the error seems to be explaining itself, some >> variable has only one level and shall have 2 >> >> sapply(sapply(dat, unique), length) >> >> shall give you for used variables value 2 or more. >> >> Regards >> Petr >> >> >>> >>> The following code is used: >>> >>> fit <-lrm(MRI_Diag_RC ~ factor(O4_1r) + N6_1r + leeftijd + LO1 + LO2 >> + >>> LO3+ >>> LO4+ LO5+ LO6+ LO7+ LO8+ LO9+ LO10+ LO11+ LO12+ LO13 + LO14+ LO15+ >>> LO16+ >>> LO17+ LO18+ LO19+ LO20+ LO21+ LO22+ LO23+ LO24 + LO26+ LO27 + LO29, >>> LO17+ LO18+ LO19+ LO20+ LO21+ LO22+ LO23+ model=T, >>> x=T, y=T, data=dat) >>> >>> Most predictors are (dichotomous) nominal variables as is the >>> problematic "LO2". Does anyone know what the problem is and how I can >>> correct it? >>> >>> Kind regards, >>> >>> Tobias >>> >>> >>> >>> -- >>> View this message in context: http://r.789695.n4.nabble.com/error-in- >>> catg-xi-name-nam-label-lab-LO2-has-2-category-levels-tp4642495.html >>> Sent from the R help mailing list archive at Nabble.com. >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting- >>> guide.html and provide commented, minimal, self-contained, >>> reproducible code. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.