>-----Original Message----- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of David Parkhurst >Sent: Friday, March 14, 2003 9:35 AM >To: [EMAIL PROTECTED] >Subject: [R] length() misbehaving? > > >I'm having a weird problem with length(), in R1.6.1 under >windows2000. I have a dataframe called byyr, with ten >columns, the first of which is named cnd95. >summary(byyr) shows that byyr$cnd95 contains the factor level >"tr" 66 times. Also, when I enter byyr$cnd95 at the command >line, I can count 66 "tr" elements in the resulting vector. >However, when I enter > >n95trt <- length(byyr$cnd95[byyr$cnd95=="tr"]) >n95trt > >the result is 68! Any ideas why this is happening, and how I >can fix the miscount? (That column also contains 69 entries of >"c", and (relevantly?) two NA's.) > >Thanks for any help. > >Dave Parkhurst
It is expected. Since NA represents a true unknown, the two NA's in your vector 'may be' a "tr". Thus, you get TRUE for the NA's when making the comparison. Instead of length(), you might want to use: sum(byyr$cnd95[byyr$cnd95 == "tr"], na.rm = TRUE) which will remove the two NA's. See ?sum HTH, Marc Schwartz ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help