>-----Original Message-----
>From: [EMAIL PROTECTED] 
>[mailto:[EMAIL PROTECTED] On Behalf Of David
Parkhurst
>Sent: Friday, March 14, 2003 9:35 AM
>To: [EMAIL PROTECTED]
>Subject: [R] length() misbehaving?
>
>
>I'm having a weird problem with length(), in R1.6.1 under 
>windows2000.  I have a dataframe called byyr, with ten 
>columns, the first of which is named cnd95.
>summary(byyr) shows that byyr$cnd95 contains the factor level 
>"tr" 66 times.  Also, when I enter byyr$cnd95 at the command 
>line, I can count 66 "tr" elements in the resulting vector.  
>However, when I enter
>
>n95trt <- length(byyr$cnd95[byyr$cnd95=="tr"])
>n95trt
>
>the result is 68!  Any ideas why this is happening, and how I 
>can fix the miscount? (That column also contains 69 entries of 
>"c", and (relevantly?) two NA's.)
>
>Thanks for any help.
>
>Dave Parkhurst


It is expected.

Since NA represents a true unknown, the two NA's in your vector 'may
be' a "tr".  Thus, you get TRUE for the NA's when making the
comparison.

Instead of length(), you might want to use:

sum(byyr$cnd95[byyr$cnd95 == "tr"], na.rm = TRUE)

which will remove the two NA's.

See ?sum

HTH,

Marc Schwartz

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Reply via email to