Hi, I'm scratching my head as to why I can't use the subset() command to remove one line of data from a data frame.
There is just one row (out of 45840) that I'd like to remove and it can be identified using.... > dim(raw.all.clean) [1] 45840 10 > subset(raw.all.clean, Height.1 == 0 & Height.2 == 0) Sample.Name Well SNP Allele.1 Allele.2 Size.1 Size.2 Height.1 47068 CA0153 O02 rs2106776 NA NA 0 Height.2 Pool 47068 0 3 (Note that the row index of 47068 which is higher than the rows reported by dim() is simply because I have already removed a number of rows). So I want to remove this one instance where Height.1 == 0 & Height.2 == 0. I'd have thought that a logical expression where Height.1 != 0 & Height.2 != 0 would have achieved this, but it doesn't seem to correctly drop out this one observation, instead its dropping out far more observations... > t <- subset(raw.all.clean, Height.1 != 0 & Height.2 != 0) > dim(t) [1] 38150 10 Thus 7690 rows have been removed. It seems to be that the '&' operator is being interparated as an 'OR' (|) since... > dim(subset(raw.all.clean, Height.1 != 0)) [1] 42152 10 > dim(subset(raw.all.clean, Height.2 != 0)) [1] 41837 10 ...and... > dim(raw.all.clean) - dim(subset(raw.all.clean, Height.1 != 0)) [1] 3688 0 > dim(raw.all.clean) - dim(subset(raw.all.clean, Height.2 != 0)) [1] 4003 0 > 3688 + 4003 [1] 7691 (This is one more than the number of rows being removed, but given that there is one sample where both Height.1 and Height.2 are '0' thats fine). I thought I understood how logical expressions are constructed, and have gone back and read the entries on precedence, but can't work out why the above is happening? Whats particularly perplexing (to me) is that the test for exact equality works, but not for inequality? I feel like I'm missing something blatantly obvious, but can't work out what it is. Cheers, Neil -- Email - [EMAIL PROTECTED] / [EMAIL PROTECTED] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.