Dear R-Helpers, 

I have a large data matrix (9707 rows, 60 columns), which contains missing
data. The matrix looks something like this: 

1) X X X X X X  NA  X X X X X X X X X 

2) NA NA NA NA X NA NA NA X NA NA 

3) NA NA X NA NA NA NA NA NA NA 

5) NA X NA X X X NA X X X X NA X 

..

9708) X NA NA X NA NA X X NA NA X

.and so on. Notice that every row has a varying number of entries, all rows
have at least one entry, but some rows have too much missing data.  My goal
is to filter out/remove rows that have ~5 (this number is yet to be
determined, but let's say its 5) missing entries before I run pearsons to
tell me correlation between all of the rows.  The order of the columns does
not matter here.
I think that I might need to test each row for a "data, at least one NA,
data" pattern?

Is there some kind of way of doing this? I am at a loss for an easy way to
accomplishing this. Any suggestions are most appreciated! 

John Morrow

 


        [[alternative HTML version deleted]]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to