This method worked perfectly! The rle() function was key and I was completely unfamiliar with it. Thanks so much, Max
On Thu, May 24, 2012 at 8:52 AM, Rui Barradas <ruipbarra...@sapo.pt> wrote: > Hello, > > Assuming that 'd' is your original data.frame and that you've set entire > rows to NA, try this > > > d$leak_num <- NA > ix <- !is.na(d[, 1]) # any column will do, entire row is NA > ## alternative, if other rows may have NAs, due to something else > #ix <- apply(d, 1, function(x) all(!is.na(x))) > r <- rle(ix) > v <- cumsum(r$values) > d$leak_num[ix] <- rep(v[r$values], r$lengths[r$values]) > d > > > Hope this helps, > > Rui Barradas > > Em 24-05-2012 11:00, Max Brondfield <mbro...@post.harvard.edu> escreveu: > >> Date: Wed, 23 May 2012 16:42:02 -0400 >> From: Max Brondfield<mbrondf@post.**harvard.edu<mbro...@post.harvard.edu> >> > >> To:r-help@r-project.org >> Subject: [R] Using NA as a break point for indicator variable? >> Message-ID: >> <CADu+jDpcJUHZTXxrsxyQvjaEmw_**N0iLbL6ZJjHZC-rSBCMneiw@mail.** >> gmail.com<cadu%2bjdpcjuhztxxrsxyqvjaemw_n0ilbl6zjjhzc-rsbcmn...@mail.gmail.com> >> > >> Content-Type: text/plain >> >> >> Hi all, >> I am working with a spatial data set for which I am only interested in >> high >> concentration values ("leaks"). The low values (< 90th percentile) have >> already been turned into NA's, leaving me with a matrix like this: >> >> < CH4_leak >> >> lon lat CH4 >> 1 -71.11954 42.35068 2.595834 >> 2 -71.11954 42.35068 2.595688 >> 3 NA NA NA >> 4 NA NA NA >> 5 NA NA NA >> 6 -71.11948 42.35068 2.435762 >> 7 -71.11948 42.35068 2.491003 >> 8 NA NA NA >> 9 -71.11930 42.35068 2.464475 >> 10 -71.11932 42.35068 2.470865 >> >> Every time an NA comes up, it means the "leak" is gone, and the next valid >> value would represent a different leak (at a different location). My goal >> is to tag all of the remaining values with an indicator variable to >> spatially distinguish the leaks. I am envisioning a simple numeric >> indicator such as: >> >> lon lat CH4 leak_num >> 1 -71.11954 42.35068 2.595834 1 >> 2 -71.11954 42.35068 2.595688 1 >> 3 NA NA NA NA >> 4 NA NA NA NA >> 5 NA NA NA NA >> 6 -71.11948 42.35068 2.435762 2 >> 7 -71.11948 42.35068 2.491003 2 >> 8 NA NA NA NA >> 9 -71.11930 42.35068 2.064475 3 >> 10 -71.11932 42.35068 2.070865 3 >> >> Does anyone have any thoughts on how to code this, perhaps using the NA >> values as a "break point"? The data set is far too large to do this >> manually, and I must admit I'm completely at a loss. Any help would be >> much >> appreciated! Best, >> >> Max >> >> [[alternative HTML version deleted]] >> >> >> -- Max Brondfield Research Assistant Department of Geography & Environment Boston University [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.