Re: [R] Drop non-integers
On Nov 17, 2010, at 6:57 PM, Sam Albers wrote: On Wed, Nov 17, 2010 at 3:49 PM, David Winsemius > wrote: On Nov 17, 2010, at 6:27 PM, Sam Albers wrote: Hello all, I have a fairly simple data manipulation question. Say I have a dataframe like this: dat <- as.data.frame(runif(7, 3, 5)) dat$cat <- factor(c("1","4","13","1","4","13","13A")) dat runif(7, 3, 5) cat 1 3.880020 1 2 4.062800 4 3 4.828950 13 4 4.761850 1 5 4.716962 4 6 3.868348 13 7 3.420944 13A Under the dat$cat variable the 13A value is an analytical replicate. For my purposes I would like to drop all values that are not an integer (i.e. 13A) from the dataframe. Can anyone recommend a way to drop all rows where the cat value is a non-integer? DW dat[!is.na(as.numeric(as.character(dat$cat))), ] (You do get a warning about coercion to NA's but that is a good sign since that is what we were trying to exclude in the first place.) SA--- Apologies. This worked fine but I didn't quite outline that I also wanted to drop the unused levels of the factor as well. drop=TRUE doesn't seem to work, so can anyone suggest a way to drop the factor levels in addition to the values? > sd <- dat[!is.na(as.numeric(as.character(dat$cat))), ] Warning message: In `[.data.frame`(dat, !is.na(as.numeric(as.character(dat$cat))), : NAs introduced by coercion > str(sd) 'data.frame':6 obs. of 2 variables: $ runif(7, 3, 5): num 3.88 4.06 4.83 4.76 4.72 ... $ cat : Factor w/ 4 levels "1","13","13A",..: 1 4 2 1 4 2 Right. Well, pretty simple actually: dat2 <- dat[!is.na(as.numeric(as.character(dat$cat))), ] dat2$cat <- factor(dat2$cat) # removes the no-longer-existent levels -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drop non-integers
On Wed, Nov 17, 2010 at 3:49 PM, David Winsemius wrote: > > On Nov 17, 2010, at 6:27 PM, Sam Albers wrote: > > Hello all, >> >> I have a fairly simple data manipulation question. Say I have a dataframe >> like this: >> >> dat <- as.data.frame(runif(7, 3, 5)) >> dat$cat <- factor(c("1","4","13","1","4","13","13A")) >> >> dat >> runif(7, 3, 5) cat >> 1 3.880020 1 >> 2 4.062800 4 >> 3 4.828950 13 >> 4 4.761850 1 >> 5 4.716962 4 >> 6 3.868348 13 >> 7 3.420944 13A >> >> Under the dat$cat variable the 13A value is an analytical replicate. For >> my >> purposes I would like to drop all values that are not an integer (i.e. >> 13A) >> from the dataframe. Can anyone recommend a way to drop all rows where the >> cat value is a non-integer? >> > > dat[!is.na(as.numeric(as.character(dat$cat))), ] > > (You do get a warning about coercion to NA's but that is a good sign since > that is what we were trying to exclude in the first place.) > Apologies. This worked fine but I didn't quite outline that I also wanted to drop the unused levels of the factor as well. drop=TRUE doesn't seem to work, so can anyone suggest a way to drop the factor levels in addition to the values? > sd <- dat[!is.na(as.numeric(as.character(dat$cat))), ] Warning message: In `[.data.frame`(dat, !is.na(as.numeric(as.character(dat$cat))), : NAs introduced by coercion > str(sd) 'data.frame':6 obs. of 2 variables: $ runif(7, 3, 5): num 3.88 4.06 4.83 4.76 4.72 ... $ cat : Factor w/ 4 levels "1","13","13A",..: 1 4 2 1 4 2 > > >> Sorry for the simple question and thanks in advance. >> >> Sam >> -- >> * >> Sam Albers >> Geography Program >> University of Northern British Columbia >> University Way >> Prince George, British Columbia >> Canada, V2N 4Z9 >> phone: 250 960-6777 >> * >> >>[[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > David Winsemius, MD > West Hartford, CT > > -- * Sam Albers Geography Program University of Northern British Columbia University Way Prince George, British Columbia Canada, V2N 4Z9 phone: 250 960-6777 * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drop non-integers
On 17/11/2010 6:27 PM, Sam Albers wrote: Hello all, I have a fairly simple data manipulation question. Say I have a dataframe like this: dat<- as.data.frame(runif(7, 3, 5)) dat$cat<- factor(c("1","4","13","1","4","13","13A")) dat runif(7, 3, 5) cat 1 3.880020 1 2 4.062800 4 3 4.828950 13 4 4.761850 1 5 4.716962 4 6 3.868348 13 7 3.420944 13A Under the dat$cat variable the 13A value is an analytical replicate. For my purposes I would like to drop all values that are not an integer (i.e. 13A) from the dataframe. Can anyone recommend a way to drop all rows where the cat value is a non-integer? You can see if an entry is non-numeric using nonnumeric <- is.na( as.numeric( as.character(dat$cat) ) ) With the data in your example, that test would be good enough. If you'd also like to be able to rule out non-integers like 13.1, you could use the lines: value <- as.numeric( as.character(dat$cat) ) # get the numbers noninteger <- value %% 1 != 0 # see if there's a fractional part noninteger <- noninteger | is.na(noninteger) # get rid of the NA's from line 1 Once you have a logical vector indicating which rows to keep, use it to index: dat[!noninteger,] Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drop non-integers
On Nov 17, 2010, at 6:27 PM, Sam Albers wrote: Hello all, I have a fairly simple data manipulation question. Say I have a dataframe like this: dat <- as.data.frame(runif(7, 3, 5)) dat$cat <- factor(c("1","4","13","1","4","13","13A")) dat runif(7, 3, 5) cat 1 3.880020 1 2 4.062800 4 3 4.828950 13 4 4.761850 1 5 4.716962 4 6 3.868348 13 7 3.420944 13A Under the dat$cat variable the 13A value is an analytical replicate. For my purposes I would like to drop all values that are not an integer (i.e. 13A) from the dataframe. Can anyone recommend a way to drop all rows where the cat value is a non-integer? dat[!is.na(as.numeric(as.character(dat$cat))), ] (You do get a warning about coercion to NA's but that is a good sign since that is what we were trying to exclude in the first place.) Sorry for the simple question and thanks in advance. Sam -- * Sam Albers Geography Program University of Northern British Columbia University Way Prince George, British Columbia Canada, V2N 4Z9 phone: 250 960-6777 * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Drop non-integers
Hello all, I have a fairly simple data manipulation question. Say I have a dataframe like this: dat <- as.data.frame(runif(7, 3, 5)) dat$cat <- factor(c("1","4","13","1","4","13","13A")) dat runif(7, 3, 5) cat 1 3.880020 1 2 4.062800 4 3 4.828950 13 4 4.761850 1 5 4.716962 4 6 3.868348 13 7 3.420944 13A Under the dat$cat variable the 13A value is an analytical replicate. For my purposes I would like to drop all values that are not an integer (i.e. 13A) from the dataframe. Can anyone recommend a way to drop all rows where the cat value is a non-integer? Sorry for the simple question and thanks in advance. Sam -- * Sam Albers Geography Program University of Northern British Columbia University Way Prince George, British Columbia Canada, V2N 4Z9 phone: 250 960-6777 * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.