Re: [R] applying ifelse to dataframe
Thanks, the dataframe, is indeed clever at preserving its dimensions. I'll try your solution with the real data On Tue, Jun 22, 2010 at 12:23 M, Petr PIKAL petr.pi...@precheza.cz wrote: Hi r-help-boun...@r-project.org napsal dne 22.06.2010 08:28:04: The following dataframe will illustrate the problem DF-data.frame(name=rep(1:5,each=2),x1=rep(A,10),x2=seq(10,19,by=1),x3=rep (NA,10),x4=seq(20,29,by=1)) DF$x3[5]-50 # we have a data frame. we are interested in the columns x2,x3,x4 which contain sparse # values and many NA. DF name x1 x2 x3 x4 1 1 A 10 NA 20 2 1 A 11 NA 21 3 2 A 12 NA 22 4 2 A 13 NA 23 5 3 A 14 50 24 6 3 A 15 NA 25 7 4 A 16 NA 26 8 4 A 17 NA 27 9 5 A 18 NA 28 105 A 19 NA 29 # we have a list of target values that we want to search for in the data frame # if the value is in the data frame we want to keep it there, otherwise, replace it with NA targets-c(11,12,13,16,19,50,27,24,22,26) # so we apply a test by column to the last 3 columns using the in test # this gives us a mask of whether the data frame 'contains' elements in the # target list mask-apply(DF[,3:5],2, %in% ,targets) mask x2x3x4 [1,] FALSE FALSE FALSE [2,] TRUE FALSE FALSE [3,] TRUE FALSE TRUE [4,] TRUE FALSE FALSE [5,] FALSE TRUE TRUE [6,] FALSE FALSE FALSE [7,] TRUE FALSE TRUE [8,] FALSE FALSE TRUE [9,] FALSE FALSE FALSE [10,] TRUE FALSE FALSE # and so DF[2,3] is equal to 11 and 11 is in the target list, so the mask is True # now something like DF- ifelse(mask==T,DF,NA) is CONCEPTUALLY what I want Data frames are quite clever in preserving their dimensions. I would do mask=data.frame(a=TRUE, b=TRUE, !mask) to add column 1 and 2 and DF[mask]-NA Regards Petr to do in the end I'd Like a result that looks like name x1 x2 x3 x4 1 1 A NA NA NA 2 1 A 11 NA NA 3 2 A 12 NA 22 4 2 A 13 NANA 5 3 A NA 50 24 6 3 A NA NA NA 7 4 A 16 NA 26 8 4 A NA NA 27 9 5 A NA NA NA 105 A 19 NA NA Ive tried forcing the DF and the mask into vectors so that ifelse() would work and have tried apply using ifelse.. without much luck. any thoughts? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] applying ifelse to dataframe
Hmm DF-data.frame(name=rep(1:5,each=2),x1=rep(A,10),x2=seq(10,19,by=1),x3=rep(NA,10),x4=seq(20,29,by=1)) DF$x3[5]-50 mask-apply(sample,2,%in%, target) DF name x1 x2 x3 x4 1 1 A 10 NA 20 2 1 A 11 NA 21 3 2 A 12 NA 22 4 2 A 13 NA 23 5 3 A 14 50 24 6 3 A 15 NA 25 7 4 A 16 NA 26 8 4 A 17 NA 27 9 5 A 18 NA 28 105 A 19 NA 29 mask [,1] [,2] [,3] [,4] [,5] [1,] FALSE FALSE FALSE FALSE FALSE [2,] FALSE FALSE FALSE FALSE FALSE [3,] TRUE TRUE FALSE TRUE FALSE [4,] FALSE FALSE FALSE FALSE FALSE [5,] TRUE FALSE FALSE FALSE FALSE mask-data.frame(a=TRUE,b=TRUE,!mask) DF[mask]-NA Error in FUN(X[[1L]], ...) : only defined on a data frame with all numeric variables DF2-data.frame(DF[,3:5]) mask-apply(sample,2,%in%, target) mask-data.frame(!mask) DF2[mask]-NA Error in FUN(X[[1L]], ...) : only defined on a data frame with all numeric variables DF2 x2 x3 x4 1 10 NA 20 2 11 NA 21 3 12 NA 22 4 13 NA 23 5 14 50 24 6 15 NA 25 7 16 NA 26 8 17 NA 27 9 18 NA 28 10 19 NA 29 mask-apply(DF2,2,%in%, target) mask-data.frame(!mask) DF2[mask]-NA Error in FUN(X[[1L]], ...) : only defined on a data frame with all numeric variables On Tue, Jun 22, 2010 at 12:23 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi r-help-boun...@r-project.org napsal dne 22.06.2010 08:28:04: The following dataframe will illustrate the problem DF-data.frame(name=rep(1:5,each=2),x1=rep(A,10),x2=seq(10,19,by=1),x3=rep (NA,10),x4=seq(20,29,by=1)) DF$x3[5]-50 # we have a data frame. we are interested in the columns x2,x3,x4 which contain sparse # values and many NA. DF name x1 x2 x3 x4 1 1 A 10 NA 20 2 1 A 11 NA 21 3 2 A 12 NA 22 4 2 A 13 NA 23 5 3 A 14 50 24 6 3 A 15 NA 25 7 4 A 16 NA 26 8 4 A 17 NA 27 9 5 A 18 NA 28 105 A 19 NA 29 # we have a list of target values that we want to search for in the data frame # if the value is in the data frame we want to keep it there, otherwise, replace it with NA targets-c(11,12,13,16,19,50,27,24,22,26) # so we apply a test by column to the last 3 columns using the in test # this gives us a mask of whether the data frame 'contains' elements in the # target list mask-apply(DF[,3:5],2, %in% ,targets) mask x2x3x4 [1,] FALSE FALSE FALSE [2,] TRUE FALSE FALSE [3,] TRUE FALSE TRUE [4,] TRUE FALSE FALSE [5,] FALSE TRUE TRUE [6,] FALSE FALSE FALSE [7,] TRUE FALSE TRUE [8,] FALSE FALSE TRUE [9,] FALSE FALSE FALSE [10,] TRUE FALSE FALSE # and so DF[2,3] is equal to 11 and 11 is in the target list, so the mask is True # now something like DF- ifelse(mask==T,DF,NA) is CONCEPTUALLY what I want Data frames are quite clever in preserving their dimensions. I would do mask=data.frame(a=TRUE, b=TRUE, !mask) to add column 1 and 2 and DF[mask]-NA Regards Petr to do in the end I'd Like a result that looks like name x1 x2 x3 x4 1 1 A NA NA NA 2 1 A 11 NA NA 3 2 A 12 NA 22 4 2 A 13 NANA 5 3 A NA 50 24 6 3 A NA NA NA 7 4 A 16 NA 26 8 4 A NA NA 27 9 5 A NA NA NA 105 A 19 NA NA Ive tried forcing the DF and the mask into vectors so that ifelse() would work and have tried apply using ifelse.. without much luck. any thoughts? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] applying ifelse to dataframe
OK slight modification DF-data.frame(name=rep(1:5,each=2),x1=rep(A,10),x2=seq(10,19,by=1),x3=rep(NA,10),x4=seq(20,29,by=1)) DF$x3[5]-50 targets-c(11,12,13,16,19,50,27,24,22,26) mask-apply(DF[,3:5],2, %in% ,targets) mask-!mask DF[,3:5][mask]-NA DF name x1 x2 x3 x4 1 1 A NA NA NA 2 1 A 11 NA NA 3 2 A 12 NA 22 4 2 A 13 NA NA 5 3 A NA 50 24 6 3 A NA NA NA 7 4 A 16 NA 26 8 4 A NA NA 27 9 5 A NA NA NA 105 A 19 NA NA Regards Petr steven mosher mosherste...@gmail.com napsal dne 22.06.2010 09:45:08: Hmm DF-data.frame(name=rep(1:5,each=2),x1=rep(A,10),x2=seq(10,19,by=1),x3=rep (NA,10),x4=seq(20,29,by=1)) DF$x3[5]-50 mask-apply(sample,2,%in%, target) DF name x1 x2 x3 x4 1 1 A 10 NA 20 2 1 A 11 NA 21 3 2 A 12 NA 22 4 2 A 13 NA 23 5 3 A 14 50 24 6 3 A 15 NA 25 7 4 A 16 NA 26 8 4 A 17 NA 27 9 5 A 18 NA 28 105 A 19 NA 29 mask [,1] [,2] [,3] [,4] [,5] [1,] FALSE FALSE FALSE FALSE FALSE [2,] FALSE FALSE FALSE FALSE FALSE [3,] TRUE TRUE FALSE TRUE FALSE [4,] FALSE FALSE FALSE FALSE FALSE [5,] TRUE FALSE FALSE FALSE FALSE mask-data.frame(a=TRUE,b=TRUE,!mask) DF[mask]-NA Error in FUN(X[[1L]], ...) : only defined on a data frame with all numeric variables DF2-data.frame(DF[,3:5]) mask-apply(sample,2,%in%, target) mask-data.frame(!mask) DF2[mask]-NA Error in FUN(X[[1L]], ...) : only defined on a data frame with all numeric variables DF2 x2 x3 x4 1 10 NA 20 2 11 NA 21 3 12 NA 22 4 13 NA 23 5 14 50 24 6 15 NA 25 7 16 NA 26 8 17 NA 27 9 18 NA 28 10 19 NA 29 mask-apply(DF2,2,%in%, target) mask-data.frame(!mask) DF2[mask]-NA Error in FUN(X[[1L]], ...) : only defined on a data frame with all numeric variables On Tue, Jun 22, 2010 at 12:23 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi r-help-boun...@r-project.org napsal dne 22.06.2010 08:28:04: The following dataframe will illustrate the problem DF-data.frame(name=rep(1:5,each=2),x1=rep(A,10),x2=seq(10,19,by=1),x3=rep (NA,10),x4=seq(20,29,by=1)) DF$x3[5]-50 # we have a data frame. we are interested in the columns x2,x3,x4 which contain sparse # values and many NA. DF name x1 x2 x3 x4 1 1 A 10 NA 20 2 1 A 11 NA 21 3 2 A 12 NA 22 4 2 A 13 NA 23 5 3 A 14 50 24 6 3 A 15 NA 25 7 4 A 16 NA 26 8 4 A 17 NA 27 9 5 A 18 NA 28 105 A 19 NA 29 # we have a list of target values that we want to search for in the data frame # if the value is in the data frame we want to keep it there, otherwise, replace it with NA targets-c(11,12,13,16,19,50,27,24,22,26) # so we apply a test by column to the last 3 columns using the in test # this gives us a mask of whether the data frame 'contains' elements in the # target list mask-apply(DF[,3:5],2, %in% ,targets) mask x2x3x4 [1,] FALSE FALSE FALSE [2,] TRUE FALSE FALSE [3,] TRUE FALSE TRUE [4,] TRUE FALSE FALSE [5,] FALSE TRUE TRUE [6,] FALSE FALSE FALSE [7,] TRUE FALSE TRUE [8,] FALSE FALSE TRUE [9,] FALSE FALSE FALSE [10,] TRUE FALSE FALSE # and so DF[2,3] is equal to 11 and 11 is in the target list, so the mask is True # now something like DF- ifelse(mask==T,DF,NA) is CONCEPTUALLY what I want Data frames are quite clever in preserving their dimensions. I would do mask=data.frame(a=TRUE, b=TRUE, !mask) to add column 1 and 2 and DF[mask]-NA Regards Petr to do in the end I'd Like a result that looks like name x1 x2 x3 x4 1 1 A NA NA NA 2 1 A 11 NA NA 3 2 A 12 NA 22 4 2 A 13 NANA 5 3 A NA 50 24 6 3 A NA NA NA 7 4 A 16 NA 26 8 4 A NA NA 27 9 5 A NA NA NA 105 A 19 NA NA Ive tried forcing the DF and the mask into vectors so that ifelse() would work and have tried apply using ifelse.. without much luck. any thoughts? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] applying ifelse to dataframe
On 2010-06-22 1:45, steven mosher wrote: Hmm DF-data.frame(name=rep(1:5,each=2),x1=rep(A,10),x2=seq(10,19,by=1),x3=rep(NA,10),x4=seq(20,29,by=1)) DF$x3[5]-50 mask-apply(sample,2,%in%, target) This is getting confusing. What's 'sample'? What's 'target'? Probably what you originally called 'targets'. DF name x1 x2 x3 x4 1 1 A 10 NA 20 2 1 A 11 NA 21 3 2 A 12 NA 22 4 2 A 13 NA 23 5 3 A 14 50 24 6 3 A 15 NA 25 7 4 A 16 NA 26 8 4 A 17 NA 27 9 5 A 18 NA 28 105 A 19 NA 29 mask [,1] [,2] [,3] [,4] [,5] [1,] FALSE FALSE FALSE FALSE FALSE [2,] FALSE FALSE FALSE FALSE FALSE [3,] TRUE TRUE FALSE TRUE FALSE [4,] FALSE FALSE FALSE FALSE FALSE [5,] TRUE FALSE FALSE FALSE FALSE This suggests that 'sample' may be a matrix, not a dataframe. Anyway, try this on your original problem: targets-c(11,12,13,16,19,50,27,24,22,26) mask-apply(DF[,3:5],2, %in% ,targets) is.na(DF[3:5]) - !mask -Peter Ehlers mask-data.frame(a=TRUE,b=TRUE,!mask) DF[mask]-NA Error in FUN(X[[1L]], ...) : only defined on a data frame with all numeric variables DF2-data.frame(DF[,3:5]) mask-apply(sample,2,%in%, target) mask-data.frame(!mask) DF2[mask]-NA Error in FUN(X[[1L]], ...) : only defined on a data frame with all numeric variables DF2 x2 x3 x4 1 10 NA 20 2 11 NA 21 3 12 NA 22 4 13 NA 23 5 14 50 24 6 15 NA 25 7 16 NA 26 8 17 NA 27 9 18 NA 28 10 19 NA 29 mask-apply(DF2,2,%in%, target) mask-data.frame(!mask) DF2[mask]-NA Error in FUN(X[[1L]], ...) : only defined on a data frame with all numeric variables On Tue, Jun 22, 2010 at 12:23 AM, Petr PIKALpetr.pi...@precheza.cz wrote: Hi r-help-boun...@r-project.org napsal dne 22.06.2010 08:28:04: The following dataframe will illustrate the problem DF-data.frame(name=rep(1:5,each=2),x1=rep(A,10),x2=seq(10,19,by=1),x3=rep (NA,10),x4=seq(20,29,by=1)) DF$x3[5]-50 # we have a data frame. we are interested in the columns x2,x3,x4 which contain sparse # values and many NA. DF name x1 x2 x3 x4 1 1 A 10 NA 20 2 1 A 11 NA 21 3 2 A 12 NA 22 4 2 A 13 NA 23 5 3 A 14 50 24 6 3 A 15 NA 25 7 4 A 16 NA 26 8 4 A 17 NA 27 9 5 A 18 NA 28 105 A 19 NA 29 # we have a list of target values that we want to search for in the data frame # if the value is in the data frame we want to keep it there, otherwise, replace it with NA targets-c(11,12,13,16,19,50,27,24,22,26) # so we apply a test by column to the last 3 columns using the in test # this gives us a mask of whether the data frame 'contains' elements in the # target list mask-apply(DF[,3:5],2, %in% ,targets) mask x2x3x4 [1,] FALSE FALSE FALSE [2,] TRUE FALSE FALSE [3,] TRUE FALSE TRUE [4,] TRUE FALSE FALSE [5,] FALSE TRUE TRUE [6,] FALSE FALSE FALSE [7,] TRUE FALSE TRUE [8,] FALSE FALSE TRUE [9,] FALSE FALSE FALSE [10,] TRUE FALSE FALSE # and so DF[2,3] is equal to 11 and 11 is in the target list, so the mask is True # now something like DF- ifelse(mask==T,DF,NA) is CONCEPTUALLY what I want Data frames are quite clever in preserving their dimensions. I would do mask=data.frame(a=TRUE, b=TRUE, !mask) to add column 1 and 2 and DF[mask]-NA Regards Petr to do in the end I'd Like a result that looks like name x1 x2 x3 x4 1 1 A NA NA NA 2 1 A 11 NA NA 3 2 A 12 NA 22 4 2 A 13 NANA 5 3 A NA 50 24 6 3 A NA NA NA 7 4 A 16 NA 26 8 4 A NA NA 27 9 5 A NA NA NA 105 A 19 NA NA Ive tried forcing the DF and the mask into vectors so that ifelse() would work and have tried apply using ifelse.. without much luck. any thoughts? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] applying ifelse to dataframe
Thanks for the solution On Tue, Jun 22, 2010 at 1:02 AM, Peter Ehlers ehl...@ucalgary.ca wrote: On 2010-06-22 1:45, steven mosher wrote: Hmm DF-data.frame(name=rep(1:5,each=2),x1=rep(A,10),x2=seq(10,19,by=1),x3=rep(NA,10),x4=seq(20,29,by=1)) DF$x3[5]-50 mask-apply(sample,2,%in%, target) This is getting confusing. What's 'sample'? What's 'target'? Probably what you originally called 'targets'. DF name x1 x2 x3 x4 1 1 A 10 NA 20 2 1 A 11 NA 21 3 2 A 12 NA 22 4 2 A 13 NA 23 5 3 A 14 50 24 6 3 A 15 NA 25 7 4 A 16 NA 26 8 4 A 17 NA 27 9 5 A 18 NA 28 105 A 19 NA 29 mask [,1] [,2] [,3] [,4] [,5] [1,] FALSE FALSE FALSE FALSE FALSE [2,] FALSE FALSE FALSE FALSE FALSE [3,] TRUE TRUE FALSE TRUE FALSE [4,] FALSE FALSE FALSE FALSE FALSE [5,] TRUE FALSE FALSE FALSE FALSE This suggests that 'sample' may be a matrix, not a dataframe. Anyway, try this on your original problem: targets-c(11,12,13,16,19,50,27,24,22,26) mask-apply(DF[,3:5],2, %in% ,targets) is.na(DF[3:5]) - !mask -Peter Ehlers mask-data.frame(a=TRUE,b=TRUE,!mask) DF[mask]-NA Error in FUN(X[[1L]], ...) : only defined on a data frame with all numeric variables DF2-data.frame(DF[,3:5]) mask-apply(sample,2,%in%, target) mask-data.frame(!mask) DF2[mask]-NA Error in FUN(X[[1L]], ...) : only defined on a data frame with all numeric variables DF2 x2 x3 x4 1 10 NA 20 2 11 NA 21 3 12 NA 22 4 13 NA 23 5 14 50 24 6 15 NA 25 7 16 NA 26 8 17 NA 27 9 18 NA 28 10 19 NA 29 mask-apply(DF2,2,%in%, target) mask-data.frame(!mask) DF2[mask]-NA Error in FUN(X[[1L]], ...) : only defined on a data frame with all numeric variables On Tue, Jun 22, 2010 at 12:23 AM, Petr PIKALpetr.pi...@precheza.cz wrote: Hi r-help-boun...@r-project.org napsal dne 22.06.2010 08:28:04: The following dataframe will illustrate the problem DF-data.frame(name=rep(1:5,each=2),x1=rep(A,10),x2=seq(10,19,by=1),x3=rep (NA,10),x4=seq(20,29,by=1)) DF$x3[5]-50 # we have a data frame. we are interested in the columns x2,x3,x4 which contain sparse # values and many NA. DF name x1 x2 x3 x4 1 1 A 10 NA 20 2 1 A 11 NA 21 3 2 A 12 NA 22 4 2 A 13 NA 23 5 3 A 14 50 24 6 3 A 15 NA 25 7 4 A 16 NA 26 8 4 A 17 NA 27 9 5 A 18 NA 28 105 A 19 NA 29 # we have a list of target values that we want to search for in the data frame # if the value is in the data frame we want to keep it there, otherwise, replace it with NA targets-c(11,12,13,16,19,50,27,24,22,26) # so we apply a test by column to the last 3 columns using the in test # this gives us a mask of whether the data frame 'contains' elements in the # target list mask-apply(DF[,3:5],2, %in% ,targets) mask x2x3x4 [1,] FALSE FALSE FALSE [2,] TRUE FALSE FALSE [3,] TRUE FALSE TRUE [4,] TRUE FALSE FALSE [5,] FALSE TRUE TRUE [6,] FALSE FALSE FALSE [7,] TRUE FALSE TRUE [8,] FALSE FALSE TRUE [9,] FALSE FALSE FALSE [10,] TRUE FALSE FALSE # and so DF[2,3] is equal to 11 and 11 is in the target list, so the mask is True # now something like DF- ifelse(mask==T,DF,NA) is CONCEPTUALLY what I want Data frames are quite clever in preserving their dimensions. I would do mask=data.frame(a=TRUE, b=TRUE, !mask) to add column 1 and 2 and DF[mask]-NA Regards Petr to do in the end I'd Like a result that looks like name x1 x2 x3 x4 1 1 A NA NA NA 2 1 A 11 NA NA 3 2 A 12 NA 22 4 2 A 13 NANA 5 3 A NA 50 24 6 3 A NA NA NA 7 4 A 16 NA 26 8 4 A NA NA 27 9 5 A NA NA NA 105 A 19 NA NA Ive tried forcing the DF and the mask into vectors so that ifelse() would work and have tried apply using ifelse.. without much luck. any thoughts? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.