Re: [R] data frame question
thank you both... assumption is in fact that a and b are always the same length... these work for me well... much appreciate it... Andras On Sunday, August 6, 2017 12:14 PM, Ulrik Stervbo wrote: Hi Andreas, assuming that the increment is always indicated by the same value (in your example 0), this could work: df$a <- cumsum(seq_along(df$b) %in% which(df$b == 0)) df HTH, Ulrik On Sun, 6 Aug 2017 at 18:06 Bert Gunter wrote: Your specification is a bit unclear to me, so I'm not sure the below >is really what you want. For example, your example seems to imply that >a and b must be of the same length, but I do not see that your >description requires this. So the following may not be what you want >exactly, but one way to do this(there may be cleverer ones!) is to >make use of ?rep. Everything else is just fussy detail. (Your example >suggests that you should also learn about ?seq. Both of these should >be covered in any good R tutorial, which you should probably spend >time with if you haven't already). > >Anyway... > >## WARNING: Not thoroughly tested! May (probably :-( ) contain bugs. > >f <- function(x,y,switch_val =0) >{ > wh <- which(y == switch_val) > len <- length(wh) > len_x <- length(x) > if(!len) x > else if(wh[1] == 1){ > if(len ==1) return(rep(x[1],len_x)) > else { > wh <- wh[-1] > len <- len -1 > } > } > count <- c(wh[1]-1,diff(wh)) > if(wh[len] == len_x) count<- c(count,1) > else count <- c(count, len_x - wh[len] +1) > rep(x[seq_along(count)],times = count) >} > >> a <- c(1:5,1:8) >> b <- c(0:4,0:7) >> f(a,b) > [1] 1 1 1 1 1 2 2 2 2 2 2 2 2 > > > >Bert Gunter > >"The trouble with having an open mind is that people keep coming along >and sticking things into it." >-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > >On Sun, Aug 6, 2017 at 4:10 AM, Andras Farkas via R-help > wrote: >> Dear All, >> >> wonder if you have thoughts on the following: >> >> let us say we have: >> >> df<-data.frame(a=c(1,2,3,4,5,1,2,3,4,5,6,7,8),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) >> >> >> I would like to rewrite values in column name "a" based on values in column >> name "b", where based on a certain value of column "b" the next value of >> column 'a' is prompted, in other words would like to have this as a result: >> >> df<-data.frame(a=c(1,1,1,1,1,2,2,2,2,2,2,2,2),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) >> >> >> where at the value of 0 in column 'b' the number in column a changes from 1 >> to 2. From the first zero value of column 'b' and until the next zero in >> column 'b' the numbers would not change in 'a', ie: they are all 1 in my >> example... then from 2 it would change to 3 again as 'b' will have zero >> again in a row, and so on.. Would be grateful for a solution that would >> allow me to set the values (from 'b') that determine how the values get >> established in 'a' (ie: lets say instead of 0 I would want 3 being the value >> where 1 changes to 2 in 'a') and that would be flexible to take into account >> that the number of rows and the number of time 0 shows up in a row in column >> 'b' may vary... >> >> much appreciate your thoughts.. >> >> Andras >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame question
Hi Andreas, assuming that the increment is always indicated by the same value (in your example 0), this could work: df$a <- cumsum(seq_along(df$b) %in% which(df$b == 0)) df HTH, Ulrik On Sun, 6 Aug 2017 at 18:06 Bert Gunter wrote: > Your specification is a bit unclear to me, so I'm not sure the below > is really what you want. For example, your example seems to imply that > a and b must be of the same length, but I do not see that your > description requires this. So the following may not be what you want > exactly, but one way to do this(there may be cleverer ones!) is to > make use of ?rep. Everything else is just fussy detail. (Your example > suggests that you should also learn about ?seq. Both of these should > be covered in any good R tutorial, which you should probably spend > time with if you haven't already). > > Anyway... > > ## WARNING: Not thoroughly tested! May (probably :-( ) contain bugs. > > f <- function(x,y,switch_val =0) > { >wh <- which(y == switch_val) >len <- length(wh) >len_x <- length(x) >if(!len) x >else if(wh[1] == 1){ > if(len ==1) return(rep(x[1],len_x)) > else { > wh <- wh[-1] > len <- len -1 > } >} >count <- c(wh[1]-1,diff(wh)) >if(wh[len] == len_x) count<- c(count,1) >else count <- c(count, len_x - wh[len] +1) >rep(x[seq_along(count)],times = count) > } > > > a <- c(1:5,1:8) > > b <- c(0:4,0:7) > > f(a,b) > [1] 1 1 1 1 1 2 2 2 2 2 2 2 2 > > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Sun, Aug 6, 2017 at 4:10 AM, Andras Farkas via R-help > wrote: > > Dear All, > > > > wonder if you have thoughts on the following: > > > > let us say we have: > > > > > df<-data.frame(a=c(1,2,3,4,5,1,2,3,4,5,6,7,8),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) > > > > > > I would like to rewrite values in column name "a" based on values in > column name "b", where based on a certain value of column "b" the next > value of column 'a' is prompted, in other words would like to have this as > a result: > > > > > df<-data.frame(a=c(1,1,1,1,1,2,2,2,2,2,2,2,2),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) > > > > > > where at the value of 0 in column 'b' the number in column a changes > from 1 to 2. From the first zero value of column 'b' and until the next > zero in column 'b' the numbers would not change in 'a', ie: they are all 1 > in my example... then from 2 it would change to 3 again as 'b' will have > zero again in a row, and so on.. Would be grateful for a solution that > would allow me to set the values (from 'b') that determine how the values > get established in 'a' (ie: lets say instead of 0 I would want 3 being the > value where 1 changes to 2 in 'a') and that would be flexible to take into > account that the number of rows and the number of time 0 shows up in a row > in column 'b' may vary... > > > > much appreciate your thoughts.. > > > > Andras > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame question
Your specification is a bit unclear to me, so I'm not sure the below is really what you want. For example, your example seems to imply that a and b must be of the same length, but I do not see that your description requires this. So the following may not be what you want exactly, but one way to do this(there may be cleverer ones!) is to make use of ?rep. Everything else is just fussy detail. (Your example suggests that you should also learn about ?seq. Both of these should be covered in any good R tutorial, which you should probably spend time with if you haven't already). Anyway... ## WARNING: Not thoroughly tested! May (probably :-( ) contain bugs. f <- function(x,y,switch_val =0) { wh <- which(y == switch_val) len <- length(wh) len_x <- length(x) if(!len) x else if(wh[1] == 1){ if(len ==1) return(rep(x[1],len_x)) else { wh <- wh[-1] len <- len -1 } } count <- c(wh[1]-1,diff(wh)) if(wh[len] == len_x) count<- c(count,1) else count <- c(count, len_x - wh[len] +1) rep(x[seq_along(count)],times = count) } > a <- c(1:5,1:8) > b <- c(0:4,0:7) > f(a,b) [1] 1 1 1 1 1 2 2 2 2 2 2 2 2 Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sun, Aug 6, 2017 at 4:10 AM, Andras Farkas via R-help wrote: > Dear All, > > wonder if you have thoughts on the following: > > let us say we have: > > df<-data.frame(a=c(1,2,3,4,5,1,2,3,4,5,6,7,8),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) > > > I would like to rewrite values in column name "a" based on values in column > name "b", where based on a certain value of column "b" the next value of > column 'a' is prompted, in other words would like to have this as a result: > > df<-data.frame(a=c(1,1,1,1,1,2,2,2,2,2,2,2,2),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) > > > where at the value of 0 in column 'b' the number in column a changes from 1 > to 2. From the first zero value of column 'b' and until the next zero in > column 'b' the numbers would not change in 'a', ie: they are all 1 in my > example... then from 2 it would change to 3 again as 'b' will have zero again > in a row, and so on.. Would be grateful for a solution that would allow me to > set the values (from 'b') that determine how the values get established in > 'a' (ie: lets say instead of 0 I would want 3 being the value where 1 changes > to 2 in 'a') and that would be flexible to take into account that the number > of rows and the number of time 0 shows up in a row in column 'b' may vary... > > much appreciate your thoughts.. > > Andras > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data frame question
Dear All, wonder if you have thoughts on the following: let us say we have: df<-data.frame(a=c(1,2,3,4,5,1,2,3,4,5,6,7,8),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) I would like to rewrite values in column name "a" based on values in column name "b", where based on a certain value of column "b" the next value of column 'a' is prompted, in other words would like to have this as a result: df<-data.frame(a=c(1,1,1,1,1,2,2,2,2,2,2,2,2),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) where at the value of 0 in column 'b' the number in column a changes from 1 to 2. From the first zero value of column 'b' and until the next zero in column 'b' the numbers would not change in 'a', ie: they are all 1 in my example... then from 2 it would change to 3 again as 'b' will have zero again in a row, and so on.. Would be grateful for a solution that would allow me to set the values (from 'b') that determine how the values get established in 'a' (ie: lets say instead of 0 I would want 3 being the value where 1 changes to 2 in 'a') and that would be flexible to take into account that the number of rows and the number of time 0 shows up in a row in column 'b' may vary... much appreciate your thoughts.. Andras __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame question
Hi Andras, here is an other solution which also works if b contains missing values: a <-seq(0,10,by=1) b <-c(NA, 11:20) f <-16 # a[which.max(b[b If it's not homework, then I'm happy to provide more help: > > > a <-seq(0,10,by=1) > b <-c(10:20) > d <-data.frame(a=a,b=b) > f <-16 > > subset(d, b < f & b == max(b[b < f]))$a > > # I'd turn it into a function > getVal <- function(d, f) { > subset(d, b < f & b == max(b[b < f]))$a > } > > > Sarah > > > On Mon, Dec 9, 2013 at 3:50 PM, Andras Farkas wrote: >> Sarah, >> >> thank you, not homework though, I guess it just looks like it I will >> look into subset() >> >> Andras >> >> >> On Monday, December 9, 2013 3:45 PM, Sarah Goslee >> >> wrote: >> Thank you for providing a reproducible example. I tweaked it a little >> bit to make it actually a data frame problem. >> >> There are lots of ways to do this; here's one approach. >> >> On second thought, this looks a lot like homework, so perhaps instead >> I'll just suggest using subset() with more than one condition. >> >> Sarah >> >> On Mon, Dec 9, 2013 at 3:27 PM, Andras Farkas >> wrote: >>> Dear All >>> >>> please help with the following: >>> >>> I have: >>> >>> a <-seq(0,10,by=1) >>> b <-c(10:20) >>> d <-cbind(a,b) >>> f <-16 >>> >>> I would like to select the value in column a based on a value in column >>> b, >>> where the value in column b is the 1st value that is smaller then f. >>> Thus I >>> should end up with the number 5 because the 1st value that is below 16 >>> would >>> be 15, and in the same row column a has the number 5 >>> >>> appreciate your insights, >>> >>> andras >> > > > -- > Sarah Goslee > http://www.functionaldiversity.org > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame question
If it's not homework, then I'm happy to provide more help: a <-seq(0,10,by=1) b <-c(10:20) d <-data.frame(a=a,b=b) f <-16 subset(d, b < f & b == max(b[b < f]))$a # I'd turn it into a function getVal <- function(d, f) { subset(d, b < f & b == max(b[b < f]))$a } Sarah On Mon, Dec 9, 2013 at 3:50 PM, Andras Farkas wrote: > Sarah, > > thank you, not homework though, I guess it just looks like it I will > look into subset() > > Andras > > > On Monday, December 9, 2013 3:45 PM, Sarah Goslee > wrote: > Thank you for providing a reproducible example. I tweaked it a little > bit to make it actually a data frame problem. > > There are lots of ways to do this; here's one approach. > > On second thought, this looks a lot like homework, so perhaps instead > I'll just suggest using subset() with more than one condition. > > Sarah > > On Mon, Dec 9, 2013 at 3:27 PM, Andras Farkas wrote: >> Dear All >> >> please help with the following: >> >> I have: >> >> a <-seq(0,10,by=1) >> b <-c(10:20) >> d <-cbind(a,b) >> f <-16 >> >> I would like to select the value in column a based on a value in column b, >> where the value in column b is the 1st value that is smaller then f. Thus I >> should end up with the number 5 because the 1st value that is below 16 would >> be 15, and in the same row column a has the number 5 >> >> appreciate your insights, >> >> andras > -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame question
Thank you for providing a reproducible example. I tweaked it a little bit to make it actually a data frame problem. There are lots of ways to do this; here's one approach. On second thought, this looks a lot like homework, so perhaps instead I'll just suggest using subset() with more than one condition. Sarah On Mon, Dec 9, 2013 at 3:27 PM, Andras Farkas wrote: > Dear All > > please help with the following: > > I have: > > a <-seq(0,10,by=1) > b <-c(10:20) > d <-cbind(a,b) > f <-16 > > I would like to select the value in column a based on a value in column b, > where the value in column b is the 1st value that is smaller then f. Thus I > should end up with the number 5 because the 1st value that is below 16 would > be 15, and in the same row column a has the number 5 > > appreciate your insights, > > andras -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data frame question
Dear All please help with the following: I have: a <-seq(0,10,by=1) b <-c(10:20) d <-cbind(a,b) f <-16 I would like to select the value in column a based on a value in column b, where the value in column b is the 1st value that is smaller then f. Thus I should end up with the number 5 because the 1st value that is below 16 would be 15, and in the same row column a has the number 5 appreciate your insights, andras [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data frame question
Hi, Not sure if this is what you wanted: activity<- data.frame(Name=paste0("activity",LETTERS[1:5]),stringsAsFactors=FALSE) dates1<- data.frame(dat=as.Date(c("2013-02-01","2013-02-04","2013-02-05"),format="%Y-%m-%d")) merge(dates1,activity) # dat Name #1 2013-02-01 activityA #2 2013-02-04 activityA #3 2013-02-05 activityA #4 2013-02-01 activityB #5 2013-02-04 activityB #6 2013-02-05 activityB #7 2013-02-01 activityC #8 2013-02-04 activityC #9 2013-02-05 activityC #10 2013-02-01 activityD #11 2013-02-04 activityD #12 2013-02-05 activityD #13 2013-02-01 activityE #14 2013-02-04 activityE #15 2013-02-05 activityE #or expand.grid(dat=dates1[,1],Name=activity[,1]) dat Name #1 2013-02-01 activityA #2 2013-02-04 activityA #3 2013-02-05 activityA #4 2013-02-01 activityB #5 2013-02-04 activityB #6 2013-02-05 activityB #7 2013-02-01 activityC #8 2013-02-04 activityC #9 2013-02-05 activityC #10 2013-02-01 activityD #11 2013-02-04 activityD #12 2013-02-05 activityD #13 2013-02-01 activityE #14 2013-02-04 activityE #15 2013-02-05 activityE A.K. - Original Message - From: ramoss To: r-help@r-project.org Cc: Sent: Monday, April 1, 2013 11:54 AM Subject: [R] Data frame question Hello, I have 2 data frames: activity and dates. Activity contains a l variable listing all activities: activityA, activityB etc. The dates contain all the valid business dates. I need to combine the 2 so that I get a single data frame activitydat that contains the activity name along w/ evevry valid business dates such as Name dat activity A 2013-02-01 activity A 2013-02-04 activity A 2013-02-05 etc Any thought? Thanks ahead for your help. -- View this message in context: http://r.789695.n4.nabble.com/Data-frame-question-tp4662967.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data frame question
That sounds like a job for merge(). If you provide an actual reproducible example using dput(), then you will likely get some actual runnable code. Sarah On Mon, Apr 1, 2013 at 11:54 AM, ramoss wrote: > Hello, > > I have 2 data frames: activity and dates. Activity contains a l variable > listing all activities: activityA, activityB etc. > The dates contain all the valid business dates. I need to combine the 2 so > that I get a single data frame activitydat that contains the activity name > along w/ evevry valid business dates such as > > Name dat > activity A 2013-02-01 > activity A 2013-02-04 > activity A 2013-02-05 > etc > > > Any thought? Thanks ahead for your help. > > > -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data frame question
Hello, I have 2 data frames: activity and dates. Activity contains a l variable listing all activities: activityA, activityB etc. The dates contain all the valid business dates. I need to combine the 2 so that I get a single data frame activitydat that contains the activity name along w/ evevry valid business dates such as Name dat activity A 2013-02-01 activity A 2013-02-04 activity A 2013-02-05 etc Any thought? Thanks ahead for your help. -- View this message in context: http://r.789695.n4.nabble.com/Data-frame-question-tp4662967.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data frame question
apjawor...@mmm.com wrote: Thanks for the quick reply. No, I did not run into any problems so far. I have been using the PLS package and the modelling functions seem to work just fine. In fact, even if I let the data.frame convert the x matrix to separate column, the "y ~ x" modeling syntax still seems to work fine. I don't see that behaviour: rm (x) # make sure there is no leftover x in the workspace mat <- matrix (1 : 9, 3) df <- data.frame (y = 1 : 3, x = mat) str (df) df coef (plsr (y ~ x, data = df, ncomp = 1)) # error coef (plsr (y ~ x.1 + x.2 + x.3, data = df, ncomp = 1)) # works df$x <- I (-mat) str (df) df coef (plsr (y ~ x, data = df, ncomp = 1)) # works Claudia PS: May I be curious: what kind of data do you analyze with PLS? Thanks again, Andy __ Andy Jaworski 518-1-01 Process Laboratory 3M Corporate Research Laboratory - E-mail: apjawor...@mmm.com Tel: (651) 733-6092 Fax: (651) 736-3122 From: Claudia Beleites To: apjawor...@mmm.com Cc: r-help@r-project.org Date: 03/12/2010 02:13 PM Subject: Re: [R] Data frame question Andy, Did you run into any kind of trouble? I'm asking because I'm maintaining a package for spectroscopic data that heavily uses "I (spectra.matrix)" ... However, once you have the matrix safe inside the data.frame, you can delete the "AsIs": > a <- matrix (1:9, 3) > str (a) int [1:3, 1:3] 1 2 3 4 5 6 7 8 9 > df <- data.frame (a = I (a)) > str (df) 'data.frame': 3 obs. of 1 variable: $ a: 'AsIs' int [1:3, 1:3] 1 2 3 4 5 6 7 8 9 > df$a <- unclass (df$a) > str (df) 'data.frame': 3 obs. of 1 variable: $ a: int [1:3, 1:3] 1 2 3 4 5 6 7 8 9 > df$a [,1] [,2] [,3] [1,]147 [2,]258 [3,]369 > dim (df) [1] 3 1 However, I don't know whether something can now trigger a conversion to data.frame that the AsIs would have stopped. Cheers, Claudia apjawor...@mmm.com wrote: > Hi, > > I have the following question about creating data frames. I want to > create a data frame with 2 components: a vector and a matrix. > > Let me use a simple example: > > y <- rnorm(10) > x <- matrix(rnorm(150), nrow=10) > > Now if I do > > dd <- data.frame(x=x, y=y) > > I get a data frame with 16 colums, but if, according to the documentation, > I do > > dd <- data.frame(x=I(x), y=y) > > then str(dd) gives: > > 'data.frame': 10 obs. of 2 variables: > $ x: AsIs [1:10, 1:15] 0.700073 -0.44371 -0.46625 > 0.977337 0.509786 ... > $ y: num 0.4676 -1.4343 -0.3671 0.0637 -0.231 ... > > This looks and works OK. > > Now, there exists a CRAN package called pls. It has a yarn data set in > it. > >> data(yarn) >> str(yarn) > 'data.frame': 28 obs. of 3 variables: > $ NIR: num [1:28, 1:268] 3.07 3.07 3.08 3.08 3.1 ... > ..- attr(*, "dimnames")=List of 2 > .. ..$ : NULL > .. ..$ : NULL > $ density: num 100 80.2 79.5 60.8 60 ... > $ train : logi TRUE TRUE TRUE TRUE TRUE TRUE ... > > This looks almost the same, except the matrix component in my example has > the AsIs instead of num. > > Is this just some older behavior of the data.frame function producing this > difference? If not, how can I get my data frame (dd) to look like yarn? > > I read the help pages for data.frame and as.data.frame and found this > paragraph > > If a list is supplied, each element is converted to a column in the data > frame. Similarly, each column of a matrix is converted separately. This > can be overridden if the object has a class which has a method for > as.data.frame: two examples are matrices of class "model.matrix" (which > are included as a single column) and list objects of class "POSIXlt" which > are coerced to class "POSIXct". > > If I do > >> methods(as.data.frame) > [1] as.data.frame.aovproj*as.data.frame.array > [3] as.data.frame.AsIsas.data.frame.character > [5] as.data.frame.complex as.data.frame.data.frame > [7] as.data.frame.Dateas.data.frame.default > [9] as.data.frame.difftimeas.data.frame.factor > [11] as.data.frame.ftable* as.data.frame.integer > [13] as.data.frame.listas.data.frame.logical > [15] as.data.frame.logLik* as.data.frame.matrix > [17] as.data.frame.model.matrixas.data.frame.numeric > [19] as.data.frame.numeric_version as.data.frame.ordered > [21] as.data.frame.POSIXct as.data.frame.POSIXlt > [23] as.data.frame.r
Re: [R] Data frame question
Andy, Did you run into any kind of trouble? I'm asking because I'm maintaining a package for spectroscopic data that heavily uses "I (spectra.matrix)" ... However, once you have the matrix safe inside the data.frame, you can delete the "AsIs": > a <- matrix (1:9, 3) > str (a) int [1:3, 1:3] 1 2 3 4 5 6 7 8 9 > df <- data.frame (a = I (a)) > str (df) 'data.frame': 3 obs. of 1 variable: $ a: 'AsIs' int [1:3, 1:3] 1 2 3 4 5 6 7 8 9 > df$a <- unclass (df$a) > str (df) 'data.frame': 3 obs. of 1 variable: $ a: int [1:3, 1:3] 1 2 3 4 5 6 7 8 9 > df$a [,1] [,2] [,3] [1,]147 [2,]258 [3,]369 > dim (df) [1] 3 1 However, I don't know whether something can now trigger a conversion to data.frame that the AsIs would have stopped. Cheers, Claudia apjawor...@mmm.com wrote: Hi, I have the following question about creating data frames. I want to create a data frame with 2 components: a vector and a matrix. Let me use a simple example: y <- rnorm(10) x <- matrix(rnorm(150), nrow=10) Now if I do dd <- data.frame(x=x, y=y) I get a data frame with 16 colums, but if, according to the documentation, I do dd <- data.frame(x=I(x), y=y) then str(dd) gives: 'data.frame': 10 obs. of 2 variables: $ x: AsIs [1:10, 1:15] 0.700073 -0.44371 -0.46625 0.977337 0.509786 ... $ y: num 0.4676 -1.4343 -0.3671 0.0637 -0.231 ... This looks and works OK. Now, there exists a CRAN package called pls. It has a yarn data set in it. data(yarn) str(yarn) 'data.frame': 28 obs. of 3 variables: $ NIR: num [1:28, 1:268] 3.07 3.07 3.08 3.08 3.1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : NULL $ density: num 100 80.2 79.5 60.8 60 ... $ train : logi TRUE TRUE TRUE TRUE TRUE TRUE ... This looks almost the same, except the matrix component in my example has the AsIs instead of num. Is this just some older behavior of the data.frame function producing this difference? If not, how can I get my data frame (dd) to look like yarn? I read the help pages for data.frame and as.data.frame and found this paragraph If a list is supplied, each element is converted to a column in the data frame. Similarly, each column of a matrix is converted separately. This can be overridden if the object has a class which has a method for as.data.frame: two examples are matrices of class "model.matrix" (which are included as a single column) and list objects of class "POSIXlt" which are coerced to class "POSIXct". If I do methods(as.data.frame) [1] as.data.frame.aovproj*as.data.frame.array [3] as.data.frame.AsIsas.data.frame.character [5] as.data.frame.complex as.data.frame.data.frame [7] as.data.frame.Dateas.data.frame.default [9] as.data.frame.difftimeas.data.frame.factor [11] as.data.frame.ftable* as.data.frame.integer [13] as.data.frame.listas.data.frame.logical [15] as.data.frame.logLik* as.data.frame.matrix [17] as.data.frame.model.matrixas.data.frame.numeric [19] as.data.frame.numeric_version as.data.frame.ordered [21] as.data.frame.POSIXct as.data.frame.POSIXlt [23] as.data.frame.raw as.data.frame.table [25] as.data.frame.ts as.data.frame.vector so it looks like there is a matrix method for as.data.frame. The question then is how can I override the default behavior for the matrix object (converting columns separately). Any hint will be appreciated, Andy __ Andy Jaworski 518-1-01 Process Laboratory 3M Corporate Research Laboratory - E-mail: apjawor...@mmm.com Tel: (651) 733-6092 Fax: (651) 736-3122 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Claudia Beleites Dipartimento dei Materiali e delle Risorse Naturali Università degli Studi di Trieste Via Alfonso Valerio 6/a I-34127 Trieste phone: +39 0 40 5 58-37 68 email: cbelei...@units.it __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data frame question
Hi, I have the following question about creating data frames. I want to create a data frame with 2 components: a vector and a matrix. Let me use a simple example: y <- rnorm(10) x <- matrix(rnorm(150), nrow=10) Now if I do dd <- data.frame(x=x, y=y) I get a data frame with 16 colums, but if, according to the documentation, I do dd <- data.frame(x=I(x), y=y) then str(dd) gives: 'data.frame': 10 obs. of 2 variables: $ x: AsIs [1:10, 1:15] 0.700073 -0.44371 -0.46625 0.977337 0.509786 ... $ y: num 0.4676 -1.4343 -0.3671 0.0637 -0.231 ... This looks and works OK. Now, there exists a CRAN package called pls. It has a yarn data set in it. > data(yarn) > str(yarn) 'data.frame': 28 obs. of 3 variables: $ NIR: num [1:28, 1:268] 3.07 3.07 3.08 3.08 3.1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : NULL $ density: num 100 80.2 79.5 60.8 60 ... $ train : logi TRUE TRUE TRUE TRUE TRUE TRUE ... This looks almost the same, except the matrix component in my example has the AsIs instead of num. Is this just some older behavior of the data.frame function producing this difference? If not, how can I get my data frame (dd) to look like yarn? I read the help pages for data.frame and as.data.frame and found this paragraph If a list is supplied, each element is converted to a column in the data frame. Similarly, each column of a matrix is converted separately. This can be overridden if the object has a class which has a method for as.data.frame: two examples are matrices of class "model.matrix" (which are included as a single column) and list objects of class "POSIXlt" which are coerced to class "POSIXct". If I do > methods(as.data.frame) [1] as.data.frame.aovproj*as.data.frame.array [3] as.data.frame.AsIsas.data.frame.character [5] as.data.frame.complex as.data.frame.data.frame [7] as.data.frame.Dateas.data.frame.default [9] as.data.frame.difftimeas.data.frame.factor [11] as.data.frame.ftable* as.data.frame.integer [13] as.data.frame.listas.data.frame.logical [15] as.data.frame.logLik* as.data.frame.matrix [17] as.data.frame.model.matrixas.data.frame.numeric [19] as.data.frame.numeric_version as.data.frame.ordered [21] as.data.frame.POSIXct as.data.frame.POSIXlt [23] as.data.frame.raw as.data.frame.table [25] as.data.frame.ts as.data.frame.vector so it looks like there is a matrix method for as.data.frame. The question then is how can I override the default behavior for the matrix object (converting columns separately). Any hint will be appreciated, Andy __ Andy Jaworski 518-1-01 Process Laboratory 3M Corporate Research Laboratory - E-mail: apjawor...@mmm.com Tel: (651) 733-6092 Fax: (651) 736-3122 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data frame question
I have a data frame containing sequences and I am interested in changing a few sequences in a window and the swapping the original sequence back after I have completed my analysis. My temporary data frame that I am creating seq.in.window does not like the way I am making me assignment. The variable seq.in.window gets numbers assigned to it, instead of the data contained in sequence.data. Can someone point out the mistake I am making? Thanks ../Murli x<-c("a","t","g","c") sequence.data<-structure(list(V1 = structure(c(4L, 1L, 3L, 2L, 2L, 3L, 3L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 4L, 1L, 4L, 3L, 1L), .Label = c("a", "c", "g", "t"), class = "factor"), V2 = structure(c(3L, 2L, 4L, 1L, 4L, 1L, 3L, 4L, 2L, 3L, 3L, 4L, 2L, 4L, 4L, 1L, 4L, 2L, 3L, 4L), .Label = c("a", "c", "g", "t"), class = "factor"), V3 = structure(c(3L, 3L, 3L, 3L, 3L, 1L, 4L, 2L, 1L, 3L, 1L, 1L, 1L, 2L, 1L, 4L, 1L, 4L, 1L, 4L), .Label = c("a", "c", "g", "t"), class = "factor"), V4 = structure(c(3L, 2L, 1L, 2L, 2L, 3L, 4L, 2L, 1L, 4L, 3L, 4L, 2L, 1L, 4L, 2L, 4L, 4L, 4L, 3L), .Label = c("a", "c", "g", "t"), class = "factor"), V5 = structure(c(3L, 3L, 3L, 2L, 1L, 1L, 2L, 4L, 2L, 3L, 4L, 2L, 1L, 1L, 4L, 4L, 4L, 2L, 1L, 4L), .Label = c("a", "c", "g", "t"), class = "factor"), V6 = structure(c(2L, 2L, 1L, 4L, 3L, 4L, 1L, 4L, 2L, 3L, 3L, 2L, 3L, 2L, 2L, 3L, 1L, 4L, 4L, 4L), .Label = c("a", "c", "g", "t"), class = "factor")), .Names = c("V1", "V2", "V3", "V4", "V5", "V6"), class = "data.frame", row.names = c("16", "4", "1", "9", "6", "2", "15", "19", "18", "12", "13", "91", "41", "21", "151", "14", "5", "8", "181", "121")) seq.in.window<-data.frame(matrix(0,nrow=20,ncol=5)) # Creating an empty data frame for(i in 1:20){ seq.in.window[i,1:5]<- sequence.data[i,1:5] #Can I do this assignment? print(seq.in.window[i,1:5]) rnd.seq =as.vector(sample(x,length(1:5), replace=TRUE)) sequence.data[i,5] =t(rnd.seq) print(sequence.data[i,1:5]) cat("\n") } for(i in 1:20){ sequence.data[i,1:5]=seq.in.window[i,1:5] } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame question
... or in one step df <- transform(df, col1 = ifelse(col1 > 3, NA, col1)) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of K. Elo Sent: Friday, 15 February 2008 4:29 PM To: r-help@r-project.org Subject: Re: [R] data frame question Hi, joseph wrote (15.2.2008): > Thanks. I have another question: > In the following data frame df, I want to replace all values in col1 > that are higher than 3 with NA. df= data.frame(col1=c(1:5, NA),col2= > c(2,NA,4:7)) My suggestion: x<-df$col1; x[ x>3 ]<-NA; df$col1<-x; rm(x) -Kimmo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame question
Hi, joseph wrote (15.2.2008): > Thanks. I have another question: > In the following data frame df, I want to replace all values in col1 > that are higher than 3 with NA. df= data.frame(col1=c(1:5, NA),col2= > c(2,NA,4:7)) My suggestion: x<-df$col1; x[ x>3 ]<-NA; df$col1<-x; rm(x) -Kimmo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame question
Thanks. I have another question: In the following data frame df, I want to replace all values in col1 that are higher than 3 with NA. df= data.frame(col1=c(1:5, NA),col2= c(2,NA,4:7)) - Original Message From: John Kane <[EMAIL PROTECTED]> To: joseph <[EMAIL PROTECTED]>; r-help@r-project.org Cc: r-help@r-project.org Sent: Thursday, February 14, 2008 3:09:40 PM Subject: Re: [R] data frame question Create the new data.frame and do the muliplying on it? df2 <- df1 df2[,1] <- df2[,1]*2 --- joseph <[EMAIL PROTECTED]> wrote: > > > Hi > > I have a data frame df1 in which I would like to > multiply col1 > by 2. > > > The way I did it does not allow me to keep the old > data > frame. > > > How can I do this and be able to create a new data > frame > df2? > > > > df1= data.frame(col1= c(3, 5, NA, 1), col2= c(4, > NA,6, > 2)) > > > > df1 > > > col1 col2 > > > 1 3 4 > > > 2 5 NA > > > 3 NA 6 > > > 4 1 2 > > > > df1$col1=df1$col1*2 > > > > df1 > > > col1 col2 > > > 1 6 4 > > > 2 10 NA > > > 3 NA 6 > > > 4 2 2 > > > > > > > > Be a better friend, newshound, and > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > Connect with friends from any web browser - no download required. Try the new Canada Messenger for the Web BETA at Looking for last minute shopping deals? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame question
Create the new data.frame and do the muliplying on it? df2 <- df1 df2[,1] <- df2[,1]*2 --- joseph <[EMAIL PROTECTED]> wrote: > > > Hi > > I have a data frame df1 in which I would like to > multiply col1 > by 2. > > > The way I did it does not allow me to keep the old > data > frame. > > > How can I do this and be able to create a new data > frame > df2? > > > > df1= data.frame(col1= c(3, 5, NA, 1), col2= c(4, > NA,6, > 2)) > > > > df1 > > > col1 col2 > > > 134 > > > 25 NA > > > 3 NA6 > > > 412 > > > > df1$col1=df1$col1*2 > > > > df1 > > > col1 col2 > > > 164 > > > 2 10 NA > > > 3 NA6 > > > 422 > > > > > > > > Be a better friend, newshound, and > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame question
On Thursday 14 February 2008 06:27:07 pm Stefan Grosse wrote: SG> df$col3<-df1$col1*2 ups it should be df1$col3<-df1$col1*2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame question
On Thursday 14 February 2008 06:12:23 pm joseph wrote: jo> I have a data frame df1 in which I would like to multiply col1 jo> by 2. jo> The way I did it does not allow me to keep the old data jo> frame. jo> jo> jo> How can I do this and be able to create a new data frame jo> df2? jo> jo> jo> > df1$col1=df1$col1*2 df$col3<-df1$col1*2 Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame question
If I understand: df2 <- transform(df1, col3=col1*2) On 14/02/2008, joseph <[EMAIL PROTECTED]> wrote: > > > Hi > > I have a data frame df1 in which I would like to multiply col1 > by 2. > > > The way I did it does not allow me to keep the old data > frame. > > > How can I do this and be able to create a new data frame > df2? > > > > df1= data.frame(col1= c(3, 5, NA, 1), col2= c(4, NA,6, > 2)) > > > > df1 > > > col1 col2 > > > 134 > > > 25 NA > > > 3 NA6 > > > 412 > > > > df1$col1=df1$col1*2 > > > > df1 > > > col1 col2 > > > 164 > > > 2 10 NA > > > 3 NA6 > > > 422 > > > > > > > > Be a better friend, newshound, and > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame question
df2 = df ? G. On Thu, Feb 14, 2008 at 09:12:23AM -0800, joseph wrote: > > > Hi > > I have a data frame df1 in which I would like to multiply col1 > by 2. > > > The way I did it does not allow me to keep the old data > frame. > > > How can I do this and be able to create a new data frame > df2? > > > > df1= data.frame(col1= c(3, 5, NA, 1), col2= c(4, NA,6, > 2)) > > > > df1 > > > col1 col2 > > > 134 > > > 25 NA > > > 3 NA6 > > > 412 > > > > df1$col1=df1$col1*2 > > > > df1 > > > col1 col2 > > > 164 > > > 2 10 NA > > > 3 NA6 > > > 422 > > > > > > > > Be a better friend, newshound, and > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Csardi Gabor <[EMAIL PROTECTED]>UNIL DGM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data frame question
Hi I have a data frame df1 in which I would like to multiply col1 by 2. The way I did it does not allow me to keep the old data frame. How can I do this and be able to create a new data frame df2? > df1= data.frame(col1= c(3, 5, NA, 1), col2= c(4, NA,6, 2)) > df1 col1 col2 134 25 NA 3 NA6 412 > df1$col1=df1$col1*2 > df1 col1 col2 164 2 10 NA 3 NA6 422 Be a better friend, newshound, and [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame question
joseph <[EMAIL PROTECTED]> wrote in news:[EMAIL PROTECTED]: > I have 2 data frames df1 and df2. I would like to create a > new data frame new_df which will contain only the common rows based > on the first 2 columns (chrN and start). The column score in the new > data frame should > be replaced with a column containing the average score > (average_score) from df1 and df2. > > df1= data.frame(chrN= c("chr1", "chr1", "chr1", "chr1", "chr2", > "chr2", "chr2"), > start= c(23, 82, 95, 108, 95, 108, 121), > end= c(33, 92, 105, 118, 105, 118, 131), > score= c(3, 6, 2, 4, 9, 2, 7)) > > df2= data.frame(chrN= c("chr1", "chr2", "chr2", "chr2" , "chr2"), > start= c(23, 50, 95, 20, 121), > end= c(33, 60, 105, 30, 131), > score= c(9, 3, 7, 7, 3)) Clunky to be sure, but this should worked for me: df3 <- merge(df1,df2,by=c("chrN","start") #non-match variables get auto-relabeled df3$avg.scr <- with(df3, (score.x+score.y)/2) # or mean( ) df3 <- df3[,c("chrN","start","avg.scr")] #drops the variables not of interest df3 chrN start avg.scr 1 chr123 6 2 chr2 121 5 3 chr295 8 -- David Winsemius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame question
On 10/02/2008, joseph <[EMAIL PROTECTED]> wrote: > Hello > I have 2 data frames df1 and df2. I would like to create a > new data frame new_df which will contain only the common rows based on the > first 2 > columns (chrN and start). The column score in the new data frame > should > be replaced with a column containing the average score (average_score) from > df1 > and df2. Try this: (avoiding underscores) new.df <- merge(df1, df2, by=c('chrN','start')) new.df$average.score <- apply(df3[,c('score.x','score.y')], 1, mean, na.rm=T) As always, interested to see whether it can be done in one line... -- Dr. Mark Wardle Specialist registrar, Neurology Cardiff, UK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data frame question
Hello I have 2 data frames df1 and df2. I would like to create a new data frame new_df which will contain only the common rows based on the first 2 columns (chrN and start). The column score in the new data frame should be replaced with a column containing the average score (average_score) from df1 and df2. df1= data.frame(chrN= c(chr1, chr1, chr1, chr1, chr2, chr2, chr2), start= c(23, 82, 95, 108, 95, 108, 121), end= c(33, 92, 105, 118, 105, 118, 131), score= c(3, 6, 2, 4, 9, 2, 7)) df2= data.frame(chrN= c(chr1, chr2, chr2, chr2 , chr2), start= c(23, 50, 95, 20, 121), end= c(33, 60, 105, 30, 131), score= c(9, 3, 7, 7, 3)) new_df= data.frame(chrN= c(chr1, chr2, chr2), start= c(23, 95, 121), end= c(33, 105, 131), average_score= c(6, 8, 5)) Thank you for your help Joseph Never miss a thing. Make Yahoo your home page. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.