[R] Vector indexing question
Suppose you have 4 related vectors: a.id-c(1:25, 1:25, 1:25) a.vals - c(101:175)# same length as a.id (the values for those IDs) a.id.levels - c(1:25) a.id.ratings - rep(letters[1:5], times=5)# same length as a.id.levels What I would like to do is specify a rating from a.ratings (e.g. e), get the vector of corresponding IDs from a.id.levels (via a.id.levels[a.id.ratings=='e']) and then somehow use those IDs in a.id to get the corresponding values from a.vals. I think I can probably write a loop to construct of a vector of ratings of the same length as a.id so that the ratings match the ID, and then go from there. Is there a better way? Perhaps using factors or levels or something? Thanks, --Paul -- Paul Lynch Aquilent, Inc. National Library of Medicine (Contractor) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vector indexing question
Sounds like you have two different tables and are trying to mine one based on the other. Try ref - data.frame( levels = 1:25, ratings = rep(letters[1:5], times=5) ) db - data.frame( vals=101:175, levels=c(1:25, 1:25, 1:25) ) levels.of.interest - ref$levels[ ref$rating==a ] db$vals[ which(db$levels %in% levels.of.interest) ] [1] 101 106 111 116 121 126 131 136 141 146 151 156 161 166 171 OR a much more intuitive way is to merge both tables and proceeding as out - merge( db, ref, by=levels, all.x=TRUE ) out - out[ order(out$val), ] # little cleanup subset( out, ratings==a ) # ignore the rownames levels vals ratings 1 1 101 a 16 6 106 a 31 11 111 a 46 16 116 a 61 21 121 a 3 1 126 a 17 6 131 a 32 11 136 a 47 16 141 a 62 21 146 a 2 1 151 a 18 6 156 a 33 11 161 a 48 16 166 a 63 21 171 a Then you can do cool things using the apply() family like tapply( out$vals, out$ratings, mean ) a b c d e 136 137 138 139 140 Check out %in%, merge and apply. Regards, Adai Paul Lynch wrote: Suppose you have 4 related vectors: a.id-c(1:25, 1:25, 1:25) a.vals - c(101:175)# same length as a.id (the values for those IDs) a.id.levels - c(1:25) a.id.ratings - rep(letters[1:5], times=5)# same length as a.id.levels What I would like to do is specify a rating from a.ratings (e.g. e), get the vector of corresponding IDs from a.id.levels (via a.id.levels[a.id.ratings=='e']) and then somehow use those IDs in a.id to get the corresponding values from a.vals. I think I can probably write a loop to construct of a vector of ratings of the same length as a.id so that the ratings match the ID, and then go from there. Is there a better way? Perhaps using factors or levels or something? Thanks, --Paul __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vector indexing question
On Thu, 2007-03-29 at 19:55 -0400, Paul Lynch wrote: Suppose you have 4 related vectors: a.id-c(1:25, 1:25, 1:25) a.vals - c(101:175)# same length as a.id (the values for those IDs) a.id.levels - c(1:25) a.id.ratings - rep(letters[1:5], times=5)# same length as a.id.levels What I would like to do is specify a rating from a.ratings (e.g. e), get the vector of corresponding IDs from a.id.levels (via a.id.levels[a.id.ratings=='e']) and then somehow use those IDs in a.id to get the corresponding values from a.vals. I think I can probably write a loop to construct of a vector of ratings of the same length as a.id so that the ratings match the ID, and then go from there. Is there a better way? Perhaps using factors or levels or something? Thanks, --Paul Is this what you want? DF - data.frame(a.id, a.vals, a.id.levels, a.id.ratings) DF[DF$a.id.ratings == e, a.vals] [1] 105 110 115 120 125 130 135 140 145 150 155 160 165 170 175 or subset(DF, a.id.ratings == e, select = a.vals) a.vals 5 105 10110 15115 20120 25125 30130 35135 40140 45145 50150 55155 60160 65165 70170 75175 See ?subset HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vector indexing question
On Thu, 29 Mar 2007, Paul Lynch wrote: Suppose you have 4 related vectors: a.id-c(1:25, 1:25, 1:25) a.vals - c(101:175)# same length as a.id (the values for those IDs) a.id.levels - c(1:25) a.id.ratings - rep(letters[1:5], times=5)# same length as a.id.levels What I would like to do is specify a rating from a.ratings (e.g. e), get the vector of corresponding IDs from a.id.levels (via a.id.levels[a.id.ratings=='e']) and then somehow use those IDs in a.id to get the corresponding values from a.vals. see ?factor ?match ( in case a.id.levels does not actually index a.id.ratings) ?split a.ratings.factor - factor( a.id.ratings[ match(a.id, a.id.levels) ]) a.vals[ a.ratings.factor == 'e' ] [1] 105 110 115 120 125 130 135 140 145 150 155 160 165 170 175 split( a.vals, a.ratings.factor ) # more generally $a [1] 101 106 111 116 121 126 131 136 141 146 151 156 161 166 171 $b [1] 102 107 112 117 122 127 132 137 142 147 152 157 162 167 172 [output truncated] lm( a.vals ~ a.ratings.factor - 1 ) # means of a.vals Call: lm(formula = a.vals ~ a.ratings.factor - 1) Coefficients: a.ratings.factora a.ratings.factorb a.ratings.factorc a.ratings.factord a.ratings.factore 136137138139 140 I think I can probably write a loop to construct of a vector of ratings of the same length as a.id so that the ratings match the ID, and then go from there. Is there a better way? Perhaps using factors or levels or something? A warning: using factor() in this way a.ratings.factor - factor( a.id, levels=a.id.levels, labels=a.id.ratings ) will work in this case: a.vals[ a.ratings.factor == 'e' ] but generally will get you into trouble as its creates a factor with 25 non-unique levels. So, split( a.vals, a.ratings.factor ) ends up giving a list of 25 (non-uniquely labelled) components HTH, Chuck Thanks, --Paul -- Paul Lynch Aquilent, Inc. National Library of Medicine (Contractor) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vector indexing question
Adai-- Thanks a lot! This is just what I was looking for. I was almost sure there had to be a neat of doing this. Bert-- Thanks for the tip. Marc-- Not quite, although your solution works fine for the case I gave. What I had in mind for a.id was an arbitrary sequence of the numbers in the range [1,25], of length 75, though I was not savvy enough with R to express that succinctly. You spotted a shortcut that I hadn't reallized I was introducing. Thanks all for your help! --Paul On 3/29/07, Adaikalavan Ramasamy [EMAIL PROTECTED] wrote: Sounds like you have two different tables and are trying to mine one based on the other. Try ref - data.frame( levels = 1:25, ratings = rep(letters[1:5], times=5) ) db - data.frame( vals=101:175, levels=c(1:25, 1:25, 1:25) ) levels.of.interest - ref$levels[ ref$rating==a ] db$vals[ which(db$levels %in% levels.of.interest) ] [1] 101 106 111 116 121 126 131 136 141 146 151 156 161 166 171 OR a much more intuitive way is to merge both tables and proceeding as out - merge( db, ref, by=levels, all.x=TRUE ) out - out[ order(out$val), ] # little cleanup subset( out, ratings==a ) # ignore the rownames levels vals ratings 1 1 101 a 16 6 106 a 31 11 111 a 46 16 116 a 61 21 121 a 3 1 126 a 17 6 131 a 32 11 136 a 47 16 141 a 62 21 146 a 2 1 151 a 18 6 156 a 33 11 161 a 48 16 166 a 63 21 171 a Then you can do cool things using the apply() family like tapply( out$vals, out$ratings, mean ) a b c d e 136 137 138 139 140 Check out %in%, merge and apply. Regards, Adai Paul Lynch wrote: Suppose you have 4 related vectors: a.id-c(1:25, 1:25, 1:25) a.vals - c(101:175)# same length as a.id (the values for those IDs) a.id.levels - c(1:25) a.id.ratings - rep(letters[1:5], times=5)# same length as a.id.levels What I would like to do is specify a rating from a.ratings (e.g. e), get the vector of corresponding IDs from a.id.levels (via a.id.levels[a.id.ratings=='e']) and then somehow use those IDs in a.id to get the corresponding values from a.vals. I think I can probably write a loop to construct of a vector of ratings of the same length as a.id so that the ratings match the ID, and then go from there. Is there a better way? Perhaps using factors or levels or something? Thanks, --Paul -- Paul Lynch Aquilent, Inc. National Library of Medicine (Contractor) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.