[R] Vector indexing question

2007-03-29 Thread Paul Lynch
Suppose you have 4 related vectors:

a.id-c(1:25, 1:25, 1:25)
a.vals - c(101:175)# same length as a.id (the values for those IDs)
a.id.levels - c(1:25)
a.id.ratings - rep(letters[1:5], times=5)# same length as a.id.levels

What I would like to do is specify a rating from a.ratings (e.g. e),
get the vector of corresponding IDs from a.id.levels (via
a.id.levels[a.id.ratings=='e']) and then somehow use those IDs in a.id
to get the corresponding values from a.vals.

I think I can probably write a loop to construct of a vector of
ratings of the same length as a.id so that the ratings match the ID,
and then go from there.  Is there a better way?  Perhaps using factors
or levels or something?

Thanks,
  --Paul

-- 
Paul Lynch
Aquilent, Inc.
National Library of Medicine (Contractor)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Vector indexing question

2007-03-29 Thread Adaikalavan Ramasamy
Sounds like you have two different tables and are trying to mine one 
based on the other. Try

ref - data.frame( levels  = 1:25,
ratings = rep(letters[1:5], times=5) )

db - data.frame( vals=101:175, levels=c(1:25, 1:25, 1:25) )

levels.of.interest - ref$levels[ ref$rating==a ]
db$vals[ which(db$levels %in% levels.of.interest) ]

  [1] 101 106 111 116 121 126 131 136 141 146 151 156 161 166 171


OR a much more intuitive way is to merge both tables and proceeding as

out - merge( db, ref, by=levels, all.x=TRUE )
out - out[ order(out$val), ] # little cleanup
subset( out, ratings==a )   # ignore the rownames

levels vals ratings
1   1  101   a
16  6  106   a
31 11  111   a
46 16  116   a
61 21  121   a
3   1  126   a
17  6  131   a
32 11  136   a
47 16  141   a
62 21  146   a
2   1  151   a
18  6  156   a
33 11  161   a
48 16  166   a
63 21  171   a

Then you can do cool things using the apply() family like
   tapply( out$vals, out$ratings, mean )
 a   b   c   d   e
   136 137 138 139 140

Check out %in%, merge and apply.

Regards, Adai



Paul Lynch wrote:
 Suppose you have 4 related vectors:
 
 a.id-c(1:25, 1:25, 1:25)
 a.vals - c(101:175)# same length as a.id (the values for those IDs)
 a.id.levels - c(1:25)
 a.id.ratings - rep(letters[1:5], times=5)# same length as a.id.levels
 
 What I would like to do is specify a rating from a.ratings (e.g. e),
 get the vector of corresponding IDs from a.id.levels (via
 a.id.levels[a.id.ratings=='e']) and then somehow use those IDs in a.id
 to get the corresponding values from a.vals.
 
 I think I can probably write a loop to construct of a vector of
 ratings of the same length as a.id so that the ratings match the ID,
 and then go from there.  Is there a better way?  Perhaps using factors
 or levels or something?
 
 Thanks,
   --Paul


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Vector indexing question

2007-03-29 Thread Marc Schwartz
On Thu, 2007-03-29 at 19:55 -0400, Paul Lynch wrote:
 Suppose you have 4 related vectors:
 
 a.id-c(1:25, 1:25, 1:25)
 a.vals - c(101:175)# same length as a.id (the values for those IDs)
 a.id.levels - c(1:25)
 a.id.ratings - rep(letters[1:5], times=5)# same length as a.id.levels
 
 What I would like to do is specify a rating from a.ratings (e.g. e),
 get the vector of corresponding IDs from a.id.levels (via
 a.id.levels[a.id.ratings=='e']) and then somehow use those IDs in a.id
 to get the corresponding values from a.vals.
 
 I think I can probably write a loop to construct of a vector of
 ratings of the same length as a.id so that the ratings match the ID,
 and then go from there.  Is there a better way?  Perhaps using factors
 or levels or something?
 
 Thanks,
   --Paul

Is this what you want?

DF - data.frame(a.id, a.vals, a.id.levels, a.id.ratings)

 DF[DF$a.id.ratings == e, a.vals]
 [1] 105 110 115 120 125 130 135 140 145 150 155 160 165 170 175

or

 subset(DF, a.id.ratings == e, select = a.vals)
   a.vals
5 105
10110
15115
20120
25125
30130
35135
40140
45145
50150
55155
60160
65165
70170
75175

See ?subset

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Vector indexing question

2007-03-29 Thread Charles C. Berry
On Thu, 29 Mar 2007, Paul Lynch wrote:

 Suppose you have 4 related vectors:

 a.id-c(1:25, 1:25, 1:25)
 a.vals - c(101:175)# same length as a.id (the values for those IDs)
 a.id.levels - c(1:25)
 a.id.ratings - rep(letters[1:5], times=5)# same length as a.id.levels

 What I would like to do is specify a rating from a.ratings (e.g. e),
 get the vector of corresponding IDs from a.id.levels (via
 a.id.levels[a.id.ratings=='e']) and then somehow use those IDs in a.id
 to get the corresponding values from a.vals.

see

?factor
?match ( in case a.id.levels does not actually index a.id.ratings)
?split

 a.ratings.factor - factor( a.id.ratings[ match(a.id, a.id.levels) ])
 a.vals[ a.ratings.factor == 'e' ]
  [1] 105 110 115 120 125 130 135 140 145 150 155 160 165 170 175

 split( a.vals, a.ratings.factor ) # more generally
$a
  [1] 101 106 111 116 121 126 131 136 141 146 151 156 161 166 171

$b
  [1] 102 107 112 117 122 127 132 137 142 147 152 157 162 167 172
[output truncated]


 lm( a.vals ~ a.ratings.factor - 1 ) # means of a.vals

Call:
lm(formula = a.vals ~ a.ratings.factor - 1)

Coefficients:
a.ratings.factora  a.ratings.factorb  a.ratings.factorc  a.ratings.factord  
a.ratings.factore
   136137138139 
   140


 I think I can probably write a loop to construct of a vector of
 ratings of the same length as a.id so that the ratings match the ID,
 and then go from there.  Is there a better way?  Perhaps using factors
 or levels or something?

A warning: using factor() in this way

 a.ratings.factor - factor( a.id, levels=a.id.levels, 
labels=a.id.ratings )

will work in this case:

a.vals[ a.ratings.factor == 'e' ]

but generally will get you into trouble as its creates a factor with 25 
non-unique levels. So,

split( a.vals, a.ratings.factor )

ends up giving a list of 25 (non-uniquely labelled) components

HTH,

Chuck


 Thanks,
  --Paul

 -- 
 Paul Lynch
 Aquilent, Inc.
 National Library of Medicine (Contractor)

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Vector indexing question

2007-03-29 Thread Paul Lynch
Adai-- Thanks a lot!  This is just what I was looking for.  I was
almost sure there had to be a neat of doing this.

Bert--  Thanks for the tip.

Marc-- Not quite, although your solution works fine for the case I
gave.  What I had in mind for a.id was an arbitrary sequence of the
numbers in the range [1,25], of length 75, though I was not savvy
enough with R to express that succinctly.  You spotted a shortcut that
I hadn't reallized I was introducing.

Thanks all for your help!
  --Paul

On 3/29/07, Adaikalavan Ramasamy [EMAIL PROTECTED] wrote:
 Sounds like you have two different tables and are trying to mine one
 based on the other. Try

 ref - data.frame( levels  = 1:25,
 ratings = rep(letters[1:5], times=5) )

 db - data.frame( vals=101:175, levels=c(1:25, 1:25, 1:25) )

 levels.of.interest - ref$levels[ ref$rating==a ]
 db$vals[ which(db$levels %in% levels.of.interest) ]

   [1] 101 106 111 116 121 126 131 136 141 146 151 156 161 166 171


 OR a much more intuitive way is to merge both tables and proceeding as

 out - merge( db, ref, by=levels, all.x=TRUE )
 out - out[ order(out$val), ] # little cleanup
 subset( out, ratings==a )   # ignore the rownames

 levels vals ratings
 1   1  101   a
 16  6  106   a
 31 11  111   a
 46 16  116   a
 61 21  121   a
 3   1  126   a
 17  6  131   a
 32 11  136   a
 47 16  141   a
 62 21  146   a
 2   1  151   a
 18  6  156   a
 33 11  161   a
 48 16  166   a
 63 21  171   a

 Then you can do cool things using the apply() family like
tapply( out$vals, out$ratings, mean )
  a   b   c   d   e
136 137 138 139 140

 Check out %in%, merge and apply.

 Regards, Adai



 Paul Lynch wrote:
  Suppose you have 4 related vectors:
 
  a.id-c(1:25, 1:25, 1:25)
  a.vals - c(101:175)# same length as a.id (the values for those IDs)
  a.id.levels - c(1:25)
  a.id.ratings - rep(letters[1:5], times=5)# same length as a.id.levels
 
  What I would like to do is specify a rating from a.ratings (e.g. e),
  get the vector of corresponding IDs from a.id.levels (via
  a.id.levels[a.id.ratings=='e']) and then somehow use those IDs in a.id
  to get the corresponding values from a.vals.
 
  I think I can probably write a loop to construct of a vector of
  ratings of the same length as a.id so that the ratings match the ID,
  and then go from there.  Is there a better way?  Perhaps using factors
  or levels or something?
 
  Thanks,
--Paul
 




-- 
Paul Lynch
Aquilent, Inc.
National Library of Medicine (Contractor)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.