Re: [R] lookups and joins
Paul, You don't want to write you own function. merge() will do that for you very quickly and efficiently. Just to elaborate on Mike's reply, here is an example of how to use merge: > test <- merge(out, trt1m, by="gvkey") > names(out) [1] "gvkey""datadate" "Price""FV" "ER" "Rank" > names(trt1m) [1] "gvkey" "trt1m" > names(test) [1] "gvkey""datadate" "Price""FV" "ER" "Rank" "trt1m" > "gvkey" is the column that is joined on and the new data frame (test) has all the columns of both data frames. Now you can subscript as usual to get just the data you need. Read ?merge to see what all.x=TRUE does in case you need to use that. Thanks, Roger On Mon, 24 Jan 2005 22:41:54 -, michael watson (IAH-C) <[EMAIL PROTECTED]> wrote: > All together now: > > ?merge > > :-) > > > -Original Message- > From: [EMAIL PROTECTED] on behalf of Paul Sorenson > Sent: Mon 1/24/2005 10:34 PM > To: r-help@stat.math.ethz.ch > Cc: > Subject:[R] lookups and joins > I have some data coming from SQL sources that I wish to relate in various > ways. For reasons only known to our IT people, this can't be done in SQL at > present. > > I am looking for an R'ish technique for looking up new columns on a data > frame. As a simple, hardwired example I have tried the following: > > # This gives me two columns, one the lookup value and the second one > # the result column, ie my lookup table. > stcl = read.csv("stockclass.csv") > stockclass = as.vector(stcl$stock_class) > # This gives me what appears to be a dictionary or map > names(stockclass) = as.vector(stcl$stock_group) > > getstockclass = function(stock_group) { >try(stockclass[[stock_group]], TRUE) > } > csg$stk_class=factor(sapply(csg$stock_group, getstockclass)) > > I need the try since if there is a missing value I get an exception. > > I also tried something along the lines of (from memory): > getstockclass = function(stock_group) { >stcl[which(stcl$stock_group == stock_group),]$stock_class > } > > These work but I just wanted to check if there was an inbuilt way to do this > kind of thing in R? I searched on "join" without much luck. > > Really what I would like is a generic function that: >- Takes 2 data frames, >- Some kind of specification on which column(s) to join >- Outputs the joined frames, or perhaps a vector which is an index > vector that I can use on the second data frame. > > I don't really want to reinvent SQL and my data sets are not huge. > > cheers > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] lookups and joins
All together now: ?merge :-) -Original Message- From: [EMAIL PROTECTED] on behalf of Paul Sorenson Sent: Mon 1/24/2005 10:34 PM To: r-help@stat.math.ethz.ch Cc: Subject:[R] lookups and joins I have some data coming from SQL sources that I wish to relate in various ways. For reasons only known to our IT people, this can't be done in SQL at present. I am looking for an R'ish technique for looking up new columns on a data frame. As a simple, hardwired example I have tried the following: # This gives me two columns, one the lookup value and the second one # the result column, ie my lookup table. stcl = read.csv("stockclass.csv") stockclass = as.vector(stcl$stock_class) # This gives me what appears to be a dictionary or map names(stockclass) = as.vector(stcl$stock_group) getstockclass = function(stock_group) { try(stockclass[[stock_group]], TRUE) } csg$stk_class=factor(sapply(csg$stock_group, getstockclass)) I need the try since if there is a missing value I get an exception. I also tried something along the lines of (from memory): getstockclass = function(stock_group) { stcl[which(stcl$stock_group == stock_group),]$stock_class } These work but I just wanted to check if there was an inbuilt way to do this kind of thing in R? I searched on "join" without much luck. Really what I would like is a generic function that: - Takes 2 data frames, - Some kind of specification on which column(s) to join - Outputs the joined frames, or perhaps a vector which is an index vector that I can use on the second data frame. I don't really want to reinvent SQL and my data sets are not huge. cheers __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] lookups and joins
I have some data coming from SQL sources that I wish to relate in various ways. For reasons only known to our IT people, this can't be done in SQL at present. I am looking for an R'ish technique for looking up new columns on a data frame. As a simple, hardwired example I have tried the following: # This gives me two columns, one the lookup value and the second one # the result column, ie my lookup table. stcl = read.csv("stockclass.csv") stockclass = as.vector(stcl$stock_class) # This gives me what appears to be a dictionary or map names(stockclass) = as.vector(stcl$stock_group) getstockclass = function(stock_group) { try(stockclass[[stock_group]], TRUE) } csg$stk_class=factor(sapply(csg$stock_group, getstockclass)) I need the try since if there is a missing value I get an exception. I also tried something along the lines of (from memory): getstockclass = function(stock_group) { stcl[which(stcl$stock_group == stock_group),]$stock_class } These work but I just wanted to check if there was an inbuilt way to do this kind of thing in R? I searched on "join" without much luck. Really what I would like is a generic function that: - Takes 2 data frames, - Some kind of specification on which column(s) to join - Outputs the joined frames, or perhaps a vector which is an index vector that I can use on the second data frame. I don't really want to reinvent SQL and my data sets are not huge. cheers __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html