One way to speed up the merge is not to use merge.  You can use 'match' to
find matching indices and then manually.

Does this do what you want:

> ua <- read.table(text = '                  AName              rt_date
+ 2007-03-31 "14066.580078125" "2007-04-01"
+ 2007-06-30 "14717"           "2007-04-03"
+ 2007-09-30 "15528"           "2007-10-25"
+ 2007-12-31 "17609"           "2008-04-06"
+ 2008-03-31 "17168"           "2008-04-24"
+ 2008-06-30 "17681"           "2008-04-09"', header = TRUE, as.is = TRUE)
>
> dt <- c( "2007-03-31" ,"2007-04-01" ,"2007-04-02", "2007-04-03"
,"2007-04-04",
+ "2007-04-05" ,"2007-04-06" ,"2007-04-07",
+ "2007-04-08", "2007-04-09")
>
> # find matching values in ua
> indx <- match(dt, ua$rt_date)
>
> # create new result matrix
> xx1 <- cbind(dt, ua[indx,])
> rownames(xx1) <- NULL  # delete funny names
> xx1
           dt    AName    rt_date
1  2007-03-31       NA       <NA>
2  2007-04-01 14066.58 2007-04-01
3  2007-04-02       NA       <NA>
4  2007-04-03 14717.00 2007-04-03
5  2007-04-04       NA       <NA>
6  2007-04-05       NA       <NA>
7  2007-04-06       NA       <NA>
8  2007-04-07       NA       <NA>
9  2007-04-08       NA       <NA>
10 2007-04-09       NA       <NA>
>


On Fri, Mar 2, 2012 at 5:24 AM, Ben quant <ccqu...@gmail.com> wrote:

> Hello,
>
> I have a nasty loop that I have to do 11877 times. The only thing that
> slows it down really is this merge:
>
> xx1 = merge(dt,ua_rd,by.x=1,by.y= 'rt_date',all.x=T)
>
> Any ideas on how to speed it up? The output can't change materially (it
> works), but I'd like it to go faster. I'm looking at getting around the
> loop (not shown), but I'm trying to speed up the merge first. I'll post
> regarding the loop if nothing comes of this post.
>
> Here is some information on what type of stuff is going into the merge:
>
> > class(ua_rd)
> [1] "matrix"
> > dim(ua_rd)
> [1] 20  2
> > head(ua_rd)
>                   AName              rt_date
> 2007-03-31 "14066.580078125" "2007-04-26"
> 2007-06-30 "14717"           "2007-07-19"
> 2007-09-30 "15528"           "2007-10-25"
> 2007-12-31 "17609"           "2008-01-24"
> 2008-03-31 "17168"           "2008-04-24"
> 2008-06-30 "17681"           "2008-07-17"
> > class(dt)
> [1] "character"
> > length(dt)
> [1] 1799
> > dt[1:10]
>  [1] "2007-03-31" "2007-04-01" "2007-04-02" "2007-04-03" "2007-04-04"
> "2007-04-05" "2007-04-06" "2007-04-07"
>  [9] "2007-04-08" "2007-04-09"
>
> thanks,
>
> Ben
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to