Re: [R] Data Frame Search Slow

2011-11-22 Thread TimothyDalbey
Answer to my own question:

ush <- data.table(read.csv(...))
setkey(ush, product_id)
s1 <- ush[J[product.id]]



> user  system elapsed 
>   0.000   0.000   0.003 
> 

It seems like that's the method to use!  Amazing.

--
View this message in context: 
http://r.789695.n4.nabble.com/Data-Frame-Search-Slow-tp4096906p4097576.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Frame Search Slow

2011-11-22 Thread TimothyDalbey
Update from email outside of this thread:

Justin Haynes writes:


> matrices will help, but two quick solutions:
> 
> if you are looking for single items to go in the some_value space, use ==
> instead of %in% and you'll notice speedups.  The second more involved
> option is to take a look at the package data.table.
> 
> it provides a wrapper for data.frame that has some impressive
> optimizations as well as allowing for SQL like indexing and syntax.
> 
> 
> hope that helps!
> 

Justin, thanks.  Here's the before and after clock:

Before: 

user  system elapsed 
  0.976   0.164   2.607 

After:

user  system elapsed 
  0.624   0.156   1.810 

You are correct.  I still need to get that elapsed down even more. 
Data.table sounds like a good option.  I'll look into it more.  Thank you.

--
View this message in context: 
http://r.789695.n4.nabble.com/Data-Frame-Search-Slow-tp4096906p4097261.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Frame Search Slow

2011-11-22 Thread TimothyDalbey
So, here is the result time from using the datatable package:



> user  system elapsed 
>   0.800   0.012   1.847
> 

Here are the methods that I am using:

ush <- data.table(read.csv(...))
setkey(ush, product_id)
s1 <- subset(ush, product_id == product.id)

Seems like a minor improvement but not enough to get those subsets in less
than ~2 seconds.  Am I doing something wrong I wonder...


--
View this message in context: 
http://r.789695.n4.nabble.com/Data-Frame-Search-Slow-tp4096906p4097441.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Frame Search Slow

2011-11-22 Thread TimothyDalbey
Wow, these specs are fantastic:



> user  system elapsed
>0.330.000.39 
> 

I wonder how much of that is because of the capacity of the box that you are
running R on.  Can you post pertinent specs?  This suggest to me that
hardware upgrades (RAM specifically) may also be in order.

Investigating data.table now.

Thanks!

--
View this message in context: 
http://r.789695.n4.nabble.com/Data-Frame-Search-Slow-tp4096906p4097278.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data Frame Search Slow

2011-11-22 Thread TimothyDalbey
Hey All,

So - I promise to write a blog post on this topic and post it somewhere on
the internet once I get to the bottom of this.  Basically, the set-up to the
problem is like this:

1.  I have a data frame with dim (2547290, 4)
2.  I need to make SQL like lookups on the dataframe.  I have been using the
following sort of syntax:

a.dataframe[a.dataframe[[column_index]] %in% some_value, ]

3.  This process takes quite a lot of time (~2 seconds) on m1.small
instances AMIs (AWS)

So, I hope I can get that look-up/search logic quite a lot faster.  I have
heard that using matrices is the way to do it but I haven't found any
resources on performing that sort of operation specifically that have
yielded better results.  

Thought, feelings and advice are more than welcome.

Best,
TMD

--
View this message in context: 
http://r.789695.n4.nabble.com/Data-Frame-Search-Slow-tp4096906p4096906.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] HoltWinters in R 2.14.0

2011-11-05 Thread TimothyDalbey
You are 100% correct by my estimation however I suppose I am looking for the 
conditions in the data that might cause the optim() or optimize() functions to 
fail.  I took a brief tour of the HoltWinters source but the code available 
(readable) online seemed outdated (by way of conflicting descriptions in 
versioning.). I'll have another poke around the source - that is unless there 
is someone out there that can clearly state why optimize() fails within the 
context of the HoltWinters class v. 2.14.0.

On Nov 4, 2011, at 8:38 PM, "Prof Brian Ripley [via R]" 
 wrote:

> On Fri, 4 Nov 2011, R. Michael Weylandt wrote: 
> 
> > I believe there were some changes to Holt-Winters, specifically in re 
> > optimization that probably lead to your problem, but you'll have to 
> > provide more details. See the NEWS file for citations about the 
> > change. If you put example code/data others may be able to help you -- 
> > I haven't updated yet so I can't be of much help. 
> > 
> > Michael 
> > 
> > 
> > On Fri, Nov 4, 2011 at 2:55 PM, TimothyDalbey <[hidden email]> wrote: 
> >> Hey All, 
> >> 
> >> First time on these forums.  Thanks in advance. 
> >> 
> >> S...  I have a process that was functioning well before the 2.14 
> >> update. 
> >> Now the HoltWinters function is throwing an error whereby I get the 
> >> following: 
> >> 
> >> Error in HoltWinters(sales.ts) : optimization failure
> Most likely it was incorrect before.  You cannot assume that it was 
> actually 'functioning well': all the cases where we have seen this 
> message it was giving incorrect answers before and not detecting them. 
> And in all those cases the model was a bad fit and using starting 
> values for the optimization helped. 
> 
> >> I've been looking around to determine why this happens (see if I can test 
> >> the data beforehand) but I haven't come across anything. 
> >> 
> >> Any help appreciated! 
> 
> -- 
> Brian D. Ripley,  [hidden email] 
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self) 
> 1 South Parks Road, +44 1865 272866 (PA) 
> Oxford OX1 3TG, UKFax:  +44 1865 272595
> __ 
> [hidden email] mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 
> 
> 
> If you reply to this email, your message will be added to the discussion 
> below:
> http://r.789695.n4.nabble.com/HoltWinters-in-R-2-14-0-tp3991247p3992395.html
> To unsubscribe from HoltWinters in R 2.14.0, click here.


--
View this message in context: 
http://r.789695.n4.nabble.com/HoltWinters-in-R-2-14-0-tp3991247p3992497.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] HoltWinters in R 2.14.0

2011-11-04 Thread TimothyDalbey
Hey All,

First time on these forums.  Thanks in advance.  

S...  I have a process that was functioning well before the 2.14 update. 
Now the HoltWinters function is throwing an error whereby I get the
following: 

Error in HoltWinters(sales.ts) : optimization failure

I've been looking around to determine why this happens (see if I can test
the data beforehand) but I haven't come across anything.  

Any help appreciated!


--
View this message in context: 
http://r.789695.n4.nabble.com/HoltWinters-in-R-2-14-0-tp3991247p3991247.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.