Re: [R] Data Frame Search Slow
Answer to my own question: ush <- data.table(read.csv(...)) setkey(ush, product_id) s1 <- ush[J[product.id]] > user system elapsed > 0.000 0.000 0.003 > It seems like that's the method to use! Amazing. -- View this message in context: http://r.789695.n4.nabble.com/Data-Frame-Search-Slow-tp4096906p4097576.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Frame Search Slow
Update from email outside of this thread: Justin Haynes writes: > matrices will help, but two quick solutions: > > if you are looking for single items to go in the some_value space, use == > instead of %in% and you'll notice speedups. The second more involved > option is to take a look at the package data.table. > > it provides a wrapper for data.frame that has some impressive > optimizations as well as allowing for SQL like indexing and syntax. > > > hope that helps! > Justin, thanks. Here's the before and after clock: Before: user system elapsed 0.976 0.164 2.607 After: user system elapsed 0.624 0.156 1.810 You are correct. I still need to get that elapsed down even more. Data.table sounds like a good option. I'll look into it more. Thank you. -- View this message in context: http://r.789695.n4.nabble.com/Data-Frame-Search-Slow-tp4096906p4097261.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Frame Search Slow
So, here is the result time from using the datatable package: > user system elapsed > 0.800 0.012 1.847 > Here are the methods that I am using: ush <- data.table(read.csv(...)) setkey(ush, product_id) s1 <- subset(ush, product_id == product.id) Seems like a minor improvement but not enough to get those subsets in less than ~2 seconds. Am I doing something wrong I wonder... -- View this message in context: http://r.789695.n4.nabble.com/Data-Frame-Search-Slow-tp4096906p4097441.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Frame Search Slow
Wow, these specs are fantastic: > user system elapsed >0.330.000.39 > I wonder how much of that is because of the capacity of the box that you are running R on. Can you post pertinent specs? This suggest to me that hardware upgrades (RAM specifically) may also be in order. Investigating data.table now. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Data-Frame-Search-Slow-tp4096906p4097278.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data Frame Search Slow
Hey All, So - I promise to write a blog post on this topic and post it somewhere on the internet once I get to the bottom of this. Basically, the set-up to the problem is like this: 1. I have a data frame with dim (2547290, 4) 2. I need to make SQL like lookups on the dataframe. I have been using the following sort of syntax: a.dataframe[a.dataframe[[column_index]] %in% some_value, ] 3. This process takes quite a lot of time (~2 seconds) on m1.small instances AMIs (AWS) So, I hope I can get that look-up/search logic quite a lot faster. I have heard that using matrices is the way to do it but I haven't found any resources on performing that sort of operation specifically that have yielded better results. Thought, feelings and advice are more than welcome. Best, TMD -- View this message in context: http://r.789695.n4.nabble.com/Data-Frame-Search-Slow-tp4096906p4096906.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] HoltWinters in R 2.14.0
You are 100% correct by my estimation however I suppose I am looking for the conditions in the data that might cause the optim() or optimize() functions to fail. I took a brief tour of the HoltWinters source but the code available (readable) online seemed outdated (by way of conflicting descriptions in versioning.). I'll have another poke around the source - that is unless there is someone out there that can clearly state why optimize() fails within the context of the HoltWinters class v. 2.14.0. On Nov 4, 2011, at 8:38 PM, "Prof Brian Ripley [via R]" wrote: > On Fri, 4 Nov 2011, R. Michael Weylandt wrote: > > > I believe there were some changes to Holt-Winters, specifically in re > > optimization that probably lead to your problem, but you'll have to > > provide more details. See the NEWS file for citations about the > > change. If you put example code/data others may be able to help you -- > > I haven't updated yet so I can't be of much help. > > > > Michael > > > > > > On Fri, Nov 4, 2011 at 2:55 PM, TimothyDalbey <[hidden email]> wrote: > >> Hey All, > >> > >> First time on these forums. Thanks in advance. > >> > >> S... I have a process that was functioning well before the 2.14 > >> update. > >> Now the HoltWinters function is throwing an error whereby I get the > >> following: > >> > >> Error in HoltWinters(sales.ts) : optimization failure > Most likely it was incorrect before. You cannot assume that it was > actually 'functioning well': all the cases where we have seen this > message it was giving incorrect answers before and not detecting them. > And in all those cases the model was a bad fit and using starting > values for the optimization helped. > > >> I've been looking around to determine why this happens (see if I can test > >> the data beforehand) but I haven't come across anything. > >> > >> Any help appreciated! > > -- > Brian D. Ripley, [hidden email] > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UKFax: +44 1865 272595 > __ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > If you reply to this email, your message will be added to the discussion > below: > http://r.789695.n4.nabble.com/HoltWinters-in-R-2-14-0-tp3991247p3992395.html > To unsubscribe from HoltWinters in R 2.14.0, click here. -- View this message in context: http://r.789695.n4.nabble.com/HoltWinters-in-R-2-14-0-tp3991247p3992497.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] HoltWinters in R 2.14.0
Hey All, First time on these forums. Thanks in advance. S... I have a process that was functioning well before the 2.14 update. Now the HoltWinters function is throwing an error whereby I get the following: Error in HoltWinters(sales.ts) : optimization failure I've been looking around to determine why this happens (see if I can test the data beforehand) but I haven't come across anything. Any help appreciated! -- View this message in context: http://r.789695.n4.nabble.com/HoltWinters-in-R-2-14-0-tp3991247p3991247.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.