Re: [R] in continuation with the earlier R puzzle
Many many thanks to all of you. The beer cleared the air of doubts! Pls look at the following lines of code. This is taken from the example of tradesys documentation. When I run the given example using the data.frame spx it works just very fine but while I use some other data.frame (here nifty) it crashes. Now I can intuit that the total rows in the column named "Last" are 3637 and if i do a 20d MA and a 50d MA the respective rows for each of them are 3618 and 3588. Why does expr.frame crash for one data.frame and not for the other? I have given str() for both below for youe kind perusal. library(tradesys) > library(TTR) > x=nifty[,c("Open","Last")] > d <- expr.frame(x, list(MAf=quote(SMA(Last, 20)), MAs=quote(SMA(Last, 50 Error in data.frame(c(1000, 1001.53, 987.17, 976.28, 960.32, 951.93, 949.29, : arguments imply differing number of rows: 3637, 3618, 3588 str(nifty) 'data.frame': 3637 obs. of 6 variables: $ Date..GMT.: Factor w/ 3637 levels "01/01/1996","01/01/1997",..: 321 687 807 929 1052 1172 1537 1650 1764 1886 ... $ Open : num 1000 1002 987 976 960 ... $ High : num 1000 1002 987 976 960 ... $ Low : num 1000 989 977 963 952 ... $ Last : num 1000 989 978 964 953 ... $ Date : num 321 687 807 929 1052 ... > str(spx) 'data.frame': 14940 obs. of 5 variables: $ Open : num 16.7 16.9 16.9 17 17.1 ... $ High : num 16.7 16.9 16.9 17 17.1 ... $ Low : num 16.7 16.9 16.9 17 17.1 ... $ Close : num 16.7 16.9 16.9 17 17.1 ... $ Volume: num 126 189 255 201 252 216 263 297 333 146 ... Thanks Raghu On Tue, Jul 13, 2010 at 12:01 PM, Petr PIKAL wrote: > Hi > > r-help-boun...@r-project.org napsal dne 12.07.2010 16:09:30: > > > When I just run a for loop it works. But if I am going to run a for loop > > every time for large vectors I might as well use C or any other > language. > > The reason R is powerful is becasue it can handle large vectors without > each > > element being manipulated? Please let me know where I am wrong. > > > > for(i in 1:length(news1o)){ > > + if(news1o[i]>s2o[i]) > > + s[i]<-1 > > + else > > + s[i]<--1 > > + } > > Think in R not in C. Why using loops when you can use whole object > directly. It is like drinking beer from snifters. It is possible but using > pints is preferable and more convenient. > > news1o>s2o > > gives you a logical vector the same length > > and you can use it directly for further selection or computation. You can > consider FALSE as 0 and TRUE as 1 and use it as numeric vector > so > > x<-runif(10) > y<-runif(10) > > c(-1,1)[(x>y)+1] > > selects -1 when FALSE and 1 when TRUE. > > or you can use it in mathematical operation directly > > (x>y)*2-1 > > Regards > Petr > > > > > -- > > 'Raghu' > > > >[[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > -- 'Raghu' [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] in continuation with the earlier R puzzle
Hi I do not use any of mentioned libraries so I can not directly answer it. I would try to use debug(expr.frame) to see at what time the error is thrown. I have no idea why did you obtain error. Try to evaluate code in peaces e.g. what is result of list(MAf=quote(SMA(Last, 20)), MAs=quote(SMA(Last, 50))) and look for differences between results got from spx data and nifty data. Regards Petr Raghu napsal dne 13.07.2010 13:17:42: > Many many thanks to all of you. The beer cleared the air of doubts! > Pls look at the following lines of code. This is taken from the example of > tradesys documentation. When I run the given example using the data.frame spx > it works just very fine but while I use some other data.frame (here nifty) it > crashes. Now I can intuit that the total rows in the column named "Last" are > 3637 and if i do a 20d MA and a 50d MA the respective rows for each of them > are 3618 and 3588. Why does expr.frame crash for one data.frame and not for > the other? I have given str() for both below for youe kind perusal. > > library(tradesys) > > library(TTR) > > x=nifty[,c("Open","Last")] > > d <- expr.frame(x, list(MAf=quote(SMA(Last, 20)), MAs=quote(SMA(Last, 50 > Error in data.frame(c(1000, 1001.53, 987.17, 976.28, 960.32, 951.93, 949.29, : > arguments imply differing number of rows: 3637, 3618, 3588 > > > str(nifty) > 'data.frame': 3637 obs. of 6 variables: > $ Date..GMT.: Factor w/ 3637 levels "01/01/1996","01/01/1997",..: 321 687 807 > 929 1052 1172 1537 1650 1764 1886 ... > $ Open : num 1000 1002 987 976 960 ... > $ High : num 1000 1002 987 976 960 ... > $ Low : num 1000 989 977 963 952 ... > $ Last : num 1000 989 978 964 953 ... > $ Date : num 321 687 807 929 1052 ... > > str(spx) > 'data.frame': 14940 obs. of 5 variables: > $ Open : num 16.7 16.9 16.9 17 17.1 ... > $ High : num 16.7 16.9 16.9 17 17.1 ... > $ Low : num 16.7 16.9 16.9 17 17.1 ... > $ Close : num 16.7 16.9 16.9 17 17.1 ... > $ Volume: num 126 189 255 201 252 216 263 > 297 333 146 ... > > > Thanks > Raghu > > On Tue, Jul 13, 2010 at 12:01 PM, Petr PIKAL wrote: > Hi > > r-help-boun...@r-project.org napsal dne 12.07.2010 16:09:30: > > > When I just run a for loop it works. But if I am going to run a for loop > > every time for large vectors I might as well use C or any other > language. > > The reason R is powerful is becasue it can handle large vectors without > each > > element being manipulated? Please let me know where I am wrong. > > > > for(i in 1:length(news1o)){ > > + if(news1o[i]>s2o[i]) > > + s[i]<-1 > > + else > > + s[i]<--1 > > + } > Think in R not in C. Why using loops when you can use whole object > directly. It is like drinking beer from snifters. It is possible but using > pints is preferable and more convenient. > > news1o>s2o > > gives you a logical vector the same length > > and you can use it directly for further selection or computation. You can > consider FALSE as 0 and TRUE as 1 and use it as numeric vector > so > > x<-runif(10) > y<-runif(10) > > c(-1,1)[(x>y)+1] > > selects -1 when FALSE and 1 when TRUE. > > or you can use it in mathematical operation directly > > (x>y)*2-1 > > Regards > Petr > > > > > -- > > 'Raghu' > > > >[[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > 'Raghu' __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] in continuation with the earlier R puzzle
I wanted to point out one thing that Ted said, about initializing the vectors ('s' in your example). This can make a dramatic speed difference if you are using a for loop (the difference is neglible with vectorized computations). Also, a lot of benchmarks have been flying around, each from a different system and using random numbers without identical seeds. So to provide an overall comparison of all the methods I saw here plus demonstrate the speed difference for initializing a vector (if you know its desired length in advance), I ran these benchmarks. Notes: I did not want to interfere with your objects so I used different names. The equivalencies are: news1o = x; s2o = y; s = z. system.time() automatically calculates the time difference from proc.time() between start and finish . > ##R version info > sessionInfo() R version 2.11.1 (2010-05-31) x86_64-pc-mingw32 #snipped > > ##Some Sample Data > set.seed(10) > x <- rnorm(10^6) > set.seed(15) > y <- rnorm(10^6) > > ##Benchmark 1 > z.1 <- NULL > system.time(for(i in 1:length(x)) { + if(x[i] > y[i]) { + z.1[i] <- 1 + } else { + z.1[i] <- -1} + } + ) user system elapsed 1303.83 174.24 1483.74 > > ##Benchmark 2 > #initialize 'z' at length > z.2 <- vector("numeric", length = 10^6) > system.time(for(i in 1:length(x)) { + if(x[i] > y[i]) { + z.2[i] <- 1 + } else { + z.2[i] <- -1} + } + ) user system elapsed 3.770.003.77 > > ##Benchmark 3 > > z.3 <- NULL > system.time(z.3 <- ifelse(x > y, 1, -1)) user system elapsed 0.380.000.38 > > ##Benchmark 4 > > z.4 <- vector("numeric", length = 10^6) > system.time(z.4 <- ifelse(x > y, 1, -1)) user system elapsed 0.310.000.31 > > ##Benchmark 5 > > system.time(z.5 <- 2*(x > y) - 1) user system elapsed 0.010.000.01 > > ##Benchmark 6 > > system.time(z.6 <- numeric(length(x))-1) user system elapsed 0 0 0 > system.time(z.6[x > y] <- 1) user system elapsed 0.030.000.03 > > ##Show that all results are identical > > identical(z.1, z.2) [1] TRUE > identical(z.1, z.3) [1] TRUE > identical(z.1, z.4) [1] TRUE > identical(z.1, z.5) [1] TRUE > identical(z.1, z.6) [1] TRUE I have not replicated these on other system, but tentatively, it appears that loops are significantly slower than ifelse(), which in turn is slower than options 5 and 6. However, when using the same test data and the same system, I did not find an appreciable difference between options 5 and 6 speed wise. Cheers, Josh On Mon, Jul 12, 2010 at 7:09 AM, Raghu wrote: > When I just run a for loop it works. But if I am going to run a for loop > every time for large vectors I might as well use C or any other language. > The reason R is powerful is becasue it can handle large vectors without each > element being manipulated? Please let me know where I am wrong. > > for(i in 1:length(news1o)){ > + if(news1o[i]>s2o[i]) > + s[i]<-1 > + else > + s[i]<--1 > + } > > -- > 'Raghu' > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] in continuation with the earlier R puzzle
Thanks to you all. I stand corrected Ted and Manuela:) I am just an end user and trying to pick up from such forums. Many thanks sirs. On Mon, Jul 12, 2010 at 5:45 PM, Huso, Manuela wrote: > Using Ted Harding's example: > > news1o <- runif(100) > s2o<- runif(100) > > pt1 <- proc.time() > s <- numeric(length(news1o))-1 # Set all of s to -1 > s[news1o>s2o] <-1# Change to 1 only those values of s > # for which news1o>s2o > pt2<- proc.time() > pt2-pt1 # Takes even less time... > # user system elapsed > # 0.040.000.05 > > >::<>::<>::<>::<>::<>::<>::<>::<>::<>::<>::<>::< > Please note: I will be out of the office and out > of email contact from 7/11-7/25/2010 > >::<>::<>::<>::<>::<>::<>::<>::<>::<>::<>::<>::< > Manuela Huso > Consulting Statistician > 201H Richardson Hall > Department of Forest Ecosystems and Society > Oregon State University > Corvallis, OR 97331 > ph: 541-737-6232 > fx: 541-737-1393 > > > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of Ted Harding > Sent: Monday, July 12, 2010 9:36 AM > To: r-help@r-project.org > Cc: Raghu > Subject: Re: [R] in continuation with the earlier R puzzle > > On 12-Jul-10 14:09:30, Raghu wrote: > > When I just run a for loop it works. But if I am going to > > run a for loop every time for large vectors I might as well > > use C or any other language. > > The reason R is powerful is becasue it can handle large vectors > > without each element being manipulated? Please let me know where > > I am wrong. > > > > for(i in 1:length(news1o)){ > > + if(news1o[i]>s2o[i]) > > + s[i]<-1 > > + else > > + s[i]<--1 > > + } > > > > -- > > 'Raghu' > > Many operations over the whole length of vectors can be done > in "vectorised" form, in which an entire vector is changed > in one operation based on the values of the separate elemnts > of other vectors, also all take into account in a single > operation. What happens "behind to scenes" is that the single > element by element operations are performed by a function > in a precompiled (usually from C) library. Hence R already > does what you are suggesting as a "might as well" alternative! > > Below is an example, using long vectors. The first case is a > copy of your R loop above (with some additional initialisation > of the vectors). The second achieves the same result in the > "vectorised" form. > > news1o <- runif(100) > s2o<- runif(100) > s <- numeric(length(news1o)) > > proc.time() > #user system elapsed > # 1.728 0.680 450.257 > for(i in 1:length(news1o)){ ### Using a loop >if(news1o[i]>s2o[i]) >s[i]<- 1 >else >s[i]<- (-1) > } > proc.time() > #user system elapsed > # 11.184 0.756 460.340 > s2 <- 2*(news1o > s2o) - 1 ### Vectorised > proc.time() > #user system elapsed > # 11.348 0.852 460.663 > > sum(s2 != s) > # [1] 0 ### Results identical > > Result: The loop took (11.184 - 1.728) = 9.456 seconds, > Vectorised, it took (11.348 - 11.184) = 0.164 seconds. > > Loop/Vector = (11.184 - 1.728)/(11.348 - 11.184) = 57.65854 > > i.e. nearly 60 times as long. > > Ted. > > > E-Mail: (Ted Harding) > Fax-to-email: +44 (0)870 094 0861 > Date: 12-Jul-10 Time: 17:36:07 > -- XFMail -- > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- 'Raghu' [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] in continuation with the earlier R puzzle
Using Ted Harding's example: news1o <- runif(100) s2o<- runif(100) pt1 <- proc.time() s <- numeric(length(news1o))-1 # Set all of s to -1 s[news1o>s2o] <-1# Change to 1 only those values of s # for which news1o>s2o pt2<- proc.time() pt2-pt1 # Takes even less time... # user system elapsed # 0.040.000.05 >::<>::<>::<>::<>::<>::<>::<>::<>::<>::<>::<>::< Please note: I will be out of the office and out of email contact from 7/11-7/25/2010 >::<>::<>::<>::<>::<>::<>::<>::<>::<>::<>::<>::< Manuela Huso Consulting Statistician 201H Richardson Hall Department of Forest Ecosystems and Society Oregon State University Corvallis, OR 97331 ph: 541-737-6232 fx: 541-737-1393 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Ted Harding Sent: Monday, July 12, 2010 9:36 AM To: r-help@r-project.org Cc: Raghu Subject: Re: [R] in continuation with the earlier R puzzle On 12-Jul-10 14:09:30, Raghu wrote: > When I just run a for loop it works. But if I am going to > run a for loop every time for large vectors I might as well > use C or any other language. > The reason R is powerful is becasue it can handle large vectors > without each element being manipulated? Please let me know where > I am wrong. > > for(i in 1:length(news1o)){ > + if(news1o[i]>s2o[i]) > + s[i]<-1 > + else > + s[i]<--1 > + } > > -- > 'Raghu' Many operations over the whole length of vectors can be done in "vectorised" form, in which an entire vector is changed in one operation based on the values of the separate elemnts of other vectors, also all take into account in a single operation. What happens "behind to scenes" is that the single element by element operations are performed by a function in a precompiled (usually from C) library. Hence R already does what you are suggesting as a "might as well" alternative! Below is an example, using long vectors. The first case is a copy of your R loop above (with some additional initialisation of the vectors). The second achieves the same result in the "vectorised" form. news1o <- runif(100) s2o<- runif(100) s <- numeric(length(news1o)) proc.time() #user system elapsed # 1.728 0.680 450.257 for(i in 1:length(news1o)){ ### Using a loop if(news1o[i]>s2o[i]) s[i]<- 1 else s[i]<- (-1) } proc.time() #user system elapsed # 11.184 0.756 460.340 s2 <- 2*(news1o > s2o) - 1 ### Vectorised proc.time() #user system elapsed # 11.348 0.852 460.663 sum(s2 != s) # [1] 0 ### Results identical Result: The loop took (11.184 - 1.728) = 9.456 seconds, Vectorised, it took (11.348 - 11.184) = 0.164 seconds. Loop/Vector = (11.184 - 1.728)/(11.348 - 11.184) = 57.65854 i.e. nearly 60 times as long. Ted. E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 12-Jul-10 Time: 17:36:07 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] in continuation with the earlier R puzzle
On 12-Jul-10 14:09:30, Raghu wrote: > When I just run a for loop it works. But if I am going to > run a for loop every time for large vectors I might as well > use C or any other language. > The reason R is powerful is becasue it can handle large vectors > without each element being manipulated? Please let me know where > I am wrong. > > for(i in 1:length(news1o)){ > + if(news1o[i]>s2o[i]) > + s[i]<-1 > + else > + s[i]<--1 > + } > > -- > 'Raghu' Many operations over the whole length of vectors can be done in "vectorised" form, in which an entire vector is changed in one operation based on the values of the separate elemnts of other vectors, also all take into account in a single operation. What happens "behind to scenes" is that the single element by element operations are performed by a function in a precompiled (usually from C) library. Hence R already does what you are suggesting as a "might as well" alternative! Below is an example, using long vectors. The first case is a copy of your R loop above (with some additional initialisation of the vectors). The second achieves the same result in the "vectorised" form. news1o <- runif(100) s2o<- runif(100) s <- numeric(length(news1o)) proc.time() #user system elapsed # 1.728 0.680 450.257 for(i in 1:length(news1o)){ ### Using a loop if(news1o[i]>s2o[i]) s[i]<- 1 else s[i]<- (-1) } proc.time() #user system elapsed # 11.184 0.756 460.340 s2 <- 2*(news1o > s2o) - 1 ### Vectorised proc.time() #user system elapsed # 11.348 0.852 460.663 sum(s2 != s) # [1] 0 ### Results identical Result: The loop took (11.184 - 1.728) = 9.456 seconds, Vectorised, it took (11.348 - 11.184) = 0.164 seconds. Loop/Vector = (11.184 - 1.728)/(11.348 - 11.184) = 57.65854 i.e. nearly 60 times as long. Ted. E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 12-Jul-10 Time: 17:36:07 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] in continuation with the earlier R puzzle
On Jul 12, 2010, at 10:09 AM, Raghu wrote: When I just run a for loop it works. But if I am going to run a for loop every time for large vectors I might as well use C or any other language. The reason R is powerful is becasue it can handle large vectors without each element being manipulated? Please let me know where I am wrong. for(i in 1:length(news1o)){ + if(news1o[i]>s2o[i]) + s[i]<-1 + else + s[i]<--1 + } Perhaps: s <- 2*( news1o > s2o[1:length(news1o)] ) - 1 ...which I think will throw errors under pretty much the same conditions that would cause errors in that loop. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] in continuation with the earlier R puzzle
> The reason R is powerful is becasue it can handle large vectors without each > element being manipulated? Please let me know where I am wrong. > > for(i in 1:length(news1o)){ > + if(news1o[i]>s2o[i]) > + s[i]<-1 > + else > + s[i]<--1 > + } You might give ifelse() a shot here. s <- ifelse(news1o > s2o, 1, -1) Learning to think in vectors is important in R, just like thinking in sets is important for SQL, or thinking in rows and steps is important in SAS. cur -- Curt Seeliger, Data Ranger Raytheon Information Services - Contractor to ORD seeliger.c...@epa.gov 541/754-4638 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] in continuation with the earlier R puzzle
I don't know what is wrong with your code but I believe you should use ifelse instead of a for loop: s <- ifelse(news1o > s2o, 1 , -1 ) Alain On 12-Jul-10 16:09, Raghu wrote: When I just run a for loop it works. But if I am going to run a for loop every time for large vectors I might as well use C or any other language. The reason R is powerful is becasue it can handle large vectors without each element being manipulated? Please let me know where I am wrong. for(i in 1:length(news1o)){ + if(news1o[i]>s2o[i]) + s[i]<-1 + else + s[i]<--1 + } -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.