Short: get rid of the loops I use and optimize runtime Dear all,
I want to calculate for each row the amount of the month ago. I use a matrix with 2100 rows and 22 colums (which is still a very small matrix. nrows of other matrixes can easily be more then 100000) Table before Year month quarter yearmonth Service ... Amount 2009 9 Q3 092009 A ... 120 2009 9 Q3 092009 B ... 80 2009 8 Q3 082009 A ... 40 2009 7 Q3 072009 A ... 50 The result I want Year month quarter yearmonth Service ... Amount amound_lastmonth 2009 9 Q3 092009 A ... 120 40 2009 9 Q3 092009 B ... 80 ... 2009 8 Q3 082009 A ... 40 50 2009 7 Q3 072009 A ... 50 ... Table is not exactly the same but gives a good idea what I have and what I want The code I have written (see below) does what I want but it is very very slow. It takes 129s for 400 rows. And the time gets four times higher each time I double the amount of rows. I'm new in programming in R, but I found that you can use Rprof and summaryRprof to analyse your code (output see below) But I don't really understand the output I guess I need code that requires linear time and need to get rid of the 2 for loops. can someone help me or tell me what else I can do to optimize my runtime I use R 2.9.2 windows Xp service pack3 Thank you in advance Best regards, Willems Ian ***************************** dataset[,5]= month dataset[,3]= year dataset[,22]= amount dataset[,14]= servicetype [CODE] #for each row of the matrix check if each row has.. > for (j in 1:Number_rows) { + sum<-0 + for(i in 1:Number_rows){ + if (dataset[j,14]== dataset[i,14]) #..the same service type + {if (dataset[j,18]== dataset[i,18]) # .. the same department + {if (dataset[j,5]== "1") # if month=1, month ago is 12 and year is -1 + {if ("12"== dataset[i,5]) + {if ((dataset[j,3]-1)== dataset[i,3]) + + { sum<-sum + dataset[i,22]} + }} + else { + if ((dataset[j,5]-1)== dataset[i,5]) " if month != 1, month ago is month -1 + { if (dataset[j,3]== dataset[i,3]) + {sum<-sum + dataset[i,22]} + }}}}}} [\Code] > summaryRprof() $by.self self.time self.pct total.time total.pct [.data.frame 33.92 26.2 80.90 62.5 NextMethod 12.68 9.8 12.68 9.8 [.factor 8.60 6.6 18.36 14.2 Ops.factor 8.10 6.3 40.08 31.0 sort.int 6.82 5.3 13.70 10.6 [ 6.70 5.2 85.44 66.0 names 6.54 5.1 6.54 5.1 length 5.66 4.4 5.66 4.4 == 5.04 3.9 44.92 34.7 levels 4.80 3.7 5.56 4.3 is.na 4.24 3.3 4.24 3.3 dim 3.66 2.8 3.66 2.8 switch 3.60 2.8 3.80 2.9 vector 2.68 2.1 8.02 6.2 inherits 1.90 1.5 1.90 1.5 any 1.68 1.3 1.68 1.3 noNA.levels 1.46 1.1 7.84 6.1 .Call 1.40 1.1 1.40 1.1 ! 1.26 1.0 1.26 1.0 attr<- 1.06 0.8 1.06 0.8 .subset 1.00 0.8 1.00 0.8 class<- 0.82 0.6 0.82 0.6 != 0.80 0.6 0.80 0.6 levels.default 0.68 0.5 0.76 0.6 all 0.62 0.5 0.62 0.5 < 0.54 0.4 0.54 0.4 - 0.48 0.4 0.48 0.4 is.factor 0.44 0.3 2.34 1.8 .subset2 0.38 0.3 0.38 0.3 attr 0.36 0.3 0.36 0.3 is.character 0.28 0.2 0.28 0.2 is.null 0.28 0.2 0.28 0.2 | 0.26 0.2 0.26 0.2 oldClass<- 0.20 0.2 0.20 0.2 is.atomic 0.16 0.1 0.16 0.1 nzchar 0.10 0.1 0.10 0.1 is.numeric 0.06 0.0 0.06 0.0 oldClass 0.06 0.0 0.06 0.0 ( 0.04 0.0 0.04 0.0 [.data 0.02 0.0 0.02 0.0 $by.total total.time total.pct self.time self.pct [ 85.44 66.0 6.70 5.2 [.data.frame 80.90 62.5 33.92 26.2 == 44.92 34.7 5.04 3.9 Ops.factor 40.08 31.0 8.10 6.3 [.factor 18.36 14.2 8.60 6.6 sort.int 13.70 10.6 6.82 5.3 NextMethod 12.68 9.8 12.68 9.8 vector 8.02 6.2 2.68 2.1 noNA.levels 7.84 6.1 1.46 1.1 names 6.54 5.1 6.54 5.1 length 5.66 4.4 5.66 4.4 levels 5.56 4.3 4.80 3.7 is.na 4.24 3.3 4.24 3.3 switch 3.80 2.9 3.60 2.8 dim 3.66 2.8 3.66 2.8 is.factor 2.34 1.8 0.44 0.3 inherits 1.90 1.5 1.90 1.5 any 1.68 1.3 1.68 1.3 .Call 1.40 1.1 1.40 1.1 ! 1.26 1.0 1.26 1.0 attr<- 1.06 0.8 1.06 0.8 .subset 1.00 0.8 1.00 0.8 class<- 0.82 0.6 0.82 0.6 != 0.80 0.6 0.80 0.6 levels.default 0.76 0.6 0.68 0.5 all 0.62 0.5 0.62 0.5 < 0.54 0.4 0.54 0.4 - 0.48 0.4 0.48 0.4 .subset2 0.38 0.3 0.38 0.3 attr 0.36 0.3 0.36 0.3 is.character 0.28 0.2 0.28 0.2 is.null 0.28 0.2 0.28 0.2 | 0.26 0.2 0.26 0.2 oldClass<- 0.20 0.2 0.20 0.2 is.atomic 0.16 0.1 0.16 0.1 nzchar 0.10 0.1 0.10 0.1 is.numeric 0.06 0.0 0.06 0.0 oldClass 0.06 0.0 0.06 0.0 ( 0.04 0.0 0.04 0.0 [.data 0.02 0.0 0.02 0.0 $sampling.time [1] 129.38 Warning message: In readLines(filename, n = chunksize) : incomplete final line found on 'Rprof.out' [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.