Forgot the last part of the question: > test <- structure(list(jul = structure(c(14655, 14655, 14655, 14655, + 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, + 14655, 14655, 14655), origin = structure(0, class = "Date")), + time = structure(c(1266258354, 1266258954, 1266259554, 1266260154, + 1266260754, 1266261354, 1266261954, 1266262554, 1266263154, + 1266263754, 1266264354, 1266264954, 1266265554, 1266266154, + 1266266754, 1266267354), class = c("POSIXct", "POSIXt"), tzone = + "GMT"), + act = c(130, 23, 45, 200, 200, 200, 199, 150, 0, 0, 0, 0, + 34, 200, 200, 145), day = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, + 0, 0, 0, 0, 0, 0)), .Names = c("jul", "time", "act", "day" + ), class = "data.frame", row.names = c(510L, 512L, 514L, 516L, + 518L, 520L, 522L, 524L, 526L, 528L, 530L, 532L, 534L, 536L, 538L, + 540L)) > > # add key to separate data > test$key <- ifelse(test$act == 0 + , 1L # 0 + , ifelse(test$act == 200 + , 3L # 200 + , 2L # 1-199 + ) + ) > # mark changes in sequence > test$resChange <- cumsum(c(1L, abs(diff(test$key)))) > test$res <- ave(test$resChange, test$resChange, FUN = length) > > test$res2 <- ave(test$resChange, test$resChange, test$day, FUN = length) > > require(data.table) # use this for aggregation > test <- data.table(test) > testResume <- test[ + , list(maxres = max(res) + , minres = min(res) + , sumres = length(unique(resChange)) + ) + , keyby = c('day', 'key') + ] > # change 'key' > testResume$key <- c('0', '1-199', '200')[testResume$key] > testResume day key maxres minres sumres 1: 0 0 4 4 1 2: 0 1-199 1 1 2 3: 0 200 2 2 1 4: 1 0 4 4 1 5: 1 1-199 3 2 2 6: 1 200 3 3 1 >
On Mon, Apr 29, 2013 at 6:44 AM, zuzana zajkova <zuzu...@gmail.com> wrote: > Hi, > > I would appreciate if somebody could help me with following calculation. > I have a dataframe, by 10 minutes time, for mostly one year data. This is > small example: > > > dput(test) > structure(list(jul = structure(c(14655, 14655, 14655, 14655, > 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, > 14655, 14655, 14655), origin = structure(0, class = "Date")), > time = structure(c(1266258354, 1266258954, 1266259554, 1266260154, > 1266260754, 1266261354, 1266261954, 1266262554, 1266263154, > 1266263754, 1266264354, 1266264954, 1266265554, 1266266154, > 1266266754, 1266267354), class = c("POSIXct", "POSIXt"), tzone = > "GMT"), > act = c(130, 23, 45, 200, 200, 200, 199, 150, 0, 0, 0, 0, > 34, 200, 200, 145), day = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 0, 0, 0, 0, 0, 0)), .Names = c("jul", "time", "act", "day" > ), class = "data.frame", row.names = c(510L, 512L, 514L, 516L, > 518L, 520L, 522L, 524L, 526L, 528L, 530L, 532L, 534L, 536L, 538L, > 540L)) > > Looks like this: > > > test > jul time act day > 510 14655 2010-02-15 18:25:54 130 1 > 512 14655 2010-02-15 18:35:54 23 1 > 514 14655 2010-02-15 18:45:54 45 1 > 516 14655 2010-02-15 18:55:54 200 1 > 518 14655 2010-02-15 19:05:54 200 1 > 520 14655 2010-02-15 19:15:54 200 1 > 522 14655 2010-02-15 19:25:54 199 1 > 524 14655 2010-02-15 19:35:54 150 1 > 526 14655 2010-02-15 19:45:54 0 1 > 528 14655 2010-02-15 19:55:54 0 1 > 530 14655 2010-02-15 20:05:54 0 0 > 532 14655 2010-02-15 20:15:54 0 0 > 534 14655 2010-02-15 20:25:54 34 0 > 536 14655 2010-02-15 20:35:54 200 0 > 538 14655 2010-02-15 20:45:54 200 0 > 540 14655 2010-02-15 20:55:54 145 0 > > > What I would like to calculate is the number of consecutive occurrences of > values 200, 0 and together values from 1 til 199 (in fact the values that > differ from 200 and 0) in column "act". > > I would like to get something like this (result$res) > > > result > jul time act day res res2 > 510 14655 2010-02-15 18:25:54 130 1 3 3 > 512 14655 2010-02-15 18:35:54 23 1 3 3 > 514 14655 2010-02-15 18:45:54 45 1 3 3 > 516 14655 2010-02-15 18:55:54 200 1 3 3 > 518 14655 2010-02-15 19:05:54 200 1 3 3 > 520 14655 2010-02-15 19:15:54 200 1 3 3 > 522 14655 2010-02-15 19:25:54 199 1 2 2 > 524 14655 2010-02-15 19:35:54 150 1 2 2 > 526 14655 2010-02-15 19:45:54 0 1 4 2 > 528 14655 2010-02-15 19:55:54 0 1 4 2 > 530 14655 2010-02-15 20:05:54 0 0 4 2 > 532 14655 2010-02-15 20:15:54 0 0 4 2 > 534 14655 2010-02-15 20:25:54 34 0 1 1 > 536 14655 2010-02-15 20:35:54 200 0 2 2 > 538 14655 2010-02-15 20:45:54 200 0 2 2 > 540 14655 2010-02-15 20:55:54 145 0 1 1 > > And if possible, distinguish among day==1 and day==0 (see the "act" values > of 0 for example), results as in result$res2. > > After it I would like to make a resume table per days (jul): > where maxres is max(result$res) for the "act" value > where minres is min(result$res) for the "act" value > where sumres is sum(result$res) for the "act" value (for example, if the > 200 value ocurrs in different times per day(jul) consecutively 3, 5, 1, 6 > and 7 times the sumres would be 3+5+1+6+7= 22) > > something like this (this are made up numbers): > > jul act maxres minres sumres > 14655 0 4 1 25 > 14655 200 3 2 48 > 14655 1-199 3 1 71 > 14656 0 8 2 38 > 14656 200 15 3 60 > 14656 1-199 11 4 46 > ... > (theoretically the sum of sumres per day(jul) should be 144) > > ____________ > > sessionInfo() > R version 2.15.2 (2012-10-26) > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > ____________ > > I hope my explanation is sufficient. I appreciate any hint. > Thank you, > > Zuzana > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.