Re: [R] Counting number of consecutive occurrences per rows
Hi I slightly modified Jim's code first part is function to split data frame test according to act, juln and day and compute repetitions in each chunk. fff<- function(x) { fac <- factor((x[, "act"]==0)*1+(x[,"act"] == 200)*2, levels=c(1,0,2)) int<-interaction(x[,"juln"], x[,"day"], fac) res <- cumsum(c(1, abs(diff(as.numeric(int) res } test$fac<-fff(test) Second part evaluates length of each chunk test$res <- ave(test$fac, test$fac, FUN=length) Last part computes max (min, sum) of res in each distinct chunk. fff2<- function(x) { fac <- factor((x[, "act"]==0)*1+(x[,"act"] == 200)*2, levels=c(1,0,2), labels=c("0", "1-199", "200")) fac } aggregate(test$res, list(test$juln, test$day), max) aggregate(test$res, list(test$juln, test$day, fff2(test)), max) Is it what you want? Petr From: zuzana zajkova [mailto:zuzu...@gmail.com] Sent: Friday, May 03, 2013 7:10 PM To: PIKAL Petr; jholt...@gmail.com Cc: r-help@r-project.org Subject: Re: [R] Counting number of consecutive occurrences per rows Hi, I'm sorry that it takes me so much time to respond, finally yesterday I got time to try your suggestions. Thank you for them! I tried both, they give the same results, but in both there are some things I still need to solve. I would appreciate your help. I include a little bigger dataframe (test2, in the end of this email), with more differencies in variables, to be able to better explain what I would like to calculate in addition. Jim's code: I needed to make some changes in assigning the key. Yours worked ok for that small "test" data, but when I tried it on my dataframe which has around 25000rows, it didn't work properly. test2$key[test2$act == 0] <- 1 test2$key[test2$act > 0 & test2$act < 200] <- 2 test2$key[test2$act == 200] <- 3 # this works ok test2$resChange <- cumsum(c(1, abs(diff(test2$key test2$res <- ave(test2$resChange, test2$resChange, FUN = length) # I added new column by jul date test2$resJ <- ave(test2$resChange, test2$resChange, test2$juln, FUN = length) # this works fine as well, for dividing between day 0 and day 1 test2$resJD <- ave(test2$resChange, test2$resChange, test2$juln, test2$day, FUN = length) # resume test2Resume <- test2[ , list(maxres = max(res) , minres = min(res) , sumres = length(unique(resChange))) , keyby = c('day', 'key')] # change 'key' test2Resume_day$key <- c('0', '1-199', '200')[test2Resume_day$key] test2Resume_day day key maxres minres sumres 1: 0 0 2 2 3 2: 0 1-199 3 1 9 3: 0 200 6 1 7 4: 1 0 1 1 1 5: 1 1-199 10 1 7 6: 1 200 6 1 6 # resume by juln test2Resume_jul <- test2[ , list(maxres = max(res) , minres = min(res) , sumres = length(unique(resChange))) , keyby = c('juln', 'key')] # by juln # change 'key' test2Resume_jul$key <- c('0', '1-199', '200')[test2Resume_jul$key] test2Resume_jul juln key maxres minres sumres 1: 15173 0 2 2 1 2: 15173 1-199 3 1 7 3: 15173 200 6 1 6 4: 15174 0 2 1 3 5: 15174 1-199 10 1 8 6: 15174 200 6 1 6 It is ok, but what I would like to get is resume for juln and for variable day (0 and 1) aswell. Like this: juln day key maxres minressumres 15173 00 15173 01-199 15173 0200 15173 10 15173 11-199 15173 1200 15174 0 0 15174 0 1-199 15174 0 200 15174 1 0 15174 1 1-199 15174 1 200 ... The other thing is that the "sumres" I would like to calculate like a sum of values of occurencies for each "key". For example, if in the test2 dataframe res values for key 200 (juln 15173) are 1, 1, 2,2,1,2 the sumres should be 9 (1+1+2+2+1+2), not 6 (which I suppose come form sum of number of unique occurencies). Petr's code: This works fine also, the thing is that doing the aggregation I would need the intervals to be like this [0, 1) [1, 199] (199, 200] what I don't know if is possible... I checked the hepl for cut, but I found that it can be closed just right or left... Thank you very much for your time and sharing your knowledge! Zuzana ## here is the bigger test2 dataframe > dput(test2) structure(list(daten = structure(c(15173, 15173, 15173, 15173, 15173, 15173, 15173, 1
Re: [R] Counting number of consecutive occurrences per rows
ot;win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win", "win"), night = structure(c(1310962792, 1310963392, 1310963992, 1310964592, 1310965192, 1310965792, 1310966392, 1310966992, 1310967592, 1310968192, 1310968792, 1310969392, 1310969992, 1310970592, 1310971192, 1310971792, 1310972392, 1310972992, 1310973592, 1310974192, 1310974792, 1310975392, 1311107991, 1311108591, 1311109191, 1311109791, 130391, 130991, 131591, 132191, 132791, 133391, 133991, 134591, 135191, 135791, 136391, 136991, 137591, 138191, 138791, 139391, 139991, 1311034191, 1311034791, 1311035391, 1311035991, 1311036591, 1311037191, 1311037791, 1311038391, 1311038991, 1311039591, 1311040191, 1311040791, 1311041391, 1311041991, 1311042591, 1311043191, 1311043791), class = c("POSIXct", "POSIXt" ), tzone = "GMT"), day = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), act = c(196, 200, 199, 200, 197, 198, 197, 200, 200, 197, 200, 200, 198, 200, 1, 1, 0, 0, 1, 2, 200, 200, 200, 200, 200, 200, 199, 61, 0, 194, 198, 198, 196, 193, 194, 193, 197, 198, 199, 200, 197, 199, 199, 200, 198, 200, 200, 198, 200, 34, 1, 1, 0, 0, 199, 200, 199, 7, 0, 0)), .Names = c("daten", "juln", "fen", "night", "day", "act"), row.names = 9990:10049, class = "data.frame") On 29 April 2013 14:35, PIKAL Petr wrote: > Hi > > rrr<-rle(as.numeric(cut(test$act, c(0,1,199,200), include.lowest=T))) > test$res <- rep(rrr$lengths, rrr$lengths) > > If you put it in function > > fff<- function(x, limits=c(0,1,199,200)) { > rrr<-rle(as.numeric(cut(x, limits, include.lowest=T))) > res <- rep(rrr$lengths, rrr$lengths) > res > } > > you can use split/lapply approach > > test$res2<-unlist(lapply(split(test$act, factor(test$day, levels=c(1,0))), > fff)) > > Beware of correct ordering of days in output. Without correct leveling of > factor 0 precedes 1. > > And for the last part probably aggregate can be the way. > > > aggregate(test$res, list(test$jul, cut(test$act, c(0,1,199,200), > include.lowest=T)), max) > Group.1 Group.2 x > 1 14655 [0,1] 4 > 2 14655 (1,199] 3 > 3 14655 (199,200] 3 > > aggregate(test$res, list(test$jul, cut(test$act, c(0,1,199,200), > include.lowest=T)), min) > Group.1 Group.2 x > 1 14655 [0,1] 4 > 2 14655 (1,199] 1 > 3 14655 (199,200] 2 > > Regards > Petr > > > -Original Message- > > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- > > project.org] On Behalf Of zuzana zajkova > > Sent: Monday, April 29, 2013 12:45 PM > > To: r-help@r-project.org > > Subject: [R] Counting number of consecutive occurrences per rows > > > > Hi, > > > > I would appreciate if somebody could help me with following > > calculation. > > I have a dataframe, by 10 minutes time, for mostly one year data. This > > is small example: > > > > > dput(test) > > structure(list(jul = structure(c(14655, 14655, 14655, 14655, 14655, > > 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, > > 14655), origin = structure(0, class = "Date")), > > time = structure(c(1266258354, 1266258954, 1266259554, 1266260154, > > 1266260754, 1266261354, 1266261954, 1266262554, 1266263154, > > 1266263754, 1266264354, 1266264954, 1266265554, 1266266154, > > 1266266754, 1266267354), class = c("POSIXct", "POSIXt"), tzone = > > "GMT"), > > act = c(130, 23, 45, 200, 200, 200, 199, 150, 0, 0, 0, 0, > > 34, 200, 200, 145), day = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > > 0, 0, 0, 0, 0, 0)), .Names = c("jul", "time", "act", "day" > > ), class = "data.frame", row.names = c(510L, 512L, 514L, 516L, 518L, > > 520L, 522L, 524L, 526L, 528L, 530L, 532L, 534L, 536L, 538L, > > 540L)) > > > > L
Re: [R] Counting number of consecutive occurrences per rows
Hi rrr<-rle(as.numeric(cut(test$act, c(0,1,199,200), include.lowest=T))) test$res <- rep(rrr$lengths, rrr$lengths) If you put it in function fff<- function(x, limits=c(0,1,199,200)) { rrr<-rle(as.numeric(cut(x, limits, include.lowest=T))) res <- rep(rrr$lengths, rrr$lengths) res } you can use split/lapply approach test$res2<-unlist(lapply(split(test$act, factor(test$day, levels=c(1,0))), fff)) Beware of correct ordering of days in output. Without correct leveling of factor 0 precedes 1. And for the last part probably aggregate can be the way. > aggregate(test$res, list(test$jul, cut(test$act, c(0,1,199,200), > include.lowest=T)), max) Group.1 Group.2 x 1 14655 [0,1] 4 2 14655 (1,199] 3 3 14655 (199,200] 3 > aggregate(test$res, list(test$jul, cut(test$act, c(0,1,199,200), > include.lowest=T)), min) Group.1 Group.2 x 1 14655 [0,1] 4 2 14655 (1,199] 1 3 14655 (199,200] 2 Regards Petr > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- > project.org] On Behalf Of zuzana zajkova > Sent: Monday, April 29, 2013 12:45 PM > To: r-help@r-project.org > Subject: [R] Counting number of consecutive occurrences per rows > > Hi, > > I would appreciate if somebody could help me with following > calculation. > I have a dataframe, by 10 minutes time, for mostly one year data. This > is small example: > > > dput(test) > structure(list(jul = structure(c(14655, 14655, 14655, 14655, 14655, > 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, > 14655), origin = structure(0, class = "Date")), > time = structure(c(1266258354, 1266258954, 1266259554, 1266260154, > 1266260754, 1266261354, 1266261954, 1266262554, 1266263154, > 1266263754, 1266264354, 1266264954, 1266265554, 1266266154, > 1266266754, 1266267354), class = c("POSIXct", "POSIXt"), tzone = > "GMT"), > act = c(130, 23, 45, 200, 200, 200, 199, 150, 0, 0, 0, 0, > 34, 200, 200, 145), day = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 0, 0, 0, 0, 0, 0)), .Names = c("jul", "time", "act", "day" > ), class = "data.frame", row.names = c(510L, 512L, 514L, 516L, 518L, > 520L, 522L, 524L, 526L, 528L, 530L, 532L, 534L, 536L, 538L, > 540L)) > > Looks like this: > > > test > jultime act day > 510 14655 2010-02-15 18:25:54 130 1 > 512 14655 2010-02-15 18:35:54 23 1 > 514 14655 2010-02-15 18:45:54 45 1 > 516 14655 2010-02-15 18:55:54 200 1 > 518 14655 2010-02-15 19:05:54 200 1 > 520 14655 2010-02-15 19:15:54 200 1 > 522 14655 2010-02-15 19:25:54 199 1 > 524 14655 2010-02-15 19:35:54 150 1 > 526 14655 2010-02-15 19:45:54 0 1 > 528 14655 2010-02-15 19:55:54 0 1 > 530 14655 2010-02-15 20:05:54 0 0 > 532 14655 2010-02-15 20:15:54 0 0 > 534 14655 2010-02-15 20:25:54 34 0 > 536 14655 2010-02-15 20:35:54 200 0 > 538 14655 2010-02-15 20:45:54 200 0 > 540 14655 2010-02-15 20:55:54 145 0 > > > What I would like to calculate is the number of consecutive occurrences > of values 200, 0 and together values from 1 til 199 (in fact the > values that differ from 200 and 0) in column "act". > > I would like to get something like this (result$res) > > > result > jultime act day res res2 > 510 14655 2010-02-15 18:25:54 130 1 33 > 512 14655 2010-02-15 18:35:54 23 1 33 > 514 14655 2010-02-15 18:45:54 45 1 33 > 516 14655 2010-02-15 18:55:54 200 1 33 > 518 14655 2010-02-15 19:05:54 200 1 33 > 520 14655 2010-02-15 19:15:54 200 1 33 > 522 14655 2010-02-15 19:25:54 199 1 22 > 524 14655 2010-02-15 19:35:54 150 1 22 > 526 14655 2010-02-15 19:45:54 0 1 42 > 528 14655 2010-02-15 19:55:54 0 1 42 > 530 14655 2010-02-15 20:05:54 0 0 42 > 532 14655 2010-02-15 20:15:54 0 0 42 > 534 14655 2010-02-15 20:25:54 34 0 11 > 536 14655 2010-02-15 20:35:54 200 0 22 > 538 14655 2010-02-15 20:45:54 200 0 22 > 540 14655 2010-02-15 20:55:54 145 0 11 > > And if possible, distinguish among day==1 and day==0 (see the "act" > values of 0 for example), results as in result$res2. > > After it I would like to make a resume table per days (jul): > where maxres is max(result$res) for the "act" value where minres is > min(result$res) for the "act" value where sumres is sum(result$res) for > the "act" value (for example, if the 200 value ocurrs in different > times per day(jul) consecutively 3, 5, 1, 6 and 7 times the sumr
Re: [R] Counting number of consecutive occurrences per rows
try this: > test <- structure(list(jul = structure(c(14655, 14655, 14655, 14655, + 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, + 14655, 14655, 14655), origin = structure(0, class = "Date")), + time = structure(c(1266258354, 1266258954, 1266259554, 1266260154, + 1266260754, 1266261354, 1266261954, 1266262554, 1266263154, + 1266263754, 1266264354, 1266264954, 1266265554, 1266266154, + 1266266754, 1266267354), class = c("POSIXct", "POSIXt"), tzone = + "GMT"), + act = c(130, 23, 45, 200, 200, 200, 199, 150, 0, 0, 0, 0, + 34, 200, 200, 145), day = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, + 0, 0, 0, 0, 0, 0)), .Names = c("jul", "time", "act", "day" + ), class = "data.frame", row.names = c(510L, 512L, 514L, 516L, + 518L, 520L, 522L, 524L, 526L, 528L, 530L, 532L, 534L, 536L, 538L, + 540L)) > > # add key to separate data > test$key <- ifelse(test$act == 0 + , 1L # 0 + , ifelse(test$act == 200 + , 3L # 200 + , 2L # 1-199 + ) + ) > # mark changes in sequence > test$resChange <- cumsum(c(1L, abs(diff(test$key > test$res <- ave(test$resChange, test$resChange, FUN = length) > > test$res2 <- ave(test$resChange, test$resChange, test$day, FUN = length) > > test jultime act day key resChange res res2 510 14655 2010-02-15 18:25:54 130 1 2 1 33 512 14655 2010-02-15 18:35:54 23 1 2 1 33 514 14655 2010-02-15 18:45:54 45 1 2 1 33 516 14655 2010-02-15 18:55:54 200 1 3 2 33 518 14655 2010-02-15 19:05:54 200 1 3 2 33 520 14655 2010-02-15 19:15:54 200 1 3 2 33 522 14655 2010-02-15 19:25:54 199 1 2 3 22 524 14655 2010-02-15 19:35:54 150 1 2 3 22 526 14655 2010-02-15 19:45:54 0 1 1 4 42 528 14655 2010-02-15 19:55:54 0 1 1 4 42 530 14655 2010-02-15 20:05:54 0 0 1 4 42 532 14655 2010-02-15 20:15:54 0 0 1 4 42 534 14655 2010-02-15 20:25:54 34 0 2 5 11 536 14655 2010-02-15 20:35:54 200 0 3 6 22 538 14655 2010-02-15 20:45:54 200 0 3 6 22 540 14655 2010-02-15 20:55:54 145 0 2 7 11 > On Mon, Apr 29, 2013 at 6:44 AM, zuzana zajkova wrote: > Hi, > > I would appreciate if somebody could help me with following calculation. > I have a dataframe, by 10 minutes time, for mostly one year data. This is > small example: > > > dput(test) > structure(list(jul = structure(c(14655, 14655, 14655, 14655, > 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, > 14655, 14655, 14655), origin = structure(0, class = "Date")), > time = structure(c(1266258354, 1266258954, 1266259554, 1266260154, > 1266260754, 1266261354, 1266261954, 1266262554, 1266263154, > 1266263754, 1266264354, 1266264954, 1266265554, 1266266154, > 1266266754, 1266267354), class = c("POSIXct", "POSIXt"), tzone = > "GMT"), > act = c(130, 23, 45, 200, 200, 200, 199, 150, 0, 0, 0, 0, > 34, 200, 200, 145), day = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 0, 0, 0, 0, 0, 0)), .Names = c("jul", "time", "act", "day" > ), class = "data.frame", row.names = c(510L, 512L, 514L, 516L, > 518L, 520L, 522L, 524L, 526L, 528L, 530L, 532L, 534L, 536L, 538L, > 540L)) > > Looks like this: > > > test > jultime act day > 510 14655 2010-02-15 18:25:54 130 1 > 512 14655 2010-02-15 18:35:54 23 1 > 514 14655 2010-02-15 18:45:54 45 1 > 516 14655 2010-02-15 18:55:54 200 1 > 518 14655 2010-02-15 19:05:54 200 1 > 520 14655 2010-02-15 19:15:54 200 1 > 522 14655 2010-02-15 19:25:54 199 1 > 524 14655 2010-02-15 19:35:54 150 1 > 526 14655 2010-02-15 19:45:54 0 1 > 528 14655 2010-02-15 19:55:54 0 1 > 530 14655 2010-02-15 20:05:54 0 0 > 532 14655 2010-02-15 20:15:54 0 0 > 534 14655 2010-02-15 20:25:54 34 0 > 536 14655 2010-02-15 20:35:54 200 0 > 538 14655 2010-02-15 20:45:54 200 0 > 540 14655 2010-02-15 20:55:54 145 0 > > > What I would like to calculate is the number of consecutive occurrences of > values 200, 0 and together values from 1 til 199 (in fact the values that > differ from 200 and 0) in column "act". > > I would like to get something like this (result$res) > > > result > jultime act day res res2 > 510 14655 2010-02-15 18:25:54 130 1 33 > 512 14655 2010-02-15 18:35:54 23 1 33 > 514 14655 2010-02-15 18:45:54 45 1 33 > 516 14655 2010-02-15 18:55:54 200 1 33 > 518 14655 2010-02-15 19:05:54 200 1 33 > 520 14655 2010-02-15 19:15:54 200 1 33 > 522 14655 2010-02-15 19:25:54 199 1 22 > 524 14655 2010-02-15 19:35:54 150 1 22 > 526 14655 2010-02-15 19:45:54 0 1 42 > 528 14655 2010-02-15 19:55:54 0 1 42 > 530 14655 2010-
Re: [R] Counting number of consecutive occurrences per rows
Forgot the last part of the question: > test <- structure(list(jul = structure(c(14655, 14655, 14655, 14655, + 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, + 14655, 14655, 14655), origin = structure(0, class = "Date")), + time = structure(c(1266258354, 1266258954, 1266259554, 1266260154, + 1266260754, 1266261354, 1266261954, 1266262554, 1266263154, + 1266263754, 1266264354, 1266264954, 1266265554, 1266266154, + 1266266754, 1266267354), class = c("POSIXct", "POSIXt"), tzone = + "GMT"), + act = c(130, 23, 45, 200, 200, 200, 199, 150, 0, 0, 0, 0, + 34, 200, 200, 145), day = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, + 0, 0, 0, 0, 0, 0)), .Names = c("jul", "time", "act", "day" + ), class = "data.frame", row.names = c(510L, 512L, 514L, 516L, + 518L, 520L, 522L, 524L, 526L, 528L, 530L, 532L, 534L, 536L, 538L, + 540L)) > > # add key to separate data > test$key <- ifelse(test$act == 0 + , 1L # 0 + , ifelse(test$act == 200 + , 3L # 200 + , 2L # 1-199 + ) + ) > # mark changes in sequence > test$resChange <- cumsum(c(1L, abs(diff(test$key > test$res <- ave(test$resChange, test$resChange, FUN = length) > > test$res2 <- ave(test$resChange, test$resChange, test$day, FUN = length) > > require(data.table) # use this for aggregation > test <- data.table(test) > testResume <- test[ + , list(maxres = max(res) + , minres = min(res) + , sumres = length(unique(resChange)) + ) + , keyby = c('day', 'key') + ] > # change 'key' > testResume$key <- c('0', '1-199', '200')[testResume$key] > testResume day key maxres minres sumres 1: 0 0 4 4 1 2: 0 1-199 1 1 2 3: 0 200 2 2 1 4: 1 0 4 4 1 5: 1 1-199 3 2 2 6: 1 200 3 3 1 > On Mon, Apr 29, 2013 at 6:44 AM, zuzana zajkova wrote: > Hi, > > I would appreciate if somebody could help me with following calculation. > I have a dataframe, by 10 minutes time, for mostly one year data. This is > small example: > > > dput(test) > structure(list(jul = structure(c(14655, 14655, 14655, 14655, > 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, > 14655, 14655, 14655), origin = structure(0, class = "Date")), > time = structure(c(1266258354, 1266258954, 1266259554, 1266260154, > 1266260754, 1266261354, 1266261954, 1266262554, 1266263154, > 1266263754, 1266264354, 1266264954, 1266265554, 1266266154, > 1266266754, 1266267354), class = c("POSIXct", "POSIXt"), tzone = > "GMT"), > act = c(130, 23, 45, 200, 200, 200, 199, 150, 0, 0, 0, 0, > 34, 200, 200, 145), day = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 0, 0, 0, 0, 0, 0)), .Names = c("jul", "time", "act", "day" > ), class = "data.frame", row.names = c(510L, 512L, 514L, 516L, > 518L, 520L, 522L, 524L, 526L, 528L, 530L, 532L, 534L, 536L, 538L, > 540L)) > > Looks like this: > > > test > jultime act day > 510 14655 2010-02-15 18:25:54 130 1 > 512 14655 2010-02-15 18:35:54 23 1 > 514 14655 2010-02-15 18:45:54 45 1 > 516 14655 2010-02-15 18:55:54 200 1 > 518 14655 2010-02-15 19:05:54 200 1 > 520 14655 2010-02-15 19:15:54 200 1 > 522 14655 2010-02-15 19:25:54 199 1 > 524 14655 2010-02-15 19:35:54 150 1 > 526 14655 2010-02-15 19:45:54 0 1 > 528 14655 2010-02-15 19:55:54 0 1 > 530 14655 2010-02-15 20:05:54 0 0 > 532 14655 2010-02-15 20:15:54 0 0 > 534 14655 2010-02-15 20:25:54 34 0 > 536 14655 2010-02-15 20:35:54 200 0 > 538 14655 2010-02-15 20:45:54 200 0 > 540 14655 2010-02-15 20:55:54 145 0 > > > What I would like to calculate is the number of consecutive occurrences of > values 200, 0 and together values from 1 til 199 (in fact the values that > differ from 200 and 0) in column "act". > > I would like to get something like this (result$res) > > > result > jultime act day res res2 > 510 14655 2010-02-15 18:25:54 130 1 33 > 512 14655 2010-02-15 18:35:54 23 1 33 > 514 14655 2010-02-15 18:45:54 45 1 33 > 516 14655 2010-02-15 18:55:54 200 1 33 > 518 14655 2010-02-15 19:05:54 200 1 33 > 520 14655 2010-02-15 19:15:54 200 1 33 > 522 14655 2010-02-15 19:25:54 199 1 22 > 524 14655 2010-02-15 19:35:54 150 1 22 > 526 14655 2010-02-15 19:45:54 0 1 42 > 528 14655 2010-02-15 19:55:54 0 1 42 > 530 14655 2010-02-15 20:05:54 0 0 42 > 532 14655 2010-02-15 20:15:54 0 0 42 > 534 14655 2010-02-15 20:25:54 34 0 11 > 536 14655 2010-02-15 20:35:54 200 0 22 > 538 14655 2010-02-15 20:45:54 200 0 22 > 540 14655 2010-02-15 20:55:54 145 0 11 > > And if possible, distinguish among day==1 and day==0 (see the "act" values > of 0 for example), results as in result$res2. > > After it I would like
[R] Counting number of consecutive occurrences per rows
Hi, I would appreciate if somebody could help me with following calculation. I have a dataframe, by 10 minutes time, for mostly one year data. This is small example: > dput(test) structure(list(jul = structure(c(14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655, 14655), origin = structure(0, class = "Date")), time = structure(c(1266258354, 1266258954, 1266259554, 1266260154, 1266260754, 1266261354, 1266261954, 1266262554, 1266263154, 1266263754, 1266264354, 1266264954, 1266265554, 1266266154, 1266266754, 1266267354), class = c("POSIXct", "POSIXt"), tzone = "GMT"), act = c(130, 23, 45, 200, 200, 200, 199, 150, 0, 0, 0, 0, 34, 200, 200, 145), day = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0)), .Names = c("jul", "time", "act", "day" ), class = "data.frame", row.names = c(510L, 512L, 514L, 516L, 518L, 520L, 522L, 524L, 526L, 528L, 530L, 532L, 534L, 536L, 538L, 540L)) Looks like this: > test jultime act day 510 14655 2010-02-15 18:25:54 130 1 512 14655 2010-02-15 18:35:54 23 1 514 14655 2010-02-15 18:45:54 45 1 516 14655 2010-02-15 18:55:54 200 1 518 14655 2010-02-15 19:05:54 200 1 520 14655 2010-02-15 19:15:54 200 1 522 14655 2010-02-15 19:25:54 199 1 524 14655 2010-02-15 19:35:54 150 1 526 14655 2010-02-15 19:45:54 0 1 528 14655 2010-02-15 19:55:54 0 1 530 14655 2010-02-15 20:05:54 0 0 532 14655 2010-02-15 20:15:54 0 0 534 14655 2010-02-15 20:25:54 34 0 536 14655 2010-02-15 20:35:54 200 0 538 14655 2010-02-15 20:45:54 200 0 540 14655 2010-02-15 20:55:54 145 0 What I would like to calculate is the number of consecutive occurrences of values 200, 0 and together values from 1 til 199 (in fact the values that differ from 200 and 0) in column "act". I would like to get something like this (result$res) > result jultime act day res res2 510 14655 2010-02-15 18:25:54 130 1 33 512 14655 2010-02-15 18:35:54 23 1 33 514 14655 2010-02-15 18:45:54 45 1 33 516 14655 2010-02-15 18:55:54 200 1 33 518 14655 2010-02-15 19:05:54 200 1 33 520 14655 2010-02-15 19:15:54 200 1 33 522 14655 2010-02-15 19:25:54 199 1 22 524 14655 2010-02-15 19:35:54 150 1 22 526 14655 2010-02-15 19:45:54 0 1 42 528 14655 2010-02-15 19:55:54 0 1 42 530 14655 2010-02-15 20:05:54 0 0 42 532 14655 2010-02-15 20:15:54 0 0 42 534 14655 2010-02-15 20:25:54 34 0 11 536 14655 2010-02-15 20:35:54 200 0 22 538 14655 2010-02-15 20:45:54 200 0 22 540 14655 2010-02-15 20:55:54 145 0 11 And if possible, distinguish among day==1 and day==0 (see the "act" values of 0 for example), results as in result$res2. After it I would like to make a resume table per days (jul): where maxres is max(result$res) for the "act" value where minres is min(result$res) for the "act" value where sumres is sum(result$res) for the "act" value (for example, if the 200 value ocurrs in different times per day(jul) consecutively 3, 5, 1, 6 and 7 times the sumres would be 3+5+1+6+7= 22) something like this (this are made up numbers): julact maxres minres sumres 146550 4 1 25 14655 200 32 48 146551-199 3171 146560 8238 14656 200 15360 146561-199 114 46 ... (theoretically the sum of sumres per day(jul) should be 144) > sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) I hope my explanation is sufficient. I appreciate any hint. Thank you, Zuzana [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.