[R] Counting consecutive events in R

2015-05-14 Thread Abhinaba Roy
Hi,

I have the following dataframe

structure(list(Type = c(QRS, QRS, QRS, QRS, QRS, QRS,
QRS, QRS, QRS, QRS, QRS, QRS, RR, RR, RR, PP,
PP, PP, PP, PP, PP, PP, PP, PP, QTc, QTc,
QTc, QTc, QTc, QTc, QTc, QTc, QTc, QTc, QTc,
QTc, QTc, QTc, QTc), Time_Point_Start = c(2015-04-01 14:57:15.0.0312,
2015-04-01 14:57:15.0.7839, 2015-04-01 14:57:16.0.5343,
2015-04-01 14:57:17.0.2573,
2015-04-01 14:57:18.0.0234, 2015-04-01 14:57:18.0.7722,
2015-04-01 14:57:19.0.5265,
2015-04-01 14:57:24.0.0195, 2015-04-01 14:57:24.0.7839,
2015-04-01 14:57:25.0.5343,
2015-04-01 14:57:26.0.2768, 2015-04-01 14:57:27.0.0273,
2015-04-01 14:58:03.0.0702,
2015-04-01 14:58:03.0.8190, 2015-04-01 14:58:04.0.5694,
2015-04-01 14:57:58.0.4134,
2015-04-01 14:57:59.0.1637, 2015-04-01 14:57:59.0.9126,
2015-04-01 14:58:00.0.6630,
2015-04-01 14:58:01.0.4134, 2015-04-01 14:58:02.0.1637,
2015-04-01 14:58:02.0.9126,
2015-04-01 14:58:03.0.6630, 2015-04-01 14:58:04.0.4134,
2015-04-01 14:57:07.0.4212,
2015-04-01 14:57:08.0.1715, 2015-04-01 14:57:08.0.9204,
2015-04-01 14:57:09.0.6864,
2015-04-01 14:57:10.0.4368, 2015-04-01 14:57:11.0.1871,
2015-04-01 14:57:11.0.9360,
2015-04-01 14:57:12.0.6591, 2015-04-01 14:57:13.0.4251,
2015-04-01 14:57:14.0.1754,
2015-04-01 14:57:14.0.9243, 2015-04-01 14:57:15.0.6903,
2015-04-01 14:57:16.0.4407,
2015-04-01 14:57:17.0.1676, 2015-04-01 14:57:17.0.9321),
Time_Point_End = c(2015-04-01 14:57:15.0.0858, 2015-04-01
14:57:15.0.8346,
2015-04-01 14:57:16.0.6006, 2015-04-01 14:57:17.0.0351,
2015-04-01 14:57:18.0.1403, 2015-04-01 14:57:18.0.8385,
2015-04-01 14:57:19.0.5889, 2015-04-01 14:57:24.0.0858,
2015-04-01 14:57:24.0.8346, 2015-04-01 14:57:25.0.5772,
2015-04-01 14:57:26.0.3939, 2015-04-01 14:57:27.0.0936,
2015-04-01 14:58:03.0.8190, 2015-04-01 14:58:04.0.5694,
2015-04-01 14:58:05.0.3197, 2015-04-01 14:57:59.0.1637,
2015-04-01 14:57:59.0.9126, 2015-04-01 14:58:00.0.6630,
2015-04-01 14:58:01.0.4134, 2015-04-01 14:58:02.0.1637,
2015-04-01 14:58:02.0.9126, 2015-04-01 14:58:03.0.6630,
2015-04-01 14:58:04.0.4134, 2015-04-01 14:58:05.0.1793,
2015-04-01 14:57:07.0.8775, 2015-04-01 14:57:08.0.6435,
2015-04-01 14:57:09.0.3705, 2015-04-01 14:57:10.0.1209,
2015-04-01 14:57:10.0.8697, 2015-04-01 14:57:11.0.6201,
2015-04-01 14:57:12.0.3861, 2015-04-01 14:57:13.0.1364,
2015-04-01 14:57:13.0.8853, 2015-04-01 14:57:14.0.6513,
2015-04-01 14:57:15.0.4017, 2015-04-01 14:57:16.0.1248,
2015-04-01 14:57:16.0.9165, 2015-04-01 14:57:17.0.6162,
2015-04-01 14:57:18.0.3900), Value = c(0.0546, 0.0507,
0.0663, 0.0936, 0.117, 0.0663, 0.0624, 0.0663, 0.0507, 0.0429,
0.117, 0.0663, 0.7488, 0.7488, 0.7488, 0.7488, 0.7488, 0.7488,
0.7488, 0.7488, 0.7488, 0.7488, 0.7488, 0.7644, 0.033103481,
0.034056449, 0.032367699, 0.031000613, 0.031405867, 0.031241866,
0.032367699, 0.034337907, 0.033125921, 0.034337907, 0.034337907,
0.031241866, 0.034337907, 0.032367699, 0.032930616), Score = c(0L,
0L, 0L, 0L, 3L, 0L, 0L, 0L, 0L, 0L, 3L, 0L, 0L, 0L, 0L, 0L,
0L, 2L, 2L, 2L, 2L, 2L, 0L, 0L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), Type_Desc = c(NA, NA, NA,
NA, 1L, NA, NA, NA, NA, NA, 1L, NA, NA, NA, NA, NA, NA, 1L,
1L, 1L, 1L, 1L, NA, NA, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L), Pat_id = c(4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L)), .Names = c(Type, Time_Point_Start, Time_Point_End,
Value, Score, Type_Desc, Pat_id), class = data.frame,
row.names = c(NA,
-39L))


For each unique value in column 'Type' , I want to check for
consecutive 5 rows (if any) of 'Score'  0.

Now, if there are five consecutive rows with Score  0 and 'Type_Desc'
= 0, then we print Type_low , else if

'Type_Desc' = 1, we print Type_high. The search should end once 5
consecutive rows have been found.

So, for this data frame we will have two statements as follows,


1.PP_high

(reason - consecutive 5 rows of score  0 and

'Type_Desc' = 1 )

2.QTc_low
(reason - consecutive 5 rows of score  0 and

'Type_Desc' = 0 )

How can this problem tackled in R?

Thanks,

Abhinaba

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting consecutive events in R

2015-05-14 Thread Sarah Goslee
Assuming I understand the problem correctly, you want to check for
runs of at least length five where both Score and Test_desc assume
particular values. You don't care where they are or what other data
are associated, you just want to know if at least one such run exists
in your data frame.

Here's a function that does that:


checkruns - function(testdata) {

test1 - ifelse(testdata$Score  0  testdata$Type_Desc == 1 
!is.na(testdata$Type_Desc), 1, 0)
test0 - ifelse(testdata$Score  0  testdata$Type_Desc == 0 
!is.na(testdata$Type_Desc), 1, 0)

test1.rle - rle(test1)
test0.rle - rle(test0)

if(any(test1.rle$lengths = 5  test1.rle$values == 1))
cat(Type_high\n)
if(any(test0.rle$lengths = 5  test0.rle$values == 1))
cat(Type_low\n)

invisible()
}

Sarah


On Thu, May 14, 2015 at 8:16 AM, Abhinaba Roy abhinabaro...@gmail.com wrote:
 Hi,

 I have the following dataframe

 structure(list(Type = c(QRS, QRS, QRS, QRS, QRS, QRS,
 QRS, QRS, QRS, QRS, QRS, QRS, RR, RR, RR, PP,
 PP, PP, PP, PP, PP, PP, PP, PP, QTc, QTc,
 QTc, QTc, QTc, QTc, QTc, QTc, QTc, QTc, QTc,
 QTc, QTc, QTc, QTc), Time_Point_Start = c(2015-04-01 
 14:57:15.0.0312,
 2015-04-01 14:57:15.0.7839, 2015-04-01 14:57:16.0.5343,
 2015-04-01 14:57:17.0.2573,
 2015-04-01 14:57:18.0.0234, 2015-04-01 14:57:18.0.7722,
 2015-04-01 14:57:19.0.5265,
 2015-04-01 14:57:24.0.0195, 2015-04-01 14:57:24.0.7839,
 2015-04-01 14:57:25.0.5343,
 2015-04-01 14:57:26.0.2768, 2015-04-01 14:57:27.0.0273,
 2015-04-01 14:58:03.0.0702,
 2015-04-01 14:58:03.0.8190, 2015-04-01 14:58:04.0.5694,
 2015-04-01 14:57:58.0.4134,
 2015-04-01 14:57:59.0.1637, 2015-04-01 14:57:59.0.9126,
 2015-04-01 14:58:00.0.6630,
 2015-04-01 14:58:01.0.4134, 2015-04-01 14:58:02.0.1637,
 2015-04-01 14:58:02.0.9126,
 2015-04-01 14:58:03.0.6630, 2015-04-01 14:58:04.0.4134,
 2015-04-01 14:57:07.0.4212,
 2015-04-01 14:57:08.0.1715, 2015-04-01 14:57:08.0.9204,
 2015-04-01 14:57:09.0.6864,
 2015-04-01 14:57:10.0.4368, 2015-04-01 14:57:11.0.1871,
 2015-04-01 14:57:11.0.9360,
 2015-04-01 14:57:12.0.6591, 2015-04-01 14:57:13.0.4251,
 2015-04-01 14:57:14.0.1754,
 2015-04-01 14:57:14.0.9243, 2015-04-01 14:57:15.0.6903,
 2015-04-01 14:57:16.0.4407,
 2015-04-01 14:57:17.0.1676, 2015-04-01 14:57:17.0.9321),
 Time_Point_End = c(2015-04-01 14:57:15.0.0858, 2015-04-01
 14:57:15.0.8346,
 2015-04-01 14:57:16.0.6006, 2015-04-01 14:57:17.0.0351,
 2015-04-01 14:57:18.0.1403, 2015-04-01 14:57:18.0.8385,
 2015-04-01 14:57:19.0.5889, 2015-04-01 14:57:24.0.0858,
 2015-04-01 14:57:24.0.8346, 2015-04-01 14:57:25.0.5772,
 2015-04-01 14:57:26.0.3939, 2015-04-01 14:57:27.0.0936,
 2015-04-01 14:58:03.0.8190, 2015-04-01 14:58:04.0.5694,
 2015-04-01 14:58:05.0.3197, 2015-04-01 14:57:59.0.1637,
 2015-04-01 14:57:59.0.9126, 2015-04-01 14:58:00.0.6630,
 2015-04-01 14:58:01.0.4134, 2015-04-01 14:58:02.0.1637,
 2015-04-01 14:58:02.0.9126, 2015-04-01 14:58:03.0.6630,
 2015-04-01 14:58:04.0.4134, 2015-04-01 14:58:05.0.1793,
 2015-04-01 14:57:07.0.8775, 2015-04-01 14:57:08.0.6435,
 2015-04-01 14:57:09.0.3705, 2015-04-01 14:57:10.0.1209,
 2015-04-01 14:57:10.0.8697, 2015-04-01 14:57:11.0.6201,
 2015-04-01 14:57:12.0.3861, 2015-04-01 14:57:13.0.1364,
 2015-04-01 14:57:13.0.8853, 2015-04-01 14:57:14.0.6513,
 2015-04-01 14:57:15.0.4017, 2015-04-01 14:57:16.0.1248,
 2015-04-01 14:57:16.0.9165, 2015-04-01 14:57:17.0.6162,
 2015-04-01 14:57:18.0.3900), Value = c(0.0546, 0.0507,
 0.0663, 0.0936, 0.117, 0.0663, 0.0624, 0.0663, 0.0507, 0.0429,
 0.117, 0.0663, 0.7488, 0.7488, 0.7488, 0.7488, 0.7488, 0.7488,
 0.7488, 0.7488, 0.7488, 0.7488, 0.7488, 0.7644, 0.033103481,
 0.034056449, 0.032367699, 0.031000613, 0.031405867, 0.031241866,
 0.032367699, 0.034337907, 0.033125921, 0.034337907, 0.034337907,
 0.031241866, 0.034337907, 0.032367699, 0.032930616), Score = c(0L,
 0L, 0L, 0L, 3L, 0L, 0L, 0L, 0L, 0L, 3L, 0L, 0L, 0L, 0L, 0L,
 0L, 2L, 2L, 2L, 2L, 2L, 0L, 0L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), Type_Desc = c(NA, NA, NA,
 NA, 1L, NA, NA, NA, NA, NA, 1L, NA, NA, NA, NA, NA, NA, 1L,
 1L, 1L, 1L, 1L, NA, NA, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
 0L, 0L, 0L, 0L, 0L, 0L), Pat_id = c(4L, 4L, 4L, 4L, 4L, 4L,
 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
 4L, 4L, 4L)), .Names = c(Type, Time_Point_Start, Time_Point_End,
 Value, Score, Type_Desc, Pat_id), class = data.frame,
 row.names = c(NA,
 -39L))


 For each unique value in column 'Type' , I want to check for
 consecutive 5 rows (if any) of 'Score'  0.

 Now, if there are five consecutive rows with Score  0 and 'Type_Desc'
 = 0, then we print Type_low , else if

 'Type_Desc' = 1, we print Type_high. The search should end once 5
 consecutive rows have been found.

 So, for this data frame we will have two statements as 

Re: [R] Counting consecutive events in R

2015-05-14 Thread Johannes Huesing

I normally use rle() for these problems, see ?rle.

for instance,

k - rbinom(999, 1, .5)   
series - function(run) { r - rle(run)ser - which(r$lengths  5  r$values)  } 
series(k)



returns the indices of consecutive runs that have length 5 or longer.
 


Abhinaba Roy abhinabaro...@gmail.com [Thu, May 14, 2015 at 02:16:31PM CEST]:

Hi,

I have the following dataframe

structure(list(Type = c(QRS, QRS, QRS, QRS, QRS, QRS,
QRS, QRS, QRS, QRS, QRS, QRS, RR, RR, RR, PP,
PP, PP, PP, PP, PP, PP, PP, PP, QTc, QTc,
QTc, QTc, QTc, QTc, QTc, QTc, QTc, QTc, QTc,
QTc, QTc, QTc, QTc), Time_Point_Start = c(2015-04-01 14:57:15.0.0312,
2015-04-01 14:57:15.0.7839, 2015-04-01 14:57:16.0.5343,
2015-04-01 14:57:17.0.2573,
2015-04-01 14:57:18.0.0234, 2015-04-01 14:57:18.0.7722,
2015-04-01 14:57:19.0.5265,
2015-04-01 14:57:24.0.0195, 2015-04-01 14:57:24.0.7839,
2015-04-01 14:57:25.0.5343,
2015-04-01 14:57:26.0.2768, 2015-04-01 14:57:27.0.0273,
2015-04-01 14:58:03.0.0702,
2015-04-01 14:58:03.0.8190, 2015-04-01 14:58:04.0.5694,
2015-04-01 14:57:58.0.4134,
2015-04-01 14:57:59.0.1637, 2015-04-01 14:57:59.0.9126,
2015-04-01 14:58:00.0.6630,
2015-04-01 14:58:01.0.4134, 2015-04-01 14:58:02.0.1637,
2015-04-01 14:58:02.0.9126,
2015-04-01 14:58:03.0.6630, 2015-04-01 14:58:04.0.4134,
2015-04-01 14:57:07.0.4212,
2015-04-01 14:57:08.0.1715, 2015-04-01 14:57:08.0.9204,
2015-04-01 14:57:09.0.6864,
2015-04-01 14:57:10.0.4368, 2015-04-01 14:57:11.0.1871,
2015-04-01 14:57:11.0.9360,
2015-04-01 14:57:12.0.6591, 2015-04-01 14:57:13.0.4251,
2015-04-01 14:57:14.0.1754,
2015-04-01 14:57:14.0.9243, 2015-04-01 14:57:15.0.6903,
2015-04-01 14:57:16.0.4407,
2015-04-01 14:57:17.0.1676, 2015-04-01 14:57:17.0.9321),
   Time_Point_End = c(2015-04-01 14:57:15.0.0858, 2015-04-01
14:57:15.0.8346,
   2015-04-01 14:57:16.0.6006, 2015-04-01 14:57:17.0.0351,
   2015-04-01 14:57:18.0.1403, 2015-04-01 14:57:18.0.8385,
   2015-04-01 14:57:19.0.5889, 2015-04-01 14:57:24.0.0858,
   2015-04-01 14:57:24.0.8346, 2015-04-01 14:57:25.0.5772,
   2015-04-01 14:57:26.0.3939, 2015-04-01 14:57:27.0.0936,
   2015-04-01 14:58:03.0.8190, 2015-04-01 14:58:04.0.5694,
   2015-04-01 14:58:05.0.3197, 2015-04-01 14:57:59.0.1637,
   2015-04-01 14:57:59.0.9126, 2015-04-01 14:58:00.0.6630,
   2015-04-01 14:58:01.0.4134, 2015-04-01 14:58:02.0.1637,
   2015-04-01 14:58:02.0.9126, 2015-04-01 14:58:03.0.6630,
   2015-04-01 14:58:04.0.4134, 2015-04-01 14:58:05.0.1793,
   2015-04-01 14:57:07.0.8775, 2015-04-01 14:57:08.0.6435,
   2015-04-01 14:57:09.0.3705, 2015-04-01 14:57:10.0.1209,
   2015-04-01 14:57:10.0.8697, 2015-04-01 14:57:11.0.6201,
   2015-04-01 14:57:12.0.3861, 2015-04-01 14:57:13.0.1364,
   2015-04-01 14:57:13.0.8853, 2015-04-01 14:57:14.0.6513,
   2015-04-01 14:57:15.0.4017, 2015-04-01 14:57:16.0.1248,
   2015-04-01 14:57:16.0.9165, 2015-04-01 14:57:17.0.6162,
   2015-04-01 14:57:18.0.3900), Value = c(0.0546, 0.0507,
   0.0663, 0.0936, 0.117, 0.0663, 0.0624, 0.0663, 0.0507, 0.0429,
   0.117, 0.0663, 0.7488, 0.7488, 0.7488, 0.7488, 0.7488, 0.7488,
   0.7488, 0.7488, 0.7488, 0.7488, 0.7488, 0.7644, 0.033103481,
   0.034056449, 0.032367699, 0.031000613, 0.031405867, 0.031241866,
   0.032367699, 0.034337907, 0.033125921, 0.034337907, 0.034337907,
   0.031241866, 0.034337907, 0.032367699, 0.032930616), Score = c(0L,
   0L, 0L, 0L, 3L, 0L, 0L, 0L, 0L, 0L, 3L, 0L, 0L, 0L, 0L, 0L,
   0L, 2L, 2L, 2L, 2L, 2L, 0L, 0L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
   3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), Type_Desc = c(NA, NA, NA,
   NA, 1L, NA, NA, NA, NA, NA, 1L, NA, NA, NA, NA, NA, NA, 1L,
   1L, 1L, 1L, 1L, NA, NA, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
   0L, 0L, 0L, 0L, 0L, 0L), Pat_id = c(4L, 4L, 4L, 4L, 4L, 4L,
   4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
   4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
   4L, 4L, 4L)), .Names = c(Type, Time_Point_Start, Time_Point_End,
Value, Score, Type_Desc, Pat_id), class = data.frame,
row.names = c(NA,
-39L))


For each unique value in column 'Type' , I want to check for
consecutive 5 rows (if any) of 'Score'  0.

Now, if there are five consecutive rows with Score  0 and 'Type_Desc'
= 0, then we print Type_low , else if

'Type_Desc' = 1, we print Type_high. The search should end once 5
consecutive rows have been found.

So, for this data frame we will have two statements as follows,


1.PP_high

(reason - consecutive 5 rows of score  0 and

'Type_Desc' = 1 )

2.QTc_low
(reason -