Re: [R] How to identify runs or clusters of events in time

Clint Bowman Fri, 01 Jul 2016 10:50:40 -0700

Mark,

I did something similar a couple of year ago by coding non-events as 0,positive events as +1 and negative events as -1 then summing the valuethrough time. In my case the patterns showed up quite clearly and I usedother criteria to define the actual periods.


Clint

Clint Bowman                    INTERNET:       cl...@ecy.wa.gov
Air Quality Modeler             INTERNET:       cl...@math.utah.edu
Department of Ecology           VOICE:          (360) 407-6815
PO Box 47600                    FAX:            (360) 407-7534
Olympia, WA 98504-7600

        USPS:           PO Box 47600, Olympia, WA 98504-7600
        Parcels:        300 Desmond Drive, Lacey, WA 98503-1274

On Fri, 1 Jul 2016, Mark Shanks wrote:

Hi,


Imagine the two problems:


1) You have an event that occurs repeatedly over time. You want to identify 
periods when the event occurs more frequently than the base rate of occurrence. 
Ideally, you don't want to have to specify the period (e.g., break into 
months), so the analysis can be sensitive to scenarios such as many events 
happening only between, e.g., June 10 and June 15 - even though the overall 
number of events for the month may not be much greater than usual. Similarly, 
there may be a cluster of events that occur from March 28 to April 3. Ideally, 
you want to pull out the base rate of occurrence and highlight only the periods 
when the frequency is less or greater than the base rate.


2) Events again occur repeatedly over time in an inconsistent way. However, 
this time, the event has positive or negative outcomes - such as a spot check 
of conformity to regulations. You again want to know whether there is a group 
of negative outcomes close together in time. This analysis should take into 
account the negative outcomes as well though. E.g., if from June 10 to June 15 
you get 5 negative outcomes and no positive outcomes it should be flagged. On 
the other hand, if from June 10 to June 15 you get 5 negative outcomes 
interspersed between many positive outcomes it should be ignored.


I'm guessing that there is some statistical approach designed to look at these 
types of issues. What is it called? What package in R implements it? I 
basically just need to know where to start.


Thanks,


Mark

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to identify runs or clusters of events in time

Reply via email to