Here is another version, loopless, but still
a bit clumsy (can the call to which be removed?):

f <- function (x = walevel,
               startThreshhold = 5.45,
               stopThreshhold = 5.3, 
               plot = TRUE) 
{
    stopifnot(startThreshhold > stopThreshhold)
    isFirstInRun <- function(x) c(TRUE, x[-1] != x[-length(x)])
    isLastInRun <- function(x) c(x[-1] != x[-length(x)], TRUE)
    isOverStart <- x >= startThreshhold
    isOverStop <- x >= stopThreshhold
    possibleStartPt <- which(isFirstInRun(isOverStart) & isOverStart)
    possibleStopPt <- which(isLastInRun(isOverStop) & isOverStop)
    pts <- c(possibleStartPt, possibleStopPt)
    names(pts) <- rep(c("start", "stop"),
        c(length(possibleStartPt), length(possibleStopPt)))
    pts <- pts[order(pts)]
    tmp <- isFirstInRun(names(pts))
    start <- pts[tmp & names(pts) == "start"]
    stop <- pts[tmp & names(pts) == "stop"]
    if (length(stop) > length(start)) { 
        # i.e., when first downcrossing happens before
        # first upcrossing
        stop <- stop[-1]
    }
    if (plot) {
        plot(x, cex = 0.5)
        abline(h = c(startThreshhold, stopThreshhold))
        abline(v = start, col = "green")
        abline(v = stop, col = "red")
    }
    data.frame(start = start, stop = stop)
}

# define the data in the original message and call f().

The isFirstInRun and isLastInRun functions do the
first part of the calculation that rle does and
avoid the cumsum(diff()) roundtrip that
cumsum(rle()$lengths) involves and their names
make it clearer what I'm trying to do.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> -----Original Message-----
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of 
> Hutchinson,David [PYR]
> Sent: Thursday, May 27, 2010 10:41 AM
> To: Tonja Krueger
> Cc: r-help@r-project.org
> Subject: Re: [R] Peak Over Threshold values
> 
> Perhaps not elegant, but does the job
> 
> threshold <- 5.45
> meetThreshold <- walevel >= threshold
> 
> d <- rle(meetThreshold)
> startPos <- c(1, 1 + cumsum(d$lengths[-length(d$lengths)]))
> 
> abv <- which(d$values == TRUE)
> p.o.t <- data.frame()
> 
> for (i in seq( along = abv ) ) {
>   a <- startPos[abv[i]]
>   b <- a + (d$lengths[abv[i]] - 1)
>   e <- which.max(walevel[a:b])
>   p.o.t <- rbind( p.o.t, data.frame(
>                pos = a + e - 1,
>                val = walevel[a:b][e]
>            ) )
> }
> 
> plot (
>   day,
>   walevel, type = 'l'
> )
> 
> points(
>   p.o.t$pos,
>   p.o.t$val,
>   col = 'red'
> )
> 
> abline(h = threshold, lty = 2, col = 'red')
> 
> -----Original Message-----
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Tonja Krueger
> Sent: Thursday, May 27, 2010 1:47 AM
> To: Vito Muggeo (UniPa); Clint Bowman
> Cc: r-help@r-project.org
> Subject: [R] Peak Over Threshold values
> 
> 
>    I'm  sorry, but that's not exactly what I was looking for. 
> I obviously
>    didn't explain properly:
> 
>    Within  my dataframe (df) I would like to find POT values 
> that are not
>    linked. In my definition two maximum values are linked if 
> walevel does not
>    fall below a certain value (the lower threshold value).
> 
>    walevel <- c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 
> 5.86, 5.91, 5.91,
>    5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 
> 6.11, 6.14, 6.12,
>    6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 
> 5.72, 5.70, 5.65,
>    5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 
> 5.19, 5.19, 5.13,
>    5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 
> 5.22, 5.32, 5.29,
>    5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 
> 5.66, 5.68, 5.72,
>    5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 
> 5.62, 5.62, 5.61,
>    5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 
> 5.45, 5.47, 5.50,
>    5.50, 5.49, 5.43, 5.39, 5.33, 5.26)
> 
>    day <- c(1:100)
> 
>    df <- data.frame(day,walevel)
> 
>    For example in my dataframe  a threshold value = 5.45 and 
> an lower threshold
>    value = 5.3 should lead to two maximum values because 
> between the 2^nd and
>    3^rd peak walevel does not fall below the lower threshold value.
> 
>    With  "clusters (...)" from "evd package", I could find 
> out POT values but
>    even though I set a lower threshold value for my example 
> dataframe I would
>    get three maximum values instead of two.
> 
>    library(evd)
> 
>    clusters(df$walevel,u =5.45, ulow= 5.3, r = 1, cmax=T)
> 
>    clusters(df$walevel,u =5.45, ulow= 5.3, r = 1, plot=T)
> 
>    Changing r to 30 (for example) connects the 2^nd and 3^rd 
> maximum events and
>    gives out the right 'end' for the first extreme event but 
> not for the second
>    event. (Also I'd rather not use a time interval to prove 
> that the events are
>    linked)
> 
>    clusters(df$walevel,u =5.45, ulow= 5.3, r = 30, cmax=T)
> 
>    clusters(df$walevel,u =5.45, ulow= 5.3, r = 30, plot=T)
> 
>    What went wrong???
> 
>    Tonja
>    -----Ursprüngliche Nachricht-----
>    How about?
>    hi.rle<-rle(walevel>5.79)
>    lo.rle<-rle(walevel<5.36)
>    plot(walevel)
>    abline(h=5.8,col=2,lty=3)
>    abline(h=5.35,col=3,lty=3)
>    hi.lo.rle<-sort(c(cumsum(hi.rle$lengths),cumsum(lo.rle$lengths)))
>    abline(v=hi.lo.rle)
>    You can use the $values from the rle to sort things out. Probably
>    want to ignore the end point at length(walevel).
>    --
>    Clint Bowman INTERNET: cl...@ecy.wa.gov
>    Air Quality Modeler INTERNET: cl...@math.utah.edu
>    Department of Ecology VOICE: (360) 407-6815
>    PO Box 47600 FAX: (360) 407-7534
>    Olympia, WA 98504-7600
>    On Wed, 26 May 2010, Vito Muggeo (UniPa) wrote:
>    > dear Tonja,
>    > By plotting your data
>    >
>    > plot(df)
>    >
>    > it seems to me that you are looking for a piecewise 
> linear relationships.
>    If
>    >  this is the case, have a look to the package segmented. 
> You have to
>    specify
>    > or not the number and the starting values for the breakpoints
>    >
>    > library(segmented)
>    > olm<-lm(walevel~day)
>    >
>    > #specify number and starting values for the breakpoints..
>    > oseg<-segmented(olm, seg.Z=~day, psi=c(20,50,80))
>    > plot(oseg,add=TRUE,col=2)
>    > oseg$psi
>    >
>    > #or you can avoid to specify starting values for psi
>    > oseg<-segmented(olm, seg.Z=~day,
>    > psi=NA,control=seg.control(stop.if.error=FALSE,K=30))
>    >
>    > plot(oseg,add=TRUE,col=2)
>    > oseg$psi
>    >
>    >
>    > best,
>    > vito
>    >
>    >
>    > Tonja Krueger ha scritto:
>    >> Dear List
>    >>
>    >> I hope you can help me: I've got a dataframe (df) 
> within which I am
>    >> looking
>    >> for Peak Over Threshold values as well as the length of 
> the events. An
>    >> event
>    >> starts when walevel equals 5.8 and it should end when 
> walevel equals
>    >> the
>    >> lower threshold value (5.35).
>    >>
>    >> I tried "clusters (...)" from "evd package", and varied 
> r (see example)
>    >> but it
>    >> did not work for all events (again see example).
>    >>
>    >> walevel <- c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 
> 5.86, 5.91,
>    >> 5.91,
>    >> 5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 
> 6.11, 6.14, 6.12,
>    >> 6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 
> 5.72, 5.70, 5.65,
>    >> 5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 
> 5.19, 5.19, 5.13,
>    >> 5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 
> 5.22, 5.32, 5.29,
>    >> 5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 
> 5.66, 5.68, 5.72,
>    >> 5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 
> 5.62, 5.62, 5.61,
>    >> 5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 
> 5.45, 5.47, 5.50,
>    >> 5.50, 5.49, 5.43, 5.39, 5.33, 5.26)
>    >>
>    >> day <- c(1:100)
>    >>
>    >> df <- data.frame(day,walevel)
>    >>
>    >> library(evd)
>    >> clusters(df$walevel, u = 5.80, r = 1, ulow = 5.35, cmax 
> = T, plot = T)
>    >> clusters(df$walevel, u = 5.80, r = 50, ulow = 5.35, 
> cmax = T, plot = T)
>    >>
>    >> What have I done wrong?
>    >>
>    >> Tonja
>    >> ______________________________________________
>    >> R-help@r-project.org mailing list
>    >> [1]https://stat.ethz.ch/mailman/listinfo/r-help
>    >> PLEASE do read the posting guide
>    >> [2]http://www.R-project.org/posting-guide.html
>    >> and provide commented, minimal, self-contained, 
> reproducible code.
>    >>
>    >
>    >
> 
>    GRATIS: Movie-Flat mit über 300 Top-Videos. Für WEB.DE Nutzer
>    dauerhaft kostenlos! Jetzt freischalten unter 
> http://movieflat.web.de
> 
> References
> 
>    1. 
> file://localhost/../jump.htm?goto=https%3A%2F%2Fstat.ethz.ch%2
Fmailman%2Flistinfo%2Fr-help
>    2. 
> file://localhost/../jump.htm?goto=http%3A%2F%2Fwww.R-project.o
rg%2Fposting-guide.html
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to