[R] Peak Over Threshold values

2010-05-26 Thread Tonja Krueger

   Dear List

   I hope you can help me: I’ve got a dataframe (df) within which I am looking
   for Peak Over Threshold values as well as the length of the events. An event
   starts when walevel equals 5.8 and it should end when walevel equals the
   lower threshold value (5.35).

   I tried “clusters (…)” from “evd package”, and varied r (see example) but it
   did not work for all events (again see example).

   walevel <- c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 5.86, 5.91, 5.91,
   5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 6.11, 6.14, 6.12,
   6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 5.72, 5.70, 5.65,
   5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 5.19, 5.19, 5.13,
   5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 5.22, 5.32, 5.29,
   5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 5.66, 5.68, 5.72,
   5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 5.62, 5.62, 5.61,
   5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 5.45, 5.47, 5.50,
   5.50, 5.49, 5.43, 5.39, 5.33, 5.26)

   day <- c(1:100)

   df <- data.frame(day,walevel)

   library(evd)
   clusters(df$walevel, u = 5.80, r = 1, ulow = 5.35, cmax = T, plot = T)
   clusters(df$walevel, u = 5.80, r = 50, ulow = 5.35, cmax = T, plot = T)

   What have I done wrong?

   Tonja
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Peak Over Threshold values

2010-05-27 Thread Tonja Krueger

   I’m  sorry, but that‘s not exactly what I was looking for. I obviously
   didn’t explain properly:

   Within  my dataframe (df) I would like to find POT values that are not
   linked. In my definition two maximum values are linked if walevel does not
   fall below a certain value (the lower threshold value).

   walevel <- c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 5.86, 5.91, 5.91,
   5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 6.11, 6.14, 6.12,
   6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 5.72, 5.70, 5.65,
   5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 5.19, 5.19, 5.13,
   5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 5.22, 5.32, 5.29,
   5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 5.66, 5.68, 5.72,
   5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 5.62, 5.62, 5.61,
   5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 5.45, 5.47, 5.50,
   5.50, 5.49, 5.43, 5.39, 5.33, 5.26)

   day <- c(1:100)

   df <- data.frame(day,walevel)

   For example in my dataframe  a threshold value = 5.45 and an lower threshold
   value = 5.3 should lead to two maximum values because between the 2^nd and
   3^rd peak walevel does not fall below the lower threshold value.

   With  “clusters (…)” from “evd package”, I could find out POT values but
   even though I set a lower threshold value for my example dataframe I would
   get three maximum values instead of two.

   library(evd)

   clusters(df$walevel,u =5.45, ulow= 5.3, r = 1, cmax=T)

   clusters(df$walevel,u =5.45, ulow= 5.3, r = 1, plot=T)

   Changing r to 30 (for example) connects the 2^nd and 3^rd maximum events and
   gives out the right ‘end’ for the first extreme event but not for the second
   event. (Also I’d rather not use a time interval to prove that the events are
   linked)

   clusters(df$walevel,u =5.45, ulow= 5.3, r = 30, cmax=T)

   clusters(df$walevel,u =5.45, ulow= 5.3, r = 30, plot=T)

   What went wrong???

   Tonja
   -Ursprüngliche Nachricht-
   How about?
   hi.rle<-rle(walevel>5.79)
   lo.rle<-rle(walevel<5.36)
   plot(walevel)
   abline(h=5.8,col=2,lty=3)
   abline(h=5.35,col=3,lty=3)
   hi.lo.rle<-sort(c(cumsum(hi.rle$lengths),cumsum(lo.rle$lengths)))
   abline(v=hi.lo.rle)
   You can use the $values from the rle to sort things out. Probably
   want to ignore the end point at length(walevel).
   --
   Clint Bowman INTERNET: cl...@ecy.wa.gov
   Air Quality Modeler INTERNET: cl...@math.utah.edu
   Department of Ecology VOICE: (360) 407-6815
   PO Box 47600 FAX: (360) 407-7534
   Olympia, WA 98504-7600
   On Wed, 26 May 2010, Vito Muggeo (UniPa) wrote:
   > dear Tonja,
   > By plotting your data
   >
   > plot(df)
   >
   > it seems to me that you are looking for a piecewise linear relationships.
   If
   >  this is the case, have a look to the package segmented. You have to
   specify
   > or not the number and the starting values for the breakpoints
   >
   > library(segmented)
   > olm<-lm(walevel~day)
   >
   > #specify number and starting values for the breakpoints..
   > oseg<-segmented(olm, seg.Z=~day, psi=c(20,50,80))
   > plot(oseg,add=TRUE,col=2)
   > oseg$psi
   >
   > #or you can avoid to specify starting values for psi
   > oseg<-segmented(olm, seg.Z=~day,
   > psi=NA,control=seg.control(stop.if.error=FALSE,K=30))
   >
   > plot(oseg,add=TRUE,col=2)
   > oseg$psi
   >
   >
   > best,
   > vito
   >
   >
   > Tonja Krueger ha scritto:
   >> Dear List
   >>
   >> I hope you can help me: I’ve got a dataframe (df) within which I am
   >> looking
   >> for Peak Over Threshold values as well as the length of the events. An
   >> event
   >> starts when walevel equals 5.8 and it should end when walevel equals
   >> the
   >> lower threshold value (5.35).
   >>
   >> I tried “clusters (…)” from “evd package”, and varied r (see example)
   >> but it
   >> did not work for all events (again see example).
   >>
   >> walevel <- c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 5.86, 5.91,
   >> 5.91,
   >> 5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 6.11, 6.14, 6.12,
   >> 6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 5.72, 5.70, 5.65,
   >> 5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 5.19, 5.19, 5.13,
   >> 5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 5.22, 5.32, 5.29,
   >> 5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 5.66, 5.68, 5.72,
   >> 5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 5.62, 5.62, 5.61,
   >> 5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 5.45, 5.47, 5.50,
   >> 5.50, 5.49, 5.43, 5.39, 5.33, 5.26)
   >>
   >> day <- c(1:100)
   >>
   >> df <- data.frame(day,walevel)
   >>
   >> library(evd)
   >> clusters(df$walevel, u = 5.80, r = 1, ulow = 5.35, cmax = T, plot = T)
   >> clusters(df$walevel, u = 5.80, r = 50, ulow = 5.35, cmax = T, plot = T)
   >>
   >> What have I done wrong?
   >>
   >> Tonja
   >> __
   >> R-help@r-project.org mailing list
   >> [

Re: [R] Peak Over Threshold values

2010-05-26 Thread Vito Muggeo (UniPa)

dear Tonja,
By plotting your data

plot(df)

it seems to me that you are looking for a piecewise linear 
relationships. If this is the case, have a look to the package 
segmented. You have to specify or not the number and the starting values 
for the breakpoints


library(segmented)
olm<-lm(walevel~day)

#specify number and starting values for the breakpoints..
oseg<-segmented(olm, seg.Z=~day, psi=c(20,50,80))
plot(oseg,add=TRUE,col=2)
oseg$psi

#or you can avoid to specify starting values for psi
oseg<-segmented(olm, seg.Z=~day, 
psi=NA,control=seg.control(stop.if.error=FALSE,K=30))


plot(oseg,add=TRUE,col=2)
oseg$psi


best,
vito


Tonja Krueger ha scritto:

   Dear List

   I hope you can help me: I’ve got a dataframe (df) within which I am looking
   for Peak Over Threshold values as well as the length of the events. An event
   starts when walevel equals 5.8 and it should end when walevel equals the
   lower threshold value (5.35).

   I tried “clusters (…)” from “evd package”, and varied r (see example) but it
   did not work for all events (again see example).

   walevel <- c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 5.86, 5.91, 5.91,
   5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 6.11, 6.14, 6.12,
   6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 5.72, 5.70, 5.65,
   5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 5.19, 5.19, 5.13,
   5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 5.22, 5.32, 5.29,
   5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 5.66, 5.68, 5.72,
   5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 5.62, 5.62, 5.61,
   5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 5.45, 5.47, 5.50,
   5.50, 5.49, 5.43, 5.39, 5.33, 5.26)

   day <- c(1:100)

   df <- data.frame(day,walevel)

   library(evd)
   clusters(df$walevel, u = 5.80, r = 1, ulow = 5.35, cmax = T, plot = T)
   clusters(df$walevel, u = 5.80, r = 50, ulow = 5.35, cmax = T, plot = T)

   What have I done wrong?

   Tonja
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Peak Over Threshold values

2010-05-26 Thread Clint Bowman

How about?

hi.rle<-rle(walevel>5.79)
lo.rle<-rle(walevel<5.36)
plot(walevel)
abline(h=5.8,col=2,lty=3)
abline(h=5.35,col=3,lty=3)
hi.lo.rle<-sort(c(cumsum(hi.rle$lengths),cumsum(lo.rle$lengths)))
abline(v=hi.lo.rle)

You can use the $values from the rle to sort things out.  Probably 
want to ignore the end point at length(walevel).


--
Clint BowmanINTERNET:   cl...@ecy.wa.gov
Air Quality Modeler INTERNET:   cl...@math.utah.edu
Department of Ecology   VOICE:  (360) 407-6815
PO Box 47600FAX:(360) 407-7534
Olympia, WA 98504-7600

On Wed, 26 May 2010, Vito Muggeo (UniPa) wrote:


dear Tonja,
By plotting your data

plot(df)

it seems to me that you are looking for a piecewise linear relationships. If 
this is the case, have a look to the package segmented. You have to specify 
or not the number and the starting values for the breakpoints


library(segmented)
olm<-lm(walevel~day)

#specify number and starting values for the breakpoints..
oseg<-segmented(olm, seg.Z=~day, psi=c(20,50,80))
plot(oseg,add=TRUE,col=2)
oseg$psi

#or you can avoid to specify starting values for psi
oseg<-segmented(olm, seg.Z=~day, 
psi=NA,control=seg.control(stop.if.error=FALSE,K=30))


plot(oseg,add=TRUE,col=2)
oseg$psi


best,
vito


Tonja Krueger ha scritto:

Dear List

I hope you can help me: I’ve got a dataframe (df) within which I am
looking
for Peak Over Threshold values as well as the length of the events. An
event
starts when walevel equals 5.8 and it should end when walevel equals
the
lower threshold value (5.35).

I tried “clusters (…)” from “evd package”, and varied r (see example)
but it
did not work for all events (again see example).

walevel <- c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 5.86, 5.91,
5.91,
5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 6.11, 6.14, 6.12,
6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 5.72, 5.70, 5.65,
5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 5.19, 5.19, 5.13,
5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 5.22, 5.32, 5.29,
5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 5.66, 5.68, 5.72,
5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 5.62, 5.62, 5.61,
5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 5.45, 5.47, 5.50,
5.50, 5.49, 5.43, 5.39, 5.33, 5.26)

day <- c(1:100)

df <- data.frame(day,walevel)

library(evd)
clusters(df$walevel, u = 5.80, r = 1, ulow = 5.35, cmax = T, plot = T)
clusters(df$walevel, u = 5.80, r = 50, ulow = 5.35, cmax = T, plot = T)

What have I done wrong?

Tonja
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Peak Over Threshold values

2010-05-27 Thread Hutchinson,David [PYR]
Perhaps not elegant, but does the job

threshold <- 5.45
meetThreshold <- walevel >= threshold

d <- rle(meetThreshold)
startPos <- c(1, 1 + cumsum(d$lengths[-length(d$lengths)]))

abv <- which(d$values == TRUE)
p.o.t <- data.frame()

for (i in seq( along = abv ) ) {
  a <- startPos[abv[i]]
  b <- a + (d$lengths[abv[i]] - 1)
  e <- which.max(walevel[a:b])
  p.o.t <- rbind( p.o.t, data.frame(
   pos = a + e - 1,
   val = walevel[a:b][e]
   ) )
}

plot (
  day,
  walevel, type = 'l'
)

points(
  p.o.t$pos,
  p.o.t$val,
  col = 'red'
)

abline(h = threshold, lty = 2, col = 'red')

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Tonja Krueger
Sent: Thursday, May 27, 2010 1:47 AM
To: Vito Muggeo (UniPa); Clint Bowman
Cc: r-help@r-project.org
Subject: [R] Peak Over Threshold values


   I'm  sorry, but that's not exactly what I was looking for. I obviously
   didn't explain properly:

   Within  my dataframe (df) I would like to find POT values that are not
   linked. In my definition two maximum values are linked if walevel does not
   fall below a certain value (the lower threshold value).

   walevel <- c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 5.86, 5.91, 5.91,
   5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 6.11, 6.14, 6.12,
   6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 5.72, 5.70, 5.65,
   5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 5.19, 5.19, 5.13,
   5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 5.22, 5.32, 5.29,
   5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 5.66, 5.68, 5.72,
   5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 5.62, 5.62, 5.61,
   5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 5.45, 5.47, 5.50,
   5.50, 5.49, 5.43, 5.39, 5.33, 5.26)

   day <- c(1:100)

   df <- data.frame(day,walevel)

   For example in my dataframe  a threshold value = 5.45 and an lower threshold
   value = 5.3 should lead to two maximum values because between the 2^nd and
   3^rd peak walevel does not fall below the lower threshold value.

   With  "clusters (...)" from "evd package", I could find out POT values but
   even though I set a lower threshold value for my example dataframe I would
   get three maximum values instead of two.

   library(evd)

   clusters(df$walevel,u =5.45, ulow= 5.3, r = 1, cmax=T)

   clusters(df$walevel,u =5.45, ulow= 5.3, r = 1, plot=T)

   Changing r to 30 (for example) connects the 2^nd and 3^rd maximum events and
   gives out the right 'end' for the first extreme event but not for the second
   event. (Also I'd rather not use a time interval to prove that the events are
   linked)

   clusters(df$walevel,u =5.45, ulow= 5.3, r = 30, cmax=T)

   clusters(df$walevel,u =5.45, ulow= 5.3, r = 30, plot=T)

   What went wrong???

   Tonja
   -Ursprüngliche Nachricht-
   How about?
   hi.rle<-rle(walevel>5.79)
   lo.rle<-rle(walevel<5.36)
   plot(walevel)
   abline(h=5.8,col=2,lty=3)
   abline(h=5.35,col=3,lty=3)
   hi.lo.rle<-sort(c(cumsum(hi.rle$lengths),cumsum(lo.rle$lengths)))
   abline(v=hi.lo.rle)
   You can use the $values from the rle to sort things out. Probably
   want to ignore the end point at length(walevel).
   --
   Clint Bowman INTERNET: cl...@ecy.wa.gov
   Air Quality Modeler INTERNET: cl...@math.utah.edu
   Department of Ecology VOICE: (360) 407-6815
   PO Box 47600 FAX: (360) 407-7534
   Olympia, WA 98504-7600
   On Wed, 26 May 2010, Vito Muggeo (UniPa) wrote:
   > dear Tonja,
   > By plotting your data
   >
   > plot(df)
   >
   > it seems to me that you are looking for a piecewise linear relationships.
   If
   >  this is the case, have a look to the package segmented. You have to
   specify
   > or not the number and the starting values for the breakpoints
   >
   > library(segmented)
   > olm<-lm(walevel~day)
   >
   > #specify number and starting values for the breakpoints..
   > oseg<-segmented(olm, seg.Z=~day, psi=c(20,50,80))
   > plot(oseg,add=TRUE,col=2)
   > oseg$psi
   >
   > #or you can avoid to specify starting values for psi
   > oseg<-segmented(olm, seg.Z=~day,
   > psi=NA,control=seg.control(stop.if.error=FALSE,K=30))
   >
   > plot(oseg,add=TRUE,col=2)
   > oseg$psi
   >
   >
   > best,
   > vito
   >
   >
   > Tonja Krueger ha scritto:
   >> Dear List
   >>
   >> I hope you can help me: I've got a dataframe (df) within which I am
   >> looking
   >> for Peak Over Threshold values as well as the length of the events. An
   >> event
   >> starts when walevel equals 5.8 and it should end when walevel equals
   >> the
   >> lower threshold value (5.35).
   >>
   >> I tried "clusters (

Re: [R] Peak Over Threshold values

2010-05-27 Thread William Dunlap
Here is another version, loopless, but still
a bit clumsy (can the call to which be removed?):

f <- function (x = walevel,
   startThreshhold = 5.45,
   stopThreshhold = 5.3, 
   plot = TRUE) 
{
stopifnot(startThreshhold > stopThreshhold)
isFirstInRun <- function(x) c(TRUE, x[-1] != x[-length(x)])
isLastInRun <- function(x) c(x[-1] != x[-length(x)], TRUE)
isOverStart <- x >= startThreshhold
isOverStop <- x >= stopThreshhold
possibleStartPt <- which(isFirstInRun(isOverStart) & isOverStart)
possibleStopPt <- which(isLastInRun(isOverStop) & isOverStop)
pts <- c(possibleStartPt, possibleStopPt)
names(pts) <- rep(c("start", "stop"),
c(length(possibleStartPt), length(possibleStopPt)))
pts <- pts[order(pts)]
tmp <- isFirstInRun(names(pts))
start <- pts[tmp & names(pts) == "start"]
stop <- pts[tmp & names(pts) == "stop"]
if (length(stop) > length(start)) { 
# i.e., when first downcrossing happens before
# first upcrossing
stop <- stop[-1]
}
if (plot) {
plot(x, cex = 0.5)
abline(h = c(startThreshhold, stopThreshhold))
abline(v = start, col = "green")
abline(v = stop, col = "red")
}
data.frame(start = start, stop = stop)
}

# define the data in the original message and call f().

The isFirstInRun and isLastInRun functions do the
first part of the calculation that rle does and
avoid the cumsum(diff()) roundtrip that
cumsum(rle()$lengths) involves and their names
make it clearer what I'm trying to do.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of 
> Hutchinson,David [PYR]
> Sent: Thursday, May 27, 2010 10:41 AM
> To: Tonja Krueger
> Cc: r-help@r-project.org
> Subject: Re: [R] Peak Over Threshold values
> 
> Perhaps not elegant, but does the job
> 
> threshold <- 5.45
> meetThreshold <- walevel >= threshold
> 
> d <- rle(meetThreshold)
> startPos <- c(1, 1 + cumsum(d$lengths[-length(d$lengths)]))
> 
> abv <- which(d$values == TRUE)
> p.o.t <- data.frame()
> 
> for (i in seq( along = abv ) ) {
>   a <- startPos[abv[i]]
>   b <- a + (d$lengths[abv[i]] - 1)
>   e <- which.max(walevel[a:b])
>   p.o.t <- rbind( p.o.t, data.frame(
>pos = a + e - 1,
>val = walevel[a:b][e]
>) )
> }
> 
> plot (
>   day,
>   walevel, type = 'l'
> )
> 
> points(
>   p.o.t$pos,
>   p.o.t$val,
>   col = 'red'
> )
> 
> abline(h = threshold, lty = 2, col = 'red')
> 
> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Tonja Krueger
> Sent: Thursday, May 27, 2010 1:47 AM
> To: Vito Muggeo (UniPa); Clint Bowman
> Cc: r-help@r-project.org
> Subject: [R] Peak Over Threshold values
> 
> 
>I'm  sorry, but that's not exactly what I was looking for. 
> I obviously
>didn't explain properly:
> 
>Within  my dataframe (df) I would like to find POT values 
> that are not
>linked. In my definition two maximum values are linked if 
> walevel does not
>fall below a certain value (the lower threshold value).
> 
>walevel <- c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 
> 5.86, 5.91, 5.91,
>5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 
> 6.11, 6.14, 6.12,
>6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 
> 5.72, 5.70, 5.65,
>5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 
> 5.19, 5.19, 5.13,
>5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 
> 5.22, 5.32, 5.29,
>5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 
> 5.66, 5.68, 5.72,
>5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 
> 5.62, 5.62, 5.61,
>5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 
> 5.45, 5.47, 5.50,
>5.50, 5.49, 5.43, 5.39, 5.33, 5.26)
> 
>day <- c(1:100)
> 
>df <- data.frame(day,walevel)
> 
>For example in my dataframe  a threshold value = 5.45 and 
> an lower threshold
>value = 5.3 should lead to two maximum values because 
> between the 2^nd and
>3^rd peak walevel does not fall below the lower threshold value.
> 
>With  "clusters (...)" from "evd package", I could find 
> out POT values but
>even though I set a lower threshold value for my example 
> dataframe I would
>get three maximum values instead of two.
> 
>library(evd)
>

Re: [R] Peak Over Threshold values

2010-05-27 Thread William Dunlap
> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap
> Sent: Thursday, May 27, 2010 12:24 PM
> To: Hutchinson,David [PYR]; Tonja Krueger
> Cc: r-help@r-project.org
> Subject: Re: [R] Peak Over Threshold values
> 
> Here is another version, loopless, but still
> a bit clumsy (can the call to which be removed?):

Note that avoiding loops is mostly for
the aesthetic effect.  On my aging laptop
the following loopy and vector-growing version
takes 3 seconds on a million-long vector
while the non-loopy one takes 0.22 seconds
(timings done with plot=FALSE).  That is
a nice ratio but not much of a difference.

f0 <- function (x = walevel,
startThreshhold = 5.45,
stopThreshhold = 5.3, 
plot = TRUE) 
{
stopifnot(startThreshhold > stopThreshhold)
inRun <- FALSE
start <- integer()
stop <- integer()
for (i in seq_along(x)) {
if (inRun) {
if (x[i] < stopThreshhold) {
stop[length(stop) + 1] <- i - 1
inRun <- FALSE
}
}
else {
if (x[i] > startThreshhold) {
start[length(start) + 1] <- i
inRun <- TRUE
}
}
}
if (inRun) 
stop[length(stop) + 1] <- length(x)
if (length(stop) > length(start)) 
stop <- stop[-1]
if (plot) {
plot(x, cex = 0.5)
abline(h = c(startThreshhold, stopThreshhold))
abline(v = start, col = "green")
abline(v = stop, col = "red")
}
data.frame(start = start, stop = stop)
}

> 
> f <- function (x = walevel,
>startThreshhold = 5.45,
>stopThreshhold = 5.3, 
>plot = TRUE) 
> {
> stopifnot(startThreshhold > stopThreshhold)
> isFirstInRun <- function(x) c(TRUE, x[-1] != x[-length(x)])
> isLastInRun <- function(x) c(x[-1] != x[-length(x)], TRUE)
> isOverStart <- x >= startThreshhold
> isOverStop <- x >= stopThreshhold
> possibleStartPt <- which(isFirstInRun(isOverStart) & isOverStart)
> possibleStopPt <- which(isLastInRun(isOverStop) & isOverStop)
> pts <- c(possibleStartPt, possibleStopPt)
> names(pts) <- rep(c("start", "stop"),
> c(length(possibleStartPt), length(possibleStopPt)))
> pts <- pts[order(pts)]
> tmp <- isFirstInRun(names(pts))
> start <- pts[tmp & names(pts) == "start"]
> stop <- pts[tmp & names(pts) == "stop"]
> if (length(stop) > length(start)) { 
> # i.e., when first downcrossing happens before
> # first upcrossing
> stop <- stop[-1]
> }
> if (plot) {
> plot(x, cex = 0.5)
> abline(h = c(startThreshhold, stopThreshhold))
> abline(v = start, col = "green")
> abline(v = stop, col = "red")
> }
> data.frame(start = start, stop = stop)
> }
> 
> # define the data in the original message and call f().
> 
> The isFirstInRun and isLastInRun functions do the
> first part of the calculation that rle does and
> avoid the cumsum(diff()) roundtrip that
> cumsum(rle()$lengths) involves and their names
> make it clearer what I'm trying to do.
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com  
> 
> > -Original Message-
> > From: r-help-boun...@r-project.org 
> > [mailto:r-help-boun...@r-project.org] On Behalf Of 
> > Hutchinson,David [PYR]
> > Sent: Thursday, May 27, 2010 10:41 AM
> > To: Tonja Krueger
> > Cc: r-help@r-project.org
> > Subject: Re: [R] Peak Over Threshold values
> > 
> > Perhaps not elegant, but does the job
> > 
> > threshold <- 5.45
> > meetThreshold <- walevel >= threshold
> > 
> > d <- rle(meetThreshold)
> > startPos <- c(1, 1 + cumsum(d$lengths[-length(d$lengths)]))
> > 
> > abv <- which(d$values == TRUE)
> > p.o.t <- data.frame()
> > 
> > for (i in seq( along = abv ) ) {
> >   a <- startPos[abv[i]]
> >   b <- a + (d$lengths[abv[i]] - 1)
> >   e <- which.max(walevel[a:b])
> >   p.o.t <- rbind( p.o.t, data.frame(
> >        pos = a + e - 1,
> >val = walevel[a:b][e]
> >) )
> > }
> > 
> > plot (
> >   day,
> >   walevel, type = 'l'
> > )
> > 
> > points(
> >   p.o.t$pos,
> >   p.o.t$val,
> >   col = 'red'
> > )
> > 
> > abline(h = threshold, lty = 2, col = 'red')
> >

Re: [R] Peak Over Threshold values

2010-05-31 Thread Tonja Krueger
Thanks a lot for your help. That’s the time period I was looking for. 

I’ve got one more question: for further analyses I need the respective maximum 
values
within these time periods (between the green and red lines). Preferably in 
combination
with the date the maximum event happened. 

Thank you
in advance, 

Tonja



-Ursprüngliche Nachricht-
Von: William Dunlap 
Gesendet: 27.05.2010 22:13:21
An: "Hutchinson,David [PYR]" ,Tonja Krueger 

Betreff: RE: [R] Peak Over Threshold values

>> -Original Message-
>> From: r-help-boun...@r-project.org 
>> [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap
>> Sent: Thursday, May 27, 2010 12:24 PM
>> To: Hutchinson,David [PYR]; Tonja Krueger
>> Cc: r-help@r-project.org
>> Subject: Re: [R] Peak Over Threshold values
>> 
>> Here is another version, loopless, but still
>> a bit clumsy (can the call to which be removed?):
>
>Note that avoiding loops is mostly for
>the aesthetic effect.  On my aging laptop
>the following loopy and vector-growing version
>takes 3 seconds on a million-long vector
>while the non-loopy one takes 0.22 seconds
>(timings done with plot=FALSE).  That is
>a nice ratio but not much of a difference.
>
>f0 <- function (x = walevel,
>startThreshhold = 5.45,
>stopThreshhold = 5.3, 
>plot = TRUE) 
>{
>stopifnot(startThreshhold > stopThreshhold)
>inRun <- FALSE
>start <- integer()
>stop <- integer()
>for (i in seq_along(x)) {
>if (inRun) {
>if (x[i] < stopThreshhold) {
>stop[length(stop) + 1] <- i - 1
>inRun <- FALSE
>}
>}
>else {
>if (x[i] > startThreshhold) {
>start[length(start) + 1] <- i
>inRun <- TRUE
>}
>}
>}
>if (inRun) 
>stop[length(stop) + 1] <- length(x)
>if (length(stop) > length(start)) 
>stop <- stop[-1]
>if (plot) {
>plot(x, cex = 0.5)
>abline(h = c(startThreshhold, stopThreshhold))
>abline(v = start, col = "green")
>abline(v = stop, col = "red")
>}
>data.frame(start = start, stop = stop)
>}
>
>> 
>> f <- function (x = walevel,
>>startThreshhold = 5.45,
>>stopThreshhold = 5.3, 
>>plot = TRUE) 
>> {
>> stopifnot(startThreshhold > stopThreshhold)
>> isFirstInRun <- function(x) c(TRUE, x[-1] != x[-length(x)])
>> isLastInRun <- function(x) c(x[-1] != x[-length(x)], TRUE)
>> isOverStart <- x >= startThreshhold
>> isOverStop <- x >= stopThreshhold
>> possibleStartPt <- which(isFirstInRun(isOverStart) & isOverStart)
>> possibleStopPt <- which(isLastInRun(isOverStop) & isOverStop)
>> pts <- c(possibleStartPt, possibleStopPt)
>> names(pts) <- rep(c("start", "stop"),
>> c(length(possibleStartPt), length(possibleStopPt)))
>> pts <- pts[order(pts)]
>> tmp <- isFirstInRun(names(pts))
>> start <- pts[tmp & names(pts) == "start"]
>> stop <- pts[tmp & names(pts) == "stop"]
>> if (length(stop) > length(start)) { 
>> # i.e., when first downcrossing happens before
>> # first upcrossing
>> stop <- stop[-1]
>> }
>> if (plot) {
>> plot(x, cex = 0.5)
>> abline(h = c(startThreshhold, stopThreshhold))
>> abline(v = start, col = "green")
>> abline(v = stop, col = "red")
>> }
>> data.frame(start = start, stop = stop)
>> }
>> 
>> # define the data in the original message and call f().
>> 
>> The isFirstInRun and isLastInRun functions do the
>> first part of the calculation that rle does and
>> avoid the cumsum(diff()) roundtrip that
>> cumsum(rle()$lengths) involves and their names
>> make it clearer what I'm trying to do.
>> 
>> Bill Dunlap
>> Spotfire, TIBCO Software
>> wdunlap tibco.com  
>> 
>> > -Original Message-
>> > From: r-help-boun...@r-project.org 
>> > [mailto:r-help-boun...@r-project.org] On Behalf Of 
>> > Hutchinson,David [PYR]
>> > Sent: Thursday, May 27, 2010 10:41 AM
>> > To: Tonja Krueger
>> > Cc: r-help@r-project.org
>> > Subject: Re: [R] Peak Over Threshold values
>> > 
>> > Perhaps not elegant, but does the job
&g

Re: [R] Peak Over Threshold values

2010-06-02 Thread William Dunlap
> -Original Message-
> From: Tonja Krueger [mailto:tonja.krue...@web.de] 
> Sent: Monday, May 31, 2010 2:03 AM
> To: William Dunlap
> Cc: r-help@r-project.org
> Subject: RE: [R] Peak Over Threshold values
> 
> Thanks a lot for your help. That's the time period I was looking for. 
> 
> I've got one more question: for further analyses I need the 
> respective maximum values
> within these time periods (between the green and red lines). 
> Preferably in combination
> with the date the maximum event happened. 

Looping over the intervals found by f() (or f0()) is
simple to do and probably quick enough for you.  E.g.,
to get the position of the maximum of each interval use
something like

  positionOfMaxInEachInterval <- function (x, intervals) 
  {
  with(intervals, sapply(seq_along(start), function(i) start[i] + 
  which.max(x[start[i]:stop[i]]) - 1))
  }

The maxima themselve would be gotten by
  intervals <- f(x, startThreshold, stopThreshold, plot=TRUE)
  imax <- postionOfMaxInEachInterval(x, intervals)
  # abline(v=imax) if you had plot=TRUE above
  maxs <- x[imax]

By the way, the looping form of f() can be made much faster
than the f0 I showed earlier, in the case where there are
many intervals (e.g., f(sin(1:1e6),.1,0)), by preallocating
the output.  You may prefer to edit that version to get the
positions of the interval maxima as you compute the intervals.

f0a <-
function (x = walevel, startThreshhold = 5.45, stopThreshhold = 5.3, 
plot = TRUE) 
{
stopifnot(startThreshhold > stopThreshhold)
inRun <- FALSE
start <- integer(length(x))
stop <- integer(length(x))
nStart <- 0
nStop <- 0
for (i in seq_along(x)) {
if (inRun) {
if (x[i] < stopThreshhold) {
nStop <- nStop + 1
stop[nStop] <- i - 1L
inRun <- FALSE
}
}
else {
if (x[i] > startThreshhold) {
nStart <- nStart + 1
start[nStart] <- i
inRun <- TRUE
}
}
}
if (inRun) {
nStop <- nStop + 1
stop[nStop] <- length(x)
}
if (nStop > nStart) {
stop <- stop[-1]
nStop <- nStop - 1
}
length(stop) <- nStop
length(start) <- nStart
if (plot) {
plot(x, cex = 0.5)
abline(h = c(startThreshhold, stopThreshhold))
abline(v = start, col = "green")
abline(v = stop, col = "red")
}
data.frame(start = start, stop = stop)
}

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> Thank you
> in advance, 
> 
> Tonja
> 
> 
> 
> -----Ursprüngliche Nachricht-
> Von: William Dunlap 
> Gesendet: 27.05.2010 22:13:21
> An: "Hutchinson,David [PYR]" 
> ,Tonja Krueger 
> Betreff: RE: [R] Peak Over Threshold values
> 
> >> -Original Message-
> >> From: r-help-boun...@r-project.org 
> >> [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap
> >> Sent: Thursday, May 27, 2010 12:24 PM
> >> To: Hutchinson,David [PYR]; Tonja Krueger
> >> Cc: r-help@r-project.org
> >> Subject: Re: [R] Peak Over Threshold values
> >> 
> >> Here is another version, loopless, but still
> >> a bit clumsy (can the call to which be removed?):
> >
> >Note that avoiding loops is mostly for
> >the aesthetic effect.  On my aging laptop
> >the following loopy and vector-growing version
> >takes 3 seconds on a million-long vector
> >while the non-loopy one takes 0.22 seconds
> >(timings done with plot=FALSE).  That is
> >a nice ratio but not much of a difference.
> >
> >f0 <- function (x = walevel,
> >startThreshhold = 5.45,
> >stopThreshhold = 5.3, 
> >plot = TRUE) 
> >{
> >stopifnot(startThreshhold > stopThreshhold)
> >inRun <- FALSE
> >start <- integer()
> >stop <- integer()
> >for (i in seq_along(x)) {
> >if (inRun) {
> >if (x[i] < stopThreshhold) {
> >stop[length(stop) + 1] <- i - 1
> >inRun <- FALSE
> >}
> >}
> >else {
> >if (x[i] > startThreshhold) {
> >start[length(start) + 1] <- i
> >inRun <- TRUE
> >}
> >}
> >}
> >if (inRun) 
> >stop[length(stop) + 1] <- length(x)
> >if (length(stop) > length(start)) 
> >stop <- stop[-1]
> >if (plot) {
> >plot(x, cex = 0.5)
> >abline(h =

[R] Peak Over Threshold values/ length of the events

2010-05-25 Thread Tonja Krueger

   Dear List

   I hope you can help me: I’ve got a dataframe (df) within which I am looking
   for Peak Over Threshold values as well as the length of the events. An event
   starts when walevel equals 5.8 and it should end when walevel equals the
   lower threshold value (5.35).

   I tried “clusters (…)” from “evd package”, and varied r (see example) but it
   did not work for all events (again see example).

   walevel <- c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 5.86, 5.91, 5.91,
   5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 6.11, 6.14, 6.12,
   6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 5.72, 5.70, 5.65,
   5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 5.19, 5.19, 5.13,
   5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 5.22, 5.32, 5.29,
   5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 5.66, 5.68, 5.72,
   5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 5.62, 5.62, 5.61,
   5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 5.45, 5.47, 5.50,
   5.50, 5.49, 5.43, 5.39, 5.33, 5.26)

   day <- c(1:100)

   df <- data.frame(day,walevel)

   library(evd)
   clusters(df$walevel, u = 5.80, r = 1, ulow = 5.35, cmax = T, plot = T)
   clusters(df$walevel, u = 5.80, r = 50, ulow = 5.35, cmax = T, plot = T)

   What have I done wrong?

   Tonja
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.