Re: [R] Peak Over Threshold values

2010-06-02 Thread William Dunlap
 -Original Message-
 From: Tonja Krueger [mailto:tonja.krue...@web.de] 
 Sent: Monday, May 31, 2010 2:03 AM
 To: William Dunlap
 Cc: r-help@r-project.org
 Subject: RE: [R] Peak Over Threshold values
 
 Thanks a lot for your help. That's the time period I was looking for. 
 
 I've got one more question: for further analyses I need the 
 respective maximum values
 within these time periods (between the green and red lines). 
 Preferably in combination
 with the date the maximum event happened. 

Looping over the intervals found by f() (or f0()) is
simple to do and probably quick enough for you.  E.g.,
to get the position of the maximum of each interval use
something like

  positionOfMaxInEachInterval - function (x, intervals) 
  {
  with(intervals, sapply(seq_along(start), function(i) start[i] + 
  which.max(x[start[i]:stop[i]]) - 1))
  }

The maxima themselve would be gotten by
  intervals - f(x, startThreshold, stopThreshold, plot=TRUE)
  imax - postionOfMaxInEachInterval(x, intervals)
  # abline(v=imax) if you had plot=TRUE above
  maxs - x[imax]

By the way, the looping form of f() can be made much faster
than the f0 I showed earlier, in the case where there are
many intervals (e.g., f(sin(1:1e6),.1,0)), by preallocating
the output.  You may prefer to edit that version to get the
positions of the interval maxima as you compute the intervals.

f0a -
function (x = walevel, startThreshhold = 5.45, stopThreshhold = 5.3, 
plot = TRUE) 
{
stopifnot(startThreshhold  stopThreshhold)
inRun - FALSE
start - integer(length(x))
stop - integer(length(x))
nStart - 0
nStop - 0
for (i in seq_along(x)) {
if (inRun) {
if (x[i]  stopThreshhold) {
nStop - nStop + 1
stop[nStop] - i - 1L
inRun - FALSE
}
}
else {
if (x[i]  startThreshhold) {
nStart - nStart + 1
start[nStart] - i
inRun - TRUE
}
}
}
if (inRun) {
nStop - nStop + 1
stop[nStop] - length(x)
}
if (nStop  nStart) {
stop - stop[-1]
nStop - nStop - 1
}
length(stop) - nStop
length(start) - nStart
if (plot) {
plot(x, cex = 0.5)
abline(h = c(startThreshhold, stopThreshhold))
abline(v = start, col = green)
abline(v = stop, col = red)
}
data.frame(start = start, stop = stop)
}

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

 Thank you
 in advance, 
 
 Tonja
 
 
 
 -Ursprüngliche Nachricht-
 Von: William Dunlap wdun...@tibco.com
 Gesendet: 27.05.2010 22:13:21
 An: Hutchinson,David [PYR] 
 david.hutchin...@ec.gc.ca,Tonja Krueger tonja.krue...@web.de
 Betreff: RE: [R] Peak Over Threshold values
 
  -Original Message-
  From: r-help-boun...@r-project.org 
  [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap
  Sent: Thursday, May 27, 2010 12:24 PM
  To: Hutchinson,David [PYR]; Tonja Krueger
  Cc: r-help@r-project.org
  Subject: Re: [R] Peak Over Threshold values
  
  Here is another version, loopless, but still
  a bit clumsy (can the call to which be removed?):
 
 Note that avoiding loops is mostly for
 the aesthetic effect.  On my aging laptop
 the following loopy and vector-growing version
 takes 3 seconds on a million-long vector
 while the non-loopy one takes 0.22 seconds
 (timings done with plot=FALSE).  That is
 a nice ratio but not much of a difference.
 
 f0 - function (x = walevel,
 startThreshhold = 5.45,
 stopThreshhold = 5.3, 
 plot = TRUE) 
 {
 stopifnot(startThreshhold  stopThreshhold)
 inRun - FALSE
 start - integer()
 stop - integer()
 for (i in seq_along(x)) {
 if (inRun) {
 if (x[i]  stopThreshhold) {
 stop[length(stop) + 1] - i - 1
 inRun - FALSE
 }
 }
 else {
 if (x[i]  startThreshhold) {
 start[length(start) + 1] - i
 inRun - TRUE
 }
 }
 }
 if (inRun) 
 stop[length(stop) + 1] - length(x)
 if (length(stop)  length(start)) 
 stop - stop[-1]
 if (plot) {
 plot(x, cex = 0.5)
 abline(h = c(startThreshhold, stopThreshhold))
 abline(v = start, col = green)
 abline(v = stop, col = red)
 }
 data.frame(start = start, stop = stop)
 }
 
  
  f - function (x = walevel,
 startThreshhold = 5.45,
 stopThreshhold = 5.3, 
 plot = TRUE) 
  {
  stopifnot(startThreshhold  stopThreshhold)
  isFirstInRun - function(x) c(TRUE, x[-1] != x[-length(x)])
  isLastInRun - function(x) c(x[-1] != x[-length(x)], TRUE)
  isOverStart - x = startThreshhold
  isOverStop - x = stopThreshhold
  possibleStartPt - which(isFirstInRun(isOverStart)  
 isOverStart)
  possibleStopPt - which(isLastInRun(isOverStop

Re: [R] Peak Over Threshold values

2010-05-31 Thread Tonja Krueger
Thanks a lot for your help. That’s the time period I was looking for. 

I’ve got one more question: for further analyses I need the respective maximum 
values
within these time periods (between the green and red lines). Preferably in 
combination
with the date the maximum event happened. 

Thank you
in advance, 

Tonja



-Ursprüngliche Nachricht-
Von: William Dunlap wdun...@tibco.com
Gesendet: 27.05.2010 22:13:21
An: Hutchinson,David [PYR] david.hutchin...@ec.gc.ca,Tonja Krueger 
tonja.krue...@web.de
Betreff: RE: [R] Peak Over Threshold values

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap
 Sent: Thursday, May 27, 2010 12:24 PM
 To: Hutchinson,David [PYR]; Tonja Krueger
 Cc: r-help@r-project.org
 Subject: Re: [R] Peak Over Threshold values
 
 Here is another version, loopless, but still
 a bit clumsy (can the call to which be removed?):

Note that avoiding loops is mostly for
the aesthetic effect.  On my aging laptop
the following loopy and vector-growing version
takes 3 seconds on a million-long vector
while the non-loopy one takes 0.22 seconds
(timings done with plot=FALSE).  That is
a nice ratio but not much of a difference.

f0 - function (x = walevel,
startThreshhold = 5.45,
stopThreshhold = 5.3, 
plot = TRUE) 
{
stopifnot(startThreshhold  stopThreshhold)
inRun - FALSE
start - integer()
stop - integer()
for (i in seq_along(x)) {
if (inRun) {
if (x[i]  stopThreshhold) {
stop[length(stop) + 1] - i - 1
inRun - FALSE
}
}
else {
if (x[i]  startThreshhold) {
start[length(start) + 1] - i
inRun - TRUE
}
}
}
if (inRun) 
stop[length(stop) + 1] - length(x)
if (length(stop)  length(start)) 
stop - stop[-1]
if (plot) {
plot(x, cex = 0.5)
abline(h = c(startThreshhold, stopThreshhold))
abline(v = start, col = green)
abline(v = stop, col = red)
}
data.frame(start = start, stop = stop)
}

 
 f - function (x = walevel,
startThreshhold = 5.45,
stopThreshhold = 5.3, 
plot = TRUE) 
 {
 stopifnot(startThreshhold  stopThreshhold)
 isFirstInRun - function(x) c(TRUE, x[-1] != x[-length(x)])
 isLastInRun - function(x) c(x[-1] != x[-length(x)], TRUE)
 isOverStart - x = startThreshhold
 isOverStop - x = stopThreshhold
 possibleStartPt - which(isFirstInRun(isOverStart)  isOverStart)
 possibleStopPt - which(isLastInRun(isOverStop)  isOverStop)
 pts - c(possibleStartPt, possibleStopPt)
 names(pts) - rep(c(start, stop),
 c(length(possibleStartPt), length(possibleStopPt)))
 pts - pts[order(pts)]
 tmp - isFirstInRun(names(pts))
 start - pts[tmp  names(pts) == start]
 stop - pts[tmp  names(pts) == stop]
 if (length(stop)  length(start)) { 
 # i.e., when first downcrossing happens before
 # first upcrossing
 stop - stop[-1]
 }
 if (plot) {
 plot(x, cex = 0.5)
 abline(h = c(startThreshhold, stopThreshhold))
 abline(v = start, col = green)
 abline(v = stop, col = red)
 }
 data.frame(start = start, stop = stop)
 }
 
 # define the data in the original message and call f().
 
 The isFirstInRun and isLastInRun functions do the
 first part of the calculation that rle does and
 avoid the cumsum(diff()) roundtrip that
 cumsum(rle()$lengths) involves and their names
 make it clearer what I'm trying to do.
 
 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com  
 
  -Original Message-
  From: r-help-boun...@r-project.org 
  [mailto:r-help-boun...@r-project.org] On Behalf Of 
  Hutchinson,David [PYR]
  Sent: Thursday, May 27, 2010 10:41 AM
  To: Tonja Krueger
  Cc: r-help@r-project.org
  Subject: Re: [R] Peak Over Threshold values
  
  Perhaps not elegant, but does the job
  
  threshold - 5.45
  meetThreshold - walevel = threshold
  
  d - rle(meetThreshold)
  startPos - c(1, 1 + cumsum(d$lengths[-length(d$lengths)]))
  
  abv - which(d$values == TRUE)
  p.o.t - data.frame()
  
  for (i in seq( along = abv ) ) {
a - startPos[abv[i]]
b - a + (d$lengths[abv[i]] - 1)
e - which.max(walevel[a:b])
p.o.t - rbind( p.o.t, data.frame(
 pos = a + e - 1,
 val = walevel[a:b][e]
 ) )
  }
  
  plot (
day,
walevel, type = 'l'
  )
  
  points(
p.o.t$pos,
p.o.t$val,
col = 'red'
  )
  
  abline(h = threshold, lty = 2, col = 'red')
  
  -Original Message-
  From: r-help-boun...@r-project.org 
  [mailto:r-help-boun...@r-project.org] On Behalf Of Tonja Krueger
  Sent: Thursday, May 27, 2010 1:47 AM
  To: Vito Muggeo (UniPa); Clint Bowman
  Cc: r-help@r-project.org
  Subject: [R] Peak Over Threshold values
  
  
 I'm  sorry, but that's not exactly

[R] Peak Over Threshold values

2010-05-27 Thread Tonja Krueger

   I’m  sorry, but that‘s not exactly what I was looking for. I obviously
   didn’t explain properly:

   Within  my dataframe (df) I would like to find POT values that are not
   linked. In my definition two maximum values are linked if walevel does not
   fall below a certain value (the lower threshold value).

   walevel - c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 5.86, 5.91, 5.91,
   5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 6.11, 6.14, 6.12,
   6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 5.72, 5.70, 5.65,
   5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 5.19, 5.19, 5.13,
   5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 5.22, 5.32, 5.29,
   5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 5.66, 5.68, 5.72,
   5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 5.62, 5.62, 5.61,
   5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 5.45, 5.47, 5.50,
   5.50, 5.49, 5.43, 5.39, 5.33, 5.26)

   day - c(1:100)

   df - data.frame(day,walevel)

   For example in my dataframe  a threshold value = 5.45 and an lower threshold
   value = 5.3 should lead to two maximum values because between the 2^nd and
   3^rd peak walevel does not fall below the lower threshold value.

   With  “clusters (…)” from “evd package”, I could find out POT values but
   even though I set a lower threshold value for my example dataframe I would
   get three maximum values instead of two.

   library(evd)

   clusters(df$walevel,u =5.45, ulow= 5.3, r = 1, cmax=T)

   clusters(df$walevel,u =5.45, ulow= 5.3, r = 1, plot=T)

   Changing r to 30 (for example) connects the 2^nd and 3^rd maximum events and
   gives out the right ‘end’ for the first extreme event but not for the second
   event. (Also I’d rather not use a time interval to prove that the events are
   linked)

   clusters(df$walevel,u =5.45, ulow= 5.3, r = 30, cmax=T)

   clusters(df$walevel,u =5.45, ulow= 5.3, r = 30, plot=T)

   What went wrong???

   Tonja
   -Ursprüngliche Nachricht-
   How about?
   hi.rle-rle(walevel5.79)
   lo.rle-rle(walevel5.36)
   plot(walevel)
   abline(h=5.8,col=2,lty=3)
   abline(h=5.35,col=3,lty=3)
   hi.lo.rle-sort(c(cumsum(hi.rle$lengths),cumsum(lo.rle$lengths)))
   abline(v=hi.lo.rle)
   You can use the $values from the rle to sort things out. Probably
   want to ignore the end point at length(walevel).
   --
   Clint Bowman INTERNET: cl...@ecy.wa.gov
   Air Quality Modeler INTERNET: cl...@math.utah.edu
   Department of Ecology VOICE: (360) 407-6815
   PO Box 47600 FAX: (360) 407-7534
   Olympia, WA 98504-7600
   On Wed, 26 May 2010, Vito Muggeo (UniPa) wrote:
dear Tonja,
By plotting your data
   
plot(df)
   
it seems to me that you are looking for a piecewise linear relationships.
   If
 this is the case, have a look to the package segmented. You have to
   specify
or not the number and the starting values for the breakpoints
   
library(segmented)
olm-lm(walevel~day)
   
#specify number and starting values for the breakpoints..
oseg-segmented(olm, seg.Z=~day, psi=c(20,50,80))
plot(oseg,add=TRUE,col=2)
oseg$psi
   
#or you can avoid to specify starting values for psi
oseg-segmented(olm, seg.Z=~day,
psi=NA,control=seg.control(stop.if.error=FALSE,K=30))
   
plot(oseg,add=TRUE,col=2)
oseg$psi
   
   
best,
vito
   
   
Tonja Krueger ha scritto:
Dear List
   
I hope you can help me: I’ve got a dataframe (df) within which I am
looking
for Peak Over Threshold values as well as the length of the events. An
event
starts when walevel equals 5.8 and it should end when walevel equals
the
lower threshold value (5.35).
   
I tried “clusters (…)” from “evd package”, and varied r (see example)
but it
did not work for all events (again see example).
   
walevel - c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 5.86, 5.91,
5.91,
5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 6.11, 6.14, 6.12,
6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 5.72, 5.70, 5.65,
5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 5.19, 5.19, 5.13,
5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 5.22, 5.32, 5.29,
5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 5.66, 5.68, 5.72,
5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 5.62, 5.62, 5.61,
5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 5.45, 5.47, 5.50,
5.50, 5.49, 5.43, 5.39, 5.33, 5.26)
   
day - c(1:100)
   
df - data.frame(day,walevel)
   
library(evd)
clusters(df$walevel, u = 5.80, r = 1, ulow = 5.35, cmax = T, plot = T)
clusters(df$walevel, u = 5.80, r = 50, ulow = 5.35, cmax = T, plot = T)
   
What have I done wrong?
   
Tonja
__
R-help@r-project.org mailing list
[1]https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

Re: [R] Peak Over Threshold values

2010-05-27 Thread Hutchinson,David [PYR]
Perhaps not elegant, but does the job

threshold - 5.45
meetThreshold - walevel = threshold

d - rle(meetThreshold)
startPos - c(1, 1 + cumsum(d$lengths[-length(d$lengths)]))

abv - which(d$values == TRUE)
p.o.t - data.frame()

for (i in seq( along = abv ) ) {
  a - startPos[abv[i]]
  b - a + (d$lengths[abv[i]] - 1)
  e - which.max(walevel[a:b])
  p.o.t - rbind( p.o.t, data.frame(
   pos = a + e - 1,
   val = walevel[a:b][e]
   ) )
}

plot (
  day,
  walevel, type = 'l'
)

points(
  p.o.t$pos,
  p.o.t$val,
  col = 'red'
)

abline(h = threshold, lty = 2, col = 'red')

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Tonja Krueger
Sent: Thursday, May 27, 2010 1:47 AM
To: Vito Muggeo (UniPa); Clint Bowman
Cc: r-help@r-project.org
Subject: [R] Peak Over Threshold values


   I'm  sorry, but that's not exactly what I was looking for. I obviously
   didn't explain properly:

   Within  my dataframe (df) I would like to find POT values that are not
   linked. In my definition two maximum values are linked if walevel does not
   fall below a certain value (the lower threshold value).

   walevel - c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 5.86, 5.91, 5.91,
   5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 6.11, 6.14, 6.12,
   6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 5.72, 5.70, 5.65,
   5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 5.19, 5.19, 5.13,
   5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 5.22, 5.32, 5.29,
   5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 5.66, 5.68, 5.72,
   5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 5.62, 5.62, 5.61,
   5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 5.45, 5.47, 5.50,
   5.50, 5.49, 5.43, 5.39, 5.33, 5.26)

   day - c(1:100)

   df - data.frame(day,walevel)

   For example in my dataframe  a threshold value = 5.45 and an lower threshold
   value = 5.3 should lead to two maximum values because between the 2^nd and
   3^rd peak walevel does not fall below the lower threshold value.

   With  clusters (...) from evd package, I could find out POT values but
   even though I set a lower threshold value for my example dataframe I would
   get three maximum values instead of two.

   library(evd)

   clusters(df$walevel,u =5.45, ulow= 5.3, r = 1, cmax=T)

   clusters(df$walevel,u =5.45, ulow= 5.3, r = 1, plot=T)

   Changing r to 30 (for example) connects the 2^nd and 3^rd maximum events and
   gives out the right 'end' for the first extreme event but not for the second
   event. (Also I'd rather not use a time interval to prove that the events are
   linked)

   clusters(df$walevel,u =5.45, ulow= 5.3, r = 30, cmax=T)

   clusters(df$walevel,u =5.45, ulow= 5.3, r = 30, plot=T)

   What went wrong???

   Tonja
   -Ursprüngliche Nachricht-
   How about?
   hi.rle-rle(walevel5.79)
   lo.rle-rle(walevel5.36)
   plot(walevel)
   abline(h=5.8,col=2,lty=3)
   abline(h=5.35,col=3,lty=3)
   hi.lo.rle-sort(c(cumsum(hi.rle$lengths),cumsum(lo.rle$lengths)))
   abline(v=hi.lo.rle)
   You can use the $values from the rle to sort things out. Probably
   want to ignore the end point at length(walevel).
   --
   Clint Bowman INTERNET: cl...@ecy.wa.gov
   Air Quality Modeler INTERNET: cl...@math.utah.edu
   Department of Ecology VOICE: (360) 407-6815
   PO Box 47600 FAX: (360) 407-7534
   Olympia, WA 98504-7600
   On Wed, 26 May 2010, Vito Muggeo (UniPa) wrote:
dear Tonja,
By plotting your data
   
plot(df)
   
it seems to me that you are looking for a piecewise linear relationships.
   If
 this is the case, have a look to the package segmented. You have to
   specify
or not the number and the starting values for the breakpoints
   
library(segmented)
olm-lm(walevel~day)
   
#specify number and starting values for the breakpoints..
oseg-segmented(olm, seg.Z=~day, psi=c(20,50,80))
plot(oseg,add=TRUE,col=2)
oseg$psi
   
#or you can avoid to specify starting values for psi
oseg-segmented(olm, seg.Z=~day,
psi=NA,control=seg.control(stop.if.error=FALSE,K=30))
   
plot(oseg,add=TRUE,col=2)
oseg$psi
   
   
best,
vito
   
   
Tonja Krueger ha scritto:
Dear List
   
I hope you can help me: I've got a dataframe (df) within which I am
looking
for Peak Over Threshold values as well as the length of the events. An
event
starts when walevel equals 5.8 and it should end when walevel equals
the
lower threshold value (5.35).
   
I tried clusters (...) from evd package, and varied r (see example)
but it
did not work for all events (again see example).
   
walevel - c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 5.86, 5.91,
5.91,
5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 6.11, 6.14, 6.12,
6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 5.72, 5.70, 5.65,
5.59, 5.51, 5.43, 5.33, 5.30, 5.25

Re: [R] Peak Over Threshold values

2010-05-27 Thread William Dunlap
Here is another version, loopless, but still
a bit clumsy (can the call to which be removed?):

f - function (x = walevel,
   startThreshhold = 5.45,
   stopThreshhold = 5.3, 
   plot = TRUE) 
{
stopifnot(startThreshhold  stopThreshhold)
isFirstInRun - function(x) c(TRUE, x[-1] != x[-length(x)])
isLastInRun - function(x) c(x[-1] != x[-length(x)], TRUE)
isOverStart - x = startThreshhold
isOverStop - x = stopThreshhold
possibleStartPt - which(isFirstInRun(isOverStart)  isOverStart)
possibleStopPt - which(isLastInRun(isOverStop)  isOverStop)
pts - c(possibleStartPt, possibleStopPt)
names(pts) - rep(c(start, stop),
c(length(possibleStartPt), length(possibleStopPt)))
pts - pts[order(pts)]
tmp - isFirstInRun(names(pts))
start - pts[tmp  names(pts) == start]
stop - pts[tmp  names(pts) == stop]
if (length(stop)  length(start)) { 
# i.e., when first downcrossing happens before
# first upcrossing
stop - stop[-1]
}
if (plot) {
plot(x, cex = 0.5)
abline(h = c(startThreshhold, stopThreshhold))
abline(v = start, col = green)
abline(v = stop, col = red)
}
data.frame(start = start, stop = stop)
}

# define the data in the original message and call f().

The isFirstInRun and isLastInRun functions do the
first part of the calculation that rle does and
avoid the cumsum(diff()) roundtrip that
cumsum(rle()$lengths) involves and their names
make it clearer what I'm trying to do.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of 
 Hutchinson,David [PYR]
 Sent: Thursday, May 27, 2010 10:41 AM
 To: Tonja Krueger
 Cc: r-help@r-project.org
 Subject: Re: [R] Peak Over Threshold values
 
 Perhaps not elegant, but does the job
 
 threshold - 5.45
 meetThreshold - walevel = threshold
 
 d - rle(meetThreshold)
 startPos - c(1, 1 + cumsum(d$lengths[-length(d$lengths)]))
 
 abv - which(d$values == TRUE)
 p.o.t - data.frame()
 
 for (i in seq( along = abv ) ) {
   a - startPos[abv[i]]
   b - a + (d$lengths[abv[i]] - 1)
   e - which.max(walevel[a:b])
   p.o.t - rbind( p.o.t, data.frame(
pos = a + e - 1,
val = walevel[a:b][e]
) )
 }
 
 plot (
   day,
   walevel, type = 'l'
 )
 
 points(
   p.o.t$pos,
   p.o.t$val,
   col = 'red'
 )
 
 abline(h = threshold, lty = 2, col = 'red')
 
 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Tonja Krueger
 Sent: Thursday, May 27, 2010 1:47 AM
 To: Vito Muggeo (UniPa); Clint Bowman
 Cc: r-help@r-project.org
 Subject: [R] Peak Over Threshold values
 
 
I'm  sorry, but that's not exactly what I was looking for. 
 I obviously
didn't explain properly:
 
Within  my dataframe (df) I would like to find POT values 
 that are not
linked. In my definition two maximum values are linked if 
 walevel does not
fall below a certain value (the lower threshold value).
 
walevel - c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 
 5.86, 5.91, 5.91,
5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 
 6.11, 6.14, 6.12,
6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 
 5.72, 5.70, 5.65,
5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 
 5.19, 5.19, 5.13,
5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 
 5.22, 5.32, 5.29,
5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 
 5.66, 5.68, 5.72,
5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 
 5.62, 5.62, 5.61,
5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 
 5.45, 5.47, 5.50,
5.50, 5.49, 5.43, 5.39, 5.33, 5.26)
 
day - c(1:100)
 
df - data.frame(day,walevel)
 
For example in my dataframe  a threshold value = 5.45 and 
 an lower threshold
value = 5.3 should lead to two maximum values because 
 between the 2^nd and
3^rd peak walevel does not fall below the lower threshold value.
 
With  clusters (...) from evd package, I could find 
 out POT values but
even though I set a lower threshold value for my example 
 dataframe I would
get three maximum values instead of two.
 
library(evd)
 
clusters(df$walevel,u =5.45, ulow= 5.3, r = 1, cmax=T)
 
clusters(df$walevel,u =5.45, ulow= 5.3, r = 1, plot=T)
 
Changing r to 30 (for example) connects the 2^nd and 3^rd 
 maximum events and
gives out the right 'end' for the first extreme event but 
 not for the second
event. (Also I'd rather not use a time interval to prove 
 that the events are
linked)
 
clusters(df$walevel,u =5.45, ulow= 5.3, r = 30, cmax=T)
 
clusters(df$walevel,u =5.45, ulow= 5.3, r = 30, plot=T)
 
What went wrong???
 
Tonja
-Ursprüngliche Nachricht-
How about?
hi.rle-rle(walevel5.79)
lo.rle-rle(walevel5.36)
plot(walevel)
abline(h

Re: [R] Peak Over Threshold values

2010-05-27 Thread William Dunlap
 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap
 Sent: Thursday, May 27, 2010 12:24 PM
 To: Hutchinson,David [PYR]; Tonja Krueger
 Cc: r-help@r-project.org
 Subject: Re: [R] Peak Over Threshold values
 
 Here is another version, loopless, but still
 a bit clumsy (can the call to which be removed?):

Note that avoiding loops is mostly for
the aesthetic effect.  On my aging laptop
the following loopy and vector-growing version
takes 3 seconds on a million-long vector
while the non-loopy one takes 0.22 seconds
(timings done with plot=FALSE).  That is
a nice ratio but not much of a difference.

f0 - function (x = walevel,
startThreshhold = 5.45,
stopThreshhold = 5.3, 
plot = TRUE) 
{
stopifnot(startThreshhold  stopThreshhold)
inRun - FALSE
start - integer()
stop - integer()
for (i in seq_along(x)) {
if (inRun) {
if (x[i]  stopThreshhold) {
stop[length(stop) + 1] - i - 1
inRun - FALSE
}
}
else {
if (x[i]  startThreshhold) {
start[length(start) + 1] - i
inRun - TRUE
}
}
}
if (inRun) 
stop[length(stop) + 1] - length(x)
if (length(stop)  length(start)) 
stop - stop[-1]
if (plot) {
plot(x, cex = 0.5)
abline(h = c(startThreshhold, stopThreshhold))
abline(v = start, col = green)
abline(v = stop, col = red)
}
data.frame(start = start, stop = stop)
}

 
 f - function (x = walevel,
startThreshhold = 5.45,
stopThreshhold = 5.3, 
plot = TRUE) 
 {
 stopifnot(startThreshhold  stopThreshhold)
 isFirstInRun - function(x) c(TRUE, x[-1] != x[-length(x)])
 isLastInRun - function(x) c(x[-1] != x[-length(x)], TRUE)
 isOverStart - x = startThreshhold
 isOverStop - x = stopThreshhold
 possibleStartPt - which(isFirstInRun(isOverStart)  isOverStart)
 possibleStopPt - which(isLastInRun(isOverStop)  isOverStop)
 pts - c(possibleStartPt, possibleStopPt)
 names(pts) - rep(c(start, stop),
 c(length(possibleStartPt), length(possibleStopPt)))
 pts - pts[order(pts)]
 tmp - isFirstInRun(names(pts))
 start - pts[tmp  names(pts) == start]
 stop - pts[tmp  names(pts) == stop]
 if (length(stop)  length(start)) { 
 # i.e., when first downcrossing happens before
 # first upcrossing
 stop - stop[-1]
 }
 if (plot) {
 plot(x, cex = 0.5)
 abline(h = c(startThreshhold, stopThreshhold))
 abline(v = start, col = green)
 abline(v = stop, col = red)
 }
 data.frame(start = start, stop = stop)
 }
 
 # define the data in the original message and call f().
 
 The isFirstInRun and isLastInRun functions do the
 first part of the calculation that rle does and
 avoid the cumsum(diff()) roundtrip that
 cumsum(rle()$lengths) involves and their names
 make it clearer what I'm trying to do.
 
 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com  
 
  -Original Message-
  From: r-help-boun...@r-project.org 
  [mailto:r-help-boun...@r-project.org] On Behalf Of 
  Hutchinson,David [PYR]
  Sent: Thursday, May 27, 2010 10:41 AM
  To: Tonja Krueger
  Cc: r-help@r-project.org
  Subject: Re: [R] Peak Over Threshold values
  
  Perhaps not elegant, but does the job
  
  threshold - 5.45
  meetThreshold - walevel = threshold
  
  d - rle(meetThreshold)
  startPos - c(1, 1 + cumsum(d$lengths[-length(d$lengths)]))
  
  abv - which(d$values == TRUE)
  p.o.t - data.frame()
  
  for (i in seq( along = abv ) ) {
a - startPos[abv[i]]
b - a + (d$lengths[abv[i]] - 1)
e - which.max(walevel[a:b])
p.o.t - rbind( p.o.t, data.frame(
 pos = a + e - 1,
 val = walevel[a:b][e]
 ) )
  }
  
  plot (
day,
walevel, type = 'l'
  )
  
  points(
p.o.t$pos,
p.o.t$val,
col = 'red'
  )
  
  abline(h = threshold, lty = 2, col = 'red')
  
  -Original Message-
  From: r-help-boun...@r-project.org 
  [mailto:r-help-boun...@r-project.org] On Behalf Of Tonja Krueger
  Sent: Thursday, May 27, 2010 1:47 AM
  To: Vito Muggeo (UniPa); Clint Bowman
  Cc: r-help@r-project.org
  Subject: [R] Peak Over Threshold values
  
  
 I'm  sorry, but that's not exactly what I was looking for. 
  I obviously
 didn't explain properly:
  
 Within  my dataframe (df) I would like to find POT values 
  that are not
 linked. In my definition two maximum values are linked if 
  walevel does not
 fall below a certain value (the lower threshold value).
  
 walevel - c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 
  5.86, 5.91, 5.91,
 5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 
  6.11, 6.14, 6.12,
 6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 
  5.72, 5.70, 5.65,
 5.59, 5.51, 5.43

[R] Peak Over Threshold values

2010-05-26 Thread Tonja Krueger

   Dear List

   I hope you can help me: I’ve got a dataframe (df) within which I am looking
   for Peak Over Threshold values as well as the length of the events. An event
   starts when walevel equals 5.8 and it should end when walevel equals the
   lower threshold value (5.35).

   I tried “clusters (…)” from “evd package”, and varied r (see example) but it
   did not work for all events (again see example).

   walevel - c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 5.86, 5.91, 5.91,
   5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 6.11, 6.14, 6.12,
   6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 5.72, 5.70, 5.65,
   5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 5.19, 5.19, 5.13,
   5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 5.22, 5.32, 5.29,
   5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 5.66, 5.68, 5.72,
   5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 5.62, 5.62, 5.61,
   5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 5.45, 5.47, 5.50,
   5.50, 5.49, 5.43, 5.39, 5.33, 5.26)

   day - c(1:100)

   df - data.frame(day,walevel)

   library(evd)
   clusters(df$walevel, u = 5.80, r = 1, ulow = 5.35, cmax = T, plot = T)
   clusters(df$walevel, u = 5.80, r = 50, ulow = 5.35, cmax = T, plot = T)

   What have I done wrong?

   Tonja
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Peak Over Threshold values

2010-05-26 Thread Vito Muggeo (UniPa)

dear Tonja,
By plotting your data

plot(df)

it seems to me that you are looking for a piecewise linear 
relationships. If this is the case, have a look to the package 
segmented. You have to specify or not the number and the starting values 
for the breakpoints


library(segmented)
olm-lm(walevel~day)

#specify number and starting values for the breakpoints..
oseg-segmented(olm, seg.Z=~day, psi=c(20,50,80))
plot(oseg,add=TRUE,col=2)
oseg$psi

#or you can avoid to specify starting values for psi
oseg-segmented(olm, seg.Z=~day, 
psi=NA,control=seg.control(stop.if.error=FALSE,K=30))


plot(oseg,add=TRUE,col=2)
oseg$psi


best,
vito


Tonja Krueger ha scritto:

   Dear List

   I hope you can help me: I’ve got a dataframe (df) within which I am looking
   for Peak Over Threshold values as well as the length of the events. An event
   starts when walevel equals 5.8 and it should end when walevel equals the
   lower threshold value (5.35).

   I tried “clusters (…)” from “evd package”, and varied r (see example) but it
   did not work for all events (again see example).

   walevel - c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 5.86, 5.91, 5.91,
   5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 6.11, 6.14, 6.12,
   6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 5.72, 5.70, 5.65,
   5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 5.19, 5.19, 5.13,
   5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 5.22, 5.32, 5.29,
   5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 5.66, 5.68, 5.72,
   5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 5.62, 5.62, 5.61,
   5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 5.45, 5.47, 5.50,
   5.50, 5.49, 5.43, 5.39, 5.33, 5.26)

   day - c(1:100)

   df - data.frame(day,walevel)

   library(evd)
   clusters(df$walevel, u = 5.80, r = 1, ulow = 5.35, cmax = T, plot = T)
   clusters(df$walevel, u = 5.80, r = 50, ulow = 5.35, cmax = T, plot = T)

   What have I done wrong?

   Tonja
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Peak Over Threshold values

2010-05-26 Thread Clint Bowman

How about?

hi.rle-rle(walevel5.79)
lo.rle-rle(walevel5.36)
plot(walevel)
abline(h=5.8,col=2,lty=3)
abline(h=5.35,col=3,lty=3)
hi.lo.rle-sort(c(cumsum(hi.rle$lengths),cumsum(lo.rle$lengths)))
abline(v=hi.lo.rle)

You can use the $values from the rle to sort things out.  Probably 
want to ignore the end point at length(walevel).


--
Clint BowmanINTERNET:   cl...@ecy.wa.gov
Air Quality Modeler INTERNET:   cl...@math.utah.edu
Department of Ecology   VOICE:  (360) 407-6815
PO Box 47600FAX:(360) 407-7534
Olympia, WA 98504-7600

On Wed, 26 May 2010, Vito Muggeo (UniPa) wrote:


dear Tonja,
By plotting your data

plot(df)

it seems to me that you are looking for a piecewise linear relationships. If 
this is the case, have a look to the package segmented. You have to specify 
or not the number and the starting values for the breakpoints


library(segmented)
olm-lm(walevel~day)

#specify number and starting values for the breakpoints..
oseg-segmented(olm, seg.Z=~day, psi=c(20,50,80))
plot(oseg,add=TRUE,col=2)
oseg$psi

#or you can avoid to specify starting values for psi
oseg-segmented(olm, seg.Z=~day, 
psi=NA,control=seg.control(stop.if.error=FALSE,K=30))


plot(oseg,add=TRUE,col=2)
oseg$psi


best,
vito


Tonja Krueger ha scritto:

Dear List

I hope you can help me: I’ve got a dataframe (df) within which I am
looking
for Peak Over Threshold values as well as the length of the events. An
event
starts when walevel equals 5.8 and it should end when walevel equals
the
lower threshold value (5.35).

I tried “clusters (…)” from “evd package”, and varied r (see example)
but it
did not work for all events (again see example).

walevel - c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 5.86, 5.91,
5.91,
5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 6.11, 6.14, 6.12,
6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 5.72, 5.70, 5.65,
5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 5.19, 5.19, 5.13,
5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 5.22, 5.32, 5.29,
5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 5.66, 5.68, 5.72,
5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 5.62, 5.62, 5.61,
5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 5.45, 5.47, 5.50,
5.50, 5.49, 5.43, 5.39, 5.33, 5.26)

day - c(1:100)

df - data.frame(day,walevel)

library(evd)
clusters(df$walevel, u = 5.80, r = 1, ulow = 5.35, cmax = T, plot = T)
clusters(df$walevel, u = 5.80, r = 50, ulow = 5.35, cmax = T, plot = T)

What have I done wrong?

Tonja
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Peak Over Threshold values/ length of the events

2010-05-25 Thread Tonja Krueger

   Dear List

   I hope you can help me: I’ve got a dataframe (df) within which I am looking
   for Peak Over Threshold values as well as the length of the events. An event
   starts when walevel equals 5.8 and it should end when walevel equals the
   lower threshold value (5.35).

   I tried “clusters (…)” from “evd package”, and varied r (see example) but it
   did not work for all events (again see example).

   walevel - c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 5.86, 5.91, 5.91,
   5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 6.11, 6.14, 6.12,
   6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 5.72, 5.70, 5.65,
   5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 5.19, 5.19, 5.13,
   5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 5.22, 5.32, 5.29,
   5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 5.66, 5.68, 5.72,
   5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 5.62, 5.62, 5.61,
   5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 5.45, 5.47, 5.50,
   5.50, 5.49, 5.43, 5.39, 5.33, 5.26)

   day - c(1:100)

   df - data.frame(day,walevel)

   library(evd)
   clusters(df$walevel, u = 5.80, r = 1, ulow = 5.35, cmax = T, plot = T)
   clusters(df$walevel, u = 5.80, r = 50, ulow = 5.35, cmax = T, plot = T)

   What have I done wrong?

   Tonja
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.