Re: [R] Population abundance, change point

2010-11-24 Thread Gavin Simpson
On Wed, 2010-11-17 at 09:17 -0500, Jonathan P Daily wrote:
 Indeed I have looked into various non-standard changepoint analysis 
 methods. I figured the OP was more interested in traditional methods since 
 you have to spend less time justifying your methodology. Wavelets are one 
 potential nontraditional method, as is Significant Zero Crossings (R 
 package SiZer), which fits arbitrary-degree smoothing splines over a range 
 of bandwidth parameters and looks for changes.

...By looking to see if the derivative of the fitted curve is different
from 0 (given a suitable confidence interval on the derivative. My
problem with all of this is that these data are time series and SiZeR
doesn't take this into account (AFAICS) when computing the confidence
intervals - they are certainly too narrow for examples I have run.

Also, if these things are using splines, aren't we already assuming that
the underlying function is smooth and not a discontinuity? So which
technique the OP chooses will depend on how they think about the type of
change taking place at the changepoint - a point I think you made
earlier Jonathan.

Don't mean to be too negative, this has been a very useful discussion
that I am coming to late after a spot of time in the field.

All the best,

G

  With large communities of 
 abundance counts, another approach that is gaining popularity is the 
 community-level indicator taxa analysis (TITAN), though that is not useful 
 to the OP. 
 --
 Jonathan P. Daily
 Technician - USGS Leetown Science Center
 11649 Leetown Road
 Kearneysville WV, 25430
 (304) 724-4480
 Is the room still a room when its empty? Does the room,
  the thing itself have purpose? Or do we, what's the word... imbue it.
  - Jubal Early, Firefly
 
 Mike Marchywka marchy...@hotmail.com wrote on 11/17/2010 09:11:11 AM:
 
  [image removed] 
  
  RE: [R] Population abundance, change point
  
  Mike Marchywka 
  
  to:
  
  jdaily, carusonm
  
  11/17/2010 09:11 AM
  
  Cc:
  
  r-help, r-help-bounces
  
  
  
  
  
   To: carus...@gmail.com
   From: jda...@usgs.gov
   Date: Wed, 17 Nov 2010 08:45:01 -0500
   CC: r-help@r-project.org; r-help-boun...@r-project.org
   Subject: Re: [R] Population abundance, change point
  
   There are really no set ways to determine a changepoint, since a
   changepoint depends completely on what you decide. Recursive 
 partitioning
   will fit a best changepoint, but it will pretty much always fit one. 
 This
  If you are open to newer ideas,
  have you looked at wavelets at all? these come up on googel along with 
 R.
  Also with aonly a few points, even 20-30, you coldconsider exhasiutvely
  fitting slopes to all 2^n subsets and plowing throgh the histograms
  looking for anything that may be publishable or illuminating about your 
 data.
  Fitting to your own model or null hypotheses would make interesting
  contrasts of course,  populations remained the same after atrazine 
 spill
  or asteroid hit etc.
  
  
   function can be found in the package rpart:
  
fit - rpart(count ~ year, control = list(maxdepth = 1))
summary(fit)
  
   However this measure offers no level of confidence. This is where 
 packages
   like strucchange and party come into use, as they provide measures of
   confidence. Alternatively, you could look into regression-based 
 methods
   where the changepoint is some parameter. Piecewise regression, for
   instance, is as simple as fitting a spline of degree 1 and changepoint 
 X:
  
library(splines)
fit - lm(count ~ bs(year, knots = X, degree = 1))
plot(year, count)
lines(year, fitted(fit))
  
   Then you can fit a regression at each year and compare. Alternatively,
   since count data is often noisy, you could easily substitute quantile
   regression for linear regression to much of the same effect (assuming
   whatever tau you decide, I used 0.8 but this is arbitrary):
  
library(splines)
library(quantreg)
fit - rq(count ~ bs(year, knots = X, degree = 1), tau = 0.8)
plot(year, count)
lines(year, fitted(fit))
   --
   Jonathan P. Daily
   Technician - USGS Leetown Science Center
   11649 Leetown Road
   Kearneysville WV, 25430
   (304) 724-4480
   Is the room still a room when its empty? Does the room,
   the thing itself have purpose? Or do we, what's the word... imbue it.
   - Jubal Early, Firefly
  
   r-help-boun...@r-project.org wrote on 11/16/2010 05:30:49 PM:
  
[image removed]
   
[R] Population abundance, change point
   
Nicholas M. Caruso
   
to:
   
r-help
   
11/16/2010 05:32 PM
   
Sent by:
   
r-help-boun...@r-project.org
   
I am trying to understand my population abundance data and am 
 looking
   into
analyses of change point to try and determine, at approximately what
   point
do populations begin to change (either decline or increasing).
   
Can anyone offer suggestions on ways to go about

Re: [R] Population abundance, change point

2010-11-24 Thread Jonathan P Daily
I agree that SiZer is not the ultimate answer to all changepoint analysis, 
but that is why there are so many changepoint detection methods used. I 
will clarify, though, that my understanding of SiZer (which may be wrong) 
was that the smoothing splines are just a vessel for finding the 
changepoints, and made no assumptions about the continuity of the 
changepoint itself.

One thing that would certainly help, especially with the confidence 
intervals about 0, is some bandwidth selection standard, though choosing 
that standard would be a difficult process to say the least.
--
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it.
 - Jubal Early, Firefly

Gavin Simpson gavin.simp...@ucl.ac.uk wrote on 11/24/2010 09:15:55 AM:

 [image removed] 
 
 Re: [R] Population abundance, change point
 
 Gavin Simpson 
 
 to:
 
 Jonathan P Daily
 
 11/24/2010 09:16 AM
 
 Cc:
 
 Mike Marchywka, r-help, r-help-bounces, carusonm
 
 Please respond to gavin.simpson
 
 On Wed, 2010-11-17 at 09:17 -0500, Jonathan P Daily wrote:
  Indeed I have looked into various non-standard changepoint analysis 
  methods. I figured the OP was more interested in traditional methods 
since 
  you have to spend less time justifying your methodology. Wavelets are 
one 
  potential nontraditional method, as is Significant Zero Crossings (R 
  package SiZer), which fits arbitrary-degree smoothing splines over a 
range 
  of bandwidth parameters and looks for changes.
 
 ...By looking to see if the derivative of the fitted curve is different
 from 0 (given a suitable confidence interval on the derivative. My
 problem with all of this is that these data are time series and SiZeR
 doesn't take this into account (AFAICS) when computing the confidence
 intervals - they are certainly too narrow for examples I have run.
 
 Also, if these things are using splines, aren't we already assuming that
 the underlying function is smooth and not a discontinuity? So which
 technique the OP chooses will depend on how they think about the type of
 change taking place at the changepoint - a point I think you made
 earlier Jonathan.
 
 Don't mean to be too negative, this has been a very useful discussion
 that I am coming to late after a spot of time in the field.
 
 All the best,
 
 G
 
   With large communities of 
  abundance counts, another approach that is gaining popularity is the 
  community-level indicator taxa analysis (TITAN), though that is not 
useful 
  to the OP. 
  --
  Jonathan P. Daily
  Technician - USGS Leetown Science Center
  11649 Leetown Road
  Kearneysville WV, 25430
  (304) 724-4480
  Is the room still a room when its empty? Does the room,
   the thing itself have purpose? Or do we, what's the word... imbue 
it.
   - Jubal Early, Firefly
  
  Mike Marchywka marchy...@hotmail.com wrote on 11/17/2010 09:11:11 
AM:
  
   [image removed] 
   
   RE: [R] Population abundance, change point
   
   Mike Marchywka 
   
   to:
   
   jdaily, carusonm
   
   11/17/2010 09:11 AM
   
   Cc:
   
   r-help, r-help-bounces
   
   
   
   
   
To: carus...@gmail.com
From: jda...@usgs.gov
Date: Wed, 17 Nov 2010 08:45:01 -0500
CC: r-help@r-project.org; r-help-boun...@r-project.org
Subject: Re: [R] Population abundance, change point
   
There are really no set ways to determine a changepoint, since a
changepoint depends completely on what you decide. Recursive 
  partitioning
will fit a best changepoint, but it will pretty much always fit 
one. 
  This
   If you are open to newer ideas,
   have you looked at wavelets at all? these come up on googel along 
with 
  R.
   Also with aonly a few points, even 20-30, you coldconsider 
exhasiutvely
   fitting slopes to all 2^n subsets and plowing throgh the histograms
   looking for anything that may be publishable or illuminating about 
your 
  data.
   Fitting to your own model or null hypotheses would make interesting
   contrasts of course,  populations remained the same after atrazine 
  spill
   or asteroid hit etc.
   
   
function can be found in the package rpart:
   
 fit - rpart(count ~ year, control = list(maxdepth = 1))
 summary(fit)
   
However this measure offers no level of confidence. This is where 
  packages
like strucchange and party come into use, as they provide measures 
of
confidence. Alternatively, you could look into regression-based 
  methods
where the changepoint is some parameter. Piecewise regression, for
instance, is as simple as fitting a spline of degree 1 and 
changepoint 
  X:
   
 library(splines)
 fit - lm(count ~ bs(year, knots = X, degree = 1))
 plot(year, count)
 lines(year, fitted(fit))
   
Then you can fit a regression at each

Re: [R] Population abundance, change point

2010-11-24 Thread Mike Marchywka








 To: gavin.simp...@ucl.ac.uk
 CC: carus...@gmail.com; marchy...@hotmail.com; r-help@r-project.org; 
 r-help-boun...@r-project.org
 Subject: Re: [R] Population abundance, change point
 From: jda...@usgs.gov
 Date: Wed, 24 Nov 2010 10:59:17 -0500

 I agree that SiZer is not the ultimate answer to all changepoint analysis,
 but that is why there are so many changepoint detection methods used. I
 will clarify, though, that my understanding of SiZer (which may be wrong)
 was that the smoothing splines are just a vessel for finding the
 changepoints, and made no assumptions about the continuity of the
 changepoint itself.

 One thing that would certainly help, especially with the confidence
 intervals about 0, is some bandwidth selection standard, though choosing
 that standard would be a difficult process to say the least.

I had a rather long draft response to earlier post on this thread,
quoting Einstein and Homer Simpson, but it seemed to long on philosophy
and was becoming a rant. But in response to this, Standards are good, 
so good in fact everyone should have their own, LOL. 
If you find yourself agonizing over the selection of arbitary parameters,
like cutoffs of various types say p.05, it may simply be that
there is no sacred answer and the only answer is get more data 
or at least discuss the implications of various choices.

 --
 Jonathan P. Daily

 Gavin Simpson wrote on 11/24/2010 09:15:55 AM:

  [image removed]
 
  Re: [R] Population abundance, change point
 
  Gavin Simpson
 
  to:
 
  Jonathan P Daily
 
  11/24/2010 09:16 AM
 
  Cc:
 
  Mike Marchywka, r-help, r-help-bounces, carusonm
 
  Please respond to gavin.simpson
 
  On Wed, 2010-11-17 at 09:17 -0500, Jonathan P Daily wrote:
   Indeed I have looked into various non-standard changepoint analysis
   methods. I figured the OP was more interested in traditional methods
 since
   you have to spend less time justifying your methodology. Wavelets are
 one
   potential nontraditional method, as is Significant Zero Crossings (R
   package SiZer), which fits arbitrary-degree smoothing splines over a
 range
   of bandwidth parameters and looks for changes.
 
  ...By looking to see if the derivative of the fitted curve is different
  from 0 (given a suitable confidence interval on the derivative. My
  problem with all of this is that these data are time series and SiZeR
  doesn't take this into account (AFAICS) when computing the confidence
  intervals - they are certainly too narrow for examples I have run.
 
  Also, if these things are using splines, aren't we already assuming that
  the underlying function is smooth and not a discontinuity? So which
  technique the OP chooses will depend on how they think about the type of
  change taking place at the changepoint - a point I think you made
  earlier Jonathan.
 
  Don't mean to be too negative, this has been a very useful discussion
  that I am coming to late after a spot of time in the field.
 
  All the best, 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Population abundance, change point

2010-11-24 Thread Gavin Simpson
On Wed, 2010-11-24 at 10:59 -0500, Jonathan P Daily wrote:
 I agree that SiZer is not the ultimate answer to all changepoint analysis, 
 but that is why there are so many changepoint detection methods used. I 
 will clarify, though, that my understanding of SiZer (which may be wrong) 
 was that the smoothing splines are just a vessel for finding the 
 changepoints, and made no assumptions about the continuity of the 
 changepoint itself.

But by the nature of using splines, they *you) are positing a model for
what the changepoint looks like. If you model the classic Nile data with
a spline model (say using gam()) then it will suggest that there was a
smooth transition from high flow before the dam at Aswan was built to
lower flow afterwards. Other techniques that posit a discontinuity as a
changepoint would fit two, effectively flat [slope 0], linear lines
either side of the point when the dam was built.

If you didn't know a dam was built and all you fitted was a spline to
the data, that would likely influence how you interpreted the change.

 One thing that would certainly help, especially with the confidence 
 intervals about 0, is some bandwidth selection standard, though choosing 
 that standard would be a difficult process to say the least.

That won't help with the lack of independence in the residuals, which
will result in deflated CI on the derivatives, hence more wiggles seem
significant... Oh the joy.

Also, there has to be a point where with siZer you are just looking at
pattern in noise at the small bandwidth end of things...

I'll get off my soap box about now! ;-)

Cheers,

G

 --
 Jonathan P. Daily
 Technician - USGS Leetown Science Center
 11649 Leetown Road
 Kearneysville WV, 25430
 (304) 724-4480
 Is the room still a room when its empty? Does the room,
  the thing itself have purpose? Or do we, what's the word... imbue it.
  - Jubal Early, Firefly
 
 Gavin Simpson gavin.simp...@ucl.ac.uk wrote on 11/24/2010 09:15:55 AM:
 
  [image removed] 
  
  Re: [R] Population abundance, change point
  
  Gavin Simpson 
  
  to:
  
  Jonathan P Daily
  
  11/24/2010 09:16 AM
  
  Cc:
  
  Mike Marchywka, r-help, r-help-bounces, carusonm
  
  Please respond to gavin.simpson
  
  On Wed, 2010-11-17 at 09:17 -0500, Jonathan P Daily wrote:
   Indeed I have looked into various non-standard changepoint analysis 
   methods. I figured the OP was more interested in traditional methods 
 since 
   you have to spend less time justifying your methodology. Wavelets are 
 one 
   potential nontraditional method, as is Significant Zero Crossings (R 
   package SiZer), which fits arbitrary-degree smoothing splines over a 
 range 
   of bandwidth parameters and looks for changes.
  
  ...By looking to see if the derivative of the fitted curve is different
  from 0 (given a suitable confidence interval on the derivative. My
  problem with all of this is that these data are time series and SiZeR
  doesn't take this into account (AFAICS) when computing the confidence
  intervals - they are certainly too narrow for examples I have run.
  
  Also, if these things are using splines, aren't we already assuming that
  the underlying function is smooth and not a discontinuity? So which
  technique the OP chooses will depend on how they think about the type of
  change taking place at the changepoint - a point I think you made
  earlier Jonathan.
  
  Don't mean to be too negative, this has been a very useful discussion
  that I am coming to late after a spot of time in the field.
  
  All the best,
  
  G
  
With large communities of 
   abundance counts, another approach that is gaining popularity is the 
   community-level indicator taxa analysis (TITAN), though that is not 
 useful 
   to the OP. 
   --
   Jonathan P. Daily
   Technician - USGS Leetown Science Center
   11649 Leetown Road
   Kearneysville WV, 25430
   (304) 724-4480
   Is the room still a room when its empty? Does the room,
the thing itself have purpose? Or do we, what's the word... imbue 
 it.
- Jubal Early, Firefly
   
   Mike Marchywka marchy...@hotmail.com wrote on 11/17/2010 09:11:11 
 AM:
   
[image removed] 

RE: [R] Population abundance, change point

Mike Marchywka 

to:

jdaily, carusonm

11/17/2010 09:11 AM

Cc:

r-help, r-help-bounces





 To: carus...@gmail.com
 From: jda...@usgs.gov
 Date: Wed, 17 Nov 2010 08:45:01 -0500
 CC: r-help@r-project.org; r-help-boun...@r-project.org
 Subject: Re: [R] Population abundance, change point

 There are really no set ways to determine a changepoint, since a
 changepoint depends completely on what you decide. Recursive 
   partitioning
 will fit a best changepoint, but it will pretty much always fit 
 one. 
   This
If you are open to newer ideas,
have you looked at wavelets at all? these come

Re: [R] Population abundance, change point

2010-11-24 Thread Jonathan P Daily
I understand that smoothing splines produce continuous models, however the 
end product of SiZer is not a single model that is then used in any 
predictive manner. Rather, the end product is a map of potential 
changepoint locations along a gradient. Are you suggesting that SiZer 
would not find a changepoint at the Aswan Dam? If so, I would argue 
against that conclusion. If your point is that pulling one underlying 
spline model used to create a SiZer map out and trying to draw conclusions 
from that model is ineffective, you are absolutely right. I tend to think 
of SiZer outputs more akin to tree outputs than piecewise regression 
outputs.
--
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it.
 - Jubal Early, Firefly

Gavin Simpson gavin.simp...@ucl.ac.uk wrote on 11/24/2010 12:09:15 PM:

 [image removed] 
 
 Re: [R] Population abundance, change point
 
 Gavin Simpson 
 
 to:
 
 Jonathan P Daily
 
 11/24/2010 12:09 PM
 
 Cc:
 
 r-help, carusonm
 
 Please respond to gavin.simpson
 
 On Wed, 2010-11-24 at 10:59 -0500, Jonathan P Daily wrote:
  I agree that SiZer is not the ultimate answer to all changepoint 
analysis, 
  but that is why there are so many changepoint detection methods used. 
I 
  will clarify, though, that my understanding of SiZer (which may be 
wrong) 
  was that the smoothing splines are just a vessel for finding the 
  changepoints, and made no assumptions about the continuity of the 
  changepoint itself.
 
 But by the nature of using splines, they *you) are positing a model for
 what the changepoint looks like. If you model the classic Nile data with
 a spline model (say using gam()) then it will suggest that there was a
 smooth transition from high flow before the dam at Aswan was built to
 lower flow afterwards. Other techniques that posit a discontinuity as a
 changepoint would fit two, effectively flat [slope 0], linear lines
 either side of the point when the dam was built.
 
 If you didn't know a dam was built and all you fitted was a spline to
 the data, that would likely influence how you interpreted the change.
 
  One thing that would certainly help, especially with the confidence 
  intervals about 0, is some bandwidth selection standard, though 
choosing 
  that standard would be a difficult process to say the least.
 
 That won't help with the lack of independence in the residuals, which
 will result in deflated CI on the derivatives, hence more wiggles seem
 significant... Oh the joy.
 
 Also, there has to be a point where with siZer you are just looking at
 pattern in noise at the small bandwidth end of things...
 
 I'll get off my soap box about now! ;-)
 
 Cheers,
 
 G
 
  --
  Jonathan P. Daily
  Technician - USGS Leetown Science Center
  11649 Leetown Road
  Kearneysville WV, 25430
  (304) 724-4480
  Is the room still a room when its empty? Does the room,
   the thing itself have purpose? Or do we, what's the word... imbue 
it.
   - Jubal Early, Firefly
  
  Gavin Simpson gavin.simp...@ucl.ac.uk wrote on 11/24/2010 09:15:55 
AM:
  
   [image removed] 
   
   Re: [R] Population abundance, change point
   
   Gavin Simpson 
   
   to:
   
   Jonathan P Daily
   
   11/24/2010 09:16 AM
   
   Cc:
   
   Mike Marchywka, r-help, r-help-bounces, carusonm
   
   Please respond to gavin.simpson
   
   On Wed, 2010-11-17 at 09:17 -0500, Jonathan P Daily wrote:
Indeed I have looked into various non-standard changepoint 
analysis 
methods. I figured the OP was more interested in traditional 
methods 
  since 
you have to spend less time justifying your methodology. Wavelets 
are 
  one 
potential nontraditional method, as is Significant Zero Crossings 
(R 
package SiZer), which fits arbitrary-degree smoothing splines over 
a 
  range 
of bandwidth parameters and looks for changes.
   
   ...By looking to see if the derivative of the fitted curve is 
different
   from 0 (given a suitable confidence interval on the derivative. My
   problem with all of this is that these data are time series and 
SiZeR
   doesn't take this into account (AFAICS) when computing the 
confidence
   intervals - they are certainly too narrow for examples I have run.
   
   Also, if these things are using splines, aren't we already assuming 
that
   the underlying function is smooth and not a discontinuity? So which
   technique the OP chooses will depend on how they think about the 
type of
   change taking place at the changepoint - a point I think you made
   earlier Jonathan.
   
   Don't mean to be too negative, this has been a very useful 
discussion
   that I am coming to late after a spot of time in the field.
   
   All the best,
   
   G
   
 With large communities of 
abundance counts

Re: [R] Population abundance, change point

2010-11-24 Thread Gavin Simpson
On Wed, 2010-11-24 at 13:00 -0500, Jonathan P Daily wrote:
 I understand that smoothing splines produce continuous models, however the 
 end product of SiZer is not a single model that is then used in any 
 predictive manner. Rather, the end product is a map of potential 
 changepoint locations along a gradient. Are you suggesting that SiZer 
 would not find a changepoint at the Aswan Dam?

No, not at all. What I am arguing is that SiZer will find a series of
models of diminishing bandwidth that detects change points that are
smoothly varying between states either side of the change point. In fact
SiZer is fitting a series of smoothly varying models and showing you
where the derivatives of those smoothly varying functions are
significantly different from 0.

require(SiZer)
mod - SiZer(time(Nile), Nile, x.grid = 40) ## slowish

All those models show us that something changed around 1900, but the
nature of that change is one that is smoothly varying.

My point is that SiZer posits a particular form for the nature of the
change - smoothly varying. If all you use is SiZer then these are the
sorts of changes you will find. Did the flow at Aswan change smoothly
over a period of many years? If all you used was SiZer, then you'd
conclude it did. It might look something like:

mod2 - smooth.spline(time(Nile), Nile, df = 10)
pred - predict(mod2, x = seq(min(time(Nile)), max(time(Nile)),
length = 200))
plot(Nile)
lines(pred, col = red, lwd = 2)

But if we used a different change point detection routine that focussed
on a discontinuity, then that is the sort of change point we would
detect. Here is a tree version, for example:

require(rpart)
mod3 - rpart(Nile ~ time(Nile))
## 1898 change point
plot(Nile)
lines(pred, col = red, lwd = 2)
before - mean(Nile[time(Nile)  1898])
after - mean(Nile[time(Nile) = 1898])
lines(1871:1897, y = rep(before, 27), col = blue, lwd = 2)
lines(1898:1970, y = rep(after, 73), col = blue, lwd = 2)

So, in short, my point was you need to understand what form of change
point a given technique is looking to identify otherwise you could be
misled as to what is going on.

G

  If so, I would argue 
 against that conclusion. If your point is that pulling one underlying 
 spline model used to create a SiZer map out and trying to draw conclusions 
 from that model is ineffective, you are absolutely right. I tend to think 
 of SiZer outputs more akin to tree outputs than piecewise regression 
 outputs.
 --
 Jonathan P. Daily
 Technician - USGS Leetown Science Center
 11649 Leetown Road
 Kearneysville WV, 25430
 (304) 724-4480
 Is the room still a room when its empty? Does the room,
  the thing itself have purpose? Or do we, what's the word... imbue it.
  - Jubal Early, Firefly
 
 Gavin Simpson gavin.simp...@ucl.ac.uk wrote on 11/24/2010 12:09:15 PM:
 
  [image removed] 
  
  Re: [R] Population abundance, change point
  
  Gavin Simpson 
  
  to:
  
  Jonathan P Daily
  
  11/24/2010 12:09 PM
  
  Cc:
  
  r-help, carusonm
  
  Please respond to gavin.simpson
  
  On Wed, 2010-11-24 at 10:59 -0500, Jonathan P Daily wrote:
   I agree that SiZer is not the ultimate answer to all changepoint 
 analysis, 
   but that is why there are so many changepoint detection methods used. 
 I 
   will clarify, though, that my understanding of SiZer (which may be 
 wrong) 
   was that the smoothing splines are just a vessel for finding the 
   changepoints, and made no assumptions about the continuity of the 
   changepoint itself.
  
  But by the nature of using splines, they *you) are positing a model for
  what the changepoint looks like. If you model the classic Nile data with
  a spline model (say using gam()) then it will suggest that there was a
  smooth transition from high flow before the dam at Aswan was built to
  lower flow afterwards. Other techniques that posit a discontinuity as a
  changepoint would fit two, effectively flat [slope 0], linear lines
  either side of the point when the dam was built.
  
  If you didn't know a dam was built and all you fitted was a spline to
  the data, that would likely influence how you interpreted the change.
  
   One thing that would certainly help, especially with the confidence 
   intervals about 0, is some bandwidth selection standard, though 
 choosing 
   that standard would be a difficult process to say the least.
  
  That won't help with the lack of independence in the residuals, which
  will result in deflated CI on the derivatives, hence more wiggles seem
  significant... Oh the joy.
  
  Also, there has to be a point where with siZer you are just looking at
  pattern in noise at the small bandwidth end of things...
  
  I'll get off my soap box about now! ;-)
  
  Cheers,
  
  G
  
   --
   Jonathan P. Daily
   Technician - USGS Leetown Science Center
   11649 Leetown Road
   Kearneysville WV, 25430
   (304) 724-4480
   Is the room still a room when its empty

Re: [R] Population abundance, change point

2010-11-17 Thread vito.muggeo
dear Nick,
I do not know your data, however it seems to me that the pattern of relative 
abundance
of salamanders should not exhibit a sudden change, but rather a gradual change.

If this is the case, have a look to the segmented package and references 
therin. In
particular have a look to the relevant Rnews paper 

Segmented: an R package to fit regression models with broken-line 
relationships. R News,
8, 1: 20-25

at http://cran.r-project.org/doc/Rnews/Rnews_2008-1.pdf

Hope this helps you,
best,
vito

On Tue, 16 Nov 2010 17:30:49 -0500, Nicholas M. Caruso wrote
 I am trying to understand my population abundance data and am looking into
 analyses of change point to try and determine, at approximately what point
 do populations begin to change (either decline or increasing).
 
 Can anyone offer suggestions on ways to go about this?
 
 I have looked into bcp and strucchange packages but am not completely
 convinced that these are appropriate for my data.
 
 Here is an example of what type of data I have
 Year of survey (continuous variable) 1960 - 2009 (there are gaps in the
 surveys (e.g., there were no surveys from 2002-2004)
 Relative abundance of salamanders during the survey periods
 
 Thanks for your help, Nick
 
 -- 
 Nicholas M Caruso
 Graduate Student
 CLFS-Biology
 4219 Biology-Psychology Building
 University of Maryland, College Park, MD 20742-5815
 
 --
 I learned something of myself in the woods today,
 and walked out pleased for having made the acquaintance.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


--
Open WebMail Project (http://openwebmail.org)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Population abundance, change point

2010-11-17 Thread Jonathan P Daily
There are really no set ways to determine a changepoint, since a 
changepoint depends completely on what you decide. Recursive partitioning 
will fit a best changepoint, but it will pretty much always fit one. This 
function can be found in the package rpart:

 fit - rpart(count ~ year, control = list(maxdepth = 1))
 summary(fit)

However this measure offers no level of confidence. This is where packages 
like strucchange and party come into use, as they provide measures of 
confidence. Alternatively, you could look into regression-based methods 
where the changepoint is some parameter. Piecewise regression, for 
instance, is as simple as fitting a spline of degree 1 and changepoint X:

 library(splines)
 fit - lm(count ~ bs(year, knots = X, degree = 1))
 plot(year, count)
 lines(year, fitted(fit))

Then you can fit a regression at each year and compare. Alternatively, 
since count data is often noisy, you could easily substitute quantile 
regression for linear regression to much of the same effect (assuming 
whatever tau you decide, I used 0.8 but this is arbitrary):

 library(splines)
 library(quantreg)
 fit - rq(count ~ bs(year, knots = X, degree = 1), tau = 0.8)
 plot(year, count)
 lines(year, fitted(fit))
--
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it.
 - Jubal Early, Firefly

r-help-boun...@r-project.org wrote on 11/16/2010 05:30:49 PM:

 [image removed] 
 
 [R] Population abundance, change point
 
 Nicholas M. Caruso 
 
 to:
 
 r-help
 
 11/16/2010 05:32 PM
 
 Sent by:
 
 r-help-boun...@r-project.org
 
 I am trying to understand my population abundance data and am looking 
into
 analyses of change point to try and determine, at approximately what 
point
 do populations begin to change (either decline or increasing).
 
 Can anyone offer suggestions on ways to go about this?
 
 I have looked into bcp and strucchange packages but am not completely
 convinced that these are appropriate for my data.
 
 Here is an example of what type of data I have
 Year of survey (continuous variable) 1960 - 2009 (there are gaps in the
 surveys (e.g., there were no surveys from 2002-2004)
 Relative abundance of salamanders during the survey periods
 
 
 Thanks for your help, Nick
 
 -- 
 Nicholas M Caruso
 Graduate Student
 CLFS-Biology
 4219 Biology-Psychology Building
 University of Maryland, College Park, MD 20742-5815
 
 
 
 
 --
 I learned something of myself in the woods today,
 and walked out pleased for having made the acquaintance.
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Population abundance, change point

2010-11-17 Thread Mike Marchywka




 To: carus...@gmail.com
 From: jda...@usgs.gov
 Date: Wed, 17 Nov 2010 08:45:01 -0500
 CC: r-help@r-project.org; r-help-boun...@r-project.org
 Subject: Re: [R] Population abundance, change point

 There are really no set ways to determine a changepoint, since a
 changepoint depends completely on what you decide. Recursive partitioning
 will fit a best changepoint, but it will pretty much always fit one. This
If you are open to newer ideas,
have you looked at wavelets at all? these come up on googel along with R.
Also with aonly a few points, even 20-30, you coldconsider exhasiutvely
fitting slopes to all 2^n subsets and plowing throgh the histograms
looking for anything that may be publishable or illuminating about your data.
Fitting to your own model or null hypotheses would make interesting
contrasts of course,  populations remained the same after atrazine spill
or asteroid hit etc.


 function can be found in the package rpart:

  fit - rpart(count ~ year, control = list(maxdepth = 1))
  summary(fit)

 However this measure offers no level of confidence. This is where packages
 like strucchange and party come into use, as they provide measures of
 confidence. Alternatively, you could look into regression-based methods
 where the changepoint is some parameter. Piecewise regression, for
 instance, is as simple as fitting a spline of degree 1 and changepoint X:

  library(splines)
  fit - lm(count ~ bs(year, knots = X, degree = 1))
  plot(year, count)
  lines(year, fitted(fit))

 Then you can fit a regression at each year and compare. Alternatively,
 since count data is often noisy, you could easily substitute quantile
 regression for linear regression to much of the same effect (assuming
 whatever tau you decide, I used 0.8 but this is arbitrary):

  library(splines)
  library(quantreg)
  fit - rq(count ~ bs(year, knots = X, degree = 1), tau = 0.8)
  plot(year, count)
  lines(year, fitted(fit))
 --
 Jonathan P. Daily
 Technician - USGS Leetown Science Center
 11649 Leetown Road
 Kearneysville WV, 25430
 (304) 724-4480
 Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it.
 - Jubal Early, Firefly

 r-help-boun...@r-project.org wrote on 11/16/2010 05:30:49 PM:

  [image removed]
 
  [R] Population abundance, change point
 
  Nicholas M. Caruso
 
  to:
 
  r-help
 
  11/16/2010 05:32 PM
 
  Sent by:
 
  r-help-boun...@r-project.org
 
  I am trying to understand my population abundance data and am looking
 into
  analyses of change point to try and determine, at approximately what
 point
  do populations begin to change (either decline or increasing).
 
  Can anyone offer suggestions on ways to go about this?
 
  I have looked into bcp and strucchange packages but am not completely
  convinced that these are appropriate for my data.
 
  Here is an example of what type of data I have
  Year of survey (continuous variable) 1960 - 2009 (there are gaps in the
  surveys (e.g., there were no surveys from 2002-2004)
  Relative abundance of salamanders during the survey periods
 
 
  Thanks for your help, Nick
 
  --
  Nicholas M Caruso
  Graduate Student
  CLFS-Biology
  4219 Biology-Psychology Building
  University of Maryland, College Park, MD 20742-5815
 
 
 
 
  --
  I learned something of myself in the woods today,
  and walked out pleased for having made the acquaintance.
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Population abundance, change point

2010-11-17 Thread Jonathan P Daily
Indeed I have looked into various non-standard changepoint analysis 
methods. I figured the OP was more interested in traditional methods since 
you have to spend less time justifying your methodology. Wavelets are one 
potential nontraditional method, as is Significant Zero Crossings (R 
package SiZer), which fits arbitrary-degree smoothing splines over a range 
of bandwidth parameters and looks for changes. With large communities of 
abundance counts, another approach that is gaining popularity is the 
community-level indicator taxa analysis (TITAN), though that is not useful 
to the OP. 
--
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it.
 - Jubal Early, Firefly

Mike Marchywka marchy...@hotmail.com wrote on 11/17/2010 09:11:11 AM:

 [image removed] 
 
 RE: [R] Population abundance, change point
 
 Mike Marchywka 
 
 to:
 
 jdaily, carusonm
 
 11/17/2010 09:11 AM
 
 Cc:
 
 r-help, r-help-bounces
 
 
 
 
 
  To: carus...@gmail.com
  From: jda...@usgs.gov
  Date: Wed, 17 Nov 2010 08:45:01 -0500
  CC: r-help@r-project.org; r-help-boun...@r-project.org
  Subject: Re: [R] Population abundance, change point
 
  There are really no set ways to determine a changepoint, since a
  changepoint depends completely on what you decide. Recursive 
partitioning
  will fit a best changepoint, but it will pretty much always fit one. 
This
 If you are open to newer ideas,
 have you looked at wavelets at all? these come up on googel along with 
R.
 Also with aonly a few points, even 20-30, you coldconsider exhasiutvely
 fitting slopes to all 2^n subsets and plowing throgh the histograms
 looking for anything that may be publishable or illuminating about your 
data.
 Fitting to your own model or null hypotheses would make interesting
 contrasts of course,  populations remained the same after atrazine 
spill
 or asteroid hit etc.
 
 
  function can be found in the package rpart:
 
   fit - rpart(count ~ year, control = list(maxdepth = 1))
   summary(fit)
 
  However this measure offers no level of confidence. This is where 
packages
  like strucchange and party come into use, as they provide measures of
  confidence. Alternatively, you could look into regression-based 
methods
  where the changepoint is some parameter. Piecewise regression, for
  instance, is as simple as fitting a spline of degree 1 and changepoint 
X:
 
   library(splines)
   fit - lm(count ~ bs(year, knots = X, degree = 1))
   plot(year, count)
   lines(year, fitted(fit))
 
  Then you can fit a regression at each year and compare. Alternatively,
  since count data is often noisy, you could easily substitute quantile
  regression for linear regression to much of the same effect (assuming
  whatever tau you decide, I used 0.8 but this is arbitrary):
 
   library(splines)
   library(quantreg)
   fit - rq(count ~ bs(year, knots = X, degree = 1), tau = 0.8)
   plot(year, count)
   lines(year, fitted(fit))
  --
  Jonathan P. Daily
  Technician - USGS Leetown Science Center
  11649 Leetown Road
  Kearneysville WV, 25430
  (304) 724-4480
  Is the room still a room when its empty? Does the room,
  the thing itself have purpose? Or do we, what's the word... imbue it.
  - Jubal Early, Firefly
 
  r-help-boun...@r-project.org wrote on 11/16/2010 05:30:49 PM:
 
   [image removed]
  
   [R] Population abundance, change point
  
   Nicholas M. Caruso
  
   to:
  
   r-help
  
   11/16/2010 05:32 PM
  
   Sent by:
  
   r-help-boun...@r-project.org
  
   I am trying to understand my population abundance data and am 
looking
  into
   analyses of change point to try and determine, at approximately what
  point
   do populations begin to change (either decline or increasing).
  
   Can anyone offer suggestions on ways to go about this?
  
   I have looked into bcp and strucchange packages but am not 
completely
   convinced that these are appropriate for my data.
  
   Here is an example of what type of data I have
   Year of survey (continuous variable) 1960 - 2009 (there are gaps in 
the
   surveys (e.g., there were no surveys from 2002-2004)
   Relative abundance of salamanders during the survey periods
  
  
   Thanks for your help, Nick
  
   --
   Nicholas M Caruso
   Graduate Student
   CLFS-Biology
   4219 Biology-Psychology Building
   University of Maryland, College Park, MD 20742-5815
  
  
  
  
   --
   I learned something of myself in the woods today,
   and walked out pleased for having made the acquaintance.
  
   [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http

Re: [R] Population abundance, change point

2010-11-16 Thread Michael Bedward
Hi Nick,

I've used MCMC to fit change point regressions to a variety of
ecological data and prefer this approach to strucchange and similar
because I feel I have more control over the model, ie. I find it
easier to tailor the form of the model to biological / demographic
processes. I also find the imputing of missing response data more
straightforward with MCMC - at least when they are some distance from
potential changepoint(s).

Most recently I've used JAGS via the rjags package to do such models.

As an example, here is a toy model that looks for a single
change-point after which there is a change in slope (time trend) in
the data

model {
  # y[i] is population data
  # N is number of (regular) observations
  for (i in 1:N) {
y[i] ~ dnorm(mu[i], tau)
mu[i] - b0 + i*(b1 + b1cp*step(cp - i))
  }

  # intercept
  b0 ~ dnorm(0, 1.0e-6)

  # pre-change slope
  b1 ~ dnorm(0, 1.0e-6)

  # post-change slope
  b1cp ~ dnorm(0, 1.0e-6)

  # change point location (time)
  cp ~ dunif(2, N-1)

  # variance (assumed equal either side of change)
  sd ~ dunif(0.01, 100)
  tau - pow(sd, -2)
}

The step() function returns 1 when its argument is positive, and 0
otherwise. As a result, the slope is b1 before the change (time = cp)
and b1 + b1cp after the change.

You can easily modify this model to, for instance...
- assume 0 slope  but different intercepts either side of the change
- allow for change in variance; or variance proportional to mean etc.
- impute missing response data

Michael


On 17 November 2010 09:30, Nicholas M. Caruso carus...@gmail.com wrote:
 I am trying to understand my population abundance data and am looking into
 analyses of change point to try and determine, at approximately what point
 do populations begin to change (either decline or increasing).

 Can anyone offer suggestions on ways to go about this?

 I have looked into bcp and strucchange packages but am not completely
 convinced that these are appropriate for my data.

 Here is an example of what type of data I have
 Year of survey (continuous variable) 1960 - 2009 (there are gaps in the
 surveys (e.g., there were no surveys from 2002-2004)
 Relative abundance of salamanders during the survey periods


 Thanks for your help, Nick

 --
 Nicholas M Caruso
 Graduate Student
 CLFS-Biology
 4219 Biology-Psychology Building
 University of Maryland, College Park, MD 20742-5815




 --
 I learned something of myself in the woods today,
 and walked out pleased for having made the acquaintance.

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.