Re: [R] breakpoints and nonlinear regression
dear Julian, Il 18/01/2012 14.36, crimsonengineer87 ha scritto: Thanks for the comments. Yes, I also had segmented and then I went away from that. I can't remember. I've tried using it but I get some sort of strange error. Here's some code ... it is difficult for me to help you without knowing "which error" you obtain.. If you refer to "maximum number of iterations", it is a warning (not error). See the discussion in the paper on Rnews (that Achim suggested). The following code is expected to work pavlu.glm<- lm(Na ~ yield, data=pavludata) pavlu.seg<- segmented(pavlu.glm, seg.Z=~yield, psi=1000) with(pavludata, plot(yield, Na)) plot(pavlu.seg, add=TRUE) See in ?segmented and ?plot.segmented for additional examples and contact me off list if you have additional questions best, vito pavlu.glm<- glm(Na ~ yield, data=pavludata, family=gaussian) pavlu.seg<- segmented(pavlu.glm, seg.Z=~yield, psi=1000, control=seg.control(display=FALSE)) plot.series<- function() { plot(pavlu.seg) plot(pavlu.seg, add=TRUE, linkinv=TRUE, lwd=2, col=2:3, lty=c(1,3)) lines(pavlu.seg, col=2, pch=19, bottom=FALSE, lwd=2) } jpeg("pavlu-cuttingsystem-segmented.jpg", width = 1000, height = 700, units = "px") plot.series() ## Turn off device driver (to flush output to JPG) dev.off() 1. I don't think I'm doing my plotting right. I'm just not sure how that works with segmented. 2. My error is something about an error in do.call(lines) and that the maximum number of iterations has been reached. Am I missing something with glm or lm? Thanks again. -- View this message in context: http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4306657.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breakpoints and nonlinear regression
Thanks for the comments. Yes, I also had segmented and then I went away from that. I can't remember. I've tried using it but I get some sort of strange error. Here's some code ... pavlu.glm <- glm(Na ~ yield, data=pavludata, family=gaussian) pavlu.seg <- segmented(pavlu.glm, seg.Z=~yield, psi=1000, control=seg.control(display=FALSE)) plot.series <- function() { plot(pavlu.seg) plot(pavlu.seg, add=TRUE, linkinv=TRUE, lwd=2, col=2:3, lty=c(1,3)) lines(pavlu.seg, col=2, pch=19, bottom=FALSE, lwd=2) } jpeg("pavlu-cuttingsystem-segmented.jpg", width = 1000, height = 700, units = "px") plot.series() ## Turn off device driver (to flush output to JPG) dev.off() 1. I don't think I'm doing my plotting right. I'm just not sure how that works with segmented. 2. My error is something about an error in do.call(lines) and that the maximum number of iterations has been reached. Am I missing something with glm or lm? Thanks again. -- View this message in context: http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4306657.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breakpoints and nonlinear regression
On Tue, 17 Jan 2012, crimsonengineer87 wrote: Thanks for the comments everyone. I was hoping to not have to find someone in the stats department ... well, we'll see. So in response to Z's comment ... I have tried breakpoints(Na ~ yield) and I did expect to get something continuous. You won't. The result may be close to continuous (depending on your data) but there is no inherent continuity restriction. That's what "segmented" does (that Rolf already pointed to). See Vito M. R. Muggeo (2008). segmented: an R Package to Fit Regression Models with Broken-Line Relationships. R News, 8/1, 20-25. URL http://CRAN.R-project.org/doc/Rnews/. The idea was to get two or three linear functions making up the curve. And then from there, get a CI from these lines. Of course, it wouldn't be good. (This is coming from a non-stats guy ... I'm a civil engineer by degree and am now learning to be a modeler as a grad student!). Do you know of any more examples of breakpoints? The examples in the references are great, but I can't seem to get it right. Thanks again. -- View this message in context: http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4305000.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breakpoints and nonlinear regression
In respect of fitting piecewise linear regressions, have you looked at the "segmented" package? cheers, Rolf Turner On 18/01/12 04:30, crimsonengineer87 wrote: Dear Forum, I have been wracking my head over this problem for the past few days. I have a dataset of (x,y). I have been able to obtain a nonlinear regression line using nls. However, we would like to do some statistical analysis. I would like to obtain a confidence interval for the curve. We thought we could divide up the curve into piecewise linear regressions and compute CIs from those portions. There is a package called strucchange that seems helpful, but I am thoroughly confused. 'breakpoints' is used to calculate the number of breaks in the data for linear regressions. I have the following in my script: bp.pavlu<- breakpoints(Na ~ f(yield, a, b), h=0.15, breaks=3, data=pavludata) plot(bp.pavlu) breakpoints(bp.pavlu) But I am confused as to how to graph the piecewise functions that make up the curve. I am not even sure if I am using breakpoints correctly. Do I just give it a linear relationhip (Na ~ yield), instead of what I have? Is there an easier way to calculate the confidence interval for a non-linear regression? I am new to R (as I've read in many questions), but I have most certainly tried many things and am just getting frustrated with the lack of examples for what I'd like to do with my data... I'd appreciate any insight. I can also provide more information if I am not clear. Thanks in advance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breakpoints and nonlinear regression
Thanks for the comments everyone. I was hoping to not have to find someone in the stats department ... well, we'll see. So in response to Z's comment ... I have tried breakpoints(Na ~ yield) and I did expect to get something continuous. The idea was to get two or three linear functions making up the curve. And then from there, get a CI from these lines. Of course, it wouldn't be good. (This is coming from a non-stats guy ... I'm a civil engineer by degree and am now learning to be a modeler as a grad student!). Do you know of any more examples of breakpoints? The examples in the references are great, but I can't seem to get it right. Thanks again. -- View this message in context: http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4305000.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breakpoints and nonlinear regression
On Tue, 17 Jan 2012, Bert Gunter wrote: On Tue, Jan 17, 2012 at 8:06 AM, Kenneth Frost wrote: Sorry, that wasn't to helpful...I see that the intervals and se.fit argument are currently ignored. Yes, because the fitted values are nonlinear in the parameters, which makes finding exact confidence regions impossible. I think the "usual" approach (subject to correction by experts) is to use a delta method approximation for the fitted variances from the varcov matrix of the parameters at the converged optimum (itself an approximation) and then a standard t-interval based on that. However, this approximation can be quite bad, because "degrees of freedom" don't mean much for nonlinear models -- in fact, that's the essential (and huge!) difference between linear and nonlinear models -- and the likelihood surface may not be close enough to quadratic. So one may do better with, e.g. a bootstrap approximation, although this can be problematic, too, due to convergence and other issues. What I think can be said with some certainty is that the idea of approximating by a segmented regression and then using CI's for each linear part in the "usual" way is a particularly bad one -- the CI's will be underestimated because they don't take into account the uncertainty in the location of the fitted breakpoints, which are nonlinear **and** non-smooth functions of the data. So if confidence intervals for the fitted values are really important, I suggest that Julian work with his local statistician to come up with the best approach for his particular situation. It's tricky. I fully agree with Bert that, in this case, segmented regression does not seem to be a fruitful approach and that it's best to consult a local statistician. However, I just wanted to clarify a theoretical detail about what breakpoints() does. The breakpoints converge at the faster rate of "n" while the parameter estimates just converge with "sqrt(n)". This is why in principle, it is possible to get "the usual" inference from segmented regressions. The price for this is to assume that the true model is in fact a segmented regression (with only breakpoints/coefficients unknown). Hence, segmented regression will be "useful" (in the Tukey sense) if there are few relatively abrupt changes in a regression relationship. On the other hand, for approximating smooth changes there are typically better techniques available. Best, Z Cheers, Bert On 01/17/12, crimsonengineer87 wrote: Dear Forum, I have been wracking my head over this problem for the past few days. I have a dataset of (x,y). I have been able to obtain a nonlinear regression line using nls. However, we would like to do some statistical analysis. I would like to obtain a confidence interval for the curve. We thought we could divide up the curve into piecewise linear regressions and compute CIs from those portions. There is a package called strucchange that seems helpful, but I am thoroughly confused. 'breakpoints' is used to calculate the number of breaks in the data for linear regressions. I have the following in my script: bp.pavlu <- breakpoints(Na ~ f(yield, a, b), h=0.15, breaks=3, data=pavludata) plot(bp.pavlu) breakpoints(bp.pavlu) But I am confused as to how to graph the piecewise functions that make up the curve. I am not even sure if I am using breakpoints correctly. Do I just give it a linear relationhip (Na ~ yield), instead of what I have? Is there an easier way to calculate the confidence interval for a non-linear regression? I am new to R (as I've read in many questions), but I have most certainly tried many things and am just getting frustrated with the lack of examples for what I'd like to do with my data... I'd appreciate any insight. I can also provide more information if I am not clear. Thanks in advance. Julian -- View this message in context: http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4303629.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commente
Re: [R] breakpoints and nonlinear regression
On Tue, 17 Jan 2012, crimsonengineer87 wrote: Dear Forum, I have been wracking my head over this problem for the past few days. I have a dataset of (x,y). I have been able to obtain a nonlinear regression line using nls. However, we would like to do some statistical analysis. I would like to obtain a confidence interval for the curve. We thought we could divide up the curve into piecewise linear regressions and compute CIs from those portions. There is a package called strucchange that seems helpful, but I am thoroughly confused. 'breakpoints' is used to calculate the number of breaks in the data for linear regressions. I have the following in my script: bp.pavlu <- breakpoints(Na ~ f(yield, a, b), h=0.15, breaks=3, data=pavludata) plot(bp.pavlu) breakpoints(bp.pavlu) But I am confused as to how to graph the piecewise functions that make up the curve. I am not even sure if I am using breakpoints correctly. Do I just give it a linear relationhip (Na ~ yield), instead of what I have? breakpoints() currently can just handle linear (in parameters) regressions. So unless f(., a, b) is either known or can be written as a linear predictor, breakpoints() cannot estimate breaks in the model of interest. If you want approximate f(., a, b) by a piecewise linear function, then you would use breakpoints(Na ~ yield). The result however will typically not be continuous. To see the result fitted() can be used. See the references in ?breakpoints for some examples. However, I doubt that this is a route worth pursuing given your problem description... Is there an easier way to calculate the confidence interval for a non-linear regression? If you want to use nls(), you could use simulation techniques to obtain confidence intervals. Another possible alternative would be to use a GAM formulation. See e.g. gam() in package "mgcv". hth, Z I am new to R (as I've read in many questions), but I have most certainly tried many things and am just getting frustrated with the lack of examples for what I'd like to do with my data... I'd appreciate any insight. I can also provide more information if I am not clear. Thanks in advance. Julian -- View this message in context: http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4303629.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breakpoints and nonlinear regression
On Tue, Jan 17, 2012 at 8:06 AM, Kenneth Frost wrote: > Sorry, that wasn't to helpful...I see that the intervals and se.fit argument > are currently ignored. Yes, because the fitted values are nonlinear in the parameters, which makes finding exact confidence regions impossible. I think the "usual" approach (subject to correction by experts) is to use a delta method approximation for the fitted variances from the varcov matrix of the parameters at the converged optimum (itself an approximation) and then a standard t-interval based on that. However, this approximation can be quite bad, because "degrees of freedom" don't mean much for nonlinear models -- in fact, that's the essential (and huge!) difference between linear and nonlinear models -- and the likelihood surface may not be close enough to quadratic. So one may do better with, e.g. a bootstrap approximation, although this can be problematic, too, due to convergence and other issues. What I think can be said with some certainty is that the idea of approximating by a segmented regression and then using CI's for each linear part in the "usual" way is a particularly bad one -- the CI's will be underestimated because they don't take into account the uncertainty in the location of the fitted breakpoints, which are nonlinear **and** non-smooth functions of the data. So if confidence intervals for the fitted values are really important, I suggest that Julian work with his local statistician to come up with the best approach for his particular situation. It's tricky. Cheers, Bert > > On 01/17/12, crimsonengineer87 wrote: >> Dear Forum, >> >> I have been wracking my head over this problem for the past few days. I have >> a dataset of (x,y). I have been able to obtain a nonlinear regression line >> using nls. However, we would like to do some statistical analysis. I would >> like to obtain a confidence interval for the curve. We thought we could >> divide up the curve into piecewise linear regressions and compute CIs from >> those portions. There is a package called strucchange that seems helpful, >> but I am thoroughly confused. >> >> 'breakpoints' is used to calculate the number of breaks in the data for >> linear regressions. I have the following in my script: >> >> bp.pavlu <- breakpoints(Na ~ f(yield, a, b), h=0.15, breaks=3, >> data=pavludata) >> plot(bp.pavlu) >> breakpoints(bp.pavlu) >> >> But I am confused as to how to graph the piecewise functions that make up >> the curve. I am not even sure if I am using breakpoints correctly. Do I just >> give it a linear relationhip (Na ~ yield), instead of what I have? >> >> Is there an easier way to calculate the confidence interval for a non-linear >> regression? >> >> I am new to R (as I've read in many questions), but I have most certainly >> tried many things and am just getting frustrated with the lack of examples >> for what I'd like to do with my data... I'd appreciate any insight. I can >> also provide more information if I am not clear. Thanks in advance. >> >> Julian >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4303629.html >> Sent from the R help mailing list archive at Nabble.com. >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breakpoints and nonlinear regression
Hi Ken, Thx for that advice. I took a brief look at it. I already have my curve by just using the curve() function using the parameters a and b given by the nls. Would se.fit and interval have computed the CI? Maybe where I'm confused is at how I can break up my curve into pieces of linear regressions. Then doing CI's from there? Thanks. Julian -- View this message in context: http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4303763.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breakpoints and nonlinear regression
Sorry, that wasn't to helpful...I see that the intervals and se.fit argument are currently ignored. On 01/17/12, crimsonengineer87 wrote: > Dear Forum, > > I have been wracking my head over this problem for the past few days. I have > a dataset of (x,y). I have been able to obtain a nonlinear regression line > using nls. However, we would like to do some statistical analysis. I would > like to obtain a confidence interval for the curve. We thought we could > divide up the curve into piecewise linear regressions and compute CIs from > those portions. There is a package called strucchange that seems helpful, > but I am thoroughly confused. > > 'breakpoints' is used to calculate the number of breaks in the data for > linear regressions. I have the following in my script: > > bp.pavlu <- breakpoints(Na ~ f(yield, a, b), h=0.15, breaks=3, > data=pavludata) > plot(bp.pavlu) > breakpoints(bp.pavlu) > > But I am confused as to how to graph the piecewise functions that make up > the curve. I am not even sure if I am using breakpoints correctly. Do I just > give it a linear relationhip (Na ~ yield), instead of what I have? > > Is there an easier way to calculate the confidence interval for a non-linear > regression? > > I am new to R (as I've read in many questions), but I have most certainly > tried many things and am just getting frustrated with the lack of examples > for what I'd like to do with my data... I'd appreciate any insight. I can > also provide more information if I am not clear. Thanks in advance. > > Julian > > -- > View this message in context: > http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4303629.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breakpoints and nonlinear regression
Hi, Julian- I'm not sure if this will be what you want but you could start by taking a look at: ?predict.nls Ken On 01/17/12, crimsonengineer87 wrote: > Dear Forum, > > I have been wracking my head over this problem for the past few days. I have > a dataset of (x,y). I have been able to obtain a nonlinear regression line > using nls. However, we would like to do some statistical analysis. I would > like to obtain a confidence interval for the curve. We thought we could > divide up the curve into piecewise linear regressions and compute CIs from > those portions. There is a package called strucchange that seems helpful, > but I am thoroughly confused. > > 'breakpoints' is used to calculate the number of breaks in the data for > linear regressions. I have the following in my script: > > bp.pavlu <- breakpoints(Na ~ f(yield, a, b), h=0.15, breaks=3, > data=pavludata) > plot(bp.pavlu) > breakpoints(bp.pavlu) > > But I am confused as to how to graph the piecewise functions that make up > the curve. I am not even sure if I am using breakpoints correctly. Do I just > give it a linear relationhip (Na ~ yield), instead of what I have? > > Is there an easier way to calculate the confidence interval for a non-linear > regression? > > I am new to R (as I've read in many questions), but I have most certainly > tried many things and am just getting frustrated with the lack of examples > for what I'd like to do with my data... I'd appreciate any insight. I can > also provide more information if I am not clear. Thanks in advance. > > Julian > > -- > View this message in context: > http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4303629.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.