Re: [R] nls, convergence and starting values

2009-03-30 Thread Christian Ritz
Hi Patrick,

there exist specialized functionality in R that offer both automated 
calculation of
starting values and relatively robust optimization, which can be used with 
success in many
common cases of nonlinear regression, also for your data:

library(drc)  # on CRAN

## Fitting 3-parameter logistic model
## (slightly different parameterization from SSlogis())
bdd.m1 - drm(pourcma~transat, weights=sqrt(nbfeces), data=bdd, fct=L.3())

plot(bdd.m1, broken=TRUE, conLevel=0.0001)

summary(bdd.m1)


Of course, standard errors are huge as the data do not really support this 
model (as
already pointed out by other replies to this post).


Christian

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nls, convergence and starting values

2009-03-28 Thread Patrick Giraudoux

Patrick Burns a écrit :

Patrick Giraudoux wrote:

Bert Gunter a écrit :
Based on a simple scatterplot of pourcma vs  transat, a 4 parameter 
logistic
looks like wild overfitting, and that may be the source of your 
problems.

Given the huge scatter, a straight line is about as much as would seem
sensible. I think this falls into the Why ever would you want to do 
such a

thing? category.

-- Bert
  


Right, well, the general idea was just to show that the straight 
line was the best model indeed (in the other data sets, with model 
comparison, the logistic one was clearly shown to be the best... ). 
Can the fact that convergence cannot be obtained be an acceptable and 
sufficient reason to select the null model (the straight line) ?


It is my experience that convergence problems are
often encountered when the model makes little sense.
I'm not so sure that non-convergence on its own is
a good reason to reject  the model.  That is, to answer
your specific question, I think it is acceptable but not
sufficient.

Patrick Burns
patr...@burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of The R Inferno and A Guide for the Unwilling S User) 



OK. Thanks for this opinion. Actually I was sharing it intuitively but 
facing such situation for the first time, was quite unconfortable to 
make a decision (and still I am). We are touching epistemology...  and 
maybe a bit far from purely technical thus from the R list issues.


Tanks again, anyway,

Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nls, convergence and starting values

2009-03-28 Thread Patrick Burns

Patrick Giraudoux wrote:

Bert Gunter a écrit :
Based on a simple scatterplot of pourcma vs  transat, a 4 parameter 
logistic
looks like wild overfitting, and that may be the source of your 
problems.

Given the huge scatter, a straight line is about as much as would seem
sensible. I think this falls into the Why ever would you want to do 
such a

thing? category.

-- Bert
  


Right, well, the general idea was just to show that the straight 
line was the best model indeed (in the other data sets, with model 
comparison, the logistic one was clearly shown to be the best... ). 
Can the fact that convergence cannot be obtained be an acceptable and 
sufficient reason to select the null model (the straight line) ?


It is my experience that convergence problems are
often encountered when the model makes little sense.
I'm not so sure that non-convergence on its own is
a good reason to reject  the model.  That is, to answer
your specific question, I think it is acceptable but not
sufficient.

Patrick Burns
patr...@burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of The R Inferno and A Guide for the Unwilling S User)


Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nls, convergence and starting values

2009-03-28 Thread Ben Bolker
Patrick Giraudoux patrick.giraudoux at univ-fcomte.fr writes:

 
 Patrick Burns a écrit :
  Patrick Giraudoux wrote:
  Bert Gunter a écrit :
  Based on a simple scatterplot of pourcma vs  transat, a 4 parameter 
  logistic
  looks like wild overfitting, and that may be the source of your 
  problems.
  Given the huge scatter, a straight line is about as much as would seem
  sensible. I think this falls into the Why ever would you want to do 
  such a
  thing? category.
 
  -- Bert

 
  Right, well, the general idea was just to show that the straight 
  line was the best model indeed (in the other data sets, with model 
  comparison, the logistic one was clearly shown to be the best... ). 
  Can the fact that convergence cannot be obtained be an acceptable and 
  sufficient reason to select the null model (the straight line) ?
 
  It is my experience that convergence problems are
  often encountered when the model makes little sense.
  I'm not so sure that non-convergence on its own is
  a good reason to reject  the model.  That is, to answer
  your specific question, I think it is acceptable but not
  sufficient.
 
  Patrick Burns
  patrick at burns-stat.com
  +44 (0)20 8525 0696
  http://www.burns-stat.com
  (home of The R Inferno and A Guide for the Unwilling S User) 
 
 OK. Thanks for this opinion. Actually I was sharing it intuitively but 
 facing such situation for the first time, was quite unconfortable to 
 make a decision (and still I am). We are touching epistemology...  and 
 maybe a bit far from purely technical thus from the R list issues.
 

  A technical solution to this particular problem:


with(bdd,plot(pourcma~transat))

stval - list(Asym=30,xmid=0.07, scal=0.02)
with(stval,curve(Asym/(1+exp((xmid-x)/scal)),add=TRUE))

nls(pourcma~SSlogis(transat, Asym, xmid, scal), start=c(Asym=30,
xmid=0.07, scal=0.02),data=bdd, weights=sqrt(nbfeces),trace=T,alg=plinear)

library(bbmle)
m1 - mle2(pourcma~dnorm(mean=Asym/(1+exp((xmid-transat)/scal)),sd=sd),
   start=c(stval,list(sd=0.1)),method=Nelder-Mead,
   data=bdd)

with(as.list(coef(m1)),curve(Asym/(1+exp((xmid-x)/scal)),add=TRUE,col=2))


  It happens to be able to find the flat-line solution (although it
should really complain about lack of convergence, since the scale parameter
should go to infinity and the midpoint parameter should be arbitrary
in this case -- only Asym and the standard deviation are well
defined).

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] nls, convergence and starting values

2009-03-27 Thread Patrick Giraudoux

in non linear modelling finding appropriate starting values is
something like an art... (maybe from somewhere in Crawley , 2007)  Here
a colleague and I just want to compare different response models to a
null model. This has worked OK for almost all the other data sets except
that one (dumped below). Whatever our trials and algorithms, even
subsetting data (to check if some singular point was the cause of the
mess), we do not reach convergence... or screw up with singular
gradients (?) etc...

eg:

nls(pourcma~SSlogis(transat, Asym, xmid, scal), start=c(Asym=30,
xmid=0.07, scal=0.02),data=bdd, weights=sqrt(nbfeces),trace=T,alg=plinear)

As anyone a hint about an alternate approach to fit a model ? Or an idea
to get evidence that such model cannot be fitted to the data


bdd -
structure(list(transat = c(0.0697, 0.13079, 0.314265, 0.241613,
0.039319, 0, 0, 0, 0, 0, 0.0805, 0.41, 0.30585, 0.27465, 0.06085,
0.09114, 0.05766, 0.036983, 0.093186, 0.046624, 0, 0, 0, 0, 0.000616,
0, 0.0025, 0.0325, 0.03125, 0.04599, 0.38398, 0.524505, 0.450337,
0.061831, 0.133926, 0.091806, 0.00928, 0.25114, 0.3074, 0.431056,
0.026158), transma = c(0.04141, 0.01599, 0.101803, 0.002378,
0.039319, 0.00472459016393443, 0.0031016393442623, 0.000178524590163934,
0.00255704918032787, 0.000346229508196721, 0.0665, 0.012, 0.0553,
0.0045, 0.0056, 0.00155, 0.00124, 0.011966, 0.001736, 0.004712,
3.62903225806452e-05, 9.79838709677419e-05, 2.20161290322581e-05,
0.00462, 0.01006444, 0.00213, 0.046, 0.005,
0.01195, 0.07154, 0.08468, 0.141182, 0.086578, 0.027959, 0.003159,
0.003081, 0.13862, 0.00754, 0.078648, 0.068324, 0.025288), nbfeces = c(22L,
26L, 43L, 30L, 35L, 25L, 21L, 36L, 34L, 37L, 23L, 32L, 40L, 35L,
30L, 16L, 25L, 37L, 37L, 34L, 31L, 35L, 41L, 31L, 34L, 39L, 5L,
14L, 31L, 13L, 21L, 34L, 32L, 36L, 36L, 40L, 31L, 35L, 39L, 29L,
32L), pourcma = c(50, 34.6153846153846, 27.9069767441860, 43.3,
65.7142857142857, 32, 28.5714285714286, 22.2, 50,
10.8108108108108, 26.0869565217391, 40.625, 12.5, 22.8571428571429,
43.3, 6.25, 4, 10.8108108108108, 16.2162162162162,
23.5294117647059, 25.8064516129032, 45.7142857142857, 39.0243902439024,
25.8064516129032, 41.7, 27.5, 20, 14.2857142857143,
22.5806451612903, 15.3846153846154, 38.0952380952381, 17.6470588235294,
78.125, 61.1, 25, 37.5, 22.5806451612903, 40, 17.9487179487179,
41.3793103448276, 50), pourcat = c(22.7272727272727, 30.7692307692308,
41.8604651162791, 56.7, 5.71428571428571, 0, 0, 0,
0, 0, 30.4347826086957, 15.625, 45, 74.2857142857143, 13.3,
50, 12, 18.9189189189189, 27.0270270270270, 20.5882352941176,
0, 0, 0, 0, 0, 5, 40, 0, 0, 7.69230769230769, 9.52380952380952,
38.2352941176471, 59.375, 5.56, 41.7,
42.5, 9.67741935483871, 14.2857142857143, 51.2820512820513,
79.3103448275862,
6.25)), .Names = c(transat, transma, nbfeces, pourcma,
pourcat), class = data.frame, row.names = c(NA, -41L))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nls, convergence and starting values

2009-03-27 Thread Bert Gunter
Based on a simple scatterplot of pourcma vs  transat, a 4 parameter logistic
looks like wild overfitting, and that may be the source of your problems.
Given the huge scatter, a straight line is about as much as would seem
sensible. I think this falls into the Why ever would you want to do such a
thing? category.

-- Bert


Bert Gunter
Genentech Nonclinical Biostatistics
650-467-7374

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Patrick Giraudoux
Sent: Friday, March 27, 2009 12:39 PM
To: r-h...@stat.math.ethz.ch
Cc: Francis Raoul
Subject: [R] nls, convergence and starting values

in non linear modelling finding appropriate starting values is
something like an art... (maybe from somewhere in Crawley , 2007)  Here
a colleague and I just want to compare different response models to a
null model. This has worked OK for almost all the other data sets except
that one (dumped below). Whatever our trials and algorithms, even
subsetting data (to check if some singular point was the cause of the
mess), we do not reach convergence... or screw up with singular
gradients (?) etc...

eg:

nls(pourcma~SSlogis(transat, Asym, xmid, scal), start=c(Asym=30,
xmid=0.07, scal=0.02),data=bdd, weights=sqrt(nbfeces),trace=T,alg=plinear)

As anyone a hint about an alternate approach to fit a model ? Or an idea
to get evidence that such model cannot be fitted to the data


bdd -
structure(list(transat = c(0.0697, 0.13079, 0.314265, 0.241613,
0.039319, 0, 0, 0, 0, 0, 0.0805, 0.41, 0.30585, 0.27465, 0.06085,
0.09114, 0.05766, 0.036983, 0.093186, 0.046624, 0, 0, 0, 0, 0.000616,
0, 0.0025, 0.0325, 0.03125, 0.04599, 0.38398, 0.524505, 0.450337,
0.061831, 0.133926, 0.091806, 0.00928, 0.25114, 0.3074, 0.431056,
0.026158), transma = c(0.04141, 0.01599, 0.101803, 0.002378,
0.039319, 0.00472459016393443, 0.0031016393442623, 0.000178524590163934,
0.00255704918032787, 0.000346229508196721, 0.0665, 0.012, 0.0553,
0.0045, 0.0056, 0.00155, 0.00124, 0.011966, 0.001736, 0.004712,
3.62903225806452e-05, 9.79838709677419e-05, 2.20161290322581e-05,
0.00462, 0.01006444, 0.00213, 0.046, 0.005,
0.01195, 0.07154, 0.08468, 0.141182, 0.086578, 0.027959, 0.003159,
0.003081, 0.13862, 0.00754, 0.078648, 0.068324, 0.025288), nbfeces = c(22L,
26L, 43L, 30L, 35L, 25L, 21L, 36L, 34L, 37L, 23L, 32L, 40L, 35L,
30L, 16L, 25L, 37L, 37L, 34L, 31L, 35L, 41L, 31L, 34L, 39L, 5L,
14L, 31L, 13L, 21L, 34L, 32L, 36L, 36L, 40L, 31L, 35L, 39L, 29L,
32L), pourcma = c(50, 34.6153846153846, 27.9069767441860, 43.3,
65.7142857142857, 32, 28.5714285714286, 22.2, 50,
10.8108108108108, 26.0869565217391, 40.625, 12.5, 22.8571428571429,
43.3, 6.25, 4, 10.8108108108108, 16.2162162162162,
23.5294117647059, 25.8064516129032, 45.7142857142857, 39.0243902439024,
25.8064516129032, 41.7, 27.5, 20, 14.2857142857143,
22.5806451612903, 15.3846153846154, 38.0952380952381, 17.6470588235294,
78.125, 61.1, 25, 37.5, 22.5806451612903, 40, 17.9487179487179,
41.3793103448276, 50), pourcat = c(22.7272727272727, 30.7692307692308,
41.8604651162791, 56.7, 5.71428571428571, 0, 0, 0,
0, 0, 30.4347826086957, 15.625, 45, 74.2857142857143, 13.3,
50, 12, 18.9189189189189, 27.0270270270270, 20.5882352941176,
0, 0, 0, 0, 0, 5, 40, 0, 0, 7.69230769230769, 9.52380952380952,
38.2352941176471, 59.375, 5.56, 41.7,
42.5, 9.67741935483871, 14.2857142857143, 51.2820512820513,
79.3103448275862,
6.25)), .Names = c(transat, transma, nbfeces, pourcma,
pourcat), class = data.frame, row.names = c(NA, -41L))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nls, convergence and starting values

2009-03-27 Thread Patrick Giraudoux

Bert Gunter a écrit :

Based on a simple scatterplot of pourcma vs  transat, a 4 parameter logistic
looks like wild overfitting, and that may be the source of your problems.
Given the huge scatter, a straight line is about as much as would seem
sensible. I think this falls into the Why ever would you want to do such a
thing? category.

-- Bert
  


Right, well, the general idea was just to show that the straight line 
was the best model indeed (in the other data sets, with model 
comparison, the logistic one was clearly shown to be the best... ). Can 
the fact that convergence cannot be obtained be an acceptable and 
sufficient reason to select the null model (the straight line) ?


Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.