[R] strange `nls' behaviour
I initially thought, this should better be posted to r-devel but alas! no response. so I try it here. sory for the lengthy explanation but it seems unavoidable. to quickly see the problem simply copy the litte example below and execute f(n=5) which crashes. called with n != 5 (and of course n>3 since there are 3 parameters in the model...) everything is as it should be. in detail: I stumbled over the follwing _very_ strange behaviour/error when using `nls' which I'm tempted (despite the implied "dangers") to call a bug: I've written a driver for `nls' which allows specifying the model and the data vectors using arbitrary symbols. these are internally mapped to consistent names, which poses a slight complication when using `deriv' to provide analytic derivatives. the following fragment gives the idea: #- f <- function(n = 4) { x <- seq(0, 5, length = n) y <- 2 * exp(-1*x) + 2; y <- rnorm(y,y, 0.01*y) model <- y ~ a * exp (-b*x) + c fitfunc <- deriv(model[[3]], c("a", "b", "c"), c("a", "b", "c", "x")) #"standard" call of nls: res1 <- nls(y ~ fitfunc(a, b, c, x), start = c(a=1, b=1, c=1)) call.fitfunc <- c(list(fitfunc), as.name("a"), as.name("b"), as.name("c"), as.name("x")) call.fitfunc <- as.call(call.fitfunc) frml <- as.formula("y ~ eval(call.fitfunc)") #"computed" call of nls: res2 <- nls(frml, start = c(a=1, b=1, c=1)) list(res1 = res1, res2 = res2) } #- the argument `n' defines the number of (simulated) data points x/y which are going to be fitted by some model ( here y ~ a*exp(-b*x)+c ) the first call to `nls' is the standard way of calling `nls' when knowing all the variable and parameter names. the second call (yielding `res2') uses a constructed formula in `frml' (which in this example is of course not necessary, but in the general case 'a,b,c,x,y' are not a priori known names). now, here is the problem: the call f(4) runs fine/consistently, as does every call with n > 5. BUT: for n = 5 (i.e. issuing f(5)) the second fit leads to the error message: "Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : invalid type (language) for variable 'call.fitfunc'" I cornered this to a spot in `nls' where a model frame is constructed in variable `mf'. the parsing/constructing here seems simply to be messed up for n = 5: `call.fitfunc' is interpreted as variable. I, moreover, empirically noted that the problem occurs when the total number of parameters plus dependent/independent variables equals the number of data points (in the present example a,b,c,x,y). so it is not the 'magic' number of 5 but rather the identity of data vector length and number of parameters+variables in the model which leads to the problem. this is with 2.5.0 (which hopefully is not considered ancient) and MacOSX 10.4.10. any ideas? thanks joerg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] strange `nls' behaviour
I can confirm this behavior on R-2.6.0 but don't have time to look into it further at the moment. On Mon, 12 Nov 2007, Joerg van den Hoff wrote: > > I initially thought, this should better be posted to r-devel > but alas! no response. so I try it here. sory for the > lengthy explanation but it seems unavoidable. to quickly see > the problem simply copy the litte example below and execute > > f(n=5) > > which crashes. called with n != 5 (and of course n>3 since > there are 3 parameters in the model...) everything is as it > should be. > > in detail: > I stumbled over the follwing _very_ strange behaviour/error > when using `nls' which I'm tempted (despite the implied > "dangers") to call a bug: > > I've written a driver for `nls' which allows specifying the > model and the data vectors using arbitrary symbols. these > are internally mapped to consistent names, which poses a > slight complication when using `deriv' to provide analytic > derivatives. the following fragment gives the idea: > > #- > f <- function(n = 4) { > >x <- seq(0, 5, length = n) > >y <- 2 * exp(-1*x) + 2; >y <- rnorm(y,y, 0.01*y) > >model <- y ~ a * exp (-b*x) + c > >fitfunc <- deriv(model[[3]], c("a", "b", "c"), c("a", "b", "c", "x")) > >#"standard" call of nls: >res1 <- nls(y ~ fitfunc(a, b, c, x), start = c(a=1, b=1, c=1)) > >call.fitfunc <- >c(list(fitfunc), as.name("a"), as.name("b"), as.name("c"), as.name("x")) >call.fitfunc <- as.call(call.fitfunc) >frml <- as.formula("y ~ eval(call.fitfunc)") > >#"computed" call of nls: >res2 <- nls(frml, start = c(a=1, b=1, c=1)) > >list(res1 = res1, res2 = res2) > } > #- > > the argument `n' defines the number of (simulated) data > points x/y which are going to be fitted by some model ( here > y ~ a*exp(-b*x)+c ) > > the first call to `nls' is the standard way of calling `nls' > when knowing all the variable and parameter names. > > the second call (yielding `res2') uses a constructed formula > in `frml' (which in this example is of course not necessary, > but in the general case 'a,b,c,x,y' are not a priori known > names). > > now, here is the problem: the call > > f(4) > > runs fine/consistently, as does every call with n > 5. > > BUT: for n = 5 (i.e. issuing f(5)) > > the second fit leads to the error message: > > "Error in model.frame(formula, rownames, variables, varnames, extras, > extranames, : > invalid type (language) for variable 'call.fitfunc'" > > I cornered this to a spot in `nls' where a model frame is > constructed in variable `mf'. the parsing/constructing here > seems simply to be messed up for n = 5: `call.fitfunc' is > interpreted as variable. > > I, moreover, empirically noted that the problem occurs when > the total number of parameters plus dependent/independent > variables equals the number of data points (in the present > example a,b,c,x,y). > > so it is not the 'magic' number of 5 but rather the identity > of data vector length and number of parameters+variables in > the model which leads to the problem. > > this is with 2.5.0 (which hopefully is not considered > ancient) and MacOSX 10.4.10. > > any ideas? > > thanks > > joerg > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] strange `nls' behaviour
On 11/12/2007 6:51 AM, Joerg van den Hoff wrote: > I initially thought, this should better be posted to r-devel > but alas! no response. I think the reason there was no response is that your example is too complicated. You're doing a lot of strange things (fitfunc as a result of deriv, using as.name, as.call, as.formula, etc.) You should simplify it down to isolate the bug. Thats a lot of work, but you're the one in the best position to do it. I'd say there's at least an even chance that the bug is in your code rather than in nls(). And 2.5.0 *is* ancient; please confirm the bug exists in R-patched if it turns out to be an R bug. Duncan Murdoch so I try it here. sory for the > lengthy explanation but it seems unavoidable. to quickly see > the problem simply copy the litte example below and execute > > f(n=5) > > which crashes. called with n != 5 (and of course n>3 since > there are 3 parameters in the model...) everything is as it > should be. > > in detail: > I stumbled over the follwing _very_ strange behaviour/error > when using `nls' which I'm tempted (despite the implied > "dangers") to call a bug: > > I've written a driver for `nls' which allows specifying the > model and the data vectors using arbitrary symbols. these > are internally mapped to consistent names, which poses a > slight complication when using `deriv' to provide analytic > derivatives. the following fragment gives the idea: > > #- > f <- function(n = 4) { > >x <- seq(0, 5, length = n) > >y <- 2 * exp(-1*x) + 2; >y <- rnorm(y,y, 0.01*y) > >model <- y ~ a * exp (-b*x) + c > >fitfunc <- deriv(model[[3]], c("a", "b", "c"), c("a", "b", "c", "x")) > >#"standard" call of nls: >res1 <- nls(y ~ fitfunc(a, b, c, x), start = c(a=1, b=1, c=1)) > >call.fitfunc <- >c(list(fitfunc), as.name("a"), as.name("b"), as.name("c"), as.name("x")) >call.fitfunc <- as.call(call.fitfunc) >frml <- as.formula("y ~ eval(call.fitfunc)") > >#"computed" call of nls: >res2 <- nls(frml, start = c(a=1, b=1, c=1)) > >list(res1 = res1, res2 = res2) > } > #- > > the argument `n' defines the number of (simulated) data > points x/y which are going to be fitted by some model ( here > y ~ a*exp(-b*x)+c ) > > the first call to `nls' is the standard way of calling `nls' > when knowing all the variable and parameter names. > > the second call (yielding `res2') uses a constructed formula > in `frml' (which in this example is of course not necessary, > but in the general case 'a,b,c,x,y' are not a priori known > names). > > now, here is the problem: the call > > f(4) > > runs fine/consistently, as does every call with n > 5. > > BUT: for n = 5 (i.e. issuing f(5)) > > the second fit leads to the error message: > > "Error in model.frame(formula, rownames, variables, varnames, extras, > extranames, : > invalid type (language) for variable 'call.fitfunc'" > > I cornered this to a spot in `nls' where a model frame is > constructed in variable `mf'. the parsing/constructing here > seems simply to be messed up for n = 5: `call.fitfunc' is > interpreted as variable. > > I, moreover, empirically noted that the problem occurs when > the total number of parameters plus dependent/independent > variables equals the number of data points (in the present > example a,b,c,x,y). > > so it is not the 'magic' number of 5 but rather the identity > of data vector length and number of parameters+variables in > the model which leads to the problem. > > this is with 2.5.0 (which hopefully is not considered > ancient) and MacOSX 10.4.10. > > any ideas? > > thanks > > joerg > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] strange `nls' behaviour
On Mon, Nov 12, 2007 at 07:36:34AM -0500, Duncan Murdoch wrote: > On 11/12/2007 6:51 AM, Joerg van den Hoff wrote: > >I initially thought, this should better be posted to r-devel > >but alas! no response. > > I think the reason there was no response is that your example is too > complicated. You're doing a lot of strange things (fitfunc as a result > of deriv, using as.name, as.call, as.formula, etc.) You should simplify thanks for the feedback. concerning "lot of strange things": OK. I thought the context might be important ("why, for heaven's sake do you want to do this!?"), but, then, maybe not. so the easiest way to trigger a similar (not the identical) behaviour is something like f <- function (n) { #- #define n data points for a (hardcoded) model: #--- x <- seq(0, 5, length = n) y <- 2 * exp(-1*x) + 2; y <- rnorm(y,y, 0.01*y) #the model (same as the above hardcoded one): model <- y ~ a * exp (-b*x) + c #call nls as usual: res1 <- try(nls(model, start = c(a=2, b=1, c=2))) #call it a bit differently: res2 <- nls(y ~ eval(model[[3]]), start = c(a=2, b=1, c=2)) list(res1 = res1, res2 = res2) #- } this is without all the overhead of my first example and now (since not quite the same) the problem arises at n == 3 where the fit can't really procede (there are also 3 parameters -- the first example was more relevant in this respect) but anyway the second nls-call does not procede beyond the parsing phase of `model.frame'. so, I can't see room for a real bug in my code, but very well a chance that I misuse `nls' (i.e. not understanding what really is tolerable for the `model' argument of `nls'). but if the latter is not the case, I would think there is a bug in `nls' (or, actually, rather in `model.frame' or whatever) when parsing the nls call. > it down to isolate the bug. Thats a lot of work, but you're the one in > the best position to do it. I'd say there's at least an even chance > that the bug is in your code rather than in nls(). > > And 2.5.0 *is* ancient; please confirm the bug exists in R-patched if it > turns out to be an R bug. if need be, I'll do that (if I get it compiled under macosX). but assuming that you have R-patched installed anyway, I would appreciate if you would copy the new example above or the old one below to your R- prompt and see, whether it crashes with the same error message if called with the special number of data points (3 for the new, 5 for the old example)? and if so, maybe bring this to the attention of d. bates? j. van den hoff > > Duncan Murdoch > > > > > so I try it here. sory for the > >lengthy explanation but it seems unavoidable. to quickly see > >the problem simply copy the litte example below and execute > > > >f(n=5) > > > >which crashes. called with n != 5 (and of course n>3 since > >there are 3 parameters in the model...) everything is as it > >should be. > > > >in detail: > >I stumbled over the follwing _very_ strange behaviour/error > >when using `nls' which I'm tempted (despite the implied > >"dangers") to call a bug: > > > >I've written a driver for `nls' which allows specifying the > >model and the data vectors using arbitrary symbols. these > >are internally mapped to consistent names, which poses a > >slight complication when using `deriv' to provide analytic > >derivatives. the following fragment gives the idea: > > > >#- > >f <- function(n = 4) { > > > > x <- seq(0, 5, length = n) > > > > y <- 2 * exp(-1*x) + 2; > > y <- rnorm(y,y, 0.01*y) > > > > model <- y ~ a * exp (-b*x) + c > > > > fitfunc <- deriv(model[[3]], c("a", "b", "c"), c("a", "b", "c", "x")) > > > > #"standard" call of nls: > > res1 <- nls(y ~ fitfunc(a, b, c, x), start = c(a=1, b=1, c=1)) > > > > call.fitfunc <- > > c(list(fitfunc), as.name("a"), as.name("b"), as.name("c"), as.name("x")) > > call.fitfunc <- as.call(call.fitfunc) > > frml <- as.formula("y ~ eval(call.fitfunc)") > > > > #"computed" call of nls: > > res2 <- nls(frml, start = c(a=1, b=1, c=1)) > > > > list(res1 = res1, res2 = res2) > >} > >#- > > > >the argument `n' defines the number of (simulated) data > >points x/y which are going to be fitted by some model ( here > >y ~ a*exp(-b*x)+c ) > > > >the first call to `nls' is the standard way of calling `nls' > >when knowing all the variable and parameter names. > > > >the second call (yielding `res2') uses a constructed formula > >in `frml' (which in this example is of course not necessary, > >but in the general case 'a,b,c,x,y' are not a priori known > >names). > > > >now, here is the problem: the call > > > >f(4) > > > >runs fine/consistently, as does every
Re: [R] strange `nls' behaviour
> "DM" == Duncan Murdoch <[EMAIL PROTECTED]> > on Mon, 12 Nov 2007 07:36:34 -0500 writes: DM> On 11/12/2007 6:51 AM, Joerg van den Hoff wrote: >> I initially thought, this should better be posted to r-devel >> but alas! no response. DM> I think the reason there was no response is that your example is too DM> complicated. You're doing a lot of strange things (fitfunc as a result DM> of deriv, using as.name, as.call, as.formula, etc.) You should simplify DM> it down to isolate the bug. Thats a lot of work, but you're the one in DM> the best position to do it. I'd say there's at least an even chance DM> that the bug is in your code rather than in nls(). yes.. and.. no : - His code is quite peculiar, but I think only slightly too complicated - one could argue that the bug is in Joerg's thinking that something like nls(y ~ eval(fitfunc), ) should be working at all. But then he had found by experiment that it (accidentally I d'say) does work in many cases. DM> And 2.5.0 *is* ancient; please confirm the bug exists in R-patched if it DM> turns out to be an R bug. You are right, but indeed (as has Kate just said) it *does* exist in current R versions. I agree that the behavior of nls() here is sub-optimal. It *should* be consistent, i.e. work the same for n=4,5,6,.. I had spent about an hour after Joerg's R-devel posting, and found to be too busy with more urgent matters -- unfortunately forgetting to give *some* feedback about my findings. It may well be that we find that nls() should give an (intelligible) error message in such eval() cases - rather than only in one case... Martin Maechler DM> Duncan Murdoch DM> so I try it here. sory for the >> lengthy explanation but it seems unavoidable. to quickly see >> the problem simply copy the litte example below and execute >> >> f(n=5) >> >> which crashes. called with n != 5 (and of course n>3 since >> there are 3 parameters in the model...) everything is as it >> should be. >> >> in detail: >> I stumbled over the follwing _very_ strange behaviour/error >> when using `nls' which I'm tempted (despite the implied >> "dangers") to call a bug: >> >> I've written a driver for `nls' which allows specifying the >> model and the data vectors using arbitrary symbols. these >> are internally mapped to consistent names, which poses a >> slight complication when using `deriv' to provide analytic >> derivatives. the following fragment gives the idea: >> >> #- >> f <- function(n = 4) { >> >> x <- seq(0, 5, length = n) >> >> y <- 2 * exp(-1*x) + 2; >> y <- rnorm(y,y, 0.01*y) >> >> model <- y ~ a * exp (-b*x) + c >> >> fitfunc <- deriv(model[[3]], c("a", "b", "c"), c("a", "b", "c", "x")) >> >> #"standard" call of nls: >> res1 <- nls(y ~ fitfunc(a, b, c, x), start = c(a=1, b=1, c=1)) >> >> call.fitfunc <- >> c(list(fitfunc), as.name("a"), as.name("b"), as.name("c"), as.name("x")) >> call.fitfunc <- as.call(call.fitfunc) >> frml <- as.formula("y ~ eval(call.fitfunc)") >> >> #"computed" call of nls: >> res2 <- nls(frml, start = c(a=1, b=1, c=1)) >> >> list(res1 = res1, res2 = res2) >> } >> #- >> >> the argument `n' defines the number of (simulated) data >> points x/y which are going to be fitted by some model ( here >> y ~ a*exp(-b*x)+c ) >> >> the first call to `nls' is the standard way of calling `nls' >> when knowing all the variable and parameter names. >> >> the second call (yielding `res2') uses a constructed formula >> in `frml' (which in this example is of course not necessary, >> but in the general case 'a,b,c,x,y' are not a priori known >> names). >> >> now, here is the problem: the call >> >> f(4) >> >> runs fine/consistently, as does every call with n > 5. >> >> BUT: for n = 5 (i.e. issuing f(5)) >> >> the second fit leads to the error message: >> >> "Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : >> invalid type (language) for variable 'call.fitfunc'" >> >> I cornered this to a spot in `nls' where a model frame is >> constructed in variable `mf'. the parsing/constructing here >> seems simply to be messed up for n = 5: `call.fitfunc' is >> interpreted as variable. >> >> I, moreover, empirically noted that the problem occurs when >> the total number of parameters plus dependent/independent >> variables equals the number of data points (in the present >> example a,b,c,x,y). >> >> so it is not the 'magic' number of 5 but rather the identity
Re: [R] strange `nls' behaviour
On Mon, Nov 12, 2007 at 03:25:38PM +0100, Martin Maechler wrote: > > "DM" == Duncan Murdoch <[EMAIL PROTECTED]> > > on Mon, 12 Nov 2007 07:36:34 -0500 writes: > > DM> On 11/12/2007 6:51 AM, Joerg van den Hoff wrote: > >> I initially thought, this should better be posted to r-devel > >> but alas! no response. > > DM> I think the reason there was no response is that your example is too > DM> complicated. You're doing a lot of strange things (fitfunc as a > result > DM> of deriv, using as.name, as.call, as.formula, etc.) You should > simplify > DM> it down to isolate the bug. Thats a lot of work, but you're the one > in > DM> the best position to do it. I'd say there's at least an even chance > DM> that the bug is in your code rather than in nls(). > > yes.. and.. no : > - His code is quite peculiar, but I think only slightly too complicated thank's for this response. I just tried to give an easier example in response to duncan's mail. > > - one could argue that the bug is in Joerg's thinking that > something like > nls(y ~ eval(fitfunc), ) > > should be working at all. > But then he had found by experiment that it (accidentally I d'say) > does work in many cases. > > DM> And 2.5.0 *is* ancient; please confirm the bug exists in R-patched if > it > DM> turns out to be an R bug. > > You are right, but indeed (as has Kate just said) it *does* > exist in current R versions. > > I agree that the behavior of nls() here is sub-optimal. > It *should* be consistent, i.e. work the same for n=4,5,6,.. > > I had spent about an hour after Joerg's R-devel posting, > and found to be too busy with more urgent matters -- > unfortunately forgetting to give *some* feedback about my findings. > > It may well be that we find that nls() should give an > (intelligible) error message in such eval() cases - rather than > only in one case... well, if at all possible, it would _really_ be nice to have the opportunity to pass `eval' constructs throught the `model' argument to `nls'. I'm far from understanding what actually is going on in `model.frame', but if it parses the thing correctly ("accidentally" or not) except in one special case, is'nt there a clean way to "repair" the special case? in my application I found the ability to use `eval' constructs here really helpful. at least I did not find another way to be able to use `deriv' information _and_ allow the user to use arbitrary symbols in specifying the model formula. at least I'd prefer the present situation of "works nearly always" to a consistent "works never". joerg > > Martin Maechler > > > DM> Duncan Murdoch > > > > > DM> so I try it here. sory for the > >> lengthy explanation but it seems unavoidable. to quickly see > >> the problem simply copy the litte example below and execute > >> > >> f(n=5) > >> > >> which crashes. called with n != 5 (and of course n>3 since > >> there are 3 parameters in the model...) everything is as it > >> should be. > >> > >> in detail: > >> I stumbled over the follwing _very_ strange behaviour/error > >> when using `nls' which I'm tempted (despite the implied > >> "dangers") to call a bug: > >> > >> I've written a driver for `nls' which allows specifying the > >> model and the data vectors using arbitrary symbols. these > >> are internally mapped to consistent names, which poses a > >> slight complication when using `deriv' to provide analytic > >> derivatives. the following fragment gives the idea: > >> > >> #- > >> f <- function(n = 4) { > >> > >> x <- seq(0, 5, length = n) > >> > >> y <- 2 * exp(-1*x) + 2; > >> y <- rnorm(y,y, 0.01*y) > >> > >> model <- y ~ a * exp (-b*x) + c > >> > >> fitfunc <- deriv(model[[3]], c("a", "b", "c"), c("a", "b", "c", "x")) > >> > >> #"standard" call of nls: > >> res1 <- nls(y ~ fitfunc(a, b, c, x), start = c(a=1, b=1, c=1)) > >> > >> call.fitfunc <- > >> c(list(fitfunc), as.name("a"), as.name("b"), as.name("c"), > as.name("x")) > >> call.fitfunc <- as.call(call.fitfunc) > >> frml <- as.formula("y ~ eval(call.fitfunc)") > >> > >> #"computed" call of nls: > >> res2 <- nls(frml, start = c(a=1, b=1, c=1)) > >> > >> list(res1 = res1, res2 = res2) > >> } > >> #- > >> > >> the argument `n' defines the number of (simulated) data > >> points x/y which are going to be fitted by some model ( here > >> y ~ a*exp(-b*x)+c ) > >> > >> the first call to `nls' is the standard way of calling `nls' > >> when knowing all the variable and parameter names. > >> > >> the second call (yielding `res2') uses a constructed formula >
Re: [R] strange `nls' behaviour
On about line 547 in nls.R there is mf$formula <- # replace by one-sided linear model formula as.formula(paste("~", paste(varNames[varIndex], collapse = "+")), env = environment(formula)) If this is replaced with mf$formula <- # replace by one-sided linear model formula as.formula(paste("~", paste(varNames[1], collapse = "+")), env = environment(formula)) then the behviour seems to be OK at least for Joerg's examples. It remains to be determined whether this is a _good_ solution (that is, if it's a general solution to get the desired behavior). On Mon, 12 Nov 2007, Martin Maechler wrote: > > "DM" == Duncan Murdoch <[EMAIL PROTECTED]> > > on Mon, 12 Nov 2007 07:36:34 -0500 writes: > > DM> On 11/12/2007 6:51 AM, Joerg van den Hoff wrote: > >> I initially thought, this should better be posted to r-devel > >> but alas! no response. > > DM> I think the reason there was no response is that your example is too > DM> complicated. You're doing a lot of strange things (fitfunc as a > result > DM> of deriv, using as.name, as.call, as.formula, etc.) You should > simplify > DM> it down to isolate the bug. Thats a lot of work, but you're the one > in > DM> the best position to do it. I'd say there's at least an even chance > DM> that the bug is in your code rather than in nls(). > > yes.. and.. no : > - His code is quite peculiar, but I think only slightly too complicated > > - one could argue that the bug is in Joerg's thinking that > something like > nls(y ~ eval(fitfunc), ) > > should be working at all. > But then he had found by experiment that it (accidentally I d'say) > does work in many cases. > > DM> And 2.5.0 *is* ancient; please confirm the bug exists in R-patched if > it > DM> turns out to be an R bug. > > You are right, but indeed (as has Kate just said) it *does* > exist in current R versions. > > I agree that the behavior of nls() here is sub-optimal. > It *should* be consistent, i.e. work the same for n=4,5,6,.. > > I had spent about an hour after Joerg's R-devel posting, > and found to be too busy with more urgent matters -- > unfortunately forgetting to give *some* feedback about my findings. > > It may well be that we find that nls() should give an > (intelligible) error message in such eval() cases - rather than > only in one case... > > Martin Maechler > > > DM> Duncan Murdoch > > > > > DM> so I try it here. sory for the > >> lengthy explanation but it seems unavoidable. to quickly see > >> the problem simply copy the litte example below and execute > >> > >> f(n=5) > >> > >> which crashes. called with n != 5 (and of course n>3 since > >> there are 3 parameters in the model...) everything is as it > >> should be. > >> > >> in detail: > >> I stumbled over the follwing _very_ strange behaviour/error > >> when using `nls' which I'm tempted (despite the implied > >> "dangers") to call a bug: > >> > >> I've written a driver for `nls' which allows specifying the > >> model and the data vectors using arbitrary symbols. these > >> are internally mapped to consistent names, which poses a > >> slight complication when using `deriv' to provide analytic > >> derivatives. the following fragment gives the idea: > >> > >> #- > >> f <- function(n = 4) { > >> > >> x <- seq(0, 5, length = n) > >> > >> y <- 2 * exp(-1*x) + 2; > >> y <- rnorm(y,y, 0.01*y) > >> > >> model <- y ~ a * exp (-b*x) + c > >> > >> fitfunc <- deriv(model[[3]], c("a", "b", "c"), c("a", "b", "c", "x")) > >> > >> #"standard" call of nls: > >> res1 <- nls(y ~ fitfunc(a, b, c, x), start = c(a=1, b=1, c=1)) > >> > >> call.fitfunc <- > >> c(list(fitfunc), as.name("a"), as.name("b"), as.name("c"), > as.name("x")) > >> call.fitfunc <- as.call(call.fitfunc) > >> frml <- as.formula("y ~ eval(call.fitfunc)") > >> > >> #"computed" call of nls: > >> res2 <- nls(frml, start = c(a=1, b=1, c=1)) > >> > >> list(res1 = res1, res2 = res2) > >> } > >> #- > >> > >> the argument `n' defines the number of (simulated) data > >> points x/y which are going to be fitted by some model ( here > >> y ~ a*exp(-b*x)+c ) > >> > >> the first call to `nls' is the standard way of calling `nls' > >> when knowing all the variable and parameter names. > >> > >> the second call (yielding `res2') uses a constructed formula > >> in `frml' (which in this example is of course not necessary, > >> but in the general case 'a,b,c,x,y' are not a priori known > >> names). > >> > >> now, here is the problem: the call > >> > >> f(4) > >> > >> runs fine/consistently, as doe
Re: [R] strange `nls' behaviour
On 11/12/2007 9:14 AM, Joerg van den Hoff wrote: > On Mon, Nov 12, 2007 at 07:36:34AM -0500, Duncan Murdoch wrote: >> On 11/12/2007 6:51 AM, Joerg van den Hoff wrote: >> >I initially thought, this should better be posted to r-devel >> >but alas! no response. >> >> I think the reason there was no response is that your example is too >> complicated. You're doing a lot of strange things (fitfunc as a result >> of deriv, using as.name, as.call, as.formula, etc.) You should simplify > > thanks for the feedback. > > concerning "lot of strange things": OK. I thought the > context might be important ("why, for heaven's sake do you > want to do this!?"), but, then, maybe not. so the easiest > way to trigger a similar (not the identical) behaviour is > something like > > f <- function (n) { >#- >#define n data points for a (hardcoded) model: >#--- >x <- seq(0, 5, length = n) >y <- 2 * exp(-1*x) + 2; >y <- rnorm(y,y, 0.01*y) > >#the model (same as the above hardcoded one): >model <- y ~ a * exp (-b*x) + c > >#call nls as usual: >res1 <- try(nls(model, start = c(a=2, b=1, c=2))) > >#call it a bit differently: >res2 <- nls(y ~ eval(model[[3]]), start = c(a=2, b=1, c=2)) > >list(res1 = res1, res2 = res2) >#- > } I'd say the problem is relying on the default for the envir parameter to eval. It's reasonable to expect nls to set things up so that terms in the model formula are taken from the right place, but when your model formula is playing with evaluation, you should be very careful to make sure the evaluation takes place in the right context. The default for envir is "parent.frame()", and that can't be right: that will see local variables in whatever function called it, so if one of them has a variable called "model", you'll subscript it. If I were trying to do what you're doing, I would construct the formula before the call to nls, and pass that. I.e. something like the following silly code: model2 <- model model2[[3]] <- model[[3]] # The eval() is implicit res2 <- nls(model2, start = c(a=2, b=1, c=2)) If you really want to put eval() in a formula, then I can't see how you could avoid an explicit specification of the envir parameter. So I'd call this a bug in your code. Duncan Murdoch > this is without all the overhead of my first example and now > (since not quite the same) the problem arises at n == 3 > where the fit can't really procede (there are also 3 > parameters -- the first example was more relevant in this > respect) but anyway the second nls-call does not procede > beyond the parsing phase of `model.frame'. > > so, I can't see room for a real bug in my code, but very > well a chance that I misuse `nls' (i.e. not understanding > what really is tolerable for the `model' argument of `nls'). > but if the latter is not the case, I would think there is a > bug in `nls' (or, actually, rather in `model.frame' or > whatever) when parsing the nls call. > > >> it down to isolate the bug. Thats a lot of work, but you're the one in >> the best position to do it. I'd say there's at least an even chance >> that the bug is in your code rather than in nls(). >> >> And 2.5.0 *is* ancient; please confirm the bug exists in R-patched if it >> turns out to be an R bug. > > if need be, I'll do that (if I get it compiled under > macosX). but assuming that you have R-patched installed > anyway, I would appreciate if you would copy the new example > above or the old one below to your R- prompt and see, > whether it crashes with the same error message if called > with the special number of data points (3 for the new, 5 for > the old example)? and if so, maybe bring this to the > attention of d. bates? > > > j. van den hoff >> >> Duncan Murdoch >> > >> >> >> >> so I try it here. sory for the >> >lengthy explanation but it seems unavoidable. to quickly see >> >the problem simply copy the litte example below and execute >> > >> >f(n=5) >> > >> >which crashes. called with n != 5 (and of course n>3 since >> >there are 3 parameters in the model...) everything is as it >> >should be. >> > >> >in detail: >> >I stumbled over the follwing _very_ strange behaviour/error >> >when using `nls' which I'm tempted (despite the implied >> >"dangers") to call a bug: >> > >> >I've written a driver for `nls' which allows specifying the >> >model and the data vectors using arbitrary symbols. these >> >are internally mapped to consistent names, which poses a >> >slight complication when using `deriv' to provide analytic >> >derivatives. the following fragment gives the idea: >> > >> >#- >> >f <- function(n = 4) { >> > >> > x <- seq(0, 5, length = n) >> > >> > y <- 2 * exp(-1
Re: [R] strange `nls' behaviour
On Mon, Nov 12, 2007 at 11:09:21AM -0500, Duncan Murdoch wrote: > On 11/12/2007 9:14 AM, Joerg van den Hoff wrote: > >On Mon, Nov 12, 2007 at 07:36:34AM -0500, Duncan Murdoch wrote: > >>On 11/12/2007 6:51 AM, Joerg van den Hoff wrote: > >>>I initially thought, this should better be posted to r-devel > >>>but alas! no response. > >> > >>I think the reason there was no response is that your example is too > >>complicated. You're doing a lot of strange things (fitfunc as a result > >>of deriv, using as.name, as.call, as.formula, etc.) You should simplify > > > >thanks for the feedback. > > > >concerning "lot of strange things": OK. I thought the > >context might be important ("why, for heaven's sake do you > >want to do this!?"), but, then, maybe not. so the easiest > >way to trigger a similar (not the identical) behaviour is > >something like > > > >f <- function (n) { > > #- > > #define n data points for a (hardcoded) model: > > #--- > > x <- seq(0, 5, length = n) > > y <- 2 * exp(-1*x) + 2; > > y <- rnorm(y,y, 0.01*y) > > > > #the model (same as the above hardcoded one): > > model <- y ~ a * exp (-b*x) + c > > > > #call nls as usual: > > res1 <- try(nls(model, start = c(a=2, b=1, c=2))) > > > > #call it a bit differently: > > res2 <- nls(y ~ eval(model[[3]]), start = c(a=2, b=1, c=2)) > > > > list(res1 = res1, res2 = res2) > > #- > >} > > I'd say the problem is relying on the default for the envir parameter to > eval. It's reasonable to expect nls to set things up so that terms in > the model formula are taken from the right place, but when your model > formula is playing with evaluation, you should be very careful to make > sure the evaluation takes place in the right context. agreed. > > The default for envir is "parent.frame()", and that can't be right: that > will see local variables in whatever function called it, so if one of > them has a variable called "model", you'll subscript it. for one, in my actual application, I _do_ specify envir (and remember the above was really not the original situation where the subsetting of `model' is actually done only in the `deriv' call). second, if I'm not missing something, doing the evaluation in parent.frame() is fine in the above example and does not explain the error, whose triggering solely depends on the number of data points used. > > If I were trying to do what you're doing, I would construct the formula > before the call to nls, and pass that. I.e. something like the > following silly code: > > model2 <- model > model2[[3]] <- model[[3]] # The eval() is implicit > res2 <- nls(model2, start = c(a=2, b=1, c=2)) I was trying that (more or less) in my original example, I think, but I will reexamine my application and see, whether I can bypass the problem somehow along these lines. > > If you really want to put eval() in a formula, then I can't see how you > could avoid an explicit specification of the envir parameter. So I'd > call this a bug in your code. as I said, this seems right in principle (still, if the call does happen at some well defined place such as a small function local to a user visible one, the eval without envir might be quite OK) but not w.r.t. explaining the nls crash. katherine mullen seems to have located the exact spot in `nls' where something goes wrong. so, for now I think martin might be right ("bug in my thinking" by assuming I'm allowed to use eval in this place), but otherwise it is a property (or bug) of `nls'. joerg van den hoff > > Duncan Murdoch > > > >this is without all the overhead of my first example and now > >(since not quite the same) the problem arises at n == 3 > >where the fit can't really procede (there are also 3 > >parameters -- the first example was more relevant in this > >respect) but anyway the second nls-call does not procede > >beyond the parsing phase of `model.frame'. > > > >so, I can't see room for a real bug in my code, but very > >well a chance that I misuse `nls' (i.e. not understanding > >what really is tolerable for the `model' argument of `nls'). > >but if the latter is not the case, I would think there is a > >bug in `nls' (or, actually, rather in `model.frame' or > >whatever) when parsing the nls call. > > > > > > > >>it down to isolate the bug. Thats a lot of work, but you're the one in > >>the best position to do it. I'd say there's at least an even chance > >>that the bug is in your code rather than in nls(). > >> > >>And 2.5.0 *is* ancient; please confirm the bug exists in R-patched if it > >>turns out to be an R bug. > > > >if need be, I'll do that (if I get it compiled under > >macosX). but assuming that you have R-patched installed > >anyway, I would appreciate if you would copy the new example > >ab
Re: [R] strange `nls' behaviour
On 12/11/2007 2:56 PM, Joerg van den Hoff wrote: > On Mon, Nov 12, 2007 at 11:09:21AM -0500, Duncan Murdoch wrote: >> On 11/12/2007 9:14 AM, Joerg van den Hoff wrote: >>> On Mon, Nov 12, 2007 at 07:36:34AM -0500, Duncan Murdoch wrote: On 11/12/2007 6:51 AM, Joerg van den Hoff wrote: > I initially thought, this should better be posted to r-devel > but alas! no response. I think the reason there was no response is that your example is too complicated. You're doing a lot of strange things (fitfunc as a result of deriv, using as.name, as.call, as.formula, etc.) You should simplify >>> thanks for the feedback. >>> >>> concerning "lot of strange things": OK. I thought the >>> context might be important ("why, for heaven's sake do you >>> want to do this!?"), but, then, maybe not. so the easiest >>> way to trigger a similar (not the identical) behaviour is >>> something like >>> >>> f <- function (n) { >>> #- >>> #define n data points for a (hardcoded) model: >>> #--- >>> x <- seq(0, 5, length = n) >>> y <- 2 * exp(-1*x) + 2; >>> y <- rnorm(y,y, 0.01*y) >>> >>> #the model (same as the above hardcoded one): >>> model <- y ~ a * exp (-b*x) + c >>> >>> #call nls as usual: >>> res1 <- try(nls(model, start = c(a=2, b=1, c=2))) >>> >>> #call it a bit differently: >>> res2 <- nls(y ~ eval(model[[3]]), start = c(a=2, b=1, c=2)) >>> >>> list(res1 = res1, res2 = res2) >>> #- >>> } >> I'd say the problem is relying on the default for the envir parameter to >> eval. It's reasonable to expect nls to set things up so that terms in >> the model formula are taken from the right place, but when your model >> formula is playing with evaluation, you should be very careful to make >> sure the evaluation takes place in the right context. > > agreed. > >> The default for envir is "parent.frame()", and that can't be right: that >> will see local variables in whatever function called it, so if one of >> them has a variable called "model", you'll subscript it. > > for one, in my actual application, I _do_ specify envir (and > remember the above was really not the original situation > where the subsetting of `model' is actually done only in the > `deriv' call). second, if I'm not missing something, doing > the evaluation in parent.frame() is fine in the above > example and does not explain the error, whose triggering > solely depends on the number of data points used. You are missing something. parent.frame() will be evaluated by eval(), so it refers to whatever function calls eval(). You don't know what function that will be, because it's buried deep within nls(). Perhaps you think that the formula y ~ eval(model[[3]]) is the same as the original one? It's not. Try printing it: > y ~ eval(model[[3]]) y ~ eval(model[[3]]) The eval doesn't get called at this point, it gets called for every step of the least squares minimization. Who knows what parent.frame() means then? >> If I were trying to do what you're doing, I would construct the formula >> before the call to nls, and pass that. I.e. something like the >> following silly code: >> >> model2 <- model >> model2[[3]] <- model[[3]] # The eval() is implicit >> res2 <- nls(model2, start = c(a=2, b=1, c=2)) > > > I was trying that (more or less) in my original example, > I think, but I will reexamine my application and see, whether I can bypass > the problem somehow along these lines. > >> If you really want to put eval() in a formula, then I can't see how you >> could avoid an explicit specification of the envir parameter. So I'd >> call this a bug in your code. > > > as I said, this seems right in principle (still, if the call > does happen at some well defined place such as a small > function local to a user visible one, the eval without envir > might be quite OK) but not w.r.t. explaining the nls crash. No, it's not okay. Your formula is looking in places it shouldn't. It makes absolutely no sense for your formula to depend on what function evaluates it. It should only depend on the data that is available in the data argument to nls() or visible in the current environment. > katherine mullen seems to have located the exact spot in > `nls' where something goes wrong. so, for now I think martin > might be right ("bug in my thinking" by assuming I'm allowed > to use eval in this place), but otherwise it is a property > (or bug) of `nls'. She may have discovered where the error call was triggered, but things went wrong in this example when you mis-used eval(). If you think your original code used it properly, then simplify that down to a minimal example that triggers the "bug". Don't put things in the example unless they are necessary to trigger the error. For example: - drop the wrapper function f. - f
Re: [R] strange `nls' behaviour
On Mon, Nov 12, 2007 at 06:59:56PM -0500, Duncan Murdoch wrote: > On 12/11/2007 2:56 PM, Joerg van den Hoff wrote: > >On Mon, Nov 12, 2007 at 11:09:21AM -0500, Duncan Murdoch wrote: > >>On 11/12/2007 9:14 AM, Joerg van den Hoff wrote: > >>>On Mon, Nov 12, 2007 at 07:36:34AM -0500, Duncan Murdoch wrote: > On 11/12/2007 6:51 AM, Joerg van den Hoff wrote: > >I initially thought, this should better be posted to r-devel > >but alas! no response. > I think the reason there was no response is that your example is too > complicated. You're doing a lot of strange things (fitfunc as a result > of deriv, using as.name, as.call, as.formula, etc.) You should > simplify > >>>thanks for the feedback. > >>> > >>>concerning "lot of strange things": OK. I thought the > >>>context might be important ("why, for heaven's sake do you > >>>want to do this!?"), but, then, maybe not. so the easiest > >>>way to trigger a similar (not the identical) behaviour is > >>>something like > >>> > >>>f <- function (n) { > >>> #- > >>> #define n data points for a (hardcoded) model: > >>> #--- > >>> x <- seq(0, 5, length = n) > >>> y <- 2 * exp(-1*x) + 2; > >>> y <- rnorm(y,y, 0.01*y) > >>> > >>> #the model (same as the above hardcoded one): > >>> model <- y ~ a * exp (-b*x) + c > >>> > >>> #call nls as usual: > >>> res1 <- try(nls(model, start = c(a=2, b=1, c=2))) > >>> > >>> #call it a bit differently: > >>> res2 <- nls(y ~ eval(model[[3]]), start = c(a=2, b=1, c=2)) > >>> > >>> list(res1 = res1, res2 = res2) > >>> #- > >>>} > >>I'd say the problem is relying on the default for the envir parameter to > >>eval. It's reasonable to expect nls to set things up so that terms in > >>the model formula are taken from the right place, but when your model > >>formula is playing with evaluation, you should be very careful to make > >>sure the evaluation takes place in the right context. > > > >agreed. > > > >>The default for envir is "parent.frame()", and that can't be right: that > >>will see local variables in whatever function called it, so if one of > >>them has a variable called "model", you'll subscript it. > > > >for one, in my actual application, I _do_ specify envir (and > >remember the above was really not the original situation > >where the subsetting of `model' is actually done only in the > >`deriv' call). second, if I'm not missing something, doing > >the evaluation in parent.frame() is fine in the above > >example and does not explain the error, whose triggering > >solely depends on the number of data points used. > > You are missing something. parent.frame() will be evaluated by eval(), > so it refers to whatever function calls eval(). You don't know what > function that will be, because it's buried deep within nls(). thanks for taking the time. I see. will remember to be even more careful with `eval' in the future. > > Perhaps you think that the formula > > y ~ eval(model[[3]]) > > is the same as the original one? It's not. Try printing it: no, I understand roughly what lazy evaluation means :-) > > > y ~ eval(model[[3]]) > y ~ eval(model[[3]]) > > The eval doesn't get called at this point, it gets called for every step > of the least squares minimization. Who knows what parent.frame() means > then? > > >>If I were trying to do what you're doing, I would construct the formula > >>before the call to nls, and pass that. I.e. something like the > >>following silly code: > >> > >>model2 <- model > >>model2[[3]] <- model[[3]] # The eval() is implicit > >>res2 <- nls(model2, start = c(a=2, b=1, c=2)) > > > > > >I was trying that (more or less) in my original example, > >I think, but I will reexamine my application and see, whether I can bypass > >the problem somehow along these lines. > > > >>If you really want to put eval() in a formula, then I can't see how you > >>could avoid an explicit specification of the envir parameter. So I'd > >>call this a bug in your code. > > > > > >as I said, this seems right in principle (still, if the call > >does happen at some well defined place such as a small > >function local to a user visible one, the eval without envir > >might be quite OK) but not w.r.t. explaining the nls crash. > > No, it's not okay. Your formula is looking in places it shouldn't. It > makes absolutely no sense for your formula to depend on what function > evaluates it. It should only depend on the data that is available in > the data argument to nls() or visible in the current environment. yes, accepted, as above. > > >katherine mullen seems to have located the exact spot in > >`nls' where something goes wrong. so, for now I think martin > >might be right ("bug in my thinking" by assuming I'm allowed > >to use eval in this place), but otherwise it is a property > >(or