On Fri, 7 Dec 2007, Duncan Murdoch wrote: > On 12/7/2007 8:10 AM, Peter Dalgaard wrote: >> Ben Bolker wrote: >>> At this point I'd just like to advertise the "bbmle" package >>> (on CRAN) for those who respectfully disagree, as I do, with Peter over >>> this issue. I have added a data= argument to my version >>> of the function that allows other variables to be passed >>> to the objective function. It seems to me that this is perfectly >>> in line with the way that other modeling functions in R >>> behave. >>> >> This is at least cleaner than abusing the "fixed" argument. As you know, >> I have reservations, one of which is that it is not a given that I want >> it to behave just like other modeling functions, e.g. a likelihood >> function might refer to more than one data set, and/or data that are not >> structured in the traditional data frame format. The design needs more >> thought than just adding arguments. > > We should allow more general things to be passed as data arguments in > cases where it makes sense. For example a list with names or an > environment would be a reasonable way to pass data that doesn't fit into > a data frame. > >> I still prefer a design based a plain likelihood function. Then we can >> discuss how to construct such a function so that the data are >> incorporated in a flexible way. There are many ways to do this, I've >> shown one, here's another: >> >>> f <- function(lambda) -sum(dpois(x, lambda, log=T)) >>> d <- data.frame(x=rpois(10000, 12.34)) >>> environment(f)<-evalq(environment(),d) > > We really need to expand as.environment, so that it can convert data > frames into environments. You should be able to say: > > environment(f) <- as.environment(d) > > and get the same result as > > environment(f)<-evalq(environment(),d) > > But I'd prefer to avoid the necessity for users to manipulate the > environment of a function. I think the pattern > > model( f, data=d )
For working at the general likelihood I think is is better to encourage the approach of definign likelihood constructor functions. The problem with using f, data is that you need to mathc the names used in f and in data, so either you have to explicitly write out f with the names you have in data or you have to modify data to use the names f likes -- in the running example think f <- function(lambda) -sum(dpois(x, lambda, log=T)) d <- data.frame(y=rpois(10000, 12.34)) somebody has to connext up the x in f with the y in d. With a negative log likelihood constructor defines, for example, as makePoisonNegLogLikelihood <- function(x) function(lambda) -sum(dpois(x, lambda, log=T)) this happens naturally with makePoisonNegLogLikelihood(d$y) > > being implemented internally as > > environment(f) <- as.environment(d, parent = environment(f)) > > is very nice and general. It makes things like cross-validation, > bootstrapping, etc. conceptually cleaner: keep the same > formula/function f, but manipulate the data and see what happens. > It does have problems when d is an environment that already has a > parent, but I think a reasonable meaning in that case would be to copy > its contents into a new environment with the new parent set. Both (simple) bootstrapping and (simple leave-one-out) crossvalidation require a data structure with a notion of cases, which is much more restrictive than the conext in which mle can be used. A more ngeneric aproach to bootstrapping that might fit closer to the level of generality of mle might be parameterized in terms of a negative log likelihood constructor, a starting value constructor, and a resampling function, with a single iteration implemented soemthing like mleboot1 <- function(nllmaker, start, esample) { newdata <- resample() newstart <- do.call(start, newdata) nllfun <- do.call(nllmaker, newdata) mle(fnllfun, start = newstart) } This would leave decisions on the resampling method and data structure up to the user. Somehing similar could be done with K-fold CV. luke > > Duncan Murdoch > > >>> mle(f, start=list(lambda=10)) >> >> Call: >> mle(minuslogl = f, start = list(lambda = 10)) >> >> Coefficients: >> lambda >> 12.3402 >> >> It is not at all an unlikely design to have mle() as a generic function >> which works on many kinds of objects, the default method being >> function(object,...) mle(minuslogl(obj)) and minuslogl is an extractor >> function returning (tada!) the negative log likelihood function. >>> (My version also has a cool formula interface and other >>> bells and whistles, and I would love to get feedback from other >>> useRs about it.) >>> >>> cheers >>> Ben Bolker >>> >>> >> >> > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- Luke Tierney Chair, Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: [EMAIL PROTECTED] Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel