On Fri, 7 Dec 2007, Gabor Grothendieck wrote:

On Dec 7, 2007 8:10 AM, Peter Dalgaard <[EMAIL PROTECTED]> wrote:
Ben Bolker wrote:
  At this point I'd just like to advertise the "bbmle" package
(on CRAN) for those who respectfully disagree, as I do, with Peter over
this issue.  I have added a data= argument to my version
of the function that allows other variables to be passed
to the objective function.  It seems to me that this is perfectly
in line with the way that other modeling functions in R
behave.

This is at least cleaner than abusing the "fixed" argument. As you know,
I have reservations, one of which is that it is not a given that I want
it to behave just like other modeling functions, e.g. a likelihood
function might refer to more than one data set, and/or data that are not
structured in the traditional data frame format. The design needs more
thought than just adding arguments.

I still prefer a design based a plain likelihood function. Then we can
discuss how to construct such a function so that  the data are
incorporated in a flexible way.  There are many ways to do this, I've
shown one, here's another:

f <- function(lambda) -sum(dpois(x, lambda, log=T))
d <- data.frame(x=rpois(10000, 12.34))
environment(f)<-evalq(environment(),d)
mle(f, start=list(lambda=10))

Call:
mle(minuslogl = f, start = list(lambda = 10))

Coefficients:
 lambda
12.3402


The explicit environment manipulation is what I was referring to but

I make extensive use of lexical scoping in my programming and I NEVER
use explicit environment manipulaiton--for me that is unreadable, and it
is not amenable to checking using things like codetools.  In the
example above all that is needed is to define x directly, e.g.

    > f <- function(lambda) -sum(dpois(x, lambda, log=T))
    > x <- rpois(10000, 12.34)
    > mle(f, start=list(lambda=10))

    Call:
    mle(minuslogl = f, start = list(lambda = 10))

    Coefficients:
    lambda
    12.337

It isn't necessary to go through the data frame or environment
munging.  If you want to be able to work with likelihoods for several
data sets at once then you can either use diferent names for the
variables, like

    x1 <- rpois(10000, 12.34)
    f1 <- function(lambda) -sum(dpois(x1, lambda, log=T))
    x2 <- rpois(10000, 12.34)
    f2 <- function(lambda) -sum(dpois(x2, lambda, log=T))

If you are concerned that x, x1, x2 might have been redefined if you
come back to f1, f2 later (not an issue with typical useage inside a
function but can be an issue at top level) then you can create a
closure that cpatures the particular data set you are using.  The
clean way to do this is with a function that creates the negative log
likelihood, e.g.

    makePoisonNegLogLikelihood <- function(x)
        function(lambda) -sum(dpois(x, lambda, log=T))

Then you can do

    f <- makePoisonNegLogLikelihood(rpois(10000, 12.34))
    mle(f, start=list(lambda=10))

which I find much cleaner and easier to understand than environment
munging.  Once you are defining a likelihood constructor you can think
about things like making it a bit more efficient by calculating
sufficient statistics once, for example

    makePoisonNegLogLikelihood <- function(x) {
        sumX <- sum(x)
        n <- length(x)
        function(lambda) -dpois(sumX, n * lambda, log=T)
    }

Best,

luke

we can simplify it using proto.  Create a proto object to hold
f and x then pass the f in the proto object (rather than the
original f) to mle.  That works because proto automatically resets
the environment of f when its added to avoiding the evalq.

set.seed(1)
library(proto)
f <- function(lambda) -sum(dpois(x, lambda, log=TRUE))
p <- proto(f = f, x = rpois(100, 12.34))
mle(p[["f"]], start = list(lambda = 10))

Call:
mle(minuslogl = p[["f"]], start = list(lambda = 10))

Coefficients:
 lambda
12.46000

It is not at all an unlikely design to have mle() as a generic function
which works on many kinds of objects, the default method being
function(object,...) mle(minuslogl(obj)) and minuslogl is an extractor
function returning (tada!) the negative log likelihood function.
  (My version also has a cool formula interface and other
bells and whistles, and I would love to get feedback from other
useRs about it.)

   cheers
    Ben Bolker




--

  O__  ---- Peter Dalgaard             ?ster Farimagsgade 5, Entr.B
 c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - ([EMAIL PROTECTED])                  FAX: (+45) 35327907

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
   Actuarial Science
241 Schaeffer Hall                  email:      [EMAIL PROTECTED]
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to