Hi the list, I would need some advice on something that looks like a FAQ: the possibility of providing vectors to optim() function.
Here is a stupid and short example summarizing the problem: -------------------------------- example 1 ------------ 8< ---------------------- library(stats4) data <- rnorm(100,0,1) lik1 <- function(m, v, data) { N <- length(data) lik.mean <- dnorm(mean(data), m, sqrt(v/N), log=T) lik.var <- dchisq(N*var(data)/v, N-1, log=T) return(-lik.mean - lik.var) } ml.result <- mle(lik1, start=list(m=2, v=2), fixed=list(data=data)) summary(ml.result) --------------------------------------------------------------------------------------- This works perfectly (except that the default algorithm sometimes tries some negative variance parameters). However, if the parameters (m and v) are considered as a vector of parameters, the result is very disappointing: -------------------------------- example 2 ------------ 8< ---------------------- lik2 <- function(param, data) { N <- length(data) lik.mean <- dnorm(mean(data), param["m"], sqrt(param["v"]/N), log=T) lik.var <- dchisq(N*var(data)/param["v"], N-1, log=T) return(-lik.mean - lik.var) } ml.result <- mle(lik2, start=list(param=c(m=2, v=2)), fixed=list(data=data)) --------------------------------------------------------------------------------------- "Error in dnorm(mean(data), param["m"], sqrt(param["v"]/N), log = T) : argument "param" is missing, with no default" One could trust the error message and provide default values, but unfortunately, -------------------------------- example 2b ------------ 8< ---------------------- lik2b <- function(param=c(m=1, v=1), data) { N <- length(data) lik.mean <- dnorm(mean(data), param["m"], sqrt(param["v"]/N), log=T) lik.var <- dchisq(N*var(data)/param["v"], N-1, log=T) return(-lik.mean - lik.var) } ml.result <- mle(lik2b, start=list(param=c(m=2, v=2)), fixed=list(data=data)) --------------------------------------------------------------------------------------- "Error in validObject(.Object) : invalid class "mle" object: invalid object for slot "fullcoef" in class "mle": got class "list", should be or extend class "numeric"" The example above is very stupid, but there are cases where the estimate of vectors of parameters can hardly be avoided. In particular, I'm working with time series models with random effects that has to be estimated at each time point (the parameters underlying the distribution of these random effects are "real" parameters, but the effect itself is a nuisance parameter). Of course, the number of variables to estimate will depend on the size of the data set, and the function is supposed to work for any dataset (convergence is another issue). The archives of r-help are quite clear on the fact that mle (and optim) are not vectorized, dot. I tried to trick mle by defining a likelihood function with a "..." parameter, and converting the c(...) into named parameters afterwards, but it does not work: mle checks the list of parameters against "fullcoef <- formals(minuslogl)". I derived my own mle function to force the call of optim() without the "fool-proof" tests, but it was probably very naive because it crashes as well. The fact is that the code of mle(), as well as the R part of optim(), is full of (not always clean) tricks to ensure that the parameters are exactly as they are supposed to be. I guess the aim of this code is not only to slow down the function and to annoy the user, so I would like to get some feedback before trying to modify mle() and optim() to force them, in one way or another, to turn lists of vectors into lists of numeric values. I'm a bit afraid that the problem goes down to the .Internal optim function, and I will probably give up if it is the case. Thanks for any comment, Arnaud. PS: By the way, the "paranoid parameter tests" in mle / optim lead to an annoying bug: the "minuslogl" function cannot have "computed" default parameters, because all of them must be specified either as "fixed" or "start". -------------------------------- example 3 ------------ 8< ---------------------- lik1b <- function(m, v=var(data), data) { N <- length(data) lik.mean <- dnorm(mean(data), m, sqrt(v/N), log=T) lik.var <- dchisq(N*var(data)/v, N-1, log=T) return(-lik.mean - lik.var) } ml.result.1 <- mle(lik1b, start=list(m=2, v=2), fixed=list(data=data)) ml.result.2 <- mle(lik1b, start=list(m=2), fixed=list(data=data)) --------------------------------------------------------------------------------------- The second call to mle crashes "Error in validObject(.Object) : invalid class "mle" object: invalid object for slot "fullcoef" in class "mle": got class "list", should be or extend class "numeric"" I don't see any reason for that: a call to lik1b(m=0, data=data) returns the expected value. The bug can be bypassed by using a default such as v=NULL, and then if(is.null(v)) v <- var(data) , but it does not seem very elegant... ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.