Dear Ravi, Thank you so much for the help. I switched to using the optimx function but I continue to use the spg method (for the most part) because I found that only spg consistently converges give different datasets. I also decided to use AIC rather that a likelihood ratio test.
I have a new question. I would like to construct 95% confidence intervals for the parameter estimates from the best model. From a previous R Help thread, you said that it was a bad idea to use the Hessian matrix computed from optim/optimx or hessian() [numDeriv package] when the optimization is constrained and parameter estimates are on the boundary because the MLE is likely at a local minimum: http://tolstoy.newcastle.edu.au/R/e15/help/11/09/6673.html In the same thread, you suggest using the Hessian matrix from augmented Lagrangian optimization with auglag() [alabama package] (with some caveats). I would like to construct 95% CI with auglag, but I don't understand how to write the inequality constraints (hin) function. Could you please help me write the hin function? Below is R code for my MLE problem, using data that results in parameter estimates on the boundary, and my unsuccessful attempt at auglag() optimization. Note: I have gradient functions for NLL1 and NLL2 but they're very large and don't seem to improve optimization, so they are not included here. I can supply them if needed for the auglag() function. Also, I am running R 2.15.2 on a Windows 7 64-bit machine. ########################################################################## library(optimx) library(alabama) # define multinomial distribution dmnom2 <- function(x,prob,log=FALSE) { r <- lgamma(sum(x) + 1) + sum(x * log(prob) - lgamma(x + 1)) if (log) r else exp(r) } # data frame y y <- structure(list(t = c(0.167, 0.5, 1, 12, 18, 24, 36), n1 = c(1L, 1L, 1L, 8L, 9L, 10L, 12L), n2 = c(0L, 1L, 2L, 6L, 5L, 3L, 2L), n3 = c(13L, 12L, 11L, 0L, 0L, 1L, 0L)), .Names = c("t", "n1", "n2", "n3"), class = "data.frame", row.names = 36:42) # Negative log-likelihood functions NLL1 <- function(par, y) { p1 <- par[1] p2 <- p1 mu1 <- par[2] mu2 <- mu1 t <- y$t n1 <- y$n1 n2 <- y$n2 n3 <- y$n3 P1 <- (p1*((-1 + exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t))*((-mu2)*(mu2 - p1 + p2) + mu1*(mu2 + 2*p2)) - mu2*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) - exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t)* mu2*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) + 2*exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))))*t)*mu2* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))))/ exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))))*t)/(2*(mu2*p1 + mu1*(mu2 + p2))* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))) P2 <- (p2*((-1 + exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t))*(-mu1^2 + 2*mu2*p1 + mu1*(mu2 - p1 + p2)) - mu1*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) - exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t)* mu1*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) + 2*exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))))*t)*mu1* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))))/ exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))))*t)/(2*(mu2*p1 + mu1*(mu2 + p2))* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))) P3 <- 1 - P1 - P2 p.all <- c(P1, P2, P3) if(all(p.all > 0 & p.all < 1)) -sum(dmnom2(c(n1, n2, n3), prob = p.all, log = TRUE)) else 1e07 } NLL2 <- function(par, y) { p1 <- par[1] p2 <- par[2] mu1 <- par[3] mu2 <- par[4] t <- y$t n1 <- y$n1 n2 <- y$n2 n3 <- y$n3 P1 <- (p1*((-1 + exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t))*((-mu2)*(mu2 - p1 + p2) + mu1*(mu2 + 2*p2)) - mu2*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) - exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t)* mu2*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) + 2*exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))))*t)*mu2* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))))/ exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))))*t)/(2*(mu2*p1 + mu1*(mu2 + p2))* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))) P2 <- (p2*((-1 + exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t))*(-mu1^2 + 2*mu2*p1 + mu1*(mu2 - p1 + p2)) - mu1*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) - exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t)* mu1*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) + 2*exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))))*t)*mu1* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))))/ exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))))*t)/(2*(mu2*p1 + mu1*(mu2 + p2))* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))) P3 <- 1 - P1 - P2 p.all <- c(P1, P2, P3) if(all(p.all > 0 & p.all < 1)) -sum(dmnom2(c(n1, n2, n3), prob = p.all, log = TRUE)) else 1e07 } # Optimization with optimx() function op1 <- optimx(par = c(0.1, 0.1), fn = NLL1, y = y, method = c("spg"), lower = rep(0.0001, 2), control = list(all.methods=FALSE, ftol = 1e-20, maxit = 5000)) op2 <- optimx(par = c(0.1, 0.1, 0.1, 0.1), fn = NLL2, y = y, method = c("spg"), lower = rep(0.0001, 4), control = list(all.methods=FALSE, maxit = 5000, ftol = 1e-20)) # AIC corrected for small sample sizes aicc <- function(k, L, n){ # k is a vector of parameter estimates, L is the negative log-likelihood, and n is the sample size kl <- length(k) aic <- 2*kl + 2*L aicc <- aic + ((2*kl*(kl + 1))/(n - kl - 1)) aicc } # AICc values for NLL models 1 and 2 aic1 <- aicc(op1$par[[1]], op1$fvalues[[1]], sum(y$n1[1], y$n2[1], y$n3[1])) aic2 <- aicc(op2$par[[1]], op2$fvalues[[1]], sum(y$n1[1], y$n2[1], y$n3[1])) # Model 2 is best, but one parameter is on the boundary (mu1 = 0.0001) # Augmented Lagrangian optimization hin <- function(x){ h <- rep(NA, 4) h[1] <- 0.0001 + x[1] h[2] <- 0.0001 + x[2] h[3] <- 0.0001 + x[3] h[4] <- 0.0001 + x[4] h } lag.op <- auglag(par = c(0.1, 0.1, 0.1, 0.1), fn = NLL2, hin = hin, y = y) ######################################################################## Any suggestions would be much appreciated. Thanks in advance for any help. Adam Zeilinger On 10/11/2012 2:36 PM, Ravi Varadhan wrote: > Adam, > > See the attached R code that solves your problem and beyond. One important > issue is that you are enforcing constraints only indirectly. You need to > make sure that P1, P2, and P3 (which are functions of original parameters and > time) are all between 0 and 1. It is not enough to impose constraints on > original parameters p1, p2, mu1 and mu2. > > I also show you how to do a likelihood ratio test with the results from spg. > You can also do the same for nlminb or Rvmmin. > > Finally, I also show you how to use optimx to compare different algorithms. > This shows you that in addition to spg, you also get very good results (as > seen from objective function values and KKT conditions) with a number of > other algorithms (e.g., Rvmmin, nlminb), many of which are much faster than > spg. > > This example illustrates the utility of optimx(). I am always surprised as > to why more R users doing optimization are not using optimx. This is a very > powerful for benchmarking unconstrained and box-constrained optimization > problems. It deserves to be used widely, in my biased, but correct, opinion. > > Ravi > > Ravi Varadhan, Ph.D. > Assistant Professor > The Center on Aging and Health > Division of Geriatric Medicine & Gerontology > Johns Hopkins University > rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu> > 410-502-2619 > -- Adam Zeilinger Post Doctoral Scholar Department of Entomology University of California Riverside www.linkedin.com/in/adamzeilinger [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.