Adam,

Getting the variance of MLE estimator when the true parameter is on the 
boundary is a very difficult problem.  It is known that the standard bootstrap 
does not work.  There are some sub-sampling approaches (Springer book:  
Politis, Romano, Wolff), but I am not an expert on this.

However, an important question is how do we know that the truth is supposedly 
on the boundary.  If you can assume that the truth is in the interior, then you 
can use standard approaches: observed hessian or bootstrap to get standard 
errors.

Best,
Ravi

From: Adam Zeilinger [mailto:ad...@ucr.edu]
Sent: Sunday, December 02, 2012 5:04 PM
To: Ravi Varadhan
Cc: Adam Zeilinger (zeil0...@umn.edu); r-help@r-project.org
Subject: Re: [R] model selection with spg and AIC (or, convert list to fitted 
model object)

Dear Ravi,

Thank you so much for the help.  I switched to using the optimx function but I 
continue to use the spg method (for the most part) because I found that only 
spg consistently converges give different datasets.  I also decided to use AIC 
rather that a likelihood ratio test.

I have a new question.  I would like to construct 95% confidence intervals for 
the parameter estimates from the best model.  From a previous R Help thread, 
you said that it was a bad idea to use the Hessian matrix computed from 
optim/optimx or hessian() [numDeriv package] when the optimization is 
constrained and parameter estimates are on the boundary because the MLE is 
likely at a local minimum:

http://tolstoy.newcastle.edu.au/R/e15/help/11/09/6673.html

In the same thread, you suggest using the Hessian matrix from augmented 
Lagrangian optimization with auglag() [alabama package] (with some caveats).  I 
would like to construct 95% CI with auglag, but I don't understand how to write 
the inequality constraints (hin) function.  Could you please help me write the 
hin function?

Below is R code for my MLE problem, using data that results in parameter 
estimates on the boundary, and my unsuccessful attempt at auglag() 
optimization.  Note: I have gradient functions for NLL1 and NLL2 but they're 
very large and don't seem to improve optimization, so they are not included 
here.  I can supply them if needed for the auglag() function.  Also, I am 
running R 2.15.2 on a Windows 7 64-bit machine.

##########################################################################

library(optimx)
library(alabama)

# define multinomial distribution
dmnom2 <- function(x,prob,log=FALSE) {
  r <- lgamma(sum(x) + 1) + sum(x * log(prob) - lgamma(x + 1))
  if (log) r else exp(r)
}

# data frame y


y <- structure(list(t = c(0.167, 0.5, 1, 12, 18, 24, 36), n1 = c(1L,

1L, 1L, 8L, 9L, 10L, 12L), n2 = c(0L, 1L, 2L, 6L, 5L, 3L, 2L),

    n3 = c(13L, 12L, 11L, 0L, 0L, 1L, 0L)), .Names = c("t", "n1",

"n2", "n3"), class = "data.frame", row.names = 36:42)

# Negative log-likelihood functions
NLL1 <- function(par, y) {
  p1 <- par[1]
  p2 <- p1
  mu1 <- par[2]
  mu2 <- mu1
  t <- y$t
  n1 <- y$n1
  n2 <- y$n2
  n3 <- y$n3
  P1 <- (p1*((-1 + exp(sqrt((mu1 + mu2 + p1 + p2)^2 -
        4*(mu2*p1 + mu1*(mu2 + p2)))*t))*((-mu2)*(mu2 - p1 + p2) +
        mu1*(mu2 + 2*p2)) - mu2*sqrt((mu1 + mu2 + p1 + p2)^2 -
        4*(mu2*p1 + mu1*(mu2 + p2))) -
        exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t)*
        mu2*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) +
        2*exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 -
        4*(mu2*p1 + mu1*(mu2 + p2))))*t)*mu2*
        sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))))/
        exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 -
        4*(mu2*p1 + mu1*(mu2 + p2))))*t)/(2*(mu2*p1 + mu1*(mu2 + p2))*
        sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))))
  P2 <- (p2*((-1 + exp(sqrt((mu1 + mu2 + p1 + p2)^2 -
        4*(mu2*p1 + mu1*(mu2 + p2)))*t))*(-mu1^2 + 2*mu2*p1 +
        mu1*(mu2 - p1 + p2)) - mu1*sqrt((mu1 + mu2 + p1 + p2)^2 -
        4*(mu2*p1 + mu1*(mu2 + p2))) -
        exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t)*
        mu1*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) +
        2*exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 -
        4*(mu2*p1 + mu1*(mu2 + p2))))*t)*mu1*
        sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))))/
        exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 -
        4*(mu2*p1 + mu1*(mu2 + p2))))*t)/(2*(mu2*p1 + mu1*(mu2 + p2))*
        sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))))
  P3 <- 1 - P1 - P2
  p.all <- c(P1, P2, P3)
  if(all(p.all > 0 & p.all < 1)) -sum(dmnom2(c(n1, n2, n3), prob = p.all, log = 
TRUE)) else 1e07
}

NLL2 <- function(par, y) {
  p1 <- par[1]
  p2 <- par[2]
  mu1 <- par[3]
  mu2 <- par[4]
  t <- y$t
  n1 <- y$n1
  n2 <- y$n2
  n3 <- y$n3
  P1 <- (p1*((-1 + exp(sqrt((mu1 + mu2 + p1 + p2)^2 -
        4*(mu2*p1 + mu1*(mu2 + p2)))*t))*((-mu2)*(mu2 - p1 + p2) +
        mu1*(mu2 + 2*p2)) - mu2*sqrt((mu1 + mu2 + p1 + p2)^2 -
        4*(mu2*p1 + mu1*(mu2 + p2))) -
        exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t)*
        mu2*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) +
        2*exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 -
        4*(mu2*p1 + mu1*(mu2 + p2))))*t)*mu2*
        sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))))/
        exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 -
        4*(mu2*p1 + mu1*(mu2 + p2))))*t)/(2*(mu2*p1 + mu1*(mu2 + p2))*
        sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))))
  P2 <- (p2*((-1 + exp(sqrt((mu1 + mu2 + p1 + p2)^2 -
        4*(mu2*p1 + mu1*(mu2 + p2)))*t))*(-mu1^2 + 2*mu2*p1 +
        mu1*(mu2 - p1 + p2)) - mu1*sqrt((mu1 + mu2 + p1 + p2)^2 -
        4*(mu2*p1 + mu1*(mu2 + p2))) -
        exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t)*
        mu1*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) +
        2*exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 -
        4*(mu2*p1 + mu1*(mu2 + p2))))*t)*mu1*
        sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))))/
        exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 -
        4*(mu2*p1 + mu1*(mu2 + p2))))*t)/(2*(mu2*p1 + mu1*(mu2 + p2))*
        sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))))
  P3 <- 1 - P1 - P2
  p.all <- c(P1, P2, P3)
  if(all(p.all > 0 & p.all < 1)) -sum(dmnom2(c(n1, n2, n3), prob = p.all, log = 
TRUE)) else 1e07
}

# Optimization with optimx() function
op1 <- optimx(par = c(0.1, 0.1),
                   fn = NLL1, y = y,
                   method = c("spg"),
                   lower = rep(0.0001, 2),
                   control = list(all.methods=FALSE,
                                  ftol = 1e-20,
                                  maxit = 5000))

op2 <- optimx(par = c(0.1, 0.1, 0.1, 0.1),
                    fn = NLL2, y = y,
                    method = c("spg"),
                    lower = rep(0.0001, 4),
                    control = list(all.methods=FALSE,
                                   maxit = 5000,
                                   ftol = 1e-20))

# AIC corrected for small sample sizes
aicc <- function(k, L, n){
  # k is a vector of parameter estimates, L is the negative log-likelihood, and 
n is the sample size
  kl <- length(k)
  aic <- 2*kl + 2*L
  aicc <- aic + ((2*kl*(kl + 1))/(n - kl - 1))
  aicc
}

# AICc values for NLL models 1 and 2
aic1 <- aicc(op1$par[[1]], op1$fvalues[[1]], sum(y$n1[1], y$n2[1], y$n3[1]))
aic2 <- aicc(op2$par[[1]], op2$fvalues[[1]], sum(y$n1[1], y$n2[1], y$n3[1]))
# Model 2 is best, but one parameter is on the boundary (mu1 = 0.0001)

# Augmented Lagrangian optimization
hin <- function(x){
  h <- rep(NA, 4)
  h[1] <- 0.0001 + x[1]
  h[2] <- 0.0001 + x[2]
  h[3] <- 0.0001 + x[3]
  h[4] <- 0.0001 + x[4]
  h
}

lag.op <- auglag(par = c(0.1, 0.1, 0.1, 0.1),
                 fn = NLL2, hin = hin, y = y)

########################################################################

Any suggestions would be much appreciated.  Thanks in advance for any help.

Adam Zeilinger


On 10/11/2012 2:36 PM, Ravi Varadhan wrote:

Adam,



See the attached R code that solves your problem and beyond.  One important 
issue is that you are enforcing constraints only indirectly.  You need to make 
sure that P1, P2, and P3 (which are functions of original parameters and time) 
are all between 0 and 1.  It is not enough to impose constraints on original 
parameters p1, p2, mu1 and mu2.



I also show you how to do a likelihood ratio test with the results from spg.  
You can also do the same for nlminb or Rvmmin.



Finally, I also show you how to use optimx to compare different algorithms. 
This shows you that in addition to spg, you also get very good results (as seen 
from objective function values and KKT conditions) with a number of other 
algorithms (e.g., Rvmmin, nlminb), many of which are much faster than spg.



This example illustrates the utility of optimx().  I am always surprised as to 
why more R users doing optimization are not using optimx.  This is a very 
powerful for benchmarking unconstrained and box-constrained optimization 
problems.  It deserves to be used widely, in my biased, but correct, opinion.



Ravi



Ravi Varadhan, Ph.D.

Assistant Professor

The Center on Aging and Health

Division of Geriatric Medicine & Gerontology

Johns Hopkins University

rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu><mailto:rvarad...@jhmi.edu><mailto:rvarad...@jhmi.edu>

410-502-2619





--

Adam Zeilinger

Post Doctoral Scholar

Department of Entomology

University of California Riverside

www.linkedin.com/in/adamzeilinger<http://www.linkedin.com/in/adamzeilinger>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to