Have you considered estimating ln.m1, ln.m2, and ln.b, which makes the negative log likelihood something like the following:

l.ln<- function(ln.m1,ln.m2,ln.b){
m1 <- exp(ln.m1); m2 <- exp(ln.m2); b <- exp(ln.b)
lglk <- d*( ln.m1 + ln.m2 + log1p(-exp(-(b+m2)*t)
+ (m1/b-d)*log(m2+b*exp(-b+m2)*t))
+ m1*t - m1/b*log(b+m2) )


  (-sum(lglk))
}
# NOT TESTED

I don't know if I have this correct, but you should get the idea. Parameterizing in terms of logarithms automatically eliminates the constraints that m1, m2, and b must be positive.

I also prefer to play with the function until I'm reasonably confident it will never produce NAs, and I use a few tricks to preserve numerical precision where I can. For example, log(b+m2) = log(b) + log1p(m2/b) = log(m2) + log1p(b/m2). If you use the first form when b is larger and the second when m1 is larger, you should get more accurate answers. Using, e.g.:

          log.apb <- function(log.a, log.b){
                  min.ab <- pmin(log.a, log.b)
                  max.ab <- pmax(log.a, log.b)
                  max.ab + log1p(exp(min.ab-max.ab))
          }
          # NOT TESTED

If log.a and log.b are both really large, a and b could be Inf when log.a and log.b are finite. Computing log(a+b) like this eliminates that problem. The same problem occurs when log.a and log.b are so far negative that a and b are both numerically 0, even though log.a and log.b are very finite. This function eliminates that problem.

Also, have you tried plotting your "l" vs. m1 with m2 and b constant, and then vs. m2 with m2 and b constant and vs. b with m1 and m2 constant? Or (better) make contour plots of "l" vs. any 2 of these parameters with the other held constant. When I've done this in crudely similar situations, I've typically found that the log(likelihood) was more nearly parabolic in terms of ln.m1, ln.m2, and ln.b than in terms of the untransformed variables. This means that the traditional Wald confidence region procedures are more accurate, as they assume that the log(likelihood) is parabolic in the parameters estimated.

hope this helps. spencer graves
p.s. I suggest you avoid using "t" as a variable name: That's the name of the function for the transpose of a matrix. R and usually though not always tell from the context what you want. However, it's best to avoid that ambiguity. I often test at a command prompt variable names I want to use. If the response is "object not found", then I feel like I can use it.


boshao zhang wrote:

Hi, everyone

I am trying to estimate 3 parameters for my survival
function. It's very complicated. The negative
loglikelihood function is:

l<- function(m1,m2,b)  -sum(    d*( log(m1) + log(m2)
+ log(1- exp(-(b + m2)*t)) ) + (m1/b - d)*log(m2 +
b*exp(-(b + m2)*t) ) + m1*t - m1/b*log(b+m2)      )

here d and t are given, "sum" means sum over these
two vairables. the parameters are assumed small, m1, m2 in
thousandth, m2 in millionth.


I used the function "nlm" to estimate m1,m2,b. But the
result is very bad. you can get more than 50 warnings,
most of them are about "negative infinity"in log. And
the results are initial value dependent, or you will
get nothing when you choose some values.

So I tried brutal force, i.e. evaluate the values of
grid point. It works well. Also, you can get the
correct answer of log(1e-12).

My questions are:
What is the precision of a variable in R?
How to specify the constraint interval of parameters
in nlm? I tried lower, upper, it doesn't work.
any advice on MLE is appreciated.

Thank you.

Boshao

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to