l.ln<- function(ln.m1,ln.m2,ln.b){
m1 <- exp(ln.m1); m2 <- exp(ln.m2); b <- exp(ln.b)
lglk <- d*( ln.m1 + ln.m2 + log1p(-exp(-(b+m2)*t)
+ (m1/b-d)*log(m2+b*exp(-b+m2)*t))
+ m1*t - m1/b*log(b+m2) )
(-sum(lglk)) } # NOT TESTED
I don't know if I have this correct, but you should get the idea. Parameterizing in terms of logarithms automatically eliminates the constraints that m1, m2, and b must be positive.
I also prefer to play with the function until I'm reasonably confident it will never produce NAs, and I use a few tricks to preserve numerical precision where I can. For example, log(b+m2) = log(b) + log1p(m2/b) = log(m2) + log1p(b/m2). If you use the first form when b is larger and the second when m1 is larger, you should get more accurate answers. Using, e.g.:
log.apb <- function(log.a, log.b){ min.ab <- pmin(log.a, log.b) max.ab <- pmax(log.a, log.b) max.ab + log1p(exp(min.ab-max.ab)) } # NOT TESTED
If log.a and log.b are both really large, a and b could be Inf when log.a and log.b are finite. Computing log(a+b) like this eliminates that problem. The same problem occurs when log.a and log.b are so far negative that a and b are both numerically 0, even though log.a and log.b are very finite. This function eliminates that problem.
Also, have you tried plotting your "l" vs. m1 with m2 and b constant, and then vs. m2 with m2 and b constant and vs. b with m1 and m2 constant? Or (better) make contour plots of "l" vs. any 2 of these parameters with the other held constant. When I've done this in crudely similar situations, I've typically found that the log(likelihood) was more nearly parabolic in terms of ln.m1, ln.m2, and ln.b than in terms of the untransformed variables. This means that the traditional Wald confidence region procedures are more accurate, as they assume that the log(likelihood) is parabolic in the parameters estimated.
hope this helps. spencer graves
p.s. I suggest you avoid using "t" as a variable name: That's the name of the function for the transpose of a matrix. R and usually though not always tell from the context what you want. However, it's best to avoid that ambiguity. I often test at a command prompt variable names I want to use. If the response is "object not found", then I feel like I can use it.
boshao zhang wrote:
Hi, everyone
I am trying to estimate 3 parameters for my survival function. It's very complicated. The negative loglikelihood function is:
l<- function(m1,m2,b) -sum( d*( log(m1) + log(m2) + log(1- exp(-(b + m2)*t)) ) + (m1/b - d)*log(m2 + b*exp(-(b + m2)*t) ) + m1*t - m1/b*log(b+m2) )
here d and t are given, "sum" means sum over these
two vairables. the parameters are assumed small, m1, m2 in
thousandth, m2 in millionth.
I used the function "nlm" to estimate m1,m2,b. But the result is very bad. you can get more than 50 warnings, most of them are about "negative infinity"in log. And the results are initial value dependent, or you will get nothing when you choose some values.
So I tried brutal force, i.e. evaluate the values of grid point. It works well. Also, you can get the correct answer of log(1e-12).
My questions are: What is the precision of a variable in R? How to specify the constraint interval of parameters in nlm? I tried lower, upper, it doesn't work. any advice on MLE is appreciated.
Thank you.
Boshao
______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html