John,
thanks for the comments, very useful.
Just three short additions specific to the ZINB case:
1. zeroinfl() doesn't uses BFGS (by default) and not Newton-Raphson
because we have analytical gradients but not an analytical Hessian.
2. If the default starting values do not work, zeroinfl() offers EM
estimation of the starting values (which is typically followed by
a single iteration of BFGS only). EM is usually much more robust
but slower, hence it's not the default in zeroinfl().
3. I pointed the original poster to this information but he still
insisted on Newton-Raphson for no obvious reason. As I didn't
want to WTFM again on the list, I stopped using up bandwidth.
thx,
Z
On Tue, 22 Jun 2010, Prof. John C Nash wrote:
I have not included the previous postings because they came out very
strangely on my mail reader. However, the question concerned the choice of
minimizer for the zeroinfl() function, which apparently allows any of the
current 6 methods of optim() for this purpose. The original poster wanted to
use Newton-Raphson.
Newton-Raphson (or just Newton for simplicity) is commonly thought to be the
"best" way to approach optimization problems. I've had several people ask me
why the optimx() package (see OptimizeR project on r-forge -- probably soon
on CRAN, we're just tidying up) does not have such a choice. Since the
question comes up fairly frequently, here is a response. I caution that it is
based on my experience and others may get different mileage.
My reasons for being cautious about Newton are as follows:
1) Newton's method needs a number of safeguards to avoid singular or
indefinite Hessian issues. These can be tricky to implement well and to do so
in a way that does not hinder the progress of the optimization.
2) One needs both gradient and Hessian information, and it needs to be
accurate. Numerical approximations are slow and often inadequate for tough
problems.
3) Far from a solution, Newton is often not very good, likely because the
Hessian is not like a nice quadratic over the whole space.
Newton does VERY well at converging when it has a "close enough" start. If
you can find an operationally useful way to generate such starts, you deserve
awards like the Fields.
We have in our optimx work (Ravi Varadhan and I) developed a prototype
safeguarded Newton. As yet we have not included it in optimx(), but probably
will do so in a later version after we figure out what advice to give on
where it is appropriate to apply it.
In the meantime, I would suggest that BFGS or L-BFGS-B are the closest
options in optim() and generally perform quite well. There are updates to
BFGS and CG on CRAN in the form of Rvmmin and Rcgmin which are all-R
implementations with box constraints too. UCMINF is a very similar
implementation of the unconstrained algorithm that seems to have the details
done rather well -- though BFGS in optim() is based on my work, I actually
find UCMINF often does better. There's also nlm and nlminb.
Via optimx() one can call these, and also some other minimizers, or even
"all.methods", though that is meant for learning about methods rather than
solving individual problems.
JN
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.