RE: [ccp4bb]: maximum likelihood question

Ian Tickle Fri, 02 Sep 2005 10:47:36 -0700

***  For details on how to be removed from this list visit the  ***
***          CCP4 home page http://www.ccp4.ac.uk         ***




> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On 
> Behalf Of [EMAIL PROTECTED]
> Sent: 30 August 2005 22:58
> To: Douglas Theobald
> Cc: [email protected]
> Subject: Re: [ccp4bb]: maximum likelihood question
 
> Yes, but given an expectation value and a variance, Maximum Entropy
> methods say that the only unbiased error distribution is Gaussian. So,
> although it's never explicitly stated in Least-Squares, the 
> fact that you
> consider only an expectation and a variance (and the lack of 
> correlation),
> in a sense, actually already nails down the error model (if 
> you want to
> remain unbiased, that is). If the errors were non-Normal then you can
> still apply the method and get good results, but I'm not sure 
> that they
> are optimal, they don't have the highest likelihood and are 
> certainly not
> the most probable.

I couldn't let this pass without commenting that the last sentence needs
to be qualified: it's certainly true if the Gauss-Markov conditions
(i.e. fixed weight matrix proportional to inverse of var-covar matrix of
observations) are satisfied (and we're assuming throughout that the
errors are non-normal).  However there's nothing in the G-M theorem
which says that the G-M conditions have to apply!  It just means that if
they don't then the resultant parameters don't have the properties of
unbiasedness and minimum variance (that is, minimum w.r.t. variations in
the weights) that are guaranteed by the theorem if the conditions do
apply.  There seems to be a widespread belief that you can't vary the
weights in LS: you can, it's just that you don't get unbiased minimum
variance results, but then these properties are not necessarily
desirable anyway (e.g. see article on bias in Wikipedia:
http://en.wikipedia.org/wiki/Bias_%28statistics%29#The_sometimes-good_ki
nd ).

Under specified conditions in which the weights used in a LS iteration
are functionally dependent on the parameters from the previous iteration
(unless the errors are normal, in which case there is no dependence and
the weights remain fixed), and under fairly general conditions on the
form of the error distribution (which you have to know), the LS
iterations in fact converge to the ML solution.  So in that case LS
certainly does give the highest likelihood (i.e. identical to ML).  This
is the basis of the IRLS (Iteratively Reweighted Least Squares)
algorithm widely used in GLM (Generalised Linear Modelling).  Actually
this is really ML because you have to know the form of the error
distribution, it's just that the mechanics for forming and solving the
equations for the parameter updates is pure LS (or Newton-Raphson to be
precise).

Also it seems to me that nowhere here is there any assumption (implicit
or otherwise) of normal errors, in fact quite the contrary: there's an
explicit assumption of non-normal errors!

I raised this point originally because I was trying to answer the
question: what happens if you know for sure that the errors are
non-normal but you don't know the form of the PDF sufficiently
accurately to do ML?  It seemed to me that to say that LS assumes a
normal distribution is not helpful in such a situation.  This is not a
theoretical question: e.g. I know for sure that B factors don't have a
normal distribution (if the measured B is 10, the probability that the
true B is <=0 is precisely zero, but Pr(B>=20) is clearly finite, so the
distribution is strongly skewed & can't be normal).  However I don't
have the faintest idea that the error distribution is!

-- Ian

**********************************************************************
CONFIDENTIALITY NOTICE 
This email contains confidential information and may be otherwise protected by 
law. Its content should not be disclosed and it should not be given or copied 
to anyone other than the person(s) named or referenced above. If you have 
received this email in error, please contact [EMAIL PROTECTED]
**********************************************************************

RE: [ccp4bb]: maximum likelihood question

Reply via email to