Richard Ulrich <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>...
> On 9 Apr 2004 13:11:55 -0700, [EMAIL PROTECTED] (Roger Levy) wrote:
> 
> > Richard Ulrich <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>...
> > > How many do you have your smaller group?  If you have only
> > > (say) 5 cases, you may be lucky to find anything with *one*  
> > > variable, even though your total N  is 300.  - And, once the cases
> > > are 'predicted' adequately, there is little for your extra variables
> > > to do that won't show up as artifacts of over-fitting.
> > > If you reach perfect prediction, then your likelihood surface
> > > has a hole in it -- Not allowable.  Or, effectively, your predictors
> > > can become collinear, when any predictor can substitute for
> > > some other one:  That makes a flat likelihood surface, where
> > > the SEs  become large because they are measured by the 
> > > steepness of the slope.
>  RL > 
> > By 'cases' I presume you mean distinct covariate vectors?  Sorry, I
> > should have mentioned this -- the number of covariate vectors is on
> > the order of the sample size (i.e., in the hundreds).  So I'm pretty
> > sure that overfitting and collinearity are not really issues here
> > (since I'm not including any interaction terms in the model).
> >
> 
> Now you have confused me, a lot.
> By 'cases in the smaller group', I am using the common metaphor 
> of logistic regression, where the prediction is being made between
> cases and non-cases.

Ah, I think I misunderstood you.  I'm not familiar with the
cases/non-cases terminology of logistic regression -- could you
explain this usage?

> 
> With Ordinary Least Squares (OLS) linear regression, you can 
> have almost as many Predictors as you have total cases, before you 
> get into 'numerical' trouble.  Numerical trouble starts  
> earlier for ML logistic regression -- You do not much *power*
> for either problem when there is not much 'information' because
> the criterion is a dichotomy with only a few in one group.  But
> you do not get as much warning with present ML  programs for 
> logistic, and they fail earlier.
> 
> Further:  What do you mean by distinct covariate vectors? - which
> are as numerous as the sample size (which I call, Number of total
> cases).  Are you saying, you have as many predictors as the sample N?
> THAT  would be overfitting.

By a "distinct covariate vector" I mean the following: with n
covariates (i.e., predictors) X_1,...,X_n, a covariate vector is a
value [x_1,...,x_n] for a given data point.  So, for example, if I
have a half-dozen binary covariates, there are 2^6=64 logically
possible covariate vectors.

Each of my covariates is three-valued.  So the situation for which ML
and exact logistic regression were giving me substantially different
results was with a half-dozen covariates, i.e. 3^6=729 possible
covariate vectors, and 300 datapoints, therefore the covariate space
was sparsely populated.  I was not including any interaction terms,
and in most cases each datapoint had a unique set of predictor values,
so there were only seven parameters in my model and overfitting is
almost certainly not an issue.

So to restate my confusion, what I don't understand is the technical
reason why asymptotic ML estimates for parameter confidence intervals
and p-values would be unreliable in such a situation, since sample
size is relatively large in absolute terms.

Many thanks for the help.

Best,

Roger
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to