Richard Ulrich <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>... > On 9 Apr 2004 13:11:55 -0700, [EMAIL PROTECTED] (Roger Levy) wrote: > > > Richard Ulrich <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>... > > > How many do you have your smaller group? If you have only > > > (say) 5 cases, you may be lucky to find anything with *one* > > > variable, even though your total N is 300. - And, once the cases > > > are 'predicted' adequately, there is little for your extra variables > > > to do that won't show up as artifacts of over-fitting. > > > If you reach perfect prediction, then your likelihood surface > > > has a hole in it -- Not allowable. Or, effectively, your predictors > > > can become collinear, when any predictor can substitute for > > > some other one: That makes a flat likelihood surface, where > > > the SEs become large because they are measured by the > > > steepness of the slope. > RL > > > By 'cases' I presume you mean distinct covariate vectors? Sorry, I > > should have mentioned this -- the number of covariate vectors is on > > the order of the sample size (i.e., in the hundreds). So I'm pretty > > sure that overfitting and collinearity are not really issues here > > (since I'm not including any interaction terms in the model). > > > > Now you have confused me, a lot. > By 'cases in the smaller group', I am using the common metaphor > of logistic regression, where the prediction is being made between > cases and non-cases.
Ah, I think I misunderstood you. I'm not familiar with the cases/non-cases terminology of logistic regression -- could you explain this usage? > > With Ordinary Least Squares (OLS) linear regression, you can > have almost as many Predictors as you have total cases, before you > get into 'numerical' trouble. Numerical trouble starts > earlier for ML logistic regression -- You do not much *power* > for either problem when there is not much 'information' because > the criterion is a dichotomy with only a few in one group. But > you do not get as much warning with present ML programs for > logistic, and they fail earlier. > > Further: What do you mean by distinct covariate vectors? - which > are as numerous as the sample size (which I call, Number of total > cases). Are you saying, you have as many predictors as the sample N? > THAT would be overfitting. By a "distinct covariate vector" I mean the following: with n covariates (i.e., predictors) X_1,...,X_n, a covariate vector is a value [x_1,...,x_n] for a given data point. So, for example, if I have a half-dozen binary covariates, there are 2^6=64 logically possible covariate vectors. Each of my covariates is three-valued. So the situation for which ML and exact logistic regression were giving me substantially different results was with a half-dozen covariates, i.e. 3^6=729 possible covariate vectors, and 300 datapoints, therefore the covariate space was sparsely populated. I was not including any interaction terms, and in most cases each datapoint had a unique set of predictor values, so there were only seven parameters in my model and overfitting is almost certainly not an issue. So to restate my confusion, what I don't understand is the technical reason why asymptotic ML estimates for parameter confidence intervals and p-values would be unreliable in such a situation, since sample size is relatively large in absolute terms. Many thanks for the help. Best, Roger . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
