On 12 Dec 2002 21:52:18 -0800, [EMAIL PROTECTED] (Sapsi) wrote: > Hi, > I have sample of 40,000 observation grouped into 2 response groups > (responders = 1000). Number of explanatory variables=55 > While using prog logistic, i get possible quasi complete separation. > When i drop to a subset of 50 this problem goes, again a different set > of 50 brings back this problem! > 0) Why does this problem happen?
Given the number of variables, and the number of cases to classify (1000), your successes don't leave any cases classified wrong -- through good prediction, or through capitalizing on chance. > 1) Is there a way to detect which variables cause this problem? You can run the similar Multiple regression problem and use its diagnostics for some purposes -- but here, you just have too much success; unfortunately, it can screw up your prediction equation. But -- why do you have 50 variables? If a few are near perfect, why not use them in a SIMPLE, ARBITRARY prediction; and then prediction the residuals. For "simple", I am thinking of something like, "How many do you have, of the 6 warning signs for heart disease?" What is the problem? Either you are achieving fine prediction, and you just need to get simplify your scoring; or you have a terrible excess of variables, and you need to simplify the logic your narrowing the candidates. For instance, you might discard all the variables that are not among the 10 best candidates [previously in the literature]. And see what I posted in sci.stat.consult earlier today, under "Re: Stepwise Procedure in SAS ---- any demerits ?" > 2) How can i get rid of it? > See above. See my stats-FAQ for comments and references on stepwise. Stepwise is so often a bad idea. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
