[R] Prediction with multiple zeros in the dependent variable

John Sorkin Wed, 07 Sep 2005 21:06:48 -0700

I have a batch of data in each line of data contains three values,
calcium score, age, and sex. I would like to predict calcium scores as a
function of age and sex, i.e. calcium=f(age,sex). Unfortunately the
calcium scorers have a very "ugly distribution". There are multiple
zeros, and multiple values between 300 and 600. There are no values
between zero and 300. Needless to say, the calcium scores are not
normally distributed, however, the values between 300 and 600 have a
distribution that is log normal. As you might imagine, the residuals
from the regression are not normally distributed and thus violates the
basic assumption of regression analyses. Does anyone have a suggestion
for a method (or a transformation) that will allow me predict calcium
from age and sex without violating the assumptions of the model?
Thanks,
John
 
John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC and
University of Maryland School of Medicine Claude Pepper OAIC
 
University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
 
410-605-7119 
- NOTE NEW EMAIL ADDRESS:
[EMAIL PROTECTED]


        [[alternative HTML version deleted]]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Prediction with multiple zeros in the dependent variable

Reply via email to