AJ wrote in news:[EMAIL PROTECTED]:
> Thanks for your suggestions folks. I made some progress.
>
> In my dataset PRODUCT:
> Y - dependent variable (NUMBER_OF_MONTHS- it can be any
> positive integer)
> SEGMENT - is the categorical independent variable (takes values
> 01,02....60)
> STATUS - is the indicator for censoring: 1-censor; 0-uncensor.
> PURCHASE_DT - the date when customer purchased the product.
> CANCEL_DT - the date when customer canceled the product and
> missing-value if customer has not canceled yet.
> CANCEL - same date if we have cancel_dt and if missing then we
> give
> it todays date.
> *The censored obs are all RIGHT-CENSORED*
>
>
> /* My sas code */
> DATA PRODUCT;
> SET PRODUCT;
> IF CANCEL_DT EQ . THEN CANCEL=TODAY(); ELSE CANCEL = CANCEL_DT;
> IF CANCEL_DT EQ . THEN STATUS=1; ELSE STATUS=0; /* Status=0 means
> CANCELLED, 1 means NOT CANCELLED*/
> Y = INTCK('MONTH' , PURCHASE_DT, CANCEL);
> FORMAT CANCEL DATE9. ;
> if Y ge 3;
> RUN;
>
> PROC LIFEREG data=PRODUCT;
> CLASS SEGMENT;
> MODEL Y*STATUS(1)=SEGMENT / DIST= weibull;
> OUTPUT OUT=PROD_OUT P=PREDICTED_Y;
> RUN;
> QUIT;
> /**********/
>
> My questions are:
>
> 1. How to decide which distribution to use. I tried exponential,
> weibull, normal etc.
>
Do some research on diagnostic plots for survival analysis. There are
worked SAS examples here:
http://www.ats.ucla.edu/stat/sas/examples/asa/asa8.htm
I found that cite just searching on "lifereg p=".
> 2. About 75% of my obs. are censored (there are in total 110,000 obs
> in my dataset-PRODUCT)
>
> 3. The PREDICTED_Y are really huge numbers like 170, 200 etc. which
> are above what I expected. I am also suspecting if this is due to
> large no. of right-censored obs in my dataset. I have heared that-huge
> censoring can lead to highly extrapolated predictions. Is there a way
> to handling such censoring problems. Also,is it really a problem or
> it's ok to have this kind of situation?
>
When you have an N of 110,000, why would a predicted of 200 surprise you?
That is only an event rate of 0.2%/month and you said that you had a 25%
return proportion. You ought to be doing preliminary descriptive runs so
that you can compare to actual numbers.
--
David Winsemius
If the statistics are boring, then you've got the wrong numbers.
-Edward Tufte
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
. http://jse.stat.ncsu.edu/ .
=================================================================