AJ wrote in news:[EMAIL PROTECTED]:

> Thanks for your suggestions folks.  I made some progress.  
> 
> In my dataset PRODUCT:
> Y           - dependent variable (NUMBER_OF_MONTHS- it can be any
> positive integer)
> SEGMENT     - is the categorical independent variable (takes values
> 01,02....60)
> STATUS      - is the indicator for censoring: 1-censor; 0-uncensor. 
> PURCHASE_DT - the date when customer purchased the product.
> CANCEL_DT   - the date when customer canceled the product and
> missing-value if customer has not canceled yet.
> CANCEL      - same date if we have cancel_dt and if missing then we
> give
> it todays date.
> *The censored obs are all RIGHT-CENSORED* 
> 
> 
> /* My sas code */
> DATA PRODUCT;
>   SET PRODUCT;
>   IF CANCEL_DT EQ . THEN CANCEL=TODAY(); ELSE CANCEL = CANCEL_DT;
>   IF CANCEL_DT EQ . THEN STATUS=1; ELSE STATUS=0;  /* Status=0 means
> CANCELLED, 1 means NOT CANCELLED*/
>   Y = INTCK('MONTH' , PURCHASE_DT, CANCEL);
>   FORMAT CANCEL DATE9. ;
>   if Y ge 3;
> RUN;
> 
> PROC LIFEREG data=PRODUCT;
>   CLASS SEGMENT;
>   MODEL Y*STATUS(1)=SEGMENT / DIST= weibull;
>   OUTPUT OUT=PROD_OUT P=PREDICTED_Y;
> RUN;
> QUIT;
> /**********/
> 
> My questions are:
> 
> 1. How to decide which distribution to use. I tried exponential,
> weibull, normal etc.
> 
Do some research on diagnostic plots for survival analysis. There are 
worked SAS examples here:
http://www.ats.ucla.edu/stat/sas/examples/asa/asa8.htm

I found that cite just searching on "lifereg p=".

> 2. About 75% of my obs. are censored (there are in total 110,000 obs
> in my dataset-PRODUCT)
> 
> 3. The PREDICTED_Y are really huge numbers like 170, 200 etc. which
> are above what I expected. I am also suspecting if this is due to
> large no. of right-censored obs in my dataset. I have heared that-huge
> censoring can lead to highly extrapolated predictions. Is there a way
> to handling such censoring problems. Also,is it really a problem or
> it's ok to have this kind of situation?
> 
When you have an N of 110,000, why would a predicted of 200 surprise you? 
That is only an event rate of 0.2%/month and you said that you had a 25% 
return proportion. You ought to be doing preliminary descriptive runs so 
that you can compare to actual numbers.

-- 
David Winsemius

If the statistics are boring, then you've got the wrong numbers. 
                          -Edward Tufte
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to