Hi, With my limited understanding of the subject I can quickly relate this problem to survivability analysis.My experience is mainly in reliability analysis. I guess it involves two main task 1.Identifing the significant segment or catgories i.e what are different type of categories you have depending upon different predictors. Can use either decesion tree approach or stepwise regression approach. 2. After idenfing diffrent clusters with in each cluster do the survival analysis to get average life span of a customer for each catgory.
Sharad Gupta GE-Capital [EMAIL PROTECTED] (AJ) wrote in message news:<[EMAIL PROTECTED]>... > I was wondering if anyone could help me with an interesting problem. > I am trying to forecast customer life span for a set of data. > > Basically, we have 8 years data and thousands of rows regarding a > subscription service. Three raw variables are as follows. > > a) Starting Date of subscription > b) Cancellation Date of subscription > c) Demograhpic Segments that a customer belongs to. We have 66 > categorical values such as 01, 02..etc. These segments are given to > us by an outside firm that basically appends a segment to a customer > data based on variables such as what kind of car a customer drives, > how much she is educated, or how much she earns etc. > > I am interested in predicting the number of months a customer would > stay with the product. I was thinking I could use the following > variables in my regression model. > > Dependent Variables: NumberOfMonths (derived from taking the > difference between the starting and ending date of subscription for > both cancelled customers and customer who are still with us) > Independent Variables > a) Status (whether a customer has cancelled (0) or still with us (1)) > b) Demograhpic Segment > > Questions: > Q1) Is it ok to calculate "NumberOfMonths" variable from starting and > ending date of subscription? The reason I ask this is that for > customers who have > not cancelled subscription yet, it will only result in a number that > will be > the same whether they are still with us. Of course this information > (cancellation of subscription) will simultaneously be captured in the > "status" independent variable (0 or 1). > > Q2) I don't know how to use "Demograhpic Segment " independent > variable since there are 66 different numeric codes for these > segments. Should I use 65 (=66-1) dummy variables? Because if I do > use 65 dummy variables my regression equation may not only be > extremely long, but also potentially meaningless (dealing with so > many variables). > > Q3) What extra information do you think I may need in order to create > this model? > > Q4) Should I use the starting year as well in my model? > > Forecasting customer life span for a subscription service seems to be > a common business problem and I was wondering if anyone had any > canned solutions or provide me with pointers. . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
