On  5-May-2004, [EMAIL PROTECTED] (AJ) wrote:

> I am trying to forecast customer life span for a set of data.
>
> Basically, we have 8 years data and thousands of rows regarding a
> subscription service. Three raw variables are as follows.
>
> a) Starting Date of subscription
> b) Cancellation Date of subscription
> c) Demograhpic Segments that a customer belongs to. We have 66
> categorical values such as 01, 02..etc. These segments are given to
> us by an outside firm that basically appends a segment to a customer
> data based on variables such as what kind of car a customer drives,
> how much she is educated, or how much she earns etc.
>
> I am interested in predicting the number of months a customer would
> stay with the product. I was thinking I could use the following
> variables in my regression model.

This is a good example of a data mining problem that could be handled well
by a decision tree (regression tree).  Unlike classical (numeric function)
regression where your categorical variables have to be recast as multiple
binary (0/1) variables, decision trees handle categorical variables in a
natural way.  I would just dump all of the data with all of the variables
into the analysis and let it pick out which variables are significant and
look for interactions.  Unless there is something unusual about your data, I
believe the entire setup and analysis run could be done in a half hour.

I recommend first developing a single-tree model which is excellent for
getting a visual picture of the model and looking for significant variables
and interactions.  Then, for significantly increased accuracy, I would build
a TreeBoost model consisting a series of boosted trees.  TreeBoost typically
has comparable accuracy to neural networks.

-- 
Phil Sherrod
(phil.sherrod 'at' sandh.com)
http://www.dtreg.com  (decision tree modeling)
http://www.nlreg.com  (nonlinear regression)
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to