Is it fair to say that a high degree of correlation is preliminary evidence
of temporal determinism considering a pair of time-series? Because we know
that correlation is NOT causation and therefore cannot be deterministic as
such; but does it indicate that the process MAY be deterministic if the
coefficient is great enough? Correlation is more or less the 'lag-0
normalized' joint distribution of two variables across time-lags, but this
does not preserve the phase information present in the original series;
however, what are the conditions necessary to infer causality given only the
joint distribution (say X(t+tau1)Y(t+tau2) where tau1 and tau2 are
time-delays that vary from 0 to infinity)? (say, X and Y are correlated at
0.95) what additional tests would one have to perform to infer causality?

Also, can someone please explain the idea of Grainger causality? The jargon
is a bit complex in the literature.

thanks,
p


----- Original Message ----- 
From: "Donald Burrill" <[EMAIL PROTECTED]>
To: "Phil Sherrod" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Friday, May 07, 2004 10:00 AM
Subject: Re: [edstat] Forecasting customer life span


> Phil,  I think you have misunderstood the problem.  See below.  -- Don.
>
> On Fri, 7 May 2004, Phil Sherrod wrote in part:
>
> > On  6-May-2004, Richard Ulrich <[EMAIL PROTECTED]> wrote:
> >
> > > Phil, are you asserting, implicitly, that your decision-tree analysis
> > > has a built-in facility for handling Survival analysis in a life-table
> > > manner?
> >
> > No I am not.  Please reread the problem statement that was posted:
> >
> > > > > Basically, we have 8 years data and thousands of rows regarding a
> > > > > subscription service. Three raw variables are as follows.
> > > > >
> > > > > a) Starting Date of subscription
> > > > > b) Cancellation Date of subscription
> >
> > Every entry has a starting date and a cancellation date; there is no
> > truncated survival period.  So why do you think survival analysis is
> > required for this?
>
> Mostly because AJ (the OP) said so, explicitly:
>
> >>  Dependent Variables: NumberOfMonths (derived from taking the
> >> difference between the starting and ending date of subscription for
> >> both cancelled customers and customer who are still with us)
> >>  Independent Variables
> >> a) Status (whether a customer has cancelled (0) or still with us (1))
> >> b) Demograhpic Segment
>
> Which refers both to "cancelled customers" and to "customers who are
> still with us".  One may have an ending date for the current
> subscription;  but until the customer decides to renew (or to cancel)
> one does not know whether the subscription will in fact end on that
> date.  Sounds like survival analysis to me.
>
> By way of confirmation:  in the next paragraph, AJ asked:
>
> >> Questions:
> >>  Q1) Is it ok to calculate "NumberOfMonths" variable from starting
> >> and ending date of subscription? The reason I ask this is that for
> >> customers who have not cancelled subscription yet, it will only
> >> result in a number that will be the same whether they are still with
> >> us [or not -- DFB]. Of course this information (cancellation of
> >> subscription) will simultaneously be captured in the "status"
> >> independent variable (0 or 1).
>
>   <snip>
>
> [Rich Ulrich:]
> > > Since there is only one predictor variable, with 66 levels, I don't
> > > see why the analysis should take more than 3 seconds....
> >
> > Why do you think there is only one predictor variable with 66 levels.
> > Here is the statement:
> >
> >> c) Demograhpic Segments that a customer belongs to. We have 66
> >> categorical values such as 01, 02..etc. These segments are given to
> >> us by an outside firm that basically appends a segment to a customer
> >> data based on variables such as what kind of car a customer drives,
> >> how much she is educated, or how much she earns etc.
> >
> > Note: "_variables_ such as..." (1) kind of car, (2) education, (3)
> > income...  There are 66 variables with multiple levels.
>
> No.  AJ explicitly writes "66 categorical VALUES" [emphasis added].
> These segments (which I take to mean the applicable one of the 66 values
> {1,2,3,...,66} for each customer) are appended to the customer's data,
> and are BASED ON an unspecified number of variables (of which three
> exemplars are named).  They do not COMPRISE those variables.
>  [And (not that it matters) we have no idea whether the number of those
> variables is 66, or more, or (more likely, in my opinion) fewer.
> (There are probably more of them than the three named examples.)]
>
>  <snip, the rest>
>
>  ------------------------------------------------------------
>  Donald F. Burrill                              [EMAIL PROTECTED]
>  56 Sebbins Pond Drive, Bedford, NH 03110      (603) 626-0816
> .
> .
> =================================================================
> Instructions for joining and leaving this list, remarks about the
> problem of INAPPROPRIATE MESSAGES, and archives are available at:
> .                  http://jse.stat.ncsu.edu/                    .
> =================================================================
>

.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to