William and Gus:

        Please excuse a lurker for stepping in here, I've been following this
thread for several days and would like to add something.  Perhaps
something that has been said already.  
        Proving causality from a correlation algorithm would be wonderful, but
I don't think it is there.  The reason I think this is that all that
statistics can have is the numbers, and the numbers are not the system. 
Finding the correlations and the significant differences can lead to
understanding the system (often assuming for the nonce that some
correlation is a causal relationship), and that understanding hopefully
leads to further experiments.  A properly designed experiment can prove
causality.  
        As a scientist, I often feel that each of us is too specialized, by
necessity.  There is so much material (data, papers, texts) to absorb. 
I find myself wanting the understanding that a multivariate statistician
has, without the time to get it.  I feel the same about certain
specialties in biology and chemistry.  The danger of this is that we may
try to push foreward when we don't have the leverage.  I just read a
paper by a fellow, knowledgable in complexity theory and computers, who
applied his knowledge to evolution and thought he had found an error, as
well as a theory to correct the error.  It was all smoke and mirrors
because he didn't know how DNA works.  He treated the genetic code like
the perfect one's and zero's that a computer deals with.  The A,T,G, and
C of DNA operate in the real world, and the real world has few
integers.  
        I suppose that what I'm trying to get to is that interaction with the
scientist dealing with the system is more likely to find a proof of
causality, than chewing on the data more.  
        William/Bill, if you are still bothering to read this, I would like a
copy of the paper you offered to Gus.  As I say, I'm not convinced, but
I am willing to be.  Address and eddress given below.  
 
> Bill said earlier:
> 
> >> Yes, we do this so that we will have examples of all combinations of x1
> and
> >> x2,as we would do when using a factorial anova design.  But such uniform
> >> sampling does not make the variables into causes,  Adding x1 to x2 causes
> y,
> >
> Gus responded:
> 
> >Here you are using a very different notion of causality than I am
> >willing to accept. If you are serious about this notion, then I concede the
> >argument. In your sense, of course y is caused by x1 and x2. For me, that
> is then
> >simply a "so what?".
> 
> Bill responded,
> 
> In so far as we use numbers to model causes, the causes should also
> demonstrate the properties of those numbers, If not we are being deceptive
> to use numbers at all, This is point that Michell makes, as referenced in my
> most recent paper, We learn more about the phenomenon by what we know about
> numbers,  This is the whole point of statistics,   What do you use numbers
> for in causal modeling?
> 
> You concede the argument in a manner that suggests I am trying to get away
> with something unusual, I most definitely am not. The use of operations to
> model causes is implicit through the literature, For example, why do you
> think the word "nonadditive" is used to describe interactions in ANOVA?
> 
> Bill said:
> >> We do not infer that x1 and x2 are the causes because they are uniformly
> >> sampled.  We infer they are the causes because their correlations
> polarize
> >> across the ranges of the dependent variable y,
> 
> >Gus responded:
> 
> >I have to agree with Gottfried Helms and Jerry Dallal and others that
> >this is exactly the same thing. Uniform sampling of x1 and x2 _causes_ (in
> my
> >sense of the term) y to have a triangular distribution. If y has a
> >triangular distribution, then the correlations polarize, by definition.
> 
> Bill responded:
> 
> First of all, none of you have given an explanation for your beliefs, It
> would not matter if every famous statistician in the world made the same
> claim, Without an explanation that stands up to the tests of logic, your
> claims are not scholarly,
> 
> By your definition, if we sample all the variables uniformly, then CR will
> not know what to make of the data, In fact, CR will work just as well, I
> gave the example of the ranked data below, to which you responded:
> 
> Gus said:
> >Of course! Ranks are uniformly distributed. In fact, if you apply
> >ranking,
> >then you don't have to use uniform data from the beginning.
> >
> Bill responds:
> 
> I am ranking them AFTER the causes are first generated using interval or
> ratio data,  Of course we could use the ranks of the independent variables
> in the actual causal generation but their sums would still be triangular,  I
> do not deny this, only I say it is not enough to warrant causal inference
> because other things could cause the triangularity of some variable.  I am
> saying that if AFTER we get the triangular sums (Y) of the uniform interval,
> ratio or ordinal causes, we rank Y, then CR still works..even though the
> math is being done on THREE uniform variables, x1, x2 and Y.
> 
> You are insisting that the presto is in the distributions, Please explain
> why.  How does having different distributions allow us to infer causation?
> It does not,  We could have two uniform variables and a third triangular
> variable that is NOT the effect of the two uniform variables,
> 
> >
> >> >Of course the Y you generate by adding them will then be triangular. Of
> >> >course
> >> >the correlations will come out the way you want them to. But does that
> >> >prove
> >> >causality? Of course not. Look at your model in the opposite direction:
> >> >Y is caused by x1 and x2, but I want to prove it isn't, that the
> >> >causality
> >> >effect is y, x2 => x1. What do I do? I follow your recommendations and
> >> >select
> >> >the y uniformly and presto: causality goes the other way.
> >> >
> >> No it does not,  You do not infer that the uniformly sampled variable is
> the
> >> cause,  You sample the variables you think may be the causes uniformly
> and
> >> then see if you get the polarization effect across the ranges of any
> other
> >> variables, whether they are uniform or triangular,
> >
> >In other words, you do have reasons other than purely statistical ones
> >for suspecting a causation.
> 
> Bill responded:
> 
> No. We could simply sample all possible models using uniform distributions
> on the current hypothesized causes and the method would still work, The
> inference is based completely on the data.  Try generating a very large data
> set and taking a subsample in which the effect is uniform,  See what
> happens.
> 
> Gus said:
> >You then change the data (by insisting on a
> >uniform sample) to give you the polarization of the correlations that
> >you want. That still looks like circular reasoning to me.
> 
> Bill responded:
> 
> Not if you are free to sample every variable under consideration uniformly,
> whether you have extra mathematical hunches or not,  Thus there is no
> circularity,
> Furthermore, would you claim that a researcher who samples equally across
> the levels of the factors of his anova as loading the experiment?  Such
> uniform sampling does not cause interactions or main effects, it just allows
> for every possibility to be expressed by the data,
> 
> >
> >> You are being misled I
> >> think by Gottfried's speculations about distributions, But Gottfried and
> I
> >> have long had a friendly disagreement about this,  He sees the presto in
> the
> >> distributions, I do not, My point is supported by the fact that you could
> >> have two variables that are uniformly distributed and a third that is
> >> triangular and (according to both reality and corresponding
> >> correlations/regressions) there be no causal relationship between the
> >> variables.
> 
> Gus said::
> >
> >Exactly. I agree with Gottfried's presto. If you add two uniformly
> >distributed variables, the result will be triangular, and the correlations
> will
> >polarize. End of story.
> 
> Bill responded:
> 
> This is not even close to the end of the story because when doing research
> in the wild you can get triangular variables that are correlated with
> uniform variables without any causal relationship. This is because not all
> triangular variables will be generated from x1 and x2 (the putative causes).
> You are not thinking like an experimenter but are doing what cognitive
> scientists call "satisficing."  You are jumping to conclusions,  Think about
> how you would test the assumption that the presto is in the distributions?
> You would look for exceptions to the hypothesized rule.  If you did so, you
> would find the possibility of correlations between uniform variables and
> triangular variables without any causal relationship.  Would that be
> consistent with your belief that the story ends there.
> 
> Gus asked:
> >What would you say to a model in which
> >x1 = Annual observations on the number of storks
> >y  = Annual observations on the number of births (of human babies)
> >
> >If you throw out a few data points so that x1 is nearly uniform, then
> >you
> >will see the polarization of correlations. Does that translate into
> >causation
> >in your book?
> 
> Bill responded:
> 
> You need to go to a spread sheet and test your understanding of CR/CC
> because you are saying things that just are not true, The storks and the
> babies represent noncausal correlation,  They may both be the function of
> something else, such as the number of houses (families) in stork land. But
> there are plenty of examples in which we have correlated variables without
> causation directly between them. If you read my papers, you will see that CR
> and CC do not fall for this illusion,   By refusing to consider the
> distinction between the distribution of the effect and the polarization
> phenomenon, you set your self up for faulty conclusions.  At least try them
> on the computer before claiming you know what will happen, I have been
> working on this stuff since the mid 1980's and have done the simulations,
> When you say things like changing the distribution of storks will cause the
> artifact of causation, you are not speaking accurately about corresponding
> regressions/correlations. Would you like a copy of my most recent paper?
> 
> Bill said:
> >> Let's focus this conversation, What do you think about the polarization
> >> effect, assuming for the moment that it is wise to sample factors
> uniformly,
> >> in the way experimenters do in ANOVA designs?
> >
> 
> Guss said:
> >So far I am not impressed, I'm sorry to say.
> 
> Bill responded:
> I do not think you understand enough yet to be impressed one way or the
> other.
> Perhaps you would let me know what conditions must be met in order to
> impress you. Please be specific.
> 
> Bill

-- 
                ~DBH

Technical writing, literature search, and data analysis at the interface
of chemistry and biology. 

        [EMAIL PROTECTED]

        David B. Hedrick
        P.O. Box 16082
        Knoxville, TN 37996


===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================

Reply via email to