Re: [ai-geostats] F and T-test for samples drawn from the same p

2004-12-06 Thread Digby Millikan
RE: [ai-geostats] F and T-test for samples drawn from the same pComparisons 
of the sills of relative variograms may indicate wether the proportional 
effect is present
between the low and high grade zones, so a test on the correlation 
coefficients could be relevant.

Digby
www.users.on.net/~digbym

* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

Re: [ai-geostats] F and T-test for samples drawn from the same p

2004-12-06 Thread Chaosheng Zhang
Isobel,

Good idea, and that's a step forward. Any references or is it still an idea?

Cheers,

Chaosheng

- Original Message - 
From: "Isobel Clark" <[EMAIL PROTECTED]>
To: "AI Geostats mailing list" <[EMAIL PROTECTED]>
Sent: Monday, December 06, 2004 1:07 PM
Subject: Re: [ai-geostats] F and T-test for samples drawn from the same p


> Dear all
>
> I am having difficulty understanding why none of you
> want to try a spatial approach to statistics. Everyone
> is trying to make the 'independent' statistical tests
> work on spatial data. Try turning this around and look
> at the spatial aspect first.
>
> (1) Testing variances: the sill on the semi-variogram
> (total height of model) is theoretically a good
> estimate for the sample variance when auto-correlation
> or spatial dependence is present. Do your F test on
> that. Yes, you still have degrees of freedom problems,
> but with thousands of samples the 'infinity column'
> should be sufficient.
>
> (2) Testing means: the classic t-test in the presence
> of 'equal variances' requires the 'standard error' of
> each mean. For independent samples, this is s/sqrt(n).
> For spatially dependent samples, this is the kriging
> standard error for the global mean. Your only problem
> then is getting a global standard error.
>
> Isobel
> http://geoecosse.bizland.com/whatsnew.htm
>
>






> * By using the ai-geostats mailing list you agree to follow its rules
> ( see http://www.ai-geostats.org/help_ai-geostats.htm )
>
> * To unsubscribe to ai-geostats, send the following in the subject or in
the body (plain text format) of an email message to [EMAIL PROTECTED]
>
> Signoff ai-geostats
>


* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

Re: [ai-geostats] F and T-test for samples drawn from the same p

2004-12-06 Thread Isobel Clark
Dear all 

I am having difficulty understanding why none of you
want to try a spatial approach to statistics. Everyone
is trying to make the 'independent' statistical tests
work on spatial data. Try turning this around and look
at the spatial aspect first.

(1) Testing variances: the sill on the semi-variogram
(total height of model) is theoretically a good
estimate for the sample variance when auto-correlation
or spatial dependence is present. Do your F test on
that. Yes, you still have degrees of freedom problems,
but with thousands of samples the 'infinity column'
should be sufficient.

(2) Testing means: the classic t-test in the presence
of 'equal variances' requires the 'standard error' of
each mean. For independent samples, this is s/sqrt(n).
For spatially dependent samples, this is the kriging
standard error for the global mean. Your only problem
then is getting a global standard error.

Isobel 
http://geoecosse.bizland.com/whatsnew.htm

* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

Re: [ai-geostats] F and T-test for samples drawn from the same p

2004-12-06 Thread Chaosheng Zhang
Title: RE: [ai-geostats] F and T-test for samples drawn from the same p



Besides the discussions on the theory, I think 
we need a practical solution for Colin Badenhorst's initial problem (This is not 
his problem only). He wants to compare two sets of spatial data with several 
thousand samples.
 
Spatial autocorrelation (or lack of 
independence) is a basic feature of spatial data, and thus we cannot do 
anything to ask spatial data to behave well to satisfy the statistical 
requirements. If your spatial data set is lack of spatial autocorrelation, you 
may be asked to go back and take more samples. The ideal way is perhaps to develop a t-test (or whatever test) for 
spatial data, something like "spatially weighted test". If such a test is not 
available, we have no choice, but have to use existing methods. They 
may not be exactly suitable to spatial data, but better than nothing. 

 
For the time being, the best way to solve the 
problem is still to use statistical methods, but try to explain the results 
carefully and appropriately. We have to acknowledge the discrepancies between 
the basic feature of spatial data and possible statistical requirements. 
Meanwhile, when the sample size (well, going back to my initial concern) is 
large, you will always get the result of rejecting the null hypothesis for REAL 
data, no matter there is spatial dependence or not. In this case, what does such 
a result mean? I would like to say this result is not very meaningful, as it 
just proves the power of statistical tests. The simple ways of graphs (e.g., 
histogram, box-plot) and percentiles may become helpful for 
comparison.
 
Therefore, for Colin's initial problem, the 
solution is to explain the results properly, and maybe to try some other methods 
if available. 
 
Cheers,
 
Chaosheng
--Dr. 
Chaosheng ZhangLecturer in GISDepartment of GeographyNational 
University of Ireland, GalwayIRELANDTel: +353-91-524411 x 2375Direct 
Tel: +353-91-49 2375Fax: +353-91-525700E-mail: [EMAIL PROTECTED]Web 
1: www.nuigalway.ie/geography/zhang.htmlWeb 
2: www.nuigalway.ie/geography/gis/index.htm
 
* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

Re: [ai-geostats] F and T-test for samples drawn from the same p

2004-12-05 Thread Digby Millikan
Every resource model I have done, I always subdivide the populations into
those of equal mean and variance, so stationarity is obeyed, is this the 
correct
procedure, I havn't read Mining Geostatisitcs in detail yet, but understood
that this was a basic requirement for geostatisitical modelling procedures.

http://www.users.on.net/~digbym/about_consulting.htm
Digby

* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

Re: [ai-geostats] F and T-test for samples drawn from the same p

2004-12-05 Thread Digby Millikan
Colin,
Isn't a basic rule of geostatisitics that all populations must follow the
intrinsic
hypothesis, i.e. stationarity ,constant mean and variance, so you should
split
any populations that do not have the same mean and variance, introduced
pp33 Mining Geostatistics A.G.Journel & Ch. J.Huijbregts.
Regards Digby
- Original Message - 
From: "Colin Badenhorst" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Saturday, December 04, 2004 1:28 AM
Subject: RE: [ai-geostats] F and T-test for samples drawn from the same p


Hi Ted,
Thanks for your reply. I suspect my original query was too vague, so I
will
illustrate it with a practical example here.
I have an ore horizon that splits into two separate horizons. One of these
split horizons has a lower average grade, and the other has a higher
average
grade. I need to determine whether I should treat these two horizons as
separate entities during grade estimation. My geological observations tell
me that these two horizons derive from the same source, and on the face of
it are not different from one another in terms of mineral content and
genesis. I aim to back it up by proving, or attempting to prove, that
statistically these two horizons are the same, and can be treated as such
as
far as grade estimation goes. Because the mean grades vary between the
two,
I suspect that the T-test might fail, but I also suspect that the variance
in grade between the two might be very similar, and thus the F-test will
pass. Now I have a problem : a T-test tells me the populations differ
statistically, and but the F-test tells me they don't.
The confidence limit I refer to in (2) by the way is the Alpha value used
to
determine the confidence level for the test - I am using Excel to do the
test.
Thanks,
Colin
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: 03 December 2004 14:15
To: Colin Badenhorst
Cc: [EMAIL PROTECTED]
Subject: RE: [ai-geostats] F and T-test for samples drawn from the same
p
On 03-Dec-04 Colin Badenhorst wrote:
Hello everyone,
I have two groups of several thousand samples analysed
for various elements, and wish to determine if these
samples are drawn from the same statistical population
for later variography studies. I propose to test the two
groups by using a F-test to test the sample variances,
and a T-test to test the group means, at a given confidence limit.
Before I do this, I wonder how I would interpret the results
of the test if, for example:
1. The F-test suggests no significant statistical difference
between the variances at a 90% confidence limit, BUT
2. The T-test suggests a significant statistical difference
between the means at the same, or lower confidence limit.
Has anyone come across this scenario before and how are they
interpreted?
On the face of it, the scenario you describe corresponds to
a standard t-test (which involves an assumption that the
variances of the two populations do not differ), though I'm
not sure what you mean in (2) by significant "at the same,
or lower confidence limit." (Do I take it that in (1) you
mean that the P-value for the F test is 0.1 or less?)
However, if you get significant difference between the variances
in (1), then it may not be very good to use the standard
t test (depending on how different they are). A modified
version, such as the Welch test, should be used instead.
There is an issue with interpreting the results where the
samples have initially been screened by one test, before
another one is applied, since the sampling distribution
of the second test, conditional on the outcome of the
first, may not be the same as the sampling distribution of
the second test on its own. However, I feel inclined to
guess that this may not make any important difference
in your case.
Hoping this helps,
Ted.

E-Mail: (Ted Harding) <[EMAIL PROTECTED]>
Fax-to-email: +44 (0)870 094 0861  [NB: New number!]
Date: 03-Dec-04   Time: 14:15:09
-- XFMail --




* By using the ai-geostats mailing list you agree to follow its rules
( see http://www.ai-geostats.org/help_ai-geostats.htm )
* To unsubscribe to ai-geostats, send the following in the subject or in
the body (plain text format) of an email message to [EMAIL PROTECTED]
Signoff ai-geostats


* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

RE: [ai-geostats]F and T-test for samples drawn from the same p

2004-12-05 Thread Mat (University Account)
 
Sorry if this is somewhat off subject - but I'd like to discuss (and invite
further comments) on Colin's comments regarding the effects of independence
on standard statistical tests.

He mentioned that a lack of independence "typically removes a large part of
the usability of basic tests unless corrected for spatial variables".
The standard argument goes something like: 
'Spatial autocorrelation means that the sampled values are not independent, 
so you have less information than you think (i.e. your estimated degrees of
freedom are too large). 
Consequently, the variance is underestimated and confidence intervals are
too small (or the type I error is under-reported)'.

My understanding is that this argument is quite valid when you are inferring
beyond the area from which you have sampled (or inferring about the
stochastic process generating the sample data). 
However, it's probably worth mentioning that if you are simply looking to
compare the parameters of specified areas (or volumes) and you have used a
sensible design-based sampling method (e.g. SRS), then autocorrelation poses
no problem.

i.e. if you have randomly sampled some regionalized variable in volume X and
volume Y, and simply wish to determine if, say, the population means of
these volumes are different -- then the sample points will be independent
(relative to the area of inference). In this scenario, classical statistical
tests can be used to compare the realization parameters of the different
areas.

The question that often is failed to be asked is - What inference space are
we interested in? Do we wish to discuss the process that generated the data,
or simply make inference about the actual physical realization?
Geostatistics avoids many complications with autocorrelation by typically
restricting inference to the actual data, rather than the stochastic
process.

In your particular case I would expect that statistically showing that: 
(a) two horizons exhibit the same mineral content/spatial structure and 
(b) two horizons derive from the same process
are very different problems.

Certainly within biology, the difference between these situations does not
seem to be well understood
 - I am curious if geostatisticians distinguish between them as a matter of
course?

regards,
Matthew Pawley


 --- Colin Daly <[EMAIL PROTECTED]> wrote: 
> 
> 
> Hi
> 
> Sorry to repeat myself - but the samples are not independent.  
> Independance is a fundamental assumption of these types of tests - and 
> you cannot interpret the tests if this assumption is violated.
> In the situation where spatial correlation exists, the true standard 
> error is nothing like as small as the (s/sqrt(n)) that Chaosheng 
> discusses - because the sqrt(n) depends on independence.
> 
> Again, as I said before, if the data has any type of trend in it, then 
> it is completely meaningless to try and use these tests - and with no 
> trend but some 'ordinary' correlation, you must find a means of taking 
> the data redundancy into account or risk get hopelessly pessimistic 
> results (in the sense of rejecting the null hypothesis of equal means 
> far too
> often)
> 
> Consider a trivial example. A one dimensional random function which 
> takes constant values over intervals of lenght one - so, it takes the 
> value a_0 in the interval [0,1[  then the value a_1 in the interval 
> [1,2[ and so on (let us suppose that each a_n term is drawn at random 
> from a gaussian distribution with the same mean and variance for 
> example).  Next suppose you are given samples on the interval [0,2].
> You spot that there seems to be a jump between [0,1[ and [1,2[  - so 
> you test for the difference in the means. If you apply an f test you 
> will easily find that the mean differs (and more convincingly the more 
> samples you have drawn!). However by construction of the random 
> function,  the mean is not different.  We have been lulled into the 
> false conclusion of differing means by assuming that all our data are 
> independent.
> 
> Regards
> 
> Colin Daly
> 



* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

RE: [ai-geostats] F and T-test for samples drawn from the same p

2004-12-05 Thread Pierre Goovaerts
Hello,

I am currently principal investigator on a major NIH grant
that aims to develop software for test of hypothesis
using alternate hypothesis specified by the user and that
differ from the omnibus "spatial independence";
we called them "spatial neutral models".
For example, you can test for clusters of cancer rates
"above and beyond" a regional background in exposure.
The p-values are computed using randomization and I applied
geostatistical simulation to generate multiple realizations
that are then used to derive the empirical distribution of
the test statistic.

I presented an example during the last GeoEnv conference
and I put a PDF copy of the paper, which is in press for
the moment, on my website.

Cheers,

Pierre

<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Dr. Pierre Goovaerts
President of PGeostat, LLC
Chief Scientist with Biomedware Inc.
710 Ridgemont Lane
Ann Arbor, Michigan, 48103-1535, U.S.A.

E-mail:  [EMAIL PROTECTED]
Phone:   (734) 668-9900
Fax: (734) 668-7788
http://alumni.engin.umich.edu/~goovaert/

<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

On Sun, 5 Dec 2004, Colin Daly wrote:

>
>
> Hi
>
> Sorry to repeat myself - but the samples are not independent.  Independance 
> is a fundamental assumption of these types of tests - and you cannot 
> interpret the tests if this assumption is violated.  In the situation where 
> spatial correlation exists, the true standard error is nothing like as small 
> as the (s/sqrt(n)) that Chaosheng discusses - because the sqrt(n) depends on 
> independence.
>
> Again, as I said before, if the data has any type of trend in it, then it is 
> completely meaningless to try and use these tests - and with no trend but 
> some 'ordinary' correlation, you must find a means of taking the data 
> redundancy into account or risk get hopelessly pessimistic results (in the 
> sense of rejecting the null hypothesis of equal means far too often)
>
> Consider a trivial example. A one dimensional random function which takes 
> constant values over intervals of lenght one - so, it takes the value a_0 in 
> the interval [0,1[  then the value a_1 in the interval [1,2[ and so on (let 
> us suppose that each a_n term is drawn at random from a gaussian distribution 
> with the same mean and variance for example).  Next suppose you are given 
> samples on the interval [0,2]. You spot that there seems to be a jump between 
> [0,1[ and [1,2[  - so you test for the difference in the means. If you apply 
> an f test you will easily find that the mean differs (and more convincingly 
> the more samples you have drawn!). However by construction of the random 
> function,  the mean is not different.  We have been lulled into the false 
> conclusion of differing means by assuming that all our data are independent.
>
> Regards
>
> Colin Daly
>
>
> -Original Message-
> From: Chaosheng Zhang [mailto:[EMAIL PROTECTED]
> Sent: Sun 12/5/2004 11:42 AM
> To:   [EMAIL PROTECTED]
> Cc:   Colin Badenhorst; Isobel Clark; Donald E. Myers
> Subject:  Re: [ai-geostats] F and T-test for samples drawn from the same p
> Dear all,
>
>
>
> I'm wondering if sample size (number of samples, n) is playing a role here.
>
>
>
> Since Colin is using Excel to analyse several thousand samples, I have 
> checked the functions of t-tests in Excel. In the Data Analysis Tools help, a 
> function is provided for "t-Test: Two-Sample Assuming Unequal Variances 
> analysis". This function is the same as those from many text books (There are 
> other forms of the function). Unfortunately, I cannot find the function for 
> "assuming equal variances" in Excel, but I assume they are similar, and 
> should be the same as those from some text books.
>
>
>
> From the function, you can find that when the sample size is large you always 
> get a large t value. When sample size is large enough, even slight 
> differences between the mean values of two data sets (x bar and y bar) can be 
> detected, and this will result in rejection of the null hypothesis. This is 
> in fact quite reasonable. When the sample size is large, you are confident 
> with the mean values (Central Limit Theorem), with a very small stand er

RE: [ai-geostats] F and T-test for samples drawn from the same p

2004-12-05 Thread Colin Daly
Title: RE: [ai-geostats] F and T-test for samples drawn from the same p







Hi

Sorry to repeat myself - but the samples are not independent.  Independance is a fundamental assumption of these types of tests - and you cannot interpret the tests if this assumption is violated.  In the situation where spatial correlation exists, the true standard error is nothing like as small as the (s/sqrt(n)) that Chaosheng discusses - because the sqrt(n) depends on independence.

Again, as I said before, if the data has any type of trend in it, then it is completely meaningless to try and use these tests - and with no trend but some 'ordinary' correlation, you must find a means of taking the data redundancy into account or risk get hopelessly pessimistic results (in the sense of rejecting the null hypothesis of equal means far too often)

Consider a trivial example. A one dimensional random function which takes constant values over intervals of lenght one - so, it takes the value a_0 in the interval [0,1[  then the value a_1 in the interval [1,2[ and so on (let us suppose that each a_n term is drawn at random from a gaussian distribution with the same mean and variance for example).  Next suppose you are given samples on the interval [0,2]. You spot that there seems to be a jump between [0,1[ and [1,2[  - so you test for the difference in the means. If you apply an f test you will easily find that the mean differs (and more convincingly the more samples you have drawn!). However by construction of the random function,  the mean is not different.  We have been lulled into the false conclusion of differing means by assuming that all our data are independent.

Regards

Colin Daly


-Original Message-
From:   Chaosheng Zhang [mailto:[EMAIL PROTECTED]]
Sent:   Sun 12/5/2004 11:42 AM
To: [EMAIL PROTECTED]
Cc: Colin Badenhorst; Isobel Clark; Donald E. Myers
Subject:        Re: [ai-geostats] F and T-test for samples drawn from the same p
Dear all,



I'm wondering if sample size (number of samples, n) is playing a role here.



Since Colin is using Excel to analyse several thousand samples, I have checked the functions of t-tests in Excel. In the Data Analysis Tools help, a function is provided for "t-Test: Two-Sample Assuming Unequal Variances analysis". This function is the same as those from many text books (There are other forms of the function). Unfortunately, I cannot find the function for "assuming equal variances" in Excel, but I assume they are similar, and should be the same as those from some text books.



>From the function, you can find that when the sample size is large you always get a large t value. When sample size is large enough, even slight differences between the mean values of two data sets (x bar and y bar) can be detected, and this will result in rejection of the null hypothesis. This is in fact quite reasonable. When the sample size is large, you are confident with the mean values (Central Limit Theorem), with a very small stand error (s/(sqrt(n)). Therefore, you are confident to detect the differences between the two data sets. Even though there is only a slight difference, you can still say, yes, they are "significantly" different.



If you still remember some time ago, we had a discussion on large sample size problem for tests for normality. When the sample size is large enough, the result can always be expected (for real data sets), that is, rejection of the null hypothesis.



Cheers,



Chaosheng

--

Dr. Chaosheng Zhang

Lecturer in GIS

Department of Geography

National University of Ireland, Galway

IRELAND

Tel: +353-91-524411 x 2375

Direct Tel: +353-91-49 2375

Fax: +353-91-525700

E-mail: [EMAIL PROTECTED]

Web 1: www.nuigalway.ie/geography/zhang.html

Web 2: www.nuigalway.ie/geography/gis/index.htm







- Original Message -

From: "Isobel Clark" <[EMAIL PROTECTED]>

To: "Donald E. Myers" <[EMAIL PROTECTED]>

Cc: "Colin Badenhorst" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>

Sent: Saturday, December 04, 2004 11:49 AM

Subject: [ai-geostats] F and T-test for samples drawn from the same p





> Don

>

> Thank you for the extended clarification of F and t

> hypothesis test. For those unfamiliar with the

> concept, it is worth noting that the F test for

> multiple means may be more familiar under the title

> "Analysis of variance".

>

> My own brief answer was in the context of Colin's

> question, where it was quite clear that he was talking

> aboutthe simplest F variance-ratio and t comparison of

> means test.

>

> Isobel

>

>











> * By using the ai-geostat

Re: [ai-geostats] F and T-test for samples drawn from the same p

2004-12-05 Thread Chaosheng Zhang



Dear all,
 
I'm wondering if sample size (number of samples, n) 
is playing a role here.
 
Since Colin is using Excel to analyse several 
thousand samples, I have checked the functions of t-tests in Excel. In the Data 
Analysis Tools help, a function is provided for "t-Test: Two-Sample Assuming 
Unequal Variances analysis". This function is the same as those from 
many text books (There are other forms of the function). Unfortunately, I 
cannot find the function for "assuming equal variances" in Excel, but I assume 
they are similar, and should be the same as those from some text 
books.
 
From the function, you can find that when the 
sample size is large you always get a large t value. When sample size is 
large enough, even slight differences between the mean values of two data 
sets (x bar and y bar) can be detected, and this will result in rejection of the 
null hypothesis. This is in fact quite reasonable. When the sample size is 
large, you are confident with the mean values (Central Limit Theorem), with 
a very small stand error (s/(sqrt(n)). Therefore, you are confident to 
detect the differences between the two data sets. Even though there is only a 
slight difference, you can still say, yes, they are "significantly" 
different.
 
If you still remember some time ago, we had a 
discussion on large sample size problem for tests for normality. When the sample 
size is large enough, the result can always be expected (for real data sets), 
that is, rejection of the null hypothesis.
 
Cheers,
 
Chaosheng
--Dr. 
Chaosheng ZhangLecturer in GISDepartment of GeographyNational 
University of Ireland, GalwayIRELANDTel: +353-91-524411 x 2375Direct 
Tel: +353-91-49 2375Fax: +353-91-525700E-mail: [EMAIL PROTECTED]Web 1: 
www.nuigalway.ie/geography/zhang.htmlWeb 2: www.nuigalway.ie/geography/gis/index.htm
 
 
- Original Message - 

From: "Isobel Clark" <[EMAIL PROTECTED]>
To: "Donald E. Myers" <[EMAIL PROTECTED]>
Cc: "Colin Badenhorst" <[EMAIL PROTECTED]>; 
<[EMAIL PROTECTED]>
Sent: Saturday, December 04, 2004 11:49 
AM
Subject: [ai-geostats] F and T-test for samples 
drawn from the same p
> Don> > Thank you for the extended clarification of F 
and t> hypothesis test. For those unfamiliar with the> concept, it 
is worth noting that the F test for> multiple means may be more familiar 
under the title> "Analysis of variance".> > My own brief 
answer was in the context of Colin's> question, where it was quite clear 
that he was talking> aboutthe simplest F variance-ratio and t comparison 
of> means test.> > Isobel> > 



> * By using the ai-geostats mailing list you 
agree to follow its rules > ( see http://www.ai-geostats.org/help_ai-geostats.htm )> > * To unsubscribe to ai-geostats, send the 
following in the subject or in the body (plain text format) of an email message 
to [EMAIL PROTECTED]> > Signoff 
ai-geostats> 
* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

[ai-geostats] F and T-test for samples drawn from the same p

2004-12-04 Thread Isobel Clark
Don

Thank you for the extended clarification of F and t
hypothesis test. For those unfamiliar with the
concept, it is worth noting that the F test for
multiple means may be more familiar under the title
"Analysis of variance".

My own brief answer was in the context of Colin's
question, where it was quite clear that he was talking
aboutthe simplest F variance-ratio and t comparison of
means test.

Isobel

* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

RE: [ai-geostats] F and T-test for samples drawn from the same p

2004-12-03 Thread Din Chen
Title: RE: [ai-geostats] F and T-test for samples drawn from the same p








Colin (Daly) is exactly correct. The
spatial dependence is the main issue here when you use the t-test for spatial
data. You might be able to transform your data for normality or even homogeneity,
but the dependence is still there. 

 

In this case, you need to incorporate the
spatial dependence (described by variogram) into the ttest. Try the generalized
least square for a likelihood approach.

 

Din
 Chen

 









From: Colin Daly
[mailto:[EMAIL PROTECTED] 
Sent: Friday, December 03, 2004
8:16 AM
To: Glover, Tim; Colin Badenhorst;
[EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: RE: [ai-geostats] F and
T-test for samples drawn from the same p



 

 

There is one other very important assumption about
these standard statiatical tests - namely that the samples are independent.
This typically removes a large part of the usability of basic tests unless
corrected for spatial variables. It is most likely the case that your samples
within each horizon are not independent (unless the variogram has got zero
range)- so your typical tests cannot be used. They will tend to give
pessimistic results - in other words you will tend to find differences in means
when none exists. So, these type of tests don't apply directly.

I don't know if there has been much work on trying to provide 'rigourous'
methods (but given that it is impossible to give a statistical test that
shows  if a random function is stationary or not (Matheron - 'Estimating
and choosing') then I guess the results would not be completely rigourous). You
may be able to get an intuitive feel for the likely difference in means by
trying to see how many quasi independent points you have got. You could
guess-timate this by assuming that points separated by more than a variogram
range are independent and see how many such 'range units' you have got and
using this as the number of 'samples' (actually - you may be better by working
with an integral range). But if you have any trends in the data then you will
not reliable estimates of the two means and so cannot 'prove' that the samples come
from the same random function - even if they do.

Regards

Colin Daly



-Original Message-
From:   Glover, Tim [mailto:[EMAIL PROTECTED]]
Sent:   Fri 12/3/2004 3:15 PM
To: Colin Badenhorst; [EMAIL PROTECTED]
Cc:     [EMAIL PROTECTED]
Subject:        RE: [ai-geostats] F and
T-test for samples drawn from the same p
Standard t-tests make two assumptions: 1. both data sets are normally
distributed; 2. they have approximately equal variance.  Test these
assumptions before applying a t-test. Violate these assumptions at your
own risk.  If you fail either assumption, you need to consider your
options, but probably should not use a plain-vanilla t-test.  You could
possibly use a data transform to "fix" the first assumption. 
You might
have to use a modified t-test (such as Satterthwaite's modification) Or
you might consider a non-parametric approach, such as Mann-Whitney
U-test. 


Tim Glover
Senior Environmental Scientist - Geochemistry
Geoenvironmental Department
MACTEC Engineering and Consulting, Inc.
Kennesaw, Georgia,
USA
Office 770-421-3310
Fax 770-421-3486
Email [EMAIL PROTECTED]
Web www.mactec.com

-Original Message-
From: Colin Badenhorst [mailto:[EMAIL PROTECTED]]
Sent: Friday, December 03, 2004 9:59 AM
To: '[EMAIL PROTECTED]'
Cc: '[EMAIL PROTECTED]'
Subject: RE: [ai-geostats] F and T-test for samples drawn from the same
p

Hi Ted,

Thanks for your reply. I suspect my original query was too vague, so I
will
illustrate it with a practical example here.

I have an ore horizon that splits into two separate horizons. One of
these
split horizons has a lower average grade, and the other has a higher
average
grade. I need to determine whether I should treat these two horizons as
separate entities during grade estimation. My geological observations
tell
me that these two horizons derive from the same source, and on the face
of
it are not different from one another in terms of mineral content and
genesis. I aim to back it up by proving, or attempting to prove, that
statistically these two horizons are the same, and can be treated as
such as
far as grade estimation goes. Because the mean grades vary between the
two,
I suspect that the T-test might fail, but I also suspect that the
variance
in grade between the two might be very similar, and thus the F-test will
pass. Now I have a problem : a T-test tells me the populations differ
statistically, and but the F-test tells me they don't.

The confidence limit I refer to in (2) by the way is the Alpha value
used to
determine the confidence level for the test - I am using Excel to do the
test.

Thanks,
Colin


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: 03 December 2004 14:15
To: Colin Badenhorst

RE: [ai-geostats] F and T-test for samples drawn from the same p

2004-12-03 Thread Colin Daly
Title: RE: [ai-geostats] F and T-test for samples drawn from the same p







There is one other very important assumption about these standard statiatical tests - namely that the samples are independent. This typically removes a large part of the usability of basic tests unless corrected for spatial variables. It is most likely the case that your samples within each horizon are not independent (unless the variogram has got zero range)- so your typical tests cannot be used. They will tend to give pessimistic results - in other words you will tend to find differences in means when none exists. So, these type of tests don't apply directly.

I don't know if there has been much work on trying to provide 'rigourous' methods (but given that it is impossible to give a statistical test that shows  if a random function is stationary or not (Matheron - 'Estimating and choosing') then I guess the results would not be completely rigourous). You may be able to get an intuitive feel for the likely difference in means by trying to see how many quasi independent points you have got. You could guess-timate this by assuming that points separated by more than a variogram range are independent and see how many such 'range units' you have got and using this as the number of 'samples' (actually - you may be better by working with an integral range). But if you have any trends in the data then you will not reliable estimates of the two means and so cannot 'prove' that the samples come from the same random function - even if they do.

Regards

Colin Daly



-Original Message-
From:   Glover, Tim [mailto:[EMAIL PROTECTED]]
Sent:   Fri 12/3/2004 3:15 PM
To: Colin Badenhorst; [EMAIL PROTECTED]
Cc:     [EMAIL PROTECTED]
Subject:        RE: [ai-geostats] F and T-test for samples drawn from the same p
Standard t-tests make two assumptions: 1. both data sets are normally
distributed; 2. they have approximately equal variance.  Test these
assumptions before applying a t-test. Violate these assumptions at your
own risk.  If you fail either assumption, you need to consider your
options, but probably should not use a plain-vanilla t-test.  You could
possibly use a data transform to "fix" the first assumption.  You might
have to use a modified t-test (such as Satterthwaite's modification) Or
you might consider a non-parametric approach, such as Mann-Whitney
U-test. 


Tim Glover
Senior Environmental Scientist - Geochemistry
Geoenvironmental Department
MACTEC Engineering and Consulting, Inc.
Kennesaw, Georgia, USA
Office 770-421-3310
Fax 770-421-3486
Email [EMAIL PROTECTED]
Web www.mactec.com

-Original Message-
From: Colin Badenhorst [mailto:[EMAIL PROTECTED]]
Sent: Friday, December 03, 2004 9:59 AM
To: '[EMAIL PROTECTED]'
Cc: '[EMAIL PROTECTED]'
Subject: RE: [ai-geostats] F and T-test for samples drawn from the same
p

Hi Ted,

Thanks for your reply. I suspect my original query was too vague, so I
will
illustrate it with a practical example here.

I have an ore horizon that splits into two separate horizons. One of
these
split horizons has a lower average grade, and the other has a higher
average
grade. I need to determine whether I should treat these two horizons as
separate entities during grade estimation. My geological observations
tell
me that these two horizons derive from the same source, and on the face
of
it are not different from one another in terms of mineral content and
genesis. I aim to back it up by proving, or attempting to prove, that
statistically these two horizons are the same, and can be treated as
such as
far as grade estimation goes. Because the mean grades vary between the
two,
I suspect that the T-test might fail, but I also suspect that the
variance
in grade between the two might be very similar, and thus the F-test will
pass. Now I have a problem : a T-test tells me the populations differ
statistically, and but the F-test tells me they don't.

The confidence limit I refer to in (2) by the way is the Alpha value
used to
determine the confidence level for the test - I am using Excel to do the
test.

Thanks,
Colin


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: 03 December 2004 14:15
To: Colin Badenhorst
Cc: [EMAIL PROTECTED]
Subject: RE: [ai-geostats] F and T-test for samples drawn from the same
p


On 03-Dec-04 Colin Badenhorst wrote:
> Hello everyone,
> 
> I have two groups of several thousand samples analysed
> for various elements, and wish to determine if these
> samples are drawn from the same statistical population
> for later variography studies. I propose to test the two
> groups by using a F-test to test the sample variances,
> and a T-test to test the group means, at a given confidence limit.
> 
> Before I do this, I wonder how I would interpret the results
> of the test if, for example:
> 
> 1. The F-test s

RE: [ai-geostats] F and T-test for samples drawn from the same p

2004-12-03 Thread Glover, Tim
Standard t-tests make two assumptions: 1. both data sets are normally
distributed; 2. they have approximately equal variance.  Test these
assumptions before applying a t-test. Violate these assumptions at your
own risk.  If you fail either assumption, you need to consider your
options, but probably should not use a plain-vanilla t-test.  You could
possibly use a data transform to "fix" the first assumption.  You might
have to use a modified t-test (such as Satterthwaite's modification) Or
you might consider a non-parametric approach, such as Mann-Whitney
U-test.  


Tim Glover
Senior Environmental Scientist - Geochemistry 
Geoenvironmental Department
MACTEC Engineering and Consulting, Inc.
Kennesaw, Georgia, USA
Office 770-421-3310
Fax 770-421-3486
Email [EMAIL PROTECTED] 
Web www.mactec.com

-Original Message-
From: Colin Badenhorst [mailto:[EMAIL PROTECTED] 
Sent: Friday, December 03, 2004 9:59 AM
To: '[EMAIL PROTECTED]'
Cc: '[EMAIL PROTECTED]'
Subject: RE: [ai-geostats] F and T-test for samples drawn from the same
p

Hi Ted,

Thanks for your reply. I suspect my original query was too vague, so I
will
illustrate it with a practical example here.

I have an ore horizon that splits into two separate horizons. One of
these
split horizons has a lower average grade, and the other has a higher
average
grade. I need to determine whether I should treat these two horizons as
separate entities during grade estimation. My geological observations
tell
me that these two horizons derive from the same source, and on the face
of
it are not different from one another in terms of mineral content and
genesis. I aim to back it up by proving, or attempting to prove, that
statistically these two horizons are the same, and can be treated as
such as
far as grade estimation goes. Because the mean grades vary between the
two,
I suspect that the T-test might fail, but I also suspect that the
variance
in grade between the two might be very similar, and thus the F-test will
pass. Now I have a problem : a T-test tells me the populations differ
statistically, and but the F-test tells me they don't.

The confidence limit I refer to in (2) by the way is the Alpha value
used to
determine the confidence level for the test - I am using Excel to do the
test.

Thanks,
Colin
 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: 03 December 2004 14:15
To: Colin Badenhorst
Cc: [EMAIL PROTECTED]
Subject: RE: [ai-geostats] F and T-test for samples drawn from the same
p


On 03-Dec-04 Colin Badenhorst wrote:
> Hello everyone,
>  
> I have two groups of several thousand samples analysed
> for various elements, and wish to determine if these
> samples are drawn from the same statistical population
> for later variography studies. I propose to test the two
> groups by using a F-test to test the sample variances,
> and a T-test to test the group means, at a given confidence limit.
>  
> Before I do this, I wonder how I would interpret the results
> of the test if, for example:
>  
> 1. The F-test suggests no significant statistical difference
> between the variances at a 90% confidence limit, BUT
> 2. The T-test suggests a significant statistical difference
> between the means at the same, or lower confidence limit.
>  
> Has anyone come across this scenario before and how are they
> interpreted?

On the face of it, the scenario you describe corresponds to
a standard t-test (which involves an assumption that the
variances of the two populations do not differ), though I'm
not sure what you mean in (2) by significant "at the same,
or lower confidence limit." (Do I take it that in (1) you
mean that the P-value for the F test is 0.1 or less?)

However, if you get significant difference between the variances
in (1), then it may not be very good to use the standard
t test (depending on how different they are). A modified
version, such as the Welch test, should be used instead.

There is an issue with interpreting the results where the
samples have initially been screened by one test, before
another one is applied, since the sampling distribution
of the second test, conditional on the outcome of the
first, may not be the same as the sampling distribution of
the second test on its own. However, I feel inclined to
guess that this may not make any important difference
in your case.

Hoping this helps,
Ted.



E-Mail: (Ted Harding) <[EMAIL PROTECTED]>
Fax-to-email: +44 (0)870 094 0861  [NB: New number!]
Date: 03-Dec-04   Time: 14:15:09
-- XFMail --



* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

RE: [ai-geostats] F and T-test for samples drawn from the same p

2004-12-03 Thread Colin Badenhorst
Hi Ted,

Thanks for your reply. I suspect my original query was too vague, so I will
illustrate it with a practical example here.

I have an ore horizon that splits into two separate horizons. One of these
split horizons has a lower average grade, and the other has a higher average
grade. I need to determine whether I should treat these two horizons as
separate entities during grade estimation. My geological observations tell
me that these two horizons derive from the same source, and on the face of
it are not different from one another in terms of mineral content and
genesis. I aim to back it up by proving, or attempting to prove, that
statistically these two horizons are the same, and can be treated as such as
far as grade estimation goes. Because the mean grades vary between the two,
I suspect that the T-test might fail, but I also suspect that the variance
in grade between the two might be very similar, and thus the F-test will
pass. Now I have a problem : a T-test tells me the populations differ
statistically, and but the F-test tells me they don't.

The confidence limit I refer to in (2) by the way is the Alpha value used to
determine the confidence level for the test - I am using Excel to do the
test.

Thanks,
Colin
 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: 03 December 2004 14:15
To: Colin Badenhorst
Cc: [EMAIL PROTECTED]
Subject: RE: [ai-geostats] F and T-test for samples drawn from the same
p


On 03-Dec-04 Colin Badenhorst wrote:
> Hello everyone,
>  
> I have two groups of several thousand samples analysed
> for various elements, and wish to determine if these
> samples are drawn from the same statistical population
> for later variography studies. I propose to test the two
> groups by using a F-test to test the sample variances,
> and a T-test to test the group means, at a given confidence limit.
>  
> Before I do this, I wonder how I would interpret the results
> of the test if, for example:
>  
> 1. The F-test suggests no significant statistical difference
> between the variances at a 90% confidence limit, BUT
> 2. The T-test suggests a significant statistical difference
> between the means at the same, or lower confidence limit.
>  
> Has anyone come across this scenario before and how are they
> interpreted?

On the face of it, the scenario you describe corresponds to
a standard t-test (which involves an assumption that the
variances of the two populations do not differ), though I'm
not sure what you mean in (2) by significant "at the same,
or lower confidence limit." (Do I take it that in (1) you
mean that the P-value for the F test is 0.1 or less?)

However, if you get significant difference between the variances
in (1), then it may not be very good to use the standard
t test (depending on how different they are). A modified
version, such as the Welch test, should be used instead.

There is an issue with interpreting the results where the
samples have initially been screened by one test, before
another one is applied, since the sampling distribution
of the second test, conditional on the outcome of the
first, may not be the same as the sampling distribution of
the second test on its own. However, I feel inclined to
guess that this may not make any important difference
in your case.

Hoping this helps,
Ted.



E-Mail: (Ted Harding) <[EMAIL PROTECTED]>
Fax-to-email: +44 (0)870 094 0861  [NB: New number!]
Date: 03-Dec-04   Time: 14:15:09
-- XFMail --


* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

RE: [ai-geostats] F and T-test for samples drawn from the same p

2004-12-03 Thread Ted Harding
On 03-Dec-04 Colin Badenhorst wrote:
> Hello everyone,
>  
> I have two groups of several thousand samples analysed
> for various elements, and wish to determine if these
> samples are drawn from the same statistical population
> for later variography studies. I propose to test the two
> groups by using a F-test to test the sample variances,
> and a T-test to test the group means, at a given confidence limit.
>  
> Before I do this, I wonder how I would interpret the results
> of the test if, for example:
>  
> 1. The F-test suggests no significant statistical difference
> between the variances at a 90% confidence limit, BUT
> 2. The T-test suggests a significant statistical difference
> between the means at the same, or lower confidence limit.
>  
> Has anyone come across this scenario before and how are they
> interpreted?

On the face of it, the scenario you describe corresponds to
a standard t-test (which involves an assumption that the
variances of the two populations do not differ), though I'm
not sure what you mean in (2) by significant "at the same,
or lower confidence limit." (Do I take it that in (1) you
mean that the P-value for the F test is 0.1 or less?)

However, if you get significant difference between the variances
in (1), then it may not be very good to use the standard
t test (depending on how different they are). A modified
version, such as the Welch test, should be used instead.

There is an issue with interpreting the results where the
samples have initially been screened by one test, before
another one is applied, since the sampling distribution
of the second test, conditional on the outcome of the
first, may not be the same as the sampling distribution of
the second test on its own. However, I feel inclined to
guess that this may not make any important difference
in your case.

Hoping this helps,
Ted.



E-Mail: (Ted Harding) <[EMAIL PROTECTED]>
Fax-to-email: +44 (0)870 094 0861  [NB: New number!]
Date: 03-Dec-04   Time: 14:15:09
-- XFMail --

* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats