RE: [ai-geostats] Sum of Estimates

2005-08-31 Thread Reid, David W



Hi
Colin,
 
Have you checked the Fe content of your sphalerite +
other mineralogy in the problem area?
 
But I guess from your statement "Thus, there is no combination of Zn, Pb and Fe in the
estimation database that totals more than 100% total sulphide" you
have calculated the total percent for your
samples using the same formula for estimates, so this should not be an
issue.
 
Cheers
David Reid -Original Message-From: Colin Badenhorst
[mailto:[EMAIL PROTECTED]Sent: Wednesday, 31 August 2005
8:58 PMTo: ai-geostats@unil.chSubject: [ai-geostats] Sum
of Estimates

  Dear
  List,
   
  I have a rather
  interesting problem with my Kriged estimates for a base metal mine. I am
  estimating Zn, Pb and Fe, as percentage, the sum of which should total to
  no more than 100% total sulphides.
   
  All Zn comes from
  sphalerite (ZnS) at 0.671 proportion. Sphalerite SG
  = 3.80
  All Pb comes from
  galena (PbS) at 0.866 proportion. Galena SG
  = 7.40
  All Fe comes from
  pyrite (FeS2) at 0.466 proportion. Pyrite SG
  = 4.80
  Total Sulphides =
  (Zn estimate x 1.4903) + (Pb estimate x 1.1547) + (Fe x
  2.1459)
   
  What I have
  discovered is that I have areas in which the total sulphides are greater than
  100% - with very few exceptions, the total is no more than 105% total
  sulphide.
   
  My
  estimation domains for Zn and Pb are well constrained and
  validated, and the variogram models and estimation parameters are
  robust, and have been tested and validated to ensure they match the
  geological expectations. My domains for Fe are less well constrained but
  the variogram models are robust, as are the estimation parameters, and
  these also match the geological expectation. So, at the time of the
  estimation, there was very little I could do to improve on these. The
  estimation database (composited drillhole samples) have upper data value
  limits (or cut-offs if you wish to use that terminology) imposed on them
  such that
  :
   
  Zn > 40% is
  never used to estimate a block
  Pb > 10% is
  never used to estimate a block
  Fe > 46.6% is
  never used to estimate a block
   
  Thus, there is no
  combination of Zn, Pb and Fe in the estimation database that totals more
  than 100% total sulphide
   
  The areas with the
  anomalous (erroneous?) total sulphide summation all correlate, without
  fail, to areas of thick ore with very dominant pyrite content - there are
  individual blocks scattered across the mine that buck this trend. This leads
  me to suspect that the Fe estimates may be erroneous, or simply speaking, the
  Fe content is being overestimated, hence the total sulphide count exceeds the
  theoretical limit.
   
  The only solution
  to this problem is modifying the Fe variograms and estimation parameters,
  but currently, in my judgment, there is nothing I can modify that would
  lead to better variograms or estimation parameters. Of course there may be
  blocks where the total sulphide is actually underestimated, but that is
  impossible to determine, so the overestimates may balance the underestimates
  in which case there is no bias, but that needs to be
  tested.
   
  Has anyone heard
  of similar issues on other base metal mines? In the absence of revisiting the
  estimation parameters, is there anything I can do to realistically address
  this issue? 
   
  Regards,
  Colin 
   
   
  


  This message and any attached
files may contain information that is confidential and/or subject of
legal privilege intended only for use by the intended recipient. If you
are not the intended recipient or the person responsible for delivering
the message to the intended recipient, be advised that you have received
this message in error and that any dissemination, copying or use of this
message or attachment is strictly forbidden, as is the disclosure of the
information therein. If you have received this message in error please
notify the sender immediately and delete the
message.

This message and any attached files may contain information that is confidential and/or subject of legal privilege intended only for use by the intended recipient. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, be advised that you have received this message in error and that any dissemination, copying or use of this message or attachment is strictly forbidden, as is the disclosure of the information therein. If you have received this message in error please notify the sender immediately and delete the message.
* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

[ai-geostats] Pareto vs Lognormal distribution

2005-08-31 Thread Beatrice Mare-Jones
Hello list

I am a PhD student looking at developing a statistical model to predict 
the size-distribution of an area's oil and gas fields.

It is clear that previous investigators prefer either a Pareto power law 
or a lognormal distribution to approximate field-size distributions. 

The data I am using does not look like it comes from a Pareto distribution 
- which I explain as being a result of undersampling - which previous 
investigators have reported - that undersampling occurs because the small 
fields are not sampled or recoded.  However by using basin-modelling 
software to simulate oil and gas fields (for the same basin that my 
discovered empirical data comes from) I notice that this sample is also 
undersampled - that is fields under a certain size are not being simulated 
- which is probably due to the resolution of my input data but what is 
interesting is that the undersampling actually occurs throughout all the 
size ranges - including the medium to larger sizes - which I would not 
have expected.  Like the discovery dataset (n = 25)  the simulated dataset 
(n = 140) looks like it is more from a lognormal distribution than a 
Pareto distribution.

My conclusion is that without being able to say that a Pareto is better 
than a lognormal and vise-versa it appears only logical to use both 
distributions.

Geologically there does not seems to be a reason why a modal size (greater 
than what is detectable by exploration methods) of fields should exist  - 
which would be the case if the data was from a  lognormal distribution - 
except if the distribution is highly right skewed (at the small field 
size) and the mode is actually just below the detection of size.

Geologically there does seem reason for fields to become so small that 
they become entities (that trap oil and gas)  - and this relationship may 
be better approximated by a Pareto.


The Pareto and lognormal form is similar but maybe one is better to 
approximate field sizes than the other.
My question is do you think a Pareto distribution better approximates an 
oil and gas size distribution than a  lognormal (or vise-versa) and if so 
why. 


I am currently working on goodness of fit test to throw some more light on 
this - but if anyone has any thing to say I'd appreciate some comments.

Thank you,

Kind regards

Beatrice 

Geological and Nuclear Sciences
New Zealand
www.gns.cri.nz

* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

[ai-geostats] Re: Why degree of freedom is n-1

2005-08-31 Thread Isobel Clark
Hi Eric
 
What complications! You should find, in any basic statistical inference that the correlation is divided by (n-1) and has (n-2) degrees of freedom.
 
The logic behind this is because the correlation is actually calculated as the covariance divided by the two standard deviations. 
 
The covariance is calculated from n PAIRS of samples, not 2n individual observations and has (n-1) degrees of freedom because it uses the pair of means (m1,m2) as its centroid. 
 
Dividing by the pair (s1,s2) loses you the other degree of freedom. Tests on the correlation have (n-2) degrees of freedom.
 
If you use (say) a regression relationship with 'k' coefficients including the constraint of the means, you lose k degrees of freedom. Any book which deals with 'Analysis of variance' will explain this for you. We use exactly this approach for testing a trend surface (see free tutorial at http://geoecosse.bizland.com/softwares or download my SNARK (1977) paper from http://uk.geocities.com/drisobelclark/resume). 
 
Hope this helps.
Isobel [EMAIL PROTECTED] wrote:
This follow-up is slighlty aside the subject line of the mailing list, butas a geologist, this is the only statistically-flavoured one I amsubscribed to. Therefore :Federico Pardo <[EMAIL PROTECTED]>said:> Having N samples, and then n degrees of freedom.> One degree of freedom is used (or taken) by the mean calculation.> Then when you calculate the variance or the standard deviation, you only> have left n-1 degrees of freedom.Apart a rigorous calculation I am aware of that in this very case (cf.Peter Bossew's contribution on the same thread, that details it), gives aproof for this rule-of-thumb, what more or less rigourous statisticaldevelopments gives consistance to it ?I mean, for the empirical correlation coefficient,rhoXiYi = SUM_i=1..N( (x_i - mx).(y_i - my) / sx / sy ) / WHAT_NUMBERMust
 WHAT_NUMBER be, for a kind of unbiased estimate ("a kind of" meaning"with some eventual Fisher z-transform"...):* N for simplicity,* N-2 as I have most frequently seen in books that dare give this formula(N points, minus 1 for position and 1 for dispersion ?),* or 2N-4 -- 2N for the (x_i,y_i), minus 4 for {mx,my,sx,sy} -- as astrict application of the rule-of-thumb seems to suggest ?And what about, when fitting for instance a 3-parameter non-linearfunction, reducing the number of degrees of freedom, to N-3 (number ofpoints, minus one for each function parameter ? I have never read any kindof explanation to support it, though it seems widely Thanks in advance for enlightments or simply tracks for other resources ofexplanations.-- Éric L.* By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm )* To unsubscribe to ai-geostats, send the
 following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED]Signoff ai-geostats* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

Re: [ai-geostats] Why degree of freedom is n-1

2005-08-31 Thread Eric.Lewin
This follow-up is slighlty aside the subject line of the mailing list, but
as a geologist, this is the only statistically-flavoured one I am
subscribed to. Therefore :

Federico Pardo <[EMAIL PROTECTED]> said:
> Having N samples, and then n degrees of freedom.
> One degree of freedom is used (or taken)  by the mean calculation.
> Then when you calculate the variance or the standard deviation, you only
> have left n-1 degrees of freedom.

Apart a rigorous calculation I am aware of that in this very case (cf.
Peter Bossew's contribution on the same thread, that details it), gives a
proof for this rule-of-thumb, what more or less rigourous statistical
developments gives consistance to it ?

I mean, for the empirical correlation coefficient,
  rhoXiYi = SUM_i=1..N( (x_i - mx).(y_i - my) / sx / sy ) / WHAT_NUMBER
Must WHAT_NUMBER be, for a kind of unbiased estimate ("a kind of" meaning
"with some eventual Fisher z-transform"...):
 * N for simplicity,
 * N-2 as I have most frequently seen in books that dare give this formula
(N points, minus 1 for position and 1 for dispersion ?),
 * or 2N-4 -- 2N for the (x_i,y_i), minus 4 for {mx,my,sx,sy} -- as a
strict application of the rule-of-thumb seems to suggest ?

And what about, when fitting for instance a 3-parameter non-linear
function, reducing the number of degrees of freedom, to N-3 (number of
points, minus one for each function parameter ? I have never read any kind
of explanation to support it, though it seems widely 

Thanks in advance for enlightments or simply tracks for other resources of
explanations.
-- ?ric L.

* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

[Fwd: Re: [ai-geostats] natural neighbor applied to indicator transforms]

2005-08-31 Thread Nicolas Gilardi
I'm also forwarding this answer from Dr Samy Bengio who hasn't 
subscribed to ai-geostats. His e-mail address is available at the end of 
his e-mail.


Best regards

--
Nicolas Gilardi

Particle Physics Experiment group
University of Edinburgh, JCMB
Edinburgh EH9 3JZ, United Kingdoms

tel: +44 (0)131 650 5300 ; fax: +44 (0)131 650 7189
e-mail: [EMAIL PROTECTED] ; web: http://baikal-bangkok.org/~nicolas
--- Begin Message ---

Hello,

My own contribution to the following question:


I recently attended a presentation about the mapping of soil properties.
Kriging was applied and I was wondering why a regression technique was
used instead of a classification algorithm.


It is always possible to use a regression technique to solve a classification
task, while the converse is in general much harder (although never impossible).

Now why one should use one technique instead of another is a much wider
question. First, one has to think of the criterion that is optimized by the
underlying technique and compare it to the criterion that is seeked in the
problem at hand. The better these criteria fit one to the other, the more
fitted will be the technique. For instance, using a mean-squared error criterion
when solving a classification task is not optimal, although it opens the door
to many possible techniques. For classification tasks, it is better to have
a criterion that minimizes the number of errors (if this is what is expected),
and possibly while maximizing the distance between the classes in the feature
space (the so-called margin). Hence, SVMs are a good choice for classification.

However, some regression techniques, while not minimizing the best criterion,
offers other advantages that may prove interesting for the problem at hand,
such as smoothness, stochastic training, etc.


So where is the borderline? When are we facing a classification problem
and when is it a regression problem? I am not sure the borderline is
always that obvious.


The border between problems is in general obvious: is the target of
your task in N or in R ? and if in N, are the elements ordered or not? these
two simple questions decide whether it is a regression or a classification
task (although you might also have other types of tasks such as density
estimation or ranking).



-Original Message-
From: seba [mailto:[EMAIL PROTECTED]
Sent: 30 August 2005 18:17
To: ai-geostats@unil.ch
Subject: [ai-geostats] natural neighbor applied to indicator transforms


Dear list members

I would like to have some comments, suggestions or critics about the
following topic:
building a (preliminary) local uncertainty model of the spatial
distribution of discrete (categorical) variables by means of natural
neighbor interpolation method applied to indicator transforms.


From my perspective, interpolating  indicator variables (well, at the

end an indicator variable is the probability of occurrence of a given
class) by means of a method like natural neighbor is an easy and quick
way to build a (preliminary) model of local uncertainty of the studied
properties, avoiding problems of order relation violations.
In my specific case I apply natural neighbor interpolation to indicator
transforms representing lithological classes in the same way in which
direct indicator kriging is applied. In this way, looking at the spatial
distribution of the probability of occurrence of lithologies (or at the
distribution of the lithological classes, if some classification
algorithm is applied) I can have a first idea of the spatial
distribution of lithologies. Clearly this method is utilized only as an
explorative and preliminary data analysis tool.

Thank you in advance for your replies.

S. Trevisani






Samy Bengio
Senior Researcher in Machine Learning.
IDIAP, CP 592, rue du Simplon 4, 1920 Martigny, Switzerland.
tel: +41 27 721 77 39, fax: +41 27 721 77 12.
mailto:[EMAIL PROTECTED], http://www.idiap.ch/~bengio
--- End Message ---
* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

Re: [ai-geostats] natural neighbor applied to indicator transforms

2005-08-31 Thread Nicolas Gilardi
To answer to Gregoire's question, for some comparisons between SVM and 
Indicator Kriging, here is a very basic paper (from 1999):


http://baikal-bangkok.org/~nicolas/publi/acai99-svm.pdf

and a thesis chapter (chapter 6), perhaps more interesting (from 2002):

http://baikal-bangkok.org/~nicolas/cartann/these_gilardi.pdf

My personnal feeling about the distinction between using a 
classification algorithm or a regression one is the importance you put 
on the boundaries.
If you look for smooth boundaries, with uncertainty estimations, etc., 
then a regression algorithm (like indicator kriging) is certainly a good 
approach.
Now, if you don't care much about how the categories mix together at the 
interface, or if you want clear decision boundaries, then a real 
classification algorithm (like SVM) is certainly a better choice.


However, it is true that many algorithms can be used in either cases, 
often with a small or no modification. The best examples are the 
algorithms for density estimation (RBF, Parzen Windows...).
Algorithms of the category of SVM (i.e. large margin classifiers) are 
interesting for classification because they are concentrating on finding 
a separation between classes, not finding the "centre" of classes. In my 
opinion, the interest of this technic for regression isn't obvious...


Best regards,

Nico

Gregoire Dubois wrote:
I recently attended a presentation about the mapping of soil properties. 
Kriging was applied and I was wondering why a regression technique was 
used instead of a classification algorithm.
Delineating soil properties seemed to be, at first sight, a 
classification problem than a regression case. This was at first sight 
and we didn't debate much on this issue unfortunately.
Indicator kriging (IK) is somehow a bridge between these two issues 
(regression versus classification) and its simplicity in use and concept 
makes it very attractive to solve many problems.
Now I wonder (again) if there are some fundamental papers comparing IK 
to classification algorithms (e.g. Support Vector Machine, SVM). In the 
same way, SVM used for regression seems to be not that uncommon as well. 
So where is the borderline? When are we facing a classification problem 
and when is it a regression problem? I am not sure the borderline is 
always that obvious.
 
I am not answering Sebastiano's mail here but would be curious to see on 
this list a debate on "regression versus classification"... I presume 
there may there some material as well regarding the issue discussed below.
 
Best regards,
 
Gregoire


--
Nicolas Gilardi

Particle Physics Experiment group
University of Edinburgh, JCMB
Edinburgh EH9 3JZ, United Kingdoms

tel: +44 (0)131 650 5300 ; fax: +44 (0)131 650 7189
e-mail: [EMAIL PROTECTED] ; web: http://baikal-bangkok.org/~nicolas

* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

Re: [ai-geostats] Sum of Estimates

2005-08-31 Thread Marco Alfaro S.






Dear Colin:
 
The solution of your problem is co-kriging.
M. Rigidel has another solution: this solution and the cokriging is discussed in:
"Cas de simplification du cokrigeage" by Georges Matheron. Paris School of Mines.
 
Regards,
 
Marco 
 
---Mensaje original---
 

De: Colin Badenhorst
Fecha: 08/31/05 09:00:13
Para: ai-geostats@unil.ch
Asunto: [ai-geostats] Sum of Estimates
 
Dear List,
 
I have a rather interesting problem with my Kriged estimates for a base metal mine. I am estimating Zn, Pb and Fe, as percentage, the sum of which should total to no more than 100% total sulphides.
 
All Zn comes from sphalerite (ZnS) at 0.671 proportion. Sphalerite SG = 3.80
All Pb comes from galena (PbS) at 0.866 proportion. Galena SG = 7.40
All Fe comes from pyrite (FeS2) at 0.466 proportion. Pyrite SG = 4.80
Total Sulphides = (Zn estimate x 1.4903) + (Pb estimate x 1.1547) + (Fe x 2.1459)
 
What I have discovered is that I have areas in which the total sulphides are greater than 100% - with very few exceptions, the total is no more than 105% total sulphide.
 
My estimation domains for Zn and Pb are well constrained and validated, and the variogram models and estimation parameters are robust, and have been tested and validated to ensure they match the geological expectations. My domains for Fe are less well constrained but the variogram models are robust, as are the estimation parameters, and these also match the geological expectation. So, at the time of the estimation, there was very little I could do to improve on these. The estimation database (composited drillhole samples) have upper data value limits (or cut-offs if you wish to use that terminology) imposed on them such that :
 
Zn > 40% is never used to estimate a block
Pb > 10% is never used to estimate a block
Fe > 46.6% is never used to estimate a block
 
Thus, there is no combination of Zn, Pb and Fe in the estimation database that totals more than 100% total sulphide
 
The areas with the anomalous (erroneous?) total sulphide summation all correlate, without fail, to areas of thick ore with very dominant pyrite content - there are individual blocks scattered across the mine that buck this trend. This leads me to suspect that the Fe estimates may be erroneous, or simply speaking, the Fe content is being overestimated, hence the total sulphide count exceeds the theoretical limit.
 
The only solution to this problem is modifying the Fe variograms and estimation parameters, but currently, in my judgment, there is nothing I can modify that would lead to better variograms or estimation parameters. Of course there may be blocks where the total sulphide is actually underestimated, but that is impossible to determine, so the overestimates may balance the underestimates in which case there is no bias, but that needs to be tested.
 
Has anyone heard of similar issues on other base metal mines? In the absence of revisiting the estimation parameters, is there anything I can do to realistically address this issue? 
 
Regards,
Colin 
 
 
 







* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

[ai-geostats] Sum of Estimates

2005-08-31 Thread Colin Badenhorst



Dear 
List,
 
I have a rather 
interesting problem with my Kriged estimates for a base metal mine. I am 
estimating Zn, Pb and Fe, as percentage, the sum of which should total to 
no more than 100% total sulphides.
 
All Zn comes from 
sphalerite (ZnS) at 0.671 proportion. Sphalerite SG 
= 3.80
All Pb comes from 
galena (PbS) at 0.866 proportion. Galena SG = 7.40
All Fe comes from 
pyrite (FeS2) at 0.466 proportion. Pyrite SG 
= 4.80
Total Sulphides = 
(Zn estimate x 1.4903) + (Pb estimate x 1.1547) + (Fe x 
2.1459)
 
What I have 
discovered is that I have areas in which the total sulphides are greater than 
100% - with very few exceptions, the total is no more than 105% total 
sulphide.
 
My 
estimation domains for Zn and Pb are well constrained and 
validated, and the variogram models and estimation parameters are 
robust, and have been tested and validated to ensure they match the 
geological expectations. My domains for Fe are less well constrained but 
the variogram models are robust, as are the estimation parameters, and 
these also match the geological expectation. So, at the time of the estimation, 
there was very little I could do to improve on these. The estimation database 
(composited drillhole samples) have upper data value limits (or cut-offs if you 
wish to use that terminology) imposed on them such that :
 
Zn > 40% is never 
used to estimate a block
Pb > 10% is never 
used to estimate a block
Fe > 46.6% is 
never used to estimate a block
 
Thus, there is no 
combination of Zn, Pb and Fe in the estimation database that totals more 
than 100% total sulphide
 
The areas with the 
anomalous (erroneous?) total sulphide summation all correlate, without 
fail, to areas of thick ore with very dominant pyrite content - there are 
individual blocks scattered across the mine that buck this trend. This leads me 
to suspect that the Fe estimates may be erroneous, or simply speaking, the Fe 
content is being overestimated, hence the total sulphide count exceeds the 
theoretical limit.
 
The only solution 
to this problem is modifying the Fe variograms and estimation parameters, 
but currently, in my judgment, there is nothing I can modify that would 
lead to better variograms or estimation parameters. Of course there may be 
blocks where the total sulphide is actually underestimated, but that is 
impossible to determine, so the overestimates may balance the underestimates in 
which case there is no bias, but that needs to be tested.
 
Has anyone heard of 
similar issues on other base metal mines? In the absence of revisiting the 
estimation parameters, is there anything I can do to realistically address this 
issue? 
 
Regards,
Colin 
 
 
* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

RE: [ai-geostats] natural neighbor applied to indicator transforms

2005-08-31 Thread seba


Hi Gregorie
Well, I think that classification could be viewed as a way of
coding  of information in sampled areas. In particular for soil
properties continuos or fuzzy classification seems to work properly.
Then, avoiding to talk about the non-convexity of kriging, we can
interpolate before or after performing classification. But after all,
also classification algorithms are a regression problem...
Bye
Sebastiano
At 11.25 31/08/2005, Gregoire Dubois wrote:

I recently attended a
presentation about the mapping of soil properties. Kriging was applied
and I was wondering why a regression technique was used instead of a
classification algorithm. 
Delineating soil properties seemed to be, at first sight, a
classification problem than a regression case. This was at first sight
and we didn't debate much on this issue unfortunately.
Indicator kriging (IK) is somehow a bridge between these two issues
(regression versus classification) and its simplicity in use and concept
makes it very attractive to solve many problems. 
Now I wonder (again) if there are some fundamental papers comparing IK to
classification algorithms (e.g. Support Vector Machine, SVM). In the same
way, SVM used for regression seems to be not that uncommon as well. So
where is the borderline? When are we facing a classification problem and
when is it a regression problem? I am not sure the borderline is always
that obvious.
 
I am not answering Sebastiano's
mail here but would be curious to see on this list a debate on
"regression versus classification"... I presume there may there
some material as well regarding the issue discussed below.
 
Best regards,
 
Gregoire


-Original Message-

From: seba
[
mailto:[EMAIL PROTECTED]] 

Sent: 30 August 2005 18:17

To: ai-geostats@unil.ch

Subject: [ai-geostats] natural neighbor applied to indicator
transforms 

Dear list members

I would like to have some comments, suggestions or critics about the
following topic:

building a (preliminary) local uncertainty model of the spatial
distribution of discrete (categorical) variables by means of natural
neighbor interpolation method applied to indicator transforms.

From my perspective, interpolating  indicator variables (well,
at the end an indicator variable is the probability of occurrence of a
given class) by means of a method like natural neighbor is an easy and
quick way to build a (preliminary) model of local uncertainty of the
studied properties, avoiding problems of order relation violations.

In my specific case I apply natural neighbor interpolation to
indicator transforms representing lithological classes in the same way in
which direct indicator kriging is applied. In this way, looking at the
spatial distribution of the probability of occurrence of lithologies (or
at the distribution of the lithological classes, if some classification
algorithm is applied) I can have a first idea of the spatial distribution
of lithologies. Clearly this method is utilized only as an explorative
and preliminary data analysis tool.

Thank you in advance for your replies.

 

S. Trevisani



* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

RE: [ai-geostats] natural neighbor applied to indicator transforms

2005-08-31 Thread Gregoire Dubois
Title: Message



I 
recently attended a presentation about the mapping of soil properties. Kriging 
was applied and I was wondering why a regression technique was used instead of a 
classification algorithm. 
Delineating soil properties seemed to be, at first sight, a 
classification problem than a regression case. This was at first sight and we 
didn't debate much on this issue unfortunately.
Indicator kriging (IK) is somehow a bridge between these two issues 
(regression versus classification) and its simplicity in use and concept makes 
it very attractive to solve many problems. 
Now I 
wonder (again) if there are some fundamental papers comparing IK to 
classification algorithms (e.g. Support Vector Machine, SVM). In the same way, 
SVM used for regression seems to be not that uncommon as well. So where is 
the borderline? When are we facing a classification problem and when is it a 
regression problem? I am not sure the borderline is always that 
obvious.
 
I am 
not answering Sebastiano's mail here but would be curious to see on this list a 
debate on "regression versus classification"... I presume there may there some 
material as well regarding the issue discussed below.
 
Best 
regards,
 
Gregoire

  
  -Original Message-From: seba 
  [mailto:[EMAIL PROTECTED] Sent: 30 August 2005 
  18:17To: ai-geostats@unil.chSubject: [ai-geostats] 
  natural neighbor applied to indicator transforms Dear 
  list membersI would like to have some comments, suggestions or critics 
  about the following topic:building a (preliminary) local uncertainty model 
  of the spatial distribution of discrete (categorical) variables by means of 
  natural neighbor interpolation method applied to indicator 
  transforms.From my perspective, interpolating  indicator 
  variables (well, at the end an indicator variable is the probability of 
  occurrence of a given class) by means of a method like natural neighbor is an 
  easy and quick way to build a (preliminary) model of local uncertainty of the 
  studied properties, avoiding problems of order relation violations.In my 
  specific case I apply natural neighbor interpolation to indicator transforms 
  representing lithological classes in the same way in which direct indicator 
  kriging is applied. In this way, looking at the spatial distribution of the 
  probability of occurrence of lithologies (or at the distribution of the 
  lithological classes, if some classification algorithm is applied) I can have 
  a first idea of the spatial distribution of lithologies. Clearly this 
  method is utilized only as an explorative and preliminary data analysis 
  tool.Thank you in advance for your replies. S. 
  Trevisani
* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats