RE: [ai-geostats] Sum of Estimates
Hi Colin, Have you checked the Fe content of your sphalerite + other mineralogy in the problem area? But I guess from your statement "Thus, there is no combination of Zn, Pb and Fe in the estimation database that totals more than 100% total sulphide" you have calculated the total percent for your samples using the same formula for estimates, so this should not be an issue. Cheers David Reid -Original Message-From: Colin Badenhorst [mailto:[EMAIL PROTECTED]Sent: Wednesday, 31 August 2005 8:58 PMTo: ai-geostats@unil.chSubject: [ai-geostats] Sum of Estimates Dear List, I have a rather interesting problem with my Kriged estimates for a base metal mine. I am estimating Zn, Pb and Fe, as percentage, the sum of which should total to no more than 100% total sulphides. All Zn comes from sphalerite (ZnS) at 0.671 proportion. Sphalerite SG = 3.80 All Pb comes from galena (PbS) at 0.866 proportion. Galena SG = 7.40 All Fe comes from pyrite (FeS2) at 0.466 proportion. Pyrite SG = 4.80 Total Sulphides = (Zn estimate x 1.4903) + (Pb estimate x 1.1547) + (Fe x 2.1459) What I have discovered is that I have areas in which the total sulphides are greater than 100% - with very few exceptions, the total is no more than 105% total sulphide. My estimation domains for Zn and Pb are well constrained and validated, and the variogram models and estimation parameters are robust, and have been tested and validated to ensure they match the geological expectations. My domains for Fe are less well constrained but the variogram models are robust, as are the estimation parameters, and these also match the geological expectation. So, at the time of the estimation, there was very little I could do to improve on these. The estimation database (composited drillhole samples) have upper data value limits (or cut-offs if you wish to use that terminology) imposed on them such that : Zn > 40% is never used to estimate a block Pb > 10% is never used to estimate a block Fe > 46.6% is never used to estimate a block Thus, there is no combination of Zn, Pb and Fe in the estimation database that totals more than 100% total sulphide The areas with the anomalous (erroneous?) total sulphide summation all correlate, without fail, to areas of thick ore with very dominant pyrite content - there are individual blocks scattered across the mine that buck this trend. This leads me to suspect that the Fe estimates may be erroneous, or simply speaking, the Fe content is being overestimated, hence the total sulphide count exceeds the theoretical limit. The only solution to this problem is modifying the Fe variograms and estimation parameters, but currently, in my judgment, there is nothing I can modify that would lead to better variograms or estimation parameters. Of course there may be blocks where the total sulphide is actually underestimated, but that is impossible to determine, so the overestimates may balance the underestimates in which case there is no bias, but that needs to be tested. Has anyone heard of similar issues on other base metal mines? In the absence of revisiting the estimation parameters, is there anything I can do to realistically address this issue? Regards, Colin This message and any attached files may contain information that is confidential and/or subject of legal privilege intended only for use by the intended recipient. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, be advised that you have received this message in error and that any dissemination, copying or use of this message or attachment is strictly forbidden, as is the disclosure of the information therein. If you have received this message in error please notify the sender immediately and delete the message. This message and any attached files may contain information that is confidential and/or subject of legal privilege intended only for use by the intended recipient. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, be advised that you have received this message in error and that any dissemination, copying or use of this message or attachment is strictly forbidden, as is the disclosure of the information therein. If you have received this message in error please notify the sender immediately and delete the message. * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
[ai-geostats] Pareto vs Lognormal distribution
Hello list I am a PhD student looking at developing a statistical model to predict the size-distribution of an area's oil and gas fields. It is clear that previous investigators prefer either a Pareto power law or a lognormal distribution to approximate field-size distributions. The data I am using does not look like it comes from a Pareto distribution - which I explain as being a result of undersampling - which previous investigators have reported - that undersampling occurs because the small fields are not sampled or recoded. However by using basin-modelling software to simulate oil and gas fields (for the same basin that my discovered empirical data comes from) I notice that this sample is also undersampled - that is fields under a certain size are not being simulated - which is probably due to the resolution of my input data but what is interesting is that the undersampling actually occurs throughout all the size ranges - including the medium to larger sizes - which I would not have expected. Like the discovery dataset (n = 25) the simulated dataset (n = 140) looks like it is more from a lognormal distribution than a Pareto distribution. My conclusion is that without being able to say that a Pareto is better than a lognormal and vise-versa it appears only logical to use both distributions. Geologically there does not seems to be a reason why a modal size (greater than what is detectable by exploration methods) of fields should exist - which would be the case if the data was from a lognormal distribution - except if the distribution is highly right skewed (at the small field size) and the mode is actually just below the detection of size. Geologically there does seem reason for fields to become so small that they become entities (that trap oil and gas) - and this relationship may be better approximated by a Pareto. The Pareto and lognormal form is similar but maybe one is better to approximate field sizes than the other. My question is do you think a Pareto distribution better approximates an oil and gas size distribution than a lognormal (or vise-versa) and if so why. I am currently working on goodness of fit test to throw some more light on this - but if anyone has any thing to say I'd appreciate some comments. Thank you, Kind regards Beatrice Geological and Nuclear Sciences New Zealand www.gns.cri.nz * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
[ai-geostats] Re: Why degree of freedom is n-1
Hi Eric What complications! You should find, in any basic statistical inference that the correlation is divided by (n-1) and has (n-2) degrees of freedom. The logic behind this is because the correlation is actually calculated as the covariance divided by the two standard deviations. The covariance is calculated from n PAIRS of samples, not 2n individual observations and has (n-1) degrees of freedom because it uses the pair of means (m1,m2) as its centroid. Dividing by the pair (s1,s2) loses you the other degree of freedom. Tests on the correlation have (n-2) degrees of freedom. If you use (say) a regression relationship with 'k' coefficients including the constraint of the means, you lose k degrees of freedom. Any book which deals with 'Analysis of variance' will explain this for you. We use exactly this approach for testing a trend surface (see free tutorial at http://geoecosse.bizland.com/softwares or download my SNARK (1977) paper from http://uk.geocities.com/drisobelclark/resume). Hope this helps. Isobel [EMAIL PROTECTED] wrote: This follow-up is slighlty aside the subject line of the mailing list, butas a geologist, this is the only statistically-flavoured one I amsubscribed to. Therefore :Federico Pardo <[EMAIL PROTECTED]>said:> Having N samples, and then n degrees of freedom.> One degree of freedom is used (or taken) by the mean calculation.> Then when you calculate the variance or the standard deviation, you only> have left n-1 degrees of freedom.Apart a rigorous calculation I am aware of that in this very case (cf.Peter Bossew's contribution on the same thread, that details it), gives aproof for this rule-of-thumb, what more or less rigourous statisticaldevelopments gives consistance to it ?I mean, for the empirical correlation coefficient,rhoXiYi = SUM_i=1..N( (x_i - mx).(y_i - my) / sx / sy ) / WHAT_NUMBERMust WHAT_NUMBER be, for a kind of unbiased estimate ("a kind of" meaning"with some eventual Fisher z-transform"...):* N for simplicity,* N-2 as I have most frequently seen in books that dare give this formula(N points, minus 1 for position and 1 for dispersion ?),* or 2N-4 -- 2N for the (x_i,y_i), minus 4 for {mx,my,sx,sy} -- as astrict application of the rule-of-thumb seems to suggest ?And what about, when fitting for instance a 3-parameter non-linearfunction, reducing the number of degrees of freedom, to N-3 (number ofpoints, minus one for each function parameter ? I have never read any kindof explanation to support it, though it seems widely Thanks in advance for enlightments or simply tracks for other resources ofexplanations.-- Éric L.* By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm )* To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED]Signoff ai-geostats* By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
Re: [ai-geostats] Why degree of freedom is n-1
This follow-up is slighlty aside the subject line of the mailing list, but as a geologist, this is the only statistically-flavoured one I am subscribed to. Therefore : Federico Pardo <[EMAIL PROTECTED]> said: > Having N samples, and then n degrees of freedom. > One degree of freedom is used (or taken) by the mean calculation. > Then when you calculate the variance or the standard deviation, you only > have left n-1 degrees of freedom. Apart a rigorous calculation I am aware of that in this very case (cf. Peter Bossew's contribution on the same thread, that details it), gives a proof for this rule-of-thumb, what more or less rigourous statistical developments gives consistance to it ? I mean, for the empirical correlation coefficient, rhoXiYi = SUM_i=1..N( (x_i - mx).(y_i - my) / sx / sy ) / WHAT_NUMBER Must WHAT_NUMBER be, for a kind of unbiased estimate ("a kind of" meaning "with some eventual Fisher z-transform"...): * N for simplicity, * N-2 as I have most frequently seen in books that dare give this formula (N points, minus 1 for position and 1 for dispersion ?), * or 2N-4 -- 2N for the (x_i,y_i), minus 4 for {mx,my,sx,sy} -- as a strict application of the rule-of-thumb seems to suggest ? And what about, when fitting for instance a 3-parameter non-linear function, reducing the number of degrees of freedom, to N-3 (number of points, minus one for each function parameter ? I have never read any kind of explanation to support it, though it seems widely Thanks in advance for enlightments or simply tracks for other resources of explanations. -- ?ric L. * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
[Fwd: Re: [ai-geostats] natural neighbor applied to indicator transforms]
I'm also forwarding this answer from Dr Samy Bengio who hasn't subscribed to ai-geostats. His e-mail address is available at the end of his e-mail. Best regards -- Nicolas Gilardi Particle Physics Experiment group University of Edinburgh, JCMB Edinburgh EH9 3JZ, United Kingdoms tel: +44 (0)131 650 5300 ; fax: +44 (0)131 650 7189 e-mail: [EMAIL PROTECTED] ; web: http://baikal-bangkok.org/~nicolas --- Begin Message --- Hello, My own contribution to the following question: I recently attended a presentation about the mapping of soil properties. Kriging was applied and I was wondering why a regression technique was used instead of a classification algorithm. It is always possible to use a regression technique to solve a classification task, while the converse is in general much harder (although never impossible). Now why one should use one technique instead of another is a much wider question. First, one has to think of the criterion that is optimized by the underlying technique and compare it to the criterion that is seeked in the problem at hand. The better these criteria fit one to the other, the more fitted will be the technique. For instance, using a mean-squared error criterion when solving a classification task is not optimal, although it opens the door to many possible techniques. For classification tasks, it is better to have a criterion that minimizes the number of errors (if this is what is expected), and possibly while maximizing the distance between the classes in the feature space (the so-called margin). Hence, SVMs are a good choice for classification. However, some regression techniques, while not minimizing the best criterion, offers other advantages that may prove interesting for the problem at hand, such as smoothness, stochastic training, etc. So where is the borderline? When are we facing a classification problem and when is it a regression problem? I am not sure the borderline is always that obvious. The border between problems is in general obvious: is the target of your task in N or in R ? and if in N, are the elements ordered or not? these two simple questions decide whether it is a regression or a classification task (although you might also have other types of tasks such as density estimation or ranking). -Original Message- From: seba [mailto:[EMAIL PROTECTED] Sent: 30 August 2005 18:17 To: ai-geostats@unil.ch Subject: [ai-geostats] natural neighbor applied to indicator transforms Dear list members I would like to have some comments, suggestions or critics about the following topic: building a (preliminary) local uncertainty model of the spatial distribution of discrete (categorical) variables by means of natural neighbor interpolation method applied to indicator transforms. From my perspective, interpolating indicator variables (well, at the end an indicator variable is the probability of occurrence of a given class) by means of a method like natural neighbor is an easy and quick way to build a (preliminary) model of local uncertainty of the studied properties, avoiding problems of order relation violations. In my specific case I apply natural neighbor interpolation to indicator transforms representing lithological classes in the same way in which direct indicator kriging is applied. In this way, looking at the spatial distribution of the probability of occurrence of lithologies (or at the distribution of the lithological classes, if some classification algorithm is applied) I can have a first idea of the spatial distribution of lithologies. Clearly this method is utilized only as an explorative and preliminary data analysis tool. Thank you in advance for your replies. S. Trevisani Samy Bengio Senior Researcher in Machine Learning. IDIAP, CP 592, rue du Simplon 4, 1920 Martigny, Switzerland. tel: +41 27 721 77 39, fax: +41 27 721 77 12. mailto:[EMAIL PROTECTED], http://www.idiap.ch/~bengio --- End Message --- * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
Re: [ai-geostats] natural neighbor applied to indicator transforms
To answer to Gregoire's question, for some comparisons between SVM and Indicator Kriging, here is a very basic paper (from 1999): http://baikal-bangkok.org/~nicolas/publi/acai99-svm.pdf and a thesis chapter (chapter 6), perhaps more interesting (from 2002): http://baikal-bangkok.org/~nicolas/cartann/these_gilardi.pdf My personnal feeling about the distinction between using a classification algorithm or a regression one is the importance you put on the boundaries. If you look for smooth boundaries, with uncertainty estimations, etc., then a regression algorithm (like indicator kriging) is certainly a good approach. Now, if you don't care much about how the categories mix together at the interface, or if you want clear decision boundaries, then a real classification algorithm (like SVM) is certainly a better choice. However, it is true that many algorithms can be used in either cases, often with a small or no modification. The best examples are the algorithms for density estimation (RBF, Parzen Windows...). Algorithms of the category of SVM (i.e. large margin classifiers) are interesting for classification because they are concentrating on finding a separation between classes, not finding the "centre" of classes. In my opinion, the interest of this technic for regression isn't obvious... Best regards, Nico Gregoire Dubois wrote: I recently attended a presentation about the mapping of soil properties. Kriging was applied and I was wondering why a regression technique was used instead of a classification algorithm. Delineating soil properties seemed to be, at first sight, a classification problem than a regression case. This was at first sight and we didn't debate much on this issue unfortunately. Indicator kriging (IK) is somehow a bridge between these two issues (regression versus classification) and its simplicity in use and concept makes it very attractive to solve many problems. Now I wonder (again) if there are some fundamental papers comparing IK to classification algorithms (e.g. Support Vector Machine, SVM). In the same way, SVM used for regression seems to be not that uncommon as well. So where is the borderline? When are we facing a classification problem and when is it a regression problem? I am not sure the borderline is always that obvious. I am not answering Sebastiano's mail here but would be curious to see on this list a debate on "regression versus classification"... I presume there may there some material as well regarding the issue discussed below. Best regards, Gregoire -- Nicolas Gilardi Particle Physics Experiment group University of Edinburgh, JCMB Edinburgh EH9 3JZ, United Kingdoms tel: +44 (0)131 650 5300 ; fax: +44 (0)131 650 7189 e-mail: [EMAIL PROTECTED] ; web: http://baikal-bangkok.org/~nicolas * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
Re: [ai-geostats] Sum of Estimates
Dear Colin: The solution of your problem is co-kriging. M. Rigidel has another solution: this solution and the cokriging is discussed in: "Cas de simplification du cokrigeage" by Georges Matheron. Paris School of Mines. Regards, Marco ---Mensaje original--- De: Colin Badenhorst Fecha: 08/31/05 09:00:13 Para: ai-geostats@unil.ch Asunto: [ai-geostats] Sum of Estimates Dear List, I have a rather interesting problem with my Kriged estimates for a base metal mine. I am estimating Zn, Pb and Fe, as percentage, the sum of which should total to no more than 100% total sulphides. All Zn comes from sphalerite (ZnS) at 0.671 proportion. Sphalerite SG = 3.80 All Pb comes from galena (PbS) at 0.866 proportion. Galena SG = 7.40 All Fe comes from pyrite (FeS2) at 0.466 proportion. Pyrite SG = 4.80 Total Sulphides = (Zn estimate x 1.4903) + (Pb estimate x 1.1547) + (Fe x 2.1459) What I have discovered is that I have areas in which the total sulphides are greater than 100% - with very few exceptions, the total is no more than 105% total sulphide. My estimation domains for Zn and Pb are well constrained and validated, and the variogram models and estimation parameters are robust, and have been tested and validated to ensure they match the geological expectations. My domains for Fe are less well constrained but the variogram models are robust, as are the estimation parameters, and these also match the geological expectation. So, at the time of the estimation, there was very little I could do to improve on these. The estimation database (composited drillhole samples) have upper data value limits (or cut-offs if you wish to use that terminology) imposed on them such that : Zn > 40% is never used to estimate a block Pb > 10% is never used to estimate a block Fe > 46.6% is never used to estimate a block Thus, there is no combination of Zn, Pb and Fe in the estimation database that totals more than 100% total sulphide The areas with the anomalous (erroneous?) total sulphide summation all correlate, without fail, to areas of thick ore with very dominant pyrite content - there are individual blocks scattered across the mine that buck this trend. This leads me to suspect that the Fe estimates may be erroneous, or simply speaking, the Fe content is being overestimated, hence the total sulphide count exceeds the theoretical limit. The only solution to this problem is modifying the Fe variograms and estimation parameters, but currently, in my judgment, there is nothing I can modify that would lead to better variograms or estimation parameters. Of course there may be blocks where the total sulphide is actually underestimated, but that is impossible to determine, so the overestimates may balance the underestimates in which case there is no bias, but that needs to be tested. Has anyone heard of similar issues on other base metal mines? In the absence of revisiting the estimation parameters, is there anything I can do to realistically address this issue? Regards, Colin * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
[ai-geostats] Sum of Estimates
Dear List, I have a rather interesting problem with my Kriged estimates for a base metal mine. I am estimating Zn, Pb and Fe, as percentage, the sum of which should total to no more than 100% total sulphides. All Zn comes from sphalerite (ZnS) at 0.671 proportion. Sphalerite SG = 3.80 All Pb comes from galena (PbS) at 0.866 proportion. Galena SG = 7.40 All Fe comes from pyrite (FeS2) at 0.466 proportion. Pyrite SG = 4.80 Total Sulphides = (Zn estimate x 1.4903) + (Pb estimate x 1.1547) + (Fe x 2.1459) What I have discovered is that I have areas in which the total sulphides are greater than 100% - with very few exceptions, the total is no more than 105% total sulphide. My estimation domains for Zn and Pb are well constrained and validated, and the variogram models and estimation parameters are robust, and have been tested and validated to ensure they match the geological expectations. My domains for Fe are less well constrained but the variogram models are robust, as are the estimation parameters, and these also match the geological expectation. So, at the time of the estimation, there was very little I could do to improve on these. The estimation database (composited drillhole samples) have upper data value limits (or cut-offs if you wish to use that terminology) imposed on them such that : Zn > 40% is never used to estimate a block Pb > 10% is never used to estimate a block Fe > 46.6% is never used to estimate a block Thus, there is no combination of Zn, Pb and Fe in the estimation database that totals more than 100% total sulphide The areas with the anomalous (erroneous?) total sulphide summation all correlate, without fail, to areas of thick ore with very dominant pyrite content - there are individual blocks scattered across the mine that buck this trend. This leads me to suspect that the Fe estimates may be erroneous, or simply speaking, the Fe content is being overestimated, hence the total sulphide count exceeds the theoretical limit. The only solution to this problem is modifying the Fe variograms and estimation parameters, but currently, in my judgment, there is nothing I can modify that would lead to better variograms or estimation parameters. Of course there may be blocks where the total sulphide is actually underestimated, but that is impossible to determine, so the overestimates may balance the underestimates in which case there is no bias, but that needs to be tested. Has anyone heard of similar issues on other base metal mines? In the absence of revisiting the estimation parameters, is there anything I can do to realistically address this issue? Regards, Colin * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
RE: [ai-geostats] natural neighbor applied to indicator transforms
Hi Gregorie Well, I think that classification could be viewed as a way of coding of information in sampled areas. In particular for soil properties continuos or fuzzy classification seems to work properly. Then, avoiding to talk about the non-convexity of kriging, we can interpolate before or after performing classification. But after all, also classification algorithms are a regression problem... Bye Sebastiano At 11.25 31/08/2005, Gregoire Dubois wrote: I recently attended a presentation about the mapping of soil properties. Kriging was applied and I was wondering why a regression technique was used instead of a classification algorithm. Delineating soil properties seemed to be, at first sight, a classification problem than a regression case. This was at first sight and we didn't debate much on this issue unfortunately. Indicator kriging (IK) is somehow a bridge between these two issues (regression versus classification) and its simplicity in use and concept makes it very attractive to solve many problems. Now I wonder (again) if there are some fundamental papers comparing IK to classification algorithms (e.g. Support Vector Machine, SVM). In the same way, SVM used for regression seems to be not that uncommon as well. So where is the borderline? When are we facing a classification problem and when is it a regression problem? I am not sure the borderline is always that obvious. I am not answering Sebastiano's mail here but would be curious to see on this list a debate on "regression versus classification"... I presume there may there some material as well regarding the issue discussed below. Best regards, Gregoire -Original Message- From: seba [ mailto:[EMAIL PROTECTED]] Sent: 30 August 2005 18:17 To: ai-geostats@unil.ch Subject: [ai-geostats] natural neighbor applied to indicator transforms Dear list members I would like to have some comments, suggestions or critics about the following topic: building a (preliminary) local uncertainty model of the spatial distribution of discrete (categorical) variables by means of natural neighbor interpolation method applied to indicator transforms. From my perspective, interpolating indicator variables (well, at the end an indicator variable is the probability of occurrence of a given class) by means of a method like natural neighbor is an easy and quick way to build a (preliminary) model of local uncertainty of the studied properties, avoiding problems of order relation violations. In my specific case I apply natural neighbor interpolation to indicator transforms representing lithological classes in the same way in which direct indicator kriging is applied. In this way, looking at the spatial distribution of the probability of occurrence of lithologies (or at the distribution of the lithological classes, if some classification algorithm is applied) I can have a first idea of the spatial distribution of lithologies. Clearly this method is utilized only as an explorative and preliminary data analysis tool. Thank you in advance for your replies. S. Trevisani * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
RE: [ai-geostats] natural neighbor applied to indicator transforms
Title: Message I recently attended a presentation about the mapping of soil properties. Kriging was applied and I was wondering why a regression technique was used instead of a classification algorithm. Delineating soil properties seemed to be, at first sight, a classification problem than a regression case. This was at first sight and we didn't debate much on this issue unfortunately. Indicator kriging (IK) is somehow a bridge between these two issues (regression versus classification) and its simplicity in use and concept makes it very attractive to solve many problems. Now I wonder (again) if there are some fundamental papers comparing IK to classification algorithms (e.g. Support Vector Machine, SVM). In the same way, SVM used for regression seems to be not that uncommon as well. So where is the borderline? When are we facing a classification problem and when is it a regression problem? I am not sure the borderline is always that obvious. I am not answering Sebastiano's mail here but would be curious to see on this list a debate on "regression versus classification"... I presume there may there some material as well regarding the issue discussed below. Best regards, Gregoire -Original Message-From: seba [mailto:[EMAIL PROTECTED] Sent: 30 August 2005 18:17To: ai-geostats@unil.chSubject: [ai-geostats] natural neighbor applied to indicator transforms Dear list membersI would like to have some comments, suggestions or critics about the following topic:building a (preliminary) local uncertainty model of the spatial distribution of discrete (categorical) variables by means of natural neighbor interpolation method applied to indicator transforms.From my perspective, interpolating indicator variables (well, at the end an indicator variable is the probability of occurrence of a given class) by means of a method like natural neighbor is an easy and quick way to build a (preliminary) model of local uncertainty of the studied properties, avoiding problems of order relation violations.In my specific case I apply natural neighbor interpolation to indicator transforms representing lithological classes in the same way in which direct indicator kriging is applied. In this way, looking at the spatial distribution of the probability of occurrence of lithologies (or at the distribution of the lithological classes, if some classification algorithm is applied) I can have a first idea of the spatial distribution of lithologies. Clearly this method is utilized only as an explorative and preliminary data analysis tool.Thank you in advance for your replies. S. Trevisani * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats