Dear members of the list, Below is the summary of all answers for the sparse data problem (sorry for the delay, I was out of my email for a while). Thank you all for interesting and meaningful answers. I'll let you know about further developments in the problem solution.
Happy Holydays and season greetings to everybody. With much appreciation, Gali From: "Isobel Clark" <[EMAIL PROTECTED]> Add to Address Book Subject: AI-GEOSTATS: Re: sparse data problem To: "Marcel_Vallée" <[EMAIL PROTECTED]> CC: [EMAIL PROTECTED] Everybody (especially Gali!) Just to put the base case in perspective. Many half-billion dollar projects in Southern Africa have been evaluated and floated on the stock exchange on the basis of 5 or 6 holes. When a sample costs a couple of million dollars to acquire, there is little point in hoping for more. We use an extremely well sampled case in our (free) tutorial analyses. Look for the GASA data which has 27 samples. An embarrassement of riches in the mid-1980s, I can assure you. Isobel Clark http://geoecosse.bizland.com/softwares This message is not flagged. [ Flag Message - Mark as Unread ] Date: Fri, 05 Dec 2003 13:20:07 -0700 From: "Donald E. Myers" <[EMAIL PROTECTED]> Add to Address Book To: "Gali Sirkis" <[EMAIL PROTECTED]> Subject: Re: AI-GEOSTATS: sparse data problem Gali For you information There is no difference between RBF and kriging, the multiquadric is simply a particular choice of a generalized covariance. In the geostatistics literature, the RBF would be called "dual kriging". Donald E. Myers http://www.u.arizona.edu/~donaldm Date: Fri, 05 Dec 2003 14:11:42 -0500 From: "Marcel_Vallée" <[EMAIL PROTECTED]> Add to Address Book To: "Gali Sirkis" <[EMAIL PROTECTED]>, [EMAIL PROTECTED] Subject: Re: AI-GEOSTATS: sparse data problem Gail Sorry for not responding earlier to your request. Your explanatory comment to Monica does not convince me as a exploration and mining geologist. I think her comments are wise and should be considered. A 20x30 km area is a large one even when dealing with very uniform geology. Even in such conditions, different properties may be encountered, either as faults, vein or fracturation system, small intrusive bodies, mineral showings or deposits, pollution zones, etc. Such a small sample set as you have ["few (5-6) original data points + interpolated external data"] that covering whole study area] does not allow you to really appraise the validity and/or the geological cause of this "outlier." (There might be a sampling or assaying cause also). In such a case, it should be shown as an anomaly, not averaged out or kriged out. Excluding sampling/analytical problems, the outlier only has a "detection"value, meaning that the geology is not as uniform as expected and that additional geological observations and sampling in the vicinity is required to elucidate this problem. We should view geostatistics as an ancillary tool to understand a two or three dimensional "geological universe." Whenever data ara as sparse as in your exemple, kriged values should not replace and/or eliminate the potential meaning of sparse field observations. Sincerely Marcel Vallée ======================== Marcel Vallée Eng., Geo. Géoconseil Marcel Vallée Inc. 706 Routhier St Québec, Québec, Canada G1X 3J9 Tel: (1) 418, 652, 3497 Email: [EMAIL PROTECTED] Date: Thu, 04 Dec 2003 18:52:47 +0100 From: "Umberto Fracassi" <[EMAIL PROTECTED]> Add to Address Book To: "Gali Sirkis" <[EMAIL PROTECTED]> Subject: Re: AI-GEOSTATS: sparse data problem Hi Gali.. I got the info accessing the algorithm description in Surfer 7.0 help. That's the best reference I can offer: CARLSON R.E. and FOLEY T.A., 1991, Radial Basis Interpolation Methods on Track Data, Lawrence Livermore National Laboratory, UCRL-JC-1074238 I found it launching a search on google... Hope it helps! Ciao, Umberto Date: Wed, 03 Dec 2003 13:47:37 -0500 From: "Yetta Jager" <[EMAIL PROTECTED]> Add to Address Book Subject: Re: AI-GEOSTATS: sparse data problem To: "Gali Sirkis" <[EMAIL PROTECTED]> Hi Gali, I'd say 5 points isn't enough even for kriging with an external drift as one would need more than that for a regression. If you can get more data, say 25 points or so, that would be a feasible solution. However, since the more common data is already interpolated, its not clear why a kriging model would be substituted for it -- just use your regression directly to estimate the sparse variable. Don't shoot the messenger! Yetta From: "Monica Palaseanu-Lovejoy" <[EMAIL PROTECTED]> Add to Address Book To: "Gali Sirkis" <[EMAIL PROTECTED]> Date: Wed, 3 Dec 2003 18:39:30 -0000 Subject: Re: AI-GEOSTATS: sparse data problem Hi Gali, Now i have even more questions ;-) If the dataset from which you have the interpolated data and your own data set represent the same phenomenon, then why you don't add your data to the "original" data which was already krigged (but not the interpolated values), and use this new data set for kriging. Of course if you don't know these "original data" then ..... maybe you have also the kriging standard deviation data. You can probably safely hope that the points for which these kriging errors are minimal are your "original" points, or very close to the original ones. Now i guess you need to do some "digging" in the literature to be sure this is a feasible idea. Aside of that, you have to take into consideration the fact that does not matter which method of kriging you use, the extrapolated data have higher errors (usually) than the interpolated ones. In fact if it was used simple kriging the extrapolated data at distances greater than the range will tend to the distribution mean, while for ordinary kriging will tend to the local neighbourhood mean. If you used universal kriging then you may have very unrealistic results for extrapolated data because they depend heavily on the local trend modelled for that neighbourhood. So ... in any case there is not a happy situation. If i were you and have time in my hands i would use the first set of data (the interpolated one) and i would try to the best of my knowledge to extrapolate it over the area where you have your 6 values, and after i would look to see what is the difference between the inferred data and the "real" ones. I am not sure how i will interpret that now, but i am sure it might be very useful to see what type of errors you may introduce. After i would "build" a new data set with the "real" data you have and the "original" data from the interpolated data (again not the interpolated data itself) and do a kriging on that, after which i would do a cross-validation for the sparse "real" data you have and see what you are coming up with. In either case i will do as much research as i can in the nature of your outlier to have some physical base on which you can decide if you want to include it in your data, or to consider it as being a member of a different distribution, or whatever. Monica ========================================= Gali Sirkis wrote: > Hi Monica, > > thanks for quick reply. The interpolated data is a > different data set with is by its nature (speaking > about geological properties) should be correlated with > the sparse one. > This is a geological data over not huge area - around > 20x30 kilometers. It should have at least some spatial > correlation. The variogram is not of striking beauty > :) but it is not a pure nugget effect, though. > The only other way meaningfully interpolate between > those sparse points, it seems to use the simple linear > regression between those two datasets. > The literature about kriging/interpolating for very > sparse data would definitely help, if anybody know > about, please let know. > > Thanks, > > Gali This message is not flagged. [ Flag Message - Mark as Unread ] From: "Monica Palaseanu-Lovejoy" <[EMAIL PROTECTED]> Add to Address Book To: "Gali Sirkis" <[EMAIL PROTECTED]>, [EMAIL PROTECTED] Date: Wed, 3 Dec 2003 17:56:06 -0000 Subject: Re: AI-GEOSTATS: sparse data problem Hi, I am not sure i understood correctly your question. Fist of all, do the interpolated data have come from your sparse data interpolation? What method of interpolation did you use in this case? After Burrough and McDonnel, 2000, you need at least 50 points to have reliable results through kriging. Certainly you can do it on less data, but until now i never saw a study considering this problem in depth (maybe there is literature out there, and if it does and anybody knows about it - i would like to know it also ;-)) Secondly, if you know the outlier is not an error, but you interpret it as representing a different combination of properties than the rest of your data - i am not very sure it is wise to use it together with your rest of the data in any interpolation exercise. The outlier may represent a different population and in this case i cannot see any "physical" reason to treat all your data together if parts of the data represent different things. At least this is my opinion. Besides, if your data is not only sparse (5 or 6 data points .... it is really very sparse i think) but also far away in space, they can be at distances grater than the spatial correlation range, and in this case i really don't think you can use kriging .... you will have either a pure nugget effect or a very high nugget value and not a too high spatial correlation. Monica -- Date: Wed, 03 Dec 2003 18:35:33 +0100 From: "Umberto Fracassi" <[EMAIL PROTECTED]> Add to Address Book To: [EMAIL PROTECTED] Subject: Re: AI-GEOSTATS: sparse data problem Hi Gali, may you not try with Radial Basis Function (Multiquadric) instead of kriging? It's meant to be an exact interpolator, although sometimes it doesn't fully honor your data. However, it's based on the concept of track data which seems to me to suit the issue you mention. I employ RBF with macroseismic effects of historical earthquakes. Since these data are sparse (and scarce and scattered..!) by definition, this algorithm effectively pursues aligned pattern in the dataset. Hope this may help... Ciao and best regards, Umberto Gali Sirkis wrote: >Dear list members, > >Please advise what to do in following case: > The sparse dataset for kriging inlcudes only few >(5-6) original data points + interpolated external >data, that covering whole study area. >One of the original data points seems completly not to >fit to the main correlation line between original and >external data, however mostly probable is not an >error, but might represent different combination of >data properties. >Is there is any chance to use this outlying point? >Does is sound feasible for you as specialists in >statistical analysis to use the kriging method in this >case? > >Many thanks in advance for your help, > >Gali Sirkis > >__________________________________ > > > __________________________________ Do you Yahoo!? New Yahoo! Photos - easier uploading and sharing. http://photos.yahoo.com/ -- * To post a message to the list, send it to [EMAIL PROTECTED] * As a general service to the users, please remember to post a summary of any useful responses to your questions. * To unsubscribe, send an email to [EMAIL PROTECTED] with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list * Support to the list is provided at http://www.ai-geostats.org