Jim, > I've been doing some work to try and identify the actual costs > associated with an index scan with some limited sucess. What's been run > so far can be seen at http://stats.distributed.net/~decibel. But there's > a couple problems. First, I can't use the box exclusively for this > testing, which results in some result inconsistencies.
I can get you access to boxes. Chat on IRC? > Second, I've been > using a dataset that I can't make public, which means no one else can > run these tests on different hardware. Then use one of the DBT databases. > In the > case of testing index scans, we need to be able to vary correlation, > which so far I've been doing by ordering by different columns. I suspect > it will also be important to test with different tuple sizes. There's > also the question of whether or not the cache should be flushed for each > run or not. > > Does this sound like a good way to determine actual costs for index > scans (and hopefully other access methods in the future)? If so, what > would be a good way to implement this? Well, the problem is that what we need to index scans is a formula, rather than a graph. The usefulness of benchmarking index scan cost is so that we can test our formula for accuracy and precision. However, such a formula *does* need to take into account concurrent activity, updates, etc ... that is, it needs to approximately estimate the relative cost on a live database, not a test one. This is also going to be a moving target because Tom's in-memory-bitmapping changes relative cost equations. I think a first step would be, in fact, to develop a tool that allows us to put EXPLAIN ANALYZE results in a database table. Without that, there is no possibility of statistical-scale analysis. -- Josh Berkus Aglio Database Solutions San Francisco ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly