Re: Optimising for many rows and returned records (de-coupling query time to record set size for range queries)

Nick Hill Sun, 23 Apr 2006 14:41:14 -0700

Alexey Polyakov wrote:

On 4/23/06, Nick Hill <[EMAIL PROTECTED]> wrote:
I've noticed a couple things.
1) Right now you're emulating spatial index.
2) In future, you're going to emulate partitioning.


Why do you think that doing this stuff manually is better than using
builtin capabilities?


1) I am ignorant about any performance advantages of spatial indexes.
2) I am ignorant about built-in partitioning capabilities.


Selects against a table use b-trees too. Splitting data into lot of
tables won't help with selects at all (well, it may help on scans with
concurrent large data sets if data will be spread across different
physical drives, but not with regular range lookups that you're
doing). It will only help with inserts.

Assuming even distribution, selecting a table amongst 1000 will onlytake a few ms while 2^log10(1000) gives an 8 fold improvement in selectperformance. But then, I may be running inefficient queries aspostulated by Adam Wolff.

Given that there is such a strong relationship between the number of
records returned, and query time, I conclude that the whole index tree
is matched for every given number of root x records returned. If all
records we are matching are under a single node or under a small number
of nodes in the index tree, perhaps there is some way of telling the
database engine to ignore the rest of the index tree.



What is a 'root record'? Are you speaking about internal
representation of b-tree?

Yes. I am suggesting that a lower node in the B-tree may have below itall records the select query is looking for, thereby providing a short-cut.

Could this work, or am I misunderstanding how the index tree works? Are
there existing optimisations which can de-couple the relationship
between number of records and query time where the records I am
selecting are within a small range?



For studying select query performance issues it's better think about
index as simply about a sorted array with random-access, where each
random access costs O(lgN) and accesses to adjanced data cost O(1).
If your points are spread uniformly in space, cost of select query
you've shown is O(N*lgN)


I am unfamiliar with this representation. I am not sure I understand.

-Nick Hill.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: Optimising for many rows and returned records (de-coupling query time to record set size for range queries)

Reply via email to