On 4/17/08, J. Andrew Rogers <[EMAIL PROTECTED]> wrote:
>
> No, you are not correct about this.  All good database engines use a
> combination of clever adaptive cache replacement algorithms (read: keeps
> stuff you are most likely to access next in RAM) and cost-based optimization
> (read: optimizes performance by adaptively selecting query execution
> algorithms based on measured resource access costs) to optimize performance
> across a broad range of use cases.  For highly regular access patterns
> (read: similar query types and complexity), the engine will converge on very
> efficient access patterns and resource management that match this usage.
> For irregular access patterns, it will attempt to dynamically select the
> best options given recent access history and resource cost statistics -- not
> always the best result (on occasion hand optimization could do better), but
> more likely to produce good results than simpler rule-based optimization on
> average.
>
> Note that by "good database engine" I am talking engines that actually
> support these kinds of tightly integrated and adaptive management features:
> Oracle, DB2, PostgreSQL, et al.  This does *not* include MySQL, which is a
> naive and relatively non-adaptive engine, and which scales much worse and is
> generally slower than PostgreSQL anyway if you are looking for a free open
> source solution.
>
>
> I would also point out that different engines are optimized for different
> use cases.  For example, while Oracle and PostgreSQL share the same
> transaction model, Oracle design decisions optimized for massive numbers of
> small concurrent update transactions and PostgreSQL design decisions
> optimized for massive numbers of small concurrent insert/delete transaction.
>  Databases based on other transaction models, such as IBM's DB2, sacrifice
> extreme write concurrency for superior read-only performance.  There are
> unavoidable tradeoffs with such things, so the market has a diverse ecology
> of engines that have chosen a different set of tradeoffs and buyers should
> be aware of what these tradeoffs are if scalable performance is a criteria.


Thanks for the info -- I studied database systems almost a decade ago,
so I can hardly remember the details =)

ARC (Adaptive Cache Replacement) seems to be one of the most popular
methods, and it's based on keeping track of "frequently used" and
"recently used".  Unfortunately, for AGI / inference purposes, those
may not be the right optimization objectives.

The requirement of inference is that we need to access a lot of
*different* nodes, but the same nodes may not be required many times.
Perhaps what we need is to *bundle* up nodes that are associated with
each other, so we can read a whole block of nodes with 1 disk access.
This requires a very special type of storage organization -- it seems
that existing DBMSs don't have it =(

YKY

-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=101455710-f059c4
Powered by Listbox: http://www.listbox.com

Reply via email to