On 01/04/2014 12:08 AM, Ann Harrison wrote:
> Obviously,
> at least to me, if you've got to sort the data, you have to read all
> of it.
> Furthermore, despite the everyone's instinct, it's a good deal faster in
> the general case to read a table in an optimal order and sort the data
> in memory that to read the data in index order - random order relative
> to storage.
>
Ann, from server POV you are definitely right. But when talking about
ways to get first record from server as fast as possible we care about
not server-only performance, but about performance of overall
client-server system. Let me illustrate with a sample.
We have very simple table:
create table MOVIES
(NAME varchar(255) primary key,
COMMENTS varchar(8192),
ISO_IMAGE blob);
And want to run such a query:
select * from MOVIES where COMMENTS like '%yacht%' order by NAME;
Two plans are possible:
PLAN SORT (MOVIES NATURAL)
PLAN (MOVIES ORDER RDB$PRIMARY1)
Cause we need to scan all the table first plan appears to be better - it
requires less disk operations even taking into an account sort after
scanning all the table. But overall throughput depends upon what are we
going to do with ISO_IMAGE at client side. Imagine we want to burn DVDs
with all images (and have the set of DVDs sorted after burning). In that
case we should better start burning images as soon as possible, and it's
likely that second plan, being non-optimal for server, will fit better
this requirement and make it possible to complete the whole operation a
bit faster. Even with a lot of extra reads from disk server will sooner
of all find second record faster that DVD writer finishes with the first
image.
Certainly this does not mean that for all queries, containing blobs,
natural scan should not be used :-)
Even with this example we can find a lot of scenarios at the client side
when first plan will work better. And here we get probably main result -
it's absolutely impossible to predict at server side how returned data
will be used by client. Therefore I agree with Dmitry - such a hint to
optimizer is required part of SELECT statement if we want to have
optimal performance for all the system, not for server only. (Telling
true I do not understand why _THIS_ kind of hint is not part of SQL
standard - may be people who deal with standard look at the world from
server-only POV?)
------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT
organizations don't have a clear picture of how application performance
affects their revenue. With AppDynamics, you get 100% visibility into your
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
Firebird-Devel mailing list, web interface at
https://lists.sourceforge.net/lists/listinfo/firebird-devel