On 01/06/2014 12:41 AM, Ann Harrison wrote:
Alex,

    > Furthermore, despite the everyone's instinct, it's a good deal
    faster in
    > the general case to read a table in an optimal order and sort
    the data
    > in memory that to read the data in index order - random order
    relative
    > to storage.
    >

    Ann, from server POV you are definitely right. But when talking about
    ways to get first record from server as fast as possible we care about
    not server-only performance, but about performance of overall
client-server system.


When the server is bogged down running queries inefficiently, the overall
system performance does suffer.  However, if all you want is the first few
records, reading hundreds of thousands and sorting them is wasteful - and
hard on the client. That's why Firebird's optimizer handles a restriction on
the number of rows to be returned as an optimizer hint to navigate through
the index, rather than doing a natural read and sorting the results.


It's very bad idea to rely on the current optimizer behavior. I can easily imagine that in a case of constant for number of returned rows future optimizer will take it into an account and if (for example) it's greater than total number of records in MOVIES it can use optimal from server POV plan - SORT instead ORDER. It's like using "tbl.fld + 0 = constant" to avoid use of index, and this is not what may be called stable and portable way of providing hints to optimizer.


    We have very simple table:

    create table MOVIES
       (NAME varchar(255) primary key,
        COMMENTS varchar(8192),
        ISO_IMAGE blob);


(A good example of the problems with natural primary keys.)

Yes certainly - I just did not plan to provide ideal style. Certainly here we can have explicit index instead PK.


    And want to run such a query:

    select * from MOVIES where COMMENTS like '%yacht%' order by NAME;

    Two plans are possible:
    PLAN SORT (MOVIES NATURAL)
    PLAN (MOVIES ORDER RDB$PRIMARY1)


Right.  And if you want to give the optimizer a hint that it should choose
the second plan, change the query like this:

select first 1000000 * from MOVIES where COMMENTS like '%yacht% order by NAME;

If you think your client may want more than a million rows, increase that number. As far as I know, Firebird's optimizer doesn't do anything clever like trying to guess what part of the table the "first <n>" represents - ask it for the first 1 or the first ten million, it will still choose the navigational path. So ask it for ten times as many records as you expect to find. Probably a bad idea to exceed the size of an int64.

Returning to my sample we can provide a number of available empty writable DVDs in DVD changer :) But this does not change the fact that using SQL operators, originally designed for some other purpose, is not a best way to provide optimizer hints.



    Cause we need to scan all the table first plan appears to be
    better - it
    requires less disk operations even taking into an account sort after
    scanning all the table. But overall throughput depends upon what
    are we
    going to do with ISO_IMAGE at client side. Imagine we want to burn
    DVDs
    with all images (and have the set of DVDs sorted after burning).


Overall throughput also depends on what else is going on with the system -
if you've got lots of read queries that are hitting the disk hard, making your
burn program wait a few seconds for its first results could improve the
performance of the system overall.

Certainly. This is what I call a detail which can't be predicted by any optimizer taking into an account server-only aspects.

But that's moot.  You have the ability
to ask for navigational access through the index, if that's what you think
you need.

If I understand you correctly this is FIRST clause? But this is what I want to avoid...


    Certainly this does not mean that for all queries, containing blobs,
    natural scan should not be used :-)


And again, for the naive, the blobs aren't accessed until the records are
read, qualified by the comment containing 'yacht' - in this case, the non-
standard CONTAINING would be better because it is not case sensitive.

Certainly - but this is not important fact for a problem we discuss. Ideally here should be used comparison which takes into an account sound produced by word "yacht".

    Therefore I agree with Dmitry - such a hint to
    optimizer is required part of SELECT statement if we want to have
    optimal performance for all the system, not for server only.

And I guess I agree also, but think that the hint is already part of
Firebird - as either FIRST or LIMIT or ROWS or whatever other silly
syntax exists for restricting the number of records returned.  Remember,
the hint is needed only when the query is sorted and there is an index
with the same keys in the same order and direction as the sort and
that index can be used.

If this is the only question I want to repeat - in the future the way how first/limit affects optimizer behavior may change, and therefore in a case when we are not going to just limit a number of returned rows but make first row be returned faster it's better to have explicit way to do it instead of using very big values in first.

------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to