Hi again,

Very mixed news to report...

Recap:


I'd reported:
> Despite numerous efforts, we're unable to solve a severe performance 
>limitation between Pg 7.3.2
> and Pg 8.1.4.
>
> The query and 'explain analyze' plan below, runs in 
>       26.20 msec on Pg 7.3.2, and 
>       2463.968 ms on Pg 8.1.4, 
>


Tom Lane responded:
>You're not getting the indexscan optimization of the LIKE clause, which
>is most likely due to having initdb'd the 8.1 installation in something
>other than C locale.  You can either redo the initdb in C locale (which
>might be a good move to fix other inconsistencies from the 7.3 behavior
>you're used to) or create a varchar_pattern_ops index on the column(s)
>you're using LIKE with.


Steinar H. Gunderson suggested:
>You could always try
>
>  CREATE INDEX test_index ON dbxref (accession varchar_pattern_ops);


I'd responded:
>>You could always try
>>
>>  CREATE INDEX test_index ON dbxref (accession varchar_pattern_ops);
>
>WOW!  we're now at runtime 0.367ms on Pg8
>
>Next step is to initdb w/C Locale (tonight) (Thanks Tom et al.!).
>
>Thanks again - will report back soon.


Alvaro Herrera pointed out:
>> Next step is to initdb w/C Locale (tonight) (Thanks Tom et al.!).
>>
>That's alternative to the pattern_ops index; it won't help you obtain a
>plan faster than this one.



Tom concurred:
>>> Next step is to initdb w/C Locale (tonight) (Thanks Tom et al.!).
>>
>> That's alternative to the pattern_ops index; it won't help you obtain
>> a plan faster than this one.
>
>No, but since their old DB was evidently running in C locale, this
>seems like a prudent thing to do to avoid other surprising 
>changes in behavior.

=====

We reconfigured the server, as follows:  

initdb -D /var/lib/pgsql/data --encoding=UTF8 --locale=C

 -----I'm wondering if this was incorrect (?). our Pg7 servers encode SQL_ASCII 
-----


NEXT, loaded db, and the good news is the query showed:
        Total runtime:  0.372 ms


As mentioned in original post, this query is just part of a longer procedure.
reminder:  
                 The longer procedure was taking >10 *hours* to run on Pg8.1.4
                    This same longer procedure runs in ~22 minutes on Pg7.3.2 
server.


=====

Before redoing the initdb with C-locale, I did a CREATE INDEX on the 8.1.4
server, which resulted not only in much faster query times, but in a drastic
improvement in the time of the overall/longer procedure (<11mins).

With the initdb locale C Pg8.1.4 server, it ran for 6 hours before I killed it 
(and output
file was <.25 expected  end size).

======


I'm perplexed we're not seeing better performance on Pg8.1.4 server given the 
22 minutes runtime we're seeing on the Pg7.3.2 servers (on older hardware and 
OS).


So, while initdb  C locale helped the initial query, it seems to have had no 
positive affect
on the longer procedure.


Is there some other difference between 7.3.2 and 8.1.4  we're missing?


Thanks for any help.
Regards,
Susan Russo


=======
I enclose the db calls (selects) contained in the 'overall procedure' referred 
to above (taken directly
from a perl script):  THOUGH THIS RUNS IN 22 mins on Pg7.3.2, and >10 hours on 
Pg8.1.4...

        my $aq = $dbh->prepare(sprintf("SELECT * from dbxref dx, db where 
accession = '%s' and dx.db_id = db.db_id and db.name = 'GB_protein'",$rec));


        my $pq = $dbh->prepare(sprintf("SELECT o.genus, o.species, 
f.feature_id, f.uniquename, f.name, accession, is_current from feature f, 
feature_dbxref fd, dbxref d, cvterm cvt, organism o where accession = '%s' and 
d.dbxref_id = fd.dbxref_id and fd.feature_id = f.feature_id and f.uniquename 
like '%s' and f.organism_id = o.organism_id and f.type_id = cvt.cvterm_id and 
cvt.name = 'gene'",$rec,$fbgnwc));


                my $uq = $dbh2->prepare(sprintf("SELECT db.name, accession, 
version, is_current from feature_dbxref fd, dbxref dx, db where fd.feature_id = 
%d and fd.dbxref_id = dx.dbxref_id and dx.db_id = db.db_id and db.name = 
'%s'",$pr{feature_id},$uds{$uh{$rec}{stflag}}));



       my $cq = $dbh2->prepare(sprintf("SELECT f.uniquename, f.name, cvt.name 
as ntype, dx.db_id, dx.accession, fd.is_current from dbxref dx, feature f, 
feature_dbxref fd, cvte
rm cvt where accession like '%s' and dx.dbxref_id = fd.dbxref_id and 
fd.feature_id = f.feature_id and f.type_id = cvt.cvterm_id and cvt.name not in 
('gene','protein','natural_transposable_element','chromosome_structure_variation','chromosome_arm','repeat_region')",$nacc));




---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Reply via email to