Re: [SQL] User-defined SQL function has slower query on 7.3.3 than

2003-08-14 Thread Andrew Droffner
Mr. Lane:

QUERY
-
SELECT ZIPCODE
FROM LOCATIONS
WHERE
COUNTRY = 'USA' AND ZIP_DIST_MI('07306', ZIPCODE) <= 25;

I found that the 7.1.3 server performed QUERY very slowly after a VACUUM
ANALYZE. (I can't just ANALYZE in this version, right?) It's performance
was comparable to the 7.3.3 server for awhile. Then, it improved.

I don't know how to prove that an SPI query uses an index. I do know that
this SQL: select latitude, longitude from geo_zipdata where zip = $1
uses the index through PSQL.

I use an elog(NOTICE, ...) to print when the SQL get prepared, and it is
just once. geo_zipdata is never changed for the life of the database.

db=> explain
db-> select latitude, longitude from geo_zipdata where zip = '07306';

   QUERY PLAN  
  
-
 Index Scan using geo_zipdata_zip_idx on geo_zipdata  (cost=0.00..17.07
rows=5 width=16)
   Index Cond: (zip = '07306'::character varying)
(2 rows)

I expect QUERY to need a single full table scan for each ZIPCODE. I just
think that 7500 rows should never take over a minute. PG 7.3.3 takes 9
minutes (the one time we waited for it to finish).

How many data pages could 7500 rows need? With 2 or 3 page reads, it can't
take up much memory or I/O to do that.

- Andrew

On Wed, 6 Aug 2003, Tom Lane wrote:

> Andrew Droffner <[EMAIL PROTECTED]> writes:
> > I have this QUERY (below) that PostgreSQL 7.3.X servers run MUCH slower
> > than
> > the 7.1.3 server does.
> 
> I know of no reason for that to happen.  Have you vacuum analyzed the
> 7.3 database?
> 
> > It finds the ZIPs locations with a prepared
> > (and saved) SPI query, which uses an index:
> > "select latitude, longitude from geo_zipdata where zip = $1"
> 
> How do you know it's using the index?
> 
>   regards, tom lane
> 

-- 
[ Andrew Droffner
[ Advance Publications Internet
[
[ [EMAIL PROTECTED]


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[SQL] User-defined SQL function has slower query on 7.3.3 than 7.1.3

2003-08-14 Thread Andrew Droffner
I have this QUERY (below) that PostgreSQL 7.3.X servers run MUCH slower
than
the 7.1.3 server does. It makes sense that both servers have to do a
sequential scan over the ZIPCODE column. There are over 7,500 rows in the
LOCATIONS table.
Does anyone know what changed in the planner or optimizer? Can I change
the
postgresql.conf file to improve 7.3.3 performance?
Situation
-
Here is the situation...
PG 7.1.3 returns QUERY in about 4 seconds. The EXPLAIN plan says it uses
the index on country.
PG 7.3.3 simply does not return QUERY. I've waited over 3 minutes. With
the extra condition WHERE STATE = 'NJ' it takes almost 5 seconds. Other
states are much worse.
QUERY
-
SELECT ZIPCODE
FROM LOCATIONS
WHERE
COUNTRY = 'USA' AND ZIP_DIST_MI('07306', ZIPCODE) <= 25;
The function is written in C, using SPI. Given two US ZIP codes, it
returns the distance in miles. For example, it is 78 miles from Jersey
City to Philadelphia:
db=> select ZIP_DIST_MI('07306', '19130');
  zip_dist_mi   
-
78.801595557406
(1 row)

ZIP_DIST_MI() uses a geo_zipdata table to get the latitude and longitude.
Using those, it can calculate the "great circle distance" between ZIPs
with C double arithmetic. It finds the ZIPs locations with a prepared
(and saved) SPI query, which uses an index:
"select latitude, longitude from geo_zipdata where zip = $1"

FUNCTION

CREATE FUNCTION ZIP_DIST_MI(TEXT, TEXT)
RETURNS DOUBLE PRECISION...
ZIP DATA TABLE
--
CREATE TABLE GEO_ZIPDATA (
ZIP  VARCHAR(5) NOT NULL,
STATEVARCHAR(2) NOT NULL,
CITY VARCHAR(64) NOT NULL,
COUNTY   VARCHAR(64) NOT NULL,
LATITUDE FLOAT   NOT NULL,
LONGITUDE FLOAT  NOT NULL,
FIPS NUMERIC(10) NOT NULL
);
CREATE INDEX GEO_ZIPDATA_ZIP_IDX ON GEO_ZIPDATA (ZIP);


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match