Hi
testdb3=# \d csischema.dim_company;
Table "csischema.dim_company"
Column | Type | Modifiers
-----------------+-----------------------------+-----------
company_id | integer | not null
company_name | character varying(100) |
city | character varying(100) |
state | character varying(100) |
postal_code | character varying(100) |
country | character varying(100) |
latitude | double precision |
longitude | double precision |
update_datetime | timestamp without time zone |
company_source | character varying(1) |
Indexes:
"dim_company_pkey" PRIMARY KEY, btree (company_id)
testdb3=# \d woc.dim_company;
Table "woc.dim_company"
Column | Type | Modifiers
-----------------+-----------------------------+-----------
company_id | integer | not null
company_name | character varying(100) |
city | character varying(100) |
state | character varying(100) |
postal_code | character varying(100) |
country | character varying(100) |
latitude | double precision |
longitude | double precision |
update_datetime | timestamp without time zone |
company_source | character varying(1) |
Indexes:
"dim_company_pkey" PRIMARY KEY, btree (company_id)
testdb3=# select count(*) from csischema.dim_company;
count
---------
1786376
(1 row)
testdb3=# select count(*) from woc.dim_company;
count
-------
18980
(1 row)
woc.dim_company is a subset of csischema.dim_company meaning all company_id
from woc.dim_company are in csischema.dim_company
Ratio is around 1%
SELECT a.company_id FROM csischema.dim_company a, woc.dim_company b
WHERE a.company_id = b.company_id;
testdb3=# explain analyze SELECT a.company_id FROM csischema.dim_company a,
woc.dim_company b
testdb3-# WHERE a.company_id = b.company_id;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=711.05..54938.35 rows=18980 width=4) (actual
time=34.067..1118.603 rows=18980 loops=1)
Hash Cond: (a.company_id = b.company_id)
-> Seq Scan on dim_company a (cost=0.00..47097.82 rows=1850582 width=4)
(actual time=0.013..523.249 rows=1786376 loops=1)
-> Hash (cost=473.80..473.80 rows=18980 width=4) (actual
time=20.203..20.203 rows=18980 loops=1)
Buckets: 32768 Batches: 1 Memory Usage: 924kB
-> Seq Scan on dim_company b (cost=0.00..473.80 rows=18980 width=4)
(actual time=0.007..10.076 rows=18980 loops=1)
Planning time: 0.511 ms
Execution time: 1121.068 ms
(8 rows)
I was expecting at least the PK of csischema.dim_company to be used . In
another DBMS that was the case. The larger table , csischema.dim_company used
the PK.
Any hints, thoughts what am I not seing ?
Thank you
Armand
--
Sent via pgsql-general mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general