I have a very simple table set:

Transactions:
transaction_key PRIMARY KEY
client TEXT
time TIMESTAMP

LineItems
transaction_key INT
amount INT

A query to print the contents of transactions with a sum of the line item amounts provides a very suboptimal result. The problem seems to be the GROUP BY clause as it doesn't use the primary index. Rewriting the query to only group on the transaction_key and returning the max of the other transaction fields results in a query of <1ms. (see queries below)

Can anyone shed any light here, I would have expected the queries to take roughly the same time?

Out of interest, since we are grouping by transaction_key which is unique, surely the other Transaction fields in the group by could be ignored by the planner?

Thanks
Stuart

(running postgresql 8.2.5 on NetBSD 3)


>> Slow query

EXPLAIN SELECT t.transaction_key,t.cashier,t.time,SUM(l.amount) FROM Transactions t JOIN LineItems l USING (transaction_key) GROUP BY t.transaction_key,t.cashier,t.time ORDER BY t.transaction_key;
                                       QUERY PLAN
-------------------------------------------------------------------------------------------
Sort  (cost=449.16..454.16 rows=2000 width=32)
  Sort Key: t.transaction_key
  ->  HashAggregate  (cost=314.50..339.50 rows=2000 width=32)
        ->  Hash Join  (cost=66.00..262.07 rows=5243 width=32)
              Hash Cond: (l.transaction_key = t.transaction_key)
              ->  Seq Scan on lineitems l  (cost=0.00..117.43 rows=5243 
width=16)
              ->  Hash  (cost=41.00..41.00 rows=2000 width=24)
                    ->  Seq Scan on transactions t  (cost=0.00..41.00 rows=2000 
width=24)
(8 rows)


Fast query

EXPLAIN SELECT t.transaction_key,MAX(t.cashier),MAX(t.time),SUM(l.amount) FROM Transactions t JOIN LineItems l USING (transaction_key) GROUP BY t.transaction_key ORDER BY t.transaction_key;

                                                  QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
GroupAggregate  (cost=0.00..459.11 rows=2000 width=32)
  ->  Merge Join  (cost=0.00..371.68 rows=5243 width=32)
        Merge Cond: (t.transaction_key = l.transaction_key)
        ->  Index Scan using transactions_pkey on transactions t  
(cost=0.00..86.25 rows=2000 width=24)
        ->  Index Scan using lineitems_transaction_index on lineitems l  
(cost=0.00..214.90 rows=5243 width=16)
(5 rows)




---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq

Reply via email to