> At the planner level that would be entirely the wrong way to go about
> it, because that's forcing the equivalent of a nestloop join, which is
> very unlikely to be faster for the numbers of rows that we're talking
> about here.  The reason it looks faster to you is that the benefits of
> updating the document_file rows in ctid order outweigh the 
> costs of the
> dumb join strategy ... but what we want to achieve here is to 
> have both
> benefits, or at least to give the planner the opportunity to make a
> cost-driven decision about what to do.

Ok.

Here are some more data points, using a smaller table, v8.2.6:


Seq Scan on document_file df  (cost=0.00..208480.85 rows=25101 width=662) 
(actual time=0.239..773.834 rows=25149 loops=1)
  SubPlan
    ->  Index Scan using pk_document_id on document d  (cost=0.00..8.27 rows=1 
width=4) (actual time=0.011..0.015 rows=1 loops=25149)
          Index Cond: (id = $0)
Total runtime: 4492.363 ms



vs


Hash Join  (cost=1048.85..6539.32 rows=25149 width=666) (actual 
time=575.079..1408.363 rows=25149 loops=1)
  Hash Cond: (df.document_id = d.id)
  ->  Seq Scan on document_file df  (cost=0.00..4987.49 rows=25149 width=662) 
(actual time=60.724..824.195 rows=25149 loops=1)
  ->  Hash  (cost=734.49..734.49 rows=25149 width=8) (actual 
time=40.271..40.271 rows=25149 loops=1)
        ->  Seq Scan on document d  (cost=0.00..734.49 rows=25149 width=8) 
(actual time=0.055..22.559 rows=25149 loops=1)
Total runtime: 34961.504 ms


These are fairly repeatable for me after doing a vacuum full analyze of the two 
tables.


Have I simply not tuned postgres so that it knows it has everything on a single 
old IDE drive, not split over a few sets of raided SSD drives, hence 
random_page_cost should perhaps be larger than 4.0? Would that make the second 
estimate larger than the first estimate?

Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer 
focus, and courage. This email with any attachments is confidential and may be 
subject to legal privilege.  If it is not intended for you please advise by 
reply immediately, destroy it and do not copy, disclose or use it in any way.

__________________________________________________________________
  This email has been scanned by the DMZGlobal Business Quality 
              Electronic Messaging Suite.
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__________________________________________________________________


---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

                http://www.postgresql.org/about/donate

Reply via email to