Re: [SQL] Using bitmap index scans-more efficient

Kyle Bateman Tue, 15 Aug 2006 15:00:08 -0700

Tom Lane wrote:

Kyle Bateman <[EMAIL PROTECTED]> writes:
But I'm assuming that using an interval-encoded project tree, I wouldhave to do something like the following to get a progency group:
select * from ledger l, proj p where p.proj_id = l.proj and p.left >1234 and p.right < 2345;
btree has no idea about the constraint (that I imagine exists) that left
<= right.  If you're just doing a simple index on (left, right) then the
above query requires scanning all index entries with left > 1234.  It
would probably help to say

select * from ledger l, proj p where p.proj_id = l.proj and
 p.left > 1234 and p.left < 2345 and p.right < 2345;

so that you can constrain the range of "left" values scanned.

Thanks for the replies, Tom and Florian.

My problem is not that it is difficult (or costly) to determine theprogeny of a given project. I can determine this in about 90 msecregardless of whether I use an adjacency model, interval-encoding, ormaterialized path (current implementation). The problem is, when I tryto extract the ledger entries belonging to that progeny from a set of amillion records, it seems to need to process all million records ratherthan being able to index right into the ones I want.

I'm not very good at reading explain output, but I tried to set up thequery Tom suggests by creating an interval-encoded project table(proj_int) and then joining it to my ledger like so:

select l.* from ledger l, proj_int i where l.proj = i.proj_id and i.lft>= 5283 and i.lft < 5300 and i.rgt <= 5300;

On my mini-test-ledger of 100,000 entries, this takes the longest time(5 seconds) with the following explain output:


Hash Join  (cost=19018.46..23411.52 rows=14 width=85)
  Hash Cond: ("outer".proj = "inner".proj_id)
  ->  Nested Loop Left Join  (cost=18994.38..23378.41 rows=1700 width=85)
        ->  Hash Join  (cost=18990.84..23340.87 rows=1700 width=81)
              Hash Cond: ("outer".vendid = "inner".org_id)
              ->  Merge Join  (cost=18935.35..23255.64 rows=1700 width=63)

Merge Cond: (("outer".vendid = "inner".vendid) AND(("outer".invnum)::text = "inner"."?column10?"))-> Index Scan using apinv_hdr_pkey on apinv_hdr h(cost=0.00..3148.16 rows=51016 width=21)-> Sort (cost=18935.35..19235.45 rows=120041width=55)

                          Sort Key: i.vendid, (i.invnum)::text

-> Seq Scan on apinv_items i(cost=0.00..4152.99 rows=120041 width=55)Filter: ((status = 'en'::bpchar) OR(status = 'cl'::bpchar) OR (status = 'pd'::bpchar))

              ->  Hash  (cost=50.99..50.99 rows=1799 width=26)

-> Seq Scan on vend_org v (cost=0.00..50.99rows=1799 width=26)

        ->  Materialize  (cost=3.54..3.55 rows=1 width=4)
              ->  Seq Scan on acct a  (cost=0.00..3.54 rows=1 width=4)
                    Filter: ((code)::text = 'ap'::text)
  ->  Hash  (cost=24.06..24.06 rows=10 width=4)

-> Bitmap Heap Scan on proj_int i (cost=2.26..24.06 rows=10width=4)Recheck Cond: ((lft >= 5283) AND (lft < 5300) AND (rgt <=5300))-> Bitmap Index Scan on i_proj_int_lft_rgt(cost=0.00..2.26 rows=10 width=0)Index Cond: ((lft >= 5283) AND (lft < 5300) AND(rgt <= 5300))


That is roughly equivalent to my materialized path method:

select l.* from ledger l where l.proj in (select proj_id from proj_vwhere 4737 = any(ppath));

And is quite slow compared to 150 msec when enumerating the progenyprojects like so:

select l.* from ledger l where l.proj in(4737,4789,4892,4893,4894,4895,4933,4934,4935);


Nested Loop Left Join  (cost=19.73..4164.10 rows=7 width=85)
  ->  Nested Loop  (cost=19.73..4139.08 rows=7 width=81)
        ->  Nested Loop  (cost=19.73..4100.07 rows=7 width=63)

-> Bitmap Heap Scan on apinv_items i(cost=19.73..1185.71 rows=487 width=55)Recheck Cond: ((proj = 4737) OR (proj = 4789) OR(proj = 4892) OR (proj = 4893) OR (proj = 4894) OR (proj = 4895) OR(proj = 4933) OR (proj = 4934) OR (proj = 4935))Filter: ((status = 'en'::bpchar) OR (status ='cl'::bpchar) OR (status = 'pd'::bpchar))

                    ->  BitmapOr  (cost=19.73..19.73 rows=495 width=0)