[HACKERS] two index bitmap scan of a big table & hash_seq_search

Sergey E. Koposov Fri, 19 Aug 2011 11:24:43 -0700

Hello Hackers,

I've recently noticed a particularly strange behaviour of one of my simplequeries:

SELECT ra,dec FROM sdssdr7.photoobjall WHERE ra BETWEEN 175 and 190 AND
   dec BETWEEN 17 And 23 ;

The table is quite large (1.1Tb, 10^9 rows, and has Btree indexes on"ra","dec" columns).

The plan is a combination of two bitmap scans:
wsdb=# explain SELECT ra,dec FROM sdssdr7.photoobjall WHERE ra BETWEEN 175 and 
190 AND dec BETWEEN 17 And 23 ;
                                                                        QUERY 
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on photoobjall  (cost=1854918.62..13401835.50 rows=3266290 
width=16)
   Recheck Cond: ((ra >= 175::double precision) AND (ra <= 190::double precision) AND ("dec" 
>= 17::double precision) AND ("dec" <= 23::double precision))
   ->  BitmapAnd  (cost=1854918.62..1854918.62 rows=3266290 width=0)
         ->  Bitmap Index Scan on sdssdr7_photoobjall_ra_idx  
(cost=0.00..607217.21 rows=30530306 width=0)
               Index Cond: ((ra >= 175::double precision) AND (ra <= 
190::double precision))
         ->  Bitmap Index Scan on sdssdr7_photoobjall_dec_idx  
(cost=0.00..1246068.01 rows=62654186 width=0)
               Index Cond: (("dec" >= 17::double precision) AND ("dec" <= 
23::double precision))

But the funny thing I noticed is that the query after running a certainamount of time doing I/O, starts to use 100%CPU and spend 99% the time inhash_seq_search. Here is the oprofile of PG during that period:

--------
CPU: Intel Core/i7, speed 2.268e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask 
of 0x00 (No unit mask) count 100000
samples  %        symbol name
303404   99.3562  hash_seq_search
1163      0.3808  tbm_lossify
639       0.2093  hash_search_with_hash_value
....
--------

During that it very rarely tries to actually do any I/O. Every second orso it does issue an read of the main relation(not the index). Here ishow strace looks like during that time:

read(455, "\0\0\0\0\0\0\0\0\1\0\0\0t\6\200\6\360\37\4 \0\0\0\0\200\206 \0\340\237 
\0\320"..., 8192) = 8192
... repeated 100 or more  times(issued every second or so) ..
lseek(455, 603635712, SEEK_SET)         = 603635712
read(455, "\0\0\0\0\0\0\0\0\1\0\0\0t\6\200\6\360\37\4 \0\0\0\0\200\206 \0\340\237 
\0\320"..., 8192) = 8192
... repeated a lot of times...
...etc...

So it seems that it actually reads the same thing all over again andagain.

Another funny thing is that I have the exactly the same table, but withsignificantly smaller number of columns (so it occupies 180 gbinstead) but has the same indexes. Also all the rows in the table areexactly in the same order (both tables were clustered using the samething). For that table the query succeeds without problems in 20 secondsor so:

wsdb=> explain analyze SELECT ra,dec FROM sdssdr7.phototag WHERE ra BETWEEN 175 
and 190 AND dec BETWEEN 17 And 23 ;
                                                                             
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on phototag  (cost=1824731.39..10750473.21 rows=3204629 
width=16) (actual time=21146.520..23136.512 rows=2973205 loops=1)
   Recheck Cond: ((ra >= 175::double precision) AND (ra <= 190::double precision) AND ("dec" 
>= 17::double precision) AND ("dec" <= 23::double precision))
   ->  BitmapAnd  (cost=1824731.39..1824731.39 rows=3204629 width=0) (actual 
time=21093.906..21093.906 rows=0 loops=1)
         ->  Bitmap Index Scan on sdssdr7_phototag2_ra_idx  
(cost=0.00..613910.25 rows=30866810 width=0) (actual time=6502.589..6502.589 
rows=30418322 loops=1)
               Index Cond: ((ra >= 175::double precision) AND (ra <= 
190::double precision))
         ->  Bitmap Index Scan on sdssdr7_phototag2_dec_idx  
(cost=0.00..1209218.57 rows=60801242 width=0) (actual time=14201.455..14201.455 
rows=63031352 loops=1)
               Index Cond: (("dec" >= 17::double precision) AND ("dec" <= 
23::double precision))
 Total runtime: 23295.384 ms
(8 rows)

While for "wider" table it is still running the query for more than anhour with 100%CPU an almost no I/O


Additional info:
PG version 8.4.8,
OS: debian 5.0
hardware: 2xXeon E5520, 12GB RAM, RAID50 with BBU

The tables are completely static(e.g. no write activity on them), and havebeen vacuum analyzed.

The system is not busy at all (e.g. not much if any concurrent queries).

The tables have large number of columns of real/int/bigint/doubleprecision time.


Some config parameters:
 effective_cache_size            | 6GB
 work_mem                        | 1GB
 effective_io_concurrency        | 0
 shared_buffers                  | 4GB

Any ideas what can be wrong? Any info I can provide ?

Thanks in advance,
        Sergey

*******************************************************************
Sergey E. Koposov, PhD
Institute for Astronomy, Cambridge UK /Sternberg Astronomical Institute
Web: http://lnfm1.sai.msu.ru/~math
E-mail: m...@sai.msu.ru

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] two index bitmap scan of a big table & hash_seq_search

Reply via email to