From: Amit Khandekar [mailto:amit.khande...@enterprisedb.com]
> On 1 November 2013 16:32, Etsuro Fujita <fujita.ets...@lab.ntt.co.jp> wrote:
>> From: Fujii Masao [mailto:masao.fu...@gmail.com]

>>> I'm not sure if it's good idea to show the number of the fetches because it
>>> seems difficult to tune work_mem from that number. How can we calculate how
>>> much to increase work_mem to avoid lossy bitmap from the number of the 
>>> fetches
>>> in EXPLAIN output?

>> We can calculate that from the following equation in tbm_create():

>>   nbuckets = maxbytes /
>>     (MAXALIGN(sizeof(HASHELEMENT)) + MAXALIGN(sizeof(PagetableEntry))
>>     + sizeof(Pointer) + sizeof(Pointer)),

>> where maxbytes is the size of memory used for the hashtable in a TIDBitmap,
>> designated by work_mem, and nbuckets is the estimated number of hashtable
>> entries we can have within maxbytes.  From this, the size of work_mem within
>> which we can have every hashtable entry as an exact bitmap is calculated as
>> follows:

>>   work_mem = (the number of exact pages + the number of lossy pages) *
>>     (MAXALIGN(sizeof(HASHELEMENT)) + MAXALIGN(sizeof(PagetableEntry))
>>     + sizeof(Pointer) + sizeof(Pointer)) /
>>     (1024 * 1024).

> I am yet to give more thought on the above formula 
> (particularly exact_pages + lossy_pages), but  I was also wondering if the 
> user 
> would indeed be able to figure out the above way to estimate the memory, or 
> the 
> explain itself should show the estimated memory  required for the bitmap. For 
> hash joins we do show the memory taken by the hash table in show_hash_info(). 
> We 
> can show the memory requirement in addition to the number of exact/lossy 
> pages. 

Thank you for the review!

Reconsidering that, I wish to know your opinion.  The patch shows the number of 
exact/lossy pages that has been fetched in a bitmap heap scan.  But the number 
varies with the fraction of tuples to be retrieved like the following.

postgres=# EXPLAIN ANALYZE SELECT * FROM demo WHERE col2 between 0.01 and 0.02;
                                                             QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on demo  (cost=2187.35..101419.96 rows=102919 width=42) 
(actual time=23.684..1302.382 rows=99803 loops=1)
   Recheck Cond: ((col2 >= 0.01::double precision) AND (col2 <= 0.02::double 
precision))
   Rows Removed by Index Recheck: 6279502
   Heap Blocks: exact=1990 lossy=59593
   ->  Bitmap Index Scan on demo_col2_idx  (cost=0.00..2161.62 rows=102919 
width=0) (actual time=23.330..23.330 rows=99803 loops=1)
         Index Cond: ((col2 >= 0.01::double precision) AND (col2 <= 
0.02::double precision))
 Total runtime: 1311.949 ms
(7 rows)

postgres=# EXPLAIN ANALYZE SELECT * FROM demo WHERE col2 between 0.01 and 0.02 
LIMIT 5000;
                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=2187.35..7008.26 rows=5000 width=42) (actual time=23.543..86.093 
rows=5000 loops=1)
   ->  Bitmap Heap Scan on demo  (cost=2187.35..101419.96 rows=102919 width=42) 
(actual time=23.542..85.196 rows=5000 loops=1)
         Recheck Cond: ((col2 >= 0.01::double precision) AND (col2 <= 
0.02::double precision))
         Rows Removed by Index Recheck: 312179
         Heap Blocks: exact=99 lossy=2963
         ->  Bitmap Index Scan on demo_col2_idx  (cost=0.00..2161.62 
rows=102919 width=0) (actual time=23.189..23.189 rows=99803 loops=1)
               Index Cond: ((col2 >= 0.01::double precision) AND (col2 <= 
0.02::double precision))
 Total runtime: 86.626 ms
(8 rows)

So, my question is, we should show the number of exact/lossy pages in a 
TIDBitmap, not the number of these pages that has been fetched in the bitmap 
heap scan?

Thanks,

Best regards,
Etsuro Fujita



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to