2015-08-19 20:12 GMT+09:00 Simon Riggs <si...@2ndquadrant.com>:
> On 12 June 2015 at 00:29, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote:
>
>>
>> I see two ways to fix this:
>>
>> (1) enforce the 1GB limit (probably better for back-patching, if that's
>>     necessary)
>>
>> (2) make it work with hash tables over 1GB
>>
>> I'm in favor of (2) if there's a good way to do that. It seems a bit
>> stupid not to be able to use fast hash table because there's some artificial
>> limit. Are there any fundamental reasons not to use the
>> MemoryContextAllocHuge fix, proposed by KaiGai-san?
>
>
> If there are no objections, I will apply the patch for 2) to HEAD and
> backpatch to 9.5.
>
Please don't be rush. :-)

It is not difficult to replace palloc() by palloc_huge(), however, it may lead
another problem once planner gives us a crazy estimation.
Below is my comment on the another thread.

==========
Also, we may need to pay attention to reliability of scale estimation
by planner.
Even though the plan below says that Join generates 60521928028 rows,
it actually generates 776157676 rows (0.12%).


tpcds100=# EXPLAIN ANALYZE select
ws1.ws_order_number,ws1.ws_warehouse_sk wh1,ws2.ws_warehouse_sk wh2
 from web_sales ws1,web_sales ws2
 where ws1.ws_order_number = ws2.ws_order_number
   and ws1.ws_warehouse_sk <> ws2.ws_warehouse_sk;
                                                                 QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=25374644.08..1160509591.61 rows=60521928028
width=24) (actual time=138347.979..491889.343 rows=776157676 loops=1)
   Merge Cond: (ws1.ws_order_number = ws2.ws_order_number)
   Join Filter: (ws1.ws_warehouse_sk <> ws2.ws_warehouse_sk)
   Rows Removed by Join Filter: 127853313
   ->  Sort  (cost=12687322.04..12867325.16 rows=72001248 width=16)
(actual time=73252.300..79017.420 rows=72001237 loops=1)
         Sort Key: ws1.ws_order_number
         Sort Method: quicksort  Memory: 7083296kB
         ->  Seq Scan on web_sales ws1  (cost=0.00..3290612.48
rows=72001248 width=16) (actual time=0.023..39951.201 rows=72001237
loops=1)
   ->  Sort  (cost=12687322.04..12867325.16 rows=72001248 width=16)
(actual time=65095.655..128885.811 rows=904010978 loops=1)
         Sort Key: ws2.ws_order_number
         Sort Method: quicksort  Memory: 7083296kB
         ->  Seq Scan on web_sales ws2  (cost=0.00..3290612.48
rows=72001248 width=16) (actual time=0.014..31046.888 rows=72001237
loops=1)
 Planning time: 0.232 ms
 Execution time: 530176.521 ms
(14 rows)


So, even if we allows nodeHash.c to allocate hash buckets larger than
1GB, its initial size may be determined carefully.
Probably, 1GB is a good starting point even if expanded later.

-- 
KaiGai Kohei <kai...@kaigai.gr.jp>


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to