Re: Out of Memory errors are frustrating as heck!

Gunther Tue, 23 Apr 2019 16:10:21 -0700

On 4/23/2019 16:43, Justin Pryzby wrote:

It wrote 40GB tempfiles - perhaps you can increase work_mem now to improve the
query time.

I now upped my shared_buffers back from 1 to 2GB and work_mem from 4 to16MB. Need to set vm.overcommit_ratio from 50 to 75 (percent, withvm.overcommit_memory = 2 as it is.)

We didn't address it yet, but your issue was partially caused by a misestimate.
It's almost certainly because these conditions are correlated, or maybe
redundant.

That may be so, but mis-estimates happen. And I can still massivelyimprove this query logically I am sure. In fact it sticks out like asore thumb, logically it makes no sense to churn over 100 million rowshere, but the point is that hopefully PostgreSQL runs stable in suchoutlier situations, comes back and presents you with 2 hours of worktime, 40 GB temp space, or whatever, and then we users can figure outhow to make it work better. The frustrating thing it to get out ofmemory and we not knowing what we can possibly do about it.

From my previous attempt with this tmp_r and tmp_q table, I also knowthat the Sort/Uniqe step is taking a lot of extra time. I can cut thatout too by addressing the causes of the "repeated result" rows. Butagain, that is all secondary optimizations.

Merge Cond: (((documentinformationsubject.documentinternalid)::text = 
(documentinformationsubject_1.documentinternalid)::text) AND 
((documentinformationsubject.documentid)::text = 
(documentinformationsubject_1.documentid)::text) AND 
((documentinformationsubject.actinternalid)::text = 
(documentinformationsubject_1.actinternalid)::text))

If they're completely redundant and you can get the same result after dropping
one or two of those conditions, then you should.

I understand. You are saying by reducing the amount of columns in thejoin condition, somehow you might be able to reduce the size of the hashtemporary table?

Alternately, if they're correlated but not redundant, you can use PG10
"dependency" statistics (CREATE STATISTICS) on the correlated columns (and
ANALYZE).

I think documentId and documentInternalId is 1:1 they are both primary /alternate keys. So I could go with only one of them, but since I end upneeding both elsewhere inside the query I like to throw them all intothe natural join key, so that I don't have to deal with the duplicateresult columns.


Now running:

integrator=# set enable_nestloop to off; SET integrator=# explainanalyze select * from reports.v_BusinessOperation; WARNING:ExecHashIncreaseNumBatches: nbatch=8 spaceAllowed=16777216 WARNING:ExecHashIncreaseNumBatches: nbatch=16 spaceAllowed=16777216 WARNING:ExecHashIncreaseNumBatches: nbatch=32 spaceAllowed=16777216 WARNING:ExecHashIncreaseNumBatches: nbatch=64 spaceAllowed=16777216 WARNING:ExecHashIncreaseNumBatches: nbatch=128 spaceAllowed=16777216 WARNING:ExecHashIncreaseNumBatches: nbatch=256 spaceAllowed=16777216 WARNING:ExecHashIncreaseNumBatches: nbatch=512 spaceAllowed=16777216 WARNING:ExecHashIncreaseNumBatches: nbatch=1024 spaceAllowed=25165824 WARNING:ExecHashIncreaseNumBatches: nbatch=2048 spaceAllowed=50331648 WARNING:ExecHashIncreaseNumBatches: nbatch=4096 spaceAllowed=100663296 WARNING:ExecHashIncreaseNumBatches: nbatch=8192 spaceAllowed=201326592 WARNING:ExecHashIncreaseNumBatches: nbatch=16384 spaceAllowed=402653184 WARNING:ExecHashIncreaseNumBatches: nbatch=32768 spaceAllowed=805306368 WARNING:ExecHashIncreaseNumBatches: nbatch=65536 spaceAllowed=1610612736

I am waiting now, probably for that Sort/Unique to finish I think thatthe vast majority of the time spent is in this sort

Unique (cost=5551524.36..5554207.33 rows=34619 width=1197) (actualtime=6150303.060..6895451.210 rows=435274 loops=1) -> Sort(cost=5551524.36..5551610.91 rows=34619 width=1197) (actualtime=6150303.058..6801372.192 rows=113478386 loops=1) Sort Key:documentinformationsubject.documentinternalid,documentinformationsubject.is_current,documentinformationsubject.documentid,documentinformationsubject.documenttypecode,documentinformationsubject.subjectroleinternalid,documentinformationsubject.subjectentityinternalid,documentinformationsubject.subjectentityid,documentinformationsubject.subjectentityidroot,documentinformationsubject.subjectentityname,documentinformationsubject.subjectentitytel,documentinformationsubject.subjectentityemail,documentinformationsubject.otherentityinternalid,documentinformationsubject.confidentialitycode,documentinformationsubject.actinternalid,documentinformationsubject.code_code,documentinformationsubject.code_displayname, q.code_code,q.code_displayname, an.extension, an.root,documentinformationsubject_2.subjectentitycode,documentinformationsubject_2.subjectentitycodesystem,documentinformationsubject_2.effectivetime_low,documentinformationsubject_2.effectivetime_high,documentinformationsubject_2.statuscode,documentinformationsubject_2.code_code, agencyid.extension,agencyname.trivialname, documentinformationsubject_1.subjectentitycode,documentinformationsubject_1.subjectentityinternalid Sort Method:external merge Disk: 40726720kB -> Hash Right Join(cost=4255031.53..5530808.71 rows=34619 width=1197) (actualtime=325240.679..1044194.775 rows=113478386 loops=1)


isn't it?

Unique/Sort actual time   6,150,303.060 ms = 6,150 s <~ 2 h.
Hash Right Join actual time 325,240.679 ms.

So really all time is wasted in that sort, no need for you guys to worryabout anything else with these 2 hours. Tomas just stated the same thing.

Right. Chances are that with a bettwe estimate the optimizer would pick
merge join instead. I wonder if that would be significantly faster.

The prospect of a merge join is interesting here to consider: with theSort/Unique step taking so long, it seems the Merge Join might also takea lot of time? I see my disks are churning for the most time in this way:

avg-cpu: %user %nice %system %iowait %steal %idle 7.50 0.00 2.50 89.500.00 0.50 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-szawait r_await w_await svctm %util nvme1n1 0.00 0.00 253.00 131.00 30.1532.20 332.50 2.01 8.40 8.41 8.37 2.59 99.60 nvme1n1p24 0.00 0.00 253.00131.00 30.15 32.20 332.50 2.01 8.40 8.41 8.37 2.59 99.60

I.e. 400 IOPS at 60 MB/s half of it read, half of it write. During theprevious steps, the hash join presumably, throughput was a lot higher,like 2000 IOPS with 120 MB/s read or write.

But even if the Merge Join would have taken about the same or a littlemore time than the Hash Join, I wonder, if one could not use that tocollapse the Sort/Unique step into that? Like it seems that after theSort/Merge has completed, one should be able to read Uniqe recordswithout any further sorting? In that case the Merge would be a greatadvantage.

What I like about the situation now is that with that 4x biggerwork_mem, the overall memory situation remains the same. I.e., we arescraping just below 1GB for this process and we see oscillation, growthand shrinkage occurring. So I consider this case closed for me. Thatdoesn't mean I wouldn't be available if you guys want to try anythingelse about it.


OK, now here is the result with the 16 MB work_mem:

Unique (cost=5462874.86..5465557.83 rows=34619 width=1197) (actualtime=6283539.282..7003311.451 rows=435274 loops=1) -> Sort(cost=5462874.86..5462961.41 rows=34619 width=1197) (actualtime=6283539.280..6908879.456 rows=113478386 loops=1) Sort Key:documentinformationsubject.documentinternalid,documentinformationsubject.is_current,documentinformationsubject.documentid,documentinformationsubject.documenttypecode,documentinformationsubject.subjectroleinternalid, documentinformationsubject.subjectentityinternalid,documentinformationsubject.subjectentityid,documentinformationsubject.subjectentityidroot,documentinformationsubject.subjectentityname,documentinformationsubject.subjectentitytel,documentinformationsubject.subjectenti tyemail,documentinformationsubject.otherentityinternalid,documentinformationsubject.confidentialitycode,documentinformationsubject.actinternalid,documentinformationsubject.code_code,documentinformationsubject.code_displayname, q.code_code, q.code_displayname, an.extension, an.root,documentinformationsubject_2.subjectentitycode,documentinformationsubject_2.subjectentitycodesystem,documentinformationsubject_2.effectivetime_low,documentinformationsubject_2.effectivetime_high,documentinformationsubjec t_2.statuscode,documentinformationsubject_2.code_code, agencyid.extension,agencyname.trivialname, documentinformationsubject_1.subjectentitycode,documentinformationsubject_1.subjectentityinternalid Sort Method:external merge Disk: 40726872kB -> Hash Right Join(cost=4168174.03..5442159.21 rows=34619 width=1197) (actualtime=337057.290..1695675.896 rows=113478386 loops=1) Hash Cond:(((q.documentinternalid)::text =(documentinformationsubject.documentinternalid)::text) AND((r.targetinternalid)::text =(documentinformationsubject.actinternalid)::text)) -> Hash Right Join(cost=1339751.37..2608552.36 rows=13 width=341) (actualtime=84109.143..84109.238 rows=236 loops=1) Hash Cond:(((documentinformationsubject_2.documentinternalid)::text =(q.documentinternalid)::text) AND((documentinformationsubject_2.actinternalid)::text =(q.actinternalid)::text)) -> Gather (cost=29501.54..1298302.52 rows=1width=219) (actual time=43932.534..43936.888 rows=0 loops=1) WorkersPlanned: 2 Workers Launched: 2 -> Parallel Hash Left Join(cost=28501.54..1297302.42 rows=1 width=219) (actualtime=43925.304..43925.304 rows=0 loops=3) ... -> Hash(cost=1310249.63..1310249.63 rows=13 width=233) (actualtime=40176.581..40176.581 rows=236 loops=1) Buckets: 1024 Batches: 1Memory Usage: 70kB -> Hash Right Join (cost=829388.20..1310249.63rows=13 width=233) (actual time=35925.031..40176.447 rows=236 loops=1)Hash Cond: ((an.actinternalid)::text = (q.actinternalid)::text) -> SeqScan on act_id an (cost=0.00..425941.04 rows=14645404 width=134) (actualtime=1.609..7687.986 rows=14676871 loops=1) -> Hash(cost=829388.19..829388.19 rows=1 width=136) (actualtime=30106.123..30106.123 rows=236 loops=1) Buckets: 1024 Batches: 1Memory Usage: 63kB -> Gather (cost=381928.46..829388.19 rows=1width=136) (actual time=24786.510..30105.983 rows=236 loops=1) ... ->Hash (cost=2823846.37..2823846.37 rows=34619 width=930) (actualtime=252946.367..252946.367 rows=113478127 loops=1) Buckets: 32768(originally 32768) Batches: 65536 (originally 4) Memory Usage: 1204250kB-> Gather Merge (cost=2807073.90..2823846.37 rows=34619 width=930)(actual time=83891.069..153380.040 rows=113478127 loops=1) WorkersPlanned: 2 Workers Launched: 2 -> Merge Left Join(cost=2806073.87..2818850.46 rows=14425 width=930) (actualtime=83861.921..108022.671 rows=37826042 loops=3) Merge Cond:(((documentinformationsubject.documentinternalid)::text =(documentinformationsubject_1.documentinternalid)::text) AND((documentinformationsubject.documentid)::text =(documentinformationsubject_1.documentid):: text) AND((documentinformationsubject.actinternalid)::text =(documentinformationsubject_1.actinternalid)::text)) -> Sort(cost=1295969.26..1296005.32 rows=14425 width=882) (actualtime=44814.114..45535.398 rows=231207 loops=3) Sort Key:documentinformationsubject.documentinternalid,documentinformationsubject.docum... Workers Planned: 2 Workers Launched:2 -> Merge Left Join (cost=2806073.87..2818850.46 rows=14425 width=930)(actual time=83861.921..108022.671 rows=37826042 loops=3) Merge Cond:(((documentinformationsubject.documentinternalid)::text =(documentinformationsubject_1.documentinternalid)::text) AND((documentinformationsubject.documentid)::text =(documentinformationsubject_1.documentid):: text) AND((documentinformationsubject.actinternalid)::text =(documentinformationsubject_1.actinternalid)::text)) -> Sort(cost=1295969.26..1296005.32 rows=14425 width=882) (actualtime=44814.114..45535.398rows=231207 loops=3) ... Planning Time: 2.953ms Execution Time: 7004340.091 ms (70 rows)


There isn't really any big news here. But what matters is that it works.

thanks & regards,
-Gunther Schadow

Re: Out of Memory errors are frustrating as heck!

Reply via email to