On 10/5/15 10:50 AM, Tom Lane wrote:
Alvaro Herrera <alvhe...@2ndquadrant.com> writes:
Andrew Dunstan wrote:
FWIW, (a) and (b) but not (c) is probably the right description for my
client who has been seeing problems here.

I think the fact that long IN lists are fingerprinted differently
according to the number of elements in the list makes the scenario
rather very likely -- not particularly narrow.

That's certainly something worth looking at, but I think it's probably
more complicated than that.  If you just write "WHERE x IN (1,2,3,4)",
that gets folded to a ScalarArrayOp with a single array constant, which
the existing code would deal with just fine.  We need to identify
situations where that's not the case but yet we shouldn't distinguish.

In any case, that's just a marginal tweak for one class of query.
I suspect the right fix for the core problem is the one Peter mentioned
in passing earlier, namely make it possible to do garbage collection
without having to slurp the entire file into memory at once.  It'd be
slower, without a doubt, but we could continue to use the existing code
path unless the file gets really large.

To address what Peter raised up-thread, according to my client the process that was doing this was generating 10,000 inserts per transaction and sending them all as a single statement. They tried cutting it to 1000 inserts and it still had the problem. Each overall command string could have been megabytes in size. Perhaps it's not worth supporting that, but if that's the decision then there needs to at least be better error reporting around this.

I'll try to test Tom's patch next week to see what affect it has on this.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to