I wrote: > Michael Paquier <mich...@paquier.xyz> writes: >> A customer has reported a regression with the parsing of rather large >> XML data, introduced by the set of backpatches done with f68d6aabb7e2 >> & friends.
> Bleah. The supplied test case hides important details in the error message. If you get rid of the exception block so that the error is reported in full, what you see is regression=# CREATE TEMP TABLE xmldata (id BIGINT PRIMARY KEY, message XML ); CREATE TABLE regression=# DO $$ DECLARE size_40mb TEXT := repeat('X', 40000000); regression$# BEGIN regression$# INSERT INTO xmldata (id, message) VALUES regression$# ( 1, (('<Root><Item><Name>Test40MB</Name><Content>' || size_40mb || '</Content></Item></Root>')::xml) ); regression$# END $$; ERROR: invalid XML content DETAIL: line 1: internal error: Huge input lookup XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX ^ CONTEXT: SQL statement "INSERT INTO xmldata (id, message) VALUES ( 1, (('<Root><Item><Name>Test40MB</Name><Content>' || size_40mb || '</Content></Item></Root>')::xml) )" PL/pgSQL function inline_code_block line 3 at SQL statement regression=# That is, what we are hitting is libxml2's internal protections against processing "too large" input. I am not really sure why the other coding failed to hit this same thing, but I wonder if we shouldn't leave well enough alone. See commits 2197d0622 and f2743a7d7, where we tried to enable such cases and then decided it was too risky. I'm afraid now that our prior coding might have allowed billion-laugh-like cases to be reachable. regards, tom lane