Erik Wienhold <e...@ewie.name> writes:
> I'm leaning towards Michael's proposal of adding a libxml2 version check
> in the stable branches before REL_18_STABLE and parsing the content with
> xmlParseBalancedChunkMemory on versions up to 2.12.x.

I spent some time investigating this.  It appears that even when using
our old code with xmlParseBalancedChunkMemory (I tested against our
commit 6082b3d5d^), libxml2 versions 2.12.x and 2.13.x will throw
a resource-exhaustion error; but earlier and later releases do not.
Drilling down with "git bisect", the first libxml2 commit that throws
an error is

commit 834b8123efc4a4c369671cad2f1b0eb744ae67e9
Author: Nick Wellnhofer <wellnho...@aevum.de>
Date:   Tue Aug 8 15:21:28 2023 +0200

    parser: Stream data when reading from memory
    
    Don't create a copy of the whole input buffer. Read the data chunk by
    chunk to save memory.

and then the one that restores it to not throwing an error is

commit 7148b778209aaaf684c156c5e2e40d6e477f13f7
Author: Nick Wellnhofer <wellnho...@aevum.de>
Date:   Sun Jul 7 16:11:08 2024 +0200

    parser: Optimize memory buffer I/O
    
    Reenable zero-copy IO for zero-terminated static memory buffers.
    
    Don't stream zero-terminated dynamic memory buffers on top of creating
    a copy.

So apparently the "resource exhaustion" complaint is not about
anything deeper than not wanting to make a copy of a many-megabyte
input string, and the fact that it appeared and disappeared is an
artifact of libxml2 implementation choices that we have no control
over (and, IMO, no responsibility for working around).

Based on this, I'm okay with reverting to using
xmlParseBalancedChunkMemory: while that won't help users of 2.12.x or
2.13.x, it will help users of earlier and later libxml2.  But I think
we should just do that across the board, not insert libxml2 version
tests nor do it differently in different PG versions.

I've not looked at the details of the proposed patches, but will
do so now that the direction to go in is apparent.

                        regards, tom lane


Reply via email to