On Thu, Sep 01, 2022 at 03:35:52PM -0400, Tom Lane wrote: > Spyridon Dimitrios Agathos <spyridon.dimitrios.agat...@gmail.com> writes: > > this is to verify that the .patch proposed here: > > https://www.postgresql.org/message-id/flat/2318797.1638558730%40sss.pgh.pa.us > > fixes the issue. > > > Looking forward to the next steps. > > That's been committed into HEAD and v15, without pushback so far. > So the complained-of case is no longer reachable in those branches. > > I think we should reject Aleksander's patch, on the grounds that > it's now unnecessary --- or if you want to argue that it's still > necessary, then it's woefully inadequate, because there are surely > a bunch of other text-processing functions that will also misbehave > on wrongly-encoded data. But our general policy for years has been > that we check incoming text for encoding validity and then presume > that it is valid in manipulation operations.
pg_upgrade carries forward invalid text. A presumption of encoding validity won't be justified any sooner than a presumption of not finding HEAP_MOVED_OFF flags. Hence, I think there should exist another policy that text-processing functions prevent severe misbehavior when processing invalid text. Out-of-bounds memory access qualifies as severe. > I'm leaning to the idea that we should not back-patch, because > this issue has been there for years with few complaints; it's > not clear that closing the hole is worth creating a compatibility > hazard in minor releases. I would not back-patch. > On the other hand, you could argue > that we should back-patch so that back-branch charin() will > understand the strings that can now be emitted by v15 charout(). > Failing to do so will result in a different sort of compatibility > problem. If concerned, I'd back-patch enough of the read side only, not the output side. I wouldn't bother, though.