On Mon, Feb 6, 2023 at 8:57 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvhe...@alvh.no-ip.org> writes: > > On 2023-Feb-03, Tom Lane wrote: > >> Fix edge-case data corruption in shared tuplestores (Dmitry Astapov) > > > I think this sounds really scary, because people are going to think that > > their stored data can get corrupted -- they don't necessarily know what > > a "shared tuplestore" is. Maybe "Avoid query failures in parallel hash > > joins" as headline? > > Hmmm ... are we sure it *can't* lead to corruption of stored data, > if this happens during an INSERT or UPDATE plan? I'll grant that > such a case seems pretty unlikely though, as the bogus data > retrieved from the tuplestore would have to not cause a failure > within the query before it can get written out.
Agreed. I think you have to be quite unlucky to hit this in the first place (very large tuples with very particular alignment), and then you'd be highly likely to fail in some way due to corrupted tuple size, making permanent corruption extremely unlikely. > Also, aren't shared tuplestores used in more places than just > parallel hash join? I mentioned that as an example, not an > exhaustive list of trouble spots. Shared file sets (= a directory of temp files with automatic cleanup) are used by more things, but shared tuplestores (= a shared file set with a tuple-oriented interface on top, to support partial scan) are currently only used by PHJ.