On 1/11/21 10:00 PM, Anastasia Lubennikova wrote:
On 11.01.2021 01:35, Tomas Vondra wrote:
Hi,
I started looking at this patch again, hoping to get it committed in
this CF, but I think there's a regression in handling TOAST tables
(compared to the v3 patch submitted by Pavan in February 2019).
The test I'm running a very simple test (see test.sql):
1) start a transaction
2) create a table with a text column
3) copy freeze data into it
4) use pg_visibility to see how many blocks are all_visible both in the
main table and it's TOAST table
For v3 patch (applied on top of 278584b526 and
s/hi_options/ti_options) I get this:
pages NOT all_visible
------------------------------------------
main 637 0
toast 50001 3
There was some discussion about relcache invalidations causing a
couple TOAST pages not be marked as all_visible, etc.
However, for this patch on master I get this
pages NOT all_visible
------------------------------------------
main 637 0
toast 50001 50001
So no pages in TOAST are marked as all_visible. I haven't investigated
what's causing this, but IMO that needs fixing to make ths patch RFC.
Attached is the test script I'm using, and a v5 of the patch - rebased
on current master, with some minor tweaks to comments etc.
Thank you for attaching the test script. I reproduced the problem. This
regression occurs because TOAST internally uses heap_insert().
You have asked upthread about adding this optimization to heap_insert().
I wrote a quick fix, see the attached patch 0002. The TOAST test passes
now, but I haven't tested performance or any other use-cases yet.
I'm going to test it properly in a couple of days and share results.
Thanks. I think it's important to make this work for TOAST tables - it
often stores most of the data, and it was working in v3 of the patch. I
haven't looked into the details, but if it's really just due to TOAST
using heap_insert, I'd say it just confirms the importance of tweaking
heap_insert too.
With this change a lot of new code is repeated in heap_insert() and
heap_multi_insert(). I think it's fine, because these functions already
have a lot in common.
Understood. IMHO a bit of redundancy is not a big issue, but I haven't
looked at the code yet. Let's get it working first, then we can decide
if some refactoring is appropriate.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company