Probably the result of vacuum freeze manual execution will give some more
information:
Table size:
appdbname3=# \dt+ appschemaname.applications
                              List of relations
    Schema     |     Name     | Type  |     Owner     |   Size   |
Description
---------------+--------------+-------+---------------+----------+-------------
 appschemaname | applications | table | appschemaname | 10012 MB |

Each index size:
  tablename   |                         indexname
 | index_size | unique |
--------------+------------------------------------------------------------+------------+--------+
 applications | chk_invariant_1_apps2
 | 8192 bytes | Y      |
 applications | applications2_pkey
| 1645 MB    | Y      |
 applications | idx_applications2_agent_id
| 2982 MB    | N      |
 applications | idx_applications2_context_id
| 2708 MB    | N      |
 applications | idx_applications2_context_tenant_id
 | 1560 MB    | N      |
 applications | idx_applications2_context_type
| 4963 MB    | N      |
 applications | idx_applications2_deleted_at
| 3264 MB    | N      |
 applications | idx_applications2_enabled
 | 4975 MB    | N      |
 applications | idx_applications2_euc_index
 | 1684 MB    | N      |
 applications | idx_applications2_policy_id
 | 3131 MB    | N      |
 applications | idx_applications2_policy_type
 | 16 GB      | N      |
 applications | idx_applications2_root_policy_id
| 1574 MB    | N      |
 applications | idx_applications2_status
| 5391 MB    | N      |
 applications | idx_applications2_status_by_policy_fields_context_tenant_i
| 13 MB      | N      |
 applications | idx_applications2_status_by_policy_fields_tenant_id
 | 15 MB      | N      |
 applications | idx_applications2_tenant_id
 | 1670 MB    | N      |
 applications | idx_applications_created_at
 | 2855 MB    | N      |
 applications | tmp_idx_applications2_updated_at2
 | 3505 MB    | N      |
 applications | uq_policy_id_context2
 | 1398 MB    | Y      |

And vacuum freeze output:
appdbname3=# VACUUM (FREEZE, VERBOSE) appschemaname.applications;
INFO:  aggressively vacuuming "appschemaname.applications"
INFO:  scanned index "applications2_pkey" to remove 10488 row versions
DETAIL:  CPU: user: 2.45 s, system: 1.07 s, elapsed: 5.16 s
INFO:  scanned index "tmp_idx_applications2_updated_at2" to remove 10488
row versions
DETAIL:  CPU: user: 2.53 s, system: 2.86 s, elapsed: 11.56 s
INFO:  scanned index "idx_applications2_agent_id" to remove 10488 row
versions
DETAIL:  CPU: user: 3.16 s, system: 2.36 s, elapsed: 11.04 s
INFO:  scanned index "idx_applications2_context_id" to remove 10488 row
versions
DETAIL:  CPU: user: 2.22 s, system: 1.95 s, elapsed: 8.06 s
INFO:  scanned index "idx_applications2_context_tenant_id" to remove 10488
row versions
DETAIL:  CPU: user: 1.86 s, system: 1.16 s, elapsed: 5.10 s
INFO:  scanned index "idx_applications2_context_type" to remove 10488 row
versions
DETAIL:  CPU: user: 190.09 s, system: 6.48 s, elapsed: 220.88 s
INFO:  scanned index "idx_applications2_deleted_at" to remove 10488 row
versions
DETAIL:  CPU: user: 67.93 s, system: 3.41 s, elapsed: 82.75 s
INFO:  scanned index "idx_applications2_enabled" to remove 10488 row
versions
DETAIL:  CPU: user: 78.22 s, system: 4.74 s, elapsed: 96.96 s
INFO:  scanned index "idx_applications2_policy_id" to remove 10488 row
versions
DETAIL:  CPU: user: 2.99 s, system: 1.65 s, elapsed: 5.32 s
INFO:  scanned index "idx_applications2_policy_type" to remove 10488 row
versions
DETAIL:  CPU: user: 300.45 s, system: 16.05 s, elapsed: 365.80 s
INFO:  scanned index "idx_applications2_root_policy_id" to remove 10488 row
versions
DETAIL:  CPU: user: 2.13 s, system: 1.20 s, elapsed: 6.16 s
INFO:  scanned index "idx_applications2_status" to remove 10488 row versions
DETAIL:  CPU: user: 153.79 s, system: 6.57 s, elapsed: 185.76 s
INFO:  scanned index
"idx_applications2_status_by_policy_fields_context_tenant_i" to remove
10488 row versions
DETAIL:  CPU: user: 0.02 s, system: 0.01 s, elapsed: 0.06 s
INFO:  scanned index "idx_applications2_status_by_policy_fields_tenant_id"
to remove 10488 row versions
DETAIL:  CPU: user: 0.03 s, system: 0.00 s, elapsed: 0.05 s
INFO:  scanned index "idx_applications2_tenant_id" to remove 10488 row
versions
DETAIL:  CPU: user: 2.11 s, system: 1.23 s, elapsed: 6.19 s
INFO:  scanned index "idx_applications2_euc_index" to remove 10488 row
versions
DETAIL:  CPU: user: 2.05 s, system: 1.27 s, elapsed: 6.05 s
INFO:  scanned index "chk_invariant_1_apps2" to remove 10488 row versions
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO:  scanned index "uq_policy_id_context2" to remove 10488 row versions
DETAIL:  CPU: user: 1.24 s, system: 1.00 s, elapsed: 4.40 s
INFO:  scanned index "idx_applications_created_at" to remove 10488 row
versions
DETAIL:  CPU: user: 2.32 s, system: 1.97 s, elapsed: 8.46 s
INFO:  "applications": removed 10488 row versions in 7580 pages
DETAIL:  CPU: user: 0.86 s, system: 0.08 s, elapsed: 1.03 s
INFO:  index "applications2_pkey" now contains 17803283 row versions in
210561 pages
DETAIL:  0 index row versions were removed.
59 index pages have been deleted, 59 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "tmp_idx_applications2_updated_at2" now contains 17803283 row
versions in 448580 pages
DETAIL:  0 index row versions were removed.
19542 index pages have been deleted, 19542 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "idx_applications2_agent_id" now contains 17803283 row
versions in 381741 pages
DETAIL:  0 index row versions were removed.
61237 index pages have been deleted, 28239 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "idx_applications2_context_id" now contains 17803283 row
versions in 346594 pages
DETAIL:  0 index row versions were removed.
57308 index pages have been deleted, 24105 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "idx_applications2_context_tenant_id" now contains 17803283
row versions in 199743 pages
DETAIL:  0 index row versions were removed.
32830 index pages have been deleted, 14767 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "idx_applications2_context_type" now contains 17803283 row
versions in 635317 pages
DETAIL:  0 index row versions were removed.
43331 index pages have been deleted, 41233 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "idx_applications2_deleted_at" now contains 17803283 row
versions in 417770 pages
DETAIL:  0 index row versions were removed.
15695 index pages have been deleted, 15381 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "idx_applications2_enabled" now contains 17803283 row versions
in 636750 pages
DETAIL:  0 index row versions were removed.
26332 index pages have been deleted, 26037 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "idx_applications2_policy_id" now contains 17803283 row
versions in 400718 pages
DETAIL:  0 index row versions were removed.
55007 index pages have been deleted, 29671 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "idx_applications2_policy_type" now contains 17803283 row
versions in 2084261 pages
DETAIL:  7990 index row versions were removed.
56012 index pages have been deleted, 55365 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "idx_applications2_root_policy_id" now contains 5015730 row
versions in 201441 pages
DETAIL:  5807 index row versions were removed.
22964 index pages have been deleted, 17915 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "idx_applications2_status" now contains 17803283 row versions
in 690063 pages
DETAIL:  10488 index row versions were removed.
31401 index pages have been deleted, 31113 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "idx_applications2_status_by_policy_fields_context_tenant_i"
now contains 155059 row versions in 1707 pages
DETAIL:  223 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "idx_applications2_status_by_policy_fields_tenant_id" now
contains 155059 row versions in 1883 pages
DETAIL:  223 index row versions were removed.
1 index pages have been deleted, 1 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "idx_applications2_tenant_id" now contains 17803283 row
versions in 213798 pages
DETAIL:  10488 index row versions were removed.
33323 index pages have been deleted, 16730 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "idx_applications2_euc_index" now contains 17803283 row
versions in 215570 pages
DETAIL:  10488 index row versions were removed.
32261 index pages have been deleted, 16643 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "chk_invariant_1_apps2" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "uq_policy_id_context2" now contains 5015730 row versions in
178953 pages
DETAIL:  5807 index row versions were removed.
23793 index pages have been deleted, 13898 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "idx_applications_created_at" now contains 17803283 row
versions in 365479 pages
DETAIL:  10488 index row versions were removed.
19020 index pages have been deleted, 18938 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "applications": found 0 removable, 471321 nonremovable row versions
in 30772 out of 1278613 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 4177172255
There were 1490846 unused item pointers.
Skipped 0 pages due to buffer pins, 1247841 frozen pages.
0 pages are entirely empty.
CPU: user: 816.65 s, system: 55.37 s, elapsed: 1031.96 s.
INFO:  aggressively vacuuming "pg_toast.pg_toast_35599271"
INFO:  "pg_toast_35599271": found 0 removable, 2 nonremovable row versions
in 1 out of 2328 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 4178183844
There were 2 unused item pointers.
Skipped 0 pages due to buffer pins, 2327 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

What catches my eye: scanning indexes smaller than 3.1GB is fast, larger
ones are slow. For example:
idx_applications2_policy_id is 3131 MB took just 5 seconds (DETAIL:  CPU:
user: 2.99 s, system: 1.65 s, elapsed: 5.32 s)
but idx_applications2_deleted_at with 3264 MB took 1 minute 22 seconds
(DETAIL:  CPU: user: 67.93 s, system: 3.41 s, elapsed: 82.75 s)

In addition to this, I collected statistics during the
idx_applications2_enabled scan:
# strace -c -p 26792
strace: Process 26792 attached
^Cstrace: Process 26792 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 90.87    2.931694           4    711882           read
  4.61    0.148573           3     55598     19503 futex
  3.01    0.097138           1     70933           lseek
  0.87    0.027926           7      4261           write
  0.44    0.014348          23       623           fdatasync
  0.20    0.006372           5      1245           kill
  0.00    0.000081           5        17           open
  0.00    0.000042           4        12           select
  0.00    0.000038           3        12           close
  0.00    0.000000           0         1           rt_sigreturn
  0.00    0.000000           0         1           sendto
  0.00    0.000000           0         2           getrusage
------ ----------- ----------- --------- --------- ----------------

--
Mikhael

On Mon, 20 Feb 2023 at 13:29, Mikhail Balayan <mv.bala...@gmail.com> wrote:

> >> Can you run amcheck's bt_index_check() routine against some of the
>> >> indexes you've shown? There is perhaps some chance that index
>> >> corruption exists and causes VACUUM to take a very long time to delete
>> >> index pages. This is pretty much a wild guess, though.
>
>
> Unfortunately I can't, we haven't enabled this extension yet. And since
> this is a production, I'm not ready to turn it on right away. But I can say
> that this theory is unlikely, since this problem occurs on different
> sites. Here's an example of an output from another site where table size is
> 46.5 GB (again, data only), indexes 107GB, toast 62MB:
>
> automatic aggressive vacuum of table
> "appdbname2.appschemaname.applications": index scans: 1
> pages: 0 removed, 6091646 remain, 0 skipped due to pins, 6086395 skipped
> frozen
> tuples: 2344 removed, 35295654 remain, 0 are dead but not yet removable,
> oldest xmin: 213412878
> buffer usage: 251980554 hits, 14462331 misses, 18844 dirtied
> avg read rate: 12.018 MB/s, avg write rate: 0.016 MB/s
> system usage: CPU: user: 7734.43 s, system: 178.98 s, elapsed: 9401.36 s
>
>
> Here again we see that there are 5251 blocks that need to be cleaned
> (6091646 - 6086395), buffer usage is 266461729 blocks or ~ 2 TB and
> processing time is 2.5h+.
>
> >> It's possible that VACUUM had to wait a long time for a cleanup lock
>> on one individual heap page here
>
> If such a scenario is possible, it makes sense to add information about
> the blocking waiting time to the output. Something like:
>      system usage: CPU: user: 7734.43 s, system: 178.98 s, lock_wait:
> 1234.56 s, elapsed: 9401.36 s
>
>
> Mikhail
>
> On Sat, 18 Feb 2023 at 05:35, Peter Geoghegan <p...@bowt.ie> wrote:
>
>> On Thu, Feb 16, 2023 at 5:40 PM Mikhail Balayan <mv.bala...@gmail.com>
>> wrote:
>> >> >> Do you have any non-btree indexes on the table? Can you show us the
>> details of the
>> >> >> table, including all of its indexes? In other words, can you show
>> "\d applications" output from psql?
>> >
>> > Only btree indexes. Please find the full table schema below:
>>
>> It's possible that VACUUM had to wait a long time for a cleanup lock
>> on one individual heap page here, which could have added a long delay.
>> But...that doesn't seem particularly likely.
>>
>> Can you run amcheck's bt_index_check() routine against some of the
>> indexes you've shown? There is perhaps some chance that index
>> corruption exists and causes VACUUM to take a very long time to delete
>> index pages. This is pretty much a wild guess, though.
>>
>> --
>> Peter Geoghegan
>>
>

Reply via email to