On 12/6/23 09:56, Amit Kapila wrote:
> On Tue, Dec 5, 2023 at 10:23 PM Tomas Vondra
> <[email protected]> wrote:
>>
>> On 12/5/23 13:17, Amit Kapila wrote:
>>
>>> (b) for transactional
>>> cases, we see overhead due to traversing all the top-level txns and
>>> check the hash table for each one to find whether change is
>>> transactional.
>>>
>>
>> Not really, no. As I explained in my preceding e-mail, this check makes
>> almost no difference - I did expect it to matter, but it doesn't. And I
>> was a bit disappointed the global hash table didn't move the needle.
>>
>> Most of the time is spent in
>>
>> 78.81% 0.00% postgres postgres [.] DecodeCommit (inlined)
>> |
>> ---DecodeCommit (inlined)
>> |
>> |--72.65%--SnapBuildCommitTxn
>> | |
>> | --72.61%--SnapBuildBuildSnapshot
>> | |
>> | --72.09%--pg_qsort
>> | |
>> | |--66.24%--pg_qsort
>> | | |
>>
>> And there's almost no difference between master and build with sequence
>> decoding - see the attached diff-alter-sequence.perf, comparing the two
>> branches (perf diff -c delta-abs).
>>
>
> I think in this the commit time predominates which hides the overhead.
> We didn't investigate in detail if that can be improved but if we see
> a similar case of abort [1], it shows the overhead of
> ReorderBufferSequenceIsTransactional(). I understand that aborts won't
> be frequent and it is sort of unrealistic test but still helps to show
> that there is overhead in ReorderBufferSequenceIsTransactional(). Now,
> I am not sure if we can ignore that case because theoretically, the
> overhead can increase based on the number of top-level transactions.
>
> [1]:
> https://www.postgresql.org/message-id/TY3PR01MB9889D457278B254CA87D1325F581A%40TY3PR01MB9889.jpnprd01.prod.outlook.com
>
But those profiles were with the "old" patch, with one hash table per
top-level transaction. I see nothing like that with the patch [1] that
replaces that with a single global hash table. With that patch, the
ReorderBufferSequenceIsTransactional() took ~0.5% in any tests I did.
What did have bigger impact is this:
46.12% 1.47% postgres [.] pg_logical_slot_get_changes_guts
|
|--45.12%--pg_logical_slot_get_changes_guts
| |
| |--42.34%--LogicalDecodingProcessRecord
| | |
| | |--12.82%--xact_decode
| | | |
| | | |--9.46%--DecodeAbort (inlined)
| | | | |
| | | | |--8.44%--ReorderBufferCleanupTXN
| | | | | |
| | | | | |--3.25%--ReorderBufferSequenceCleanup (in)
| | | | | | |
| | | | | | |--1.59%--hash_seq_search
| | | | | | |
| | | | | | |--0.80%--hash_search_with_hash_value
| | | | | | |
| | | | | | --0.59%--hash_search
| | | | | | hash_bytes
I guess that could be optimized, but it's also a direct consequence of
the huge number of aborts for transactions that create relfilenode. For
any other workload this will be negligible.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company