Re: Add LSN <-> time conversion functionality

2024-05-30 Thread Andrey M. Borodin
Hi everyone!

Me, Bharath, and Ilya are on patch review session at the PGConf.dev :) Maybe we 
got everything wrong, please consider that we are just doing training on 
reviewing patches.


=== Purpose of the patch ===
Currently, we have checkpoint_timeout and max_wal size to know when we need a 
checkpoint. This patch brings a capability to freeze page not only by internal 
state of the system, but also by wall clock time.
To do so we need an infrastructure which will tell when page was modified.

The patch in this thread is doing exactly this: in-memory information to map 
LSNs with wall clock time. Mapping is maintained by bacgroundwriter.

=== Questions ===
1. The patch does not handle server restart. All pages will need freeze after 
any crash?
2. Some benchmarks to proof the patch does not have CPU footprint.

=== Nits ===
"Timeline" term is already taken.
The patch needs rebase due to some header changes.
Tests fail on Windows.
The patch lacks tests.
Some docs would be nice, but the feature is for developers.
Mapping is protected for multithreaded access by walstats LWlock and might have 
tuplestore_putvalues() under that lock. That might be a little dangerous, if 
tuplestore will be on-disk for some reason (should not happen).


Overall, the patch is a base for good feature which would help to do freeze 
right in time. Thanks!


Best regards, Bharath, Andrey, Ilya.



Re: Injection points: preloading and runtime arguments

2024-05-21 Thread Andrey M. Borodin



> On 21 May 2024, at 06:31, Michael Paquier  wrote:
> 
> So I agree that 0002 ought to call injection_init_shmem() when calling
> injection_points_preload(), but it also seems to me that the test is
> missing the fact that it should heat the backend cache to avoid the
> allocations in the critical sections.
> 
> Note that I disagree with taking a shortcut in the backend-side
> injection point code where we would bypass CritSectionCount or
> allowInCritSection.  These states should stay consistent for the sake
> of the callbacks registered so as these can rely on the same stack and
> conditions as the code where they are called.

Currently I'm working on the test using this
$creator->query_until(qr/start/, q(
\echo start
select injection_points_wakeup('');
select test_create_multixact();
));

I'm fine if instead of injection_points_wakeup('') I'll have to use select 
injection_points_preload('point name');.

Thanks!


Best regards, Andrey Borodin.





Re: libpq compression (part 3)

2024-05-20 Thread Andrey M. Borodin




> On 20 May 2024, at 23:37, Robert Haas  wrote:
> 
>  But if that's a practical
> attack, preventing compression prior to the authentication exchange
> probably isn't good enough: the user could also try to guess what
> queries are being sent on behalf of other users through the same
> pooled connection, or they could try to use the bits of the query that
> they can control to guess what the other bits of the query that they
> can't see look like.

All these attacks can be practically exploited in a controlled environment.
That's why previous incarnation of this patchset [0] contained a way to reset 
compression context. And Odyssey AFAIR did it (Dan, coauthor of that patch, 
implemented the compression in Odyssey).
But attacking authentication is much more straightforward and viable.

> On 20 May 2024, at 23:37, Robert Haas  wrote:
> 
> But, does this mean that we should just refuse to offer compression as
> a feature?

No, absolutely, we need the feature.

> I guess I don't understand why TLS removed
> support for encryption entirely instead of disclaiming its use in some
> appropriate way.

I think, the scope of TLS is too broad. HTTPS in turn has a compression. But 
AFAIK it never compress headers.
IMO we should try to avoid compressing authentication information.


Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/38/3499/



Re: libpq compression (part 3)

2024-05-20 Thread Andrey M. Borodin



> On 20 May 2024, at 22:48, Robert Haas  wrote:
> 
> On Mon, May 20, 2024 at 1:23 PM Jacob Champion
>  wrote:
>> On Mon, May 20, 2024 at 10:01 AM Robert Haas  wrote:
>>> I really hope that you can't poke big enough holes to kill the feature
>>> entirely, though. Because that sounds sad.
>> 
>> Even if there are holes, I don't think the situation's going to be bad
>> enough to tank everything; otherwise no one would be able to use
>> decompression on the Internet. :D And I expect the authors of the
>> newer compression methods to have thought about these things [1].
>> 
>> I hesitate to ask as part of the same email, but what were the plans
>> for compression in combination with transport encryption? (Especially
>> if you plan to compress the authentication exchange, since mixing your
>> LDAP password into the compression context seems like it might be a
>> bad idea if you don't want to leak it.)
> 
> So, the data would be compressed first, with framing around that, and
> then transport encryption would happen afterwards. I don't see how
> that would leak your password, but I have a feeling that might be a
> sign that I'm about to learn some unpleasant truths.

Compression defeats encryption. That's why it's not in TLS anymore.
The thing is compression codecs use data self correlation. And if you mix 
secret data with user's data, user might guess how correlated they are.


Best regards, Andrey Borodin.



Re: Injection points: preloading and runtime arguments

2024-05-20 Thread Andrey M. Borodin


> On 20 May 2024, at 17:01, Andrey M. Borodin  wrote:

Ugh, accidentally send without attaching the patch itself. Sorry for the noise.


Best regards, Andrey Borodin.


0001-Add-multixact-CV-sleep.patch
Description: Binary data




Re: Injection points: preloading and runtime arguments

2024-05-20 Thread Andrey M. Borodin
Hi!

> On 20 May 2024, at 08:18, Michael Paquier  wrote:

Both features look useful to me.
I've tried to rebase my test of CV sleep during multixact generation[0]. I used 
it like this:

INJECTION_POINT_PRELOAD("GetNewMultiXactId-done");
multi = GetNewMultiXactId(nmembers, ); // starts critsection
INJECTION_POINT("GetNewMultiXactId-done");

And it fails like this:

2024-05-20 16:50:40.430 +05 [21830] 001_multixact.pl LOG:  statement: select 
test_create_multixact();
TRAP: failed Assert("CritSectionCount == 0 || (context)->allowInCritSection"), 
File: "mcxt.c", Line: 1185, PID: 21830
0   postgres0x000101452ed0 ExceptionalCondition 
+ 220
1   postgres0x0001014a6050 MemoryContextAlloc + 
208
2   postgres0x0001011c3bf0 
dsm_create_descriptor + 72
3   postgres0x0001011c3ef4 dsm_attach + 400
4   postgres0x0001014990d8 dsa_attach + 24
5   postgres0x0001011c716c init_dsm_registry + 
240
6   postgres0x0001011c6e60 GetNamedDSMSegment + 
456
7   injection_points.dylib  0x000101c871f8 injection_init_shmem 
+ 60
8   injection_points.dylib  0x000101c86f1c injection_wait + 64
9   postgres0x00010148e228 
InjectionPointRunInternal + 376
10  postgres0x00010148e0a4 InjectionPointRun + 
32
11  postgres0x000100cab798 
MultiXactIdCreateFromMembers + 344
12  postgres0x000100cab604 MultiXactIdCreate + 
312

Am I doing something wrong? Seems like extension have to know too that it is 
preloaded.


Best regards, Andrey Borodin.

[0] 
https://www.postgresql.org/message-id/0925F9A9-4D53-4B27-A87E-3D83A757B0E0%40yandex-team.ru



Sort functions with specialized comparators

2024-05-18 Thread Andrey M. Borodin
Hi!

In a thread about sorting comparators[0] Andres noted that we have 
infrastructure to help compiler optimize sorting. PFA attached PoC 
implementation. I've checked that it indeed works on the benchmark from that 
thread.

postgres=# CREATE TABLE arrays_to_sort AS
   SELECT array_shuffle(a) arr
   FROM
   (SELECT ARRAY(SELECT generate_series(1, 100)) a),
   generate_series(1, 10);

postgres=# SELECT (sort(arr))[1] FROM arrays_to_sort; -- original
Time: 990.199 ms
postgres=# SELECT (sort(arr))[1] FROM arrays_to_sort; -- patched
Time: 696.156 ms

The benefit seems to be on the order of magnitude with 30% speedup.

There's plenty of sorting by TransactionId, BlockNumber, OffsetNumber, Oid etc. 
But this sorting routines never show up in perf top or something like that.

Seems like in most cases we do not spend much time in sorting. But 
specialization does not cost us much too, only some CPU cycles of a compiler. I 
think we can further improve speedup by converting inline comparator to value 
extractor: more compilers will see what is actually going on. But I have no 
proofs for this reasoning.

What do you think?


Best regards, Andrey Borodin.

[0] 
https://www.postgresql.org/message-id/flat/20240209184014.sobshkcsfjix6u4r%40awork3.anarazel.de#fc23df2cf314bef35095b632380b4a59


v1-0001-Use-specialized-sort-facilities.patch
Description: Binary data


Re: allow sorted builds for btree_gist

2024-05-18 Thread Andrey M. Borodin



> On 18 May 2024, at 15:22, Tomas Vondra  wrote:
> 
> Let's continue working on that patch/thread, I'll take a look in the
> next CF.

Cool! I'd be happy to review the patch before CF when Bernd or Christoph will 
address current issues.


Best regards, Andrey Borodin.



Re: allow sorted builds for btree_gist

2024-05-18 Thread Andrey M. Borodin



> On 18 May 2024, at 00:41, Tomas Vondra  wrote:
> 
> if the opclass supports sorted
> builds, because then we could parallelize the sort.

Hi Tomas!

Yup, I'd also be glad to see this feature. PostGIS folks are using their 
geometry (sortsupport was developed for this) with object id (this disables 
sort build).

It was committed once [0], but then reverted, vardata opclasses were 
implemented wrong. Now it's on CF[1], Bernd is actively responding in the 
thread, but currently patch lacks tests.

Thanks for raising this!


Best regards, Andrey Borodin.

[0] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9f984ba6d23dc
[1] https://commitfest.postgresql.org/48/3686/



Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Andrey M. Borodin



> On 17 May 2024, at 16:39, Robert Haas  wrote:
> 
> I think Andrey Borodin's nearby suggestion of having a separate CfM
> for each section of the CommitFest does a good job revealing just how
> bad the current situation is. I agree with him: that would actually
> work. Asking somebody, for a one-month period, to be responsible for
> shepherding one-tenth or one-twentieth of the entries in the
> CommitFest would be a reasonable amount of work for somebody. But we
> will not find 10 or 20 highly motivated, well-qualified volunteers
> every other month to do that work;

Why do you think so? Let’s just try to find more CFMs for July.
When I felt that I’m overwhelmed, I asked for help and Alexander Alekseev 
promptly agreed. That helped a lot.
If I was in that position again, I would just ask 10 times on a 1st day :)

> it's a struggle to find one or two
> highly motivated, well-qualified CommitFest managers, let alone ten or
> twenty.

Because we are looking for one person to do a job for 10.

> So I think the right interpretation of his comment is that
> managing the CommitFest has become about an order of magnitude more
> difficult than what it needs to be for the task to be done well.

Let’s scale the process. Reduce responsibility area of a CFM, define it clearer.
And maybe even explicitly ask CFM to summarize patch status of each entry at 
least once a CF.


Can I do a small poll among those who is on this thread? Would you volunteer to 
summarize a status of 20 patches in July’s CF? 5 each week or so. One per day.


Best regards, Andrey Borodin.



Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Andrey M. Borodin



> On 16 May 2024, at 23:30, Robert Haas  wrote:
> 

I think we just need 10x CFMs. Let’s have a CFM for each CF section. I’d 
happily take "Replication and Recovery” or “System Administration” for July. 
“Miscellaneous” or “Performance” look monstrous.
I feel I can easily track ~20 threads (besides threads I’m interested in). But 
it’s too tedious to spread attention among ~200 items.


Best regards, Andrey Borodin.




Pre-Commitfest Party on StHighload conf

2024-05-15 Thread Andrey M. Borodin
Hi hackers!

StHighload conference will be held on June 24-25[0]. I’m planning to do 
“Pre-Commitfest Party” there.

The idea is to help promote patches among potential reviewers. And start 
working with the very beginning of PG18 development cycle.
Good patch review of a valuable feature is a great addition to a CV, and we 
will advertise this fact among conference attendees.

If you are the patch author, can be around on conference dates and willing to 
present your patch - please contact me or just fill the registration form [1].

Postgres Professional will organize the event, provide us ~1h of a stage time 
and unlimited backstage discussion in their tent. I’ll serve as a moderator, 
and maybe present something myself.
If your work is not on Commitfest yet, but you are planning to finish a 
prototype by the end of the June - feel free to register anyway.
If you do not have a ticket to StHighload - we have some speaker entrance 
tickets.
At the moment we have 4 potential patch authors ready to present.

Please contact me with any questions regarding the event. Thanks!


Best regards, Andrey Borodin.

[0] https://highload.ru/spb/2024/
[1] https://forms.yandex.ru/u/6634e043c417f3cae70775a6/



Re: Weird test mixup

2024-05-11 Thread Andrey M. Borodin



> On 10 May 2024, at 06:04, Michael Paquier  wrote:
> 
> Attached is an updated patch for now

Can you, please, add some more comments regarding purpose of private data?
I somewhat lost understanding of the discussion for a week or so. And I hoped 
to grasp the idea of private_data from resulting code. But I cannot do so from 
current patch version...

I see that you store condition in private_data. So "private" means that this is 
a data specific to extension, do I understand it right?

As long as I started anyway, I also want to ask some more stupid questions:
1. Where is the border between responsibility of an extension and the core 
part? I mean can we define in simple words what functionality must be in 
extension?
2. If we have some concurrency issues, why can't we just protect everything 
with one giant LWLock\SpinLock. We have some locking model instead of 
serializing access from enter until exit.

Most probably, this was discussed somewhere, but I could not find it.

Thanks!


Best regards, Andrey Borodin.



Re: UUID v7

2024-05-08 Thread Andrey M. Borodin



> On 3 May 2024, at 11:18, Andrey M. Borodin  wrote:
> 
> RFC 9562 is not in AUTH48-Done state, it was approved by authors and editor, 
> and now should be published.

It's RFC now.
https://datatracker.ietf.org/doc/rfc9562/


Best regards, Andrey Borodin.



Re: UUID v7

2024-05-04 Thread Andrey M. Borodin



> On 3 May 2024, at 11:18, Andrey M. Borodin  wrote:
> 

Here's the documentation from ClickHouse [0] for their implementation. It's 
identical to provided patch in this thread, with few notable exceptions:

1. Counter is 42 bits, not 18. The counter have no guard bits, every bit is 
initialized with random number on time ticks.
2. By default counter is shared between threads. Alternative function 
generateUUIDv7ThreadMonotonic() provides thread-local counter.

Thanks!


Best regards, Andrey Borodin.

[0] 
https://clickhouse.com/docs/en/sql-reference/functions/uuid-functions#generateUUIDv7



Re: UUID v7

2024-05-03 Thread Andrey M. Borodin


> On 13 Apr 2024, at 11:58, Andrey M. Borodin  wrote:
> 
> New UUID is assigned RFC number 9562, it was aproved by RFC editors and is 
> now in AUTH48 state.

RFC 9562 is not in AUTH48-Done state, it was approved by authors and editor, 
and now should be published.


Best regards, Andrey Borodin.



Re: Idea Feedback: psql \h misses -> Offers Links?

2024-05-02 Thread Andrey M. Borodin


> On 17 Apr 2024, at 22:47, Kirk Wolak  wrote:
> 
> Thoughts?

Today we had a hacking session with Nik and Kirk. We produced a patch to assess 
how these links might look like.

Also we needed a url_encode() and found none in a codebase. It would be nice to 
have this as an SQL-callable function.

Thanks!


Best regards, Andrey Borodin.



v1-0001-Add-URLs-to-h-in-psql-when-there-is-no-match.patch
Description: Binary data


Re: Weird test mixup

2024-05-02 Thread Andrey M. Borodin



> On 2 May 2024, at 13:43, Michael Paquier  wrote:
> 
> A detach is not a wakeup.

Oh, now I see. Sorry for the noise.

Detaching local injection point of other backend seems to be useless and can be 
forbidden.
As far as I understand, your patch is already doing this in
+   if (!injection_point_allowed(name))
+   elog(ERROR, "cannot detach injection point \"%s\" not allowed 
to run",
+name);
+

As far as I understand this will effectively forbid calling 
injection_points_detach() for local injection point of other backend. Do I get 
it right?


Best regards, Andrey Borodin.



Re: Weird test mixup

2024-05-02 Thread Andrey M. Borodin



> On 2 May 2024, at 12:27, Michael Paquier  wrote:
> 
> On Wed, May 01, 2024 at 04:12:14PM -0700, Noah Misch wrote:
>> While writing an injection point test, I encountered a variant of the race
>> condition that f4083c4 fixed.  It had three sessions and this sequence of
>> events:
>> 
>> s1: local-attach to POINT
>> s2: enter InjectionPointRun(POINT), yield CPU just before 
>> injection_callback()
>> s3: detach POINT, deleting the InjectionPointCondition record
>> s2: wake up and run POINT as though it had been non-local
> 
> Fun.  One thing I would ask is why it makes sense to be able to detach
> a local point from a different session than the one who defined it as
> local.  Shouldn't the operation of s3 be restricted rather than
> authorized as a safety measure, instead?

That seems to prevent meaningful use case. If we want exactly one session to be 
waiting just before some specific point, the only way to achieve this is to 
create local injection point. But the session must be resumable from another 
session.
Without this local waiting injection points are meaningless.


Best regards, Andrey Borodin.



Re: New committers: Melanie Plageman, Richard Guo

2024-04-27 Thread Andrey M. Borodin



> On 26 Apr 2024, at 16:54, Jonathan S. Katz  wrote:
> 
> The Core Team would like to extend our congratulations to Melanie Plageman 
> and Richard Guo, who have accepted invitations to become our newest 
> PostgreSQL committers.
> 
> Please join us in wishing them much success and few reverts!

Congratulations!


Best regards, Andrey Borodin.



Re: broken reading on standby (PostgreSQL 16.2)

2024-04-25 Thread Andrey M. Borodin



> On 25 Apr 2024, at 12:06, Pavel Stehule  wrote:
> 
> Unfortunately, I have not direct access to backup, so I am not able to test 
> it. But VACUUM FREEZE DISABLE_PAGE_SKIPPING on master didn't help
> 

If Primary considers all freezable tuples frozen, but a standby does not, 
"disable page skipping" won't change anything. Primary will not emit WAL record 
to freeze tuples again.


Best regards, Andrey Borodin.



Re: broken reading on standby (PostgreSQL 16.2)

2024-04-25 Thread Andrey M. Borodin



> On 25 Apr 2024, at 11:12, Pavel Stehule  wrote:
> 
> yesterday, I had to fix strange issue on standby server 

It’s not just broken reading, if this standby is promoted in HA cluster - this 
would lead to data loss.
Recently I’ve observed some lost heap updates ofter OOM-ing cluster on 14.11. 
This might be unrelated most probably, but I’ll post a link here, just in case 
[0]. In February and March we had 3 clusters with similar problem, and this is 
unusually big number for us in just 2 months.

Can you check LSN of blocks with corrupted tuples with pageinpsect on primary 
and on standby? I suspect they are frozen on primary, but have usual xmin on 
standby.


Best regards, Andrey Borodin.

[0] 
https://www.postgresql.org/message-id/flat/67EADE8F-AEA6-4B73-8E38-A69E5D48BAFE%40yandex-team.ru#1266dd8b898ba02686c2911e0a50ab47



Re: POC: make mxidoff 64 bits

2024-04-23 Thread Andrey M. Borodin



> On 23 Apr 2024, at 11:23, Maxim Orlov  wrote:
> 
> Make multixact offsets 64 bit.

-   ereport(ERROR,
-   (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
-errmsg("multixact \"members\" limit exceeded"),
Personally, I'd be happy with this! We had some incidents where the only 
mitigation was vacuum settings tweaking.

BTW as a side note... I see lot's of casts to (unsigned long long), can't we 
just cast to MultiXactOffset?


Best regards, Andrey Borodin.



Re: plenty code is confused about function level static

2024-04-18 Thread Andrey M. Borodin



> On 18 Apr 2024, at 02:39, Andres Freund  wrote:
> 
> There are lots of places that could benefit from adding 'static
> const'.

+1 for helping compiler.
GCC has a -Wsuggest-attribute=const, we can count these warnings and threat 
increase as an error :)


Best regards, Andrey Borodin.



Re: "backend process" confused with "server process"

2024-04-15 Thread Andrey M. Borodin



> On 15 Apr 2024, at 14:01, jian he  wrote:
> 
> "is not a PostgreSQL server process" is the same thing as "not a
> PostgreSQL backend process”?

As far as I understand, backend is something attached to frontend.
There might be infrastructure processes like syslogger or stats collector, 
client can signal it too.


Best regards, Andrey Borodin.



Re: UUID v7

2024-04-13 Thread Andrey M. Borodin



> On 12 Mar 2024, at 20:41, Jelte Fennema-Nio  wrote:
> 
> if e.g.
> the RFC got approved 2 weeks after Postgres its feature freeze

Jelte, you seem to be the visionary! I would consider participating in 
lotteries or betting.
New UUID is assigned RFC number 9562, it was aproved by RFC editors and is now 
in AUTH48 state. This means after final approval by authors RFC will be 
imminently publicised. Most probably, this will happen circa 2 weeks after 
feature freeze :)


Best regards, Andrey Borodin.

[0] https://www.rfc-editor.org/auth48/rfc9562



Re: Issue with the PRNG used by Postgres

2024-04-10 Thread Andrey M. Borodin



> On 10 Apr 2024, at 21:48, Parag Paul  wrote:
> 
> Yes, the probability of this happening is astronomical, but in production 
> with 128 core servers with 7000 max_connections, with petabyte scale data, 
> this did repro 2 times in the last month. We had to move to a local approach 
> to manager our ratelimiting counters. 

FWIW we observed such failure on this [0] LWLock two times too. Both cases were 
recent (February).
We have ~15k clusters with 8MTPS, so it’s kind of infrequent, but not 
astronomic. We decided to remove that lock.


Best regards, Andrey Borodin.

[0] 
https://github.com/munakoiso/logerrors/pull/25/files#diff-f8903c463a191f399b3e84c815ed6dc60adbbfc0fb0b2db490be1e58dc692146L85



Re: post-freeze damage control

2024-04-09 Thread Andrey M. Borodin



> On 9 Apr 2024, at 18:45, Alvaro Herrera  wrote:
> 
> Maybe we should explicitly advise users to not delete that WAL from
> their archives, until pg_combinebackup is hammered a bit more.

As a backup tool maintainer, I always reference to out-of-the box Postgres 
tools as some bulletproof alternative.
I really would like to stick to this reputation and not discredit these tools.


Best regards, Andrey Borodin.



Re: Commitfest Manager for March

2024-04-09 Thread Andrey M. Borodin



> On 1 Apr 2024, at 09:05, Andrey M. Borodin  wrote:
> 
> As of April 1 there are 97 committed patches.

Hello everyone!

March Commitfest is closed.
~40 CF entries were committed since April 1st. Despite some drawbacks discussed 
in nearby threads, its still huge amount of work and significant progress for 
the project. Thanks to everyone involved!

The number is approximate, because in some cases I could not clearly determine 
status. I pinged authors to do so.

26 entries are RWF\Rejected\Withdrawn. Michael noted that I moved to next CF 
some entries that wait for the author more then couple of weeks. I'm going to 
revise WoA items in 2024-07 and RwF some of them to reduce CF bloat. If authors 
are still interested in continuing work of returned items, they are free to 
provide an answer and to change status back to Needs Review.

I've removed all "target version = 17" attribute and switched statuses of a lot 
of entries. I'm afraid there might be some errors. If I determined status of 
your patch erroneously, please accept my apologise and correct the error.

Thanks!


Best regards, Andrey Borodin.



Re: Improve WALRead() to suck data directly from WAL buffers when possible

2024-04-09 Thread Andrey M. Borodin



> On 8 Apr 2024, at 08:17, Bharath Rupireddy 
>  wrote:

Hi Bharath!

As far as I understand CF entry [0] is committed? I understand that there are 
some open followups, but I just want to determine correct CF item status...

Thanks!


Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/47/4060/



Re: SET ROLE documentation improvement

2024-04-09 Thread Andrey M. Borodin



> On 24 Mar 2024, at 23:34, Nathan Bossart  wrote:
> 
> Committed that part.

Hi Nathan and Yurii!

Can I ask you please to help me with determining status of CF item [0]. Is it 
committed or there's something to move to next CF?

Thanks!


Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/47/4572/



Re: CI and test improvements

2024-04-08 Thread Andrey M. Borodin



> On 19 Feb 2024, at 11:33, Peter Eisentraut  wrote:
> 
> Ok, I didn't see that my feedback had been addressed.  I have committed this 
> patch.

Justin, Peter, I can't determine actual status of the CF entry [0]. May I ask 
someone of you to move patch to next CF or close as committed?
Thanks!


Best regards, Andrey Borodin.
[0] https://commitfest.postgresql.org/47/3709/



Re: PostgreSQL 17 Release Management Team & Feature Freeze

2024-04-08 Thread Andrey M. Borodin



> On 8 Apr 2024, at 17:26, Melanie Plageman  wrote:
> 
> What if we pick the actual feature freeze time randomly? That is,
> starting on March 15th (or whenever but more than a week before), each
> night someone from RMT generates a random number between $current_day
> and April 8th. If the number matches $current_day, that day at
> midnight is the feature freeze.

But this implies that actual date is not publicly known before feature freeze 
is in effect. Do I understand idea correctly?


Best regards, Andrey Borodin.



Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-04-08 Thread Andrey M. Borodin



> On 27 Sep 2023, at 16:06, tender wang  wrote:
> 
>Do you have any comments or suggestions on this issue? Thanks.
Hi Tender,

there are some review comments in the thread, that you might be interested in.
I'll mark this [0] entry "Waiting on Author" and move to next CF.

Thanks!


Best regards, Andrey Borodin.

[0]https://commitfest.postgresql.org/47/4549/



Re: Weird test mixup

2024-04-08 Thread Andrey M. Borodin



> On 8 Apr 2024, at 11:55, Michael Paquier  wrote:
> 
>  the point of the test is to make sure that the local
> cleanup happens
Uh, I did not understand this. Because commit message was about stabiilzizing 
tests, not extending coverage.
Also, should we drop function wait_pid() at the end of a test?
Given that tweaks with are nothing new, I think patch looks good.


Best regards, Andrey Borodin.



Re: GenBKI emits useless open;close for catalogs without rows

2024-04-08 Thread Andrey M. Borodin



> On 22 Sep 2023, at 18:50, Matthias van de Meent 
>  wrote:

Hi Matthias!

This is kind reminder that this thread is waiting for your response.
CF entry [0] is in "Waiting on Author", I'll move it to July CF.

Thanks!


Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/47/4544/



Re: Logging parallel worker draught

2024-04-08 Thread Andrey M. Borodin



> On 29 Feb 2024, at 11:24, Benoit Lobréau  wrote:
> 
> Yes, thanks for the proposal, I'll work on it on report here.

Hi Benoit!

This is kind reminder that this thread is waiting for your response.
CF entry [0] is in "Waiting on Author", I'll move it to July CF.

Thanks!


Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/47/4291/



Re: Weird test mixup

2024-04-08 Thread Andrey M. Borodin



> On 8 Apr 2024, at 10:33, Michael Paquier  wrote:
> 
> Thoughts?

As an alternative we can make local injection points mutually exclusive.


Best regards, Andrey Borodin.




Re: MultiXact\SLRU buffers configuration

2024-04-07 Thread Andrey M. Borodin



> On 7 Apr 2024, at 21:41, Alvaro Herrera  wrote:
> 
> Well, it would be nice to have *some* test, but as you say it is way
> more complex than the thing being tested, and it zooms in on the
> functioning of the multixact creation in insane quantum-physics ways ...
> to the point that you can no longer trust that multixact works the same
> way with the test than without it.  So what I did is manually run other
> tests (pgbench) to verify that the corner case in multixact creation is
> being handled correctly, and pushed the patch after a few corrections,
> in particular so that it would follow the CV sleeping protocol a little
> more closely to what the CV documentation suggests, which is not at all
> what the patch did.  I also fixed a few comments that had been neglected
> and changed the name and description of the CV in the docs.

That's excellent! Thank you!

> Now, maybe we can still add the test later, but it needs a rebase.

Sure. 'wait' injection points are there already, so I'll produce patch with 
"prepared" injection points and re-implement test on top of that. I'll put it 
on July CF.


Best regards, Andrey Borodin.



Re: MultiXact\SLRU buffers configuration

2024-04-07 Thread Andrey M. Borodin



> On 6 Apr 2024, at 14:24, Andrey M. Borodin  wrote:
> 
> What do you think?

OK, I'll follow this plan.
As long as most parts of this thread were committed, I'll mark CF item as 
"committed".
Thanks to everyone involved!
See you in a followup thread about sleeping on CV.


Best regards, Andrey Borodin.



Re: XLog size reductions: Reduced XLog record header size for PG17

2024-04-06 Thread Andrey M. Borodin



> On 3 Jan 2024, at 15:15, Pavel Borisov  wrote:
> 
> Hi and Happy New Year!
> 
> I've looked through the patches and the change seems quite small and 
> justified. But at the second round, some doubt arises on whether this long 
> patchset indeed introduces enough performance gain? I may be wrong, but it 
> saves only several bytes and the performance gain would be only in some 
> specific artificial workload.  Did you do some measurements? Do we have 
> several percent performance-wise?
> 
> Kind regards,
> Pavel Borisov

Hi Matthias!

This is a kind reminder that the thread is waiting for your reply. Are you 
interesting in CF entry [0]?

Thanks!


Best regards, Andrey Borodin.
[0] https://commitfest.postgresql.org/47/4386/



Re: Change prefetch and read strategies to use range in pg_prewarm ... and raise a question about posix_fadvise WILLNEED

2024-04-06 Thread Andrey M. Borodin



> On 15 Mar 2024, at 17:12, Nazir Bilal Yavuz  wrote:
> 
> I did not have the time to check other things you mentioned but I
> tested the read performance. The table size is 5.5GB, I did 20 runs in
> total.

Hi Nazir!

Do you plan to review anything else? Or do you think it worth to look at by 
someone else? Or is the patch Ready for Committer? If so, please swap CF entry 
[0] to status accordingly, currently it's "Waiting on Author".


Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/47/4763/



Re: MultiXact\SLRU buffers configuration

2024-04-06 Thread Andrey M. Borodin



> On 29 Feb 2024, at 06:59, Kyotaro Horiguchi  wrote:
> 
> At Sat, 3 Feb 2024 22:32:45 +0500, "Andrey M. Borodin"  
> wrote in 
>> Here's the test draft. This test reliably reproduces sleep on CV when 
>> waiting next multixact to be filled into "members" SLRU.
> 
> By the way, I raised a question about using multiple CVs
> simultaneously [1]. That is, I suspect that the current CV
> implementation doesn't allow us to use multiple condition variables at
> the same time, because all CVs use the same PCPROC member cvWaitLink
> to accommodate different waiter sets. If this assumption is correct,
> we should resolve the issue before spreading more uses of CVs.

Alvaro, Kyotaro, what's our plan for this?
It seems to late to deal with this pg_usleep(1000L) for PG17.
I propose following course of action
1. Close this long-standing CF item
2. Start new thread with CV-sleep patch aimed at PG18
3. Create new entry in July CF

What do you think?


Best regards, Andrey Borodin.



Re: Weird test mixup

2024-04-05 Thread Andrey M. Borodin



> On 5 Apr 2024, at 07:19, Michael Paquier  wrote:
> 
> It's been a couple of weeks since this has been sent, and this did not
> get any reviews.  I'd still be happy with the simplicity of a single
> injection_points_local() that can be used to link all the injection
> points created in a single process to it, discarding them once the
> process exists with a shmem exit callback.

OK, makes sense.
I find name of the function "injection_points_local()" strange, because there 
is no verb in the name. How about "injection_points_set_local"?

>  And I don't really see an
> argument to tweak the backend-side routines, as well.
>  Comments and/or
> objections?

I'm not sure if we should refactor anything here, but InjectionPointSharedState 
has singular name, plural wait_counts and singular condition.
InjectionPointSharedState is already an array of injection points, maybe let's 
add there optional pid instead of inventing separate array of pids?

Can we set global point to 'notice', but same local to 'wait'? Looks like now 
we can't, but allowing to do so would make code simpler.

Besides this opportunity to simplify stuff, both patches looks good to me.


Best regards, Andrey Borodin.



Re: CSN snapshots in hot standby

2024-04-05 Thread Andrey M. Borodin



> On 5 Apr 2024, at 02:08, Kirill Reshke  wrote:
> 
> maybe we need some hooks here? Or maybe, we can take CSN here from extension 
> somehow.

I really like the idea of CSN-provider-as-extension.
But it's very important to move on with CSN, at least on standby, to make CSN 
actually happen some day.
So, from my perspective, having LSN-as-CSN is already huge step forward.


Best regards, Andrey Borodin.



Re: Allow non-superuser to cancel superuser tasks.

2024-04-04 Thread Andrey M. Borodin



> On 5 Apr 2024, at 05:03, Leung, Anthony  wrote:
> 
> Adding tap test for pg_signal_autovacuum using injection points as a separate 
> patch. I also made a minor change on the original patch.

The test looks good, but:
1. remove references to passcheck :)
2. detach injection point when it's not needed anymore

Thanks!


Best regards, Andrey Borodin.



Re: UUID v7

2024-04-04 Thread Andrey M. Borodin



> On 4 Apr 2024, at 18:45, Peter Eisentraut  wrote:
> 
> On 26.03.24 18:26, Andrey M. Borodin wrote:
>>> Also, you are initializing 4 bits (I think?) to zero to guard against 
>>> counter rollovers (so it's really just an 8 bit counter?).  But nothing 
>>> checks against such rollovers, so I don't understand the use of that.
>> No, there's only one guard rollover bit.
>> Here: uuid->data[6] = (uuid->data[6] & 0xf7);
>> Bits that are called "guard bits" do not guard anything, they just ensure 
>> counter capacity when it is initialized.
> 
> Uh, I guess I don't understand this at all.  I tried to dig up some 
> information about this, but didn't find anything.  What exactly is the 
> mechanism of these "counter rollover guards"?  If they don't guard anything, 
> what are they supposed to accomplish?
> 

My understanding of guard bits is the following: on every UUID generation, when 
time is advancing, counter bits are initialized with random numbers, except 
guard bits. Guard bits are always initialized with zeroes.

Let's consider we have a 1-byte counter with 4 guard bits and 4 normal bits.
If we generate some UUIDs at the very same millisecond we might have following 
counter values:

0C <--- lower nibble is initialized with random 4 bits C.
0D
0E
0F
10
11
12

If we have no these guard bits we might get random numbers that are immifiately 
at the end of a range of allowed values:

FE <--- first UUID at given millisecond
FF
00 <--- rollover to next millisecond
01


If we have 1 guard bit and 7 normal bits we get at worst 128 values before 
rollover to next millisecond.
If we have 2 guard bits and 6 normal bits this guaranty is extended to 192.
3/5 will guaranty capacity of 224.
But usefulness of every next guard bits decreases, so I think there is a point 
in only one.

That's my understanding of guard bits in the counter. Please correct me if I'm 
wrong.


At this point we can skip the counter\microseconds entirely, just fill 
everything after unix_ts_ms with randomness. It's still a valid UUIDv7, 
exhibiting much more data locality than UUIDv4. We can adjust this sortability 
measures later.


Best regards, Andrey Borodin.



Re: [PATCH] Modify pg_ctl to detect presence of geek user

2024-04-03 Thread Andrey M. Borodin



> On 3 Apr 2024, at 18:17, Panda Developpeur  
> wrote:
> 
> Thank you for considering my contribution.

Looks interesting!

+   usleep(50);

Don't we need to make system 500ms faster instead? Let's change it to

+   usleep(-50);

Thanks!


Best regards, Andrey Borodin.



Re: Allow non-superuser to cancel superuser tasks.

2024-04-02 Thread Andrey M. Borodin



> On 2 Apr 2024, at 01:21, Nathan Bossart  wrote:
> 
> I haven't looked too closely, but I'm pretty skeptical that the test suite
> in your patch would be stable.  Unfortunately, I don't have any better
> ideas at the moment besides not adding a test for this new role.

We can add tests just like [0] with injection points.
I mean replace that "sleep 1" with something like 
"$node->wait_for_event('autovacuum worker', 'autocauum-runing');".
Currently we have no infrastructure to wait for autovacuum of particular table, 
but I think it's doable.
Also I do not like that test is changing system-wide autovac settings, AFAIR 
these settings can be set for particular table.

Thanks!


Best regards, Andrey Borodin.

[0] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=eeefd4280f6



Re: Commitfest Manager for March

2024-04-01 Thread Andrey M. Borodin



> On 20 Mar 2024, at 18:13, wenhui qiu  wrote:
> 
>   Could you take a look at the patch 
> (https://commitfest.postgresql.org/47/4284/),How about your opinion

The patch is currently in the "Ready for Committer" state. It's up to the 
committer to decide which one to pick. There are 22 proposed patches and many 
committers are dealing with the consequences of what has already been committed 
(build farm failures, open issues, post-commit notifications, etc.).
If I was a committer, I wouldn't commit a new join kind at the door of feature 
freeze. If I was the author of this patch, I'd consider attracting more 
reviewers and testers to increase chances of the patch in July CF.
Thanks for your work!


Best regards, Andrey Borodin.

PS. Plz post the answer below quoted text. That’s common practice here.



Re: Commitfest Manager for March

2024-04-01 Thread Andrey M. Borodin



> On 1 Mar 2024, at 17:29, Daniel Gustafsson  wrote:
> 
> It is now March 1 in all timezones, so I have switched 202403 to In Progress
> and 202307 to Open.  There are a total of 331 patches registered with 286 of
> those in an open state, 24 of those have been around for 10 CF's or more.


As of April 1 there are 97 committed patches. Incredible amount of work! Thanks 
to all committers, patch authors and reviewers.
Still there are 205 open patches, 19 of them are there for 10+ CFs. I 
considered reducing this number by rejecting couple of my patches. But that 
author-part of my brain resisted in an offlist conversation.

I think it makes sense to close this commitfest only after Feature Freeze on 
April 8, 2024 at 0:00 AoE.
What do you think?


Best regards, Andrey Borodin.



Re: [PATCH] kNN for btree

2024-03-31 Thread Andrey M. Borodin



> On 15 Jan 2024, at 13:11, Anton A. Melnikov  wrote:
> 
> If there are any ideas pro and contra would be glad to discuss them.

Hi, Anton!

This is kind of ancient thread. I've marked CF entry [0] as "Needs review" and 
you as an author (seems like you are going to be a point of correspondence on 
this feature).

At this point it's obvious that the feature won't make it to 17, so let's move 
to July CF. Given 7 years of history in this thread, you might want to start a 
new one with a summarisation of this thread. This will attract more reviewers, 
either way they have to dig on their own.

Thanks!


Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/47/4871/



Re: logicalrep_worker_launch -- counting/checking the worker limits

2024-03-31 Thread Andrey M. Borodin



> On 15 Aug 2023, at 07:38, Peter Smith  wrote:
> 
> A rebase was needed due to a recent push [1].
> 
> PSA v3.


> On 14 Jan 2024, at 10:43, vignesh C  wrote:
> 
> I have changed the status of the patch to "Waiting on Author" as
> Amit's queries at [1] have not been verified and concluded. Please
> feel free to address them and change the status back again.

Hi Peter!

Are you still interested in this thread? If so - please post an answer to 
Amit's question.
If you are not interested - please Withdraw a CF entry [0].

Thanks!


Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/47/4499/



Re: Wrong results with grouping sets

2024-03-31 Thread Andrey M. Borodin



> On 11 Jan 2024, at 20:10, vignesh C  wrote:
> 
> I have changed the status of the patch to "Waiting on Author" as Tom
> Lane's comments have not yet been addressed, feel free to address them
> and update the commitfest entry accordingly.

This CF entry seems to be a fix for actually unexpected behaviour. But seems 
like we need another fix.
Richard, Alena, what do you think? Should we mark CF entry [0] "RwF" or leave 
it to wait for better fix?

Best regards, Andrey Borodin.


[0] https://commitfest.postgresql.org/47/4583/



Re: [PATCH] pgbench log file headers

2024-03-30 Thread Andrey M. Borodin



> On 6 Jan 2024, at 20:20, vignesh C  wrote:
> 
> One of the test has failed in CFBOT at[1] with:

Hi Adam,

This is a kind reminder that CF entry [0] is waiting for an update. Thanks!


Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/47/4660/




Re: generic plans and "initial" pruning

2024-03-30 Thread Andrey M. Borodin



> On 6 Dec 2023, at 23:52, Robert Haas  wrote:
> 
>  I hope that it's at least somewhat useful.
> 


> On 5 Jan 2024, at 15:46, vignesh C  wrote:
> 
> There is a leak reported 

Hi Amit,

this is a kind reminder that some feedback on your patch[0] is waiting for your 
reply.
Thank you for your work!

Best regards, Andrey Borodin.


[0] https://commitfest.postgresql.org/47/3478/



Re: Various small doc improvements; plpgsql, schemas, permissions, oidvector

2024-03-30 Thread Andrey M. Borodin



> On 30 Mar 2024, at 11:39, Karl O. Pinc  wrote:
> 
> Well, Friday has come and gone and I've not gotten to this.
> I'll see if I can spend time tomorrow.

No worries, Karl! I just wanted to know if anyone is interested in this thread, 
and, now is obvious that you are. Thanks for your work!


Best regards, Andrey Borodin.



Re: Use virtual tuple slot for Unique node

2024-03-28 Thread Andrey M. Borodin



> On 29 Oct 2023, at 21:30, Denis Smirnov  wrote:
> 
> I have taken a look at this discussion, at the code and I am confused how we 
> choose tuple table slot (TTS) type in PG. 

After offline discussion with Denis, we decided to withdraw this patch from CF 
for now. If anyone is willing to revive working on this, please register a new 
entry in next commitfest.
Thanks!


Best regards, Andrey Borodin.



Re: Various small doc improvements; plpgsql, schemas, permissions, oidvector

2024-03-28 Thread Andrey M. Borodin



> On 1 Dec 2023, at 19:00, Karl O. Pinc  wrote:
> 
>  I won't be able to submit
> new patches based on reviews for 2 weeks.

Hi everyone!

Is there any work going on? Maybe is someone interested in moving this forward?

Thanks!


Best regards, Andrey Borodin.



Re: Doc limitation update proposal: include out-of-line OID usage per TOAST-ed columns

2024-03-28 Thread Andrey M. Borodin



> On 8 Aug 2023, at 12:31, John Naylor  wrote:
> 
> > > Also the shared counter is the cause of the slowdown, but not the reason 
> > > for the numeric limit.
> >
> > Isn't it both? typedef Oid is unsigned int = 2^32, and according to 
> > GetNewOidWithIndex() logic if we exhaust the whole OID space it will hang 
> > indefinitely which has the same semantics as "being impossible"/permanent 
> > hang (?)
> 
> Looking again, I'm thinking the OID type size is more relevant for the first 
> paragraph, and the shared/global aspect is more relevant for the second.
> 
> The last issue is how to separate the notes at the bottom, since there are 
> now two topics.

Jakub, do you have plans to address this feedback? Is the CF entry still 
relevant?

Thanks!


Best regards, Andrey Borodin.



Re: Support logical replication of DDLs

2024-03-28 Thread Andrey M. Borodin



> On 18 Jul 2023, at 12:09, Zhijie Hou (Fujitsu)  wrote:
> 
> Here is the POC patch(0004) for the second approach

Hi everyone!

This thread is registered on CF [0] but is not active since 2023. Is anyone 
working on this? I understand that this is a nice feature. Should we move it to 
next CF or withdraw CF entry?

Thanks!


[0] https://commitfest.postgresql.org/47/3595/



Re: BUG: deadlock between autovacuum worker and client backend during removal of orphan temp tables with sequences

2024-03-28 Thread Andrey M. Borodin


> On 28 Mar 2024, at 10:51, Akshat Jaimini  wrote:
> 
> I am currently trying to review the submitted patch

Great, thank you!

> but I am not able to apply it to the master branch. 

Please find attached rebased version on current HEAD. For some reason CFbot did 
not notify about that rebases is needed.


Best regards, Andrey Borodin.


v2-0001-Avoid-deadlock-and-concurrency-during-orphan-temp.patch
Description: Binary data


Re: UUID v7

2024-03-26 Thread Andrey M. Borodin
Sorry for this long reply. I was looking on refactoring around 
pg_strong_random() and could not decide what to do. Finally, I decided to post 
at least something.

> On 22 Mar 2024, at 19:15, Peter Eisentraut  wrote:
> 
> I have been studying the uuidv() function.
> 
> I find this code extremely hard to follow.
> 
> We don't need to copy all that documentation from the RFC 4122bis document.  
> People can read that themselves.  What I would like to see is easy to find 
> information what from there we are implementing.  Like,
> 
> - UUID version 7
> - fixed-length dedicated counter
> - counter is 18 bits
> - 4 bits are initialized as zero

I've removed table taken from RFC.

> That's more or less all I would need to know what is going on.
> 
> That said, I don't understand why you say it's an 18 bit counter, when you 
> overwrite 6 bits with variant and version.  Then it's just a 12 bit counter?  
> Which is the size of the rand_a field, so that kind of makes sense.  But 12 
> bits is the recommended minimum, and (in this patch) we don't use 
> sub-millisecond timestamp precision, so we should probably use more than the 
> minimum?


No, we use 4 bits in data[6], 8 bits in data[7], and 6 bits data[8]. It's 18 
total. Essentially, we use both partial bytes and one whole byte between.
There was a bug - we used 1 extra byte of random numbers that was not 
necessary, I think that's what lead you to think that we use 12-bit counter.

> Also, you are initializing 4 bits (I think?) to zero to guard against counter 
> rollovers (so it's really just an 8 bit counter?).  But nothing checks 
> against such rollovers, so I don't understand the use of that.

No, there's only one guard rollover bit.
Here: uuid->data[6] = (uuid->data[6] & 0xf7);
Bits that are called "guard bits" do not guard anything, they just ensure 
counter capacity when it is initialized.
Rollover is carried into time tick here: 
++sequence_counter;
if (sequence_counter > 0x3)
{
/* We only have 18-bit counter */
sequence_counter = 0;
previous_timestamp++;
}

I think we might use 10 bits of microseconds and have 8 bits of a counter. 
Effect of a counter won't change much. But I'm not sure if this is allowed per 
RFC.
If time source is coarse-grained it still acts like a random initializer. And 
when it is precise - time is "natural" source of entropy.


> The code code be organized better.  In the not-increment_counter case, you 
> could use two separate pg_strong_random calls: One to initialize rand_b, 
> starting at >data[8], and one to initialize the counter. Then the 
> former could be shared between the two branches, and the code to assign the 
> sequence_counter to the uuid fields could also be shared.

Call to pg_strong_random() is very expensive in builds without ssl (and even 
with ssl too). If we could ammortize random numbers in small buffers - that 
would save a lot of time (see v8-0002-Buffer-random-numbers.patch upthread). 
Or, perhaps, we can ignore cost of two pg_string_random() calls.

> 
> I would also prefer if the normal case (not-increment_counter) were the first 
> if branch.

Done.

> Some other notes on your patch:
> 
> - Your rebase duplicated the documentation of uuid_extract_timestamp and 
> uuid_extract_version.
> 
> - PostgreSQL code uses uint64 etc. instead of uint64_t etc.
> 
> - It seems the added includes
> 
> #include "access/xlog.h"
> #include "utils/builtins.h"
> #include "utils/datetime.h"
> 
> are not needed.
> 
> - The static variables sequence_counter and previous_timestamp could be kept 
> inside the uuidv7() function.

Fixed.

Thanks!


Best regards, Andrey Borodin.


v24-0001-Implement-UUID-v7.patch
Description: Binary data


Re: [PATCH] Add sortsupport for range types and btree_gist

2024-03-22 Thread Andrey M. Borodin



> On 22 Mar 2024, at 18:20, Bernd Helmle  wrote:
> 
> Here is a rebased version of the patch.

FWIW it would be nice at least port tests from commit that I referenced 
upthread.
Nowadays we have injection points, so these tests can be much more stable.

Sorry for bringing up this so late.


Best regards, Andrey Borodin.



Re: UUID v7

2024-03-22 Thread Andrey M. Borodin



> On 21 Mar 2024, at 20:21, Jelte Fennema-Nio  wrote:
> 
> On Wed, 20 Mar 2024 at 19:08, Andrey M. Borodin  wrote:
>> Timer-based bits contribute to global sortability. But the real timers we 
>> have are not even millisecond adjusted. We can hope for ~few ms variation in 
>> one datacenter or in presence of atomic clocks.
> 
> I think the main benefit of using microseconds would not be
> sortability between servers, but sortability between backends. 

Oh, that’s an interesting practical feature!
Se, essentially counter is a theoretical guaranty of sortability in one 
backend, while microseconds are practical sortability between backends.

> However, I don't really think it is incredibly important to get the
> "perfect" approach to filling in rand_a/rand_b right now. As long as
> we don't document what we do, we can choose to change the method
> without breaking backwards compatibility. Because either approach
> results in valid UUIDv7s.

Makes sense to me. I think both methods would be much better than UUIDv4 for 
practical reasons. And even not using extra bits at all (fill them with random 
numbers) would work for 0.999 cases.


Best regards, Andrey Borodin.



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

2024-03-21 Thread Andrey M. Borodin



> On 21 Mar 2024, at 18:54, Peter Geoghegan  wrote:
>  Do the posting lists that are corrupt
> (reported on elsewhere) also have duplicate TIDs?

I do not have access now, but AFAIR yes.
Thanks for pointers!

BTW there were also some errors in logs like
ERROR:  index "tablename" contains unexpected zero page at block 1265985 HINT:
and even
MultiXactId 34043703 has not been created yet -- apparent wraparound
"right sibling's left-link doesn't match: right sibling 4 of target 2 with 
leafblkno 2 and scanblkno 2 spuriously links to non-target 1 on level 0 of 
index "indexname"

Multixact problem was fixed by vacuum freeze, other indexes were repacked.


> On 21 Mar 2024, at 20:21, Matthias van de Meent 
>  wrote:
> 
> Would you happen to have a PostgreSQL version number (or commit hash)
> to help debugging? Has it always had that PG version, or has the
> version been upgraded?

Vanilla 14.11 (14.10 when created).

> Considering the age of this thread, and thus potential for v14 pre-.4:
> Did this cluster use REINDEX (concurrently) for the relevant indexes?


As now I see, chances are my case is not related to the original thread.

Thanks!


Best regards, Andrey Borodin.



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

2024-03-21 Thread Andrey M. Borodin



> On 29 Jun 2022, at 17:43, Robins Tharakan  wrote:


Sorry to bump ancient thread, I have some observations that might or might not 
be relevant.
Recently we noticed a corruption on one of clusters. The corruption at hand is 
not in system catalog, but in user indexes.
The cluster was correctly configured: checksums, fsync, FPI etc.
The cluster never was restored from a backup. It’s a single-node cluster, so it 
was not ever promoted, pg_rewind-ed etc. VM had never been rebooted.

But, the cluster had been experiencing 10 OOMs a day. There were no torn pages, 
no checsum erros at log at all. Yet, B-tree indexes became corrupted.


Sorry for this wall of text, I’m posing everything as-is in case if there is 
some useful information.

$ /etc/cron.yandex/pg_corruption_check.py --index
2024-03-01 11:54:05,075 ERROR : Corrupted index: 96009 
table1_table1message_table1_team_identity_06a95642 XX002 ERROR: posting list 
contains misplaced TID in index 
"table1_table1message_table1_team_identity_06a95642" DETAIL: Index tid=(267,34) 
posting list offset=137 page lsn=31B/62159608.
2024-03-01 11:54:05,100 ERROR : Corrupted index: 96008 
table1_table1message_organization_id_66c18ed2 XX002 ERROR: posting list 
contains misplaced TID in index "table1_table1message_organization_id_66c18ed2" 
DETAIL: Index tid=(267,34) posting list offset=137 page lsn=31B/62158BC8.
2024-03-01 11:54:05,355 ERROR : Corrupted index: 95804 
table2_aler_channel_81aeec_idx XX002 ERROR: posting list contains misplaced TID 
in index "table2_aler_channel_81aeec_idx" DETAIL: Index tid=(336,7) posting 
list offset=182 page lsn=314/9B794248.
2024-03-01 11:54:05,716 ERROR : Corrupted index: 95816 
table2_table3_channel_id_91a1912f XX002 ERROR: posting list contains misplaced 
TID in index "table2_table3_channel_id_91a1912f" DETAIL: Index tid=(384,2) 
posting list offset=72 page lsn=317/3F14F390.
2024-03-01 11:54:06,068 ERROR : Corrupted index: 95815 
table2_table3_channel_filter_id_6706c8b6 XX002 ERROR: posting list contains 
misplaced TID in index "table2_table3_channel_filter_id_6706c8b6" DETAIL: Index 
tid=(380,2) posting list offset=72 page lsn=317/3F0D8E30.
2024-03-01 11:54:06,302 ERROR : Corrupted index: 95824 
table2_table3_root_alert_group_id_f327f122 XX002 ERROR: item order invariant 
violated for index "table2_table3_root_alert_group_id_f327f122" DETAIL: Lower 
index tid=(368,204) (points to heap tid=(48901,2)) higher index tid=(368,205) 
(points to heap tid=(48901,2)) page lsn=319/3C234588.
2024-03-01 11:54:06,538 ERROR : Corrupted index: 95810 
table2_table3_acknowledged_by_user_id_dd6723dc XX002 ERROR: posting list 
contains misplaced TID in index 
"table2_table3_acknowledged_by_user_id_dd6723dc" DETAIL: Index tid=(380,69) 
posting list offset=35 page lsn=317/C14E2D50.
2024-03-01 11:54:06,775 ERROR : Corrupted index: 95825 
table2_table3_silenced_by_user_id_40a833a1 XX002 ERROR: posting list contains 
misplaced TID in index "table2_table3_silenced_by_user_id_40a833a1" DETAIL: 
Index tid=(371,11) posting list offset=144 page lsn=318/61171918.
2024-03-01 11:54:07,009 ERROR : Corrupted index: 95829 
table2_table3_wiped_by_id_4326ff61 XX002 ERROR: item order invariant violated 
for index "table2_table3_wiped_by_id_4326ff61" DETAIL: Lower index tid=(373,97) 
(points to heap tid=(48901,2)) higher index tid=(373,98) (points to heap 
tid=(48901,2)) page lsn=318/61172788.
2024-03-01 11:54:07,245 ERROR : Corrupted index: 95823 
table2_table3_resolved_by_user_id_463cdf3d XX002 ERROR: posting list contains 
misplaced TID in index "table2_table3_resolved_by_user_id_463cdf3d" DETAIL: 
Index tid=(375,89) posting list offset=144 page lsn=319/3C1DCFC8.
2024-03-01 11:54:07,479 ERROR : Corrupted index: 95819 
table2_table3_maintenance_uuid_9a7b8529_like XX002 ERROR: item order invariant 
violated for index "table2_table3_maintenance_uuid_9a7b8529_like" DETAIL: Lower 
index tid=(372,4) (points to heap tid=(48901,2)) higher index tid=(372,5) 
(points to heap tid=(48901,2)) page lsn=317/C1A210A8.
2024-03-01 11:54:07,717 ERROR : Corrupted index: 95827 
table2_table3_table1_message_id_58a31784_like XX002 ERROR: posting list 
contains misplaced TID in index "table2_table3_table1_message_id_58a31784_like" 
DETAIL: Index tid=(373,89) posting list offset=144 page lsn=319/3C3EE660.
2024-03-01 11:54:08,162 ERROR : Corrupted index: 96066 
webhooks_webhookresponse_webhook_id_db49ebcd XX002 ERROR: item order invariant 
violated for index "webhooks_webhookresponse_webhook_id_db49ebcd" DETAIL: Lower 
index tid=(522,24) (points to heap tid=(73981,1)) higher index tid=(522,25) 
(points to heap tid=(73981,1)) page lsn=31B/E522B640.
2024-03-01 11:54:08,646 ERROR : Corrupted index: 95822 
table2_table3_resolved_by_alert_id_bbdf0a83 XX002 ERROR: posting list contains 
misplaced TID in index "table2_table3_resolved_by_alert_id_bbdf0a83" DETAIL: 
Index tid=(618,2) posting list offset=150 page lsn=317/C1DE74B8.
2024-03-01 11:54:08,873 ERROR : Corrupted index: 95427 

Re: UUID v7

2024-03-20 Thread Andrey M. Borodin


> On 19 Mar 2024, at 13:55, Peter Eisentraut  wrote:
> 
> On 16.03.24 18:43, Andrey M. Borodin wrote:
>>> On 15 Mar 2024, at 14:47, Aleksander Alekseev  
>>> wrote:
>>> 
>>> +1 to the idea. I doubt that anyone will miss it.
>> PFA v22.
>> Changes:
>> 1. Squashed all editorialisation by Jelte
>> 2. Fixed my erroneous comments on using Method 2 (we are using method 1 
>> instead)
>> 3. Remove all traces of uuid_extract_variant()
> 
> I have committed a subset of this for now, namely the additions of 
> uuid_extract_timestamp() and uuid_extract_version().  These seemed mature and 
> agreed upon.  You can rebase the rest of your patch on top of that.

Great! Thank you! PFA v23 with rebase on HEAD.

> I have started a separate discussion to learn about the precision we can 
> expect from gettimeofday().

Even in presence of real microsecond-enabled and portable timer using 
microseconds does not seem to me an optimal way of utilising UUID bits.

Timer-based bits contribute to global sortability. But the real timers we have 
are not even millisecond adjusted. We can hope for ~few ms variation in one 
datacenter or in presence of atomic clocks.

Time-based bits contribute to global uniqueness, but certainly they are not 
that effective as counter bits.

Time-based bits do not provide local sortability guarantees: some UUIDs might 
get same microseconds or be affected by leap backwards.

I think that microseconds are good only for hardware-specific solutions, not 
for something that runs on variety of platforms, OSes, devices.


Best regards, Andrey Borodin.


v23-0001-Implement-UUID-v7.patch
Description: Binary data


Re: Weird test mixup

2024-03-17 Thread Andrey M. Borodin



> On 18 Mar 2024, at 06:04, Michael Paquier  wrote:
> 
> new function call injection_points_local() that can
> be added on top of a SQL test to make it concurrent-safe.

Maybe consider function injection_points_attach_local(‘point name’) instead of 
static switch?
Or even injection_points_attach_global(‘point name’), while function 
injection_points_attach(‘point name’) will be global? This would favour writing 
concurrent test by default…


Best regards, Andrey Borodin.



Re: UUID v7

2024-03-16 Thread Andrey M. Borodin


> On 15 Mar 2024, at 14:47, Aleksander Alekseev  
> wrote:
> 
> +1 to the idea. I doubt that anyone will miss it.

PFA v22.

Changes:
1. Squashed all editorialisation by Jelte
2. Fixed my erroneous comments on using Method 2 (we are using method 1 instead)
3. Remove all traces of uuid_extract_variant()

Thanks!


Best regards, Andrey Borodin.


v22-0001-Implement-UUID-v7.patch
Description: Binary data


Re: UUID v7

2024-03-14 Thread Andrey M. Borodin



> On 14 Mar 2024, at 20:10, Peter Eisentraut  wrote:
> 
> I think the behavior of uuid_extract_var(iant) is wrong.  The code
> takes just two bits to return, but the draft document is quite clear
> that the variant is 4 bits (see Table 1).
 Well, it was correct only for implemented variant. I've made version that 
 implements full table 1 from section 4.1.
>>> I think we are still interpreting this differently.  I think 
>>> uuid_extract_variant should just return whatever is in those four bits. 
>>> Your function comment says "Can return only 0, 0b10, 0b110 and 0b111.", 
>>> which I don't think it is correct.  It should return 0 through 15.
>> We will return "do not care" bits. This bits can confuse someone. E.g. for 
>> varaint 0b10 we can return 8, 9, 10 and 11 randomly. Is it OK? BTW for some 
>> reason document lists number 1-15, but your are correct that range is 0-15.
> 
> I agree it's confusing.  Before I studied the RFC 4122bis project, I didn't 
> even know about variant vs. version.  I think overall people will find this 
> more confusing than useful.  If you just want to know, "is this UUID of the 
> kind specified in RFC 4122", you can query it with uuid_extract_version(x) IS 
> NOT NULL.  So maybe we don't need the _extract_variant function?

I think it's the best possible solution. The variant has no value besides 
detecting if a version can be extracted.


Best regards, Andrey Borodin.



Re: UUID v7

2024-03-14 Thread Andrey M. Borodin



> On 14 Mar 2024, at 16:07, Peter Eisentraut  wrote:
> 
> On 10.03.24 13:59, Andrey M. Borodin wrote:
>>> The functions uuid_extract_ver and uuid_extract_var could be named
>>> uuid_extract_version and uuid_extract_variant.  Otherwise, it's hard
>>> to tell them apart, with only one letter different.
>> Renamed.
> 
> Another related comment: Throughout your patch, swap the order of 
> uuid_extract_variant and uuid_extract_version.  First, this makes more sense 
> because version is subordinate to variant, and also it makes it alphabetical.
I will do it soon.
> 
>>> I think the behavior of uuid_extract_var(iant) is wrong.  The code
>>> takes just two bits to return, but the draft document is quite clear
>>> that the variant is 4 bits (see Table 1).
>> Well, it was correct only for implemented variant. I've made version that 
>> implements full table 1 from section 4.1.
> 
> I think we are still interpreting this differently.  I think 
> uuid_extract_variant should just return whatever is in those four bits. Your 
> function comment says "Can return only 0, 0b10, 0b110 and 0b111.", which I 
> don't think it is correct.  It should return 0 through 15.
We will return "do not care" bits. This bits can confuse someone. E.g. for 
varaint 0b10 we can return 8, 9, 10 and 11 randomly. Is it OK? BTW for some 
reason document lists number 1-15, but your are correct that range is 0-15.

> 
>>> I would have expected that, since gettimeofday() provides microsecond
>>> precision, we'd put the extra precision into "rand_a" as per Section 6.2 
>>> method 3.
>> I had chosen method 2 over method 3 as most portable. Can we be sure how 
>> many bits (after reading milliseconds) are there across different OSes?
> 
> I think this should have been researched.  If we don't know how many bits we 
> have, how do we know we have enough for milliseconds?  I think we should at 
> least have some kind of idea, if we are going to have this conversation.

Bits for milliseconds are strictly defined by the document: there are always 48 
bits, independently from clock resolution.
But I don't think it's main problem for Method 3. Method 1 actually guarantees 
strictly increasing order of UUIDs generated by single backend. Method 3 can 
generate a lot of unsorted data in case of time leaping backward.

BTW Kyzer (in an off-list discussion) and Sergey confirmed that implemented 
method from the patch actually is Method 1.


Best regards, Andrey Borodin.



Re: Transaction timeout

2024-03-12 Thread Andrey M. Borodin


> On 13 Mar 2024, at 05:23, Alexander Korotkov  wrote:
> 
> On Tue, Mar 12, 2024 at 10:28 AM Andrey M. Borodin  
> wrote:
>>> On 11 Mar 2024, at 16:18, Alexander Korotkov  wrote:
>>> 
>>> I think if checking psql stderr is problematic, checking just logs is
>>> fine.  Could we wait for the relevant log messages one by one with
>>> $node->wait_for_log() just like 040_standby_failover_slots_sync.pl do?
>> 
>> PFA version with $node->wait_for_log()
> 
> I've slightly revised the patch.  I'm going to push it if no objections.

One small note: log_offset was updated, but was not used.

Thanks!


Best regards, Andrey Borodin.


v6-0001-Add-TAP-tests-for-timeouts.patch
Description: Binary data


Re: Transaction timeout

2024-03-12 Thread Andrey M. Borodin


> On 11 Mar 2024, at 16:18, Alexander Korotkov  wrote:
> 
> I think if checking psql stderr is problematic, checking just logs is
> fine.  Could we wait for the relevant log messages one by one with
> $node->wait_for_log() just like 040_standby_failover_slots_sync.pl do?

PFA version with $node->wait_for_log()

Best regards, Andrey Borodin.


v4-0001-Add-timeouts-TAP-tests.patch
Description: Binary data


Re: UUID v7

2024-03-12 Thread Andrey M. Borodin



> On 12 Mar 2024, at 10:53, Michael Paquier  wrote:
> 
> It does not strike me as a good idea to rush an implementation without
> a specification officially approved because there is always a risk of
> shipping something that's non-compliant into core.  But perhaps I am
> missing something on the RFC side?

Upthread one of document’s authors commented:

> On 14 Feb 2023, at 19:13, Kyzer Davis (kydavis)  wrote:
> 
> The point is 99% of the work since adoption by the IETF has been ironing out 
> RFC4122's problems and nothing major related to UUIDv6/7/8 which are all in a 
> very good state.

And also


> On 22 Jan 2024, at 09:22, Nikolay Samokhvalov  wrote:
> 
> And many libraries are already including implementation of UUIDv7 – here are 
> some examples:
> 
> - https://www.npmjs.com/package/uuidv7
> - https://crates.io/crates/uuidv7
> - https://github.com/google/uuid/pull/139

So at least reviewing patch and agreeing on chosen methods and constants makes 
sense.


Best regards, Andrey Borodin.



Re: UUID v7

2024-03-11 Thread Andrey M. Borodin



> On 11 Mar 2024, at 20:56, Jelte Fennema-Nio  wrote:
> 
> Attached a few comment fixes/improvements and a pgindent run (patch 0002-0004)

Thanks!

> Now with the added comments, one thing pops out to me: The comments
> mention that we use "Monotonic Random", but when I read the spec that
> explicitly recommends against using an increment of 1 when using
> monotonic random. I feel like if we use an increment of 1, we're
> better off going for the "Fixed-Length Dedicated Counter Bits" method
> (i.e. change the code to start the counter at 0). See patch 0005 for
> an example of that change.
> 
> I'm also wondering if we really want to use the extra rand_b bits for
> this. The spec says we MAY, but it does remove the amount of
> randomness in our UUIDs.

Method 1 is a just a Method 2 with specifically picked constants.
But I'll have to use some hand-wavy wordings...

UUID consists of these 128 bits:
a. Mandatory 2 var and 4 ver bits.
b. Flexible but strongly recommended 48 bits unix_ts_ms. These bits contribute 
to global sortability of values generated at frequency less than 1KHz.
c. Counter bits:
c1. Initialised with 0 on any time tick.
c2. Initialised with randomness.
c3*. bit width of a counter step (*not counted in 128 bit capacity, can be 
non-integral)
d. Randomness bits.

Method 1 is when c2=0. My implementation of method 2 uses c1=1, c2=17

Consider all UUIDs generated at any given milliseconds. Probability of a 
collision of two UUIDs generated at frequency less than 1KHz is p = 2^-(c2+d)
Capacity of a counter has expected value of c = 2^(c1)*2^(c2-1)/2^c3
To guess next UUID you can correctly pick one of u = 2^(d+c3)

First, observe that c3 contributes unguessability at exactly same scale as 
decreases counter capacity. There is no difference between using bits in d 
directly, or in c3. There is no point in non-zero c3. Every bit that could be 
given to c3 can equally be given to d.

Second, observe that c2 bits contribute to both collision protection and 
counter capacity! And when the time ticks, c2 also contribute to 
unguessability! So, technically, we should consider using all available bits as 
c2 bits.

How many c1 bits do we need? I've chosen one - to prevent occasional counter 
capacity reduction.

If c1 = 1, we can distribute 73 bits between c2 and d. I've chosen c2 = 17 and 
d = 56 as an arbitrary compromise between capacity of one backend per ms and 
prevention of global collision.
This compromise is mostly dictated by maximum frequency of UUID generation by 
one backend, I've chosen 200MHz as a sane value.


This compromise is much easier when you do not have 74 spare bits, this crazy 
amount of information forgives almost any mistake. Imagine you have to 
distribute 10 bits between c2 and d. And you try to prevent collision between 
10 independent devices which need capacity to generate IDs with frequency of 
10KHz each and keep sortability. You would have something like c1=1, c2=3,d=6.

Sorry for this long and vague explanation, if it still seems too uncertain we 
can have a chat or something like that. I don't think this number picking stuff 
deserve to be commented, because it still is quite close to random. RFC gives 
us too much freedom of choice.

Thanks!


Best regards, Andrey Borodin.



Re: Transaction timeout

2024-03-11 Thread Andrey M. Borodin


> On 7 Mar 2024, at 00:55, Alexander Korotkov  wrote:
> 
> On Wed, Mar 6, 2024 at 10:22 AM Andrey M. Borodin  
> wrote:
>>> On 25 Feb 2024, at 21:50, Alexander Korotkov  wrote:
>>> 
>>> Thank you for the patches.  I've pushed the 0001 patch to avoid
>>> further failures on buildfarm.  Let 0004 wait till injections points
>>> by Mechael are committed.
>> 
>> Thanks!
>> 
>> All prerequisites are committed. I propose something in a line with this 
>> patch.
> 
> Thank you.  I took a look at the patch.  Should we also check the
> relevant message after the timeout is fired?  We could check it in
> psql stderr or log for that.

PFA version which checks log output.
But I could not come up with a proper use of BackgroundPsql->query_until() to 
check outputs. And there are multiple possible errors.

We can copy test from src/bin/psql/t/001_basic.pl:

# test behavior and output on server crash
my ($ret, $out, $err) = $node->psql('postgres',
"SELECT 'before' AS running;\n"
. "SELECT pg_terminate_backend(pg_backend_pid());\n"
. "SELECT 'AFTER' AS not_running;\n");

is($ret, 2, 'server crash: psql exit code');
like($out, qr/before/, 'server crash: output before crash');
ok($out !~ qr/AFTER/, 'server crash: no output after crash');
is( $err,
'psql::2: FATAL: terminating connection due to administrator command
psql::2: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
psql::2: error: connection to server was lost',
'server crash: error message’);

But I do not see much value in this.
What do you think?


Best regards, Andrey Borodin.



v3-0001-Add-timeouts-TAP-tests.patch
Description: Binary data


Re: Amcheck verification of GiST and GIN

2024-03-11 Thread Andrey M. Borodin



> On 20 Jan 2024, at 07:46, vignesh C  wrote:
> 
> I have changed the status of the commitfest entry to "Waiting on
> Author" as there was no follow-up on Alexander's queries. Feel free to
> address them and change the commitfest entry accordingly.

Thanks Vignesh!

At the moment it’s obvious that this change will not be in 17, but I have plans 
to continue work on this. So I’ll move this item to July CF.


Best regards, Andrey Borodin.



Re: UUID v7

2024-03-10 Thread Andrey M. Borodin


> On 10 Mar 2024, at 17:59, Andrey M. Borodin  wrote:
> 
> I tried to "make docs", but it gives me gazilion of errors... Is there an 
> easy way to see resulting HTML?

Oops, CFbot expectedly found a problem...
Sorry for the noise, this version, I hope, will pass all the tests.
Thanks!


Best regards, Andrey Borodin.


v19-0001-Implement-UUID-v7.patch
Description: Binary data


Re: UUID v7

2024-03-10 Thread Andrey M. Borodin
Hi Peter,

thank you for so thoughtful review.

> On 6 Mar 2024, at 12:13, Peter Eisentraut  wrote:
> 
> I have various comments on this patch:
> 
> 
> - doc/src/sgml/func.sgml
> 
> The documentation of the new functions should be broken up a bit.
> It's all one paragraph now.  At least make it several paragraphs, or
> possibly tables or something else.
I've split functions to generate UUIDs from functions to extract stuff.

> 
> Avoid listing the functions twice: Once before the description and
> then again in the description.  That's just going to get out of date.
> The first listing is not necessary, I think.

Fixed.

> The return values in the documentation should use the public-facing
> type names, like "timestamp with time zone" and "smallint".

Fixed.

> The descriptions of the UUID generation functions use handwavy
> language in their descriptions, like "It provides much better data
> locality" or "unacceptable security or business intelligence
> implications", which isn't useful.  Either we cut that all out and
> just say, it creates a UUIDv7, done, look elsewhere for more
> information, or we provide some more concretely useful details.

I've removed all that stuff entirely.

> We shouldn't label a link as "IETF standard" when it's actually a
> draft.

Fixed.

Well, all my modifications of documentation are kind of blind... I tried to 
"make docs", but it gives me gazilion of errors... Is there an easy way to see 
resulting HTML?


> - src/include/catalog/pg_proc.dat
> 
> The description of uuidv4 should be "generate UUID version 4", so that
> it parallels uuidv7.

Fixed.

> The description of uuid_extract_time says 'extract timestamp from UUID
> version 7', the implementation is not limited to version 7.

Fixed.

> I think uuid_extract_time should be named uuid_extract_timestamp,
> because it extracts a timestamp, not a time.

Renamed.

> The functions uuid_extract_ver and uuid_extract_var could be named
> uuid_extract_version and uuid_extract_variant.  Otherwise, it's hard
> to tell them apart, with only one letter different.

Renamed.

> - src/test/regress/sql/uuid.sql
> 
> Why are the tests using the input format '{...}', which is not the
> standard one?

Fixed.

> - src/backend/utils/adt/uuid.c
> 
> All this new code should have more comments.  There is a lot of bit
> twiddling going on, and I suppose one is expected to follow along in
> the RFC?  At least each function should have a header comment, so one
> doesn't have to check in pg_proc.dat what it's supposed to do.

I've added some header comment. One big comment is attached to v7, I tried to 
take parts mostly from RFC. Yet there are a lot of my additions that now need 
review...

> I'm suspicious that these functions all appear to return null for
> erroneous input, rather than raising errors.  I think at least some
> explanation for this should be recorded somewhere.

The input is not erroneous per se.
But the fact that
# select 1/0;
ERROR:  division by zero
makes me consider throwing an error. There was some argumentation upthread for 
not throwing error though, but now I cannot find it... maybe I accepted this 
behaviour as more user-friendly.

> I think the behavior of uuid_extract_var(iant) is wrong.  The code
> takes just two bits to return, but the draft document is quite clear
> that the variant is 4 bits (see Table 1).

Well, it was correct only for implemented variant. I've made version that 
implements full table 1 from section 4.1.

> The uuidv7 function could really use a header comment that explains
> the choices that were made.  The RFC draft provides various options
> that implementations could use; we should describe which ones we
> chose.

Done.

> 
> I would have expected that, since gettimeofday() provides microsecond
> precision, we'd put the extra precision into "rand_a" as per Section 6.2 
> method 3.

I had chosen method 2 over method 3 as most portable. Can we be sure how many 
bits (after reading milliseconds) are there across different OSes? Even if we 
put extra 10 bits of timestamp, we cannot extract safely them.
These bits could promote inter-backend stortability. E.i. when many backends 
generate data fast - this data is still somewhat ordered even within 1ms. But I 
think benefits of this sortability are outweighed by portability(unknown real 
resolution), simplicity(we don't store microseconds, thus do not try to extract 
them).
All this arguments are weak, but if one method would be strictly better than 
another - there would be only one method.

> 
> You use some kind of counter, but could you explain which method that
> counter implements?
I described counter in uuidv7() header.

> 
> I don't see any acknowledgment of issues relating to concurrency or
> restarts.  Like, how do we prevent duplicates being generated by
> concurrent sessions or between restarts?  Maybe the counter or random
> stuff does that, but it's not explained.

I think restart takes more than 1ms, so this is covered with time 

Re: CF entries for 17 to be reviewed

2024-03-08 Thread Andrey M. Borodin



> On 6 Mar 2024, at 18:49, Andrey M. Borodin  wrote:
> 
> Here are statuses for "Refactoring" section:

I've made a pass through "Replication and Recovery" and "SQL commands" sections.
"SQL commands" section seems to me most interesting stuff on the commitfest 
(but I'm far from inspecting every thing thread yet). But reviewing patches 
from this section is not a work for one CF definitely. Be prepared that this is 
a long run, with many iterations and occasional architectural pivots. Typically 
reviewers work on these items for much more than one month.

Replication and Recovery
* Synchronizing slots from primary to standby
Titanic work. A lot of stuff already pushed, v108 is now in the discussion. 
ISTM that entry barrier of jumping into discussion with something useful is 
really high.
* CREATE SUBSCRIPTION ... SERVER
The discussion stopped in January. Authors posted new version recently.
* speed up a logical replication setup (pg_createsubscriber)
Newest version posted recently, but it fails CFbot's tests. Pinged authors.
* Have pg_basebackup write "dbname" in "primary_conninfo"?
Some feedback and descussin provided. Switched to WoA.

SQL Commands
* Add SPLIT PARTITION/MERGE PARTITIONS commands
Cool new commands, very useful for sharding. CF item was RfC recently, need 
review update after rebase.
* Add CANONICAL option to xmlserialize
Vignesh C and Chapman Flack provided some feedback back in October 2023, 
but the patch still needs review.
* Incremental View Maintenance
This is a super cool feature. IMO at this point is too late for 17, but you 
should consider reviewing this not because it's easy, but because it's hard. 
It's real rocket science. Fine-grained 11-step patchset, which can change a lot 
of workloads if committed. CFbot finds some failures, but this should not stop 
anyone frome reviewing in this case.
* Implement row pattern recognition feature
SQL:2016 feature, carefully split into 8 steps. Needs review, probably 
review in a long run. The feature seems big. 3 reviewers are working on this, 
but no recent input for some months.
* COPY TO json
Active thread with many different authors proposing different patches. I 
could not summarize it, asked CF entry author for help.
* Add new error_action COPY ON_ERROR "log"
There's an active discussion in the thread.
* add COPY option RECECT_LIMIT
While the patch seems much similar to previous, there's no discussion in 
this thread...
 
Thanks!


Best regards, Andrey Borodin.



Re: Emitting JSON to file using COPY TO

2024-03-08 Thread Andrey M. Borodin
Hello everyone!

Thanks for working on this, really nice feature!

> On 9 Jan 2024, at 01:40, Joe Conway  wrote:
> 
> Thanks -- will have a look

Joe, recently folks proposed a lot of patches in this thread that seem like 
diverted from original way of implementation.
As an author of CF entry [0] can you please comment on which patch version 
needs review?

Thanks!


Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/47/4716/



Re: Commitfest Manager for March

2024-03-08 Thread Andrey M. Borodin



> On 4 Mar 2024, at 17:09, Aleksander Alekseev  wrote:
> 
> If you need any help please let me know.

Aleksander, I would greatly appreciate if you join me in managing CF. Together 
we can move more stuff :)
Currently, I'm going through "SQL Commands". And so far I had not come to 
"Performance" and "Server Features" at all... So if you can handle updating 
statuses of that sections - that would be great.


Best regards, Andrey Borodin.



Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2024-03-08 Thread Andrey M. Borodin



> On 26 Jan 2024, at 23:36, Dmitry Koval  wrote:
> 
> 

The CF entry was in Ready for Committer state no so long ago.
Stephane, you might want to review recent version after it was rebased on 
current HEAD. CFbot's test passed successfully.

Thanks!


Best regards, Andrey Borodin.



Re: speed up a logical replica setup

2024-03-08 Thread Andrey M. Borodin



> On 8 Mar 2024, at 12:03, Shlok Kyal  wrote:
> 
> 

I haven't digged into the thread, but recent version fails some CFbot's tests.

http://commitfest.cputube.org/euler-taveira.html
https://cirrus-ci.com/task/4833499115421696
==29928==ERROR: AddressSanitizer: heap-use-after-free on address 0x61a01458 
at pc 0x7f3b29fdedce bp 0x7ffe68fcf1c0 sp 0x7ffe68fcf1b8

Thanks!


Best regards, Andrey Borodin.



Re: 035_standby_logical_decoding unbounded hang

2024-03-07 Thread Andrey M. Borodin



> On 20 Feb 2024, at 04:09, Noah Misch  wrote:
> 

I’m not sure if it is connected, but so far many patches in CFbot keep hanging 
in this test. For example [0].

I haven’t done root cause analysis yet, but hangs may be related to this 
thread. Maybe someone more familiar with similar issues could take a look there?

Thanks!


Best regards, Andrey Borodin.

[0] https://cirrus-ci.com/task/5911176840740864?logs=check_world#L292 



Re: CF entries for 17 to be reviewed

2024-03-06 Thread Andrey M. Borodin



> On 4 Mar 2024, at 14:51, Andrey M. Borodin  wrote:
> 
> I’ve read other small sections.

Here are statuses for "Refactoring" section:
* New [relation] options engine
Relatively heavy refactoring. Author keeps interest to the patch for 
some years now. As I understood the main problem is that big refactoring cannot 
be split into incremental steps. Definitely worth reviewing, but I think not 
for 17 already...
* Confine vacuum skip logic to lazy_scan_skip
There was a discussion at the end of 2023, but no recent review 
activity. Author actively improves the patchset.
* Change prefetch and read strategies to use range in pg_prewarm
Some discussion is happening. Changed to WoA to reflect actual status.
* Potential issue in ecpg-informix decimal converting functions
On Daniel's TODO list.
* BitmapHeapScan table AM violation removal (and use streaming read API)
Active discussion with reviewers is going on.
* Streaming read sequential and TID range scan
Seems like discussion on this patch is going on in nearby threads. In 
this thread I observe only improved patch versions posted.

All in all "Refactoring" section seemed to me more complex and demanding 
in-depth knowledge. It's difficult to judge why new approaches are an 
improvement. So for newcomer reviewers I'd recommend to look to other sections.

Thanks.


Best regards, Andrey Borodin.



Re: Transaction timeout

2024-03-06 Thread Andrey M. Borodin


> On 25 Feb 2024, at 21:50, Alexander Korotkov  wrote:
> 
> Thank you for the patches.  I've pushed the 0001 patch to avoid
> further failures on buildfarm.  Let 0004 wait till injections points
> by Mechael are committed.

Thanks!

All prerequisites are committed. I propose something in a line with this patch.


Best regards, Andrey Borodin.


v2-0001-Add-timeouts-TAP-tests.patch
Description: Binary data


Re: Tidy fill hstv array (src/backend/access/heap/pruneheap.c)

2024-03-04 Thread Andrey M. Borodin



> On 4 Mar 2024, at 16:47, Ranier Vilela  wrote:
> 
> Does filling a memory area, one by one, with branches, need strong evidence 
> to prove that it is better than filling a memory area, all at once, without 
> branches? 

I apologise for being too quick to decide to mark the patch RwF. Wold you mind 
if restore patch as "Needs review" so we can have more feedback?


Best regards, Andrey Borodin.



Re: CF entries for 17 to be reviewed

2024-03-04 Thread Andrey M. Borodin



> On 4 Mar 2024, at 13:42, Andrey M. Borodin  wrote:
> 
> Here’s my take on “Miscellaneous” section.

I’ve read other small sections.

Monitoring & Control
* Logging parallel worker draught
The patchset on improving loggin os resource starvation when parallel 
workers are spawned. Some major refactoring proposed. I've switched to WoA.
* System username in pg_stat_activity
There's active discussion on extending or not pg_stat_activity.

Testing
* Add basic tests for the low-level backup method
Michael Paquier provided feedback, so I switched to WoA.

System Administration
* recovery modules
There was a very active discussion, but after April 2023 it stalled, 
and only some rebases are there. Maybe a fresh look could revive the thread.
* Possibility to disable `ALTER SYSTEM`
The discussion seems active, but inconclusive.
* Add checkpoint/redo LSNs to recovery errors.
Michael Paquier provided feedback, so I switched to WoA.

Security
* add not_before and not_after timestamps to sslinfo extension and pg_stat_ssl
Most recent version provided by Daniel Gustafsson, but the thread 
stalled in September 2023. Maybe Jacob or some other reviewer could refresh it, 
IMO this might land in 17.

Thanks!


Best regards, Andrey Borodin.



Re: CF entries for 17 to be reviewed

2024-03-04 Thread Andrey M. Borodin



> On 3 Mar 2024, at 01:19, Melanie Plageman  wrote:
> 
>  I'm not
> sure if the ones that do have a target version = 17 are actually all
> the patches targeting 17.

Yes, for me it’s only a hint where to bump things up. I will extend scope on 
other versions when I fill finish a pass though entries with version 17.


Here’s my take on “Miscellaneous” section.


* Permute underscore separated components of columns before fuzzy matching
The patch received some review, but not for latest version. I pinged 
the reviewer for an update.
* Add pg_wait_for_lockers() function
Citus-related patch, but may be of use to other distributed systems. 
This patch worth attention at least because author replied to himself 10+ 
times. Interesting addition, some feedback from Andres and Laurenz.
* Improve the log message output of basic_archive when 
basic_archive.archive_directory parameter is not set
Relatively simple change to improve user-friendliness. Daniel 
Gustafsson expressed interest recently.
* Fix log_line_prefix to display the transaction id (%x) for statements not in 
a transaction block
Reasonable log improvement, code is simple but in a tricky place. There 
was some feedback, I've asked if respondent can be a reviewer.
* Add Index-level REINDEX with multiple jobs
An addition to DBA toolset. Some unaddressed feedback, pinged authors.
* Add LSN <-> time conversion facility
There's ongoing discussion between Melanie and Tomas. Relatively 
heavyweight patchset, but given current solid technical level of the discussion 
this might land into 17. Take your chance to chime-in with review! :)
* date_trunc function in interval version
Some time tricks. There are review notes by Tomas. I pinged authors.
* Adding comments to help understand psql hidden queries
A couple of patches for psql --echo-hidden. Seems useful and simple. No 
reviews at all though. I moved the patch to "Clients" to reflect actual patch 
purpose and lighten generic “Miscellaneous".
* Improving EXPLAIN's display of SubPlan nodes
Some EXPLAIN changes, Alexander Alekseev was looking into this. I've 
asked him if he would be the reviewer.
* Should we remove -Wdeclaration-after-statement?
Not really a patch, kind of an opinion poll. The result is now kind of 
-BigNumber, I see no chances for this to get into 17, but maybe in future.
* Add to_regtypemod() SQL function
Cool nice function, some reviewers approved the patch. I've took a 
glance on the patch, seems nice, switched to "Ready for Committer". Some 
unrelated changes to genbki.pl, but according to thread it was needed for 
something.
* un-revert MAINTAIN privilege and pg_maintain predefined role
The work seems to be going on.
* Checkpoint extension hook
The patch is not provided yet. I've pinged the thread.

Stay tuned, I hope everyone interested in reviewing will find themself a cool 
interesting patch or two.


Best regards, Andrey Borodin.



Re: Comments on Custom RMGRs

2024-03-04 Thread Andrey M. Borodin



> On 29 Feb 2024, at 19:47, Danil Anisimow  wrote:
> 
> Answering your questions might take some time as I want to write a sample 
> patch for pg_stat_statements and make some tests.
> What do you think about putting the patch to commitfest as it closing in a 
> few hours?

I’ve switched the patch to “Waiting on Author” to indicate that currently patch 
is not available yet. Please, flip it back when it’s available for review.

Thanks!


Best regards, Andrey Borodin.



Re: Improving EXPLAIN's display of SubPlan nodes

2024-03-03 Thread Andrey M. Borodin



> On 17 Feb 2024, at 00:38, Tom Lane  wrote:
> 
> Here's a rebase over 9f1337639 --- no code changes, but this affects
> some of the new or changed expected outputs from that commit.

Aleksander, as long as your was reviewing this previously, I’ve added you to 
reviewers of this CF entry [0]. Please, ping me or remove yourself, it it’s 
actually not a case.

BTW, as long as there’s a new version and some input from Tom, would you be 
willing to post fleshier review?

Thanks for working on this!


Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/47/4782/



Re: date_trunc function in interval version

2024-03-03 Thread Andrey M. Borodin



> On 18 Feb 2024, at 05:29, Tomas Vondra  wrote:
> 
> I'm not very familiar with date_bin(), but is this issue inherent or
> could we maybe fix date_bin() to handle DST better?
> 
> In particular, isn't part of the problem that date_bin() is defined only
> for timestamp and not for timestamptz? Also, date_trunc() allows to
> specify a timezone, but date_bin() does not.
> 
> 
> In any case, the patch needs to add the new stuff to the SGML docs (to
> doc/src/sgml/func.sgml), which now documents the date_trunc(text,...)
> variant only.

Hi Przemysław,

Please address above notes.
I’ve flipped CF entry [0] to “Waiting on author”, feel free to switch it back.

Thank you!


Best regards, Andrey Borodin.
[0] https://commitfest.postgresql.org/47/4761/



Re: Add Index-level REINDEX with multiple jobs

2024-03-03 Thread Andrey M. Borodin


> On 6 Feb 2024, at 11:21, Michael Paquier  wrote:
> 
> The problem may be actually trickier than that, no?  Could there be
> other factors to take into account for their classification, like
> their sizes (typically, we'd want to process the biggest one first, I
> guess)?

Maxim, what do you think about it?


Best regards, Andrey Borodin.




Re: Tidy fill hstv array (src/backend/access/heap/pruneheap.c)

2024-03-03 Thread Andrey M. Borodin



> On 14 Jan 2024, at 18:55, John Naylor  wrote:
> 
> On Sat, Jan 13, 2024 at 9:36 PM Ranier Vilela  wrote:
>> 
>> Em ter., 9 de jan. de 2024 às 06:31, John Naylor  
>> escreveu:
> 
>>> This just moves an operation from one place to the other, while
>>> obliterating the explanatory comment, so I don't see an advantage.
>> 
>> Well, I think that is precisely the case for using memset.
>> The way initialization is currently done is much slower and harmful to the 
>> branch.
>> Of course, the gain should be small, but it is fully justified for switching 
>> to memset.
> 
> We haven't seen any evidence or reasoning for that. Simple
> rules-of-thumb are not enough.
> 

Hi Ranier,

I’ll mark CF entry [0] as “Returned with Feedback”. Feel free to reopen item in 
this CF or submit to the next, if you want to continue working on this.

I took a glance into the patch, and I would agree that setting field nonzero 
values with memset() is somewhat unusual. Please provide stronger evidence to 
do so.

Thanks!


Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/47/4734/



Re: Fix log_line_prefix to display the transaction id (%x) for statements not in a transaction block

2024-03-03 Thread Andrey M. Borodin



> On 12 Jan 2024, at 05:51, jian he  wrote:
> 
> another big  difference compare to HEAD:

Hi Jian,

thanks for looking into this. Would you be willing to review the next version 
of the patch?
As long as you are looking into this, you might be interested in registering 
yourself in a CF entry as a reviewer. [0]

Thanks!


Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/47/4606/



Re: Permute underscore separated components of columns before fuzzy matching

2024-03-03 Thread Andrey M. Borodin



> On 23 Jan 2024, at 09:42, Arne Roland  wrote:
> 
> <0001-fuzzy_underscore_permutation_v5.patch>

Mikhail, there’s a new patch version. May I ask you to review it?


Best regards, Andrey Borodin.



Re: PostgreSQL Contributors Updates

2024-03-03 Thread Andrey M. Borodin



> On 3 Mar 2024, at 20:57, Joe Conway  wrote:
> 
> New PostgreSQL Contributors:
> 
> * Bertrand Drouvot
> * Gabriele Bartolini
> * Richard Guo
> 
> New PostgreSQL Major Contributors:
> 
> * Alexander Lakhin
> * Daniel Gustafsson
> * Dean Rasheed
> * John Naylor
> * Melanie Plageman
> * Nathan Bossart

Congratulations! And thank you for your work on Postgres!


Best regards, Andrey Borodin.




Re: Injection points: some tools to wait and wake

2024-03-03 Thread Andrey M. Borodin



> On 4 Mar 2024, at 06:44, Michael Paquier  wrote:
> so I have applied it 

Great! Thank you! A really useful stuff for an asynchronous testing!


> On 4 Mar 2024, at 09:17, Jelte Fennema-Nio  wrote:
> 
> this code is included in src/test/modules. It sounds like that
> location will make it somewhat hard to install.

+1. I joined the thread too late to ask why it’s not in core. But, it seems to 
me that separating logic even further is not necessary, given build option 
—with-injection-points off by default.

> If that's indeed the
> case, would it be possible to move it to contrib instead?

There’s no point in shipping this to users, especially with the build without 
injection points compiled.


Best regards, Andrey Borodin.



  1   2   3   >