Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2024-05-09 Thread Michail Nikolaev
Hello, Matthias and others! Realized new horizon was applied only during validation phase (once index is marked as ready). Now it applied if index is not marked as valid yet. Updated version in attach. -- > I think the best way for this to work

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2024-05-07 Thread Michail Nikolaev
Hi again! Made an error in `GlobalVisHorizonKindForRel` logic, and it was caught by a new test. Fixed version in attach. > From 9a8ea366f6d2d144979e825c4ac0bdd2937bf7c1 Mon Sep 17 00:00:00 2001 From: nkey Date: Tue, 7 May 2024 22:10:56 +0200 Subject: [PATCH v3] WIP: fix d9d076222f5b "VACUUM:

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2024-05-07 Thread Michail Nikolaev
Hello, Matthias and others! Updated WIP in attach. Changes are: * Renaming, now it feels better for me * More reliable approach in `GlobalVisHorizonKindForRel` to make sure we have not missed `rd_safeindexconcurrentlybuilding` by calling `RelationGetIndexList` if required * Optimization to avoid

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2024-05-05 Thread Michail Nikolaev
Hello, Matthias! I just realized there is a much simpler and safe way to deal with the problem. So, d9d076222f5b94a85e0e318339cfc44b8f26022d(1) had a bug because the scan was not protected by a snapshot. At the same time, we want this snapshot to affect not all the relations, but only a subset

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2024-05-04 Thread Michail Nikolaev
Hello, Matthias! > We can just release the current snapshot, and get a new one, right? I > mean, we don't actually use the transaction for much else than > visibility during the first scan, and I don't think there is a need > for an actual transaction ID until we're ready to mark the index entry

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2024-03-07 Thread Michail Nikolaev
Hello! > I'm not a fan of this approach. Changing visibility and cleanup > semantics to only benefit R/CIC sounds like a pain to work with in > essentially all visibility-related code. I'd much rather have to deal > with another index AM, even if it takes more time: the changes in > semantics

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2024-02-21 Thread Michail Nikolaev
Hi! > How do you suppose this would work differently from a long-lived > normal snapshot, which is how it works right now? Difference in the ability to take new visibility snapshot periodically during the second phase with rechecking visibility of tuple according to the "reference" snapshot

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2024-02-21 Thread Michail Nikolaev
One more idea - is just forbid HOT prune while the second phase is running. It is not possible anyway currently because of snapshot held. Possible enhancements: * we may apply restriction only to particular tables * we may apply restrictions only to part of the tables (not yet scanned by R/CICs).

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2024-02-20 Thread Michail Nikolaev
Hello! > I think the best way for this to work would be an index method that > exclusively stores TIDs, and of which we can quickly determine new > tuples, too. I was thinking about something like GIN's format, but > using (generation number, tid) instead of ([colno, colvalue], tid) as > key data

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2024-02-01 Thread Michail Nikolaev
> > > I just realised there is one issue with this design: We can't cheaply > > > reset the snapshot during the second table scan: > > > It is critically important that the second scan of R/CIC uses an index > > > contents summary (made with index_bulk_delete) that was created while > > > the

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2024-01-09 Thread Michail Nikolaev
Hello, Melanie! Sorry to interrupt you, just a quick question. > Correct, but there are changes being discussed where we would freeze > tuples during pruning as well [0], which would invalidate that > implementation detail. And, if I had to choose between improved > opportunistic freezing and

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2024-01-04 Thread Michail Nikolaev
Hello! > Correct, but there are changes being discussed where we would freeze > tuples during pruning as well [0], which would invalidate that > implementation detail. And, if I had to choose between improved > opportunistic freezing and improved R/CIC, I'd probably choose > improved freezing

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2023-12-25 Thread Michail Nikolaev
Hello! It seems like the idea of "old" snapshot is still a valid one. > Should this deal with any potential XID wraparound, too? As far as I understand in our case, we are not affected by this in any way. Vacuum in our table is not possible because of locking, so, nothing may be frozen (see

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2023-12-21 Thread Michail Nikolaev
Hello. Realized my last idea is invalid (because tuples are frozen by using dynamically calculated horizon) - so, don't waste your time on it :) Need to think a little bit more here. Thanks, Mikhail.

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2023-12-20 Thread Michail Nikolaev
> Yes, good catch. > Assuming we have somehow prevented vac_truncate_clog from occurring > during CIC, we can leave frozen and potentially frozen > (xmin

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2023-12-20 Thread Michail Nikolaev
Hello! > How would this deal with tuples not visible to the old snapshot? > Presumably we can assume they're newer than that snapshot (the old > snapshot didn't have it, but the new one does, so it's committed after > the old snapshot, making them newer), so that backend must have > inserted it

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2023-12-20 Thread Michail Nikolaev
Hello! > Also, feels like we may apply this to both phases (first and the second > scans). > The original patch (1) was helping only to the second one (after call > to set_indexsafe_procflags). Oops, I was wrong here. The original version of the patch was also applied to both phases. > Note

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2023-12-17 Thread Michail Nikolaev
Hello! > I've thought about alternative solutions, too: how about getting a new > snapshot every so often? > We don't really care about the liveness of the already-scanned data; the > snapshots used for RIC > are used only during the scan. C/RIC's relation's lock level means vacuum > can't run

Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2023-12-15 Thread Michail Nikolaev
Hello, hackers! I think about revisiting (1) ({CREATE INDEX, REINDEX} CONCURRENTLY improvements) in some lighter way. Yes, a serious bug was (2) caused by this optimization and now it reverted. But what about a more safe idea in that direction: 1) add new horizon which ignores PROC_IN_SAFE_IC

Re: Replace known_assigned_xids_lck by memory barrier

2023-09-05 Thread Michail Nikolaev
Thanks everyone for help!

Re: Replace known_assigned_xids_lck by memory barrier

2023-08-16 Thread Michail Nikolaev
Hello, good question! Thanks for your edits. As answer: probably we need to change "If we know that we're holding ProcArrayLock exclusively, we don't need the read barrier." to "If we're removing xid, we don't need the read barrier because only the startup process can remove and add xids to

Re: Replace known_assigned_xids_lck by memory barrier

2023-08-16 Thread Michail Nikolaev
Hello! Updated version (with read barriers is attached). > One remaining question I have is whether it is okay if we see an updated value > for one of the head/tail variables but not the other. It looks like the > tail variable is only updated with ProcArrayLock held exclusively, which > IIUC

Re: Replace known_assigned_xids_lck by memory barrier

2023-08-15 Thread Michail Nikolaev
Hello, Nathan. > What sort of benefits do you see from this patch? It might be worthwhile > in itself to remove spinlocks when possible, but IME it's much easier to > justify such changes when there is a tangible benefit we can point to. Oh, it is not an easy question :) The answer, probably,

Replace known_assigned_xids_lck by memory barrier

2023-03-19 Thread Michail Nikolaev
%40mail.gmail.com#cc4827dee902978f93278732435e8521 From d968645551412abdb3fac6b8514c3d6746e8ac3d Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Sat, 18 Mar 2023 21:28:00 +0300 Subject: [PATCH v2] Removes known_assigned_xids_lck, using the write memory barrier as suggested earlier. --- src

Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE

2023-02-04 Thread Michail Nikolaev
Hello, Andres. > Not sure I like TransactionIdRetreatSafely() as a name. Maybe > TransactionIdRetreatClamped() is better? I think it is better to just replace TransactionIdRetreatedBy. It is not used anymore after `v3-0001-WIP-Fix-corruption-due-to-vacuum_defer_cleanup_ag.patch` - so, it is

Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE

2023-01-22 Thread Michail Nikolaev
Hello. I have registered it as patch in the commit fest: https://commitfest.postgresql.org/42/4138/ Best regards, Michail.

Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE

2023-01-07 Thread Michail Nikolaev
Hello. The few things I have got so far: 1) It is not required to order by random() to reproduce the issue - it could be done using queries like: BEGIN; SELECT omg.* FROM something_is_wrong_here AS omg ORDER BY value -- change is here LIMIT 1 FOR

Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE

2023-01-06 Thread Michail Nikolaev
Hello! Thanks for checking the issue! > FWIW, I find that the "Assert(ItemIdIsNormal(lp));" at the top of > heap_lock_tuple() is the first thing that fails on my assert-enabled > build. Yes, the same for me: TRAP: failed Assert("ItemIdIsNormal(lp)"), File: "heapam.c", Line: 4292, PID:

Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE

2023-01-05 Thread Michail Nikolaev
Hello, Andres. I apologize for the direct ping, but I think your snapshot scalability work in PG14 could be related to the issue. The TransactionIdRetreatedBy implementation looks correct... But with txid_current=212195 I see errors like "could not access status of transaction 58643736"... So,

BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE

2023-01-05 Thread Michail Nikolaev
Hello, hackers. It seems like PG 14 works incorrectly with vacuum_defer_cleanup_age (or just not cleared rows, not sure) and SELECT FOR UPDATE + UPDATE. I am not certain, but hot_standby_feedback probably able to cause the same issues. Steps to reproduce: 1) Start Postgres like this:

Re: Data loss on logical replication, 12.12 to 14.5, ALTER SUBSCRIPTION

2023-01-03 Thread Michail Nikolaev
> Does that by any chance mean you are using a non-community version of > Postgres which has some other changes? It is a managed Postgres service in the general cloud. Usually, such providers apply some custom minor patches. The only one I know about - about forbidding of canceling queries while

Re: Data loss on logical replication, 12.12 to 14.5, ALTER SUBSCRIPTION

2023-01-03 Thread Michail Nikolaev
Hello, Amid. > The point which is not completely clear from your description is the > timing of missing records. In one of your previous emails, you seem to > have indicated that the data missed from Table B is from the time when > the initial sync for Table B was in-progress, right? Also, from

Re: Data loss on logical replication, 12.12 to 14.5, ALTER SUBSCRIPTION

2022-12-28 Thread Michail Nikolaev
Hello. > None of these entries are from the point mentioned by you [1] > yesterday where you didn't find the corresponding data in the > subscriber. How did you identify that the entries corresponding to > that timing were missing? Some of the before the interval, some after... But the source

Re: Data loss on logical replication, 12.12 to 14.5, ALTER SUBSCRIPTION

2022-12-27 Thread Michail Nikolaev
Hello, Amit! > IUC, this is the time when only table B's initial sync was > in-progress. Table A's initial sync was finished by that time and for > Table C, it is yet not started. Yes, it is correct. C was started too, but unsuccessfully (restarted after, see below). > During the time of > the

Re: Data loss on logical replication, 12.12 to 14.5, ALTER SUBSCRIPTION

2022-12-26 Thread Michail Nikolaev
Hello again. Just small a fix for: > 2022-12-14 09:21:25.705 to > 2022-12-14 09:49:20.664 (after synchronization start, but before finish). Correct values are: 2022-12-14 09:49:31.340 2022-12-14 09:49:41.683 So, it looks like we lost about 10s of one of the tables WAL.

Data loss on logical replication, 12.12 to 14.5, ALTER SUBSCRIPTION

2022-12-26 Thread Michail Nikolaev
Hello. Just a small story about small data-loss on logical replication. We were logically replicating a 4 TB database from > PostgreSQL 12.12 (Ubuntu 12.12-201-yandex.49163.d86383ed5b) on > x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, > 64-bit to > PostgreSQL

Re: Slow standby snapshot

2022-11-29 Thread Michail Nikolaev
Hello, Tom. > Since we're running out of time in the current commitfest, > I went ahead and changed that, and made the cosmetic fixes > I wanted, and pushed. Great, thanks! The small thing I was thinking to add in KnownAssignedXidsCompress is the assertion like Assert(MyBackendType ==

Re: Slow standby snapshot

2022-11-22 Thread Michail Nikolaev
b9805 [2]: https://www.postgresql.org/message-id/flat/CANtu0oiPoSdQsjRd6Red5WMHi1E83d2%2B-bM9J6dtWR3c5Tap9g%40mail.gmail.com#cc4827dee902978f93278732435e8521 -- Michail Nikolaev From 926403598e9506edfa32c9534be9a4e48f756002 Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Wed, 23 No

Re: Slow standby snapshot

2022-11-20 Thread Michail Nikolaev
Oh, seems like it is not my day :) The image fixed again.

Re: Slow standby snapshot

2022-11-20 Thread Michail Nikolaev
Oops, wrong image, this is correct one. But is 1-run tests, so it shows only basic correlation,

Re: Slow standby snapshot

2022-11-20 Thread Michail Nikolaev
t-only for pgbench). If such approach looks committable - I could do more careful performance testing to find the best value for WASTED_SNAPSHOT_WORK_LIMIT_TO_COMPRESS. [1]: https://gist.github.com/michail-nikolaev/e1dfc70bdd7cfd1b902523dbb3db2f28 -- Michail Nikolaev Index: src/backend/storage/ip

Re: Slow standby snapshot

2022-11-16 Thread Michail Nikolaev
xcept non-startup processes (approach with offsets to skip gaps while building snapshot). [1]: https://gist.github.com/michail-nikolaev/e1dfc70bdd7cfd1b902523dbb3db2f28 [2]: https://www.postgresql.org/message-id/flat/CANtu0ogzo4MsR7My9%2BNhu3to5%3Dy7G9zSzUbxfWYOn9W5FfHjTA%40mail.gmail.com#341a3c3b033f69b26

Re: Slow standby snapshot

2022-07-29 Thread Michail Nikolaev
Hello. Thanks to everyone for the review. > It seems to me storing the index itself is simpler and maybe faster by > the cycles to perform addition. Yes, first version used 1-byte for offset with maximum value of 255. Agreed, looks like there is no sense to store offsets now. > A simple patch

Re: Slow standby snapshot

2022-07-19 Thread Michail Nikolaev
Hello, Andrey. > I've looked into v5. Thanks! Patch is updated accordingly your remarks. Best regards, Michail. From 1301a262dea7f541c11092851e82f10932150ee3 Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Tue, 19 Jul 2022 23:50:53 +0300 Subject: [PATCH v6] Curren

Re: Slow standby snapshot

2022-07-02 Thread Michail Nikolaev
Hello, Simon. Sorry for calling you directly, but you know the subject better than anyone else. It is related to your work from 2010 - replacing KnownAssignedXidsHash with the KnownAssignedXids array. I have added additional optimization to the data structure you implemented. Initially, it was

Any sense to get rid of known_assigned_xids_lck?

2022-06-13 Thread Michail Nikolaev
Hello, hackers. While working on (1) in commit 2871b4618af1acc85665eec0912c48f8341504c4 (2) from 2010 I noticed Simon Riggs was thinking about usage of memory barrier for KnownAssignedXids access instead of spinlocks. > We could dispense with the spinlock if we were to > create suitable memory

Re: CPU time for pg_stat_statement

2022-06-08 Thread Michail Nikolaev
Hello, Tom. >> This is a pretty broad claim to make on the basis of one undocumented >> test case on one unmentioned platform. > I'll try to use pg_stat_kcache to check the difference between Wall > and CPU for my case. In my case I see pretty high correlation of pg_stat_kcache and

Re: CPU time for pg_stat_statement

2022-05-20 Thread Michail Nikolaev
Hello, Tom. > This is a pretty broad claim to make on the basis of one undocumented > test case on one unmentioned platform. I'll try to use pg_stat_kcache to check the difference between Wall and CPU for my case. > On what grounds do you claim getrusage will be better? One thing we > can be

Re: CPU time for pg_stat_statement

2022-05-20 Thread Michail Nikolaev
Hello, Thomas. > This might be interesting: > https://github.com/powa-team/pg_stat_kcache Oh, nice, looks like it could help me to reduce CPU and test my assumption (using exec_user_time and exec_system_time). BWT, do you know why extension is not in standard contrib (looks mature)? Best

CPU time for pg_stat_statement

2022-05-20 Thread Michail Nikolaev
Hello, hackers. Today I was doing some aggregates over pg_stat_statements in order to find types of queries consuming most of the CPU. Aggregates were made on two pg_state_statement snapshots within 30 sec delay. The sum(total_time) had the biggest value for a very frequent query with about 10ms

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-03-31 Thread Michail Nikolaev
Hello, David. Thanks for your review! > As a specific recommendation here - submit patches with a complete commit > message. > Tweak it for each new version so that any prior discussion that informed the > general design of > the patch is reflected in the commit message. Yes, agreed. Applied

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-03-31 Thread Michail Nikolaev
Hello, Peter. > The simple answer is: I don't know. I could probably come up with a > better answer than that, but it would take real effort, and time. I remember you had an idea about using the LP_REDIRECT bit in btree indexes as some kind of “recently dead” flag (1). Is this idea still in

Re: Slow standby snapshot

2022-03-31 Thread Michail Nikolaev
Hello. Just an updated commit message. Thanks, Michail. From 934d649a18c66f8b448463e57375865b33ce53e7 Mon Sep 17 00:00:00 2001 From: nkey Date: Fri, 1 Apr 2022 02:17:55 +0300 Subject: [PATCH v5] Optimize KnownAssignedXidsGetAndSetXmin by maintaining offset to next valid xid. MIME-Version: 1.0

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-03-29 Thread Michail Nikolaev
UPD: > I was thinking it is safe to have additional hint bits > on primary, but it seems like no. Oh, sorry for the mistake, it is about standby of course. > BTW I am wondering if it is possible > to achieve the same situation by pg_rewind and standby promotion… Looks like it is impossible,

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-03-29 Thread Michail Nikolaev
Hello, Peter. Thanks for your review! > I doubt that the patch's use of pg_memory_barrier() in places like > _bt_killitems() is correct. There is no way to know for sure if this > novel new lockless algorithm is correct or not, since it isn't > explained anywhere. The memory barrier is used

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-03-29 Thread Michail Nikolaev
Hello, Greg. > I'm seeing a recovery test failure. Not sure if this represents an > actual bug or just a test that needs to be adjusted for the new > behaviour. Thanks for notifying me. It is a failure of a test added in the patch. It is a little hard to make it stable (because it depends on

Re: Patch proposal - parameter to limit amount of FPW because of hint bits per second

2022-03-24 Thread Michail Nikolaev
Hello, Peter. >> * Add code to _bt_killitems() that detects if it has generated an FPI, >> just to set some LP_DEAD bits. >> * Instead of avoiding the FPI when this happens, proactively call >> _bt_simpledel_pass() just before _bt_killitems() returns. Accept the >> immediate cost of setting an

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-03-22 Thread Michail Nikolaev
ci.com/build/5599876384817152), so, moving it back to "ready for committer" . Best regards, Michail. From 9ecb33a54971cfa1c766ed9d129c6abb44e39f98 Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Sat, 15 Jan 2022 16:21:51 +0300 Subject: [PATCH v10 1/3] code --- src/backend/access/

Re: Patch proposal - parameter to limit amount of FPW because of hint bits per second

2022-03-22 Thread Michail Nikolaev
Hello, Peter. Thanks for your comments. > There is one FPI per checkpoint for any leaf page that is modified > during that checkpoint. The difference between having that happen once > or twice per leaf page and having that happen many more times per leaf > page could be very large. Yes, I am

Re: Patch proposal - parameter to limit amount of FPW because of hint bits per second

2022-03-21 Thread Michail Nikolaev
Hello, Peter. > * Instead of avoiding the FPI when this happens, proactively call > _bt_simpledel_pass() just before _bt_killitems() returns. Accept the > immediate cost of setting an LP_DEAD bit, just like today, but avoid > repeated FPIs. Hm, not sure here AFAIK current implementation does not

Re: Slow standby snapshot

2022-02-20 Thread Michail Nikolaev
Hello, Andrey. Thanks for your efforts. > Patch on barrier seems too complicated to me right now. I’d propose to focus > on KnowAssignedXidsNext patch: it’s clean, simple and effective. I'll extract it to the separated patch later. > I’ve rebased the patch so that it does not depend on

Re: BufferAlloc: don't take two simultaneous locks

2022-02-06 Thread Michail Nikolaev
Hello, Yura. A one additional moment: > 1332: Assert((oldFlags & (BM_PIN_COUNT_WAITER | BM_IO_IN_PROGRESS)) == 0); > 1333: CLEAR_BUFFERTAG(buf->tag); > 1334: buf_state &= ~(BUF_FLAG_MASK | BUF_USAGECOUNT_MASK); > 1335: UnlockBufHdr(buf, buf_state); I think there is no sense to unlock buffer

Re: BufferAlloc: don't take two simultaneous locks

2022-01-30 Thread Michail Nikolaev
Hello, Yura. Test results look promising. But it seems like the naming and dynahash API change is a little confusing. 1) I think it is better to split the main part and atomic nentries optimization into separate commits. 2) Also, it would be nice to also fix hash_update_hash_key bug :) 3) Do we

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-01-25 Thread Michail Nikolaev
Hello, Julien. > I rebased the pathset in attached v9. Please double check that I didn't miss > anything in the rebase. Thanks a lot for your help. > I will let you mark the patch as Ready for Committer once you validate that > the > rebase was ok. Yes, rebase looks good. Best regards,

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-01-23 Thread Michail Nikolaev
back to "Ready for Committer" once it passes tests. Best regards, Michail. From a46315fd96b5432241ab6c67c37493ef41d7dc73 Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Sun, 23 Jan 2022 20:47:51 +0300 Subject: [PATCH v8 2/3] test --- src/test/recovery/Makefile

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-01-15 Thread Michail Nikolaev
> I'm switching this patch on Waiting on Author. I have tested it multiple times on my Github repo, seems to be stable now. Switching back to Ready for committer. Best regards. Michail. From 9372bac9b56d27cf993e9d1fa66127c86b51f25c Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Sat, 15 Ja

Re: Windows vs recovery tests

2022-01-12 Thread Michail Nikolaev
Hello. It is also could be related - https://www.postgresql.org/message-id/flat/20220112112425.pgzymqcgdy62e7m3%40jrouhaud#097b54a539ac3091ca4e4ed8ce9ab89c (both Windows and Linux cases. Best regards, Michail.

Re: Stream replication test fails of cfbot/windows server 2019

2022-01-12 Thread Michail Nikolaev
Hello. Looks like logical replication also affected: [08:26:35.599] # poll_query_until timed out executing this query: [08:26:35.599] # SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's'); [08:26:35.599] # expecting this output: [08:26:35.599] # t [08:26:35.599] #

Re: Slow standby snapshot

2021-11-22 Thread Michail Nikolaev
Hello, Andrey. > Write barrier must be issued after write, not before. > Don't we need to issue read barrier too? The write barrier is issued after the changes to KnownAssignedXidsNext and KnownAssignedXidsValid arrays and before the update of headKnownAssignedXids. So, it seems to be correct.

Re: Slow standby snapshot

2021-11-21 Thread Michail Nikolaev
e and some additional investigation had been done. So, I think I’ll re-add the patch to the commitfest app. Thanks, Michail From 94cd2fbf37b5f0b824e0f9a9bc23f762a8bb19b5 Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Sun, 21 Nov 2021 21:23:02 +0300 Subject: [PATCH v3 1/2] memory barrier instead

Re: Slow standby snapshot

2021-11-09 Thread Michail Nikolaev
ot lean on the compiler here because of `volatile` args. Also, I have added some comments. Best regards, Michail. From 1d55c6fae8cc160eadd705da0d70d9e7fb5bc00f Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Wed, 10 Nov 2021 00:02:18 +0300 Subject: [PATCH v2] known assignment xid next

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-11-09 Thread Michail Nikolaev
s feature in the commitfest app works in a different way :) Best regards, Michail. From 02b0dd27944c37007d8a92905a14e6b3e8e50fa8 Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Tue, 9 Nov 2021 21:43:58 +0300 Subject: [PATCH v6 2/3] test --- src/test/recovery/Makefile| 1 + .../r

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-11-09 Thread Michail Nikolaev
Woo-hoo :) > Attached is a proposal for a minor addition that would make sense to me, add > it if you think it's appropriate. Yes, I'll add to the patch. > I think I've said enough, changing the status to "ready for committer" :-) Thanks a lot for your help and attention! Best regards,

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-11-09 Thread Michail Nikolaev
I have changed approach, so it is better to start from this email: https://www.postgresql.org/message-id/flat/CANtu0ohHu1r1xQfTzEJuxeaOMYncG7xRxUQWdH%3DcMXZSf%2Bnzvg%40mail.gmail.com#4c81a4d623d8152f5e8889e97e750eec

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-11-05 Thread Michail Nikolaev
be” LSN-related logic to the test. Thanks a lot, Michail. From f8a87a2329e81b55b484547dd50edfd97a722ad2 Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Fri, 5 Nov 2021 19:28:12 +0300 Subject: [PATCH v5 3/3] doc --- src/backend/access/nbtree/README | 35 ++-- sr

Re: Slow standby snapshot

2021-10-02 Thread Michail Nikolaev
Hello, Andres. Could you please clarify how to better deal with the situation? According to your previous letter, I think there was some misunderstanding regarding the latest patch version (but I am not sure). Because as far as understand provided optimization (lazily calculated optional offset

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-09-29 Thread Michail Nikolaev
nt bits itself. It is tested on different standby, see > is(hints_num($node_standby_2), qq(10), 'index hint bits already set on second standby 2'); Also, I added checks for BTP_LP_SAFE_ON_STANDBY to make sure everything in the test goes by scenario. Thanks a lot, Michail. From cfb45d1a9cbf30be6098b2

Re: Slow standby snapshot

2021-08-09 Thread Michail Nikolaev
Hello, Andres. Thanks for the feedback again. > The problem is that we don't want to add a lot of work > KnownAssignedXidsAdd/Remove, because very often nobody will build a snapshot > for that moment and building a sorted, gap-free, linear array of xids isn't > cheap. In my experience it's more

Re: Slow standby snapshot

2021-08-03 Thread Michail Nikolaev
Hello, Andres. Thanks for your feedback. >> Maybe use a hashtable of running transactions? It will be slightly faster >> when adding\removing single transactions. But much worse when doing >> KnownAssignedXidsRemove(). > Why would it be worse for KnownAssignedXidsRemove()? Were you intending to

Re: Slow standby snapshot

2021-08-02 Thread Michail Nikolaev
Hello. > I have tried such an approach but looks like it is not effective, > probably because of CPU caching issues. It was a mistake by me. I have repeated the approach and got good results with small and a non-invasive patch. The main idea is simple optimistic optimization - store offset to

Re: Logical replication error "no record found" /* shouldn't happen */

2021-07-24 Thread Michail Nikolaev
Hello. I saw this error multiple times trying to replicate the 2-3 TB server (version 11 to version 12). I was unable to find any explanation for this error. Thanks, Michail.

Re: Slow standby snapshot

2021-07-11 Thread Michail Nikolaev
Hello, Kirill. > Also, maybe it is better to reduce the invasivity by using a more > simple approach. For example, use the first bit to mark xid as valid > and the last 7 bit (128 values) as an optimistic offset to the next > valid xid (jump by 127 steps in the worse scenario). > What do you

Re: Slow standby snapshot

2021-06-13 Thread Michail Nikolaev
)Hello. > I recently ran into a problem in one of our production postgresql cluster. > I had noticed lock contention on procarray lock on standby, which causes WAL > replay lag growth. Yes, I saw the same issue on my production cluster. > 1) set max_connections to big number, like 10 I

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-05-13 Thread Michail Nikolaev
Hello. Added a check for standby promotion with the long transaction to the test (code and docs are unchanged). Thanks, Michail. From c5e1053805c537b50b0922151bcf127754500adb Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Fri, 14 May 2021 00:32:30 +0300 Subject: [PATCH v3 3/4] test

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-05-12 Thread Michail Nikolaev
e." Fixed. Updated version in attach. Thanks a lot, Michail. From 004b2dea9b700d890147b840573bb5b796c1f96a Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Wed, 12 May 2021 22:56:18 +0300 Subject: [PATCH v2 4/4] doc --- src/backend/access/nbtree/README | 35 ++-

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-05-10 Thread Michail Nikolaev
Hello, Antonin. > Sorry, I missed the fact that your example can be executed inside BEGIN - END > block, in which case minRecoveryPoint won't advance after each command. No, the block is not executed as a single transaction, all commands are separated transactions (see below) > Actually I think

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-05-07 Thread Michail Nikolaev
Hello, Antonin. > I'm trying to review the patch, but not sure if I understand this problem, > please see my comment below. Thanks a lot for your attention. It is strongly recommended to look at version N3 (1) because it is a much more elegant, easy, and reliable solution :) But the

Re: Thoughts on "killed tuples" index hint bits support on standby

2021-02-10 Thread Michail Nikolaev
Hello, Peter. If you are interested, the possible patch (based on FPI mask during replay) was sent with some additional explanation and graphics to (1). At the moment I unable to find any "incorrectness" in it. Thanks again for your comments. Michail. [1]

Re: Thoughts on "killed tuples" index hint bits support on standby

2021-02-02 Thread Michail Nikolaev
Hello, Peter. > AFAICT that's not true, at least not in any practical sense. See the > comment in the middle of MarkBufferDirtyHint() that begins with "If we > must not write WAL, due to a relfilenode-specific...", and see the > "Checksums" section at the end of src/backend/storage/page/README.

Re: Thoughts on "killed tuples" index hint bits support on standby

2021-02-01 Thread Michail Nikolaev
Hello, Peter. Thanks a lot for your comments. There are some mine thoughts, related to the “masked bits” solution and your comments: > During recovery, we will probably always have to consider the > possibility that LP_DEAD bits that get set on the primary may be > received by a replica through

Re: Thoughts on "killed tuples" index hint bits support on standby

2021-01-30 Thread Michail Nikolaev
Hello, Peter. > Yeah, it would help a lot. But those bits are precious. So it makes > sense to think about what to do with both of them in index AMs at the > same time. Otherwise we risk missing some important opportunity. Hm. I was trying to "expand the scope" as you said and got an idea...

Re: Thoughts on "killed tuples" index hint bits support on standby

2021-01-28 Thread Michail Nikolaev
Hello, Peter. > I wonder if it would help to not actually use the LP_DEAD bit for > this. Instead, you could use the currently-unused-in-indexes > LP_REDIRECT bit. Hm… Sound very promising - an additional bit is a lot in this situation. > Whether or not "recently dead" means "dead to my >

Re: Thoughts on "killed tuples" index hint bits support on standby

2021-01-27 Thread Michail Nikolaev
Hello, hackers. Sorry for necroposting, but if someone is interested - I hope the patch is ready now and available in the other thread (1). Thanks, Michail. [1] https://www.postgresql.org/message-id/flat/CANtu0oiP18H31dSaEzn0B0rW6tA_q1G7%3D9Y92%2BUS_WHGOoQevg%40mail.gmail.com

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-01-27 Thread Michail Nikolaev
Hello, hackers. I think I was able to fix the issue related to minRecoveryPoint and crash recovery. To make sure standby will be consistent after crash recovery, we need to take the current value of minRecoveryPoint into account while setting LP_DEAD hints (almost the same way as it is done for

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-01-21 Thread Michail Nikolaev
Hello, everyone. Oh, I just realized that it seems like I was too naive to allow standby to set LP_DEAD bits this way. There is a possible consistency problem in the case of low minRecoveryPoint value (because hint bits do not move PageLSN forward). Something like this: LSN=10 STANDBY INSERTS

[PATCH] Full support for index LP_DEAD hint bits on standby

2021-01-18 Thread Michail Nikolaev
Hello, hackers. [ABSTRACT] Execution of queries to hot standby is one of the most popular ways to scale application workload. Most of the modern Postgres installations have two standby nodes for high-availability support. So, utilization of replica's CPU seems to be a reasonable idea. At the

Re: Why latestRemovedXid|cuteoff_xid are always sent?

2021-01-08 Thread Michail Nikolaev
Hello, Peter. Thanks for your explanation. One of the reasons I was asking - is an idea to use the same technique in the "LP_DEAD index hint bits on standby" WIP patch to reduce the amount of additional WAL. Now I am sure such optimization should work correctly. Thanks, Michail.

Why latestRemovedXid|cuteoff_xid are always sent?

2021-01-02 Thread Michail Nikolaev
Hello, hackers. Working on some stuff, I realized I do not understand why latestRemovedXid|cuteoff_xid (in different types of WAL records) are sent every time they appear on the primary side. latestRemovedXid|cuteoff_xid is used to call ResolveRecoveryConflictWithSnapshot and cancel conflicting

Re: [PATCH] Btree BackwardScan race condition on Standby during VACUUM

2020-08-02 Thread Michail Nikolaev
Hello, Peter. > Attached is a revised version of your patch Thanks for your work, the patch is looking better now. Michail.

Re: [PATCH] Btree BackwardScan race condition on Standby during VACUUM

2020-07-10 Thread Michail Nikolaev
Hello, Peter. Thanks for the update. Yes, it is the right decision. I have started to spot that bug only while working on a faster scan using hint bits on replicas [1], so it is unlikely to hit it in production at the moment. Thanks, Michail. [1]:

  1   2   >