[PATCH] Add CHECK_FOR_INTERRUPTS in scram_SaltedPassword loop.

2023-11-22 Thread Bowen Shi
Hi, hackers

When the scram_iterations value is set too large, the backend would hang for
a long time.  And we can't use Ctrl+C to cancel this query, cause the loop don't
process signal interrupts.

Add CHECK_FOR_INTERRUPTS within the loop of scram_SaltedPassword
to handle any signals received during this period may be a good choice.

I wrote a patch to solve this problem. What's your suggestions?

Dears
Bowen Shi


0001-Add-CHECK_FOR_INTERRUPTS-in-scram_SaltedPassword-loo.patch
Description: Binary data


Re: [HACKERS] make async slave to wait for lsn to be replayed

2023-11-22 Thread Bowen Shi
Hi,

I used the latest code and found some conflicts while applying. Which PG
version did you rebase?

Regards
Bowen Shi


Re: [PATCH] Add CHECK_FOR_INTERRUPTS in scram_SaltedPassword loop.

2023-11-22 Thread Bowen Shi
> I don't think it would be useful to limit this at an arbitrary point,
iteration
> count can be set per password and if someone wants a specific password to
be
> super-hard to brute force then why should we limit that?
I agree with that. Maybe some users do want a super-hard password.
RFC 7677 and RFC 5802 don't specify the maximum number of iterations.

> If we want to add CHECK_FOR_INTERRUPTS inside the loop I think a brief
> comment would be appropriate.

This has been completed in patch v2 and it's ready for review.

Regards
Bowen Shi
From 89c4de0a814d5343c54d9e8501986892fbb4b33e Mon Sep 17 00:00:00 2001
From: bovenshi 
Date: Wed, 22 Nov 2023 19:30:56 +0800
Subject: [PATCH] Add CHECK_FOR_INTERRUPTS in scram_SaltedPassword loop.

When the scram_iterations value is set too large, the backend would hang for
a long time. Add CHECK_FOR_INTERRUPTS within the loop of scram_SaltedPassword
to handle any signals received during this period.
---
 src/common/scram-common.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/common/scram-common.c b/src/common/scram-common.c
index ef997ef..bdf40e8 100644
--- a/src/common/scram-common.c
+++ b/src/common/scram-common.c
@@ -15,6 +15,7 @@
  */
 #ifndef FRONTEND
 #include "postgres.h"
+#include "miscadmin.h"
 #else
 #include "postgres_fe.h"
 #endif
@@ -73,6 +74,13 @@ scram_SaltedPassword(const char *password,
 	/* Subsequent iterations */
 	for (i = 2; i <= iterations; i++)
 	{
+		/* 
+		 * Allow it to be interrupted is necesssary when scram_iterations 
+		 * is set to a large value. However, this only works in the backend.
+		 */
+#ifndef FRONTEND
+		CHECK_FOR_INTERRUPTS();
+#endif
 		if (pg_hmac_init(hmac_ctx, (uint8 *) password, password_len) < 0 ||
 			pg_hmac_update(hmac_ctx, (uint8 *) Ui_prev, key_length) < 0 ||
 			pg_hmac_final(hmac_ctx, Ui, key_length) < 0)
-- 
2.9.3



Re: [PATCH] Add CHECK_FOR_INTERRUPTS in scram_SaltedPassword loop.

2023-11-26 Thread Bowen Shi
> I think that we should backpatch that down to v16 at least where the
> GUC has been introduced as it's more like a nuisance if one sets the
> GUC to an incorrect value, and I'd like to apply the patch this way.

Agreed.

The patch has been submitted in https://commitfest.postgresql.org/46/4671/

Regards
Bowen Shi


Optimize walsender handling invalid messages of 'drop publication'

2023-02-27 Thread Bowen Shi
Hi all,

Before PG 14, walsender process has to handle invalid message in one
XLOG (PG 14 provide a particular XLOG type: XLOG_XACT_INVALIDATIONS).
This may bring some problems which has been discussed in previous
mail: 
https://www.postgresql.org/message-id/flat/CAM_vCufO3eeRZ_O04z9reiE%2BB644%2BRgczbAVo9C5%2BoHV9S7%2B-g%40mail.gmail.com#981e65567784e0aefa4474cc3fd840f6

This patch can solve the problem. It has three parts:
1. pgoutput do not do useless invalid cache anymore;
2. Add a relid->relfilenode hash map to invoid hash seq search;
3. test case: It needs two or three minutes to finish.

The patch is based on the HEAD of branch REL_13_STABLE. It also works
for PG 10~12.

Thanks.
Bowenshi


0001-Optimize-invalidating-cache-in-pgoutput-and-relfilen.patch
Description: Binary data


Comparing two double values method

2023-10-10 Thread Bowen Shi
Dears,

I noticed that in the `check_GUC_init` function, there is a direct
comparison using the != operator for two double values, which seems
problematic.

I wrote this patch to fix this.

--
Regard,
Bowen Shi


v1-0001-Fix-double-value-compare-problem.patch
Description: Binary data


Re: Comparing two double values method

2023-10-10 Thread Bowen Shi
You're right, I made a mistake.

Thanks for your explanation.




Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

2023-10-15 Thread Bowen Shi
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

It looks good to me.

I have reviewed the code and tested the patch with basic check-world test an 
pgbench test (metioned in 
https://www.postgresql.org/message-id/flat/ZQtzcH2lvo8leXEr%40paquier.xyz#cc5ed83e0edc0b9a1c1305f08ff7a335).
 

Another reviewer has also approved it, so I change the status to RFC.

The new status of this patch is: Ready for Committer


Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

2023-09-20 Thread Bowen Shi
Thanks for the patch.

I rerun the test in
https://www.postgresql.org/message-id/flat/ZQtzcH2lvo8leXEr%40paquier.xyz#cc5ed83e0edc0b9a1c1305f08ff7a335
. We can discuss all the problems in this thread.

First I encountered the problem " FATAL:  could not find
recovery.signal or standby.signal when recovering with backup_label ",
then I deleted the backup_label file and started the instance
successfully.

> Delete a backup_label from a fresh base backup can easily lead to data
> corruption, as the startup process would pick up as LSN to start
> recovery from the control file rather than the backup_label file.
> This would happen if a checkpoint updates the redo LSN in the control
> file while a backup happens and the control file is copied after the
> checkpoint, for instance. If one wishes to deploy a new primary from
> a base backup, recovery.signal is the way to go, making sure that the
> new primary is bumped into a new timeline once recovery finishes, on
> top of making sure that the startup process starts recovery from a
> position where the cluster would be able to achieve a consistent
> state.

ereport(FATAL,
(errmsg("could not find redo location referenced by checkpoint record"),
errhint("If you are restoring from a backup, touch
\"%s/recovery.signal\" and add required recovery options.\n"
"If you are not restoring from a backup, try removing the file
\"%s/backup_label\".\n"
"Be careful: removing \"%s/backup_label\" will result in a corrupt
cluster if restoring from a backup.",
DataDir, DataDir, DataDir)));

There are two similar error messages in xlogrecovery.c. Maybe we can
modify the error messages to be similar.

--
Bowen Shi


On Thu, 21 Sept 2023 at 11:01, Michael Paquier  wrote:
>
> On Wed, Jul 19, 2023 at 11:21:17AM -0700, David Zhang wrote:
> > 1) simply start server from a base backup
> >
> > FATAL:  could not find recovery.signal or standby.signal when recovering
> > with backup_label
> >
> > HINT:  If you are restoring from a backup, touch
> > "/media/david/disk1/pg_backup1/recovery.signal" or
> > "/media/david/disk1/pg_backup1/standby.signal" and add required recovery
> > options.
>
> Note the difference when --write-recovery-conf is specified, where a
> standby.conf is created with a primary_conninfo in
> postgresql.auto.conf.  So, yes, that's expected by default with the
> patch.
>
> > 2) touch a recovery.signal file and then try to start the server, the
> > following error was encountered:
> >
> > FATAL:  must specify restore_command when standby mode is not enabled
>
> Yes, that's also something expected in the scope of the v1 posted.
> The idea behind this restriction is that specifying recovery.signal is
> equivalent to asking for archive recovery, but not specifying
> restore_command is equivalent to not provide any options to be able to
> recover.  See validateRecoveryParameters() and note that this
> restriction exists since the beginning of times, introduced in commit
> 66ec2db.  I tend to agree that there is something to be said about
> self-contained backups taken from pg_basebackup, though, as these
> would fail if no restore_command is specified, and this restriction is
> in place before Postgres has introduced replication and easier ways to
> have base backups.  As a whole, I think that there is a good argument
> in favor of removing this restriction for the case where archive
> recovery is requested if users have all their WAL in pg_wal/ to be
> able to recover up to a consistent point, keeping these GUC
> restrictions if requesting a standby (not recovery.signal, only
> standby.signal).
>
> > 3) touch a standby.signal file, then the server successfully started,
> > however, it operates in standby mode, whereas the intended behavior was for
> > it to function as a primary server.
>
> standby.signal implies that the server will start in standby mode.  If
> one wants to deploy a new primary, that would imply a timeline jump at
> the end of recovery, you would need to specify recovery.signal
> instead.
>
> We need more discussions and more opinions, but the discussion has
> stalled for a few months now.  In case, I am adding Thomas Munro in CC
> who has mentioned to me at PGcon that he was interested in this patch
> (this thread's problem is not directly related to the fact that the
> checkpointer now runs in crash recovery, though).
>
> For now, I am attaching a rebased v2.
> --
> Michael




Re: Optimize walsender handling invalid messages of 'drop publication'

2023-06-26 Thread Bowen Shi
Dears,

This issue has been pending for several months without any response.
And this problem still exists in the latest minor versions of PG 12
and PG 13.

I believe that the fix in this patch is helpful.

The patch has been submitted
https://commitfest.postgresql.org/43/4393/ .  Anyone who is interested
in this issue can help with the review.

Regards!
BowenShi

On Mon, 27 Feb 2023 at 16:12, Bowen Shi  wrote:
>
> Hi all,
>
> Before PG 14, walsender process has to handle invalid message in one
> XLOG (PG 14 provide a particular XLOG type: XLOG_XACT_INVALIDATIONS).
> This may bring some problems which has been discussed in previous
> mail: 
> https://www.postgresql.org/message-id/flat/CAM_vCufO3eeRZ_O04z9reiE%2BB644%2BRgczbAVo9C5%2BoHV9S7%2B-g%40mail.gmail.com#981e65567784e0aefa4474cc3fd840f6
>
> This patch can solve the problem. It has three parts:
> 1. pgoutput do not do useless invalid cache anymore;
> 2. Add a relid->relfilenode hash map to invoid hash seq search;
> 3. test case: It needs two or three minutes to finish.
>
> The patch is based on the HEAD of branch REL_13_STABLE. It also works
> for PG 10~12.
>
> Thanks.
> Bowenshi