Re: Do we want a hashset type?

2023-06-18 Thread Joel Jacobson
On Mon, Jun 19, 2023, at 02:00, jian he wrote:
> select hashset_contains('{1,2}'::int4hashset,NULL::int);
> should return null?

Hmm, that's a good philosophical question.

I notice Tomas Vondra in the initial commit opted for allowing NULL inputs,
treating them as empty sets, e.g. in int4hashset_add() we create a
new hashset if the first argument is NULL.

I guess the easiest perhaps most consistent NULL-handling strategy
would be to just mark all relevant functions STRICT except for the agg ones
since we probably want to allow skipping over rows with NULL values
without the entire result becoming NULL.

But if we're not just going the STRICT route, then I think it's a bit more 
tricky,
since you could argue the hashset_contains() example should return FALSE
since the set doesn't contain the NULL value, but OTOH, since we don't
store NULL values, we don't know if has ever been added, hence a NULL
result would perhaps make more sense.

I think I lean on thinking that if we want to be "NULL-friendly", like we
currently are in hashset_add(), it would probably be most user-friendly
to be consistent and let all functions return non-null return values in
all cases where it is not unreasonable.

Since we're essentially designing a set-theoretic system, I think we should
aim for the logical "soundness" property of it and think about how we can
verify that it is.

Thoughts?

/Joel




Re: Synchronizing slots from primary to standby

2023-06-18 Thread Drouvot, Bertrand

Hi,

On 6/16/23 11:56 AM, Amit Kapila wrote:

On Mon, Apr 17, 2023 at 7:37 PM Drouvot, Bertrand
 wrote:


Please find attached V5 (a rebase of V4 posted up-thread).

In addition to the "rebasing" work, the TAP test adds a test about conflict 
handling (logical slot invalidation)
relying on the work done in the "Minimal logical decoding on standby" patch 
series.

I did not look more at the patch (than what's was needed for the rebase) but 
plan to do so.



Are you still planning to continue working on this? 


Yes, I think it would be great to have such a feature in core.


Some miscellaneous
comments while going through this patch are as follows?


Thanks! I'll look at them and will try to come back to you by
mid of next week.

Also I think we need to handle the case of invalidated replication slot(s): 
should
we drop/recreate it/them? (as the main goal is to have sync slot(s) on the 
standby).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: [BUG] recovery of prepared transactions during promotion can fail

2023-06-18 Thread Kyotaro Horiguchi
At Mon, 19 Jun 2023 14:24:44 +0900, Michael Paquier  wrote 
in 
> On Fri, Jun 16, 2023 at 04:27:40PM +0200, Julian Markwort wrote:
> > Note that it is important that the PREPARE entry is in the WAL file
> > that PostgreSQL is writing to prior to the inital crash.
> > This has happened repeatedly in production already with a customer
> > that uses prepared transactions quite frequently.  I assume that
> > this has happened for others too, but the circumstances of the crash
> > and the cause are very dubious, and troubleshooting it is pretty
> > difficult.
> 
> I guess that this is a possibility yes.  I have not heard directly
> about such a report, but perhaps that's just because few people use
> 2PC.

+1

> > This behaviour has - apparently unintentionally - been fixed in PG
> > 15 and upwards (see commit 811051c ), as part of a general
> > restructure and reorganization of this portion of xlog.c (see commit
> > 6df1543 ).
> > 
> > Furthermore, it seems this behaviour does not appear in PG 12 and
> > older, due to another possible bug: In PG 13 and newer, the
> > XLogReaderState is reset in XLogBeginRead() before reading WAL in
> > XlogReadTwoPhaseData() in twophase.c .
> > In the older releases (PG <= 12), this reset is not done, so the
> > requested LSN containing the prepared transaction can (by happy
> > coincidence) be read from in-memory buffers, and PostgreSQL
> > consequently manages to come up just fine (as the WAL has already
> > been read into buffers prior to the .partial rename).  If the older
> > releases also where to properly reset the XLogReaderState, they
> > would also fail to find the LSN on disk, and hence PostgreSQL would
> > crash again.
> 
> That's debatable, but I think that I would let v12 and v11 be as they
> are.  v11 is going to be end-of-life soon and we did not have any
> complains on this matter as far as I know, so there is a risk of
> breaking something upon its last release.  (Got some, Err..
> experiences with that in the past).  On REL_11_STABLE, note for
> example the slight difference with the handling of
> recovery_end_command, where we rely on InRecovery rather than
> ArchiveRecoveryRequested.  REL_12_STABLE is in a more consistent shape
> than v11 regarding that.

Agree about 11, it's no use patching. About 12, I slightly prefer
applying this but I'm fine without it since no actual problem are
seen.


> > I've attached patches for PG 14 and PG 13 that mimic the change in
> > PG15 (commit 811051c ) and reorder the crucial events, placing the
> > recovery of prepared transactions *before* renaming the file. 
> 
> Yes, I think that's OK.  I would like to add two things to your
> proposal for all the existing branches.
> - Addition of a comment where RecoverPreparedTransactions() is called
> at the end of recovery to tell that we'd better do that before working
> on the last partial segment of the old timeline.
> - Enforce the use of archiving in 009_twophase.pl.

Both look good to me.

> > My humble opinion is that this fix should be backpatched to PG 14
> > and PG 13.  It's debatable whether the fix needs to be brought back
> > to 12 and older also, as those do not exhibit this issue, but the 
> > order of renaming is still wrong.
> 
> Yeah, I'd rather wait for somebody to complain about that.  And v11 is
> not worth taking risks with at this time of the year, IMHO.

I don't have a complaint as the whole.

> With your fix included, the patch for REL_14_STABLE would be like the
> attached.  Is that OK for you?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Allow pg_archivecleanup to remove backup history files

2023-06-18 Thread Michael Paquier
On Mon, Jun 19, 2023 at 11:24:29AM +0900, torikoshia wrote:
> Thanks, now I understand what you meant.

If I may ask, why is the refactoring of 0003 done after the feature in
0002?  Shouldn't the order be reversed?  That would make for a cleaner
git history.
--
Michael


signature.asc
Description: PGP signature


Re: [BUG] recovery of prepared transactions during promotion can fail

2023-06-18 Thread Kyotaro Horiguchi
Thanks for the report, reproducer and the patches.

At Fri, 16 Jun 2023 16:27:40 +0200, Julian Markwort 
 wrote in 
> - prepare a transaction
> - crash postgresql
> - create standby.signal file
> - start postgresql, wait for recovery to finish
> - promote
..
> The promotion will fail with a FATAL error, stating that "requested WAL 
> segment .* has already been removed".
> The FATAL error causes the startup process to exit, so postmaster shuts down 
> again.
> 
> Here's an exemplary log output, maybe this helps people to find this issue 
> when they search for it online:

> LOG:  redo done at 0/15D89B8
> LOG:  last completed transaction was at log time 2023-06-16 13:09:53.71118+02
> LOG:  selected new timeline ID: 2
> LOG:  archive recovery complete
> FATAL:  requested WAL segment pg_wal/00010001 has already 
> been removed
> LOG:  startup process (PID 1650358) exited with exit code 1

Reproduced here.

> The cause of this failure is an oversight (rather obvious in hindsight):
> The renaming of the WAL file (that was last written to before the crash 
> happened) to .partial is done *before* PostgreSQL
> might have to read this very file to recover prepared transactions from it.
> The relevant function calls here are durable_rename() and 
> RecoverPreparedTransactions() in xlog.c .
> This behaviour has - apparently unintentionally - been fixed in PG 15 and 
> upwards (see commit 811051c ), as part of a
> general restructure and reorganization of this portion of xlog.c (see commit 
> 6df1543 ).

I think so, the reordering might have done for some other reasons, though.

> Furthermore, it seems this behaviour does not appear in PG 12 and older, due 
> to another possible bug:

...

> In PG 13 and newer, the XLogReaderState is reset in XLogBeginRead()
> before reading WAL in XlogReadTwoPhaseData() in twophase.c .

I arraived at the same conclusion.

> In the older releases (PG <= 12), this reset is not done, so the requested 
> LSN containing the prepared transaction can
> (by happy coincidence) be read from in-memory buffers, and PostgreSQL 
> consequently manages to come up just fine (as the
> WAL has already been read into buffers prior to the .partial rename).
> If the older releases also where to properly reset the XLogReaderState, they 
> would also fail to find the LSN on disk, and
> hence PostgreSQL would crash again.

>From the perspective of loading WAL for prepared transactions, the
current code in those versions seems fine. Although I suspect Windows
may not like to rename currently-open segments, it's likely acceptable
as the current test set operates without issue.. (I didn't tested this.)

> I've attached patches for PG 14 and PG 13 that mimic the change in PG15 
> (commit 811051c ) and reorder the crucial events,
> placing the recovery of prepared transactions *before* renaming the file.

It appears to move the correct part of the code to the proper
location, modifying the steps to align with later versions.

> I've also attached recovery test scripts for PG >= 12 and PG <= 11 that can 
> be used to verify that promote after recovery
> with prepared transactions works.

It effectively detects the bug, though it can't be directly used in
the tree as-is. I'm unsure whether we need this in the tree, though.

> My humble opinion is that this fix should be backpatched to PG 14 and PG 13.

I agree with you.

> It's debatable whether the fix needs to be brought back to 12 and older also, 
> as those do not exhibit this issue, but the
> order of renaming is still wrong.
> I'm not sure if there could be cases where the in-memory buffers of the 
> walreader are too small to cover a whole WAL
> file.
> There could also be other issues from operations that require reading WAL 
> that happen after the .partial rename, I
> haven't checked in depth what else happens in the affected codepath.
> Please let me know if you think this should also be fixed in PG 12 and 
> earlier, so I can produce the patches for those
> versions as well.

There's no immediate need to change the versions. However, I would
prefer to backpatch them to the older versions for the following
reasons.

1. Applying this eases future backpatching in this area, if any.

2. I have reservations about renaming possibly-open WAL segments.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [BUG] recovery of prepared transactions during promotion can fail

2023-06-18 Thread Michael Paquier
On Fri, Jun 16, 2023 at 04:27:40PM +0200, Julian Markwort wrote:
> I've discovered a serious bug that leads to a server crash upon
> promoting an instance that crashed previously and did recovery in
> standby mode.

Reproduced here, for the versions mentioned.

> The bug is present in PostgreSQL versions 13 and 14 (and in earlier
> versions, though it doesn't manifest itself so catastrophically).
> The circumstances to trigger the bug are as follows:
> - postgresql is configured for hot_standby, archiving, and prepared 
> transactions
> - prepare a transaction
> - crash postgresql
> - create standby.signal file
> - start postgresql, wait for recovery to finish
> - promote

hot_standby allows one to run queries on a standby running recovery,
so it seems to me that it does not really matter.  Enabling archiving
is the critical piece.  The nodes set in the TAP test 009_twophase.pl
don't use any kind of archiving.  But once it is enabled on London the
first promotion command of the test fails the same way as you report.
 # Setup london node
 my $node_london = get_new_node("london");
-$node_london->init(allows_streaming => 1);
+# Archiving is used to force tests with .partial segment creations
+# done at the end of recovery.
+$node_london->init(allows_streaming => 1, has_archiving => 1);

Enabling the archiving does not impact any of the tests, as we don't
use restore_command during recovery and only rely on streaming.

> The cause of this failure is an oversight (rather obvious in
> hindsight): The renaming of the WAL file (that was last written to
> before the crash happened) to .partial is done *before* PostgreSQL
> might have to read this very file to recover prepared transactions
> from it.  The relevant function calls here are durable_rename() and
> RecoverPreparedTransactions() in xlog.c.
> 
> Note that it is important that the PREPARE entry is in the WAL file
> that PostgreSQL is writing to prior to the inital crash.
> This has happened repeatedly in production already with a customer
> that uses prepared transactions quite frequently.  I assume that
> this has happened for others too, but the circumstances of the crash
> and the cause are very dubious, and troubleshooting it is pretty
> difficult.

I guess that this is a possibility yes.  I have not heard directly
about such a report, but perhaps that's just because few people use
2PC.

> This behaviour has - apparently unintentionally - been fixed in PG
> 15 and upwards (see commit 811051c ), as part of a general
> restructure and reorganization of this portion of xlog.c (see commit
> 6df1543 ).
> 
> Furthermore, it seems this behaviour does not appear in PG 12 and
> older, due to another possible bug: In PG 13 and newer, the
> XLogReaderState is reset in XLogBeginRead() before reading WAL in
> XlogReadTwoPhaseData() in twophase.c .
> In the older releases (PG <= 12), this reset is not done, so the
> requested LSN containing the prepared transaction can (by happy
> coincidence) be read from in-memory buffers, and PostgreSQL
> consequently manages to come up just fine (as the WAL has already
> been read into buffers prior to the .partial rename).  If the older
> releases also where to properly reset the XLogReaderState, they
> would also fail to find the LSN on disk, and hence PostgreSQL would
> crash again.

That's debatable, but I think that I would let v12 and v11 be as they
are.  v11 is going to be end-of-life soon and we did not have any
complains on this matter as far as I know, so there is a risk of
breaking something upon its last release.  (Got some, Err..
experiences with that in the past).  On REL_11_STABLE, note for
example the slight difference with the handling of
recovery_end_command, where we rely on InRecovery rather than
ArchiveRecoveryRequested.  REL_12_STABLE is in a more consistent shape
than v11 regarding that.

> I've attached patches for PG 14 and PG 13 that mimic the change in
> PG15 (commit 811051c ) and reorder the crucial events, placing the
> recovery of prepared transactions *before* renaming the file. 

Yes, I think that's OK.  I would like to add two things to your
proposal for all the existing branches.
- Addition of a comment where RecoverPreparedTransactions() is called
at the end of recovery to tell that we'd better do that before working
on the last partial segment of the old timeline.
- Enforce the use of archiving in 009_twophase.pl.

> My humble opinion is that this fix should be backpatched to PG 14
> and PG 13.  It's debatable whether the fix needs to be brought back
> to 12 and older also, as those do not exhibit this issue, but the 
> order of renaming is still wrong.

Yeah, I'd rather wait for somebody to complain about that.  And v11 is
not worth taking risks with at this time of the year, IMHO.

With your fix included, the patch for REL_14_STABLE would be like the
attached.  Is that OK for you?
--
Michael
From 84c6c4cd7612847326c7121f3b20a31cc846ef92 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Mo

Re: Support logical replication of DDLs

2023-06-18 Thread shveta malik
As per suggestion by Amit, reviewed two more formats to be used for
DDL's WAL-logging purpose, analysis below:

NodeToString:
I do not think it is a good idea to use NodeToString in DDL Rep for
reasons below:
1) It consists of too much internal and not-needed information.
2) Too large to be logged in WAL. Optimization of this will be a
mammoth task because a) we need to filter out information, not all the
info will be useful to the subscriber; b) it is not a simple key based
search and replace/remove. Modifying strings is always difficult.
3) It's not compatible across major versions. If we want to use Node
information in any format, we need to ensure that the output can be
read in a different major version of PostgreSQL.

Sql-ddl-to-json-schema given in [1]:
This was suggested by Swada-san in one of the discussions. Since it is
json format and thus essentially has to be key:value pairs like the
current implementation. The difference here is that there is no
"format string" maintained with each json object. And thus the
awareness on how to construct the DDL (i.e. exact DDL-synatx) needs to
be present at the receiver side. In our case, we maintain the "fmt"
string using which the receiver can re-construct DDL statements
without knowing PostgreSQL's DDL syntax. The "fmt" string tells us the
order of elements/keys and also representation of values (string,
identifier etc) while the JSON data created by sql-ddl-to-json-schema
looks more generic; the receiver can construct a DDL statement in any
form. It would be more useful for example when the receiver is not a
PostgreSQL server. And thus does not seem a suitable choice for the
logical replication use-case in hand.

[1]: https://www.npmjs.com/package/sql-ddl-to-json-schema

thanks
Shveta




Re: Subscription statistics are not dropped at DROP SUBSCRIPTION in some cases

2023-06-18 Thread Amit Kapila
On Mon, Jun 19, 2023 at 6:50 AM Masahiko Sawada  wrote:
>
> On Sat, Jun 17, 2023 at 6:45 PM Amit Kapila  wrote:
> >
> > On Tue, May 16, 2023 at 8:00 PM Masahiko Sawada  
> > wrote:
> > >
> > > On Thu, May 11, 2023 at 5:12 PM Masahiko Sawada  
> > > wrote:
> > > >
> > >
> > > After thinking more about it, I realized that this is not a problem
> > > specific to HEAD. ISTM the problem is that by commit 7b64e4b3, we drop
> > > the stats entry of subscription that is not associated with a
> > > replication slot for apply worker, but we missed the case where the
> > > subscription is not associated with both replication slots for apply
> > > and tablesync. So IIUC we should backpatch it down to 15.
> > >
> >
> > I agree that it should be backpatched to 15.
> >
> > > Since in pg15, since we don't create the subscription stats at CREATE
> > > SUBSCRIPTION time but do when the first error is reported,
> > >
> >
> > AFAICS, the call to pgstat_create_subscription() is present in
> > CreateSubscription() in 15 as well, so, I don't get your point.
>
> IIUC in 15, pgstat_create_subscription() doesn't create the stats
> entry. See commit e0b01429590.
>

Thanks for the clarification. Your changes looks good to me though I
haven't tested it.

-- 
With Regards,
Amit Kapila.




Re: path->param_info only set for lateral?

2023-06-18 Thread Tom Lane
James Coleman  writes:
> Over in "Parallelize correlated subqueries that execute within each
> worker" [1} Richard Guo found a bug in the current version of my patch
> in that thread. While debugging that issue I've been wondering why
> Path's param_info field seems to be NULL unless there is a LATERAL
> reference even though there may be non-lateral outer params
> referenced.

Per pathnodes.h:

 * "param_info", if not NULL, links to a ParamPathInfo that identifies outer
 * relation(s) that provide parameter values to each scan of this path.
 * That means this path can only be joined to those rels by means of nestloop
 * joins with this path on the inside.  ...

We're only interested in this for params that are coming from other
relations of the same query level, so that they affect join order and
join algorithm choices.  Params coming down from outer query levels
are much like EXTERN params to the planner: they are pseudoconstants
for any one execution of the current query level.

This isn't just LATERAL stuff; it's also intentionally-generated
nestloop-with-inner-indexscan-cases.  But it's not outer-level Params.
Even though those are also PARAM_EXEC Params, they are fundamentally
different animals for the planner's purposes.

regards, tom lane




Re: Deleting prepared statements from libpq.

2023-06-18 Thread jian he
now it works.

/src/test/modules/libpq_pipeline/libpq_pipeline.c
>
> /* Now that it's closed we should get an error when describing */
> res = PQdescribePortal(conn, "cursor_one");
> if (PQresultStatus(res) != PGRES_FATAL_ERROR)
> pg_fatal("expected COMMAND_OK, got %s", PQresStatus(PQresultStatus(res)));
should it be "if (PQresultStatus(res) == PGRES_FATAL_ERROR)" ?

Similarly the following line should also change?

> res = PQdescribePrepared(conn, "select_one");
> if (PQresultStatus(res) != PGRES_FATAL_ERROR)
> pg_fatal("expected FATAL_ERROR, got %s", PQresStatus(PQresultStatus(res)));



typo, unnecessary "portal." in the following sentence?
   "portalName can be "" or NULL to reference the unnamed portal, it is
fine if no portal exists with this name. portal. On success, a PGresult
with status PGRES_COMMAND_OK is returned."

"Also, although there is no libpq function for deleting a prepared
statement, the SQL DEALLOCATE statement can be used for that purpose."
Now the PQclosePrepared has the same use as DEALLOCATE, maybe the above
sentence should be changed?


path->param_info only set for lateral?

2023-06-18 Thread James Coleman
Hello,

Over in "Parallelize correlated subqueries that execute within each
worker" [1} Richard Guo found a bug in the current version of my patch
in that thread. While debugging that issue I've been wondering why
Path's param_info field seems to be NULL unless there is a LATERAL
reference even though there may be non-lateral outer params
referenced.

Consider the query:
select * from pg_description t1 where objoid in
(select objoid from pg_description t2 where t2.description =
t1.description);

The subquery's rel has a baserestrictinfo containing an OpExpr
comparing a Var (t2.description) to a Param of type PARAM_EXEC
(t1.description). But the generated SeqScan path doesn't have its
param_info field set, which means PATH_REQ_OUTER returns NULL also
despite there being an obvious param referencing a required outer
relid. Looking at create_seqscan_path we see that param_info is
initialized with:

get_baserel_parampathinfo(root, rel, required_outer)

where required_outer is passed in from set_plain_rel_pathlist as
rel->lateral_relids. And get_baserel_parampathinfo always returns NULL
if required_outer is empty, so obviously with this query (no lateral
reference) we're not going to get any ParamPathInfo added to the path
or the rel.

Is there a reason why we don't track the required relids providing the
PARAM_EXEC params in this case?

Thanks,
James Coleman

1: 
https://www.postgresql.org/message-id/CAMbWs4_evjcMzN8Gw78bHfhfo2FKJThqhEjRJRmoMZx%3DNXcJ7w%40mail.gmail.com




Re: Allow pg_archivecleanup to remove backup history files

2023-06-18 Thread torikoshia

On 2023-06-16 11:22, Kyotaro Horiguchi wrote:

At Thu, 15 Jun 2023 21:38:28 +0900, torikoshia
 wrote in

On 2023-06-15 15:20, Kyotaro Horiguchi wrote:
Thanks for your review!
> + printf(_(" -x, --strip-extension=EXT strip this extention before
> identifying files fo clean up\n"));
> + printf(_(" -?, --help show this help, then exit\n"));
> After this change, some of these lines corss the boundary of the 80
> columns width.  (is that policy viable noadays? I am usually working
> using terminal windows with such a width..) It's somewhat unrelated to
> this patch, but a help line a few lines further down also exceeds the
> width. We could shorten it by removing the "/mnt/server" portion, but
> I'm not sure if it's worth doing.

I also highlight 80th column according to the wiki[1].
Since usage() in other files like pg_rewind.c and initdb.c also
exceeded the 80th column, I thought that was something like a guide.


I think the page is suggesting about program code, not the messages
that binaries print.


Thanks, now I understand what you meant.


ASAICS the main section of the "pg_rewind --help" fits within 80
columns. However, "initdb --help" does output a few lines exceeding
the 80-column limit. Judging by the surrounding lines, I believe we're
still aiming to maintain these the given width. I think we need to fix
initdb in that regard.


Hmm, it seems some other commands also exceeds 80 columns:

  pg_amcheck:
  --skip=OPTION   do NOT check "all-frozen" or 
"all-visible" blocks
  --startblock=BLOCK  begin checking table(s) at the given 
block number
  --endblock=BLOCKcheck table(s) only up to the given 
block number


  --no-synchronized-snapshots  do not use synchronized snapshots in 
parallel jobs


  pg_isready:
  -t, --timeout=SECS   seconds to wait when attempting connection, 0 
disables (default: 3)


  pg_receivewal:
  --create-slot  create a new replication slot (for the slot's 
name see --slot)
  --drop-slotdrop the replication slot (for the slot's name 
see --slot)


If you don't mind, I'm going to create another thread about this point.
I'll also discuss below line since it's unrelated to current thread
as you pointed out:

| pg_archivecleanup /mnt/server/archiverdir 
00010010.0020.backup



Attached patch fixes the number of columns per row exceeding 80 by
changing to use getopt_long.


On 2023-06-16 11:30, Kyotaro Horiguchi wrote:

At Fri, 16 Jun 2023 11:22:31 +0900 (JST), Kyotaro Horiguchi
 wrote in

ASAICS the main section of the "pg_rewind --help" fits within 80
columns. However, "initdb --help" does output a few lines exceeding
the 80-column limit. Judging by the surrounding lines, I believe we're
still aiming to maintain these the given width. I think we need to fix
initdb in that regard.


Mmm, the message was introduced in 2012 by 8a02339e9b. I haven't
noticed this until now...

regards.


--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATIONFrom 220bbd866158fd69a3a4affe73136f4699353ecd Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi 
Date: Mon, 19 Jun 2023 09:46:41 +0900
Subject: [PATCH v8 1/3] Introduce pg_archivecleanup into getopt_long

This patch is a preliminary step to add an easy-to-understand option
to delete backup history files, but it also adds long options to
the existing options.
---
 doc/src/sgml/ref/pgarchivecleanup.sgml|  5 -
 src/bin/pg_archivecleanup/pg_archivecleanup.c | 22 +--
 2 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/doc/src/sgml/ref/pgarchivecleanup.sgml b/doc/src/sgml/ref/pgarchivecleanup.sgml
index 635e7c7685..09991c2fcd 100644
--- a/doc/src/sgml/ref/pgarchivecleanup.sgml
+++ b/doc/src/sgml/ref/pgarchivecleanup.sgml
@@ -95,6 +95,7 @@ pg_archivecleanup:  removing file "archive/00010037000E"
 
  
   -d
+  --debug
   

 Print lots of debug logging output on stderr.
@@ -104,6 +105,7 @@ pg_archivecleanup:  removing file "archive/00010037000E"
 
  
   -n
+  --dry-run
   

 Print the names of the files that would have been removed on stdout (performs a dry run).
@@ -122,7 +124,8 @@ pg_archivecleanup:  removing file "archive/00010037000E"
  
 
  
-  -x extension
+  -x extension
+  --strip-extension=extension
   

 Provide an extension
diff --git a/src/bin/pg_archivecleanup/pg_archivecleanup.c b/src/bin/pg_archivecleanup/pg_archivecleanup.c
index 7726d05149..fc0dca9856 100644
--- a/src/bin/pg_archivecleanup/pg_archivecleanup.c
+++ b/src/bin/pg_archivecleanup/pg_archivecleanup.c
@@ -17,7 +17,7 @@
 
 #include "access/xlog_internal.h"
 #include "common/logging.h"
-#include "pg_getopt.h"
+#include "getopt_long.h"
 
 const char *progname;
 
@@ -252,11 +252,13 @@ usage(void)
 	printf(_("Usage:\n"));
 	printf(_("  %s [OPTION]... ARCHIVELOCATION OLDESTKEPTWALFILE\n"), progname);
 	printf(_("\nOptions:\n

Re: Subscription statistics are not dropped at DROP SUBSCRIPTION in some cases

2023-06-18 Thread Masahiko Sawada
On Sat, Jun 17, 2023 at 6:45 PM Amit Kapila  wrote:
>
> On Tue, May 16, 2023 at 8:00 PM Masahiko Sawada  wrote:
> >
> > On Thu, May 11, 2023 at 5:12 PM Masahiko Sawada  
> > wrote:
> > >
> >
> > After thinking more about it, I realized that this is not a problem
> > specific to HEAD. ISTM the problem is that by commit 7b64e4b3, we drop
> > the stats entry of subscription that is not associated with a
> > replication slot for apply worker, but we missed the case where the
> > subscription is not associated with both replication slots for apply
> > and tablesync. So IIUC we should backpatch it down to 15.
> >
>
> I agree that it should be backpatched to 15.
>
> > Since in pg15, since we don't create the subscription stats at CREATE
> > SUBSCRIPTION time but do when the first error is reported,
> >
>
> AFAICS, the call to pgstat_create_subscription() is present in
> CreateSubscription() in 15 as well, so, I don't get your point.

IIUC in 15, pgstat_create_subscription() doesn't create the stats
entry. See commit e0b01429590.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-18 Thread Tom Lane
Michael Paquier  writes:
> Another thing that I was wondering, though..  Do you think that there
> would be an argument in being stricter in the hstore code regarding
> the handling of multi-byte characters with some checks based on
> IS_HIGHBIT_SET() when parsing the keys and values?

What have you got in mind?  We should already have validated encoding
correctness before the text ever gets to hstore_in, and I'm not clear
what additional checks would be useful.

regards, tom lane




Re: Do we want a hashset type?

2023-06-18 Thread jian he
On Sat, Jun 17, 2023 at 8:38 AM Joel Jacobson  wrote:
>
> On Fri, Jun 16, 2023, at 17:42, Joel Jacobson wrote:
> > I realise int4hashset_hash() is broken,
> > since two int4hashset's that are considered equal,
> > can by coincidence get different hashes:
> ...
> > Do we have any ideas on how to fix this without sacrificing performance?
>
> The problem was due to hashset_hash() function accumulating the hashes
> of individual elements in a non-commutative manner. As a consequence, the
> final hash value was sensitive to the order in which elements were inserted
> into the hashset. This behavior led to inconsistencies, as logically
> equivalent sets (i.e., sets with the same elements but in different orders)
> produced different hash values.
>
> Solved by modifying the hashset_hash() function to use a commutative operation
> when combining the hashes of individual elements. This change ensures that the
> final hash value is independent of the element insertion order, and logically
> equivalent sets produce the same hash.
>
> An somewhat unfortunate side-effect of this fix, is that we can no longer
> visually sort the hashset output format, since it's not lexicographically 
> sorted.
> I think this is an acceptable trade-off for a hashset type,
> since the only alternative I see would be to sort the elements,
> but then it wouldn't be a hashset, but a treeset, which different
> Big-O complexity.
>
> New patch is attached, which will henceforth always be a complete patch,
> to avoid the hassle of having to assemble incremental patches.
>
> /Joel


select hashset_contains('{1,2}'::int4hashset,NULL::int);
should return null?
-
SELECT attname
,pc.relname
,CASE attstorage
WHEN 'p' THEN 'plain'
WHEN 'e' THEN 'external'
WHEN 'm' THEN 'main'
WHEN 'x' THEN 'extended'
END AS storage
FROMpg_attribute pa
joinpg_classpc  on pc.oid = pa.attrelid
where   attnum > 0  and pa.attstorage   = 'e';

In my system catalog, it seems only the hashset type storage =
'external'. most is extended.
I am not sure the consequence of switch from external to extended.

select hashset_hash('{-1,1}')   as a1
,hashset_hash('{1,-2}') as a2
,hashset_hash('{-3,1}') as a3
,hashset_hash('{4,1}')  as a4;
returns:
 a1  |a2 | a3 | a4
-+---++
 -1735582196 | 998516167 | 1337000903 | 1305426029
(1 row)

values {a1,a2,a3,a4} should be monotone increasing, based on the
function int4hashset_cmp, but now it's not.
so the following queries failed.

--should return only one row.
select hashset_cmp('{2,1}','{3,1}')
union
select hashset_cmp('{3,1}','{4,1}')
union
select hashset_cmp('{1,3}','{4,1}');

select hashset_cmp('{9,10,11}','{10,9,-11}') =
hashset_cmp('{9,10,11}','{10,9,-1}'); --should be true
select '{2,1}'::int4hashset > '{7}'::int4hashset; --should be false.
based on int array comparison,.
-
I comment out following lines in hashset-api.c somewhere between {810,829}

// if (a->hash < b->hash)
// PG_RETURN_INT32(-1);
// else if (a->hash > b->hash)
// PG_RETURN_INT32(1);

// if (a->nelements < b->nelements)
// PG_RETURN_INT32(-1);
// else if (a->nelements > b->nelements)
// PG_RETURN_INT32(1);

// Assert(a->nelements == b->nelements);

So hashset_cmp will directly compare int array. the above queries works.

{int4hashset_equals,int4hashset_neq} two special cases of hashset_cmp.
maybe we can just  wrap it just like int4hashset_le?

now store 10 element int4hashset need 99 bytes, similar one dimension
bigint array with length 10, occupy 101 byte

in int4hashset_send, newly add struct attributes/member {load_factor
growth_factor ncollisions hash} also need send to buf?




Re: Deleting prepared statements from libpq.

2023-06-18 Thread Michael Paquier
On Sun, Jun 18, 2023 at 01:03:57PM +0200, Jelte Fennema wrote:
> Sorry about that. I attached a new patch that allows linking to the
> new functions (I forgot to add the functions to exports.txt). This new
> patch also adds some basic tests for these new functions.

I am okay with the arguments about pgbouncer and psycopg2.  The
symmetry with the portal description routines makes this proposal easy
to think about.

-   PGQUERY_CLOSE
+   PGQUERY_CLOSE   /* Close Statoment or Portal */

s/Statoment/Statement/.

+ * Available options for close_type are
+ *  'S' to close a prepared statement; or
+ *  'P' to close a portal.
+ * Returns 1 on success and 0 on failure.
+ */
+static int
+PQsendClose(PGconn *conn, char close_type, const char *close_target)

Could it be better for this code path to issue an error if using a
non-supported close_type rather than sending it?  Okay, you are
consistent with desc_type and PQsendDescribe(), just asking if it is
worth doing something about.

+  
+   
+Submits a request to obtain information about the specified
+prepared statement, and waits for completion.
+
PQclosePrepared() does not request for a description.

+Submits a request to close the the specified
+portal, and waits for completion.
s/the the/the/.

+ allows an application to release
+resources related to a portal previously created portal. If it was a
The end of this sentence looks a bit weird.

Perhaps there should be some tests for the two async routines, as
well?
--
Michael


signature.asc
Description: PGP signature


Re: Deleting prepared statements from libpq.

2023-06-18 Thread Michael Paquier
On Sun, Jun 18, 2023 at 09:23:22PM +0800, jian he wrote:
> previously I cannot link it. with
> v2-0001-Support-sending-Close-messages-from-libpq.patch. now I can
> compile it, link it, but then run time error.
> same c program in the first email.
> when I run it ./a.out, then error:
> ./a.out: symbol lookup error: ./a.out: undefined symbol: PQsendClosePrepared

If you still have problems, it seems to me that one mistake is in not
updating LD_LIBRARY_PATH.  It should point to a version of libpq
compiled with the patch, or -lpq will not be able to resolve correctly
when compiling your test program.
--
Michael


signature.asc
Description: PGP signature


Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-18 Thread Michael Paquier
On Sun, Jun 18, 2023 at 12:38:12PM -0400, Tom Lane wrote:
> FWIW, I think the status quo is fine.  Having hstore do something that
> is neither its historical behavior nor aligned with the core parser
> doesn't seem like a great idea.

Okay.  Fine by me.

> I don't buy this argument that
> somebody might be depending on the handling of \v in particular.  It's
> not any stronger than the argument that they might be depending on,
> say, recognizing no-break space (0xA0) in LATIN1, which the old code
> did (probably, depending on platform) and scanner_isspace will not.

Another thing that I was wondering, though..  Do you think that there
would be an argument in being stricter in the hstore code regarding
the handling of multi-byte characters with some checks based on
IS_HIGHBIT_SET() when parsing the keys and values?
--
Michael


signature.asc
Description: PGP signature


Re: Do we want a hashset type?

2023-06-18 Thread Joel Jacobson
On Sun, Jun 18, 2023, at 18:45, Andrew Dunstan wrote:
> . It might be worth sending a version number with the send function 
> (c.f. jsonb_send / jsonb_recv). That way would would not be tied forever 
> to some wire representation.

Great idea; implemented.

> . I think there are some important set operations missing: most notably 
> intersection, slightly less importantly asymmetric and symmetric 
> difference. I have no idea how easy these would be to add, but even for 
> your stated use I should have thought set intersection would be useful 
> ("Who is a member of both this set of friends and that set of friends?").

Another great idea; implemented.

> . While supporting int4 only is OK for now, I think we would at least 
> want to support int8, and probably UUID since a number of systems I know 
> of use that as an object identifier.

I agree that's probably the most logical thing to focus on next. I'm on it.

New patch attached.

/Joel

hashset-0.0.1-75bf3ab.patch
Description: Binary data


Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG

2023-06-18 Thread Joe Conway

On 6/12/23 17:28, Joe Conway wrote:

On 6/12/23 10:44, Joe Conway wrote:

1/ how do we fix the misbehavior reported due to libperl in existing
stable branches





I was mostly trying to concentrate on #1, but 2 & 3 are worthy of
discussion.


Hmm, browsing through the perl source I came across a reference to this
(from https://perldoc.perl.org/perllocale):

---
PERL_SKIP_LOCALE_INIT

  This environment variable, available starting in Perl v5.20, if set
(to any value), tells Perl to not use the rest of the environment
variables to initialize with. Instead, Perl uses whatever the current
locale settings are. This is particularly useful in embedded
environments, see "Using embedded Perl with POSIX locales" in perlembed.
---

Seems we ought to be using that.


Turns out that that does nothing useful as far as I can tell.

So I am back to proposing the attached against pg16beta1, to be 
backpatched to pg11.


Since much of the discussion happened on pgsql-bugs, the background 
summary for hackers is this:


When plperl is first loaded, the init function eventually works its way 
to calling Perl_init_i18nl10n(). In versions of perl >= 5.20, that ends 
up at S_emulate_setlocale() which does a series of uselocale() calls. 
For reference, RHEL 7 is perl 5.16.3 while RHEL 9 is perl 5.32.1. Older 
versions of perl do not have this behavior.


The problem with uselocale() is that it changes the active locale away 
from the default global locale. Subsequent uses of setlocale() affect 
the global locale, but if that is not the active locale, it does not 
control the results of locale dependent functions such as localeconv(), 
which is what we depend on in PGLC_localeconv().


The result is illustrated in this example:
8<
psql test
psql (16beta1)
Type "help" for help.

test=# show lc_monetary;
 lc_monetary
-
 en_GB.UTF-8
(1 row)

test=# SELECT 12.34::money AS price;
 price

 £12.34
(1 row)

test=# \q
8<
psql test
psql (16beta1)
Type "help" for help.

test=# load 'plperl';
LOAD
test=# show lc_monetary;
 lc_monetary
-
 en_GB.UTF-8
(1 row)

test=# SELECT 12.34::money AS price;
 price

 $12.34
(1 row)
8<

Notice that merely loading plperl makes the currency symbol wrong.

I have proposed a targeted fix that I believe is safe to backpatch -- 
attached.


IIUC, Tom was +1, but Heikki was looking for a more general solution.

My issue with the more general solution is that it will likely be too 
invasive to backpatch, and at the moment at least, there are no other 
confirmed bugs related to all of this (even if the current code is more 
fragile than we would prefer).


I would like to commit this to all supported branches in the next few 
days, unless there are other suggestions or objections.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Ensure the global locale gets used by localeconv()

A loaded library, such as libperl, may call uselocale() underneath us.
This can result in localeconv() grabbing the wrong locale for
numeric and monetary symbols and formatting. Fix that by resetting
to the global locale determined by setlocale(). Backpatch to all
supported versions.

Author: Joe Conway
Reviewed-By: Tom Lane
Reported by: Guido Brugnara
Discussion: https://postgr.es/m/flat/17946-3e84cb577e9551c3%40postgresql.org
Backpatch-through: 11

diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 31e3b16..9dba161 100644
*** a/src/backend/utils/adt/pg_locale.c
--- b/src/backend/utils/adt/pg_locale.c
*** PGLC_localeconv(void)
*** 505,510 
--- 505,517 
  	}
  
  	/*
+ 	 * Ensure the global locale will be used by localeconv().
+ 	 * This is necessary, for example, if another loaded library
+ 	 * such as libperl has done uselocale() underneath us.
+ 	 */
+ 	uselocale(LC_GLOBAL_LOCALE);
+ 
+ 	/*
  	 * This is tricky because we really don't want to risk throwing error
  	 * while the locale is set to other than our usual settings.  Therefore,
  	 * the process is: collect the usual settings, set locale to special
diff --git a/src/fe_utils/print.c b/src/fe_utils/print.c
index 7af1ccb..5d80e77 100644
*** a/src/fe_utils/print.c
--- b/src/fe_utils/print.c
*** setDecimalLocale(void)
*** 3628,3633 
--- 3628,3639 
  {
  	struct lconv *extlconv;
  
+ 	/*
+ 	 * Ensure the global locale will be used by localeconv().
+ 	 * This is necessary, for example, if another loaded library
+ 	 * has done uselocale() underneath us.
+ 	 */
+ 	uselocale(LC_GLOBAL_LOCALE);
  	extlconv = localeconv();
  
  	/* Don't accept an empty decimal_point string */


Re: Use generation context to speed up tuplesorts

2023-06-18 Thread Tomas Vondra
Hi Ronan,

We briefly chatted about the glibc-tuning part of this thread at pgcon,
so I wonder if you're still planning to pursue that. If you do, I
suggest we start a fresh thread, so that it's not mixed with the already
committed improvements of generation context.

I wonder what's the situation with the generation context improvements
already pushed - does setting the glibc thresholds still help, and if
yes how much?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Do we want a hashset type?

2023-06-18 Thread Andrew Dunstan



On 2023-06-16 Fr 20:38, Joel Jacobson wrote:


New patch is attached, which will henceforth always be a complete patch,
to avoid the hassle of having to assemble incremental patches.



Cool, thanks.


A couple of random thoughts:


. It might be worth sending a version number with the send function 
(c.f. jsonb_send / jsonb_recv). That way would would not be tied forever 
to some wire representation.


. I think there are some important set operations missing: most notably 
intersection, slightly less importantly asymmetric and symmetric 
difference. I have no idea how easy these would be to add, but even for 
your stated use I should have thought set intersection would be useful 
("Who is a member of both this set of friends and that set of friends?").


. While supporting int4 only is OK for now, I think we would at least 
want to support int8, and probably UUID since a number of systems I know 
of use that as an object identifier.



cheers


andrew


--

Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-18 Thread Tom Lane
Michael Paquier  writes:
> At the end, no need to do that.  I have been able to hack the
> attached, that shows the difference of treatment for \v when running
> in macOS.  Evan, what do you think?

FWIW, I think the status quo is fine.  Having hstore do something that
is neither its historical behavior nor aligned with the core parser
doesn't seem like a great idea.  I don't buy this argument that
somebody might be depending on the handling of \v in particular.  It's
not any stronger than the argument that they might be depending on,
say, recognizing no-break space (0xA0) in LATIN1, which the old code
did (probably, depending on platform) and scanner_isspace will not.

If anything, the answer for these concerns is that d522b05c8
should not have been back-patched.  But I'm okay with where we are.

regards, tom lane




Re: Deleting prepared statements from libpq.

2023-06-18 Thread jian he
On Sun, Jun 18, 2023 at 7:04 PM Jelte Fennema  wrote:
>
> On Sat, 17 Jun 2023 at 15:34, jian he  wrote:
> > I failed to link it. I don't know why.
>
> Sorry about that. I attached a new patch that allows linking to the
> new functions (I forgot to add the functions to exports.txt). This new
> patch also adds some basic tests for these new functions.

previously I cannot link it. with
v2-0001-Support-sending-Close-messages-from-libpq.patch. now I can
compile it, link it, but then run time error.
same c program in the first email.
when I run it ./a.out, then error:
./a.out: symbol lookup error: ./a.out: undefined symbol: PQsendClosePrepared




Re: Deleting prepared statements from libpq.

2023-06-18 Thread Jelte Fennema
On Sat, 17 Jun 2023 at 15:34, jian he  wrote:
> I failed to link it. I don't know why.

Sorry about that. I attached a new patch that allows linking to the
new functions (I forgot to add the functions to exports.txt). This new
patch also adds some basic tests for these new functions.


v2-0001-Support-sending-Close-messages-from-libpq.patch
Description: Binary data


Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-18 Thread Michael Paquier
On Sun, Jun 18, 2023 at 10:50:16AM +0900, Michael Paquier wrote:
> The difference between scanner_isspace() and array_isspace() is that
> the former matches with what scan.l stores as rules for whitespace
> characters, but the latter works on values.  For hstore, we want the
> latter, with something that works on values.  To keep the change
> locale to hstore, I think that we should just introduce an
> hstore_isspace() which is a copy of array_isspace.  That's a
> duplication, sure, but I think that we may want to think harder about
> \v in the flex scanner, and that's just a few extra lines for 
> something that has not changed in 13 years for arrays.  That's also
> easier to think about for stable branches.  If you can send a patch,
> that helps a lot, for sure!

At the end, no need to do that.  I have been able to hack the
attached, that shows the difference of treatment for \v when running
in macOS.  Evan, what do you think?
--
Michael
From 7ede07940011d08f8c0cf05d57b3782f367c0adf Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Sun, 18 Jun 2023 17:29:37 +0900
Subject: [PATCH] More fixes for hstore and locales

This is not really complete without the treatment of \v like array
values.
---
 contrib/hstore/expected/hstore_utf8.out | 31 +
 contrib/hstore/hstore_io.c  | 30 
 contrib/hstore/sql/hstore_utf8.sql  |  7 ++
 3 files changed, 63 insertions(+), 5 deletions(-)

diff --git a/contrib/hstore/expected/hstore_utf8.out b/contrib/hstore/expected/hstore_utf8.out
index 4405824413..bbc885a181 100644
--- a/contrib/hstore/expected/hstore_utf8.out
+++ b/contrib/hstore/expected/hstore_utf8.out
@@ -34,3 +34,34 @@ SELECT 'keyąfoo=>valueą'::hstore;
  "keyąfoo"=>"valueą"
 (1 row)
 
+-- More patterns that may depend on isspace() and locales, all discarded.
+SELECT E'key\u000A=>value\u000A'::hstore; -- \n
+ hstore 
+
+ "key"=>"value"
+(1 row)
+
+SELECT E'key\u0009=>value\u0009'::hstore; -- \t
+ hstore 
+
+ "key"=>"value"
+(1 row)
+
+SELECT E'key\u000D=>value\u000D'::hstore; -- \r
+ hstore 
+
+ "key"=>"value"
+(1 row)
+
+SELECT E'key\u000B=>value\u000B'::hstore; -- \v
+ hstore 
+
+ "key"=>"value"
+(1 row)
+
+SELECT E'key\u000C=>value\u000C'::hstore; -- \f
+ hstore 
+
+ "key"=>"value"
+(1 row)
+
diff --git a/contrib/hstore/hstore_io.c b/contrib/hstore/hstore_io.c
index 999ddad76d..f33b241d54 100644
--- a/contrib/hstore/hstore_io.c
+++ b/contrib/hstore/hstore_io.c
@@ -13,7 +13,6 @@
 #include "lib/stringinfo.h"
 #include "libpq/pqformat.h"
 #include "nodes/miscnodes.h"
-#include "parser/scansup.h"
 #include "utils/builtins.h"
 #include "utils/json.h"
 #include "utils/jsonb.h"
@@ -41,6 +40,7 @@ typedef struct
 	int			plen;
 } HSParser;
 
+static bool hstore_isspace(char ch);
 static bool hstoreCheckKeyLength(size_t len, HSParser *state);
 static bool hstoreCheckValLength(size_t len, HSParser *state);
 
@@ -82,6 +82,26 @@ prseof(HSParser *state)
 	return false;
 }
 
+/*
+ * hstore_isspace() --- a non-locale-dependent isspace()
+ *
+ * We used to use isspace() for parsing hstore values, but that has
+ * undesirable results: a hstore value might be silently interpreted
+ * differently depending on the locale setting.  Now we just hard-wire
+ * the traditional ASCII definition of isspace().
+ */
+static bool
+hstore_isspace(char ch)
+{
+	if (ch == ' ' ||
+		ch == '\t' ||
+		ch == '\n' ||
+		ch == '\r' ||
+		ch == '\v' ||
+		ch == '\f')
+		return true;
+	return false;
+}
 
 #define GV_WAITVAL 0
 #define GV_INVAL 1
@@ -119,7 +139,7 @@ get_val(HSParser *state, bool ignoreeq, bool *escaped)
 			{
 st = GV_WAITESCIN;
 			}
-			else if (!scanner_isspace((unsigned char) *(state->ptr)))
+			else if (!hstore_isspace((unsigned char) *(state->ptr)))
 			{
 *(state->cur) = *(state->ptr);
 state->cur++;
@@ -142,7 +162,7 @@ get_val(HSParser *state, bool ignoreeq, bool *escaped)
 state->ptr--;
 return true;
 			}
-			else if (scanner_isspace((unsigned char) *(state->ptr)))
+			else if (hstore_isspace((unsigned char) *(state->ptr)))
 			{
 return true;
 			}
@@ -256,7 +276,7 @@ parse_hstore(HSParser *state)
 			{
 PRSEOF;
 			}
-			else if (!scanner_isspace((unsigned char) *(state->ptr)))
+			else if (!hstore_isspace((unsigned char) *(state->ptr)))
 			{
 PRSSYNTAXERROR;
 			}
@@ -310,7 +330,7 @@ parse_hstore(HSParser *state)
 			{
 return true;
 			}
-			else if (!scanner_isspace((unsigned char) *(state->ptr)))
+			else if (!hstore_isspace((unsigned char) *(state->ptr)))
 			{
 PRSSYNTAXERROR;
 			}
diff --git a/contrib/hstore/sql/hstore_utf8.sql b/contrib/hstore/sql/hstore_utf8.sql
index face878324..38c9481ee6 100644
--- a/contrib/hstore/sql/hstore_utf8.sql
+++ b/contrib/hstore/sql/hstore_utf8.sql
@@ -17,3 +17,10 @@ SELECT E'key\u0105=>value\u0105'::hstore;
 SELECT 'keyą=>valueą'::hstore;
 SELECT 'ą=>ą'::hstore;