Re: How to attach partition with primary key

2024-06-18 Thread Alvaro Herrera
On 2024-Jun-18, Philipp Faster wrote:

> I've done a bad job describing my issue in the first place: I left out a
> key definition statement that I thought doesn't play any role in the issue:
> another unique index on the same columns as PK.

I see.  That unique index seems quite useless. Why not just drop it?
That would also solve your problem.

> Interesting thing is that if I define PK before the unique index and then
> drop it and redefine after the unique index, then the code still doesn't
> work. This behaviour smells like a bug on the PostgreSQL side...

Yeah, it sounds like we have a bug in the index-matching code, though
it depends on having a useless, duplicative index.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Selbst das größte Genie würde nicht weit kommen, wenn es
alles seinem eigenen Innern verdanken wollte." (Johann Wolfgang von Goethe)
   Ni aún el genio más grande llegaría muy lejos si
quisiera sacarlo todo de su propio interior.




Re: How to attach partition with primary key

2024-06-18 Thread Alvaro Herrera
On 2024-Jun-18, Philipp Faster wrote:

> As I understand, PostgreSQL refuses to use existing primary key for
> some reason and tries to create its own as a children of
> "Transactions" table's primary key.

Yeah.  Your case sounds like the primary key in the partitioned table
has some slight definition difference from the stock one, which makes
the one you create in the partition not an exact match.  I'd suggest to
look at what "pg_dump -s" emits as definition of the table with a
magnifying glass to search for such differences.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: UPDATE with multiple WHERE conditions

2024-06-13 Thread Alvaro Herrera
On 2024-Jun-12, David G. Johnston wrote:

> On Wed, Jun 12, 2024 at 2:28 PM Rich Shepard 
> wrote:
> 
> > I have a table with 3492 rows. I want to update a boolean column from
> > 'false' to 'true' for 295 rows based on the value of another column.
> 
> I'll often just use a spreadsheet to build the 295 update commands and
> copy-paste them into psql or whatnot.

A closely related technique: if you have a query that generates the
UPDATE commands you need, you can run it under \gexec in psql, and
they'll be executed.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
Al principio era UNIX, y UNIX habló y dijo: "Hello world\n".
No dijo "Hello New Jersey\n", ni "Hello USA\n".




Re: Backup failure Postgres

2024-05-23 Thread Alvaro Herrera
On 2024-May-23, Jethish Jethish wrote:

> I have tried by increasing the max_standby_streaming_delay but I'm facing
> lag issues on the replica server.
> 
> When i increase the max_standby_streaming_delay even if a query runs for 2
> minutes I'm facing lag issues for 2 minutes.

You could use a separate replica for backups, with a larger max delay.

> Data size is 3TB

I think pg_dump is not an appropriate tool for backups on an instance
this size.  Have you considered using pgbarman or such, instead?

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: Missed compiler optimization issue in function select_rtable_names_for_explain

2024-05-22 Thread Alvaro Herrera
On 2024-May-22, XChy wrote:

> Hi everyone,
> 
> I'm a compiler developer working on detecting missed optimization in
> real-world applications. Recently, we found that LLVM missed a dead store
> elimination optimization in the PostgreSQL code 
> 
>  
> (https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/ruleutils.c#L3794)
> in the master branch.

I think if we want to improve how this code is seen by the compiler by
modifying it, we should just remove the NULL/NIL assignments.  It's a
pretty arbitrary (and rather small) subset of fields being initialized,
fields which very obviously have been zeroed by memset already, so I
don't see any value in the explicit assignments.

The criteria may be different in cases where some struct has all of its
members initialized to something, because at least you can use that kind
of thing to find places you need to modify when adding new struct
members.  Not so in select_rtable_names_for_explain.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"I can see support will not be a problem.  10 out of 10."(Simon Wittber)
  (http://archives.postgresql.org/pgsql-general/2004-12/msg00159.php)




Re: Using ALTER TABLE DETACH PARTITION CONCURRENTLY inside a procedure

2024-05-14 Thread Alvaro Herrera
On 2024-May-14, Dirschel, Steve wrote:

> But when I try and run the command inside the procedure it throws this error:
> 
> STATE: 25001
> MESSAGE: ALTER TABLE ... DETACH CONCURRENTLY cannot run inside a transaction 
> block
> CONTEXT: SQL statement "alter table t2.test1 detach partition 
> t2.test1_gentime_20240511 concurrently"
> PL/pgSQL function part.partition_maintenance() line 323 at EXECUTE

Yeah, ouch.

> The documentation states:
> 
> CONCURRENTLY cannot be run in a transaction block and is not allowed if the 
> partitioned table contains a default partition.

Right.

> Is there an option to call that CONCURRENTLY inside a procedure as I describe?

Not at the moment.  The issue is that CONCURRENTLY needs to commit a
transaction internally and start a new one, and to ensure that works
correctly we check that it's being executed as a "top-level command",
which rules out procedures.  It may be possible to relax that
restriction when run inside procedures, given that procedures need
transaction control of their own anyway so we could arrange for the
right things to happen; but this is hypothetical and I don't know that
anybody has attempted to implement that.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: Clarification Needed on Postgresql License Requirement for Hybrid Environment Cluster Configuration

2024-05-06 Thread Alvaro Herrera
Hello,

On 2024-May-06, Prasanna Chavan wrote:

> I hope this email finds you well. I am reaching out to seek
> clarification on a matter regarding the configuration of a cluster in
> a hybrid environment.
> Specifically, I am wondering whether we require a PostgreSQL license
> for setting up a cluster in such an environment. Could you kindly
> provide some guidance or clarification on this matter?

Yes, you require a license.  However, luckily for you, the license is
provided for any purpose, without fee, and without a written agreement.
The COPYRIGHT file in the source distribution contains the details.
You can use the software standalone or in cluster configuration, as you
see fit.  In short, you don't need to worry about it.

However, this doesn't necessarily apply to other software packages
derived from PostgreSQL itself, or those that work in conjunction with
it, such as Postgres extensions, admin interfaces, GUIs, monitoring
packages, etc.  You'll have to seek clarification on each such software
package that you intend to use.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"I dream about dreams about dreams", sang the nightingale
under the pale moon (Sandman)




Re: Failing streaming replication on PostgreSQL 14

2024-04-15 Thread Alvaro Herrera
On 2024-Apr-15, Nicolas Seinlet wrote:

> I'm using the Ubuntu/cyphered ZFS/PostgreSQL combination. I'm using
> Ubuntu LTS (20.04 22.04) and provided ZFS/PostgreSQL with LTS
> (PostgreSQL 12 on Ubuntu 20.04 and 14 on 22.04).

What exactly is "cyphered ZFS"?  Can you reproduce the problem with some
other filesystem?  If it's something very unusual, it might well be a
bug in the filesystem.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Find a bug in a program, and fix it, and the program will work today.
Show the program how to find and fix a bug, and the program
will work forever" (Oliver Silfridge)




Re: Tracing libpq client: Only with PQtrace()?

2024-04-09 Thread Alvaro Herrera
On 2024-Apr-09, Greg Sabino Mullane wrote:

> On Tue, Apr 9, 2024 at 8:51 AM Sebastien Flaesch 
> wrote:
> 
> > Is the PQtrace() API the only way to enable libpq client tracing?
> >
> > I thought about some environment variable of client configuration
> > setting...
> 
> That's generally the job of the client, or more specifically, the driver
> providing the interface between your program and the libpq API. Unless you
> are writing at the low level yourself, in which case, you would need to
> have your program detect an env and call PQtrace() as needed.
> 
> See for example
> https://metacpan.org/pod/DBD::Pg#parse_trace_flag-and-parse_trace_flags

Another approach might be to use Wireshark, which has a decoder for the
Postgres FE/BE protocol.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Pido que me den el Nobel por razones humanitarias" (Nicanor Parra)




Re: Tracing libpq client: Only with PQtrace()?

2024-04-09 Thread Alvaro Herrera
Hello,

On 2024-Apr-09, Sebastien Flaesch wrote:

> Is the PQtrace() API the only way to enable libpq client tracing?

Yes.

Regards

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: pg_dumpall with flag --no-role-passwords omits roles comments as well

2024-03-21 Thread Alvaro Herrera
On 2024-Mar-21, Daniel Gustafsson wrote:

> On 21 Mar 2024, at 13:28, Alvaro Herrera  wrote:
> 
> > I very much doubt that they realized that comments were going to be
> > omitted.  But clearly it's just a mistake, and easily fixed.
> 
> It sure looks like a search/replace kind of bug.  I had just typed up the 
> exact
> same patch with the addition of a comment on why pg_authid is used and was
> about to hit send when your email came =) Are you committing it or do you want
> me to take care of it?

Hah :-)  Please do.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Here's a general engineering tip: if the non-fun part is too complex for you
to figure out, that might indicate the fun part is too ambitious." (John Naylor)
https://postgr.es/m/CAFBsxsG4OWHBbSDM%3DsSeXrQGOtkPiOEOuME4yD7Ce41NtaAD9g%40mail.gmail.com




Re: pg_dumpall with flag --no-role-passwords omits roles comments as well

2024-03-21 Thread Alvaro Herrera
On 2024-Mar-21, Daniel Gustafsson wrote:

> Comments on roles are stored against the pg_authid catalog relation which is
> the catalog used for dumping roles, but when using --no-role-passwords we
> instead switch to using the pg_roles catalog relation.  Since comments are
> dumped for the relations which are dumped, this means that the comments on
> roles are omitted when --no-role-passwords is used.
> 
> It's not clear whether that was intentional or not, I'm failing to find the
> thread where it was discussed on a quick mailing list search.

Here it is:
https://www.postgresql.org/message-id/flat/CAEP4nAz9V4H41_4ESJd1Gf0v%3DdevkqO1%3Dpo91jUw-GJSx8Hxqg%40mail.gmail.com

I very much doubt that they realized that comments were going to be
omitted.  But clearly it's just a mistake, and easily fixed.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"After a quick R of TFM, all I can say is HOLY CR** THAT IS COOL! PostgreSQL was
amazing when I first started using it at 7.2, and I'm continually astounded by
learning new features and techniques made available by the continuing work of
the development team."
Berend Tober, http://archives.postgresql.org/pgsql-hackers/2007-08/msg01009.php
>From baa4bd18f751cde68c5637c4cb8065cf94e92c1c Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 21 Mar 2024 13:27:37 +0100
Subject: [PATCH v1] fix dump of role comments with --no-role-passwords

---
 src/bin/pg_dump/pg_dumpall.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index 491311fe79..72c30fc66d 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -776,21 +776,21 @@ dumpRoles(PGconn *conn)
 		  "rolcreaterole, rolcreatedb, "
 		  "rolcanlogin, rolconnlimit, rolpassword, "
 		  "rolvaliduntil, rolreplication, rolbypassrls, "
-		  "pg_catalog.shobj_description(oid, '%s') as rolcomment, "
+		  "pg_catalog.shobj_description(oid, 'pg_authid') as rolcomment, "
 		  "rolname = current_user AS is_current_user "
 		  "FROM %s "
 		  "WHERE rolname !~ '^pg_' "
-		  "ORDER BY 2", role_catalog, role_catalog);
+		  "ORDER BY 2", role_catalog);
 	else if (server_version >= 90500)
 		printfPQExpBuffer(buf,
 		  "SELECT oid, rolname, rolsuper, rolinherit, "
 		  "rolcreaterole, rolcreatedb, "
 		  "rolcanlogin, rolconnlimit, rolpassword, "
 		  "rolvaliduntil, rolreplication, rolbypassrls, "
-		  "pg_catalog.shobj_description(oid, '%s') as rolcomment, "
+		  "pg_catalog.shobj_description(oid, 'pg_authid') as rolcomment, "
 		  "rolname = current_user AS is_current_user "
 		  "FROM %s "
-		  "ORDER BY 2", role_catalog, role_catalog);
+		  "ORDER BY 2", role_catalog);
 	else
 		printfPQExpBuffer(buf,
 		  "SELECT oid, rolname, rolsuper, rolinherit, "
@@ -798,10 +798,10 @@ dumpRoles(PGconn *conn)
 		  "rolcanlogin, rolconnlimit, rolpassword, "
 		  "rolvaliduntil, rolreplication, "
 		  "false as rolbypassrls, "
-		  "pg_catalog.shobj_description(oid, '%s') as rolcomment, "
+		  "pg_catalog.shobj_description(oid, 'pg_authid') as rolcomment, "
 		  "rolname = current_user AS is_current_user "
 		  "FROM %s "
-		  "ORDER BY 2", role_catalog, role_catalog);
+		  "ORDER BY 2", role_catalog);
 
 	res = executeQuery(conn, buf->data);
 
-- 
2.39.2



Re: Question about PostgreSQL upgrade from version 12 to version 15

2024-03-21 Thread Alvaro Herrera
On 2024-Mar-21, Joseph Kennedy wrote:

> I'm planning to upgrade my PostgreSQL database from version 12 to
> version 15 using pg_upgrade. After completing the upgrade process, I'm
> curious to know whether it's necessary to reindex the database.
>
> Could anyone please clarify whether reindexing is required after
> completing the upgrade process from PostgreSQL version 12 to version
> 15 using pg_upgrade?

A reindex(*) is necessary for indexes on textual columns(**), and only
if you're also changing the underlying OS version(***) such that the
collation behavior changes.  If you're keeping the database on the same
OS version, there's no need to reindex anything.

(*) More than reindex actually: you may need to refresh materialized
views and consider carefully any partition bounds you may have, if you
have any partition keys that include textual columns.  Even worse: if
you have FDWs on a Postgres server that queries a table from another
Postgres server with different collation libraries, it could bit you
there too.

(**) textual column in this case means anything that is affected by
collation changes; typically that's things like varchar, text, citext,
etc, for which a collation other than "C" is explicit or implied.  You
don't need to do anything for indexes on numbers, dates, geometries,
etc, nor for textual columns where the index is defined with the C
collation.

(***) the underlying C library changes collation rules rather frequently
(especially glibc), because the developers of said libraries don't
consider that this has any important, permanent impact (but it does
impact indexes for Postgres).  Most such changes are innocuous, but from
time to time they make changes that wreak havoc.  If you're using ICU
collations with your Postgres 12 databases, you may also be affected if
you upgrade from one ICU version to another.


Joe Conway gave a very good presentation on this topic recently:
https://www.postgresql.eu/events/fosdem2024/schedule/session/5127-collation-challenges-sorting-it-out/

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Por suerte hoy explotó el califont porque si no me habría muerto
 de aburrido"  (Papelucho)




Re: Is it possible to keep track of SELECTs?

2024-03-12 Thread Alvaro Herrera
On 2024-Mar-12, Dominique Devienne wrote:

> So is it possible to track the last time a SELECT was performed on some
> TABLE?

Perhaps you could use the pgAudit module for this purpose.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: Non-Stored Generated Columns

2024-02-29 Thread Alvaro Herrera
On 2024-Feb-29, Dominique Devienne wrote:

> Honestly, I'm not sure why supporting the non-stored variant of generated
> columns is so controversial...

I don't think there's anything controversial about virtual generated
columns, really ... it's just that it's tricky to implement and we don't
have it yet.  I have interest in working on something related that might
allow us to implement virtual generated columns later, but it's not for
the near future.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Most hackers will be perfectly comfortable conceptualizing users as entropy
 sources, so let's move on."   (Nathaniel Smith)
  https://mail.gnu.org/archive/html/monotone-devel/2007-01/msg00080.html




Re: Cannot COPY xmin?

2024-02-27 Thread Alvaro Herrera
On 2024-Feb-27, Dominique Devienne wrote:

> Hi. I just tried adding the xmin pseudo-column to my COPY,
> and I get an error.

I suggest you find some other way to achieve whatever it is you think
you'll achieve with this.  It's not reliable and you'll regret it
eventually.  Maybe take a step back and let us know why you want to copy
xmin in the first place?

> Do I need to switch to a `COPY (select xmin, cols... from tab) ...` to
> access xmin?

If your ultimate objective is to shoot your feet, then yes, this will
work.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: Could not read from file "pg_subtrans/00F5" at offset 122880: Success.

2023-11-29 Thread Alvaro Herrera
On 2023-Nov-28, Kyotaro Horiguchi wrote:

> By the way, just out of curiosity, but errno should not be zero at the
> time the message above was output, yet "%m" is showing "success",
> which implies errno = 0 in Linux. How can that happen?

If the file is exactly of the length given then seek will work, and read
will not set errno but return a length of 0 bytes.  So
SlruPhysicalReadPage sets SLRU_READ_FAILED and then in pg11 the %m in
SlruReportIOError expands to "Success".

The code in master is completely different (it uses pg_pread rather than
seek + read): it does test for errno and reports accordingly.

So, nothing to do here.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/




Re: LibPQ: PQresultMemorySize as proxy to transfered bytes

2023-11-22 Thread Alvaro Herrera
On 2023-Jun-28, Dominique Devienne wrote:

> And if there's a better proxy to programmatically know the network
> traffic exchanged on the connection's socket, that's cross-platform?
> Obviously
> libpq itself knows, but I don't see any way to access that info.
> 
> Perhaps tracing might? But will that incur overhead?

Maybe you can use PQtrace()[1], yeah.  It will indicate, for each message
transmitted, its size in bytes.  Naturally there will be some overhead
in writing the trace file.  Also, the format of the output file has some
issues[2] that may make it difficult to machine-parse.  But it might be
good enough for you ... or you might find yourself motivated to fix
these problems.

[1] https://www.postgresql.org/docs/current/libpq-control.html#LIBPQ-PQTRACE
[2] 
https://www.postgr.es/m/CAFCRh-8OPoe%3D0j9v4wth7qU-x4jvjVc8DoOLV66qdnC9Do0ymw%40mail.gmail.com

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Doing what he did amounts to sticking his fingers under the hood of the
implementation; if he gets his fingers burnt, it's his problem."  (Tom Lane)




Re: New addition to the merge sql standard

2023-11-16 Thread Alvaro Herrera
On 2023-Nov-16, Nick DeCoursin wrote:

> In my opinion, it would be better for merge to offer the functionality to
> simply ignore the rows that cause unique violation exceptions instead of
> tanking the whole query.

"ignore" may not be what you want, though.  Perhaps the fact that insert
(coming from the NOT MATCHED clause) fails (== conflicts with a tuple
concurrently inserted in an unique or exclusion constraint) should
transform the row operation into a MATCHED case, so it'd fire the other
clauses in the overall MERGE operation.  Then you could add a WHEN
MATCHED DO NOTHING case which does the ignoring that you want; or just
let them be handled by WHEN MATCHED UPDATE or whatever.  But you may
need some way to distinguish rows that appeared concurrently from rows
that were there all along.

In regards to the SQL standard, I hope what you're saying is merely not
documented by them.  If it indeed isn't, it may be possible to get them
to accept some new behavior, and then I'm sure we'd consider
implementing it.  If your suggestion goes against what they already
have, I'm afraid you'd be doomed.  So the next question is, how do other
implementations handle this case you're talking about?  SQL Server, DB2
and Oracle being the relevant ones.

Assuming the idea is good and there are no conflicts, then maybe it's
just lack of round tuits.

Happen to have some?

I vaguely recall thinking about this, and noticing that implementing
something of this sort would require messing around with the ExecInsert
interface.  It'd probably require splitting it in pieces, similar to how
ExecUpdate was split.

There are some comments in the code about possible "live-locks" where
merge would be eternally confused between inserting a new row which it
then wants to delete; or something like that.  For sure we would need to
understand the concurrent behavior of this new feature very clearly.


An interesting point is that our inserts *wait* to see whether the
concurrent insertion commits or aborts, when a unique constraint is
involved.  I'm not sure you want to have MERGE blocking on concurrent
inserts.  This is all assuming READ COMMITTED semantics; on REPEATABLE
READ or higher, I think you're just screwed, because of course MERGE is
not going to get a snapshot that sees the rows inserted by transactions
that started after.

You'd need to explore all this very carefully.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/




Re: procedure string constant is parsed at procedure create time.

2023-11-07 Thread Alvaro Herrera
On 2023-Nov-07, jian he wrote:

> 2.this will have errors.
> CREATE or replace PROCEDURE template_create() LANGUAGE SQL AS $proc$
> DROP TABLE if exists test cascade;
> CREATE TABLE test(id int4range,valid_at tsrange,name text);
> INSERT INTO test VALUES (NULL, tsrange('2018-01-01',
> '2019-01-01'), 'null key');
> $proc$;

This is because of check_function_bodies.  You can get this procedure
created if you set it to 0.

However, it still won't run:

55493 16.0 1480547=# call template_create();
ERROR:  relation "test" does not exist
LÍNEA 4: INSERT INTO test VALUES (NULL, tsrange('2018-01-01',
 ^
CONSULTA:  
DROP TABLE if exists test cascade;
CREATE TABLE test(id int4range,valid_at tsrange,name text);
INSERT INTO test VALUES (NULL, tsrange('2018-01-01',
'2019-01-01'), 'null key');

CONTEXTO:  SQL function "template_create" during startup
Duración: 0,278 ms

The problem seems to be that the procedure uses a single catalog
snapshot during execution, so the INSERT doesn't see the table that
CREATE has made.  If you create the table beforehand, then DROP and
CREATE will both work, but the INSERT will fail claiming that the table
that existed before the DROP no longer exists:

55493 16.0 1480547=# CREATE TABLE test(id int4range,valid_at tsrange,name text);
CREATE TABLE
Duración: 1,251 ms
55493 16.0 1480547=# select 'test'::regclass::oid;
  oid   

 210427
(1 fila)

Duración: 0,461 ms
55493 16.0 1480547=# call template_create();
ERROR:  relation with OID 210427 does not exist
CONTEXTO:  SQL function "template_create" statement 3
Duración: 2,207 ms

Note we no longer get the "LINE 4" context item, but we do get
"statement 3", so we know it's still the insert.

(Of course, the table returns to life because the DROP is part of the
transaction that gets aborted by the failure).

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/




Re: PostgreSQL inheritance vs https://wiki.postgresql.org/wiki/Don%27t_Do_This

2023-10-23 Thread Alvaro Herrera
On 2023-Oct-23, Achilleas Mantzios wrote:

> I find PostgreSQL inheritance a great feature. The caveats are the same
> since a long time, nothing changed in that regard, but as you say, the
> implementation/limitations exist in native table partitioning as well.

For partitioning, many of the limitations have actually been fixed.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"La persona que no quería pecar / estaba obligada a sentarse
 en duras y empinadas sillas/ desprovistas, por cierto
 de blandos atenuantes"  (Patricio Vogel)




Re: PostgreSQL inheritance vs https://wiki.postgresql.org/wiki/Don%27t_Do_This

2023-10-23 Thread Alvaro Herrera
On 2023-Oct-23, Achilleas Mantzios - cloud wrote:

> I believe this text is false on too many accounts. So, what's the consensus
> about Inheritance in PostgreSQL, I am going to give a talk on it in November
> and I wouldn't like to advertise/promote/teach something that the community
> has decided to abandon or drop. Actually I proposed several topics and they
> chose this one (Inheritance).

There are several things wrong with table inheritance; see [1] in our
docs, in addition to the things that the wiki page notes.  I think it's
good advice to stay clear of it, and if you're giving a talk, do not
talk about inheritance except to mention its drawbacks.  Regarding your
talk, I would strongly suggest to ask them to choose another topic from
your list.

[1] https://www.postgresql.org/docs/16/ddl-inherit.html#DDL-INHERIT-CAVEATS

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Investigación es lo que hago cuando no sé lo que estoy haciendo"
(Wernher von Braun)




Re: Parsing libpq PQtrace files

2023-10-17 Thread Alvaro Herrera
On 2023-Oct-17, Dominique Devienne wrote:

> But then on Parse, the query text, which also contains embedded
> newlines but also embedded double-quotes, for column aliases, or table
> names, or whatever, even why not a single char literal like '"' (i.e.
> single-quote, double-quote, single-quote), seems to be double-quoted
> only. So how am I supposed to know when the SQL ends?

Yeah, I've never heard of anybody trying to machine-parse these lines,
so if those points are not addressed in the trace output, that probably
explains why.  It would probably be a good idea to change that.  This
stuff is relatively new, and most likely has thus far only used by
visual inspecting the contents of the trace file (at least that's how
*I* have used them), and only with very simple dummy queries.

So, patches welcome, I guess.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"En las profundidades de nuestro inconsciente hay una obsesiva necesidad
de un universo lógico y coherente. Pero el universo real se halla siempre
un paso más allá de la lógica" (Irulan)




Re: Can not drop partition if exist foreign keys

2023-10-11 Thread Alvaro Herrera
On 2023-Oct-11, Олег Самойлов wrote:

> There are two common practice to drop partition from partitioned
> table: just drop or detach-drop. But simple drop don't work if exist
> foreign key. Example script attached.

Yeah.  Detach it first, then you should be able to drop it.

> psql:test.sql:15: ERROR:  cannot drop table parent_0 because other objects 
> depend on it
> DETAIL:  constraint children_id_fkey on table children depends on table 
> parent_0
> HINT:  Use DROP ... CASCADE to drop the dependent objects too.
> 
> Looked like a bug.

We tried to make DROP work, but we didn't find a way.  Patches welcome.

-- 
Álvaro Herrera




Re: Multiple inserts with two levels of foreign keys

2023-10-05 Thread Alvaro Herrera
On 2023-Oct-04, Dow Drake wrote:

> I want to insert a farm record, then insert two crops associated with that
> farm, then insert two deliveries for each of the the two crops so that in
> the end, my tables look like this:

If I understand you correctly, for each table you want one CTE with the
data you want to insert, and another CTE with the data actually
inserted, that can be matched later.  Something like this should work:

with newfarms (name) as (values ('Happy Valley Farm')),
 insertedfarms (id, name) as (insert into farms (name)
 select newfarms.name
   from newfarms
  returning id, name),
 newcrops (farm, name) as (values ('Happy Valley Farm', 'corn'),
  ('Happy Valley Farm', 'wheat')),
 insertedcrops as (insert into crops (farm_id, name)
  select (select insertedfarms.id
from insertedfarms
   where insertedfarms.name = 
newcrops.farm),
 newcrops.name
from newcrops
   returning id, farm_id, name),
 newdeliveries (farm, name, ticket) as (values ('Happy Valley Farm', 
'corn', '3124'),
   ('Happy Valley Farm', 
'wheat', '3127'),
   ('Happy Valley Farm', 
'corn', '3133'),
   ('Happy Valley Farm', 
'wheat', '3140')),
 inserteddeliveries as (insert into deliveries (crop_id, ticket)
   select (select ics.id
 from insertedfarms ifs join 
insertedcrops ics on (ifs.id = ics.farm_id)
where ifs.name = newdeliveries.farm 
and
  ics.name = 
newdeliveries.name),
  ticket
 from newdeliveries
returning *)
select * from inserteddeliveries;


-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
 Are you not unsure you want to delete Firefox?
   [Not unsure] [Not not unsure][Cancel]
   http://smylers.hates-software.com/2008/01/03/566e45b2.html




Re: Calculating Days/Time(Are Loops Neccessary?)

2023-09-20 Thread Alvaro Herrera
On 2023-Sep-19, Anthony Apollis wrote:

> I have inherited this code, problem is it is over code, i believe. The
> package is gonna run once a month and this code run is a loop. How can this
> loop be running and checking data up until last day, if it only run once a
> month?

I didn't stop to understand your problem fully, but if you need the set
of calendar days in one month, you can use the generate_series()
function around some arithmetic that gives you the start of the month
plus when it ends, something like this:

with onedate(theday) as (values (date '2023-02-03'))
select g::date
from onedate,
 generate_series(date_trunc('month', onedate.theday),
date_trunc('month', onedate.theday) + interval '1 month' - interval '1 
day',
'1 day') g ;

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"El sabio habla porque tiene algo que decir;
el tonto, porque tiene que decir algo" (Platon).




Re: rollback to savepoint issue

2023-09-05 Thread Alvaro Herrera
On 2023-Sep-04, Erik Wienhold wrote:

> On 04/09/2023 16:56 CEST David G. Johnston  wrote:
> 
> > On Monday, September 4, 2023, Erik Wienhold  wrote:
> >
> > > On 04/09/2023 11:51 CEST Lorusso Domenico  wrote:
> > >
> > >  > The original code in embedded in a function, but the problem is the 
> > > same:
> > >
> > >  Transaction control is not possible in functions. Only in procedures 
> > > (CALL)
> > >  and DO blocks.
> >
> > Then explain why the original savepoint command wasn’t a syntax, or runtime,
> > error?
> 
> I don't need to because CREATE {FUNCTION | PROCEDURE} already fails because of
> ROLLBACK TO SAVEPOINT.  And without a function to execute there can't be any
> runtime error because of SAVEPOINT.  My point was about transaction control in
> plpgsql in general.

You can create the function nonetheless if you set check_function_bodies
to OFF beforehand.  Then it's a "syntax" error detected at runtime.

55432 17devel 1274266=# create function quarrel() returns void language plpgsql 
as $$ begin savepoint foo; insert into foo default values; rollback to 
savepoint foo; end $$;
ERROR:  syntax error at or near "to"
LINE 1: ...int foo; insert into foo default values; rollback to savepoi...
 ^

55432 17devel 1274266=# set check_function_bodies to 0;
SET

55432 17devel 1274266=# create function quarrel() returns void language plpgsql 
as $$ begin savepoint foo; insert into foo default values; rollback to 
savepoint foo; end $$;
CREATE FUNCTION

55432 17devel 1274266=# select quarrel();
ERROR:  syntax error at or near "to"
LINE 1: ...int foo; insert into foo default values; rollback to savepoi...
 ^
QUERY:   begin savepoint foo; insert into foo default values; rollback to 
savepoint foo; end 
CONTEXT:  compilation of PL/pgSQL function "quarrel" near line 1


But it gets worse.  If you create a procedure (no longer a function)
with a SAVEPOINT and ROLLBACK TO SAVEPOINT, it'll complain about the
ROLLBACK TO command, but not about SAVEPOINT; and if you remove that,
then it'll fail at runtime saying that SAVEPOINT is unsupported.

So it's not super consistent about how it report these various problems.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"El sudor es la mejor cura para un pensamiento enfermo" (Bardia)




Re: Schema renaming cascade

2023-08-17 Thread Alvaro Herrera
On 2023-Aug-17, Lorusso Domenico wrote:

> Hello guys,
> I need to rename a schema, including each reference to it (also for
> functions)

Maybe you should consider removing schema name references in function
source code, and instead refer to the tables (and other functions, etc)
by their unqualified names, and at the same time attach a "SET
search_path" clause to the function.  That way, when you need to rename
the schema you don't need to change the function source code, just
update its "SET search_path" clause.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: Completely Removing PostgreSQL

2023-07-31 Thread Alvaro Herrera
On 2023-Jul-31, Amn Ojee Uw wrote:

> In my Debian 12, I have removed the following apps from my system by using
> the following commands:

> *dpkg -l | grep postgres*
> rc  postgresql-12 12.15-1.pgdg120+1  amd64    The
> World's Most Advanced Open Source Relational Database
> rc  postgresql-15 15.3-1.pgdg120+1   amd64    The
> World's Most Advanced Open Source Relational Database
> rc  postgresql-client-common 250.pgdg120+1  all 
> manager for multiple PostgreSQL client versions
> rc  postgresql-common 250.pgdg120+1  all 
> PostgreSQL database-cluster manager
> rc  postgresql-pljava-common 1.6.4-2.pgdg120+1  all 
> Java procedural language for PostgreSQL, config files

This is lack of Debian knowledge, not Postgres' fault in any way .  You
just need to "apt purge" these packages, as "rc" status for a package in
apt means "desired state=remove, status=conf-files", ie. the residual is
just the config files that apt doesn't remove in case you want to
reinstall the package later.  You would have seen known this if you had
looked at the dpkg -l header lines:

Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name  Version   
  
...
rc

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: \d don't print all the tables

2023-07-24 Thread Alvaro Herrera
On 2023-Jul-24, Ron wrote:

> Add namespace_a and namespace_b to your search_path.  Then it will work.
> 
> Off the top of my head:
> SET search_path = namespace_a, namespace_b, public;

Actually it won't, because the table in the earliest schema "shadows"
any other tables of the same name in later schemas, so they won't show
in \d either.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"XML!" Exclaimed C++.  "What are you doing here? You're not a programming
language."
"Tell that to the people who use me," said XML.
https://burningbird.net/the-parable-of-the-languages/




Re: suggestion about time based partitioning and hibernate

2023-07-19 Thread Alvaro Herrera
On 2023-Jul-18, Luca Ferrari wrote:

> Dear all,
> I'm looking for ideas here, and it could be someone already stepped
> into declarative partitioning of an existing database where Hibernate
> (a Java ORM) handles the tables.
> The situation is as follows:
> 
> create table foo( id primary key, a_date date, ... );
> 
> Now, the trivial way to partition this would be on a range based on
> a_date, so that the primary key of the tables shifts from id to (id,
> a_date).

Hmm.  If you can make partitioning give you some benefit without having
to change the model, then by all means explore it.  But if you're forced
to change the model (in this case, by adding a column to your primary
key), that is going to cause you lots of pain, and some queries might
become slower rather than faster.  Therefore I suggest to avoid doing
that.  Either look at some other partitioning scheme that doesn't
involve adding columns to the primary key, or disregard partitioning for
this table entirely.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: [Beginner Question]What's the use of ResTarget?

2023-07-09 Thread Alvaro Herrera
On 2023-Jul-02, Wen Yi wrote:

> Hi community
> When I read the Internals document (41.1. The Query Tree),
> the introduction of the 'the result relation' confuse me.

There are "result relations" in commands that modify a relation, such as
insert or update.  The result relation is where the new tuples are
inserted, or where the modified tuples are.


> in my original understand, I guess the use of ResTarget is to store
> the columns of the result, like:
> 
> SELECT row_1, row_2 FROM table;

Yeah.  Those aren't necessarily in the result relation, because (as the
doc says) a SELECT does not even *have* a result relation.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Once again, thank you and all of the developers for your hard work on
PostgreSQL.  This is by far the most pleasant management experience of
any database I've worked on." (Dan Harris)
http://archives.postgresql.org/pgsql-performance/2006-04/msg00247.php




Re: 15 pg_upgrade with -j

2023-05-24 Thread Alvaro Herrera
On 2023-May-23, Ron wrote:

> We'd never hardlink.  Eliminates the ability to return to the old system if
> something goes wrong.

If you'd never hardlink, then you should run your test without the -k
option.  Otherwise, the timings are meaningless.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"La vida es para el que se aventura"




Re: "PANIC: could not open critical system index 2662" - twice

2023-05-08 Thread Alvaro Herrera
On 2023-May-07, Thomas Munro wrote:

> Did you previously run this same workload on versions < 15 and never
> see any problem?  15 gained a new feature CREATE DATABASE ...
> STRATEGY=WAL_LOG, which is also the default.  I wonder if there is a
> bug somewhere near that, though I have no specific idea.  If you
> explicitly added STRATEGY=FILE_COPY to your CREATE DATABASE commands,
> you'll get the traditional behaviour.

Maybe it would be sensible to make STRATEGY_FILE=FILE_COPY the default
again, for branch 15, before today's release.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Doing what he did amounts to sticking his fingers under the hood of the
implementation; if he gets his fingers burnt, it's his problem."  (Tom Lane)




Re: Convert pg_constraint.conkey array to same-order array of column names

2023-03-24 Thread Alvaro Herrera
On 2023-Mar-23, Dominique Devienne wrote:

> We have a query returning 1 row per constraint column,
> which until recently we didn't realize wasn't preserving order of the
> columns.
> 
> A colleague fixed that, with something like below:
> 
> SELECT ...
> FROM pg_catalog.pg_constraint cnstr
> ...
> CROSS JOIN LATERAL UNNEST(cnstr.conkey) WITH ORDINALITY AS cols(value, rank)
>  ORDER BY cols.rank
> 
> But I'm wondering about getting 1 row per constraint instead,
> and fetching an array of column names.

IIRC you can use array_agg(unnest ORDER BY ordinality), grouping by
constraint OID or name.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"La espina, desde que nace, ya pincha" (Proverbio africano)




Re: Losing my latin on Ordering...

2023-02-14 Thread Alvaro Herrera
On 2023-Feb-14, Dominique Devienne wrote:

> Honestly, who expects the same prefix to sort differently based on what
> comes after, in left-to-right languages?

Look, we don't define the collation rules.  We just grab the collation
rules defined by experts in collations.  In this case the experts have
advised the glibc developers to write collations this way; but even if
you went further and looked at the ICU libraries, you would find that
they have pretty much the same definition.

> How does one even find out what the (capricious?) rules for sorting in a
> given collation are?

I suggest to look at a telephone book carefully sometime (provided you
can find one ... apparently nobody wants them anymore.)

> So the "C" collation is fine with general UTF-8 encoding?
> I.e. it will be codepoint ordered OK?

Sure, just make sure to use the definition of C that uses UTF-8 encoding
(I think it's typically called C.UTF-8).

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: How could I elog the tupleTableSlot to the fronted terminal?

2023-01-30 Thread Alvaro Herrera
On 2023-Jan-30, jack...@gmail.com wrote:

> For example, I use "insert into t values(1)"; and I 'll get a tupleTableSlot,
> 
> And Now I want to get the real data , that's 1, and then use elog() func
> to print it. Could you give me some codes to realize that? futhermore,
> what If the data type is text or other types? What do I need to change?

Maybe have a look at the 'debugtup()' function.  It doesn't do exactly
what you want, but it may inspire you to write the code you need.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"All rings of power are equal,
But some rings of power are more equal than others."
 (George Orwell's The Lord of the Rings)




Re: https://wiki.postgresql.org/wiki/Working_with_Git link one link cannot open, another link is unrelated.

2023-01-17 Thread Alvaro Herrera
On 2022-Dec-29, Melih Mutlu wrote:

> also in this link: https://wiki.postgresql.org/wiki/Working_with_Git
> >
> >> See the documentation and tutorials at http://git.or.cz/ for a more
> >> detailed Git introduction. For a more detailed lesson, check out
> >> http://progit.org and maybe get a hardcopy to help support the site.
> >>
> > the first link not working, the second link, is not that related to git.
>
> This is just bad. Needs to be updated properly.

I have fixed the outdated links now.  There are several other things in
those instructions that seem dated -- perhaps not factually wrong, but
they look odd in 2023.

(Hmm, one thing that does seem wrong is that we no longer offer
individual repos under git.postgresql.org; AFAIR we tell people to rely
on external providers such as github/gitlab for that.)

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
Thou shalt study thy libraries and strive not to reinvent them without
cause, that thy code may be short and readable and thy days pleasant
and productive. (7th Commandment for C Programmers)




Re: gexec from command prompt?

2023-01-12 Thread Alvaro Herrera
On 2023-Jan-12, Ron wrote:

> Postgresql 12.11
> 
> This might be more of a bash question, or it might be a psql vs engine 
> problem.
> 
> I want to run this query using psql from a bash prompt:
> select format('SELECT ''%s'', MIN(part_date) FROM %s;', table_name, 
> table_name)
> from dba.table_structure
> order by table_name\gexec

Yeah, what I use in these cases is something like

echo "select format('SELECT ''%s'', MIN(part_date) FROM %s;', table_name, 
table_name)
from dba.table_structure
order by table_name \gexec" | psql -f-


-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"El hombre nunca sabe de lo que es capaz hasta que lo intenta" (C. Dickens)




Re: PostgreSQL extension for processing Graph queries (Apache AGE)

2022-11-30 Thread Alvaro Herrera
On 2022-Nov-29, Young Seung Andrew Ko wrote:

> Hello PostgreSQL users,
> 
> https://github.com/apache/age
> Apache AGE is an Apache 2-licensed open source PostgreSQL extension for
> storing Graph data.
> 
> The current version of Apache AGE is to enable PostgreSQL users to use
> Neo4j's openCypher-based graph queries in unison with existing relational
> tables

Can you show some examples of this feature in action?  What sort of data
would I use it for, how would I query it?

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: MERGE output doubt

2022-11-17 Thread Alvaro Herrera
On 2022-Nov-17, Alvaro Herrera wrote:

> On 2022-Oct-20, Luca Ferrari wrote:
> 
> > I was expecting an output tag like "MERGE 0" since both branches have
> > "do nothing", so no tuples should be updated at all on the target
> > table.
> 
> Hmm, yeah, it should report 0 tuples, according to the documentation.
> AFAICS this patch fixes it, will push shortly.

Pushed, thanks for reporting.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Someone said that it is at least an order of magnitude more work to do
production software than a prototype. I think he is wrong by at least
an order of magnitude."  (Brian Kernighan)




Re: MERGE output doubt

2022-11-17 Thread Alvaro Herrera
On 2022-Oct-20, Luca Ferrari wrote:

> Now, if I use  a do nothing merge:
> 
> estdb=> MERGE INTO average_scores a
> USING ( SELECT name, avg( score ) AS avg_score
>FROM scores GROUP BY name ) s
> ON s.name = a.name
> WHEN MATCHED  THEN
>do nothing
> WHEN NOT MATCHED THEN
>do nothing;
> MERGE 2
> 
> I was expecting an output tag like "MERGE 0" since both branches have
> "do nothing", so no tuples should be updated at all on the target
> table.

Hmm, yeah, it should report 0 tuples, according to the documentation.
AFAICS this patch fixes it, will push shortly.

> Moreover, if I truncate the target table and execute again the merge
> query, I got the result of 0:
> 
> testdb=> truncate average_scores ;
> TRUNCATE TABLE
> testdb=> MERGE INTO average_scores a
> USING ( SELECT name, avg( score ) AS avg_score
>FROM scores GROUP BY name ) s
> ON s.name = a.name
> WHEN MATCHED  THEN   -- caso di match
>do nothing
> WHEN NOT MATCHED THEN
>do nothing;
> MERGE 0
> 
> What am I missing here?

Hmm, is this not what you were expecting?  Or are you saying that it
should have been affected by the same bug?

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
>From 4282eadc0af3061dc53a5bc1ffcdd51b03cc28c4 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 17 Nov 2022 11:58:42 +0100
Subject: [PATCH] fix MERGE tuple count with DO NOTHING

---
 src/backend/executor/nodeModifyTable.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index d8fd3cfdbe..fe2b8e4bac 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2805,7 +2805,7 @@ lmerge_matched:
 		{
 			case TM_Ok:
 /* all good; perform final actions */
-if (canSetTag)
+if (canSetTag && commandType != CMD_NOTHING)
 	(estate->es_processed)++;
 
 break;
-- 
2.30.2



Re: possible bug

2022-10-24 Thread Alvaro Herrera
On 2022-Oct-21, Tom Lane wrote:

> "David G. Johnston"  writes:
> > On Fri, Oct 21, 2022 at 4:52 PM Ravi Krishna  wrote:
> >> on a diff note, is the word memoize inspired from Perl Module memoize
> >> which use to do the same thing.
> 
> > It is a general functional programming concept - not sure on the history
> > but probably academic and thus Perl and others picked it up "from the
> > source".
> 
> Looks to me like you suggested our use of the terminology:
> 
> https://www.postgresql.org/message-id/flat/CAKFQuwZQmCNyS_Vv2Jf3TNe7wRTiptWNs7xkgU%3DAEdqthkQe9A%40mail.gmail.com#bbcd739c97e28b17ef2e111be8cf214b

The word itself has been used in the PG lists before.  For example,
Greg Stark used it in 2008 when discussing WITH RECURSIVE
https://postgr.es/m/47c151f1.8050...@enterprisedb.com
There are a few other hits over the years.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Digital and video cameras have this adjustment and film cameras don't for the
same reason dogs and cats lick themselves: because they can."   (Ken Rockwell)




Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread Alvaro Herrera
On 2022-Oct-19, Dominique Devienne wrote:

> Anybody has an answer to my question regarding how substr() works on
> bytea values?  I.e. is it "pushed down" / optimized enough that it
> avoids reading the whole N-byte value, to then pass it to substr(),
> which then returns an M-byte value (where M < N)?
>
> If TOAST stores 2,000 chunks, and those chunks' PKs are numbers,
> simple arithmetic should be able to select only the chunks of
> interest, those incurring only the necessary IO for the selected
> range, no?

That's exactly what I was trying to say.  If there's no compression, we
don't read prior chunks.  (This is valid for bytea, at least; for
textual types we have to worry about multibyte characters, which are
again a potential source of confusion regarding the exact location you
want to seek.)

This can be seen in detoast_attr_slice() in
src/backend/access/common/detoast.c, though there are way too many^W^W^W
multiple layers of indirection if you start from bytea_substr() in
varlena.c.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread Alvaro Herrera
On 2022-Oct-19, Dominique Devienne wrote:

> Upfront, I have to state that I'm not keen on lo, because of security
> considerations. We store blobs in many different schemas, and users
> can access some schemas, and not others. So the fact the lo table is
> unique for the whole database would allow users to see blobs from any
> schema, as I understand it. Right? OTOH, lo has random access, which I
> also need...

Generally speaking, bytea sucks for random access, because if a TOAST
item is compressed, it has to be always read from the beginning in order
to decompress correctly.  However, if you set
ALTER TABLE tab ALTER COLUMN byt SET STORAGE EXTERNAL
then compression is not used, and random access becomes fast.

https://www.postgresql.org/docs/15/sql-altertable.html

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"No hay hombre que no aspire a la plenitud, es decir,
la suma de experiencias de que un hombre es capaz"




Re: Weird planner issue on a standby

2022-10-11 Thread Alvaro Herrera
On 2022-Oct-11, Tom Lane wrote:

> Are there any tables in this query where extremal values of the join
> key are likely to be in recently-added or recently-dead rows?  Does
> VACUUM'ing on the primary help?

I remember having an hypothesis, upon getting a report of this exact
problem on a customer system once, that it could be due to killtuple not
propagating to standbys except by FPIs.  I do not remember if we proved
that true or not.  I do not remember observing that tables were being
read, however.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: fully qualified domain names and .pgpass

2022-10-04 Thread Alvaro Herrera
On 2022-Oct-04, Ron wrote:

> Sometimes (both interactively and via script) I access a remote Pg server
> via just the bare host name "foobar", and other times via the FQDN
> "foobar.example.com".
> 
> I've only been able to get this to work by having two lines in the .pgpass 
> file:

Maybe it would be simpler to do this using a service definition in the
~/.pg_service.conf file.
https://www.postgresql.org/docs/current/libpq-pgservice.html

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: is there still a memory leak with hash joins in PG 12.11 ?

2022-09-28 Thread Alvaro Herrera
On 2022-Sep-28, Zwettler Markus (OIZ) wrote:

> I found this blog post talking about a memory leak having hash joins due to a 
> larger work_mem.
> https://gist.github.com/luhn/2b35a9b31255e3a6a2e6a06d1213dfc9

Oh dear, is that what passes for a blog post these days?

> Does anyone know if this problem is still there in PG 12.11 ?

It wasn't there in 9.5, so it seems safe to believe it isn't in 12.11.

> We are running on PG 12.11.
> We recognized an evergrowing memory usage.
> 
> work_mem = 50M
> there are some regularly hash joins.

I think it's more likely that this is a new, different bug (assuming it
is a bug at all).  But you'd have to provide more details in order to
track it down.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"El Maquinismo fue proscrito so pena de cosquilleo hasta la muerte"
(Ijon Tichy en Viajes, Stanislaw Lem)




Re: Missing query plan for auto_explain.

2022-08-30 Thread Alvaro Herrera
On 2022-Aug-30, Matheus Martin wrote:

> Good idea on using an actual prepared statement but unfortunately it didn't
> produce any different result.

I should have also mentioned to try the EXPLAIN EXECUTE six times and
see if the last one produces a different plan.  That's when it switches
from planning every time to planning with generic arguments, as I
recall.

> Could you please elaborate a bit on your advice concerning
> ExecutorEnd/PortalCleanup? I am afraid it doesn't mean much to me.

That wasn't advice actually, just a note that the code might be doing
that thing wrong, causing auto_explain to miss it.  This is an unproven
hypothesis that is likely to be all wrong.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"I'm impressed how quickly you are fixing this obscure issue. I came from 
MS SQL and it would be hard for me to put into words how much of a better job
you all are doing on [PostgreSQL]."
 Steve Midgley, http://archives.postgresql.org/pgsql-sql/2008-08/msg0.php




Re: Missing query plan for auto_explain.

2022-08-30 Thread Alvaro Herrera
On 2022-Aug-30, Matheus Martin wrote:

> Our Postgres recently started reporting considerably different
> execution times for the same query. When executed from our JDBC
> application the Postgres logs report an average execution time of 1500
> ms but when the query is manually executed through `psql` it doesn't
> take longer than 50 ms.

I don't know why the plan is not saved by auto_explain (maybe we're
missing ExecutorEnd calls somewhere?  that would be strange), but one
frequent reason for queries to show different plan in JDBC than psql is
the use of prepared statements.  Did you try using "PREPARE
yourquery(...)" and then EXPLAIN EXECUTE(...)?  Sometimes that helps to
recreate the original problem.


(Apparently, ExecutorEnd is called from PortalCleanup; what happens with
the portal for an extended-protocol query?)

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: Support for dates before 4713 BC

2022-08-22 Thread Alvaro Herrera
Hello Stefan, Alexander,

On 2022-Aug-22, stefan eichert wrote:

> I can also fully support what Alex has written. I am an archaeologist at
> the Natural History Museum Vienna and PostgreSQL is a perfect Open Source
> software and we really love working with it for our archaeological and
> (pre)historical projects.
> [...]

This made me remember the stuff genealogists use -- a complex datatype
that encodes not just some number on a specific calendar, but also a
"quality" marker (precise, calculated, estimated) as well as a calendar
identifier (Gregorian, Islamic, Hebrew etc).  For example, GRAMPS has
this:
https://github.com/gramps-project/gramps/blob/master/gramps/gen/lib/date.py#L531
I'm not familiar with the details, but skimming through that source
might give you some ideas.

Judging from your description and Alexander's, it sounds like you should
dedicate some time to developing a specification for a complex datatype
to use with archaeological or what have you.  I suppose you would want
to store type of date it is, its precision, the value itself, a degree
of fuzziness, and maybe some other properties that you know better than
I do.  Then get someone to implement an open source custom datatype
based on that spec, which everybody then uses.

Using "regular" calendars is unlikely to be a good match.

-- 
Álvaro Herrera




Re: Password reset link / 'less' does not exit in psql version 13.4

2022-07-25 Thread Alvaro Herrera
On 2022-Jul-25, Michael J. Baars wrote:

> No, it's psql. Setting PAGER to "more -e" solved the problem l, but I never
> had to before. There are no other variables affecting this behavior, so it
> must be psql internal default piping command that has changed.

Perhaps the settings are in the environment variable LESS, which sets
options for less(1).

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/




Re: lifetime of the old CTID

2022-07-06 Thread Alvaro Herrera
On 2022-Jul-06, Matthias Apitz wrote:

> This gives in the DB layer a CURSOR of say 100.000 rows of the
> 3.000.000 in the table. Now the application fetches row by row and see
> if something should be done with the row. If so, the DB layer must
> LOCK the row for update. It does so using the CTID.

This is a bad idea, for reasons already explained.  The CTID is not for
user consumption.  If it breaks, as it does for you here, it's your
fault for using it.  Your rows need to have a key that you can use.  The
CTID is not it.

> Of course there is a key in the row (d01gsi, the signature of the
> book), but this is not uniqu and can't be used to lock exactly this
> row for update.

Well, you need something.  It is beginning to sound like your database
model is wrong, because it lacks sufficient keys.

> Interestingly, I tested today morning how long the new CTID can be
> seen with currtid2(). I did 10 or more updates of a row and the then
> new CTID could always be seen with the old CTID from the moment before
> the 10 updates. I even found no way to get this tuple broken. Only
> deletion of the row helped to make currtid2() loosing the relation.
> This is with a 14.1 server. Why the 13.1 behaves different?

There are implementation details that you shouldn't concern yourself
with.  Quite likely, the reason it stayed unchanged has nothing to do
with the server version, and is instead related to other things
happening in the server at the same time.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"I suspect most samba developers are already technically insane...
Of course, since many of them are Australians, you can't tell." (L. Torvalds)




Re: ADD COLUMN ts tsvector GENERATED too slow

2022-07-06 Thread Alvaro Herrera
On 2022-Jul-06, Florents Tselai wrote:

> Actually, I monitored my disk usage and it was **definitely** working as 
> It had already eaten up an additional 30% of my disk capacity.

Adding a column like this requires creating a second copy of the table,
copying all the contents from the old table (plus the new column) into
the new one, then recreating all indexes from scratch on the new copy of
the table.  If you have a lot of indexes, this can be significant.
Also, AFAIU all data has to be uncompressed on read, then compressed
back on write.

Note: the 80 GB from pg_relation_size() does *not* include the size of
TOAST data.  You're copying a lot of additional data.  See
pg_table_size().

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Hay quien adquiere la mala costumbre de ser infeliz" (M. A. Evans)




Re: parallel-processing multiple similar query tasks - any example?

2022-04-28 Thread Alvaro Herrera
On 2022-Apr-28, Shaozhong SHI wrote:

> Expand and explain please.

No, thanks.

-- 
Álvaro Herrera




Re: parallel-processing multiple similar query tasks - any example?

2022-04-28 Thread Alvaro Herrera
On 2022-Apr-28, Shaozhong SHI wrote:

> Why sleep(1)?

It is sleeping to show that they are running concurrently.  If it runs
five sleeps of one second each and the whole command lasts one second,
then all sleeps ran in parallel.  Had the whole command taken five
seconds, you would know that the queries ran serially.

> It should be all active - doing work concurrently.

They are all active simultaneously.  You just need to supply your own
query, without any sleeps.

-- 
Álvaro Herrera




Re: parallel-processing multiple similar query tasks - any example?

2022-04-28 Thread Alvaro Herrera
On 2022-Apr-28, Shaozhong SHI wrote:

> multiple similar query tasks are as follows:
> 
> select * from a_table where country ='UK'
> select * from a_table where country='France'
> and so on
> 
> How best to parallel-processing such types of multiple similar query tasks?
> 
> Any example available?

for country in UK France Germany Ireland; do
  echo "select pg_sleep(1); select * from a_table where country = 
'${country//\'/''/}'"
done | \
  xargs -d"\n" -P10 -n1 psql -X -c

Note the ${country/} stuff is a bash-ism.

-- 
Álvaro Herrera




Re: "create function... depends on extension..." not supported. Why?

2022-04-27 Thread Alvaro Herrera
On 2022-Apr-26, Tom Lane wrote:

> I suppose that "DEPENDS ON EXTENSION" was modeled after the commands
> to control extension membership, which likewise exist only in ALTER
> form because CREATE's behavior for that is hard-wired.  If you wanted
> to hand-wave a lot, you could maybe claim that ownership and extension
> membership/dependency are similar kinds of relationships and so it
> makes sense that the command structures for manipulating them are
> similar.  But TBH that would probably be reverse-engineering an
> explanation.  I think that "we didn't bother" is more nearly the
> situation.

IIRC Abhijit and I discussed this point offline, and our conclusion was
that DEPENDS ON EXTENSION was not of general enough usefulness to
warrant support for it in CREATE FUNCTION -- keeping in mind that we
would need support for it in CREATE INDEX, CREATE TRIGGER, CREATE
MATERIALIZED VIEW as well.

It's probably not that much code.  The effort of writing the code wasn't
the barrier we were optimizing for, but rather for getting the whole
*idea* accepted.

If this feature is suddenly so useful as to *require* support in the
various CREATE commands, that is most satisfying to me and I volunteer
to reviewing patches that implement it.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: support for DIN SPEC 91379 encoding

2022-03-28 Thread Alvaro Herrera
On 2022-Mar-28, Peter J. Holzer wrote:

> On 2022-03-27 14:06:25 -0400, Tom Lane wrote:

> > We follow that spec, so depending on what DIN 91379 *actually* says,
> > we might have additional reasons not to be in compliance.  I don't
> > read German unfortunately.
> 
> It defines minimal character set that IT systems which process personal
> and company names in the EU must accept. Basically Latin, Greek and
> Cyrillic letters, digits and some symbols and interpunctation.

Yeah, I had a look at the list of allowed characters and it's a
reasonably simple set.  The most complex you can find is stuff like

LATIN CAPITAL LETTER R WITH COMBINING RING BELOW AND COMBINING MACRON
LATIN CAPITAL LETTER K WITH COMBINING DOUBLE MACRON BELOW AND LATIN SMALL 
LETTER H

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Tiene valor aquel que admite que es un cobarde" (Fernandel)




Re: support for DIN SPEC 91379 encoding

2022-03-27 Thread Alvaro Herrera
On 2022-Mar-27, Ralf Schuchardt wrote:

> where did you read, that this DIN SPEC 91379 norm is incompatible with UTF-8?
> 
> In the document „String.Latin+ 1.2: eine kommentierte und erweiterte
> Fassung der DIN SPEC 91379. Inklusive einer umfangreichen Liste häufig
> gestellter Fragen. Herausgegeben von der Fachgruppe String.Latin“
> linked here https://www.xoev.de/downloads-2316#StringLatin it is said,
> that the spec is a strict subset of unicode (E.1.6), and it is also
> mentioned in E.1.4, that in UTF-8 all unicode characters can be
> encoded. Therefore UTF-8 can be used to encode all DIN SPEC 91379
> characters.

So the remaining question is whether DIN SPEC 91379 requires an
implementation to support character U+.  If it does, then PostgreSQL
is not conformant, because that character is the only one in Unicode
that we don't support.  If U+ is not required, then PostgreSQL is
okay.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: PG12: Any drawback of increasing wal_keep_segments

2022-03-24 Thread Alvaro Herrera
On 2022-Mar-22, Shukla, Pranjal wrote:

> Team,
> Are there any disadvantages of increasing the “wal_keep_segments” to a
> higher number say, 500? Will it have any impact on performance of
> streaming replication, on primary or secondary servers?

No.  It just means WAL will occupy more disk space.  I've seen people go
even as high as 5000 with no issues.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: alter function/procedure depends on extension

2022-02-17 Thread Alvaro Herrera
On 2022-Feb-17, Bryn Llewellyn wrote:

> qu. 1. What is the use-case for "alter function/procedure depends on 
> extension"?

Suppose you have an extension that implements change auditing for
tables, which works by creating triggers on the audited tables.  You
enable auditing for specific tables by calling one of the extension's
function; this function creates a trigger on the table.

The design idea was that if you drop the auditing extension, then those
triggers should be gone too.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"I'm always right, but sometimes I'm more right than other times."
  (Linus Torvalds)




Re: Can we go beyond the standard to make Postgres radically better?

2022-02-15 Thread Alvaro Herrera
On 2022-Feb-13, Guyren Howe wrote:

> I’m back to just having no earthly idea why anyone who finds relations
> to be a productive tool for building a model would think that SQL
> being the only means to do that is Okay.

There are aspects other than technical reasons alone why some things
live on while "better" things barely survive without thriving, or even
die.  For example, the fact that there are multiples companies
furthering the development of the SQL language means that there's a
group of engineers working to introduce improvements to the language
after extensive discussions and exposure to requests from users.  If
Postgres decided to "fork" and go solitary with its own query language,
it would by itself have to produce all the language improvements, or be
left behind by the other products.  And it takes a lot of effort to
produce those improvements.  Have you looked at how SQL changed from one
version to the next?

Another aspect is inertia.  The amount of software products that rely on
SQL is just too high.  Suppose you came up with a technically-better
language that has all the current capabilities of SQL.  Do you think a
majority of those products would immediately switch to the new language?
My guess is no, they wouldn't, because the safe bet is that SQL will
continue to work in 10, 20 years, while there is no certainty at all
that your new language would.  So by ditching SQL, Postgres would no
longer be a database of choice for those products.

So, while SQL may not be the greatest possible relational language
possible, there are very good reasons for it to continue to be the
language of choice.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: Undetected Deadlock

2022-01-25 Thread Alvaro Herrera
On 2022-Jan-25, Michael Harris wrote:

> We've recently updated our application to PG 14.1, and in the test instance we
> have started to see some alarming undetected deadlocks.

This is indeed suspicious / worrisome / curious.

What version were you using previously?

I reformatted the result sets:

> An example of what we have seen is:
> 
>  locktype | database |  relation  | page | tuple | virtualxid | transactionid 
> | classid | objid | objsubid | virtualtransaction |   pid   |  mode   
> | granted | fastpath |   waitstart   |relation
> --+--++--+---++---+-+---+--++-+-+-+--+---+
>  relation |   529986 | 1842228045 |  |   ||   
> | |   |  | 165/1941408| 2130531 | AccessShareLock 
> | f   | f| 2022-01-19 00:32:32.626152+01 | st.ctr_table_efr_oa
> (1 row)
> 
>  locktype | database |  relation  | page | tuple | virtualxid | transactionid 
> | classid | objid | objsubid | virtualtransaction |   pid   |mode 
> | granted | fastpath | waitstart | relation
> --+--++--+---++---+-+---+--++-+-+-+--+---+
>  relation |   529986 | 1842228045 |  |   ||   
> | |   |  | 75/2193719 | 2128603 | 
> AccessExclusiveLock | t   | f|   | st.ctr_table_efr_oa
> (1 row)
> 
>  locktype | database |  relation  | page | tuple | virtualxid | transactionid 
> | classid | objid | objsubid | virtualtransaction |   pid   |mode 
> | granted | fastpath |   waitstart   | relation
> --+--++--+---++---+-+---+--++-+-+-+--+---+---
>  relation |   529986 | 1842231489 |  |   ||   
> | |   |  | 75/2193719 | 2128603 | 
> AccessExclusiveLock | f   | f| 2022-01-19 00:32:32.924694+01 | 
> st.tpd_oa
> (1 row)
> 
>locktype| database |  relation  | page | tuple |  virtualxid  | 
> transactionid | classid |   objid   | objsubid | virtualtransaction |   pid   
> | mode  | granted | fastpath |   waitstart   
> | relation
> ---+--++--+---+--+---+-+---+--++-+---+-+--+---+---
>  relation  |   529986 | 1842231489 |  |   |  |
>| |   |  | 165/1941408| 2130531 | 
> AccessShareLock   | t   | f|   | 
> st.tpd_oa
> 
> So:
>   pid 2130531 waits for an AccessShareLock on relation 1842228045, blocked by 
> pid 2128603 which holds an AccessExclusiveLock
>   pid 2128603 waits for an AccessExclusiveLock on relation 1842231489, 
> blocked by pid 2130531 which holds an AccessShareLock
> 
> The queries being executed by these backends are:
> 
>pid   |  query_start  | state_change  | 
> wait_event_type | wait_event | state  | query
> -+---+---+-+++-
>  2128603 | 2022-01-19 00:32:32.924413+01 | 2022-01-19 00:32:32.924413+01 | 
> Lock| relation   | active | DROP TABLE st.tpd_oa_18929
>  2130531 | 2022-01-19 00:32:32.625706+01 | 2022-01-19 00:32:32.625708+01 | 
> Lock| relation   | active | DELETE FROM st.ctr_table_efr_oa_19010 
> WHERE ropid = 44788868
> (2 rows)

I know of no cases in which we fail to detect a deadlock.  Perhaps you
have indeed hit a bug.

> Note that there were a lot of other processes also waiting on relation
> 1842231489 - could that be confusing the deadlock detection routine?

It shouldn't.

> I am also confused about the locks which are being taken out by the
> DELETE query.

Maybe the lock is already taken before the DELETE is run; do you have
any triggers, rules, constraints, or anything?  If you have seen this
several times already, maybe a way to investigate deeper is an
exhaustive log capture of everything that these transactions do, from
the point they begin until they become blocked (log_statement=all).

Perhaps you need to involve other concurrent transactions in order to
cause the problem.

-- 
Álvaro Herrera   

Re: DROP OWNED BY fails with #53200: ERROR: out of shared memory

2022-01-10 Thread Alvaro Herrera
On 2022-Jan-10, Dominique Devienne wrote:

> Given  max_locks_per_transaction * (max_connections
> 
>  + max_prepared_transactions
> )
> from
> https://www.postgresql.org/docs/current/runtime-config-locks.html, and
> max_conn being 100, that's not many locks.

6400 locks, to be precise.  So if your schemas have on average 10 tables
each with 3 indexes per table, you could drop at most 160 schemas in one
go (but only if you're lucky.)

> Given there's only 64 locks per conn by default, how can this work with
> over 100 tables?
> I'm confused... --DD

That value indicates the maximum number of locks that can be taken
across all sessions at a time.  You can have a single session take that
number of locks, or all sessions take 64 locks each.

If you really have many more relations that need to be dropped, you
could try to issue "DROP SCHEMA...CASCADE" for each schema to drop.
It's a lot less convenient than DROP OWNED BY, but it doesn't require to
take as many locks simultaneously.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"Los trabajadores menos efectivos son sistematicamente llevados al lugar
donde pueden hacer el menor daño posible: gerencia."  (El principio Dilbert)




Re: Match 2 words and more

2021-11-27 Thread Alvaro Herrera
On 2021-Nov-28, Shaozhong SHI wrote:

> this is supposed to find those to have 2 words and more.
> 
> select name FROM a_table where "STREET_NAME" ~ '^[[:alpha:]+ ]+[:alpha:]+$';
> 
> But, it finds only one word as well.

How about something like this?

'^([[:<:]][[:alpha:]]+[[:>:]]( |$)){2}$'

You have:
- the ^ is a constraint that matches start of string
- you have a ( ... ){2}$ construct which means "match exactly twice" and
  then match end-of-string
- Inside the parens of that construct, you match:
  - [[:<:]] which means start-of-word
  - [[:alpha:]]+ which means "a non-empty set of alphabetical chars"
  - [[:>:]] which means end-of-word
  - ( |$) for "either a space or end-of-string"

You can perhaps simplify by removing the [[:<:]] and [[:>:]]
constraints, so '^([[:alpha:]]+( |$)){2}$'

To mean "between two and four", change the {2} to {2,4}.  If you want
"two or more", try {2,}.

You could change the ( |$) to ([[:white:]]+|$) in order to accept more
than one space between words, or combinations of space and tabs and
newlines and so on.

With a decent set of data, you could probably notice some other problems
in this regexp, but at least it should be a decent start.

> It appears that regex is not robust.

Nah.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/




Re: Merge into does not work

2021-11-26 Thread Alvaro Herrera
On 2021-Nov-26, Adrian Klaver wrote:

> On 11/26/21 11:44, Alvaro Herrera wrote:
> > On 2021-Nov-26, Shaozhong SHI wrote:
> > 
> > I am using the MERGE patch I posted here[1], on top of Postgres 15.
> > 
> > https://postgr.es/m/20252245.byerxxac444d@alvherre.pgsql
> 
> A patch that as AFAIK is not even committed to what is at best an alpha
> version would in my opinion not qualify as ready much less supported. I look
> forward to seeing it make it in, but I would hardly recommend it for general
> use.

All true.  I'm recruiting testers.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
"I love the Postgres community. It's all about doing things _properly_. :-)"
(David Garamond)




Re: Merge into does not work

2021-11-26 Thread Alvaro Herrera
On 2021-Nov-26, Shaozhong SHI wrote:

> MERGE INTO Stock USING Buy ON Stock.item_id = Buy.item_id
>  WHEN MATCHED THEN UPDATE SET balance = balance + Buy.volume
>  WHEN NOT MATCHED THEN INSERT VALUES (Buy.item_id, Buy.volume);

It does work for me:

55479 15devel 680346=# MERGE INTO Stock USING Buy ON Stock.item_id = Buy.item_id
 WHEN MATCHED THEN UPDATE SET balance = balance + Buy.volume
 WHEN NOT MATCHED THEN INSERT VALUES (Buy.item_id, Buy.volume);
MERGE 2
Duración: 3,879 ms
55479 15devel 680346=# select * from stock;
 item_id │ balance 
─┼─
  20 │1900
  10 │3200
  30 │ 300
(3 filas)

> I am using Postgres 9.6.

I am using the MERGE patch I posted here[1], on top of Postgres 15.

https://postgr.es/m/20252245.byerxxac444d@alvherre.pgsql

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
"Pido que me den el Nobel por razones humanitarias" (Nicanor Parra)




Re: PostgreSQL 14: pg_dump / pg_restore error: could not write to the communication channel: Broken pipe

2021-10-15 Thread Alvaro Herrera
On 2021-Oct-15, Nick Renders wrote:

> Hello,
> 
> I have been trying to import a Postgres 11 database into Postgres 14, but
> the pg_restore command exits with the following message:
> 
>   pg_restore: error: could not write to the communication channel: Broken
> pipe
> 
> The command I sent looks like this:
> 
>   /Library/PostgreSQL/14/bin/pg_restore -h localhost -p 48100 -U postgres 
> -w
> -C -d template1 -j 24 -Fd /Volumes/Migration/dbname --verbose
> 
> It seems that the multiple jobs parameter is the cause. If I specify "-j 1",
> the command works without problems. If I specify "-j 2" or higher, I get the
> above error after a few seconds.

Hi,

Yeah, pg_dump in parallel mode uses a pipe to communicate between leader
and workers; the error you see is what happens when a write to the pipe
fails.  It sounds to me like something in the operating system is
preventing the pipes from working properly.  I don't know anything about
macOS so I can't help you with that.  I can tell you however that this
error has not been reported previously.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"I dream about dreams about dreams", sang the nightingale
under the pale moon (Sandman)




Re: Postgres PANIC when it could not open file in pg_logical/snapshots directory

2021-10-04 Thread Alvaro Herrera
On 2021-Jun-22, Mike Yeap wrote:

> I have a Postgres version 11.11 configured with both physical replication
> slots (for repmgr) as well as some logical replication slots (for AWS
> Database Migration Service (DMS)). This morning, the server went panic with
> the following messages found in the log file:
> 
> 2021-06-22 04:56:35.314 +08 [PID=19457 application="[unknown]"
> user_name=dms database=** host(port)=**(48360)] PANIC:  could not open file
> "pg_logical/snapshots/969-FD606138.snap": Operation not permitted

Hmm, isn't this strange?  open(3) is not documented to return EPERM,
which is what this error string maps to.

You should definitely not mess with the PG data dir; if you do, it's
your problem when things break.  In this case you may not be messing
with the data dir yourself, but perhaps you kernel is buggy or you have
some security module that prevents the operation from working properly,
or something.
 
-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/




Re: datfrozenxid not dropping after vacuum

2021-09-02 Thread Alvaro Herrera
On 2021-Sep-01, Matthew Tice wrote:

> Hi Alvaro, thanks for the quick reply.

Hi. Glad to hear that your problem is now behind.

> I'm scheduled to do my patching maintenance at the end of this month -
> but at this point I don't think I'm going to make it.
> 
> Other than patching, is there a work around?

Hm, in my previous reply I had written a suggestion to vacuum
pg_database in the offending database after deleting the
pg_internal.init file, but evidently I edited it out before sending.
(Probably because I wasn't sure if you need to delete file, connect,
vacuum, or rather connect, delete file, vacuum.)

> For example, in #2 above:
> >The fix for 2) is simpler,
> >simply always remove both the shared and local init files.
> 
> I'm not familiar with the differences between 'shared' and 'local'
> init files (I'd imagine I referenced a 'local' file in my original
> post)?

The global file is in the global/ subdirectory of the data directory,
and the "local" ones are each in the corresponding database directory:

cd $PGDATA
$ find . -name pg_internal.init
./base/12758/pg_internal.init
./base/46212/pg_internal.init
./global/pg_internal.init

$ psql -c "select oid, datname from pg_database"
  oid  |  datname   
---+
 12757 | postgres
 12758 | alvherre
 1 | template1
 12756 | template0
 46212 | regression
(5 filas)

So in the above there are cache files for databases regression and
alvherre, plus the global one.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Hay dos momentos en la vida de un hombre en los que no debería
especular: cuando puede permitírselo y cuando no puede" (Mark Twain)




Re: datfrozenxid not dropping after vacuum

2021-09-01 Thread Alvaro Herrera
On 2021-Sep-01, Matthew Tice wrote:

[ problem table is pg_database ]

> My primary, read/write database is Postgresql 10.4 (CentOS 7) while my
> standby databases have been patched to 10.17.

Hmm, I think there was a bug in the early 10.x versions where advancing
the xid age of shared tables would not work correctly for some reason ...
Ah yes, this was fixed in 10.5, a mere three years ago:

Author: Andres Freund 
Branch: master Release: REL_11_BR [a54e1f158] 2018-06-12 11:13:21 -0700
Branch: REL_10_STABLE Release: REL_10_5 [2ce64caaf] 2018-06-12 11:13:21 -0700
Branch: REL9_6_STABLE Release: REL9_6_10 [6a46aba1c] 2018-06-12 11:13:21 -0700
Branch: REL9_5_STABLE Release: REL9_5_14 [14b3ec6f3] 2018-06-12 11:13:21 -0700
Branch: REL9_4_STABLE Release: REL9_4_19 [817f9f9a8] 2018-06-12 11:13:22 -0700
Branch: REL9_3_STABLE Release: REL9_3_24 [9b9b622b2] 2018-06-12 11:13:22 -0700

Fix bugs in vacuum of shared rels, by keeping their relcache entries 
current.

When vacuum processes a relation it uses the corresponding relcache
entry's relfrozenxid / relminmxid as a cutoff for when to remove
tuples etc. Unfortunately for nailed relations (i.e. critical system
catalogs) bugs could frequently lead to the corresponding relcache
entry being stale.

This set of bugs could cause actual data corruption as vacuum would
potentially not remove the correct row versions, potentially reviving
them at a later point.  After 699bf7d05c some corruptions in this vein
were prevented, but the additional error checks could also trigger
spuriously. Examples of such errors are:
  ERROR: found xmin ... from before relfrozenxid ...
and
  ERROR: found multixact ... from before relminmxid ...
To be caused by this bug the errors have to occur on system catalog
tables.

The two bugs are:

1) Invalidations for nailed relations were ignored, based on the
   theory that the relcache entry for such tables doesn't
   change. Which is largely true, except for fields like relfrozenxid
   etc.  This means that changes to relations vacuumed in other
   sessions weren't picked up by already existing sessions.  Luckily
   autovacuum doesn't have particularly longrunning sessions.

2) For shared *and* nailed relations, the shared relcache init file
   was never invalidated while running.  That means that for such
   tables (e.g. pg_authid, pg_database) it's not just already existing
   sessions that are affected, but even new connections are as well.
   That explains why the reports usually were about pg_authid et. al.

To fix 1), revalidate the rd_rel portion of a relcache entry when
invalid. This implies a bit of extra complexity to deal with
bootstrapping, but it's not too bad.  The fix for 2) is simpler,
simply always remove both the shared and local init files.

Author: Andres Freund
Reviewed-By: Alvaro Herrera
Discussion:
https://postgr.es/m/20180525203736.crkbg36muzxrj...@alap3.anarazel.de

https://postgr.es/m/CAMa1XUhKSJd98JW4o9StWPrfS=11bpgg+_gdmxe25tvuy4s...@mail.gmail.com

https://postgr.es/m/cakmfjucqbuodrfxpdx39wha3vjyxwerg_zdvxzncr6+5wog...@mail.gmail.com

https://postgr.es/m/cagewt-ujgpmlq09gxcufmzazsgjc98vxhefbf-tppb0fb13...@mail.gmail.com
Backpatch: 9.3-


-- 
Álvaro Herrera   39°49'30"S 73°17'W
"El número de instalaciones de UNIX se ha elevado a 10,
y se espera que este número aumente" (UPM, 1972)




Re: Can we get rid of repeated queries from pg_dump?

2021-08-29 Thread Alvaro Herrera
On 2021-Aug-28, Tom Lane wrote:

> Here is a second patch, quite independent of the first one, that
> gets rid of some other repetitive queries.

Another pointlessly repetitive query is in getTriggers, which we run
once per table to be dumped containing triggers.  We could reduce that
by running it in bulk for many relations at a time.  I suppose it's
normally not hurtful, but as we grow the number of partitions we allow
it's going to become a problem.

No patch from me for now — if someone wantw to volunteer one, it looks
simple enough ...

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/




Re: Getting pg_stat_database data takes significant time

2021-08-12 Thread Alvaro Herrera
On 2021-Aug-11, hubert depesz lubaczewski wrote:

> On Wed, Aug 11, 2021 at 10:16:13AM -0400, Alvaro Herrera wrote:
> > 1. this depends on reading the stats file; that's done once per
> > transaction.  So if you run the query twice in a transaction, the second
> > time will take less time.  You can know how much time is spent reading
> > that file by subtracting both times.
> 
> Yeah. I noticed. Looks like loading the stats file is the culprit. But
> does that mean that the whole stats file has to be read at once? I just
> need stats on db, not on relations?

As I recall there is one file per database containing everything
pertaining to that database, and you cannot read it partially.

Maybe you can use stats_temp_directory to put these files in faster
or less busy storage -- a RAM disk perhaps?

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
"Those who use electric razors are infidels destined to burn in hell while
we drink from rivers of beer, download free vids and mingle with naked
well shaved babes." (http://slashdot.org/comments.pl?sid=44793=4647152)




Re: Getting pg_stat_database data takes significant time

2021-08-11 Thread Alvaro Herrera
Two things,

1. this depends on reading the stats file; that's done once per
transaction.  So if you run the query twice in a transaction, the second
time will take less time.  You can know how much time is spent reading
that file by subtracting both times.

2. EXPLAIN (VERBOSE) will tell you which functions are being called by
the query.  One of those loops across all live backends.  Is that
significant?  You could measure by creating an identical view but
omitting pg_stat_db_numbackends.  Does it take the same time as the
view?  If not, then you know that looping around all live backends is
slow.

If the problem is (1) then you could have less tables, so that the file
is smaller and thus faster to read, but I don't think you'll like that
answer; and if the problem is (2) then you could reduce max_connections,
but I don't think you'll like that either.

I suspect there's not much you can do, other than patch the monitoring
system to not read that view as often.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
"Use it up, wear it out, make it do, or do without"




Re: PostgreSQL reference coffee mug

2021-07-28 Thread Alvaro Herrera
On 2021-Jul-28, Adrian Klaver wrote:

> On 7/28/21 8:01 AM, Georg H. wrote:

> To add to above:
> 
> \set\s allone -> alone

Actually that's wrong, because \s prints history not variables.  This
needs to read
  \set   alone ...

I'm *not* going to get into this, because I know I'm capable of spending
days on it.  I would definitely look into using different font sizes and
colors and make it much denser, for one thing.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: RDS Proxy war stories?

2021-07-27 Thread Alvaro Herrera
On 2021-Jul-27, Quinn David Weaver wrote:

> Hi,
> 
> Does anyone here have experience to share regarding Amazon's RDS
> Proxy? Subjective or quantitative, positive or negative, anything you
> have is great.

I know one instance of its use.  The RDS instance it serves is very
large (it used the largest instance that was available in RDS, but the
user managed to shrink it to smaller ones as we optimized things) and
the concurrency is pretty high.  It worked well.  No stability
complaints.

-- 
Álvaro Herrera




Re: ERROR: cannot freeze committed xmax

2021-07-15 Thread Alvaro Herrera
One thing I forgot is that these XIDs are fairly old, perhaps dating
back to when this database was freshly initdb'd if there has been no XID
wraparound.  In that case you were probably running a version much older
than 10.14 when they were written.  Do you happen to know when did you
initdb this, with what version, when did you upgrade this to 10.14?
That may help search the commit log for bugfixes that might explain the
bug.  I just remembered this one as my favorite candidate:

Author: Alvaro Herrera 
Branch: master Release: REL_11_BR [d2599ecfc] 2018-05-04 18:24:45 -0300
Branch: REL_10_STABLE Release: REL_10_4 [e1d634758] 2018-05-04 18:23:58 -0300
Branch: REL9_6_STABLE Release: REL9_6_9 [3a11485a5] 2018-05-04 18:23:30 -0300

Don't mark pages all-visible spuriously

Dan Wood diagnosed a long-standing problem that pages containing tuples
that are locked by multixacts containing live lockers may spuriously end
up as candidates for getting their all-visible flag set.  This has the
long-term effect that multixacts remain unfrozen; this may previously
pass undetected, but since commit XYZ it would be reported as
  "ERROR: found multixact 134100944 from before relminmxid 192042633"
because when a later vacuum tries to freeze the page it detects that a
multixact that should have gotten frozen, wasn't.

Dan proposed a (correct) patch that simply sets a variable to its
correct value, after a bogus initialization.  But, per discussion, it
seems better coding to avoid the bogus initializations altogether, since
they could give rise to more bugs later.  Therefore this fix rewrites
the logic a little bit to avoid depending on the bogus initializations.

This bug was part of a family introduced in 9.6 by commit a892234f830e;
later, commit 38e9f90a227d fixed most of them, but this one was
unnoticed.

Authors: Dan Wood, Pavan Deolasee, Álvaro Herrera
Reviewed-by: Masahiko Sawada, Pavan Deolasee, Álvaro Herrera
Discussion: 
https://postgr.es/m/84ebac55-f06d-4fbe-a3f3-8bda093ce...@amazon.com


-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"El número de instalaciones de UNIX se ha elevado a 10,
y se espera que este número aumente" (UPM, 1972)




Re: ERROR: cannot freeze committed xmax

2021-07-15 Thread Alvaro Herrera
Hi Sasha

On 2021-Jul-14, Sasha Aliashkevich wrote:

>  lp |  ctid   |   xmin| xmax | xmax_is_lock | xmax_committed | 
> xmax_rolled_back | xmax_multixact 
> +-+---+--+--++--+
>  19 | (75,21) |   571 |  572 | f| f  | t  
>   | f  
>  21 | (75,21) |   572 |0 | f| f  | t  
>   | f  

Yeah, row 19 here is dead and should have been removed by vacuum (same
with the ones in pg_depend).  I don't know why it wasn't.  If you query
the tables
SELECT * FROM pg_proc WHERE ctid = '(75,19)'
are they visible?  They shouldn't.  If they are visible, maybe try to
UPDATE them, again by ctid; perhaps that will kill them for good.  But
since the txid_status() returns committed, that likely won't work.

Maybe if you try to VACUUM by setting vacuum_freeze_min_age to a high
value (so that it doesn't try to freeze, but only remove them) they are
removed?

(I didn't verify that the values you &-mask the infomask with are
correct.)

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/




Re: ALTER TABLE ... DETACH PARTITION just sitting there

2021-06-28 Thread Alvaro Herrera
On 2021-Jun-28, Ron wrote:

> We've got a table partitioned by month range (FOR VALUES FROM ('2011-07-01')
> TO (2011-08-01')), and I've been detaching partitions from oldest to newest,
> one at a time. Whenever it's failed due to a FK constraint (and there are
> many of them!), I dropped the "same month" partition from TABLE_B, and then
> returned and dropped the partition from TABLE_A.
> 
> But now, after 17 dropped partitions it's just sitting there on "ALTER TABLE
> table_a DROP PARTITION table_a_p2011_07;"  I'm the only user on this test
> instance, and validated that nothing else is blocking me.

Did you look in pg_locks for ungranted locks?

> Are the FK validations what's causing the apparent "hang"?  (EXPLAIN ALTER
> TABLE... does not work.)

Sure, it is possible.  Do you have any FKs that are missing indexes in
the referencing side?


-- 
Álvaro Herrera   Valdivia, Chile
 really, I see PHP as like a strange amalgamation of C, Perl, Shell
 inflex: you know that "amalgam" means "mixture with mercury",
   more or less, right?
 i.e., "deadly poison"




Re: Partitioned Table Index Column Order

2021-06-23 Thread Alvaro Herrera
On 2021-Jun-23, Rumpi Gravenstein wrote:

> As a best practice is it better to create the primary key starting or
> ending with the partition column?

It is not relevant from the partitioning point of view.  Other factors
can be used to decide the column order.

-- 
Álvaro Herrera   Valdivia, Chile




Re: DB size

2021-04-26 Thread Alvaro Herrera
On 2021-Apr-26, luis.robe...@siscobra.com.br wrote:

>   SELECT sum(pg_total_relation_size(relid)), 
>  pg_size_pretty(sum(pg_total_relation_size(relid)))
> FROM pg_catalog.pg_stat_all_tables 
> 
> 
> sum |pg_size_pretty|
> +--+
> 518549716992|483 GB|
> 
> 
> SELECT pg_database_size('mydb'),
>  pg_size_pretty(pg_database_size('mydb'))
> 
> 
> pg_database_size|pg_size_pretty|
> +--+
> 869150909087|809 GB|
> 
> There are three databases in the cluster, apart from 'mydb' (one of
> them is the 'postgres' database). These other two size about 8MB each.

I would guess that there are leftover files because of those crashes you
mentioned.  You can probably look for files in the database subdir in
the data directory that do not appear in the pg_class.relfilenode
listing for the database.

-- 
Álvaro Herrera39°49'30"S 73°17'W
 really, I see PHP as like a strange amalgamation of C, Perl, Shell
 inflex: you know that "amalgam" means "mixture with mercury",
   more or less, right?
 i.e., "deadly poison"




Re: could not read from hash-join temporary file: SUCCESS && DB goes into recovery mode

2021-04-19 Thread Alvaro Herrera
On 2021-Apr-19, Reid Thompson wrote:

> Thanks - I found that, which seems to fix the error handling right? Or
> does it actually correct the cause of the segfault also?

Uh, what segfault?  You didn't mention one.  Yes, it fixes the error
handling, so when the system runs out of disk space, that's correctly
reported instead of continuing.

... Ah, I see now that you mentioned that the DB goes in recovery mode
in the subject line.  That's exactly why I was looking at that problem
last year.  What I saw is that the hash-join spill-to-disk phase runs
out of disk, so the disk file is corrupt; later the hash-join reads that
data back in memory, but because it is incomplete, it follows a broken
pointer somewhere and causes a crash.

(In our customer case it was actually a bit more complicated: they had
*two* sessions running the same large hash-join query, and one of them
filled up disk first, then the other also did that; some time later one
of them raised an ERROR freeing up disk space, which allowed the other
to continue until it tried to read hash-join data back and crashed).

So, yes, the fix will avoid the crash by the fact that once you run out
of disk space, the hash-join will be aborted and nothing will try to
read broken data.

You'll probably have to rewrite your query to avoid eating 2TB of disk
space.

-- 
Álvaro Herrera   Valdivia, Chile




Re: could not read from hash-join temporary file: SUCCESS && DB goes into recovery mode

2021-04-19 Thread Alvaro Herrera
On 2021-Apr-19, Reid Thompson wrote:

> Hi I'm looking for some guidance related to the subject line issue.
> PostgreSQL 11.8 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 
> (Red Hat 4.8.5-39), 64-bit
> 128GB RAM
> pgsql_tmp is on a FS with 2+TB free

This bug report looks familiar.  I think it was fixed in the below commit and
that you'd benefit from running an up-to-date version (11.11).

Author: Thomas Munro 
Branch: master [7897e3bb9] 2020-06-16 16:59:07 +1200
Branch: REL_13_STABLE Release: REL_13_0 [3e0b08c40] 2020-06-16 17:00:06 +1200
Branch: REL_12_STABLE Release: REL_12_4 [28ee12669] 2020-06-16 17:00:21 +1200
Branch: REL_11_STABLE Release: REL_11_9 [9c14d6024] 2020-06-16 17:00:37 +1200
Branch: REL_10_STABLE Release: REL_10_14 [95647a1c7] 2020-06-16 17:00:53 +1200
Branch: REL9_6_STABLE Release: REL9_6_19 [02b71f06b] 2020-06-16 17:01:07 +1200
Branch: REL9_5_STABLE Release: REL9_5_23 [89020a92f] 2020-06-16 17:01:22 +1200

Fix buffile.c error handling.

Convert buffile.c error handling to use ereport.  This fixes cases where
I/O errors were indistinguishable from EOF or not reported.  Also remove
"%m" from error messages where errno would be bogus.  While we're
modifying those strings, add block numbers and short read byte counts
where appropriate.

Back-patch to all supported releases.

Reported-by: Amit Khandekar 
Reviewed-by: Melanie Plageman 
Reviewed-by: Alvaro Herrera 
Reviewed-by: Robert Haas 
Reviewed-by: Ibrar Ahmed 
Reviewed-by: Michael Paquier 
Discussion: 
https://postgr.es/m/CA%2BhUKGJE04G%3D8TLK0DLypT_27D9dR8F1RQgNp0jK6qR0tZGWOw%40mail.gmail.com


-- 
Álvaro Herrera39°49'30"S 73°17'W
EnterpriseDBhttps://www.enterprisedb.com




Re: Upgrading from 11 to 13

2021-04-01 Thread Alvaro Herrera
On 2021-Apr-02, Koen De Groote wrote:

> I seem to recall that going from 11 to 12, a certain configuration file was
> removed and the keys are now expected to be set in the regular
> configuration file? The logic being there should only ever be 1
> configuration file.
> 
> I can't find it, but at the same time I don't recall what it's called. I
> believe it has to do with streaming replication?
> 
> Is this a thing or am I imagining stuff?

recovery.conf to postgresql.conf?  Yes, you're right.  (There are more
reasons beyond "just one config file", though.)

-- 
Álvaro Herrera39°49'30"S 73°17'W
"Someone said that it is at least an order of magnitude more work to do
production software than a prototype. I think he is wrong by at least
an order of magnitude."  (Brian Kernighan)




Re: unexpected character used as group separator by to_char

2021-03-10 Thread Alvaro Herrera
On 2021-Mar-10, Gavan Schneider wrote:

> On 10 Mar 2021, at 16:24, Alvaro Herrera wrote:
> 
> > That space (0xe280af) is U+202F, which appears to be used for French and
> > Mongolian languages (exclusively?).  It is quite possible that in the
> > future some other language will end up using some different whitespace
> > character, possibly breaking any code you write today -- the use of
> > U+202F appears to be quite recent.
> > 
> Drifting off topic a little. That a proper code point for things that will
> benefit from the whitespace but should still stay together.
> Also it’s not that new, added in 1999 — https://codepoints.net/U+202F

I probably got misled on this whole thing by these change proposals.
https://www.unicode.org/L2/L2019/19116-clarify-nnbsp.pdf
https://www.unicode.org/L2/L2020/20008-core-text.pdf
Apparently prior to this, they (?) had been using/recommending
THIN SPACE U+2009 as separator, which is not non-breaking.

Anyway, it reinforces my point that it's not impossible that some other
locale definition could use U+2009 when printing numbers, or even some
other kind of spacing entity in non-Latin languages etc.  So I think
that for truly robust handling you should separate the thing you use for
display from the thing you use to talk to the database.

> And the thin space is part of the international standard for breaking up
> large numbers (from 1948), specifically no dots or commas should be used in
> this role. The dot or comma is only to be used for the decimal point!

Interesting U+2014 EM DASH I didn't know this.

-- 
Álvaro Herrera   Valdivia, Chile
"This is a foot just waiting to be shot"(Andrew Dunstan)




Re: unexpected character used as group separator by to_char

2021-03-09 Thread Alvaro Herrera
On 2021-Mar-10, Vincent Veyron wrote:

> On Tue, 09 Mar 2021 16:22:07 -0500
> Tom Lane  wrote:

> > (I suppose you could also use regexp_replace to convert random forms
> > of whitespace to plain ASCII space.)
> 
> No dice. 'G' formatting looks like a whitespace, but is different (it
> appears to be slightly narrower when displayed in html, too)  :

That space (0xe280af) is U+202F, which appears to be used for French and
Mongolian languages (exclusively?).  It is quite possible that in the
future some other language will end up using some different whitespace
character, possibly breaking any code you write today -- the use of
U+202F appears to be quite recent.

Maybe it'd be better to have the query return two fields: one destined
for printing, the other you can use to feed the database back.


-- 
Álvaro Herrera   Valdivia, Chile
"Porque Kim no hacía nada, pero, eso sí,
con extraordinario éxito" ("Kim", Kipling)




Re: libpq pipelineing

2021-03-05 Thread Alvaro Herrera
Hello Samuel,

On 2020-Jun-27, Samuel Williams wrote:

> I found some discussion in the past relating to batch processing which
> appears to support some kind of pipelining:
> 
> https://2ndquadrant.github.io/postgres/libpq-batch-mode.html

I just noticed this old thread of yours.  I've been working on getting
the work you linked to, polished a little bit and completed for
PostgreSQL 14.  If you'd like to give it a try, your input would be
very useful to me.  You can find the patch here (applies on the current
master branch): https://postgr.es/m/20210306003559.GA1375@alvherre.pgsql
(If you need help building, please ping me on private email).

The patched libpq version is compatible with older servers.

Thanks

-- 
Álvaro Herrera   Valdivia, Chile




Re: Locks in creating a partition in CREATE TABLE vs ALTER TABLE

2021-03-02 Thread Alvaro Herrera
On 2021-Mar-02, Asaf Flescher wrote:

> I'm not sure if this is a bug or I'm missing something regarding how
> partitioning is supposed to work but I've noticed (in Postgres 12.6) that
> if I have a partitioned table, and then try to add a partition to it via
> CREATE TABLE ... PARTITION OF, the statement will grab an AccessExclusive
> lock on the partitioned table. Meanwhile, if I create that same table
> normally, then attach it to the partitioned table via ALTER table - no
> AccessExclusive lock.

It's a new feature in Postgres 12 actually -- we went great lengths to
be able to do ALTER TABLE .. ATTACH PARTITION without a full
AccessExclusive lock.  However, it's just not possible to do the same
for CREATE TABLE .. PARTITION AS.

If you try the same in Postgres 11, you'll notice that both use an
AccessExclusive lock.

-- 
Álvaro Herrera   Valdivia, Chile
Syntax error: function hell() needs an argument.
Please choose what hell you want to involve.




Re: Localizing stored functions by replacing placeholders in their body

2021-03-02 Thread Alvaro Herrera
On 2021-Mar-02, Alexander Farber wrote:

> CREATE OR REPLACE FUNCTION localize_hello()
> RETURNS text AS
> $func$
> SELECT '$(hello)';
> $func$ LANGUAGE sql IMMUTABLE;

I'm not sure this is a great approach to in-database translations: you
have one function per string, which is cumbersome, bloated and probably
slow.  I would suggest having a function that takes a string and returns
its translation, which is obtained from a couple of tables: one where
the original strings are stored and another which stores the
translations for each string into each language.

(You can have the target language be a property of the database, or a
custom GUC setting that the application sets at the start, or just
passed as an argument to the translate() function from somewhere.  Or
maybe each database always has exactly one language.  Whatever suits you
best.)

So the functions that your application calls return strings by doing
stuff like
  SELECT translate('one UFO came and stole one bike');
and they'll get whatever is right for them.  The functions only need to
worry about calling translate() in all the right places; they don't need
to individually worry about fetching the translation etc.

Note that in that design, the original string appears in two places: the
function source code, and the original-strings table.  You could go one
step further and have the function store a code (UUID?) for the string;
then if a message has a typo, you're just one UPDATE away from fixing it
instead of an ALTER FUNCTION.  And also, it's easy to change all
translations together if your UFOs are actually ordinary burglars.

Exercise for the reader: what if your strings have format specifiers?
  "%d UFOs came and stole %d bikes"

-- 
Álvaro Herrera39°49'30"S 73°17'W




Re: PostgreSQL URI

2021-02-26 Thread Alvaro Herrera
On 2021-Feb-26, Paul Förster wrote:

> Hi Tom,
> 
> > On 26. Feb, 2021, at 15:51, Tom Lane  wrote:
> > 
> > +1.  I think you could lose the outer brackets in hostspec in
> > this formulation, ie given that hostspec is already bracketed
> > above, it should be enough to write
> > 
> >hostspec is [host][:port][,...]
> 
> if you remove the outer brackets of host spec, then that means that
> only the port may be repeated. The repeat is always meant to refer to
> its immediate preceding argument. The outer brackets make sure that it
> refers to either of both host *and* port.

I think an easier fix is to move the repeat to the main line, i.e., make
it "hostspec[,...]" and then hostspec is only [host][:port].

> > Also, the paramspec is under-bracketed today.  Should be
> > more like
> > 
> >paramspec is param=value[&...]

True.

-- 
Álvaro Herrera   Valdivia, Chile
"Estoy de acuerdo contigo en que la verdad absoluta no existe...
El problema es que la mentira sí existe y tu estás mintiendo" (G. Lama)




Re: PostgreSQL URI

2021-02-26 Thread Alvaro Herrera
On 2021-Feb-25, Paul Förster wrote:

> So, my suggestion is:
> 
> postgresql://[user[:password]@][[host][:port]][,...][/dbname][?param1=value1&...]
> 
> Still, I think that it's an improvement, because it makes clear that not only 
> the port, but also the host may be repeated.

I wonder if we shouldn't instead try to break it up in parts that can be
explained or described separately.  This many brackets makes it pretty
hard to read.

We could say something like

postgresql://[userspec@][hostspec][/dbname][?paramspec]

where
  userspec is user[:password]
  hostspec is [[host][:port]][,...]
  paramspec is param1=value1&...

which makes it easier to focus on each part separately, and we can
provide more verbose explanations or examples where needed.  (Now that I
broke it up, the original line became very clear to me, but when I saw
it in isolation it was not.  You need to count brackets carefully to be
able to read it.)

-- 
Álvaro Herrera   Valdivia, Chile




Re: How to post to this mailing list from a web based interface

2021-02-15 Thread Alvaro Herrera
On 2021-Feb-15, Adrian Klaver wrote:

> On 2/15/21 9:24 AM, Thomas Guyot wrote:
> 
> > The "download mbox" option doesn't work, I get asked for a user/password
> > every time (I would've downloaded archives for the lats two months to
> > get continuation on most threads).
> 
> The user/password is in the message in the prompt.

Sadly, some browsers choose not to display that message -- apparently
the message itself has been used somehow by certain sites to attack
something or other.

-- 
Álvaro Herrera39°49'30"S 73°17'W




Re: How to post to this mailing list from a web based interface

2021-01-28 Thread Alvaro Herrera
On 2021-Jan-28, Ravi Krishna wrote:

> > Everyone is free to use whatever he/she wants. For me a we based MUA
> > would be the worst thing ever.
> 
> Oh well.  I have created a seperate email account for this to keep the 
> clutter out.

In any half-decent email program, you can tag incoming email in such a
way that it doesn't clutter your main "inbox" view, while still having
such tagged messages if you specifically request them.  Just saying.

-- 
Álvaro Herrera39°49'30"S 73°17'W




Re: How to post to this mailing list from a web based interface

2021-01-28 Thread Alvaro Herrera
On 2021-Jan-28, Ravi Krishna wrote:

> I am planning to switch to a web based tool to read this mailing list.

That's great.

> While reading is easy via web, how do I post a reply from web.

Yeah, "how" indeed.

> I recollect there use to be a website from where one can reply from web.

The community does not maintain such a service.

There used to be a Gmane archive of this list that you could use to
post.  Seems it's still online at postgresql-archive.org.  They have a
"Reply" button and it says to require your account, but given SPF and
DMARC and other restrictions on email generation, it seems pretty
uncertain that emails posted that way would work correctly.  I think we
would even reject such emails if they reached our mailing list servers.

-- 
Álvaro Herrera39°49'30"S 73°17'W




Re: Error messages on duplicate schema names

2021-01-19 Thread Alvaro Herrera
On 2021-Jan-15, Michael Paquier wrote:

> On Wed, Jan 06, 2021 at 07:15:24PM +0200, Andrus wrote:
> > Should duplicate schema names accepted or should their usage throw better
> > error messages.
> 
> This means that we are one call of CommandCounterIncrement() short for
> such queries, and similar safeguards already exist in this area for
> GRANT/REVOKE.  The spot where I would put this new CCI is at the end
> of SetDefaultACL(), like in the attached with a test added to
> privileges.sql.
> 
> Any thoughts from others?

Looks to match what we do in ExecGrant_Relation() et al, so +1.

I guess you could make the case that the CCI call should be in the
callers where we actually loop (SetDefaultACLsInSchemas,
RemoveRoleFromObjectACL), but it's hard to get excited about the added
clutter.

-- 
Álvaro Herrera   Valdivia, Chile




Re: Do we need a way to moderate mailing lists?

2021-01-16 Thread Alvaro Herrera
On 2021-Jan-16, Bruce Momjian wrote:

> On Sat, Jan 16, 2021 at 02:50:58PM -0300, Álvaro Herrera wrote:
> > On 2021-Jan-16, Hemil Ruparel wrote:
> > 
> > > Okay. I will not reply to them. Enough mental cycles wasted
> > 
> > One way you could help, is by learning what top-posting is, learning not
> > to do it, and teaching others the same.  Same with not quoting entire
> > messages on reply.
> 
> That "quoting entire messages on reply" is something I see far too often
> here.  I have been meaning to mention this problem.  Thousands of people
> are reading postings here, so it pays to take time to trim down what
> others have to view.

Yes.  Gmail, by hiding the quoted part of the message, has taught people
that it's okay to leave the whole thing in place.  For most of the rest
of the world, it's an annoyance.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"After a quick R of TFM, all I can say is HOLY CR** THAT IS COOL! PostgreSQL was
amazing when I first started using it at 7.2, and I'm continually astounded by
learning new features and techniques made available by the continuing work of
the development team."
Berend Tober, http://archives.postgresql.org/pgsql-hackers/2007-08/msg01009.php




Re: Do we need a way to moderate mailing lists?

2021-01-16 Thread Alvaro Herrera
On 2021-Jan-16, Hemil Ruparel wrote:

> Okay. I will not reply to them. Enough mental cycles wasted

One way you could help, is by learning what top-posting is, learning not
to do it, and teaching others the same.  Same with not quoting entire
messages on reply.

-- 
Álvaro Herrera   Valdivia, Chile




Re: Deleting takes days, should I add some index?

2020-11-27 Thread Alvaro Herrera
On 2020-Nov-27, Alexander Farber wrote:

> Referenced by:
> TABLE "words_chat" CONSTRAINT "words_chat_gid_fkey" FOREIGN KEY (gid)
> REFERENCES words_games(gid) ON DELETE CASCADE
> TABLE "words_moves" CONSTRAINT "words_moves_gid_fkey" FOREIGN KEY (gid)
> REFERENCES words_games(gid) ON DELETE CASCADE
> TABLE "words_scores" CONSTRAINT "words_scores_gid_fkey" FOREIGN KEY
> (gid) REFERENCES words_games(gid) ON DELETE CASCADE

Make sure you have indexes on the gid columns of these tables.  Delete
needs to scan them in order to find the rows that are cascaded to.

> So I ctrl-c (surprisingly not a single record was deleted; I was expecting
> at least some to be gone)

Ctrl-C aborts the transaction, so even though the rows are marked
deleted, they are so by an aborted transaction.  Therefore they're
alive.




Re: Potential BRIN Index Corruption

2020-11-25 Thread Alvaro Herrera
On 2020-Nov-26, Huan Ruan wrote:

> Hi All
> 
> We cannot work out a reproducible case but we have a copy of the offending
> database. I was hoping to know

I think the way to debug this would be to see what WAL records have been
emitted for the index, using pageinspect to find the problem index
tuple.

Use 'select ctid rrom large_table_with_623m_records where ...' to
pinpoint the unindexed tuple's page number; see when (in LSN) was that
tuple written; inspect WAL surroundings looking for updates (or lack
thereof) for the BRIN index.  Use pageinspect to examine raw brin data.





  1   2   3   >