Re: Automatic aggressive vacuum on almost frozen table takes too long

2023-02-20 Thread Peter Geoghegan
On Mon, Feb 20, 2023 at 9:43 PM Mikhail Balayan  wrote:
> What catches my eye: scanning indexes smaller than 3.1GB is fast, larger ones 
> are slow. For example:
> idx_applications2_policy_id is 3131 MB took just 5 seconds (DETAIL:  CPU: 
> user: 2.99 s, system: 1.65 s, elapsed: 5.32 s)
> but idx_applications2_deleted_at with 3264 MB took 1 minute 22 seconds 
> (DETAIL:  CPU: user: 67.93 s, system: 3.41 s, elapsed: 82.75 s)

I think that I know what this is.

If you delete many index pages during VACUUM, and those pages are all
full of duplicate values, the deletion operation can sometimes be
slower due to the need to relocate a downlink to each to-be-deleted
leaf page. When there are thousands of matches, you'll start to notice
O(n^2) behavior due to the way in which the B-Tree VACUUM code must
grovel through the parent level, which is full of duplicate keys.

If you were on Postgres 12+, then this wouldn't happen, because the
heap TID is treated as a part of the key space there, affecting sort
order. The implementation would immediately relocate the matching
parent downlink using a unique key (unique because heap TID would act
as a unique-ifier on that version). And if you were on 14+, things in
this area would be much better still.

-- 
Peter Geoghegan




Re: Automatic aggressive vacuum on almost frozen table takes too long

2023-02-20 Thread Mikhail Balayan
Probably the result of vacuum freeze manual execution will give some more
information:
Table size:
appdbname3=# \dt+ appschemaname.applications
  List of relations
Schema | Name | Type  | Owner |   Size   |
Description
---+--+---+---+--+-
 appschemaname | applications | table | appschemaname | 10012 MB |

Each index size:
  tablename   | indexname
 | index_size | unique |
--++++
 applications | chk_invariant_1_apps2
 | 8192 bytes | Y  |
 applications | applications2_pkey
| 1645 MB| Y  |
 applications | idx_applications2_agent_id
| 2982 MB| N  |
 applications | idx_applications2_context_id
| 2708 MB| N  |
 applications | idx_applications2_context_tenant_id
 | 1560 MB| N  |
 applications | idx_applications2_context_type
| 4963 MB| N  |
 applications | idx_applications2_deleted_at
| 3264 MB| N  |
 applications | idx_applications2_enabled
 | 4975 MB| N  |
 applications | idx_applications2_euc_index
 | 1684 MB| N  |
 applications | idx_applications2_policy_id
 | 3131 MB| N  |
 applications | idx_applications2_policy_type
 | 16 GB  | N  |
 applications | idx_applications2_root_policy_id
| 1574 MB| N  |
 applications | idx_applications2_status
| 5391 MB| N  |
 applications | idx_applications2_status_by_policy_fields_context_tenant_i
| 13 MB  | N  |
 applications | idx_applications2_status_by_policy_fields_tenant_id
 | 15 MB  | N  |
 applications | idx_applications2_tenant_id
 | 1670 MB| N  |
 applications | idx_applications_created_at
 | 2855 MB| N  |
 applications | tmp_idx_applications2_updated_at2
 | 3505 MB| N  |
 applications | uq_policy_id_context2
 | 1398 MB| Y  |

And vacuum freeze output:
appdbname3=# VACUUM (FREEZE, VERBOSE) appschemaname.applications;
INFO:  aggressively vacuuming "appschemaname.applications"
INFO:  scanned index "applications2_pkey" to remove 10488 row versions
DETAIL:  CPU: user: 2.45 s, system: 1.07 s, elapsed: 5.16 s
INFO:  scanned index "tmp_idx_applications2_updated_at2" to remove 10488
row versions
DETAIL:  CPU: user: 2.53 s, system: 2.86 s, elapsed: 11.56 s
INFO:  scanned index "idx_applications2_agent_id" to remove 10488 row
versions
DETAIL:  CPU: user: 3.16 s, system: 2.36 s, elapsed: 11.04 s
INFO:  scanned index "idx_applications2_context_id" to remove 10488 row
versions
DETAIL:  CPU: user: 2.22 s, system: 1.95 s, elapsed: 8.06 s
INFO:  scanned index "idx_applications2_context_tenant_id" to remove 10488
row versions
DETAIL:  CPU: user: 1.86 s, system: 1.16 s, elapsed: 5.10 s
INFO:  scanned index "idx_applications2_context_type" to remove 10488 row
versions
DETAIL:  CPU: user: 190.09 s, system: 6.48 s, elapsed: 220.88 s
INFO:  scanned index "idx_applications2_deleted_at" to remove 10488 row
versions
DETAIL:  CPU: user: 67.93 s, system: 3.41 s, elapsed: 82.75 s
INFO:  scanned index "idx_applications2_enabled" to remove 10488 row
versions
DETAIL:  CPU: user: 78.22 s, system: 4.74 s, elapsed: 96.96 s
INFO:  scanned index "idx_applications2_policy_id" to remove 10488 row
versions
DETAIL:  CPU: user: 2.99 s, system: 1.65 s, elapsed: 5.32 s
INFO:  scanned index "idx_applications2_policy_type" to remove 10488 row
versions
DETAIL:  CPU: user: 300.45 s, system: 16.05 s, elapsed: 365.80 s
INFO:  scanned index "idx_applications2_root_policy_id" to remove 10488 row
versions
DETAIL:  CPU: user: 2.13 s, system: 1.20 s, elapsed: 6.16 s
INFO:  scanned index "idx_applications2_status" to remove 10488 row versions
DETAIL:  CPU: user: 153.79 s, system: 6.57 s, elapsed: 185.76 s
INFO:  scanned index
"idx_applications2_status_by_policy_fields_context_tenant_i" to remove
10488 row versions
DETAIL:  CPU: user: 0.02 s, system: 0.01 s, elapsed: 0.06 s
INFO:  scanned index "idx_applications2_status_by_policy_fields_tenant_id"
to remove 10488 row versions
DETAIL:  CPU: user: 0.03 s, system: 0.00 s, elapsed: 0.05 s
INFO:  scanned index "idx_applications2_tenant_id" to remove 10488 row
versions
DETAIL:  CPU: user: 2.11 s, system: 1.23 s, elapsed: 6.19 s
INFO:  scanned index "idx_applications2_euc_index" to remove 10488 row
versions
DETAIL:  CPU: user: 2.05 s, system: 1.27 s, elapsed: 6.05 s
INFO:  scanned index "chk_invariant_1_apps2" to remove 10488 row versions
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO:  scanned index "uq_policy_id_context2" to remove 10488 row versions
DETAIL:  CPU: user: 1.24 s, system: 1.00 s, elapsed: 4.40 s
INFO:  scanned index "idx_applications_created_at" to remove 10488 row
versions
DETAIL:  CPU: user: 2.32 s, system: 1.97 s, elapsed: 8.46 s
INFO:  "applications": removed 10488 row versions in 7580 pages
DETAIL:  CPU: user: 0.86 s, system: 0.08 s, elapsed: 1.03 s
INFO:  index "applicati

Re: pg_dump'ed file contains "DROP DATABASE"

2023-02-20 Thread Adrian Klaver

On 2/20/23 19:22, p...@pfortin.com wrote:

On Mon, 20 Feb 2023 15:24:23 -0800 Adrian Klaver wrote:





It just dawned on me you might be doing all of this through the
pgAdmin4 GUI.


Sorry for any confusion...  I get it now...

A team member uses pgAdmin4 to load separate table(s) into his DB; then
creates dump files (one per table) of those _individual_ tables which are
uploaded to me.


Given that you are using pg_restore then the dump file they are creating 
is done with a custom format e.g. -Fc.


This means when you do:

 pg_restore --host "localhost" --port "5432" --username "postgres"
  --no-password --dbname "myname" --create --clean --verbose "dumpfile"

the --create in combination with --clean is going to DROP DATABASE and 
recreate it.




I maintain a complete set of tables in my DB. pgAdmin4 is never used
here; the restore is done with a simple bash script which inserts the
dbname and dumpfile name into the command. Those restore tables should
only be created with DROP DATABASE _off_.


As I said before the solution is going to come from your end. Do not 
include the --create in you pg_restore script.




All that should happen with my DB is to add these tables (99.9% of the
time, they are totally new to me).

A DROP DATABASE from the one-table per dumpfile creator is UNwanted.


Again that is your doing and you are the one that can stop it.



Looks like I really did dodge a bullet...


Again the default is to not include those options.


Glad to know pgAdmin4 has those switches.

Thanks again!!



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: pg_dump'ed file contains "DROP DATABASE"

2023-02-20 Thread pf
On Mon, 20 Feb 2023 15:24:23 -0800 Adrian Klaver wrote:

>On 2/20/23 11:36, p...@pfortin.com wrote:
>> On Mon, 20 Feb 2023 11:06:34 -0800 Adrian Klaver wrote:
>>   
>>> On 2/20/23 10:27, p...@pfortin.com wrote:  
 [Still a newbie; but learning fast...]

 Hi,  
>
>> 
>> Notwithstanding the man page, my take is that the DROP DATABASE statement
>> needs to be eliminated at pg_dump creation by pgAdmin4.  Taking this to
>> that mailing list.  
>
>It just dawned on me you might be doing all of this through the 
>pgAdmin4 GUI.

Sorry for any confusion...  I get it now...

A team member uses pgAdmin4 to load separate table(s) into his DB; then
creates dump files (one per table) of those _individual_ tables which are
uploaded to me. 

I maintain a complete set of tables in my DB. pgAdmin4 is never used
here; the restore is done with a simple bash script which inserts the
dbname and dumpfile name into the command. Those restore tables should
only be created with DROP DATABASE _off_.

All that should happen with my DB is to add these tables (99.9% of the
time, they are totally new to me).  

A DROP DATABASE from the one-table per dumpfile creator is UNwanted.

Looks like I really did dodge a bullet...  

>Again the default is to not include those options.

Glad to know pgAdmin4 has those switches.

Thanks again!!

>> 
>> Thanks Tom & Adrian!
>>   
 Was my 134 table[1] myname DB saved because it was open?  
>> 
>> Tom:  Yup.
>>   
 If the dump file
 contains the above statements, how can I be absolutely certain I won't
 lose the DB?  
>> 
>> Tom:
>> Reading the manual is advisable.  --create --clean specifies exactly
>> that the target database is to be dropped and recreated.
>> 
>>  regards, tom lane
>>   
 I'm obviously quite paranoid now...  
>>>
>>> You will lose the database if you do as the docs specify for -C:
>>>
>>> "
>>> -C
>>>
>>> ...
>>>
>>> When this option is used, the database named with -d is used only to
>>> issue the initial DROP DATABASE and CREATE DATABASE commands. All data
>>> is restored into the database name that appears in the archive.
>>> "
>>>
>>>
>>> It will then be recreated with whatever information is in "dumpfile". If
>>> that is the same data or new data you want then you are fine. Otherwise
>>> you will need to be more specific about what you are trying to achieve.
>>>
>>>  

 [1] 3 types of tables:  ~40%=8.5M rows; ~40%=33M rows; ~20%=varying sizes

 Thanks,
 Pierre

  
>>>  
>> 
>>   
>




Re: Thanks! Re: Who adds the "start transaction" and "commit" to the intended SQL statement in "autocommit" mode?

2023-02-20 Thread Bryn Llewellyn
>> b...@yugabyte.com wrote:
>> 
>> I’ve no idea how I might have found this without human help.
> 
> x...@thebuild.com wrote:
> 
> That sounds like an excellent documentation patch!

Well, it’s already documented clearly enough. The question is how to find 
it—especially if you don’t know that the feature that you’re looking for exists 
or not. The psql doc would print out at about thirty pages with a normal font 
size. So reading it from top to bottom would be quite a task.



Re: Thanks! Re: Who adds the "start transaction" and "commit" to the intended SQL statement in "autocommit" mode?

2023-02-20 Thread Christophe Pettus



> On Feb 20, 2023, at 17:54, Bryn Llewellyn  wrote:
> 
> 
> I’ve no idea how I might have found this without human help.

That sounds like an excellent documentation patch!



Re: Thanks! Re: Who adds the "start transaction" and "commit" to the intended SQL statement in "autocommit" mode?

2023-02-20 Thread Bryn Llewellyn
>> b...@yugabyte.com wrote:
>> 
>> It seems a bit odd that psql has no syntax to ask for this in its 
>> interactive mode. 
> 
> dan...@manitou-mail.org wrote:
> 
> Backslash-semicolon is the syntax.

Thanks, Daniel. Yes, that works. And the server’s SQL statement log confirms 
this.

I’ve no idea how I might have found this without human help. (Neither generic 
Google search nor using the PG docs own search got me anywhere.) But now I know 
that what I asked about is possible and I’ve seen it work, I trust that I’ll 
remember the trick.



Re: pg_dump'ed file contains "DROP DATABASE"

2023-02-20 Thread Adrian Klaver

On 2/20/23 11:36, p...@pfortin.com wrote:

On Mon, 20 Feb 2023 11:06:34 -0800 Adrian Klaver wrote:


On 2/20/23 10:27, p...@pfortin.com wrote:

[Still a newbie; but learning fast...]

Hi,




Notwithstanding the man page, my take is that the DROP DATABASE statement
needs to be eliminated at pg_dump creation by pgAdmin4.  Taking this to
that mailing list.


It just dawned on me you might be doing all of this through the 
pgAdmin4 GUI.


In which case from most recent documentation:

https://www.pgadmin.org/docs/pgadmin4/6.20/index.html

Backup Dialog:

https://www.pgadmin.org/docs/pgadmin4/6.20/backup_dialog.html

Options tab:
"
Move the switch next to Include CREATE DATABASE statement towards right 
position to include a command in the backup that creates a new database 
when restoring the backup.


Move the switch next to Include DROP DATABASE statement towards right 
position to include a command in the backup that will drop any existing 
database object with the same name before recreating the object during a 
backup.

"

So the default is not to include those options.

For Restore dialog:

https://www.pgadmin.org/docs/pgadmin4/6.20/restore_dialog.html

Options tab(for custom format):

"
Move the switch next to Include CREATE DATABASE statement towards right 
position to include a command that creates a new database before 
performing the restore.


Move the switch next to Clean before restore towards right position to 
drop each existing database object (and data) before restoring.

"

Again the default is to not include those options.




Thanks Tom & Adrian!


Was my 134 table[1] myname DB saved because it was open?


Tom:  Yup.


If the dump file
contains the above statements, how can I be absolutely certain I won't
lose the DB?


Tom:
Reading the manual is advisable.  --create --clean specifies exactly
that the target database is to be dropped and recreated.

regards, tom lane


I'm obviously quite paranoid now...


You will lose the database if you do as the docs specify for -C:

"
-C

...

When this option is used, the database named with -d is used only to
issue the initial DROP DATABASE and CREATE DATABASE commands. All data
is restored into the database name that appears in the archive.
"


It will then be recreated with whatever information is in "dumpfile". If
that is the same data or new data you want then you are fine. Otherwise
you will need to be more specific about what you are trying to achieve.




[1] 3 types of tables:  ~40%=8.5M rows; ~40%=33M rows; ~20%=varying sizes

Thanks,
Pierre

   







--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Thanks! Re: Who adds the "start transaction" and "commit" to the intended SQL statement in "autocommit" mode?

2023-02-20 Thread Daniel Verite
Bryn Llewellyn wrote:

> 2023-02-20 12:42:44.993 PST [2540504] d0$u0@d0 LOG:  0: statement:
> insert into s.t(v) values(17); insert into s.t(v) values(42);
> 2023-02-20 12:42:44.993 PST [2540504] d0$u0@d0 LOCATION:  exec_simple_query,
> postgres.c:971
> 
> It seems a bit odd that psql has no syntax to ask for this in its
> interactive mode. 

Backslash-semicolon is the syntax. Quoted from the doc:



\;

Backslash-semicolon is not a meta-command in the same way as the
preceding commands; rather, it simply causes a semicolon to be added to the
query buffer without any further processing.

Normally, psql will dispatch an SQL command to the server as soon as it
reaches the command-ending semicolon, even if more input remains on the
current line. Thus for example entering

select 1; select 2; select 3;

will result in the three SQL commands being individually sent to the
server, with each one's results being displayed before continuing to the next
command. However, a semicolon entered as \; will not trigger command
processing, so that the command before it and the one after are effectively
combined and sent to the server in one request. So for example

select 1\; select 2\; select 3;

results in sending the three SQL commands to the server in a single
request, when the non-backslashed semicolon is reached. The server executes
such a request as a single transaction, unless there are explicit
BEGIN/COMMIT commands included in the string to divide it into multiple
transactions. (See Section 55.2.2.1 for more details about how the server
handles multi-query strings.)




Best regards,
-- 
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite




Re: Thanks! Re: Who adds the "start transaction" and "commit" to the intended SQL statement in "autocommit" mode?

2023-02-20 Thread Bryn Llewellyn
> x...@thebuild.com wrote:
> 
>> b...@yugabyte.com wrote:
>> 
>> 2. If I send over "begin" and then "insert into s.t(v) values(42)", then (so 
>> far) a second session will not see the effect of my SQL's. It sees this only 
>> when I send over "commit". (If I send over "rollback" instead of "commit", 
>> then other sessions never know what I did.)
> 
> This may or may not be true. If the second session currently has a 
> transaction open in REPEATABLE READ or SERIALIZABLE mode, it *won't* see the 
> effects of that statement, since it took its snapshot at the start of the 
> transaction (to be technical, at the first statement in that transaction), 
> and holds it until commit time. However, a transaction in READ COMMITTED mode 
> *will* see the results after the statement completes.
> 
>> I can't see that a client-side "autocommit off" mode like psql supports 
>> brings me anything of value.
> 
> There's general agreement on that point.
> 
> https://www.cybertec-postgresql.com/en/disabling-autocommit-in-postgresql-can-damage-your-health/

Thanks, Christophe. Yes, I sacrificed correctness for brevity. I should have 
stipulated that observations made from a second concurrent session are to be 
done using a singleton "select" in its own txn—i.e. outside of an explicitly 
started txn (whether this is started by hand or using a client's implementation 
of "autocommit off"). Thanks, too, for the xref to the Cybertec post by Laurenz 
Albe.

And thanks, David, for your separate tip about using « psql -c ». I tried it 
and watched the server log. Sure enough, I saw this:

2023-02-20 12:42:44.993 PST [2540504] d0$u0@d0 LOG:  0: statement: insert 
into s.t(v) values(17); insert into s.t(v) values(42);
2023-02-20 12:42:44.993 PST [2540504] d0$u0@d0 LOCATION:  exec_simple_query, 
postgres.c:971

It seems a bit odd that psql has no syntax to ask for this in its interactive 
mode. But, yes, it doesn't actually matter because I can get the same semantics 
by starting a txn myself.



Re: pg_reload_conf()

2023-02-20 Thread Marc Millas
Thanks Adrian,

the fact that the changes are visible only by the new backend, and NOT by
the backend which did the changes was what I did missed.

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Mon, Feb 20, 2023 at 6:01 PM Adrian Klaver 
wrote:

> On 2/20/23 08:57, Marc Millas wrote:
> > select  setting, boot_val, reset_val from pg_settings where
> > name='log_connections';
>
> # select  setting, boot_val, reset_val, context from pg_settings where
> name='log_connections';
>   setting | boot_val | reset_val |  context
> -+--+---+---
>   on  | off  | on| superuser-backend
>
>
> https://www.postgresql.org/docs/15/view-pg-settings.html
>
> There are several possible values of context. In order of decreasing
> difficulty of changing the setting, they are:
>
>
> ...
>
> superuser-backend
>
>  Changes to these settings can be made in postgresql.conf without
> restarting the server. They can also be set for a particular session in
> the connection request packet (for example, via libpq's PGOPTIONS
> environment variable), but only if the connecting user is a superuser or
> has been granted the appropriate SET privilege. However, these settings
> never change in a session after it is started. If you change them in
> postgresql.conf, send a SIGHUP signal to the postmaster to cause it to
> re-read postgresql.conf. The new values will only affect
> subsequently-launched sessions.
>
> ...
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Thanks! Re: Who adds the "start transaction" and "commit" to the intended SQL statement in "autocommit" mode?

2023-02-20 Thread David G. Johnston
On Mon, Feb 20, 2023 at 12:57 PM Bryn Llewellyn  wrote:

> 3. Chapter 55 also has a section "Multiple Statements In A Simple Query".
> But this feature seems to do no more semantically beyond implicitly
> achieving what I could do by surrounding several statements explicitly with
> "begin; ... commit;". There is, though, the non-semantic aspect of
> round-trip reduction. It seems that psql doesn't expose doing many
> statements in a simple query. (At least, that's what I see in the server
> log when I put several statements on a single line (with psql's autocommit
> on).
>

IIRC psql -c 'insert into tbl values (1); select * from tbl;' uses the
simply query protocol with multiple statements.

Then, 55.2.2.1:

> When a simple Query message contains more than one SQL statement
(separated by semicolons), those statements are executed as a single
transaction

So in effect bundling within the Simple Query Protocol has a similar effect
to wrapping all statements in a begin/commit pair.

David J.


Re: Thanks! Re: Who adds the "start transaction" and "commit" to the intended SQL statement in "autocommit" mode?

2023-02-20 Thread Christophe Pettus



> On Feb 20, 2023, at 11:57, Bryn Llewellyn  wrote:
> 2. If I send over "begin" and then "insert into s.t(v) values(42)", then (so 
> far) a second session will not see the effect of my SQL's. It sees this only 
> when I send over "commit". (If I send over "rollback" instead of "commit", 
> then other sessions never know what I did.)

This may or may not be true.  If the second session currently has a transaction 
open in REPEATABLE READ or SERIALIZABLE mode, it *won't* see the effects of 
that statement, since it took its snapshot at the start of the transaction (to 
be technical, at the first statement in that transaction), and holds it until 
commit time.  However, a transaction in READ COMMITTED mode *will* see the 
results after the statement completes.

> I can't see that a client-side "autocommit off" mode like psql supports 
> brings me anything of value.

There's general agreement on that point.


https://www.cybertec-postgresql.com/en/disabling-autocommit-in-postgresql-can-damage-your-health/



Re: pg_dump'ed file contains "DROP DATABASE"

2023-02-20 Thread Adrian Klaver

On 2/20/23 11:36, p...@pfortin.com wrote:

On Mon, 20 Feb 2023 11:06:34 -0800 Adrian Klaver wrote:


On 2/20/23 10:27, p...@pfortin.com wrote:

[Still a newbie; but learning fast...]

Hi,

A remote team member is helping out by dumping some of his tables via
pgAdmin4 on Windows.  My DB is on Linux.

The other day, I restored his first file with:
pg_restore --host "localhost" --port "5432" --username "postgres"
--no-password --dbname "myname" --create --clean --verbose "dumpfile"

when I saw this:

pg_restore: dropping DATABASE myname
Command was: DROP DATABASE myname;

pg_restore: error: could not execute query:
  ERROR:  cannot drop the currently open database

Digging into the pg_dump'ed files, I see:

CREATE DATABASE myname WITH TEMPLATE = template0 ENCODING = 'UTF8'
LOCALE_PROVIDER = libc LOCALE = 'English_United States.1252';
DROP DATABASE myname;
^


I thought the --clean applied to the table being restored.  The man page
reads:

-c  |||
--clean VVV
  Clean (drop) database objects before recreating them. (Unless
  --if-exists is used, this might generate some harmless error
  messages, if any objects were not present in the destination
vs =>database.)

so I took that to mean table; not the entire DB.


1) If you clean(drop) all the objects in a database you have effectively 
got to the same point as dropping the database.





Notwithstanding the man page, my take is that the DROP DATABASE statement
needs to be eliminated at pg_dump creation by pgAdmin4.  Taking this to
that mailing list.


This is not on pgAdmin4 If the dump is done with the  custom format the 
-c and -C can be done on the pg_restore end per:


https://www.postgresql.org/docs/current/app-pgdump.html

-c

-C

This option is ignored when emitting an archive (non-text) output file. 
For the archive formats, you can specify the option when you call 
pg_restore.



So this:

pg_restore --host "localhost" --port "5432" --username "postgres"
  --no-password --dbname "myname" --create --clean --verbose "dumpfile"

is on you not pgAdmin4.

Spend some time in the pg_dump and pg_restore docs, there is a lot going 
on in there.


Thanks Tom & Adrian!



--
Adrian Klaver
adrian.kla...@aklaver.com





Thanks! Re: Who adds the "start transaction" and "commit" to the intended SQL statement in "autocommit" mode?

2023-02-20 Thread Bryn Llewellyn
> b...@yugabyte.com wrote:
> 
> ...it's not clear who actually implements the opening "start transaction" 
> and the closing "commit" around every submitted SQL statement when 
> autocommit is "on". Is this done in client-side code (maybe implying 
> three round trips per intended SQL statement)? Or is it done server-side?
 
 rjuju...@gmail.com wrote
 
 It's always done on the client side, postgres itself doesn't know about 
 this feature.
>>> 
>>> x...@thebuild.com wrote:
>>> 
>>> It's done server-side.  Note that what really happens is that, when a 
>>> statement begins execution and there is no open transaction, a snapshot is 
>>> taken and then released when the statement finishes (just as happens in 
>>> READ COMMITTED mode).  No piece of code literally injects a BEGIN and a 
>>> COMMIT statement to make it happen.
>> 
>> x...@thebuild.com wrote:
>> 
>> PostgreSQL has no idea that mode even exists: it either sees statements 
>> without transactions, which run in their own transaction, or BEGIN / COMMIT 
>> statements.  Because client stacks have traditionally loved to provide their 
>> own transaction semantics, they might inject BEGIN and COMMIT statements, 
>> but that's not something PostgreSQL sees.
>> 
>> (And I have never liked the term "autocommit mode"; it really doesn't reveal 
>> much about what is going on.
> 
> david.g.johns...@gmail.com wrote:
> 
> To rephrase the other responses, the client-defined setting has no inherent 
> relationship to the concept of a PostgreSQL session.  How the client uses 
> that setting is internal to the client and whatever abstraction(s) it 
> provides the programmer.
> 
> So far as the server is concerned it is always auto (implicit) begin, and 
> also auto (implicit) commit - absent receiving a BEGIN SQL Command in which 
> case it disables implicit commit and (more or less) waits for a COMMIT or 
> ROLLBACK before ending the transaction that it implicitly started.

Thanks, all, for these replies. In summary, I think that it all makes sense to 
me now.

I've written a bit more, here, for my own benefit. Do please speak up if I got 
it wrong.

Christophe also said “I may have misinterpreted your question...” No worries. I 
didn’t have the proper vocabulary to ask a clear question. I might just as well 
have said “Tell me about autocommit—in all of the different domains where 
there's something to say. Anyway... it seems clear from the fact that nobody 
mentioned a doc reference that there is no useful account in the PG docs of 
what's been explained in this thread. Pity.

I think that I can summarize what seems to be the consensus expressed by 
David's and Christophe's replies with reference to a thought experiment. 
Imagine that I studied up the material described here:

Chapter 55. Frontend/Backend Protocol
https://www.postgresql.org/docs/current/protocol.html

and that I wrote a program using the low-level language of my choice to send 
and receive TCP/IP messages to/from my PostgreSQL server. Assume that I limit 
myself to so-called Simple Queries and that I don't use "prepare" and "execute".

You've told me that, at this level of communication, there is no "autocommit" 
mode to set. Rather, things just behave in a certain way, like you've described.

1. If I send over just "insert into s.t(v) values(17)" then a second session 
will see the effect of my "insert" immediately. (I don't need a trailing 
semicolon; but I'm allowed to have one if I like to.) Moreover, If I set « 
log_statement = 'all' » in my postgresql.conf and tail the log file, all I see 
is my bare insert statement. I don't see "begin" before it and "commit" after 
it. I conclude, therefore, that the defined semantics of sending over a single 
SQL statement are to start a transaction under the covers, to execute the 
statement, and then to commit it under the covers. It's tempting to say that 
the effect of my statement is automatically committed—or to say that PostgreSQL 
natively implements automatic commit. But I s'pose that it's enough simply to 
describe what happens without inventing any terminology.

2. If I send over "begin" and then "insert into s.t(v) values(42)", then (so 
far) a second session will not see the effect of my SQL's. It sees this only 
when I send over "commit". (If I send over "rollback" instead of "commit", then 
other sessions never know what I did.)

3. Chapter 55 also has a section "Multiple Statements In A Simple Query". But 
this feature seems to do no more semantically beyond implicitly achieving what 
I could do by surrounding several statements explicitly with "begin; ... 
commit;". There is, though, the non-semantic aspect of round-trip reduction. It 
seems that psql doesn't expose doing many statements in a simple query. (At 
least, that's what I see in the server log when I put several statements on a 
single line (with psql's autocommit on).

4. If I use psql and set autocommit to off, I see that it knows w

Is Autovacuum running?

2023-02-20 Thread Brad White
I'm concerned that Autovacuum may not be running based on the results of
this query.

SELECT relname, last_vacuum, last_autovacuum FROM pg_stat_user_tables;
gives 211 rows like this...

*relname| last_vacuum |
last_autovacuum*BusinessIncidentCategories
| null | null
Valid Use  | null | null
Serial Pool Part Types | null | null
BusinessIncidentLog| null | null
Rate Categories| null | null

I don't see any process with 'auto' or 'vacuum' in the name in TaskManager.
I don't see anything similar set up in Services to run in the background.
I do see pg_ctl running for each instance of the server running, 9.4 and 14.

The settings look ok as far as I can tell.

SELECT name, setting FROM pg_settings WHERE name='autovacuum';
name   | setting
autovacuum | on

SELECT name, setting FROM pg_settings WHERE name='track_counts';
name | setting
track_counts | on

SELECT relname, reloptions FROM pg_class;
These are not turned OFF, and I assume the default is ON.
*relname   | reloptions*
pg_statistic  | null
pg_type   | null
Activity Codes| null
Activity Codes_ID_seq | null
..

SELECT * from pg_settings where category like 'Autovacuum';
autovacuum on Autovacuum Starts the autovacuum subprocess. sighup bool
default on on
autovacuum_analyze_scale_factor 0.1 Autovacuum Number of tuple inserts,
updates, or deletes prior to analyze as a fraction of reltuples. sighup real
default 0 100 0.1 0.1
autovacuum_analyze_threshold 50 Autovacuum Minimum number of tuple inserts,
updates, or deletes prior to analyze. sighup integer default 0 2.15E+09 50
50
autovacuum_freeze_max_age 2E+08 Autovacuum Age at which to autovacuum a
table to prevent transaction ID wraparound. postmaster integer default 1E+08
2E+09 2E+08 2E+08
autovacuum_max_workers 3 Autovacuum Sets the maximum number of
simultaneously running autovacuum worker processes. postmaster integer
default 1 8388607 3 3
autovacuum_multixact_freeze_max_age 4E+08 Autovacuum Multixact age at which
to autovacuum a table to prevent multixact wraparound. postmaster integer
default 1000 2E+09 4E+08 4E+08
autovacuum_naptime 60 s Autovacuum Time to sleep between autovacuum runs.
sighup integer default 1 2147483 60 60
autovacuum_vacuum_cost_delay 20 ms Autovacuum Vacuum cost delay in
milliseconds, for autovacuum. sighup integer default -1 100 20 20
autovacuum_vacuum_cost_limit -1 Autovacuum Vacuum cost amount available
before napping, for autovacuum. sighup integer default -1 1 -1 -1
autovacuum_vacuum_scale_factor 0.2 Autovacuum Number of tuple updates or
deletes prior to vacuum as a fraction of reltuples. sighup real default 0
100 0.2 0.2
autovacuum_vacuum_threshold 50 Autovacuum Minimum number of tuple updates
or deletes prior to vacuum. sighup integer default 0 2.15E+09 50 50


Re: pg_dump'ed file contains "DROP DATABASE"

2023-02-20 Thread pf
On Mon, 20 Feb 2023 11:06:34 -0800 Adrian Klaver wrote:

>On 2/20/23 10:27, p...@pfortin.com wrote:
>> [Still a newbie; but learning fast...]
>> 
>> Hi,
>> 
>> A remote team member is helping out by dumping some of his tables via
>> pgAdmin4 on Windows.  My DB is on Linux.
>> 
>> The other day, I restored his first file with:
>>pg_restore --host "localhost" --port "5432" --username "postgres"
>>--no-password --dbname "myname" --create --clean --verbose "dumpfile"
>> 
>> when I saw this:
>> 
>>pg_restore: dropping DATABASE myname
>>Command was: DROP DATABASE myname;
>> 
>>pg_restore: error: could not execute query:
>>  ERROR:  cannot drop the currently open database
>> 
>> Digging into the pg_dump'ed files, I see:
>> 
>> CREATE DATABASE myname WITH TEMPLATE = template0 ENCODING = 'UTF8'
>> LOCALE_PROVIDER = libc LOCALE = 'English_United States.1252';
>> DROP DATABASE myname;
>> ^

I thought the --clean applied to the table being restored.  The man page
reads:

   -c  |||
   --clean VVV
 Clean (drop) database objects before recreating them. (Unless
 --if-exists is used, this might generate some harmless error
 messages, if any objects were not present in the destination
vs =>database.)

so I took that to mean table; not the entire DB.

Notwithstanding the man page, my take is that the DROP DATABASE statement
needs to be eliminated at pg_dump creation by pgAdmin4.  Taking this to
that mailing list.

Thanks Tom & Adrian!

>> Was my 134 table[1] myname DB saved because it was open?  

Tom:  Yup.

>> If the dump file
>> contains the above statements, how can I be absolutely certain I won't
>> lose the DB?  

Tom:
Reading the manual is advisable.  --create --clean specifies exactly
that the target database is to be dropped and recreated.

regards, tom lane

>> I'm obviously quite paranoid now...  
>
>You will lose the database if you do as the docs specify for -C:
>
>"
>-C
>
>...
>
>When this option is used, the database named with -d is used only to 
>issue the initial DROP DATABASE and CREATE DATABASE commands. All data 
>is restored into the database name that appears in the archive.
>"
>
>
>It will then be recreated with whatever information is in "dumpfile". If 
>that is the same data or new data you want then you are fine. Otherwise 
>you will need to be more specific about what you are trying to achieve.
>
>
>> 
>> [1] 3 types of tables:  ~40%=8.5M rows; ~40%=33M rows; ~20%=varying sizes
>> 
>> Thanks,
>> Pierre
>> 
>>   
>




Re: pg_dump'ed file contains "DROP DATABASE"

2023-02-20 Thread Adrian Klaver

On 2/20/23 10:27, p...@pfortin.com wrote:

[Still a newbie; but learning fast...]

Hi,

A remote team member is helping out by dumping some of his tables via
pgAdmin4 on Windows.  My DB is on Linux.

The other day, I restored his first file with:
   pg_restore --host "localhost" --port "5432" --username "postgres"
   --no-password --dbname "myname" --create --clean --verbose "dumpfile"

when I saw this:

   pg_restore: dropping DATABASE myname
   Command was: DROP DATABASE myname;

   pg_restore: error: could not execute query:
 ERROR:  cannot drop the currently open database

Digging into the pg_dump'ed files, I see:

CREATE DATABASE myname WITH TEMPLATE = template0 ENCODING = 'UTF8'
LOCALE_PROVIDER = libc LOCALE = 'English_United States.1252';
DROP DATABASE myname;
^

Was my 134 table[1] myname DB saved because it was open?  If the dump file
contains the above statements, how can I be absolutely certain I won't
lose the DB?  I'm obviously quite paranoid now...


You will lose the database if you do as the docs specify for -C:

"
-C

...

When this option is used, the database named with -d is used only to 
issue the initial DROP DATABASE and CREATE DATABASE commands. All data 
is restored into the database name that appears in the archive.

"


It will then be recreated with whatever information is in "dumpfile". If 
that is the same data or new data you want then you are fine. Otherwise 
you will need to be more specific about what you are trying to achieve.





[1] 3 types of tables:  ~40%=8.5M rows; ~40%=33M rows; ~20%=varying sizes

Thanks,
Pierre




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: pg_dump'ed file contains "DROP DATABASE"

2023-02-20 Thread Tom Lane
p...@pfortin.com writes:
> The other day, I restored his first file with:
>   pg_restore --host "localhost" --port "5432" --username "postgres"
>   --no-password --dbname "myname" --create --clean --verbose "dumpfile"

> Was my 134 table[1] myname DB saved because it was open?

Yup.

> If the dump file
> contains the above statements, how can I be absolutely certain I won't
> lose the DB?

Reading the manual is advisable.  --create --clean specifies exactly
that the target database is to be dropped and recreated.

regards, tom lane




pg_dump'ed file contains "DROP DATABASE"

2023-02-20 Thread pf
[Still a newbie; but learning fast...]

Hi,

A remote team member is helping out by dumping some of his tables via
pgAdmin4 on Windows.  My DB is on Linux.

The other day, I restored his first file with:
  pg_restore --host "localhost" --port "5432" --username "postgres"
  --no-password --dbname "myname" --create --clean --verbose "dumpfile"

when I saw this:

  pg_restore: dropping DATABASE myname
  Command was: DROP DATABASE myname;

  pg_restore: error: could not execute query: 
ERROR:  cannot drop the currently open database

Digging into the pg_dump'ed files, I see:

CREATE DATABASE myname WITH TEMPLATE = template0 ENCODING = 'UTF8'
LOCALE_PROVIDER = libc LOCALE = 'English_United States.1252'; 
DROP DATABASE myname;
^

Was my 134 table[1] myname DB saved because it was open?  If the dump file
contains the above statements, how can I be absolutely certain I won't
lose the DB?  I'm obviously quite paranoid now...

[1] 3 types of tables:  ~40%=8.5M rows; ~40%=33M rows; ~20%=varying sizes

Thanks,
Pierre 




Re: pg_reload_conf()

2023-02-20 Thread David G. Johnston
On Mon, Feb 20, 2023 at 9:57 AM Marc Millas  wrote:

>
> select pg_reload_conf();
> t
>
> So, I am perplexed: what pg_reload_conf() is doing/not doing ?
>

It is sending a signal and stating success that said signal was sent.

David J.


Re: pg_reload_conf()

2023-02-20 Thread Adrian Klaver

On 2/20/23 08:57, Marc Millas wrote:
select  setting, boot_val, reset_val from pg_settings where 
name='log_connections';


# select  setting, boot_val, reset_val, context from pg_settings where 
name='log_connections';

 setting | boot_val | reset_val |  context
-+--+---+---
 on  | off  | on| superuser-backend


https://www.postgresql.org/docs/15/view-pg-settings.html

There are several possible values of context. In order of decreasing 
difficulty of changing the setting, they are:



...

superuser-backend

Changes to these settings can be made in postgresql.conf without 
restarting the server. They can also be set for a particular session in 
the connection request packet (for example, via libpq's PGOPTIONS 
environment variable), but only if the connecting user is a superuser or 
has been granted the appropriate SET privilege. However, these settings 
never change in a session after it is started. If you change them in 
postgresql.conf, send a SIGHUP signal to the postmaster to cause it to 
re-read postgresql.conf. The new values will only affect 
subsequently-launched sessions.


...
--
Adrian Klaver
adrian.kla...@aklaver.com





pg_reload_conf()

2023-02-20 Thread Marc Millas
Hi,



maybe I am missing something...

with superuser rights, on a postgres 14 and postgres 15 version:
select  setting, boot_val, reset_val from pg_settings where
name='log_connections';
off off off
alter system set log_connections to 'on';
select  setting, boot_val, reset_val from pg_settings where
name='log_connections';
off off off
... strange
select pg_reload_conf();
t
select  setting, boot_val, reset_val from pg_settings where
name='log_connections';
off off off

very strange.

cat postgresql.auto.conf
log_connections = 'on'

obviously, if I stop and restart the DB,
select  setting, boot_val, reset_val from pg_settings where
name='log_connections';
on off on

So, I am perplexed: what pg_reload_conf() is doing/not doing ?

thanks


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: A performance issue in ROW_NUMBER() OVER(ORDER BY NULL) [27 times slow than OVER()] V14.5

2023-02-20 Thread Kirk Wolak
On Sun, Feb 19, 2023 at 4:18 PM Tom Lane  wrote:

> Kirk Wolak  writes:
> >   I have some converted code that uses this syntax.
>
> Seems kinda dumb, but ...
>
> >   The solution is to remove the ORDER BY NULL.  [since that is not
> > sortable, should it be ignored?]
> >   This does NOT SHOW UP with 1 million rows.
>
> I don't see it at all.  Comparing your two test queries on released
> branches, I see maybe 2x penalty for the ORDER BY NULL, not 30x.
> (In HEAD there's only about 13% penalty.)  I wonder what PG version
> you are testing.
>
> regards, tom lane
>
Tom,
  I put V14.5 in the subject line (I could have made it more clear).
  It appears in new versions, as confirmed by Pavel, it is already
addressed in some newer versions.

  Also, would it make sense to have EXPLAIN output the version of PG?  I
think that might be useful,
because it becomes a COMMON next question?

Thanks,

Kirk Out!


Re: A performance issue in ROW_NUMBER() OVER(ORDER BY NULL) [27 times slow than OVER()] V14.5

2023-02-20 Thread Tom Lane
David Rowley  writes:
> On Mon, 20 Feb 2023 at 13:17, Tom Lane  wrote:
>> We could teach createplan.c to generate a WindowAgg plan node
>> that omits the useless column from ordNumCols/ordColIdx/etc, but I'm not
>> sure if that'd save much in itself.

> I wonder what the comment had in mind when it said it doesn't seem
> worth it.  Doing an if (IsA(tle->expr, Const)) continue; seems pretty
> simple and low-cost.

The right way to do this would be to pass forward information
about which window sort columns had been deemed redundant during
make_pathkeys_for_sortclauses: there are a lot more ways for that
to happen than "it's a constant".

The recent addition of make_pathkeys_for_sortclauses_extended would
make it pretty easy for make_pathkeys_for_window to obtain a reduced list
of SortGroupClauses, and then in theory you could match that against the
original lists to identify which sort columns are redundant (though
repeated keys might make that tricky; might be better to provide another
more-direct output representation).  Pass the information forward in
WindowAggPath, et voila.

I remain doubtful that it's worth the trouble though.

> I've not looked at what the comment mentions
> about RANGE OFFSET.  Assuming we'd need to not remove any ORDER BY
> clauses when the WindowClause is doing that.

There's some Asserts in nodeWindowAgg.c about ordNumCols being
positive, probably related.

regards, tom lane




Re: can't get psql authentication against Active Directory working

2023-02-20 Thread Stephen Frost
Greetings,

* Tomas Pospisek (t...@sourcepole.ch) wrote:
> so I'm trying to authenticate psql (on Windows) -> postgres (on Linux) via
> Active Directory.

Looks like you're trying to do a bit more than that as you're using
GSSAPI (not SSPI, which would be the more typical method on Windows..)
and you're asking for an encrypted connection.  Is there a reason you're
going down this particular route..?

> psql (Linux) -> postgres (Linux) with authentication against Active
> Directory does work.

That's good.  One thing to know about Linux vs. Windows when it comes to
Kerberos (and GSSAPI/SSPI) is that the default service name is different
between them.  See the libpq parameter krbsrvname documentation here:

https://www.postgresql.org/docs/current/libpq-connect.html

> However the same with psql.exe on Windows does not. I get:
> 
> D:\>C:\OSGeo4W\bin\psql.exe service=the_db
> psql: error: connection to server at "dbserver.example.lan
> (192.168.4.104), port 5432 failed: could not initiate GSSAPI
> security context: No credentials were supplied, or the credentials
> were unavailable or inaccessible: Internal credentials cache error

That's a bit of an odd message to get if krbsrvname is this issue
though.

> psql.exe from the OSGeo4W QGIS Installer *does* include GSS support. (I have
> tried with a different psql.exe without GSS support and it would tell me
> that it does not support GSS).

Providing the specific messages would possibly be helpful here..

> The .pg_service.conf file in the users $HOME directory looks like this:
> 
> [the_db]
> host=dbserver.example.lan
> port=5432
> user=u...@example.lan
> gssencmode=require
> 
> This same pg_service.conf does work for psql (Linux).
> 
> Also getting a Kerberos ticket for the service on Windows does work:
> 
> D:\> klist get postgres/dbserver.example@example.lan
> [...]
> This will list the ticket

Is this using MIT klist or Windows klist though?

> But when using psql.exe it will not get a ticket for the service nor will it
> apparently use the existing service ticket.

As mentioned above, GSSAPI and SSPI aren't the same thing and what I
suspect you're seeing here is that the Windows klist is using SSPI but
the psql you have is built with GSSAPI.  There used to be a Kerberos For
Windows utility that would make the bridge between those two, as I
recall, put out by MIT but I'm not sure what the current state of it is.
It hasn't been needed for a very long time as SSPI generally works fine-
though we don't yet have SSPI support for encryption.

> I have tried to trace psql.exe with Window's Process Monitor and I can't see
> it accessing no keytab file (I'm not sure whether a keytab file exists at
> all under Windows or if psql.exe doesn't instead need to access some Windows
> service). I see that psql.exe will open and close a TCP connection to
> dbserver.example.lan, however as far as I can see that connection is
> completely irrelevant for the aquisition of a Kerberos ticket for the
> service since that is a business purely between psql.exe and Active
> Directory or respectively between psql.exe and the credentials cache. And
> there is no other TCP connection being opened to anywhere from psql.exe.
> 
> What I find suspicious about the error above is "... the credentials were
> unavailable or inaccessible: Internal credentials cache error", since that
> looks like either psql.exe can't access the (inexisting) keytab file, or it
> can't access Window's Kerberos service.

psql wouldn't be accessing an actual keytab, it would be trying to
access a credential cache and it is failing on that, though if it was
just non-existant, I'd expect an error along those lines.  Accessing the
Windows kerberos service would require either using SSPI, which psql can
be built to do, or using a bridge tool like KfW as mentioned above.

> Also, I see that psql.exe is trying to access a ccapiserver.exe which does
> not exist. Should psql.exe be able to access that ccapiserver.exe file? That
> means is the OSGeo4W QGIS installer, that also installs all things necessary
> for psql missing that ccapiserver.exe executable?

That's part of KfW and it's the Kerberos libraries, not directly psql,
that's trying to access it.  If you want to get this working, probably
the next step would be to look into KfW.

> * has anybody ever succeeded in authenticating with psql.exe against Active
> Directory?

Yes, many times, thuogh more typically using SSPI support in psql
instead of GSSAPI and using TLS/SSL for encryption.

> * can you maybe tell me what's wrong from the error message above?
> * how can I proceed from here?

Either switch to using SSPI and TLS/SSL, or you could give KfW a try.
If you'd like encryption support with SSPI, that's a bit of a project.

> PS: Any way to make GSS more talkative? At this moment all that I can get as
> logs is the above "computer says no".

Well, you're hitting the same issue over and over, it looks li

Re: A performance issue in ROW_NUMBER() OVER(ORDER BY NULL) [27 times slow than OVER()] V14.5

2023-02-20 Thread David Rowley
On Mon, 20 Feb 2023 at 13:17, Tom Lane  wrote:
> I suspect most of the remaining performance discrepancy is just triggered
> by having to pass the extra always-NULL column forward through the various
> plan steps.  We could teach createplan.c to generate a WindowAgg plan node
> that omits the useless column from ordNumCols/ordColIdx/etc, but I'm not
> sure if that'd save much in itself.  (The comment in create_windowagg_plan
> shows we already thought about that and decided it wasn't worth the
> trouble.)

I wonder what the comment had in mind when it said it doesn't seem
worth it.  Doing an if (IsA(tle->expr, Const)) continue; seems pretty
simple and low-cost.  I've not looked at what the comment mentions
about RANGE OFFSET.  Assuming we'd need to not remove any ORDER BY
clauses when the WindowClause is doing that.

> Getting rid of the useless targetlist column altogether would
> be way more invasive, and I'm not inclined to try.

Yeah, that would likely add more complexity than it would be worth.

David