Re: does refreshing materialized view make the database bloat?

2023-01-15 Thread David G. Johnston
On Sunday, January 15, 2023, jian he  wrote:

>
>
> In the following example, I cannot see bloat (via extension pgstattuple
> dead_tuple_count>0). Wondering where the bloat is.
>
> do the update.
> update tbt set a = 10 + a  where a < 20;
> REFRESH MATERIALIZED view tbtmv;
> SELECT * FROM pgstattuple('tbtmv'); -no dead tuples count.
>

You didn’t specify concurrently so the merge method you quoted is not being
used.

David J.


Re: does refreshing materialized view make the database bloat?

2023-01-15 Thread jian he
On Mon, Jan 16, 2023 at 10:28 AM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Sunday, January 15, 2023, jian he  wrote:
>
>>
>>
>> Hi,
>> why the materialized view itself bloats. If no refresh then no bloat
>> right? If fresh then set based delete operation will make materialized view
>> bloat?
>> I also found the same question online.
>> https://dba.stackexchange.com/questions/219079/bloat-on-materialized-views
>> Unfortunately nobody answered...
>>
>>
> The definition of bloat is a deleted row.  Bloat can be reduced by
> subsequent row insertions.
>
> David J.
>
>
Hi.
In the following example, I cannot see bloat (via extension pgstattuple
dead_tuple_count>0). Wondering where the bloat is.

BEGIN;create table tbt( a int) with(fillfactor=40, autovacuum_enabled=off);
insert into tbt  select g from generate_series(1,2000) g;
create materialized view tbtmv as select * from tbt;
commit;

do the update.
update tbt set a = 10 + a  where a < 20;
REFRESH MATERIALIZED view tbtmv;
SELECT * FROM pgstattuple('tbtmv'); -no dead tuples count.

--try delete.
delete  from tbt  where a < 50;
REFRESH MATERIALIZED view tbtmv;
SELECT * FROM pgstattuple('tbtmv');---still no dead tuples.


Re: does refreshing materialized view make the database bloat?

2023-01-15 Thread David G. Johnston
On Sunday, January 15, 2023, jian he  wrote:

>
>
> Hi,
> why the materialized view itself bloats. If no refresh then no bloat
> right? If fresh then set based delete operation will make materialized view
> bloat?
> I also found the same question online. https://dba.stackexchange.com/
> questions/219079/bloat-on-materialized-views Unfortunately nobody
> answered...
>
>
The definition of bloat is a deleted row.  Bloat can be reduced by
subsequent row insertions.

David J.


Re: does refreshing materialized view make the database bloat?

2023-01-15 Thread jian he
On Sat, Jan 14, 2023 at 11:49 AM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Friday, January 13, 2023, jian he  wrote:
>
>>
>> Once we have the diff table, we perform set-based DELETE and INSERT
>>> operations against the materialized view, and discard both temporary
>>> tables.
>>>
>>
>> Here the temporary tables are "discard" meaning the temporary tables are
>> deleted and the temporary tables' spaces are reclaimed immediately?
>> Or the temporary tables are deleted and the spaces will be reclaimed by
>> another mechanism?
>>
>> simplify:does refreshing materialized view make the database bloat.
>>
>>
> The materialized view itself bloats.  The temp tables are removed
> immediately (the “drop table” docs don’t say this explicitly though it the
> most logical behavior and implied by the fact it takes an access exclusive
> lock).
>
> David J.
>
>
Hi,
why the materialized view itself bloats. If no refresh then no bloat right?
If fresh then set based delete operation will make materialized view bloat?
I also found the same question online.
https://dba.stackexchange.com/questions/219079/bloat-on-materialized-views
Unfortunately nobody answered...


Re: pg_upgrade 13.6 to 15.1? [Solved: what can go wrong, will...]

2023-01-15 Thread pf
On Mon, 16 Jan 2023 09:16:27 +1100 Gavan Schneider wrote:

>On 16 Jan 2023, at 8:59, p...@pfortin.com wrote:
>
>> encodings for database "template1" do not match: old "UTF8", new
>> "SQL_ASCII" Failure, exiting
>>  
>Suggest the old dB using UTF8 is the better practice, and the new dB should do 
>likewise

I was surprised to see pg_upgrade even suggest that... 

>> "template1" is not a DB I've ever messed with; so this will require that
>> I fire up the old version and change the encoding somehow?
>>  
>This is created at initdb and mostly you don’t need/want to mess with it

I should have checked "initdb --help" first; but now I've used:
  initdb -E UTF8 /mnt/work/var/lib/pgsql/data

and got burned (2nd time) by one file with root:root ownership; fixed
with:
  chown postgres:postgres /mnt/work/var/lib/pgsql/data13/base/24597/35874

I have no clue how a single file would have root ownership; but found
this a few days ago, and forgot to fix it in both copies of the DB.  Of
course, this put me on a new path of [recoverable] disaster...  ;/

pg_upgrade aborted on it; but only after getting to a point of "no
return". The documentation alludes to checking everything before
proceeding; but it's the story of my life to find the unexpected...

[postgres@pf ~]$ /usr/bin/pg_upgrade -b /usr/local/pgsql/bin -B /usr/bin
-d /mnt/work/var/lib/pgsql/data13 -D /mnt/work/var/lib/pgsql/data --link
-U postgres 
Performing Consistency Checks -
[snip]
If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.

Performing Upgrade
--
[snip]
Adding ".old" suffix to old global/pg_control   ok

If you want to start the old cluster, you will need to remove
the ".old" suffix from
/mnt/work/var/lib/pgsql/data13/global/pg_control.old. Because "link" mode
was used, the old cluster cannot be safely started once the new cluster
has been started.

Linking user relation files 
  /mnt/work/var/lib/pgsql/data13/base/24597/35874   
error while creating link for relation "public.vr_snapshot_2022_01_01"
("/mnt/work/var/lib/pgsql/data13/base/24597/35874" to
"/mnt/work/var/lib/pgsql/data/base/24597/35874"): Operation not permitted
Failure, exiting

OK...  starting over...   
  rm -rf data
  initdb -E UTF8 /mnt/work/var/lib/pgsql/data
  /usr/bin/pg_upgrade -b /usr/local/pgsql/bin -B /usr/bin -d
  /mnt/work/var/lib/pgsql/data13 -D /mnt/work/var/lib/pgsql/data --link -U
  postgres 
Performing Consistency Checks -
Checking cluster versions   ok
pg_controldata: fatal: could not open file
"/mnt/work/var/lib/pgsql/data13/global/pg_control" for reading: No such
file or directory

The source cluster lacks cluster state information:
Failure, exiting

Sigh...  this should "fix" it:
  mv /mnt/work/var/lib/pgsql/data13/global/pg_control.old
  /mnt/work/var/lib/pgsql/data13/global/pg_control

Yup...  Success!!  Upgrade done.  Again, sorry for the noise; but hope
the above helps with other issues that can go wrong during an upgrade...

Thanks Tom, Gavan, et al!!

Pierre


>> Is this likely to repeat for my actual databases?
>>  
>AFAICT the least work option is to redo the initdb for the new v15.1 database. 
>There is a lot of pain (and potential data corruption) to be had trying to 
>reconfigure the old one before it can be moved.
>
>Personally, UTF8 is the way to go. It will handle everything in the old 
>database and the future brings to the new one. I can see no advantage in pure 
>ASCII when there is the potential for the real world to be contributing text. 
>And there could well be non-ASCII characters lurking in the old dB, especially 
>since someone set it up to receive them.
>
>Regards
>
>Gavan Schneider
>——
>Gavan Schneider, Sodwalls, NSW, Australia
>Explanations exist; they have existed for all time; there is always a 
>well-known solution to every human problem — neat, plausible, and wrong.
>— H. L. Mencken, 1920
>
>
>




Re: pg_upgrade 13.6 to 15.1?

2023-01-15 Thread Tom Lane
p...@pfortin.com writes:
> encodings for database "template1" do not match:  old "UTF8", new
> "SQL_ASCII" Failure, exiting

So you need to do the initdb under the same locale setting you
used for the old DB.  Looking into its LC_XXX settings should
refresh your memory on what that was.

regards, tom lane




Re: Are ZFS snapshots unsafe when PGSQL is spreading through multiple zpools?

2023-01-15 Thread Ron
We regularly use "history" databases.  Put them on slow media, and only take 
a backup when data is added to them (monthly, quarterly, etc).


On 1/15/23 15:57, HECTOR INGERTO wrote:


> But you cannot and should not rely on snapshots alone

That’s only for non atomic (multiple pools) snapshots. Isn’t?

If I need to rely only on ZFS (automated) snapshots, then the best option 
would be to have two DB? Each one in each own pool. One HDD DB and one SSD 
DB. Then, the backend code should know on which DB the requested data is.


*De: *Magnus Hagander 
*Enviado: *domingo, 15 de enero de 2023 20:36
*Para: *HECTOR INGERTO 
*CC: *pgsql-gene...@postgresql.org
*Asunto: *Re: Are ZFS snapshots unsafe when PGSQL is spreading through 
multiple zpools?


On Sun, Jan 15, 2023 at 8:18 PM HECTOR INGERTO  wrote:

Hello everybody,

I’m using PostgreSQL on openZFS. I use ZFS snapshots as a backup +
hotspare method.

From man zfs-snapshot: “Snapshots are taken atomically, so that all
snapshots correspond to the same moment in time.” So if a PSQL
instance is started from a zfs snapshot, it will start to replay the
WAL from the last checkpoint, in the same way it would do in a crash
or power loss scenario. So from my knowledge, ZFS snapshots can be
used to rollback to a previous point in time. Also, sending those
snapshots to other computers will allow you to have hotspares and
remote backups. If I’m wrong here, I would appreciate being told about
it because I’m basing the whole question on this premise.

On the other hand, we have the tablespace PGSQL feature, which is
great because it allows “unimportant” big data to be written into
cheap HDD and frequently used data into fast NVMe.

So far, so good. The problem is when both ideas are merged. Then,
snapshots from different pools are NOT atomical, snapshot on the HDD
pool isn’t going to be done at the same exact time as the one on the
SSD pool, and I don’t know enough about PGSQL internals to know how
dangerous this is. So here is where I would like to ask for your help
with the following questions:

First of all, what kind of problem can this lead to? Are we talking
about potential whole DB corruption or only the loss of a few of the
latest transactions?

Silent data corruption. *not* just losing your latest transaction.

In second place, if I’m initializing a corrupted PGSQL instance
because ZFS snapshots are from different pools and slightly different
times, am I going to notice it somehow or is it going to fail silently?

Silent. You might notice at the application level. Might.

In third and last place, is there some way to quantify the amount of
risk taken when snapshotting a PGSQL instance spread across two (or
more) different pools?

"Don't do it".

If you can't get atomic snapshots, don't do it, period.

You can use them together with a regular online backup. That is 
pg_start_backup() //  // pg_stop_backup() 
together with log archiving. That's a perfectly valid method. But you 
cannot and should not rely on snapshots alone.


--

 Magnus Hagander
 Me: https://www.hagander.net/ 

 Work: https://www.redpill-linpro.com/ 





--
Born in Arizona, moved to Babylonia.

Re: pg_upgrade 13.6 to 15.1?

2023-01-15 Thread Gavan Schneider
On 16 Jan 2023, at 8:59, p...@pfortin.com wrote:

> encodings for database "template1" do not match: old "UTF8", new
> "SQL_ASCII" Failure, exiting
>
Suggest the old dB using UTF8 is the better practice, and the new dB should do 
likewise

> "template1" is not a DB I've ever messed with; so this will require that
> I fire up the old version and change the encoding somehow?
>
This is created at initdb and mostly you don’t need/want to mess with it

> Is this likely to repeat for my actual databases?
>
AFAICT the least work option is to redo the initdb for the new v15.1 database. 
There is a lot of pain (and potential data corruption) to be had trying to 
reconfigure the old one before it can be moved.

Personally, UTF8 is the way to go. It will handle everything in the old 
database and the future brings to the new one. I can see no advantage in pure 
ASCII when there is the potential for the real world to be contributing text. 
And there could well be non-ASCII characters lurking in the old dB, especially 
since someone set it up to receive them.

Regards

Gavan Schneider
——
Gavan Schneider, Sodwalls, NSW, Australia
Explanations exist; they have existed for all time; there is always a 
well-known solution to every human problem — neat, plausible, and wrong.
— H. L. Mencken, 1920




Re: pg_upgrade 13.6 to 15.1?

2023-01-15 Thread Alan Hodgson
On Sun, 2023-01-15 at 16:59 -0500, p...@pfortin.com wrote:
> 
> 
> encodings for database "template1" do not match:  old "UTF8", new
> "SQL_ASCII" Failure, exiting

You almost certainly don't want your new database to use SQL_ASCII.
Init the new cluster with -E UTF8.



RE: Are ZFS snapshots unsafe when PGSQL is spreading through multiple zpools?

2023-01-15 Thread HECTOR INGERTO

> But you cannot and should not rely on snapshots alone

That’s only for non atomic (multiple pools) snapshots. Isn’t?

If I need to rely only on ZFS (automated) snapshots, then the best option would 
be to have two DB? Each one in each own pool. One HDD DB and one SSD DB. Then, 
the backend code should know on which DB the requested data is.

De: Magnus Hagander
Enviado: domingo, 15 de enero de 2023 20:36
Para: HECTOR INGERTO
CC: pgsql-gene...@postgresql.org
Asunto: Re: Are ZFS snapshots unsafe when PGSQL is spreading through multiple 
zpools?



On Sun, Jan 15, 2023 at 8:18 PM HECTOR INGERTO 
mailto:hector_...@hotmail.com>> wrote:
Hello everybody,

I’m using PostgreSQL on openZFS. I use ZFS snapshots as a backup + hotspare 
method.

>From man zfs-snapshot: “Snapshots are taken atomically, so that all snapshots 
>correspond to the same moment in time.” So if a PSQL instance is started from 
>a zfs snapshot, it will start to replay the WAL from the last checkpoint, in 
>the same way it would do in a crash or power loss scenario. So from my 
>knowledge, ZFS snapshots can be used to rollback to a previous point in time. 
>Also, sending those snapshots to other computers will allow you to have 
>hotspares and remote backups. If I’m wrong here, I would appreciate being told 
>about it because I’m basing the whole question on this premise.

On the other hand, we have the tablespace PGSQL feature, which is great because 
it allows “unimportant” big data to be written into cheap HDD and frequently 
used data into fast NVMe.

So far, so good. The problem is when both ideas are merged. Then, snapshots 
from different pools are NOT atomical, snapshot on the HDD pool isn’t going to 
be done at the same exact time as the one on the SSD pool, and I don’t know 
enough about PGSQL internals to know how dangerous this is. So here is where I 
would like to ask for your help with the following questions:

First of all, what kind of problem can this lead to? Are we talking about 
potential whole DB corruption or only the loss of a few of the latest 
transactions?

Silent data corruption. *not* just losing your latest transaction.


In second place, if I’m initializing a corrupted PGSQL instance because ZFS 
snapshots are from different pools and slightly different times, am I going to 
notice it somehow or is it going to fail silently?

Silent. You might notice at the application level. Might.


In third and last place, is there some way to quantify the amount of risk taken 
when snapshotting a PGSQL instance spread across two (or more) different pools?


"Don't do it".

If you can't get atomic snapshots, don't do it, period.

You can use them together with a regular online backup. That is 
pg_start_backup() //  // pg_stop_backup() together 
with log archiving. That's a perfectly valid method. But you cannot and should 
not rely on snapshots alone.

--
 Magnus Hagander
 Me: 
https://www.hagander.net/
 Work: 
https://www.redpill-linpro.com/



Re: Are ZFS snapshots unsafe when PGSQL is spreading through multiple zpools?

2023-01-15 Thread Magnus Hagander
On Sun, Jan 15, 2023 at 10:57 PM HECTOR INGERTO 
wrote:

>
>
> > But you cannot and should not rely on snapshots alone
>
>
>
> That’s only for non atomic (multiple pools) snapshots. Isn’t?
>

Right. For single-filesystem installs it should be fine. Just make sure it
has both the data and the WAL directories in the same one.




> If I need to rely only on ZFS (automated) snapshots, then the best option
> would be to have two DB? Each one in each own pool. One HDD DB and one SSD
> DB. Then, the backend code should know on which DB the requested data is.
>

 You could. I wouldn't -- I would set it up to use proper backups instead,
maybe using snapshots as the infrastructure. That gives you other
advantages as well, like being able to do PITR. It's a little more to set
up, but I'd say it's worth it.

//Magnus


Re: pg_upgrade 13.6 to 15.1?

2023-01-15 Thread pf
On Sun, 15 Jan 2023 16:38:08 -0500 p...@pfortin.com wrote:

>On Sun, 15 Jan 2023 15:59:20 -0500 Tom Lane wrote:
>
>>p...@pfortin.com writes:  
>>> On Sun, 15 Jan 2023 14:47:35 -0500 Tom Lane wrote:
 I think you misunderstand how this is supposed to work.  The -D
 argument should point at an *empty* data directory that has been
 freshly initialized with the new version's initdb.  pg_upgrade then
 transfers data into that from the old database (-d argument).
>>  
>>> I was hoping to avoid the hours worth of copying to the NVMe SSD. 
>>> The instructions refer to upgrading with --link; would that save the copy
>>> time?
>>
>>Yes, but to use --link you must have both data directories on the
>>same filesystem, so this is still the wrong thing.
>>
>>Try something like  
>
>My understanding:
>>mv /mnt/work/var/lib/pgsql/data /mnt/work/var/lib/pgsql/data13  
>- renames the DB
>
>>initdb /mnt/work/var/lib/pgsql/data  
>- creates new DB
>
>>pg_upgrade ... -d /mnt/work/var/lib/pgsql/data13 -D 
>> /mnt/work/var/lib/pgsql/data --link ...  
>- if this only creates hard links; then this should do what I want.  
>  My big concern was due to the DB being about 65% of /mnt/work; so doing
>  it on the same file system absolutely requires hard links vs copying...
>
>Looks like this is what I was trying to be certain of...   Thanks!!
>Pierre

Sigh...  I thought all was good...  This was not expected and is not
discussed in the pg_upgrade instructions:

[postgres@pf ~]$ /usr/bin/pg_upgrade -b /usr/local/pgsql/bin -B /usr/bin
-d /mnt/work/var/lib/pgsql/data13 -D /mnt/work/var/lib/pgsql/data --link
-U postgres Performing Consistency Checks -
Checking cluster versions   ok
Checking database user is the install user  ok
Checking database connection settings   ok
Checking for prepared transactions  ok
Checking for system-defined composite types in user tables  ok
Checking for reg* data types in user tables ok
Checking for contrib/isn with bigint-passing mismatch   ok
Checking for user-defined encoding conversions  ok
Checking for user-defined postfix operators ok
Checking for incompatible polymorphic functions ok
Creating dump of global objects ok
Creating dump of database schemas   
ok

encodings for database "template1" do not match:  old "UTF8", new
"SQL_ASCII" Failure, exiting

"template1" is not a DB I've ever messed with; so this will require that
I fire up the old version and change the encoding somehow?  

Is this likely to repeat for my actual databases?  

Sorry if this is noise...

>
>>  regards, tom lane
>>
>>
>>  
>
>




Re: pg_upgrade 13.6 to 15.1?

2023-01-15 Thread pf
On Sun, 15 Jan 2023 15:59:20 -0500 Tom Lane wrote:

>p...@pfortin.com writes:
>> On Sun, 15 Jan 2023 14:47:35 -0500 Tom Lane wrote:  
>>> I think you misunderstand how this is supposed to work.  The -D
>>> argument should point at an *empty* data directory that has been
>>> freshly initialized with the new version's initdb.  pg_upgrade then
>>> transfers data into that from the old database (-d argument).  
>
>> I was hoping to avoid the hours worth of copying to the NVMe SSD. 
>> The instructions refer to upgrading with --link; would that save the copy
>> time?  
>
>Yes, but to use --link you must have both data directories on the
>same filesystem, so this is still the wrong thing.
>
>Try something like

My understanding:
>mv /mnt/work/var/lib/pgsql/data /mnt/work/var/lib/pgsql/data13
- renames the DB

>initdb /mnt/work/var/lib/pgsql/data
- creates new DB

>pg_upgrade ... -d /mnt/work/var/lib/pgsql/data13 -D 
> /mnt/work/var/lib/pgsql/data --link ...
- if this only creates hard links; then this should do what I want.  
  My big concern was due to the DB being about 65% of /mnt/work; so doing
  it on the same file system absolutely requires hard links vs copying...

Looks like this is what I was trying to be certain of...   Thanks!!
Pierre

>   regards, tom lane
>
>
>




Re: pg_upgrade 13.6 to 15.1?

2023-01-15 Thread pf
On Sun, 15 Jan 2023 13:00:58 -0800 Adrian Klaver wrote:

>On 1/15/23 12:41, p...@pfortin.com wrote:
>> On Sun, 15 Jan 2023 12:23:10 -0800 Adrian Klaver wrote:
>>   
>>> On 1/15/23 11:27, p...@pfortin.com wrote:  
 Hi,

 I'm fairly new to postgres; but have databases with about 2TB of data.

 Trying to upgrade from 13.6 to 15.1, pg_upgrade complains with:
 [postgres@pf ~]$ /usr/bin/pg_upgrade -b /usr/local/pgsql/bin -B /usr/bin \
 -d /mnt/db/var/lib/pgsql/data -D /mnt/work/var/lib/pgsql/data \
 -s /tmp -U postgres  
>>>  
 Nothing I read implies the need to upgrade to 14.x first...  Right?  
>>>
>>> In addition to Tom Lane's comments I would recommend reading this:
>>>
>>> https://www.postgresql.org/docs/current/pgupgrade.html
>>>
>>> multiple times. There is a lot going on there and it will take a couple
>>> of reads at least to begin to understand it all.  
>> 
>> Yup...  that's what I've been working from...  See my reply to Tom re
>> --link...  
>
>1) Working from and understanding are two different things. For instance 
>further on in the docs there is:
>
>--clone
>
> Use efficient file cloning (also known as “reflinks” on some 
>systems) instead of copying files to the new cluster. This can result in 
>near-instantaneous copying of the data files, giving the speed 
>advantages of -k/--link while leaving the old cluster untouched.
>
> File cloning is only supported on some operating systems and file 
>systems. If it is selected but not supported, the pg_upgrade run will 
>error. At present, it is supported on Linux (kernel 4.5 or later) with 
>Btrfs and XFS (on file systems created with reflink support), and on 
 ^

Unless there's an update to pg_upgrade that's still undocumented; this is
not an option since like so many Linux users, my file system is ext4.

>macOS with APFS.
>
>2) From the docs:
>
>Run pg_upgrade
>
>Always run the pg_upgrade binary of the new server, not the old one. 
>pg_upgrade requires the specification of the old and new cluster's data 
>and executable (bin) directories. You can also specify user and port 
>values, and whether you want the data files linked or cloned instead of 
>the default copy behavior.
>
>
>3) Again, read the docs multiple times there is a lot to understand.

Agreed. But they could be a little clearer...  :)

>> Thanks!
>>   

 Thanks
 Pierre


  
>>>  
>> 
>>   
>




Re: pg_upgrade 13.6 to 15.1?

2023-01-15 Thread Tom Lane
Adrian Klaver  writes:
> --clone

I think --clone is probably contraindicated here, given that Pierre
already made a copy of the data.  If I understand how that works,
it'll just wind up making another whole copy, but in a time-extended
manner as the tables are modified.  Over the long run there would
still be two copies of the DB on the new disk, which doesn't seem
like what he wants.

regards, tom lane




Re: pg_upgrade 13.6 to 15.1?

2023-01-15 Thread Adrian Klaver

On 1/15/23 12:41, p...@pfortin.com wrote:

On Sun, 15 Jan 2023 12:23:10 -0800 Adrian Klaver wrote:


On 1/15/23 11:27, p...@pfortin.com wrote:

Hi,

I'm fairly new to postgres; but have databases with about 2TB of data.

Trying to upgrade from 13.6 to 15.1, pg_upgrade complains with:
[postgres@pf ~]$ /usr/bin/pg_upgrade -b /usr/local/pgsql/bin -B /usr/bin \
-d /mnt/db/var/lib/pgsql/data -D /mnt/work/var/lib/pgsql/data \
-s /tmp -U postgres



Nothing I read implies the need to upgrade to 14.x first...  Right?


In addition to Tom Lane's comments I would recommend reading this:

https://www.postgresql.org/docs/current/pgupgrade.html

multiple times. There is a lot going on there and it will take a couple
of reads at least to begin to understand it all.


Yup...  that's what I've been working from...  See my reply to Tom re
--link...


1) Working from and understanding are two different things. For instance 
further on in the docs there is:


--clone

Use efficient file cloning (also known as “reflinks” on some 
systems) instead of copying files to the new cluster. This can result in 
near-instantaneous copying of the data files, giving the speed 
advantages of -k/--link while leaving the old cluster untouched.


File cloning is only supported on some operating systems and file 
systems. If it is selected but not supported, the pg_upgrade run will 
error. At present, it is supported on Linux (kernel 4.5 or later) with 
Btrfs and XFS (on file systems created with reflink support), and on 
macOS with APFS.


2) From the docs:

Run pg_upgrade

Always run the pg_upgrade binary of the new server, not the old one. 
pg_upgrade requires the specification of the old and new cluster's data 
and executable (bin) directories. You can also specify user and port 
values, and whether you want the data files linked or cloned instead of 
the default copy behavior.



3) Again, read the docs multiple times there is a lot to understand.


Thanks!



Thanks
Pierre


   







--
Adrian Klaver
adrian.kla...@aklaver.com





Re: pg_upgrade 13.6 to 15.1?

2023-01-15 Thread Tom Lane
p...@pfortin.com writes:
> On Sun, 15 Jan 2023 14:47:35 -0500 Tom Lane wrote:
>> I think you misunderstand how this is supposed to work.  The -D
>> argument should point at an *empty* data directory that has been
>> freshly initialized with the new version's initdb.  pg_upgrade then
>> transfers data into that from the old database (-d argument).

> I was hoping to avoid the hours worth of copying to the NVMe SSD. 
> The instructions refer to upgrading with --link; would that save the copy
> time?

Yes, but to use --link you must have both data directories on the
same filesystem, so this is still the wrong thing.

Try something like

mv /mnt/work/var/lib/pgsql/data /mnt/work/var/lib/pgsql/data13
initdb /mnt/work/var/lib/pgsql/data
pg_upgrade ... -d /mnt/work/var/lib/pgsql/data13 -D 
/mnt/work/var/lib/pgsql/data --link ...

regards, tom lane




Re: pg_upgrade 13.6 to 15.1?

2023-01-15 Thread pf
On Sun, 15 Jan 2023 12:23:10 -0800 Adrian Klaver wrote:

>On 1/15/23 11:27, p...@pfortin.com wrote:
>> Hi,
>> 
>> I'm fairly new to postgres; but have databases with about 2TB of data.
>> 
>> Trying to upgrade from 13.6 to 15.1, pg_upgrade complains with:
>> [postgres@pf ~]$ /usr/bin/pg_upgrade -b /usr/local/pgsql/bin -B /usr/bin \
>>-d /mnt/db/var/lib/pgsql/data -D /mnt/work/var/lib/pgsql/data \
>>-s /tmp -U postgres  
>
>> Nothing I read implies the need to upgrade to 14.x first...  Right?  
>
>In addition to Tom Lane's comments I would recommend reading this:
>
>https://www.postgresql.org/docs/current/pgupgrade.html
>
>multiple times. There is a lot going on there and it will take a couple 
>of reads at least to begin to understand it all.

Yup...  that's what I've been working from...  See my reply to Tom re
--link...
Thanks!

>> 
>> Thanks
>> Pierre
>> 
>> 
>>   
>




Re: pg_upgrade 13.6 to 15.1?

2023-01-15 Thread pf
On Sun, 15 Jan 2023 14:47:35 -0500 Tom Lane wrote:

>p...@pfortin.com writes:
>> Trying to upgrade from 13.6 to 15.1, pg_upgrade complains with:
>> [postgres@pf ~]$ /usr/bin/pg_upgrade -b /usr/local/pgsql/bin -B /usr/bin \
>>   -d /mnt/db/var/lib/pgsql/data -D /mnt/work/var/lib/pgsql/data \
>>   -s /tmp -U postgres  
>
>> This utility can only upgrade to PostgreSQL version 15.   <=<<< ??  
>
>This indicates that it thinks the new data directory (-D) is the
>wrong version.  Perhaps the error message could be clearer about that.
>
>> Due to the database size, I have the working copy on a 4TB NVMe SSD
>> mounted at /mnt/work/var/lib/{pgadmin,pgsql} and an identical copy on an
>> 18TB platter at /mnt/db/var/lib/{pgadmin,pgsql}.  Both copies are
>> currently at 13.6.
>
>I think you misunderstand how this is supposed to work.  The -D
>argument should point at an *empty* data directory that has been
>freshly initialized with the new version's initdb.  pg_upgrade then
>transfers data into that from the old database (-d argument).

I was hoping to avoid the hours worth of copying to the NVMe SSD. 
The instructions refer to upgrading with --link; would that save the copy
time? I have an identical copy of the DB, so could recover if necessary
or just go the initdb route.  I wasn't clear on the exact syntax for
including --link... if that's an option I can use, do I eliminate one of
the -d or -D parameters?

Thanks!
>   regards, tom lane





Re: pg_upgrade 13.6 to 15.1?

2023-01-15 Thread Adrian Klaver

On 1/15/23 11:27, p...@pfortin.com wrote:

Hi,

I'm fairly new to postgres; but have databases with about 2TB of data.

Trying to upgrade from 13.6 to 15.1, pg_upgrade complains with:
[postgres@pf ~]$ /usr/bin/pg_upgrade -b /usr/local/pgsql/bin -B /usr/bin \
   -d /mnt/db/var/lib/pgsql/data -D /mnt/work/var/lib/pgsql/data \
   -s /tmp -U postgres



Nothing I read implies the need to upgrade to 14.x first...  Right?


In addition to Tom Lane's comments I would recommend reading this:

https://www.postgresql.org/docs/current/pgupgrade.html

multiple times. There is a lot going on there and it will take a couple 
of reads at least to begin to understand it all.




Thanks
Pierre





--
Adrian Klaver
adrian.kla...@aklaver.com





Re: pg_upgrade 13.6 to 15.1?

2023-01-15 Thread Tom Lane
p...@pfortin.com writes:
> Trying to upgrade from 13.6 to 15.1, pg_upgrade complains with:
> [postgres@pf ~]$ /usr/bin/pg_upgrade -b /usr/local/pgsql/bin -B /usr/bin \
>   -d /mnt/db/var/lib/pgsql/data -D /mnt/work/var/lib/pgsql/data \
>   -s /tmp -U postgres

> This utility can only upgrade to PostgreSQL version 15.   <=<<< ??

This indicates that it thinks the new data directory (-D) is the
wrong version.  Perhaps the error message could be clearer about that.

> Due to the database size, I have the working copy on a 4TB NVMe SSD
> mounted at /mnt/work/var/lib/{pgadmin,pgsql} and an identical copy on an
> 18TB platter at /mnt/db/var/lib/{pgadmin,pgsql}.  Both copies are
> currently at 13.6.  

I think you misunderstand how this is supposed to work.  The -D
argument should point at an *empty* data directory that has been
freshly initialized with the new version's initdb.  pg_upgrade then
transfers data into that from the old database (-d argument).

regards, tom lane




Re: Are ZFS snapshots unsafe when PGSQL is spreading through multiple zpools?

2023-01-15 Thread Magnus Hagander
On Sun, Jan 15, 2023 at 8:18 PM HECTOR INGERTO 
wrote:

> Hello everybody,
>
>
>
> I’m using PostgreSQL on openZFS. I use ZFS snapshots as a backup +
> hotspare method.
>
>
>
> From man zfs-snapshot: “Snapshots are taken atomically, so that all
> snapshots correspond to the same moment in time.” So if a PSQL instance is
> started from a zfs snapshot, it will start to replay the WAL from the last
> checkpoint, in the same way it would do in a crash or power loss scenario. So
> from my knowledge, ZFS snapshots can be used to rollback to a previous
> point in time. Also, sending those snapshots to other computers will allow
> you to have hotspares and remote backups. If I’m wrong here, I would
> appreciate being told about it because I’m basing the whole question on
> this premise.
>
>
>
> On the other hand, we have the tablespace PGSQL feature, which is great
> because it allows “unimportant” big data to be written into cheap HDD and
> frequently used data into fast NVMe.
>
>
>
> So far, so good. The problem is when both ideas are merged. Then,
> snapshots from different pools are NOT atomical, snapshot on the HDD pool
> isn’t going to be done at the same exact time as the one on the SSD pool,
> and I don’t know enough about PGSQL internals to know how dangerous this
> is. So here is where I would like to ask for your help with the following
> questions:
>
>
>
> First of all, what kind of problem can this lead to? Are we talking about
> potential whole DB corruption or only the loss of a few of the latest
> transactions?
>

Silent data corruption. *not* just losing your latest transaction.



> In second place, if I’m initializing a corrupted PGSQL instance because
> ZFS snapshots are from different pools and slightly different times, am I
> going to notice it somehow or is it going to fail silently?
>

Silent. You might notice at the application level. Might.



> In third and last place, is there some way to quantify the amount of risk
> taken when snapshotting a PGSQL instance spread across two (or more)
> different pools?
>
>
>
"Don't do it".

If you can't get atomic snapshots, don't do it, period.

You can use them together with a regular online backup. That is
pg_start_backup() //  // pg_stop_backup()
together with log archiving. That's a perfectly valid method. But you
cannot and should not rely on snapshots alone.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


pg_upgrade 13.6 to 15.1?

2023-01-15 Thread pf
Hi,

I'm fairly new to postgres; but have databases with about 2TB of data.  

Trying to upgrade from 13.6 to 15.1, pg_upgrade complains with:
[postgres@pf ~]$ /usr/bin/pg_upgrade -b /usr/local/pgsql/bin -B /usr/bin \
  -d /mnt/db/var/lib/pgsql/data -D /mnt/work/var/lib/pgsql/data \
  -s /tmp -U postgres
Performing Consistency Checks
-
Checking cluster versions   
This utility can only upgrade to PostgreSQL version 15.   <=<<< ??
Failure, exiting
[postgres@pf ~]$ /usr/bin/pg_upgrade -V
pg_upgrade (PostgreSQL) 15.1
[postgres@pf ~]$ /usr/local/pgsql/bin/pg_upgrade -V
pg_upgrade (PostgreSQL) 13.6

This is on Mageia Linux (mga9/cauldron) where the package manager prevents
the installation of both PG13 and PG15; so I have PG15 officially
installed and PG13 manually installed in /usr/local.

Due to the database size, I have the working copy on a 4TB NVMe SSD
mounted at /mnt/work/var/lib/{pgadmin,pgsql} and an identical copy on an
18TB platter at /mnt/db/var/lib/{pgadmin,pgsql}.  Both copies are
currently at 13.6.  

Rather than mess with the config locations; I use symlinks to point to
the database which are currently:
/var/lib/pgadmin -> /mnt/work/var/lib/pgadmin/
/var/lib/pgsql -> /mnt/work/var/lib/pgsql/

Nothing I read implies the need to upgrade to 14.x first...  Right?

Thanks
Pierre





Are ZFS snapshots unsafe when PGSQL is spreading through multiple zpools?

2023-01-15 Thread HECTOR INGERTO
Hello everybody,

I’m using PostgreSQL on openZFS. I use ZFS snapshots as a backup + hotspare 
method.

>From man zfs-snapshot: “Snapshots are taken atomically, so that all snapshots 
>correspond to the same moment in time.” So if a PSQL instance is started from 
>a zfs snapshot, it will start to replay the WAL from the last checkpoint, in 
>the same way it would do in a crash or power loss scenario. So from my 
>knowledge, ZFS snapshots can be used to rollback to a previous point in time. 
>Also, sending those snapshots to other computers will allow you to have 
>hotspares and remote backups. If I’m wrong here, I would appreciate being told 
>about it because I’m basing the whole question on this premise.

On the other hand, we have the tablespace PGSQL feature, which is great because 
it allows “unimportant” big data to be written into cheap HDD and frequently 
used data into fast NVMe.

So far, so good. The problem is when both ideas are merged. Then, snapshots 
from different pools are NOT atomical, snapshot on the HDD pool isn’t going to 
be done at the same exact time as the one on the SSD pool, and I don’t know 
enough about PGSQL internals to know how dangerous this is. So here is where I 
would like to ask for your help with the following questions:

First of all, what kind of problem can this lead to? Are we talking about 
potential whole DB corruption or only the loss of a few of the latest 
transactions?

In second place, if I’m initializing a corrupted PGSQL instance because ZFS 
snapshots are from different pools and slightly different times, am I going to 
notice it somehow or is it going to fail silently?

In third and last place, is there some way to quantify the amount of risk taken 
when snapshotting a PGSQL instance spread across two (or more) different pools?

Thanks for your time,


Héctor