What should I read?

2020-10-23 Thread W.P.

Hi there,

I am (still) using 9.5 Postgres.

On my new devices (OrangePi) default installed is 11.9. This also seems 
to be reasonable choice for my Intel laptop(s), as 11.9 offers logical 
replication (as Depesz explained me at some point of time) which is 
needed to replicate data between Intel64(32) and ARM64 platforms.


What should I read to realize differences (ex clusters etc), and learn 
how to configure database and migrate?


Tried about an hour to create database on OPi but failed. Too much 
differences to do it without good reading.



Laurent.





Re: split_part for the last element

2020-10-23 Thread Nikhil Benesch
On Fri, Oct 23, 2020 at 2:35 PM Tom Lane  wrote:
>
> "David G. Johnston"  writes:
> > On Fri, Oct 23, 2020 at 8:47 AM Nikhil Benesch 
> > wrote:
> >> Is there another option I'm missing? Would there be interest in
> >> extending split part so that negative indices counted from the end, as
> >> in:
> >>  split_part('foo bar baz', ' ', -1) -> 'baz'
>
> > I'm torn here because this would be the first usage of this concept in
> > PostgreSQL (I think).
>
> We already have some JSON functions that act like that, not to mention
> the left() and right() string functions, so I don't see that much of an
> argument against extending split_part to do it.

Oh, I didn't realize left and right already worked this way. That
makes this design much more compelling, at least to me.

If the consensus is this extension is ok, I'd be happy to prepare a patch.




Re: split_part for the last element

2020-10-23 Thread Nikhil Benesch
On Fri, Oct 23, 2020 at 2:21 PM David G. Johnston
 wrote:

> I'm torn here because this would be the first usage of this concept in
> PostgreSQL (I think).

Yeah, I also have some qualms about this design in the context of Postgres.
Particularly because Postgres allows arrays to begin at negative indices.

> Tangentially, I noticed that we have a "starts_with" function but no
> corresponding "end_with".

Ah, interesting. On the other hand, there are both "left" and "right",
"lpad" and "rpad", and "ltrim" and "rtrim". And at least ends_with has the
fairly elegant alternative of "s LIKE '%suffix'".

> It's been a while but there used to be a systemic inertia working against
> adding minor useful functions such as these.
>
> With the new documentation layout I would at least consider updating the
> description for the normal functions with an example on how to formulate
> an expression that works contra-normally, and in the case where there does
> exist such a specialized function, naming it.

Supposing you go this route, which of the options would you envision
mentioning as the converse of split_part?




Re: split_part for the last element

2020-10-23 Thread Tom Lane
"David G. Johnston"  writes:
> On Fri, Oct 23, 2020 at 8:47 AM Nikhil Benesch 
> wrote:
>> Is there another option I'm missing? Would there be interest in
>> extending split part so that negative indices counted from the end, as
>> in:
>>  split_part('foo bar baz', ' ', -1) -> 'baz'

> I'm torn here because this would be the first usage of this concept in
> PostgreSQL (I think).

We already have some JSON functions that act like that, not to mention
the left() and right() string functions, so I don't see that much of an
argument against extending split_part to do it.

> Tangentially, I noticed that we have a "starts_with" function but no
> corresponding "end_with".
> It's been a while but there used to be a systemic inertia working
> against adding minor useful functions such as these.

Part of the reason for that bias is that these one-off functions tend
not to be very well thought out or complete :-(.  The point that
could be raised here is why we have split_part but not any corresponding
regex-based splitter.

regards, tom lane




Re: split_part for the last element

2020-10-23 Thread David G. Johnston
On Fri, Oct 23, 2020 at 8:47 AM Nikhil Benesch 
wrote:

> Is there another option I'm missing? Would there be interest in
> extending split part so that negative indices counted from the end, as
> in:
>
> split_part('foo bar baz', ' ', -1) -> 'baz'
>

Some thoughts:

I'm torn here because this would be the first usage of this concept in
PostgreSQL (I think).

Tangentially, I noticed that we have a "starts_with" function but no
corresponding "end_with".

It's been a while but there used to be a systemic inertia working
against adding minor useful functions such as these.

With the new documentation layout I would at least consider updating the
description for the normal functions with an example on how to formulate an
expression that works contra-normally, and in the case where there does
exist such a specialized function, naming it.

David J.


Re: split_part for the last element

2020-10-23 Thread Nikhil Benesch
Right, that's option 2 in my original mail. There are several
deficiencies with that idiom:

  * It is non-obvious. Sure, it might make sense to you and I, but to
someone just learning SQL, it takes a minute to reason through why it
works. They're also unlikely to invent the trick on their own.
  * It is inefficient. When the strings are large reversing the
strings is a silly waste of compute.

On Fri, Oct 23, 2020 at 12:03 PM PALAYRET Jacques
 wrote:
>
> Hello,
>
>   reverse(split_part(reverse('foo bar baz'), ' ', 1))   -> 'baz'
>
> Regards
>
> - Mail original -
> De: "Nikhil Benesch" 
> À: pgsql-general@lists.postgresql.org
> Envoyé: Vendredi 23 Octobre 2020 17:47:16
> Objet: split_part for the last element
>
> Hi,
>
> Suppose I need to split a string on a delimiter and select one of the
> resulting components. If I want a specific component counting from the
> start, that's easy:
>
> split_part('foo bar baz', ' ', 1) -> 'foo'
>
> But if I want the last component, I have several less-than-ideal options:
>
> 1. (string_to_array('foo bar baz', '
> '))[cardinality(string_to_array('foo bar baz', ' ')) - 1]
> 2. reverse(split_part(reverse('foo bar baz'), ' ', 1))
> 3. (regexp_match('foo baz bar', '\S*$'))[1]
>
> Option 1 is probably the most understandable, especially if you are
> willing to introduce a temporary parts array:
>
> select parts[cardinality(parts) - 1] from string_to_array('foo bar
> baz', ' ') parts
>
> But if the strings are long, this needlessly builds an array just to
> throw it away. Option 2 has similar efficiency problems and is just
> kind of silly. Option 3 is probably the best, but it's still a good
> bit more complicated than a simple split_part invocation.
>
> Is there another option I'm missing? Would there be interest in
> extending split part so that negative indices counted from the end, as
> in:
>
> split_part('foo bar baz', ' ', -1) -> 'baz'
>
> Or adding a split_part_end function in which positive indices counted
> from the end:
>
> split_part_end('foo bar baz', ' ', 1) -> 'baz'
>
> I'd be happy to prepare a patch if so.
>
> Cheers,
> Nikhil
>
>




Re: split_part for the last element

2020-10-23 Thread PALAYRET Jacques
Hello,

  reverse(split_part(reverse('foo bar baz'), ' ', 1))   -> 'baz'

Regards

- Mail original -
De: "Nikhil Benesch" 
À: pgsql-general@lists.postgresql.org
Envoyé: Vendredi 23 Octobre 2020 17:47:16
Objet: split_part for the last element

Hi,

Suppose I need to split a string on a delimiter and select one of the
resulting components. If I want a specific component counting from the
start, that's easy:

split_part('foo bar baz', ' ', 1) -> 'foo'

But if I want the last component, I have several less-than-ideal options:

1. (string_to_array('foo bar baz', '
'))[cardinality(string_to_array('foo bar baz', ' ')) - 1]
2. reverse(split_part(reverse('foo bar baz'), ' ', 1))
3. (regexp_match('foo baz bar', '\S*$'))[1]

Option 1 is probably the most understandable, especially if you are
willing to introduce a temporary parts array:

select parts[cardinality(parts) - 1] from string_to_array('foo bar
baz', ' ') parts

But if the strings are long, this needlessly builds an array just to
throw it away. Option 2 has similar efficiency problems and is just
kind of silly. Option 3 is probably the best, but it's still a good
bit more complicated than a simple split_part invocation.

Is there another option I'm missing? Would there be interest in
extending split part so that negative indices counted from the end, as
in:

split_part('foo bar baz', ' ', -1) -> 'baz'

Or adding a split_part_end function in which positive indices counted
from the end:

split_part_end('foo bar baz', ' ', 1) -> 'baz'

I'd be happy to prepare a patch if so.

Cheers,
Nikhil




split_part for the last element

2020-10-23 Thread Nikhil Benesch
Hi,

Suppose I need to split a string on a delimiter and select one of the
resulting components. If I want a specific component counting from the
start, that's easy:

split_part('foo bar baz', ' ', 1) -> 'foo'

But if I want the last component, I have several less-than-ideal options:

1. (string_to_array('foo bar baz', '
'))[cardinality(string_to_array('foo bar baz', ' ')) - 1]
2. reverse(split_part(reverse('foo bar baz'), ' ', 1))
3. (regexp_match('foo baz bar', '\S*$'))[1]

Option 1 is probably the most understandable, especially if you are
willing to introduce a temporary parts array:

select parts[cardinality(parts) - 1] from string_to_array('foo bar
baz', ' ') parts

But if the strings are long, this needlessly builds an array just to
throw it away. Option 2 has similar efficiency problems and is just
kind of silly. Option 3 is probably the best, but it's still a good
bit more complicated than a simple split_part invocation.

Is there another option I'm missing? Would there be interest in
extending split part so that negative indices counted from the end, as
in:

split_part('foo bar baz', ' ', -1) -> 'baz'

Or adding a split_part_end function in which positive indices counted
from the end:

split_part_end('foo bar baz', ' ', 1) -> 'baz'

I'd be happy to prepare a patch if so.

Cheers,
Nikhil




Re: Conditional column filtering with pglogical replication

2020-10-23 Thread PALAYRET Jacques



The " row_filter " filters the rows and I don’t want to filter the row but only 
one (or several) column(s). 

Actually, I want a column value on the provider to be filtered on subcribers 
when replicating. 




For example, on provider : 

a | b | c | d 

-+--++-- 

123 | 45.6 | 15 | abc 

111 | 55.5 | 66 | def 




on subscriber : 

a | b | c | d 

-+--++-- 

123 | 45.6 | 15 | abc 

111 | | 66 | def 
-> b NULL on subscriber but the others columns are replicated. 


De: "Fabrízio de Royes Mello"  
À: "PALAYRET Jacques"  
Cc: "Postgres General"  
Envoyé: Vendredi 23 Octobre 2020 16:23:19 
Objet: Re: Conditional column filtering with pglogical replication 

Em sex., 23 de out. de 2020 às 10:35, PALAYRET Jacques < [ 
mailto:jacques.palay...@meteo.fr | jacques.palay...@meteo.fr ] > escreveu: 
> 
> With PgLogical extension, I have tested the possibility of column filtering 
> (columns) and row filtering (row_filter). 
> But is there a way to do a conditional column filtering ? 
> I mean a way to filter a column based on a predicate, with pglogical (so the 
> filtered values won’t arrive on subscribers). 
> For example, with a replicated table t(a integer as Primary Key, b numeric, c 
> integer, d text), I would like to filter the column b when c between 10 and 
> 20. 
> 

Hello, 

According to the documentation [1] the row_filter is a normal PostgreSQL 
expression with the same limitations of CHECK constraints, so you can do 
something like: row_filter := 'c between 10 and 20 and b = ?' 

Regards, 

[1] [ https://github.com/2ndQuadrant/pglogical#row-filtering | 
https://github.com/2ndQuadrant/pglogical#row-filtering ] 

-- 
Fabrízio de Royes Mello Timbira - [ http://www.timbira.com.br/ | 
http://www.timbira.com.br/ ] 
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento 



Re: postgres materialized view refresh performance

2020-10-23 Thread Philip Semanchuk



> On Oct 23, 2020, at 9:52 AM, Ravi Krishna  wrote:
> 
>> My understanding is that when CONCURRENTLY is specified, Postgres implements 
>> the refresh as a series of INSERT, UPDATE,
>> and DELETE statements on the existing view. So the answer to your question 
>> is no, Postgres doesn’t create another table and
>> then swap it.
> 
> The INSERTS/UPDATE/DELETE happens only for the difference.  PG first creates 
> a new temp table and then compares it with
> the MV and detects the difference.  That is why for CONCURRENTLY, a unique 
> index is required on the MV.

Yes, thank you, that’s what I understand too but I expressed it very poorly. 





Re: Conditional column filtering with pglogical replication

2020-10-23 Thread Fabrízio de Royes Mello
Em sex., 23 de out. de 2020 às 10:35, PALAYRET Jacques <
jacques.palay...@meteo.fr> escreveu:
>
> With PgLogical extension, I have tested the possibility of column
filtering (columns) and row filtering (row_filter).
> But is there a way to do a conditional column filtering ?
> I mean a way to filter a column based on a predicate, with pglogical (so
the filtered values won’t arrive on subscribers).
> For example, with a replicated table t(a integer as Primary Key, b
numeric, c integer, d text), I would like to filter the column b when c
between 10 and 20.
>

Hello,

According to the documentation [1] the row_filter is a normal PostgreSQL
expression with the same limitations of CHECK constraints, so you can do
something like: row_filter := 'c between 10 and 20 and b = ?'

Regards,

[1] https://github.com/2ndQuadrant/pglogical#row-filtering

--
   Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


Re: postgres materialized view refresh performance

2020-10-23 Thread Ravi Krishna
> My understanding is that when CONCURRENTLY is specified, Postgres implements 
> the refresh as a series of INSERT, UPDATE,
> and DELETE statements on the existing view. So the answer to your question is 
> no, Postgres doesn’t create another table and
> then swap it.

The INSERTS/UPDATE/DELETE happens only for the difference.  PG first creates a 
new temp table and then compares it with
the MV and detects the difference.  That is why for CONCURRENTLY, a unique 
index is required on the MV.



Re: Hot backup in PostgreSQL

2020-10-23 Thread Stephen Frost
Greetings,

* Mark Johnson (remi9...@gmail.com) wrote:
> User managed backups in PostgreSQL work very similar to what you know from
> Oracle.  You first place the cluster in backup mode, then copy the database
> files, and lastly take the cluster out of backup mode.  The first and last
> steps are done using functions pg_start_backup('label',false,false) and
> pg_stop_backup(false, false). [1].

Just to be clear for the archives, the above is *not* sufficient to have
a consistent and valid backup- you must also ensure that archive_command
(or some other system) is configured to capture all of the WAL produced
from the start to the end of the backup, and you need to make sure that
a backup_label file is created in the backup (using the results of
pg_stop_backup).  Some of this is discussed in the 'low level API' part
of the documentation, as linked before:

https://www.postgresql.org/docs/13/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP

Though it's really best, as mentioned, to use an existing tool that's
been written to ensure all of this is done correctly and not to try and
build your own.

Thanks,

Stephen

> If you use a utility supplied with PostgreSQL such as pg_basebackup, it
> does these steps for you.  If you are using a specific non-PostgreSQL
> utility (i.e., Dell Networker or IBM Tivoli) see their documentation for
> specifics.
> 
> [1]
> https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-BACKUP.
> 
> 
> On Thu, Oct 22, 2020 at 11:14 AM Adam Brusselback 
> wrote:
> 
> > > how to do "hot backup" (copying files) while database running?
> > As others have shown, there are ways to do this with PG's internal tooling
> > (pg_basebackup).
> >
> > However, I would highly recommend you use an external backup tool like
> > pgbackrest [1] to save yourself the pain of implementing things incorrectly
> > and ending up with non-viable backups when you need them most. I'm not
> > affiliated with them at all, but have just used pgbackrest in production
> > for years now with great results.  It takes care of PITR, and manages
> > backup retention (and associated WAL retention). Those can be a bit of a
> > pain to do manually otherwise.
> >
> > Just my $0.02, hope it helps!
> >
> > 1. https://pgbackrest.org/
> >


signature.asc
Description: PGP signature


Conditional column filtering with pglogical replication

2020-10-23 Thread PALAYRET Jacques
Hello, 

With PgLogical extension, I have tested the possibility of column filtering 
(columns) and row filtering (row_filter). 
But is there a way to do a conditional column filtering ? 
I mean a way to filter a column based on a predicate, with pglogical (so the 
filtered values won’t arrive on subscribers). 
For example, with a replicated table t(a integer as Primary Key, b numeric, c 
integer, d text), I would like to filter the column b when c between 10 and 20. 

Regards 
- Météo-France - 
PALAYRET JACQUES 
DCSC/MBD 
jacques.palay...@meteo.fr 
Fixe : +33 561078319 


Re: postgres materialized view refresh performance

2020-10-23 Thread Philip Semanchuk



> On Oct 22, 2020, at 3:53 PM, Ayub M  wrote:
> 
> There is a table t which is used in a mview mv, this is the only table in the 
> mview definition.
> 
> create table t (c1 int, ..., c10 int
> );
> 
> -- there is a pk on say c1 column
> create materialized view mv as select c1, c2...c10 from
>  t;
> 
> ---there is a unique index on say c5 and bunch of other indexes on the mview.
> The reason there is a mview created instead of using table t, is that that 
> the table gets truncated and reloaded every couple of hours and we don't want 
> users to see an empty table at any point of time that's why mview is being 
> used.
> 
> Using "refresh materialized view concurrently", this mview is being used by 
> APIs and end users.
> 
> Couple of questions I have -

Hi Ayub,
I’m not an expert on the subject; I hope you’ll get an answer from someone who 
is. :-) Until then, my answers might help.

>   • Whenever mview refresh concurrently happens, does pg create another 
> set of table and indexes and switch it with the orig? If no, then does it 
> update the existing data?


My understanding is that when CONCURRENTLY is specified, Postgres implements 
the refresh as a series of INSERT, UPDATE, and DELETE statements on the 
existing view. So the answer to your question is no, Postgres doesn’t create 
another table and then swap it.


>   • The mview gets refreshed in a couple of mins sometimes and sometimes 
> it takes hours. When it runs for longer, there are no locks and no resource 
> shortage, the number of recs in the base table is 6m (7.5gb) which is not 
> huge so why does it take so long to refresh the mview?

Does the run time correlate with the number of changes being made?


>   • Does mview need vacuum/analyze/reindex?


My understanding is that when CONCURRENTLY is specified, yes it does need 
vacuuming, because of the aforementioned implementation of REFRESH as a series 
of INSERT, UPDATE, and DELETE statements. 

In our situation, we have large views that are refreshed once per week. We want 
to ensure that the view is in the best possible shape for users, so we create 
the view with autovacuum_enabled = false and then run an explicit 
vacuum/analyze step immediately after the refresh rather than leaving it to 
chance.

Cheers
Philip