Re: A simple question about text fields

2021-06-17 Thread Laurenz Albe
On Fri, 2021-06-18 at 10:28 +1000, Gavan Schneider wrote:
> On 18 Jun 2021, at 9:34, David G. Johnston wrote:
> > On Thursday, June 17, 2021, Gavan Schneider  
> > wrote:
> > 
> > > My approach is to define such fields as ‘text’ and set a constraint using
> > > char_length(). This allows PG to do the business with the text in native
> > > form, and only imposes the cost of any length check when the field is
> > > updated… best of both worlds.
> > > 
> > 
> > Those are basically the same world…your alternative probably is strictly
> > worse than varchar(n) because of its novel way of implementing the same
> > functionality.
> 
> Not sure if this is strictly true. Novelty per se is not always worse. :)

True in general, but not in this case.

There is no advantage in a "text" with a check constraint on the length,
that is, no added functionality.

And it is worse for these reasons:

- the performance will be worse (big reason)

- the length limit is less obvious if you look at the table definition
  (small reason)

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Temporal tables as part of main release

2021-06-17 Thread David G. Johnston
On Thursday, June 17, 2021, Anand Sowmithiran  wrote:

> I am looking to use the temporal tables feature for keeping track of
> changes to my table data. As of now, there is an pgxn.org provided
> extension by which we could leverage this functionality, but *when
> Temporal tables will become part of the main Postgresql release *[and not
> as an extension] ? Even in the Postgresql 14 beta release notes it is not
> mentioned.
>

This email seems to provide a good starting point for reading up on the
past work on the feature.

https://www.postgresql.org/message-id/877c5179-d2f6-e222-717f-cc3970ae0d5b%40illuminatedcomputing.com

Its already too late for v14 since we are already in beta.

For a feature this complex its probably best to not have expectations of
that nature.

And, since its not in the commitfest app, and the last emails I found are
over a year old, it doesn’t seem to be a priority for anyone at this time.

https://commitfest.postgresql.org/33/

David J.


Temporal tables as part of main release

2021-06-17 Thread Anand Sowmithiran
I am looking to use the temporal tables feature for keeping track of
changes to my table data. As of now, there is an pgxn.org provided
extension by which we could leverage this functionality, but *when Temporal
tables will become part of the main Postgresql release *[and not as an
extension] ? Even in the Postgresql 14 beta release notes it is not
mentioned.

thanks,
Anand.


Re: A simple question about text fields

2021-06-17 Thread Gavan Schneider
On 18 Jun 2021, at 9:34, David G. Johnston wrote:

> On Thursday, June 17, 2021, Gavan Schneider 
> wrote:
>
>>
>> My approach is to define such fields as ‘text’ and set a constraint using
>> char_length(). This allows PG to do the business with the text in native
>> form, and only imposes the cost of any length check when the field is
>> updated… best of both worlds.
>>
>
> Those are basically the same world…your alternative probably is strictly
> worse than varchar(n) because of its novel way of implementing the same
> functionality.
>
Not sure if this is strictly true. Novelty per se is not always worse. :)
The design advantage is in all text fields being defined the same — no built in 
length.
When it becomes apparent a length constraint is needed it can be added for the 
relevant field(s), e.g., when the system does not impose proper restraint at 
the input stage.

> For most text fields any good constraint is going be done in the form of a
> regular expression, one that at minimum prevents non-printable characters
> (linefeed and carriage return being obvious examples).
>
Agree. If the business rules need some additional restrictions they can go here 
as well.
Not so obvious that newlines, etc. are unwelcome. Consider the need for human 
readable text in comment fields (or other annotation where the readability is 
enhanced by such layout). There is always the consideration of SQL injection 
(but it’s mostly too late if we’re leaving this to a constraint) and other 
toxic string sequences. But this is all business logic. The database just needs 
to handle the ‘text’ and we need to design the restraint around the content.

Gavan Schneider
——
Gavan Schneider, Sodwalls, NSW, Australia
Explanations exist; they have existed for all time; there is always a 
well-known solution to every human problem — neat, plausible, and wrong.
— H. L. Mencken, 1920




Re: A simple question about text fields

2021-06-17 Thread David G. Johnston
On Thursday, June 17, 2021, Gavan Schneider 
wrote:

>
> My approach is to define such fields as ‘text’ and set a constraint using
> char_length(). This allows PG to do the business with the text in native
> form, and only imposes the cost of any length check when the field is
> updated… best of both worlds.
>

Those are basically the same world…your alternative probably is strictly
worse than varchar(n) because of its novel way of implementing the same
functionality.

For most text fields any good constraint is going be done in the form of a
regular expression, one that at minimum prevents non-printable characters
(linefeed and carriage return being obvious examples).

David J.


Re: A simple question about text fields

2021-06-17 Thread Gavan Schneider
On 17 Jun 2021, at 1:08, Tom Lane wrote:

> Martin Mueller  writes:
>
>> Are there performance issues with the choice of 'text' vs. varchar and some 
>> character limit?  For instance, if I have a table with ten million records 
>> and text fields that may range in length from 15 to 150, can I expect a 
>> measurable improvement in response time for using varchar(150) or will text  
>>   do just or nearly as well.
>
>  There is no situation where varchar outperforms text in Postgres.
>  If you need to apply a length constraint for application semantic
>  reasons, do so ... otherwise, text is the native type.  It's
>  useful to think of varchar as being a domain over text, though
>  for various reasons it's not implemented quite that way.
>
This reminds of my days converting from MySQL to PostgreSQL. MySQL, along with 
other databases, seemed to have a strong preference for setting a length on 
character strings. And all this from before the advent of UTF encoding which 
has made the concept of string ‘length’ very messy.

Database guru and SQL author Joe Celko asserts in his ’SQL for Smarties’ that 
if he finds a text field without a length limit he will input the Heart Sutra 
(presumably in ASCII :) to demonstrate the design error. (Of course he is 
ignoring the potential for this input to help the database achieve inner 
consistency. :) . But taking Joe’s central point there do seem to be grounds 
for restricting user input text fields to a reasonable length according to the 
business need… if only to limit the damage of a cat sitting on the keyboard.

My approach is to define such fields as ‘text’ and set a constraint using 
char_length(). This allows PG to do the business with the text in native form, 
and only imposes the cost of any length check when the field is updated… best 
of both worlds.

Gavan Schneider
——
Gavan Schneider, Sodwalls, NSW, Australia
Explanations exist; they have existed for all time; there is always a 
well-known solution to every human problem — neat, plausible, and wrong.
— H. L. Mencken, 1920




Re: Treating float arrays as vectors?

2021-06-17 Thread Paul Jungwirth

On 6/17/21 7:13 AM, Celia McInnis wrote:
I would love it if there was a vector data type in postgresql along with 
such vector operations as addition, subtraction, scalar multiplication, 
cross product, dot product, normalization, length and various sorts of 
vector distances.


I wrote an extension to define vector-based functions that take & return 
numeric arrays:


https://github.com/pjungwir/floatvec

It only has basic arithmetic, but matrix math functions could be added.

If you want aggregate functions instead I wrote another extension for that:

https://github.com/pjungwir/aggs_for_vecs

I haven't touched either in a while, but if you find they have problems 
on modern versions of Postgres, let me know and I should be able to get 
them updated quickly.


I experimented a bit with an AVX implementation, and it showed a 
worthwhile performance improvement, but I never got around to adding it 
to all functions, so it was just a proof-of-concept. I could dust that 
off if you're interested. But since the extension is C things are 
already pretty fast.


Yours,

--
Paul  ~{:-)
p...@illuminatedcomputing.com




Re: Streaming replication: PANIC on tertiary when secondary promoted

2021-06-17 Thread Alexey Bashtanov

On 16/06/2021 20:31, Alexey Bashtanov wrote:

I had it "latest" as well.
I'll try to reproduce it again tomorrow. replica -v -d 
"dbname=postgres port=5432" -U postgres


I cannot quite reproduce it artificially.

One more piece of detail: in the chain 
serverA->serverB->serverC->serverD when serverB was promoted and serverC 
crashed I had wal_compression enabled on serverA only.


Best regards, Alex




Re: compute_query_id

2021-06-17 Thread Julien Rouhaud
On Thu, Jun 17, 2021 at 08:57:02PM +0530, Vijaykumar Jain wrote:
>
> test=# show log_line_prefix;
> log_line_prefix
> 
>  [timestamp=%t] [query_id=%Q] :
> (1 row)
> 
> test=# show compute_query_id;
>  compute_query_id
> --
>  on
> (1 row)
> 
> test=# show log_statement;
>  log_statement
> ---
>  all
> (1 row)
> 
> * corrects myself
> . ok now it works, when i set log_min_duration_statement =0 to log
> all statements.
> 
> test=# show log_min_duration_statement;
>  log_min_duration_statement
> 
>  0
> (1 row)

Yes, unfortunately log_statements is not compatible with compute_query_id.
This is documented at
https://www.postgresql.org/docs/devel/runtime-config-logging.html#GUC-LOG-LINE-PREFIX:

> The %Q escape always reports a zero identifier for lines output by
> log_statement because log_statement generates output before an identifier can
> be calculated, including invalid statements for which an identifier cannot be
> calculated.




Re: compute_query_id

2021-06-17 Thread Vijaykumar Jain
On Thu, 17 Jun 2021 at 20:20, Julien Rouhaud  wrote:
>
> On Thu, Jun 17, 2021 at 08:09:54PM +0530, Vijaykumar Jain wrote:
> > how is the compute_query_id actually calculated?
>
> > why does it show 0 in logs for random sql queries.
> > log_line_prefix = '%Q :'
> > 0 :LOG:  statement: select * from pg_stat_activity;
>
> It means that you haven't enabled it:
>
> 2021-06-17 22:46:16.231 CST [11246] queryid=0 LOG:  duration: 4.971 ms  
> statement: select * from pg_stat_activity ;
> 2021-06-17 22:46:25.383 CST [11246] queryid=0 LOG:  duration: 0.284 ms  
> statement: set compute_query_id = on;
> 2021-06-17 22:46:28.744 CST [11246] queryid=941978042436931562 LOG:  
> duration: 1.725 ms  statement: select * from pg_stat_activity ;
>

 psql test
psql (14beta1)
Type "help" for help.

test=# show log_line_prefix;
log_line_prefix

 [timestamp=%t] [query_id=%Q] :
(1 row)

test=# show compute_query_id;
 compute_query_id
--
 on
(1 row)

test=# show log_statement;
 log_statement
---
 all
(1 row)

test=# select query_id, query, pid from pg_stat_activity where pid =
pg_backend_pid();
-[ RECORD 1 
]-
query_id | -4293879703199833131
query| select query_id, query, pid from pg_stat_activity where pid
= pg_backend_pid();
pid  | 2640


from the logs, i get the id for some queries, but not all
[timestamp=2021-06-17 20:37:59 IST] [query_id=488416992746849793]
:ERROR:  relation "t" already exists
[timestamp=2021-06-17 20:37:59 IST] [query_id=488416992746849793]
:STATEMENT:  create table t(id int);
[timestamp=2021-06-17 20:38:22 IST] [query_id=0] :LOG:  statement:
prepare qq(int) as select count(1) from t where id = $1;
[timestamp=2021-06-17 20:38:29 IST] [query_id=0] :LOG:  statement:
execute qq(1);
[timestamp=2021-06-17 20:38:29 IST] [query_id=0] :DETAIL:  prepare:
prepare qq(int) as select count(1) from t where id = $1;
[timestamp=2021-06-17 20:38:32 IST] [query_id=0] :LOG:  statement:
execute qq(2);
[timestamp=2021-06-17 20:38:32 IST] [query_id=0] :DETAIL:  prepare:
prepare qq(int) as select count(1) from t where id = $1;
[timestamp=2021-06-17 20:39:25 IST] [query_id=0] :LOG:  statement:
select query_id, query, pid from pg_stat_activity where pid = 0;
[timestamp=2021-06-17 20:40:36 IST] [query_id=0] :LOG:  statement:
select count(1) from t;
[timestamp=2021-06-17 20:40:47 IST] [query_id=0] :LOG:  statement:
select count(1) from t where id < 100;

test=# explain (analyze,verbose) select * from t where id <
floor((random() * 100)::int);
QUERY PLAN
---
 Append  (cost=0.00..3.10 rows=3 width=4) (actual time=0.009..0.014
rows=3 loops=1)

 Query Identifier: 1051405225525186795
 Planning Time: 0.090 ms
 Execution Time: 0.030 ms
(13 rows)

test=# select query_id, query, pid from pg_stat_activity where pid =
pg_backend_pid();
   query_id   |  query
 | pid
--+-+--
 -4293879703199833131 | select query_id, query, pid from
pg_stat_activity where pid = pg_backend_pid(); | 2671
(1 row)

but in logs
[timestamp=2021-06-17 20:46:47 IST] [query_id=0] :LOG:  statement:
explain select query_id, query, pid from pg_stat_activity where pid =
pg_backend_pid();
[timestamp=2021-06-17 20:46:54 IST] [query_id=0] :LOG:  statement:
explain select query_id, query, pid from pg_stat_activity where pid >
100;
[timestamp=2021-06-17 20:46:58 IST] [query_id=0] :LOG:  statement:
explain analyze select query_id, query, pid from pg_stat_activity
where pid > 100;
[timestamp=2021-06-17 20:47:25 IST] [query_id=0] :LOG:  statement:
explain analyze select * from t where id <  floor((random() *
100)::int);
[timestamp=2021-06-17 20:48:16 IST] [query_id=0] :LOG:  statement:
explain (analyze,verbose) select * from t where id <  floor((random()
* 100)::int);
[timestamp=2021-06-17 20:48:38 IST] [query_id=0] :LOG:  statement:
select query_id, query, pid from pg_stat_activity where pid =
pg_backend_pid();

not sure, if i am missing something obvious?

> I'm not sure that I understand that question.  The pid will identify a 
> backend,
> and that backend can execute 0, 1 or a lot of different queries.  The query_id
> will uniquely identify statements after some normalization and removing the
> constant parts (so for instance "select 1;" and "Select 2  ;" will have the
> same identifier).  Having only that information in the log can be useful on 
> its
> own, but you usually get way more benefit using additional modules like
> pg_stat_statements.

Thanks, that helps, but I still do not see them in logs.


* corrects myself
. ok now it works, 

Re: compute_query_id

2021-06-17 Thread Julien Rouhaud
On Thu, Jun 17, 2021 at 08:09:54PM +0530, Vijaykumar Jain wrote:
> how is the compute_query_id actually calculated?

It's the exact same implementation that was extracted from pg_stat_statements.
You have some implementation details at
https://www.postgresql.org/docs/current/pgstatstatements.html.

> why does it show 0 in logs for random sql queries.
> log_line_prefix = '%Q :'
> 0 :LOG:  statement: select * from pg_stat_activity;

It means that you haven't enabled it:

2021-06-17 22:46:16.231 CST [11246] queryid=0 LOG:  duration: 4.971 ms  
statement: select * from pg_stat_activity ;
2021-06-17 22:46:25.383 CST [11246] queryid=0 LOG:  duration: 0.284 ms  
statement: set compute_query_id = on;
2021-06-17 22:46:28.744 CST [11246] queryid=941978042436931562 LOG:  duration: 
1.725 ms  statement: select * from pg_stat_activity ;

> i mean pid already was doing the job to identify the query and its children
> even it logs,
> but i know pid will get recycled.

I'm not sure that I understand that question.  The pid will identify a backend,
and that backend can execute 0, 1 or a lot of different queries.  The query_id
will uniquely identify statements after some normalization and removing the
constant parts (so for instance "select 1;" and "Select 2  ;" will have the
same identifier).  Having only that information in the log can be useful on its
own, but you usually get way more benefit using additional modules like
pg_stat_statements.




compute_query_id

2021-06-17 Thread Vijaykumar Jain
hi,

I noticed this new param compute_query_id in pg14beta.
it is interesting as I was long wanting to identify a query with a unique
id like we have for http requests etc so that we can trace the query all
the way to shards via FDW etc.

but i cannot see them in the logs even after setting compute_query_id on.
i also read
compute_query_id_query
  for
the same.
how is the compute_query_id actually calculated?
why does it show 0 in logs for random sql queries.
log_line_prefix = '%Q :'
0 :LOG:  statement: select * from pg_stat_activity;

i mean pid already was doing the job to identify the query and its children
even it logs,
but i know pid will get recycled.
queryjumble.c


tldr;
how is compute_query_id different from pid to identify some query running ?
can it be passed on to FDW queries ? for tracing etc ?

am i totally getting its use case totally wrong :)


-- 
Thanks,
Vijay
Mumbai, India


Treating float arrays as vectors?

2021-06-17 Thread Celia McInnis
Hi:

I would love it if there was a vector data type in postgresql along with
such vector operations as addition, subtraction, scalar multiplication,
cross product, dot product, normalization, length and various sorts of
vector distances. So far I have been feeding my float arrays to plpython3u
to do these kinds of operations. Is there a better way I can get this
functionality?

Celia McInnis


Re: Listen and notify in psql process

2021-06-17 Thread Torsten Förtsch
On Thu, Jun 17, 2021 at 1:04 PM Sakshi Jain 
wrote:

How to listen from within a psql process and get the payloads?
>
> Do Postgresql have any such a mechanism where in a session a process send
> a "listen " sql command and then gets a message if someone in the
> other session issued a "notify ".
>
> Please provide an example of how to do this.
>
> I am looking for listen to return some thing when a notify has been issued.
>

Here is an example in perl. Basically you issue the LISTEN command. Then
you sit and watch the file descriptor to become ready. You do that outside
of a transaction. Once the descriptor is ready you call your driver's
version of notification read function until it comes back empty. That is
when you go back and wait for the next arrival.

$dbh->do("LISTEN channel");
my $sel = IO::Select->new;
$sel->add($dbh->{pg_socket});
while ($sel->can_read) {
while (my $notify = $dbh->pg_notifies) {
my ($name, $pid, $payload) = @$notify;
do_something($payload);
}
}

>


views on partitioned tables

2021-06-17 Thread Holger Vornholt
Hello,

we are using several partitioned tables. We regularly encounter the problem, 
that we would like to help analysts (and ourselves) with joining these tables 
by building predefined Views.
Is it possible to handle the partitions when querying the Views if the tables 
are hidden in Subqueries? Are there other recommendations on how to achieve 
this? Below is an example. Every table involved has around 90 partitions.

Some of my experiments so far:

  *   Parameters to force a partition choice when querying the View by using 
current_setting('set_partition.abfahrt_tag_plan') at several places. This has 
terrible impact on downstream processes or when multiple partitions are needed.
  *   Materialized Views. Additional storage costs and refresh-management.
  *   Adding unnecessary joins clauses to suggest using partitions when 
querying from outside.

-- ereignis_sv_soll (tag_plan) and vereinigung_sv_soll (abfahrt_tag_plan) are 
partitioned.
CREATE VIEW public.vereinigung_sv_soll_expanded as
WITH vereinigung_sv_soll_deduplicated as
 (
 SELECT max(vereinigung_key) as vereinigung_key
 from (
  SELECT vereinigung_key,
 string_agg(vereinigung.fahrtid || 
vereinigung.abfahrt_ereignisid || vereinigung.ankunft_ereignisid, ' | '
order by vereinigung.fahrtid) as 
vereinigung_eine_zeile
  FROM vereinigung_sv_soll vereinigung
  GROUP BY vereinigung_key
  ) temp
 GROUP BY vereinigung_single_row
 )
SELECT
vereinigung_sv_soll.abfahrt_tag_plan,
vereinigung_sv_soll,
abfahrt.start_tag,
abfahrt,
ankunft.evanr as ankunft_evanr,
ankunft,
FROM vereinigung_sv_soll
 INNER JOIN ereignis_sv_soll abfahrt
ON vereinigung_sv_soll.abfahrt_ereignisid = 
abfahrt.ereignisid
 INNER JOIN  ereignis_sv_soll ankunft
 ON vereinigung_sv_soll.ankunft_ereignisid = 
ankunft.ereignisid
 INNER JOIN vereinigung_sv_soll_deduplicated
ON vereinigung_sv_soll.vereinigung_key = 
vereinigung_sv_soll_deduplicated.vereinigung_key
;

Kind regards,

Holger Vornholt
Reisendeninformation (T.RS)
Deutsche Bahn AG
Hahnstr. 40, 60528 Frankfurt a. Main
Chat
 | Call | +49152 37557535




Pflichtangaben anzeigen

N?here Informationen zur Datenverarbeitung im DB-Konzern finden Sie hier: 
http://www.deutschebahn.com/de/konzern/datenschutz


Re: Listen and notify in psql process

2021-06-17 Thread Ravi Krishna
https://www.postgresql.org/docs/current/sql-notify.html

https://www.postgresql.org/docs/13/sql-listen.html


  

Fwd: Listen and notify in psql process

2021-06-17 Thread Sakshi Jain
Hi Team,

How to listen from within a psql process and get the payloads?

Do Postgresql have any such a mechanism where in a session a process send a
"listen " sql command and then gets a message if someone in the other
session issued a "notify ".

Please provide an example of how to do this.

I am looking for listen to return some thing when a notify has been issued.

-- 
*Thanks & Regards*
Sakshi Jain
New Delhi
8882433263



-- 
*Thanks & Regards*
Sakshi Jain
New Delhi
8882433263