Having issue with SSL.

2023-05-24 Thread Randy Needham
host - Windows Server 2022
postgresql - 14.8
pgAdmin 4 - 7.1
openssl - 3.1.0

So I have generated a key and csr file to be sent to a CA cert issuer InCommon. 
 I generated via openssl with the following command.

openssl.exe req -newkey rsa:2048 -nodes -keyout postgresql.key -out 
postgresql.csr

Downloaded the PKCS#7, PEM encoded version of the cert to use.  The following 
is the changes I did to postgresql.conf.  The x.x.x.x is the actual IP of the 
Server.

listen_addresses = 'x.x.x.x' 

ssl = on
#ssl_ca_file = ''
ssl_cert_file = './certs/postgresql.cer'
#ssl_crl_file = ''
#ssl_crl_dir = ''
ssl_key_file = './certs/postgresql.key'
#ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL' # allowed SSL ciphers
#ssl_prefer_server_ciphers = on
#ssl_ecdh_curve = 'prime256v1'
#ssl_min_protocol_version = 'TLSv1.2'
#ssl_max_protocol_version = ''
#ssl_dh_params_file = ''
#ssl_passphrase_command = ''
#ssl_passphrase_command_supports_reload = off

Here is the current setup of pg_hba.conf with real IP's being x.x.x.x


# "local" is for Unix domain socket connections only
#local   all all scram-sha-256
# IPv4 local connections:
#hostall all 127.0.0.1/32scram-sha-256
#hostall all x.x.x.x/32   scram-sha-256
hostall all x.x.x.x/32scram-sha-256
hostssl all all 127.0.0.1/32scram-sha-256
hostssl all all x.x.x.x/32   scram-sha-256
hostssl all all x.x.x.x/32scram-sha-256
# IPv6 local connections:
# hostall all ::1/128 scram-sha-256
# Allow replication connections from localhost, by a user with the
# replication privilege.
local   replication all scram-sha-256
hostreplication all 127.0.0.1/32scram-sha-256
# hostreplication all ::1/128 scram-sha-256

The problem I am running into is my remote client can't connect via SSL to 
postgrsql.  I am able to from the server itself.  This is using pgAdmin 4 and 
making ssl mode as required.  Also ran psql.exe on the server to show that SSL 
was in fact working on the server. "SSL connection (protocol: TLSv1.3, cipher: 
TLS_AES_256_GCM_SHA384, bits: 256, compression: off)".  In the logs it is 
showing this when I try to connect via my remote client.

[2672] LOG:  could not accept SSL connection: An existing connection was 
forcibly closed by the remote host.

The error from pgAdmin 4 on the remote client is this.

connection failed: server closed the connection unexpectedly This probably 
means the server terminated abnormally before or while processing the request.  
SSL SYSCALL error: Connection reset by peer (0x2746/100054)

I have been trying to find a solution with no luck.  I am hoping that I might 
be missing something simple and someone will be able to see it.




Question regarding specifics of GIN and pg_trgm performance and potential use of show_trgm to improve it

2023-05-24 Thread Pavel Horal
Hello PostgreSQL community,

I am trying to improve performance of using similarity based queries on a
large datasets and I would like to confirm my understanding of how GIN
indexes work and how pg_trgm uses them.

If I understand it correctly, using GIN index is always a two step process:
first, potential matches are searched in the index; then as a second step
tuples are actually fetched and rechecked if they really match the query.
This two step process can lead to degraded performance when the index scan
matches too many elements that then are read from disk only to be dropped
as non-matching during the recheck phase. *Is that understanding correct?*

Now to the issue... pg_trgm's similarity search can use similarity operator
% to search for "similar" documents. Concept of "similarity" is based on a
similarity of trigram array extracted from the query string and trigram
arrays of searched values. This concept is quite tricky in a sense that
just by matching trigrams in GIN index PostgreSQL can not tell if the final
value will match or not as it does not know how many trigrams overall are
there in that value...

Consider following example:

CREATE TABLE test (id SERIAL, value TEXT);
CREATE INDEX test_idx ON test USING GIN (value gin_trgm_ops);
INSERT INTO test (value) SELECT 'lorem ipsum ' || id || repeat('foo bar',
CAST(random() * 100 AS INT)) FROM generate_series(1, 10) source(id);

SET pg_trgm.similarity_threshold TO 0.5;
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM test WHERE value % 'lorem';
 QUERY PLAN

 Bitmap Heap Scan on test  (cost=2024.08..2062.86 rows=10 width=374)
(actual time=2261.727..2261.728 rows=0 loops=1)
   Recheck Cond: (value % 'lorem'::text)
   Rows Removed by Index Recheck: 10
   Heap Blocks: exact=5025
   Buffers: shared hit=5636
   ->  Bitmap Index Scan on test_idx  (cost=0.00..2024.08 rows=10 width=0)
(actual time=19.242..19.242 rows=10 loops=1)
 Index Cond: (value % 'lorem'::text)
 Buffers: shared hit=611
 Planning:
   Buffers: shared hit=1
 Planning Time: 2.417 ms
 Execution Time: 2261.765 ms
(12 rows)


If I understand this correctly the *index scan really matches all 10
items that are then read from disk* only *to be discarded during the
recheck*. So 2 seconds of doing all that work to return zero results (and I
was lucky in my example to only have shared buffer hits, so no real disk
I/O).

*Is my understanding correct that this happens only because pg_trgm is not
able to actually determine if the matched item from the index search is
actually much much longer than the query?* Is there any way how the
performance can be improved in this case? I thought that I can store number
of trigrams in the index, but that is not being used by the query planner:

CREATE INDEX test_idx2 ON test USING GIN (value gin_trgm_ops,
array_length(show_trgm(value), 1));

EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM test WHERE value % 'lorem' AND
array_length(show_trgm(value), 1) < array_length(show_trgm('lorem'), 1) /
0.5;
QUERY PLAN
---
 Bitmap Heap Scan on test  (cost=56.08..94.96 rows=3 width=376) (actual
time=2273.225..2273.226 rows=0 loops=1)
   Recheck Cond: (value % 'lorem'::text)
   Rows Removed by Index Recheck: 10
   Filter: ((array_length(show_trgm(value), 1))::numeric <
12.)
   Heap Blocks: exact=5025
   Buffers: shared hit=5134
   ->  Bitmap Index Scan on test_idx3  (cost=0.00..56.08 rows=10 width=0)
(actual time=15.945..15.946 rows=10 loops=1)
 Index Cond: (value % 'lorem'::text)
 Buffers: shared hit=109
 Planning:
   Buffers: shared hit=3
 Planning Time: 2.394 ms
 Execution Time: 2273.256 ms
(13 rows)


Thank you for any sort of insight into this.

Regards,
Pavel


Re: 15 pg_upgrade with -j

2023-05-24 Thread Alvaro Herrera
On 2023-May-23, Ron wrote:

> We'd never hardlink.  Eliminates the ability to return to the old system if
> something goes wrong.

If you'd never hardlink, then you should run your test without the -k
option.  Otherwise, the timings are meaningless.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"La vida es para el que se aventura"




Re: confused by int2vector typdelim

2023-05-24 Thread Tom Lane
jian he  writes:
> should I expect the typdelim be a white space? Since '1 2'::int2vector
> works, '1,2'::int2vector does not work.

typdelim applies to the type's associated array type, that is
int2vector[].

regression=# select '{1 2,3 4 5}'::int2vector[];
   int2vector
-
 {"1 2","3 4 5"}
(1 row)

regards, tom lane




confused by int2vector typdelim

2023-05-24 Thread jian he
hi.
select
pt.typtype
, pt.typcategory
, pt.typdelim
, pt.typelem
, pt1.typname   as elem_type
,pt.typsubscript
,pt.typname
frompg_type pt  join pg_type pt1 on pt.typelem = pt1.oid
where   pt.typname = 'int2vector';

returns
 typtype | typcategory | typdelim | typelem | elem_type |  typsubscript
  |  typname
-+-+-+--+---+-+
 b| A  | ,|  21  | int2 |
array_subscript_handler | int2vector
(1 row)

from manual:

> typdelim char
> Character that separates two values of this type when parsing array input.
> Note that the delimiter is associated with the array element data type, not
> the array data type.


should I expect the typdelim be a white space? Since '1 2'::int2vector
works, '1,2'::int2vector does not work.