Re: DB migration : Sybase to Postgres

2023-05-25 Thread Ron
I don't remember, to be honest.  ALTER TABLE ... VALIDATE CONSTRAINT was 
/really fast/, though.


Having a supporting index (which Pg does /not/ automatically create) is 
vital, of course.


On 5/25/23 23:16, Sengottaiyan T wrote:

Thanks Ron.

While enabling the constraint (valid constraint) after initial data 
import, will it scan all the table data or the validation will happen for 
new rows inserted after that point)?


On Thu, May 25, 2023 at 6:40 PM Ron  wrote:

If I read your email correctly, I see two options
- Apply FK constraints *after* initial data load.
- Load parent data before child data

(You can create the FKs ahead of time, but use the NOT VALID clause;
then, after data is loaded do ALTER TABLE ... VALIDATE CONSTRAINT.)

On 5/25/23 06:30, Sengottaiyan T wrote:

Is there an option to set novalidate constraints in postgres? In my
source Sybase DB, table structures are not defined properly (only
primary keys exist and no foreign key) - I'm making necessary changes
on target Postgres DB (created tables, identifying relationship
between table columns with respective team). After creating proper
structure on target, the next step is to load data - I'm sure there
will be a lot of errors during initial data load (no parent record
found). How to handle it?

On Tue, Mar 28, 2023 at 3:22 PM Sengottaiyan T 
wrote:

Team,

Need some help regarding Open Source tools for DB migration (from
SYBASE to PostgreSQL) with real time CDC. Along with this, is
there any possibility to migrate other objects (like Triggers,
Stored Procedures and Functions) to PostgreSQL?

Thanks,
Senko



-- 
Born in Arizona, moved to Babylonia.




--
Born in Arizona, moved to Babylonia.

Re: DB migration : Sybase to Postgres

2023-05-25 Thread Sengottaiyan T
Thanks Ron.

While enabling the constraint (valid constraint) after initial data import,
will it scan all the table data or the validation will happen for new rows
inserted after that point)?

On Thu, May 25, 2023 at 6:40 PM Ron  wrote:

> If I read your email correctly, I see two options
> - Apply FK constraints *after* initial data load.
> - Load parent data before child data
>
> (You can create the FKs ahead of time, but use the NOT VALID clause; then,
> after data is loaded do ALTER TABLE ... VALIDATE CONSTRAINT.)
>
> On 5/25/23 06:30, Sengottaiyan T wrote:
>
> Is there an option to set novalidate constraints in postgres? In my source
> Sybase DB, table structures are not defined properly (only primary keys
> exist and no foreign key) - I'm making necessary changes on target Postgres
> DB (created tables, identifying relationship between table columns with
> respective team). After creating proper structure on target, the next step
> is to load data - I'm sure there will be a lot of errors during initial
> data load (no parent record found). How to handle it?
>
> On Tue, Mar 28, 2023 at 3:22 PM Sengottaiyan T 
> wrote:
>
>> Team,
>>
>> Need some help regarding Open Source tools for DB migration (from SYBASE
>> to PostgreSQL) with real time CDC. Along with this, is there any
>> possibility to migrate other objects (like Triggers, Stored Procedures and
>> Functions) to PostgreSQL?
>>
>> Thanks,
>> Senko
>>
>
> --
> Born in Arizona, moved to Babylonia.
>


PostgreSQL GSSAPI Windows AD

2023-05-25 Thread Jean-Philippe Chenel
Hi,

I've recently updated from PostgreSQL 9.6 to 14 and also ubuntu 16.04 to 22.04.
I've made all the installation required for postgresql to connect in GSSAPI 
authentication to a Windows domain.

Something is going wrong and I don't know why.
When I change the mapped user password from "postgres" to anything else, the 
connection stop to work

Log of postgres:
Unspecified GSS failure.  Minor code may provide more information: Request 
ticket server postgres/ubuntu.ad.corp@ad.corp.com not found in keytab 
(ticket kvno 3)

Here is the ktpass command (Windows AD):

working:
ktpass -out postgres.keytab -princ postgres/ubuntu.ad.corp@ad.corp.com 
-mapUser AD\pgsql_ubuntu -pass postgres -mapOp add -crypto AES256-SHA1 -ptype 
KRB5_NT_PRINCIPAL

not working:
ktpass -out postgres.keytab -princ postgres/ubuntu.ad.corp@ad.corp.com 
-mapUser AD\pgsql_ubuntu -pass other_password -mapOp add -crypto AES256-SHA1 
-ptype KRB5_NT_PRINCIPAL

I put the keytab on the postgres server, the keytab file is referenced in the 
postgresql.conf file.

Here is the full procedure:

  1.  Create user in AD for postgresql mapping (pgsql_ubuntu), always valid, 
support AES256
  2.  Create another user for connection testing
  3.  run ktpass command
  4.  put the keytab file on the pg server in /etc/postgresql, chown to 
postgres and chmod 600
  5.  postgresql.conf krb_server_keyfile = '/etc/postgresql/postgres.keytab'
  6.  pg_hba is configured to connect over gss
  7.  ubuntu server (postgres) is added to domain with this command:
sudo realm join server.ad.corp.com -U Administrateur


I don't know why it works when the password is "postgres" and why I can't 
change it.

With best regards,


Re: Having issue with SSL.

2023-05-25 Thread Randy Needham
Some new information.  I was able to connect to postgresql via ssl from a 
machine in the same subnet as the server.  Beyond port 5432 is there any other 
ports that need to be opened to access postgresql via ssl?  There is a firewall 
setup between the server and the clients that need access but we have only port 
5432 opened.

From: Randy Needham 
Sent: Wednesday, May 24, 2023 2:02 PM
To: pgsql-general@lists.postgresql.org 
Subject: Having issue with SSL.

host - Windows Server 2022
postgresql - 14.8
pgAdmin 4 - 7.1
openssl - 3.1.0

So I have generated a key and csr file to be sent to a CA cert issuer InCommon. 
 I generated via openssl with the following command.

openssl.exe req -newkey rsa:2048 -nodes -keyout postgresql.key -out 
postgresql.csr

Downloaded the PKCS#7, PEM encoded version of the cert to use.  The following 
is the changes I did to postgresql.conf.  The x.x.x.x is the actual IP of the 
Server.

listen_addresses = 'x.x.x.x' 

ssl = on
#ssl_ca_file = ''
ssl_cert_file = './certs/postgresql.cer'
#ssl_crl_file = ''
#ssl_crl_dir = ''
ssl_key_file = './certs/postgresql.key'
#ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL' # allowed SSL ciphers
#ssl_prefer_server_ciphers = on
#ssl_ecdh_curve = 'prime256v1'
#ssl_min_protocol_version = 'TLSv1.2'
#ssl_max_protocol_version = ''
#ssl_dh_params_file = ''
#ssl_passphrase_command = ''
#ssl_passphrase_command_supports_reload = off

Here is the current setup of pg_hba.conf with real IP's being x.x.x.x


# "local" is for Unix domain socket connections only
#local   all all scram-sha-256
# IPv4 local connections:
#hostall all 127.0.0.1/32scram-sha-256
#hostall all x.x.x.x/32   scram-sha-256
hostall all x.x.x.x/32scram-sha-256
hostssl all all 127.0.0.1/32scram-sha-256
hostssl all all x.x.x.x/32   scram-sha-256
hostssl all all x.x.x.x/32scram-sha-256
# IPv6 local connections:
# hostall all ::1/128 scram-sha-256
# Allow replication connections from localhost, by a user with the
# replication privilege.
local   replication all scram-sha-256
hostreplication all 127.0.0.1/32scram-sha-256
# hostreplication all ::1/128 scram-sha-256

The problem I am running into is my remote client can't connect via SSL to 
postgrsql.  I am able to from the server itself.  This is using pgAdmin 4 and 
making ssl mode as required.  Also ran psql.exe on the server to show that SSL 
was in fact working on the server. "SSL connection (protocol: TLSv1.3, cipher: 
TLS_AES_256_GCM_SHA384, bits: 256, compression: off)".  In the logs it is 
showing this when I try to connect via my remote client.

[2672] LOG:  could not accept SSL connection: An existing connection was 
forcibly closed by the remote host.

The error from pgAdmin 4 on the remote client is this.

connection failed: server closed the connection unexpectedly This probably 
means the server terminated abnormally before or while processing the request.  
SSL SYSCALL error: Connection reset by peer (0x2746/100054)

I have been trying to find a solution with no luck.  I am hoping that I might 
be missing something simple and someone will be able to see it.




Re: CREATE TEMPORARY TABLE LIKE

2023-05-25 Thread Erik Wienhold
> On 25/05/2023 15:06 CEST Jim Vanns  wrote:
>
> When basing a temporary table of a source table, are triggers included
> by default? I have this statement;
>
> CREATE TEMPORARY TABLE dev_main (
> LIKE prod_main
> INCLUDING ALL
> EXCLUDING INDEXES
> EXCLUDING CONSTRAINTS
> ) ON COMMIT DELETE ROWS;
>
> And wondering if there is a trigger (row-based after) on prod_main
> it'll fire also on dev_main? I can't find anything in the
> documentation that suggests either way nor can I see an explicit
> EXCLUDING option to be sure triggers aren't copied.

You can check if triggers exist with psql:

\d dev_main

or by checking catalog pg_trigger:

select * from pg_trigger where tgrelid = 'dev_main'::regclass;

But no.  Triggers are not included when creating tables like that.

--
Erik




Re: CREATE TEMPORARY TABLE LIKE

2023-05-25 Thread Jim Vanns
I just wrote a little test case... it appears not. Triggers aren't
fired in the temporary table.

Jim

On Thu, 25 May 2023 at 14:06, Jim Vanns  wrote:
>
> Hi everyone,
>
> When basing a temporary table of a source table, are triggers included
> by default? I have this statement;
>
> CREATE TEMPORARY TABLE dev_main (
> LIKE prod_main
> INCLUDING ALL
> EXCLUDING INDEXES
> EXCLUDING CONSTRAINTS
> ) ON COMMIT DELETE ROWS;
>
> And wondering if there is a trigger (row-based after) on prod_main
> it'll fire also on dev_main? I can't find anything in the
> documentation that suggests either way nor can I see an explicit
> EXCLUDING option to be sure triggers aren't copied.
>
> Does anyone know?
>
> Cheers,
>
> Jim
>
> --
> Jim Vanns
> Principal Production Engineer
> Industrial Light & Magic, London



-- 
Jim Vanns
Principal Production Engineer
Industrial Light & Magic, London




Re: DB migration : Sybase to Postgres

2023-05-25 Thread Ron

If I read your email correctly, I see two options
- Apply FK constraints *after* initial data load.
- Load parent data before child data

(You can create the FKs ahead of time, but use the NOT VALID clause; then, 
after data is loaded do ALTER TABLE ... VALIDATE CONSTRAINT.)


On 5/25/23 06:30, Sengottaiyan T wrote:
Is there an option to set novalidate constraints in postgres? In my source 
Sybase DB, table structures are not defined properly (only primary keys 
exist and no foreign key) - I'm making necessary changes on target 
Postgres DB (created tables, identifying relationship between table 
columns with respective team). After creating proper structure on target, 
the next step is to load data - I'm sure there will be a lot of errors 
during initial data load (no parent record found). How to handle it?


On Tue, Mar 28, 2023 at 3:22 PM Sengottaiyan T  wrote:

Team,

Need some help regarding Open Source tools for DB migration (from
SYBASE to PostgreSQL) with real time CDC. Along with this, is there
any possibility to migrate other objects (like Triggers, Stored
Procedures and Functions) to PostgreSQL?

Thanks,
Senko



--
Born in Arizona, moved to Babylonia.

CREATE TEMPORARY TABLE LIKE

2023-05-25 Thread Jim Vanns
Hi everyone,

When basing a temporary table of a source table, are triggers included
by default? I have this statement;

CREATE TEMPORARY TABLE dev_main (
LIKE prod_main
INCLUDING ALL
EXCLUDING INDEXES
EXCLUDING CONSTRAINTS
) ON COMMIT DELETE ROWS;

And wondering if there is a trigger (row-based after) on prod_main
it'll fire also on dev_main? I can't find anything in the
documentation that suggests either way nor can I see an explicit
EXCLUDING option to be sure triggers aren't copied.

Does anyone know?

Cheers,

Jim

-- 
Jim Vanns
Principal Production Engineer
Industrial Light & Magic, London




Re: Maintaining accents with "COPY" ?

2023-05-25 Thread Erik Wienhold
> On 25/05/2023 12:08 CEST Laura Smith  
> wrote:
>
> > Looks like an encoding issue and a mismatch between database encoding and
> > client encoding. You can check both with:
> >
> > SHOW server_encoding;
> > SHOW client_encoding;
> >
> > Then either set the client encoding or use COPY's encoding option to match
> > the database encoding (I assume utf8 in this example):
> >
> > SET client_encoding = 'utf8';
> > COPY (...) TO /tmp/bar.csv DELIMITER ',' CSV HEADER ENCODING 'utf8';
>
> Hi Erik,
>
> Looks like you could well be right about encoding:
>
> postgres=# SHOW server_encoding;
>  server_encoding
> -
>  UTF8
> (1 row)
>
> postgres=# SHOW client_encoding;
>  client_encoding
> -
>  SQL_ASCII
> (1 row)
>
> I will try your suggestion...

The client encoding is not the problem here.  Using SQL_ASCII effectively uses
the server encoding.  SQL_ASCII basically means uninterpreted bytes/characters.

>From https://www.postgresql.org/docs/15/multibyte.html#id-1.6.11.5.7:

"If the client character set is defined as SQL_ASCII, encoding conversion is
 disabled, regardless of the server's character set. (However, if the server's
 character set is not SQL_ASCII, the server will still check that incoming data
 is valid for that encoding; so the net effect is as though the client character
 set were the same as the server's.) Just as for the server, use of SQL_ASCII is
 unwise unless you are working with all-ASCII data."

--
Erik




Re: Maintaining accents with "COPY" ?

2023-05-25 Thread Erik Wienhold
> On 25/05/2023 13:26 CEST Peter J. Holzer  wrote:
>
> On 2023-05-25 07:14:40 +, Laura Smith wrote:
> > I'm currently doing a CSV export using COPY:
> >
> > COPY (select * from foo where bar='foo') TO '/tmp/bar.csv' DELIMITER ',' 
> > CSV HEADER;
> >
> > This works great apart from accents are not preserved in the output,
> > for example é gets converted to random characters, e.g. √© or similar.
>
> How do you check the output?
>
> If a single character is turned into 2 or 3 characters the issue is
> usually that the program which produces the output (in the case of COPY
> I think that would be the PostgreSQL server, not the client) produces
> UTF-8, but the program consuming it expects an 8-bit character set
> (typically windows-1252). See if oyu can tell that program that the file
> is in UTF-8.
>
> > How can I preserve accents ?
>
> They probably already are preserved.

You're right.  The bytes are probably interpreted as Mac OS Roman:

$ echo é | iconv -f macintosh
é
$ echo -n é | xxd
: c3a9

--
Erik




Re: Question regarding specifics of GIN and pg_trgm performance and potential use of show_trgm to improve it

2023-05-25 Thread Tomas Vondra
On 5/24/23 22:28, Pavel Horal wrote:
> Hello PostgreSQL community,
> 
> I am trying to improve performance of using similarity based queries on
> a large datasets and I would like to confirm my understanding of how GIN
> indexes work and how pg_trgm uses them.
> 
> If I understand it correctly, using GIN index is always a two step
> process: first, potential matches are searched in the index; then as a
> second step tuples are actually fetched and rechecked if they really
> match the query. This two step process can lead to degraded performance
> when the index scan matches too many elements that then are read from
> disk only to be dropped as non-matching during the recheck phase. *Is
> that understanding correct?*
> 

Correct. GIN gives you "candidate" rows, but the recheck may still
eliminate those.

> Now to the issue... pg_trgm's similarity search can use similarity
> operator % to search for "similar" documents. Concept of "similarity" is
> based on a similarity of trigram array extracted from the query string
> and trigram arrays of searched values. This concept is quite tricky in a
> sense that just by matching trigrams in GIN index PostgreSQL can not
> tell if the final value will match or not as it does not know how many
> trigrams overall are there in that value... 
> 
> Consider following example:
> 
> CREATE TABLE test (id SERIAL, value TEXT);
> CREATE INDEX test_idx ON test USING GIN (value gin_trgm_ops);
> INSERT INTO test (value) SELECT 'lorem ipsum ' || id || repeat('foo
> bar', CAST(random() * 100 AS INT)) FROM generate_series(1, 10)
> source(id);
> 
> SET pg_trgm.similarity_threshold TO 0.5;
> EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM test WHERE value % 'lorem';
>                                                          QUERY PLAN
> 
>  Bitmap Heap Scan on test  (cost=2024.08..2062.86 rows=10 width=374)
> (actual time=2261.727..2261.728 rows=0 loops=1)
>    Recheck Cond: (value % 'lorem'::text)
>    Rows Removed by Index Recheck: 10
>    Heap Blocks: exact=5025
>    Buffers: shared hit=5636
>    ->  Bitmap Index Scan on test_idx  (cost=0.00..2024.08 rows=10
> width=0) (actual time=19.242..19.242 rows=10 loops=1)
>          Index Cond: (value % 'lorem'::text)
>          Buffers: shared hit=611
>  Planning:
>    Buffers: shared hit=1
>  Planning Time: 2.417 ms
>  Execution Time: 2261.765 ms
> (12 rows)
> 
> 
> If I understand this correctly the *index scan really matches all 10
> items that are then read from disk* only *to be discarded during the
> recheck*. So 2 seconds of doing all that work to return zero results
> (and I was lucky in my example to only have shared buffer hits, so no
> real disk I/O).
> 
> *Is my understanding correct that this happens only because pg_trgm is
> not able to actually determine if the matched item from the index search
> is actually much much longer than the query?* 

Yes, I think that's mostly correct understanding. The trouble here is
that the posting list for "lorem" is very long - it contains TID for
pretty much every row in the table. We can't calculate similarity at
that point from trigrams alone, so we have to fetch the rows.

> Is there any way how the
> performance can be improved in this case? I thought that I can store
> number of trigrams in the index, but that is not being used by the query
> planner:
> 
> CREATE INDEX test_idx2 ON test USING GIN (value gin_trgm_ops,
> array_length(show_trgm(value), 1));
> 
> EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM test WHERE value % 'lorem' AND
> array_length(show_trgm(value), 1) < array_length(show_trgm('lorem'), 1)
> / 0.5;
>                                                         QUERY PLAN
> ---
>  Bitmap Heap Scan on test  (cost=56.08..94.96 rows=3 width=376) (actual
> time=2273.225..2273.226 rows=0 loops=1)
>    Recheck Cond: (value % 'lorem'::text)
>    Rows Removed by Index Recheck: 10
>    Filter: ((array_length(show_trgm(value), 1))::numeric <
> 12.)
>    Heap Blocks: exact=5025
>    Buffers: shared hit=5134
>    ->  Bitmap Index Scan on test_idx3  (cost=0.00..56.08 rows=10
> width=0) (actual time=15.945..15.946 rows=10 loops=1)
>          Index Cond: (value % 'lorem'::text)
>          Buffers: shared hit=109
>  Planning:
>    Buffers: shared hit=3
>  Planning Time: 2.394 ms
>  Execution Time: 2273.256 ms
> (13 rows)
> 
> 
> Thank you for any sort of insight into this.

I don't think indexing the number of trigrams like this can help, and
I'm not sure how to improve this (at least for the built-in GIN). It
seem similarity searches are bound to be proportional to the most
frequent trigram in the query.

I wonder if the "newer" GIN variants like RUM [1] could improve this,
but I don't think it has trgm opclasses.


regard

Re: DB migration : Sybase to Postgres

2023-05-25 Thread Sengottaiyan T
Is there an option to set novalidate constraints in postgres? In my source
Sybase DB, table structures are not defined properly (only primary keys
exist and no foreign key) - I'm making necessary changes on target Postgres
DB (created tables, identifying relationship between table columns with
respective team). After creating proper structure on target, the next step
is to load data - I'm sure there will be a lot of errors during initial
data load (no parent record found). How to handle it?

On Tue, Mar 28, 2023 at 3:22 PM Sengottaiyan T  wrote:

> Team,
>
> Need some help regarding Open Source tools for DB migration (from SYBASE
> to PostgreSQL) with real time CDC. Along with this, is there any
> possibility to migrate other objects (like Triggers, Stored Procedures and
> Functions) to PostgreSQL?
>
> Thanks,
> Senko
>


Re: Maintaining accents with "COPY" ?

2023-05-25 Thread Peter J. Holzer
On 2023-05-25 07:14:40 +, Laura Smith wrote:
> I'm currently doing a CSV export using COPY:
> 
> COPY (select * from foo where bar='foo') TO '/tmp/bar.csv' DELIMITER ',' CSV 
> HEADER;
> 
> 
> This works great apart from accents are not preserved in the output,
> for example é gets converted to random characters, e.g. √© or similar.

How do you check the output?

If a single character is turned into 2 or 3 characters the issue is
usually that the program which produces the output (in the case of COPY
I think that would be the PostgreSQL server, not the client) produces
UTF-8, but the program consuming it expects an 8-bit character set
(typically windows-1252). See if oyu can tell that program that the file
is in UTF-8.

> How can I preserve accents ?

They probably already are preserved.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Maintaining accents with "COPY" ?

2023-05-25 Thread Laura Smith


> Looks like an encoding issue and a mismatch between database encoding and 
> client
> encoding. You can check both with:
> 
> SHOW server_encoding;
> SHOW client_encoding;
> 
> Then either set the client encoding or use COPY's encoding option to match the
> database encoding (I assume utf8 in this example):
> 
> SET client_encoding = 'utf8';
> COPY (...) TO /tmp/bar.csv DELIMITER ',' CSV HEADER ENCODING 'utf8';
> 
> --
> Erik

Hi Erik,

Looks like you could well be right about encoding:

postgres=# SHOW server_encoding;
 server_encoding 
-
 UTF8
(1 row)

postgres=# SHOW client_encoding;
 client_encoding 
-
 SQL_ASCII
(1 row)

I will try your suggestion...





Re: Maintaining accents with "COPY" ?

2023-05-25 Thread Erik Wienhold
> On 25/05/2023 09:14 CEST Laura Smith  
> wrote:
>
> I'm currently doing a CSV export using COPY:
>
> COPY (select * from foo where bar='foo') TO '/tmp/bar.csv' DELIMITER ',' CSV 
> HEADER;
>
> This works great apart from accents are not preserved in the output, for
> example é gets converted to random characters, e.g. √© or similar.
>
> How can I preserve accents ?

Looks like an encoding issue and a mismatch between database encoding and client
encoding.  You can check both with:

SHOW server_encoding;
SHOW client_encoding;

Then either set the client encoding or use COPY's encoding option to match the
database encoding (I assume utf8 in this example):

SET client_encoding = 'utf8';
COPY (...) TO /tmp/bar.csv DELIMITER ',' CSV HEADER ENCODING 'utf8';

--
Erik




Maintaining accents with "COPY" ?

2023-05-25 Thread Laura Smith
Hi

I'm currently doing a CSV export using COPY:

COPY (select * from foo where bar='foo') TO '/tmp/bar.csv' DELIMITER ',' CSV 
HEADER;


This works great apart from accents are not preserved in the output, for 
example é gets converted to random characters, e.g. √© or similar.


How can I preserve accents ?


Thanks !


Laura