Re: Apparent bug in WAL summarizer process (hung state)

2024-06-27 Thread Israel Barth Rubio
> Yeah, this is a bug. It seems that the WAL summarizer process, when
> restarted, wants to restart from wherever it was previously
> summarizing WAL, which is correct if that WAL is still around, but if
> summarize_wal has been turned off in the meanwhile, it might not be
> correct. Here's a patch to fix that.

Thanks for checking this!

> Thanks. New version attached.

And besides that, thanks for the patch, of course!

I compiled Postgres locally with your patch. I attempted to break it several
times, both manually and through a shell script.

No success on that -- which in this case is actually success :)
The WAL summarizer seems able to always resume from a valid point,
so `pg_basebackup` isn't failing anymore.


Re: Apparent bug in WAL summarizer process (hung state)

2024-06-24 Thread Israel Barth Rubio
I'm attaching the files which I missed in the original email.

>
19:34:17.437626 epoll_wait(5, [], 1, 8161) = 0 <8.171542>
19:34:25.610176 rt_sigprocmask(SIG_SETMASK, [URG], NULL, 8) = 0 <0.000334>
19:34:25.611012 openat(AT_FDCWD, "pg_wal/0001000200B3", O_RDONLY) = 
-1 ENOENT (No such file or directory) <0.000323>
19:34:25.611657 rt_sigprocmask(SIG_SETMASK, ~[QUIT ILL TRAP ABRT BUS FPE KILL 
SEGV CONT STOP SYS RTMIN RT_1], NULL, 8) = 0 <0.000406>
19:34:25.612327 write(2, "\0\0{\0\357\23\1\0\0212024-06-24 17:34:25.612"..., 
132) = 132 <0.000203>
19:34:25.612873 epoll_wait(5, [], 1, 1) = 0 <10.010950>
19:34:35.623942 rt_sigprocmask(SIG_SETMASK, [URG], NULL, 8) = 0 <0.12>
19:34:35.624120 openat(AT_FDCWD, "pg_wal/0001000200B3", O_RDONLY) = 
-1 ENOENT (No such file or directory) <0.18>
19:34:35.624191 rt_sigprocmask(SIG_SETMASK, ~[QUIT ILL TRAP ABRT BUS FPE KILL 
SEGV CONT STOP SYS RTMIN RT_1], NULL, 8) = 0 <0.07>
19:34:35.624237 write(2, "\0\0{\0\357\23\1\0\0212024-06-24 17:34:35.624"..., 
132) = 132 <0.52>
19:34:35.624341 epoll_wait(5, [], 1, 1) = 0 <10.010941>
19:34:45.635408 rt_sigprocmask(SIG_SETMASK, [URG], NULL, 8) = 0 <0.16>
19:34:45.635602 openat(AT_FDCWD, "pg_wal/0001000200B3", O_RDONLY) = 
-1 ENOENT (No such file or directory) <0.62>
19:34:45.635765 rt_sigprocmask(SIG_SETMASK, ~[QUIT ILL TRAP ABRT BUS FPE KILL 
SEGV CONT STOP SYS RTMIN RT_1], NULL, 8) = 0 <0.09>
19:34:45.635830 write(2, "\0\0{\0\357\23\1\0\0212024-06-24 17:34:45.635"..., 
132) = 132 <0.000495>
19:34:45.636390 epoll_wait(5, [], 1, 1) = 0 <10.008785>
19:34:55.645305 rt_sigprocmask(SIG_SETMASK, [URG], NULL, 8) = 0 <0.16>
19:34:55.645520 openat(AT_FDCWD, "pg_wal/0001000200B3", O_RDONLY) = 
-1 ENOENT (No such file or directory) <0.32>
19:34:55.645645 rt_sigprocmask(SIG_SETMASK, ~[QUIT ILL TRAP ABRT BUS FPE KILL 
SEGV CONT STOP SYS RTMIN RT_1], NULL, 8) = 0 <0.27>
19:34:55.646214 write(2, "\0\0{\0\357\23\1\0\0212024-06-24 17:34:55.645"..., 
132) = 132 <0.000136>
19:34:55.646445 epoll_wait(5, [], 1, 1) = 0 <10.011731>
19:35:05.658366 rt_sigprocmask(SIG_SETMASK, [URG], NULL, 8) = 0 <0.18>
19:35:05.658591 openat(AT_FDCWD, "pg_wal/0001000200B3", O_RDONLY) = 
-1 ENOENT (No such file or directory) <0.28>
19:35:05.658689 rt_sigprocmask(SIG_SETMASK, ~[QUIT ILL TRAP ABRT BUS FPE KILL 
SEGV CONT STOP SYS RTMIN RT_1], NULL, 8) = 0 <0.10>
19:35:05.658750 write(2, "\0\0{\0\357\23\1\0\0212024-06-24 17:35:05.658"..., 
132) = 132 <0.000521>
19:35:05.659335 epoll_wait(5,  
#0  0x7fb51c50e1da in epoll_wait (epfd=5, events=0x1409fd8, maxevents=1, 
timeout=timeout@entry=1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0x00922bad in WaitEventSetWaitBlock (nevents=1, 
occurred_events=0x7ffe5492aba0, cur_timeout=, set=0x1409f70) at 
storage/ipc/latch.c:1570
#2  WaitEventSetWait (set=0x1409f70, timeout=1, 
occurred_events=0x7ffe5492aba0, nevents=1, wait_event_info=) at 
storage/ipc/latch.c:1516
#3  0x00920e65 in WaitLatch (latch=, 
wakeEvents=, timeout=, wait_event_info=150994953) 
at storage/ipc/latch.c:538
#4  0x008aa7e4 in WalSummarizerMain (startup_data=, 
startup_data_len=) at postmaster/walsummarizer.c:317
#5  0x008a46a0 in postmaster_child_launch (client_sock=0x0, 
startup_data_len=0, startup_data=0x0, child_type=B_WAL_SUMMARIZER) at 
postmaster/launch_backend.c:265
#6  postmaster_child_launch (client_sock=0x0, startup_data_len=0, 
startup_data=0x0, child_type=B_WAL_SUMMARIZER) at 
postmaster/launch_backend.c:226
#7  StartChildProcess (type=type@entry=B_WAL_SUMMARIZER) at 
postmaster/postmaster.c:3928
#8  0x008a4dcf in MaybeStartWalSummarizer () at 
postmaster/postmaster.c:4075
#9  MaybeStartWalSummarizer () at postmaster/postmaster.c:4070
#10 0x008a7f8f in ServerLoop () at postmaster/postmaster.c:1746
#11 0x0089f5e0 in PostmasterMain (argc=3, argv=0x14097b0) at 
postmaster/postmaster.c:1372
#12 0x005381aa in main (argc=3, argv=0x14097b0) at main/main.c:197
No symbol "full" in current context.
#0  0x7fb51c50e1da in epoll_wait (epfd=5, events=0x1409fd8, maxevents=1, 
timeout=timeout@entry=1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
sc_ret = -4
sc_ret = 
#1  0x00922bad in WaitEventSetWaitBlock (nevents=1, 
occurred_events=0x7ffe5492aba0, cur_timeout=, set=0x1409f70) at 
storage/ipc/latch.c:1570
returned_events = 0
rc = 
cur_event = 
cur_epoll_event = 
returned_events = 
rc = 
cur_event = 
cur_epoll_event = 
__func__ = { }
__errno_location = 
#2  WaitEventSetWait (set=0x1409f70, timeout=1, 
occurred_events=0x7ffe5492aba0, nevents=1, wait_event_info=) at 
storage/ipc/latch.c:1516
rc = 
returned_events = 0
start_time = {ticks = 49310570173263}
cur_time = {ticks = }
cur_timeout = 
#3  

Apparent bug in WAL summarizer process (hung state)

2024-06-24 Thread Israel Barth Rubio
Hello,

Hope you are doing well.

I've been playing a bit with the incremental backup feature which might
come as
part of the 17 release, and I think I hit a possible bug in the WAL
summarizer
process.

The issue that I face refers to the summarizer process getting into a hung
state.
When the issue is triggered, it keeps in an infinite loop trying to process
a WAL
file that no longer exists.  It apparently comes up only when I perform
changes to
`wal_summarize` GUC and reload Postgres, while there is some load in
Postgres
which makes it recycle WAL files.

I'm running Postgres 17 in a Rockylinux 9 VM. In order to have less WAL
files
available in `pg_wal` and make it easier to reproduce the issue, I'm using
a low
value for `max_wal_size` ('100MB'). You can find below the steps that I
took to
reproduce this problem, assuming this small `max_wal_size`, and
`summarize_wal`
initially enabled:

```bash
# Assume we initially have max_wal_size = '100MB' and summarize_wal = on

# Create a table of ~ 100MB
psql -c "CREATE TABLE test AS SELECT generate_series(1, 300)"

# Take a full backup
pg_basebackup -X none -c fast -P -D full_backup_1

# Recreate a table of ~ 100MB
psql -c "DROP TABLE test"
psql -c "CREATE TABLE test AS SELECT generate_series(1, 300)"

# Take an incremental backup
pg_basebackup -X none -c fast -P -D incremental_backup_1 --incremental
full_backup_1/backup_manifest

# Disable summarize_wal
psql -c "ALTER SYSTEM SET summarize_wal TO off"
psql -c "SELECT pg_reload_conf()"

# Recreate a table of ~ 100MB
psql -c "DROP TABLE test"
psql -c "CREATE TABLE test AS SELECT generate_series(1, 300)"

# Re-enable sumarize_wal
psql -c "ALTER SYSTEM SET summarize_wal TO on"
psql -c "SELECT pg_reload_conf()"

# Take a full backup
pg_basebackup -X none -c fast -P -D full_backup_2

# Take an incremental backup
pg_basebackup -X none -c fast -P -D incremental_backup_2 --incremental
full_backup_2/backup_manifest
```

I'm able to reproduce the issue most of the time when running these steps
manually. It's harder to reproduce if I attempt to run those commands as a
bash script.

This is the sample output of a run of those commands:

```console

(barman) [postgres@barmandevhost ~]$ psql -c "CREATE TABLE test AS
SELECT generate_series(1, 300)"SELECT 300(barman)
[postgres@barmandevhost ~]$ pg_basebackup -X none -c fast -P -D
full_backup_1NOTICE:  WAL archiving is not enabled; you must ensure
that all required WAL segments are copied through other means to
complete the backup331785/331785 kB (100%), 1/1 tablespace(barman)
[postgres@barmandevhost ~]$ psql -c "DROP TABLE test"DROP
TABLE(barman) [postgres@barmandevhost ~]$ psql -c "CREATE TABLE test
AS SELECT generate_series(1, 300)"SELECT 300(barman)
[postgres@barmandevhost ~]$ pg_basebackup -X none -c fast -P -D
incremental_backup_1 --incremental
full_backup_1/backup_manifestNOTICE:  WAL archiving is not enabled;
you must ensure that all required WAL segments are copied through
other means to complete the backup111263/331720 kB (33%), 1/1
tablespace(barman) [postgres@barmandevhost ~]$ psql -c "ALTER SYSTEM
SET summarize_wal TO off"ALTER SYSTEM(barman) [postgres@barmandevhost
~]$ psql -c "SELECT pg_reload_conf()" pg_reload_conf
t(1 row)
(barman) [postgres@barmandevhost ~]$ psql -c "DROP TABLE test"DROP
TABLE(barman) [postgres@barmandevhost ~]$ psql -c "CREATE TABLE test
AS SELECT generate_series(1, 300)"SELECT 300(barman)
[postgres@barmandevhost ~]$ psql -c "ALTER SYSTEM SET summarize_wal TO
on"ALTER SYSTEM(barman) [postgres@barmandevhost ~]$ psql -c "SELECT
pg_reload_conf()" pg_reload_conf t(1 row)
(barman) [postgres@barmandevhost ~]$ pg_basebackup -X none -c fast -P
-D full_backup_2NOTICE:  WAL archiving is not enabled; you must ensure
that all required WAL segments are copied through other means to
complete the backup331734/331734 kB (100%), 1/1 tablespace(barman)
[postgres@barmandevhost ~]$ pg_basebackup -X none -c fast -P -D
incremental_backup_2 --incremental
full_backup_2/backup_manifestWARNING:  still waiting for WAL
summarization through 2/C128 after 10 secondsDETAIL:
Summarization has reached 2/B3D8 on disk and 2/B3D8 in
memory.WARNING:  still waiting for WAL summarization through
2/C128 after 20 secondsDETAIL:  Summarization has reached
2/B3D8 on disk and 2/B3D8 in memory.WARNING:  still waiting
for WAL summarization through 2/C128 after 30 secondsDETAIL:
Summarization has reached 2/B3D8 on disk and 2/B3D8 in
memory.WARNING:  still waiting for WAL summarization through
2/C128 after 40 secondsDETAIL:  Summarization has reached
2/B3D8 on disk and 2/B3D8 in memory.WARNING:  still waiting
for WAL summarization through 2/C128 after 50 secondsDETAIL:
Summarization has reached 2/B3D8 on disk and 2/B3D8 in
memory.WARNING:  still waiting for WAL summarization through
2/C128 after 60 secondsDETAIL:  Summarization has reached
2/B3D8 on disk and 

Re: Authentication fails for md5 connections if ~/.postgresql/postgresql.{crt and key} exist

2023-01-25 Thread Israel Barth Rubio
Hello Jacob,

> I'm not sure how helpful it is to assign "blame" here. I think the
> requested improvement is reasonable -- it should be possible to
> override the default for a particular connection, without having to
> pick a junk value that you hope doesn't match up with an actual file
> on the disk.

Right, I agree we can look for improvements. "blame" was likely
not the best word to express myself in that message.

> sslmode=disable isn't used in either of our proposals, though. Unless
> I'm missing what you mean?

Sorry about the noise, I misread the code snippet shared earlier
(sslmode x sslcertmode). I just took a closer read at the previously
mentioned patch about sslcertmode and it seems a bit
more elegant way of achieving something similar to what has
been proposed here.

Best regards,
Israel.

Em qua., 25 de jan. de 2023 às 14:09, Jacob Champion <
jchamp...@timescale.com> escreveu:

> On Wed, Jan 25, 2023 at 7:47 AM Israel Barth Rubio
>  wrote:
> > I imagine more people might have already hit a similar situation too.
> While the
> > workaround can seem a bit weird, in my very humble opinion the
> user/client is
> > somehow still the one to blame in this case as it is providing the
> "wrong" file in
> > a path that is checked by libpq. With that in mind I would be inclined
> to say it is
> > an acceptable workaround.
>
> I'm not sure how helpful it is to assign "blame" here. I think the
> requested improvement is reasonable -- it should be possible to
> override the default for a particular connection, without having to
> pick a junk value that you hope doesn't match up with an actual file
> on the disk.
>
> > Although both patches achieve a similar goal regarding not sending the
> > client certificate there is still a slight but in my opinion important
> difference
> > between them: sslmode=disable will also disable channel encryption. It
> > may or may not be acceptable depending on how the connection is between
> > your client and the server.
>
> sslmode=disable isn't used in either of our proposals, though. Unless
> I'm missing what you mean?
>
> --Jacob
>


Re: Authentication fails for md5 connections if ~/.postgresql/postgresql.{crt and key} exist

2023-01-25 Thread Israel Barth Rubio
Hello Jim/Jacob,

> > I do not think it is worth it to change the current behavior of
> PostgreSQL
> > in that sense.
>
> Well, I am not suggesting to change the current behavior of PostgreSQL in
> that matter. Quite the contrary, I find this feature very convenient,
> specially when you need to deal with many different clusters. What I am
> proposing is rather the possibility to disable it on demand :) I mean,
> in case I do not want libpq to try to authenticate using the certificates
> in `~/.postgresql`.
>
> > PostgreSQL looks for the cert and key under `~/.postgresql` as a
> facility.
> > These files do not exist by default, so if PostgreSQL finds something in
> > there it assumes you want to use it.
>
> Yes. I'm just trying to find an elegant way to disable this assumption
> on demand.

Right, I do understand your proposal. I was just thinking out loud and
wondering about the broad audience of such a mode in the sslmode
argument.

Something else that came to my mind is that sslmode itself seems more
like an argument covering the client expectations regarding the connection
to the server, I mean, if it expects channel encryption and/or validation
of the
server identity.

I wonder if we are willing to add some functionality around the expectations
regarding the client certificate, if it wouldn't make more sense to be
controlled
through something like the clientcert option of pg_hba? If so, the downside
of
that is the fact that the client would still send the certificate even if
it would not
be used at all by the server. Again, just thinking out loud about what your
goal
is and possible ways of accomplishing that:)

> > I do realize that this patch is a big ask, since probably nobody except
> > me "needs it" :D
>
> I'd imagine other people have run into it too; it's just a matter of
> how palatable the workarounds were to them. :)

I imagine more people might have already hit a similar situation too. While
the
workaround can seem a bit weird, in my very humble opinion the user/client
is
somehow still the one to blame in this case as it is providing the "wrong"
file in
a path that is checked by libpq. With that in mind I would be inclined to
say it is
an acceptable workaround.

> > I think the sslcertmode=disable option that I introduced in [1]
solves this issue too;
>
> Well, I see there is indeed a significant overlap between our patches -
> but yours has a much more comprehensive approach! If I got it right,
> the new slcertmode=disable would indeed cancel the existing certs in
> '~/.postgresql/ in case they exist. Right?
>
> +if (conn->sslcertmode[0] == 'd') /* disable */
> +{
> +/* don't send a client cert even if we have one */
> +have_cert = false;
> +}
> +else if (fnbuf[0] == '\0')
>
> My idea was rather to use the existing sslmode with a new option
> "no-clientcert" that does actually the same:
>
>  /* sslmode no-clientcert */
>  if (conn->sslmode[0] == 'n')
>  {
>  fnbuf[0] = '\0';
>  }
>
>  ...
>
>  if (fnbuf[0] == '\0')
>  {
>  /* no home directory, proceed without a client cert */
>  have_cert = false;
>  }
>
> I wish I had found your patchset some months ago. Now I hate myself
> for the duplication of efforts :D

Although both patches achieve a similar goal regarding not sending the
client certificate there is still a slight but in my opinion important
difference
between them: sslmode=disable will also disable channel encryption. It
may or may not be acceptable depending on how the connection is between
your client and the server.

Kind regards,
Israel.


Re: Authentication fails for md5 connections if ~/.postgresql/postgresql.{crt and key} exist

2023-01-19 Thread Israel Barth Rubio
Hello Jim,

> Hi Jelte, thanks for the message. You're right, an invalid cert path
> does solve the issue - I even use it for tests. Although it solves the
> authentication issue it still looks in my eyes like a non intuitive
> workaround/hack. Perhaps a new sslmode isn't the right place for this
> "feature"? Thanks again for the suggestion!

I do not think it is worth it to change the current behavior of PostgreSQL
in that sense.

PostgreSQL looks for the cert and key under `~/.postgresql` as a facility.
These files do not exist by default, so if PostgreSQL finds something in
there it assumes you want to use it.

I also think it is correct in the sense of choosing the certificate over
a password based authentication when it finds a certificate as the cert
based would provide you with stronger checks.

I believe that using libpq services would be a better approach if you
want to connect to several PostgreSQL clusters from the very same
source machine. That way you would specify whatever is specific to each
target cluster in a centralized configuration file and just reference each
target cluster by its service name in the connection string. It would
require that you move the SSL cert and key from `~/.postgresql` to somewhere
else and specify `sslcert` and `sslkey` in the expected service in the
`~/.pg_service.conf` file.

More info about that can be found at:

https://www.postgresql.org/docs/current/libpq-pgservice.html

Best regards,
Israel.

>


Re: Add support for DEFAULT specification in COPY FROM

2022-12-02 Thread Israel Barth Rubio
Hello all,

I'm submitting a new version of the patch. Instead of changing signature
of several functions in order to use the defaults parameter, it is now
storing
that in the cstate structure, which is already passed to all functions that
were previously modified.

Best regards,
Israel.

Em sex., 7 de out. de 2022 às 17:54, Israel Barth Rubio <
barthisr...@gmail.com> escreveu:

> Hello Zhihong,
>
> > For the last question, please take a look at:
> >
> > #define MemSetAligned(start, val, len) \
> >
> > which is called by palloc0().
>
> Oh, I totally missed that. Thanks for the heads up!
>
> I'm attaching the new patch version, which contains both the fix
> to the problem reported by Andres, and removes this useless
> MemSet call.
>
> Best regards,
> Israel.
>


v6-0001-Added-support-for-DEFAULT-in-COPY-FROM.patch
Description: Binary data


Re: BUG #17434: CREATE/DROP DATABASE can be executed in the same transaction with other commands

2022-11-25 Thread Israel Barth Rubio
> Attached is a draft patch along the lines I speculated about above.
> It breaks backwards compatibility in that PreventInTransactionBlock
> commands will now be rejected if they're a non-first command in a
> pipeline.  I think that's okay, and arguably desirable, for HEAD
> but I'm pretty uncomfortable about back-patching it.

I attempted to run these using HEAD, and it fails:

parse: create temporary table t1 (a int) on commit drop
bind
execute
parse: analyze t1
bind
execute
parse: select * from t1
bind
execute
sync

It then works fine after applying your patch!

Just for some context, this was brought by Peter E. based on an issue
reported by a customer. They are using PostgreSQL 11, and the issue
was observed after upgrading to PostgreSQL 11.17, which includes the
commit 9e3e1ac458abcda5aa03fa2a136e6fa492d58bd6. As a workaround
they downgraded the binaries to 11.16.

It would be great if we can back-patch this to all supported versions,
as the issue itself is currently affecting them all.

Regards,
Israel.


Re: Add support for DEFAULT specification in COPY FROM

2022-10-07 Thread Israel Barth Rubio
Hello Zhihong,

> For the last question, please take a look at:
>
> #define MemSetAligned(start, val, len) \
>
> which is called by palloc0().

Oh, I totally missed that. Thanks for the heads up!

I'm attaching the new patch version, which contains both the fix
to the problem reported by Andres, and removes this useless
MemSet call.

Best regards,
Israel.


v5-0001-Added-support-for-DEFAULT-in-COPY-FROM.patch
Description: Binary data


Re: Add support for DEFAULT specification in COPY FROM

2022-10-07 Thread Israel Barth Rubio
Hello Andres,

> cfbot shows that tests started failing with this version:
>
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/39/3822
> A full backtrace is at
https://api.cirrus-ci.com/v1/task/5354378189078528/logs/cores.log

Thanks for pointing this out. I had initially missed this as my local runs
of *make check*
were working fine, sorry!

I'm attaching a new version of the patch, containing the memory context
switches.

Regards,
Israel.


v4-0001-Added-support-for-DEFAULT-in-COPY-FROM.patch
Description: Binary data


Re: Add support for DEFAULT specification in COPY FROM

2022-10-07 Thread Israel Barth Rubio
Hello Zhihong,

> +   /* attribute is NOT to be copied from input */
>
> I think saying `is NOT copied from input` should suffice.
>
> +   /* fieldno is 0-index and attnum is 1-index */
>
> 0-index -> 0-indexed

I have applied both suggestions, thanks! I'll submit a 4th version
of the patch soon.

> +   defaults = (bool *) palloc0(num_phys_attrs * sizeof(bool));
> +   MemSet(defaults, false, num_phys_attrs * sizeof(bool));
>
> Is the MemSet() call necessary ?

I would say it is, so it initializes the array with all flags set to false.
Later, if it detects attributes that should evaluate their default
expression,
it would set the flag to true.

Am I missing something?

Regards,
Israel.

>


Re: Add support for DEFAULT specification in COPY FROM

2022-09-26 Thread Israel Barth Rubio
Hello Andrew,

> . There needs to be a check that this is being used with COPY FROM, and
> the restriction needs to be stated in the docs and tested for. c.f.
> FORCE NULL.
>
> . There needs to be support for this in psql's tab_complete.c, and
> appropriate tests added
>
> . There needs to be support for it in contrib/file_fdw/file_fdw.c, and a
> test added
>
> . The tests should include psql's \copy as well as sql COPY
>
> . I'm not sure we need a separate regression test file for this.
> Probably these tests can go at the end of src/test/regress/sql/copy2.sql.

Thanks for your review! I have applied the suggested changes, and I'm
submitting the new patch version.

Kind regards,
Israel.


v3-0001-Added-support-for-DEFAULT-in-COPY-FROM.patch
Description: Binary data


Re: Add support for DEFAULT specification in COPY FROM

2022-08-18 Thread Israel Barth Rubio
Hello Ilmari,

Thanks for checking it, too. I can study to implement these changes
to include a way of overriding the behavior for the given columns.

Regards,
Israel.

Em qui., 18 de ago. de 2022 às 06:56, Dagfinn Ilmari Mannsåker <
ilm...@ilmari.org> escreveu:

> Andrew Dunstan  writes:
>
> > On 2022-08-16 Tu 14:12, Israel Barth Rubio wrote:
> >> Hello all,
> >>
> >> With the current implementation of COPY FROM in PostgreSQL we are
> >> able to load the DEFAULT value/expression of a column if the column
> >> is absent in the list of specified columns. We are not able to
> >> explicitly ask that PostgreSQL uses the DEFAULT value/expression in a
> >> column that is being fetched from the input file, though.
> >>
> >> This patch adds support for handling DEFAULT values in COPY FROM. It
> >> works similarly to NULL in COPY FROM: whenever the marker that was
> >> set for DEFAULT value/expression is read from the input stream, it
> >> will evaluate the DEFAULT value/expression of the corresponding
> >> column.
> […]
> > Interesting, and probably useful. I've only had a brief look, but it's
> > important that the default marker not be quoted in CSV mode (c.f. NULL)
> > -f it is it should be taken as a literal rather than a special value.
>
> For the NULL marker that can be overridden for individual columns with
> the FORCE(_NOT)_NULL option. This feature should have a similar
> FORCE(_NOT)_DEFAULT option to allow the DEFAULT marker to be ignored, or
> recognised even when quoted, respectively.
>
> - ilmari
>


Re: Add support for DEFAULT specification in COPY FROM

2022-08-18 Thread Israel Barth Rubio
Hello,

Thanks for your review. I submitted the patch to the next commit fest
(https://commitfest.postgresql.org/39/3822/).

Regards,
Israel.

Em qua., 17 de ago. de 2022 às 18:56, Andrew Dunstan 
escreveu:

>
> On 2022-08-17 We 17:12, Israel Barth Rubio wrote:
> >
> >
> > Does that address your concerns?
> >
> > I am attaching the new patch, containing the above test in the regress
> > suite.
>
>
> Thanks, yes, that all looks sane.
>
>
> Please add this to the next CommitFest if you haven't already done so.
>
>
> cheers
>
>
> andrew
>
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>
>


Re: Add support for DEFAULT specification in COPY FROM

2022-08-17 Thread Israel Barth Rubio
Hello Andrew,

Thanks for reviewing this patch.

It is worth noting that DEFAULT will only take place if explicitly
specified, meaning there is
no default value for the option DEFAULT. The usage of \D in the tests was
only a suggestion.
Also, NULL marker will be an unquoted empty string by default in CSV mode.

In any case I have manually tested it and the behavior is compliant to what
we see in NULL
if it is defined to use \N both in text and CSV modes.

- NULL as \N:

postgres=# CREATE TEMP TABLE copy_null (id integer primary key, value text);
CREATE TABLE
postgres=# copy copy_null from stdin with (format text, NULL '\N');
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1 \N
>> 2 \\N
>> 3 "\N"
>> \.
COPY 3
postgres=# TABLE copy_null ;
 id | value
+---
  1 |
  2 | \N
  3 | "N"
(3 rows)

postgres=# TRUNCATE copy_null ;
TRUNCATE TABLE
postgres=# copy copy_null from stdin with (format csv, NULL '\N');
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,\N
>> 2,\\N
>> 3,"\N"
>> \.
COPY 3
postgres=# TABLE copy_null ;
 id | value
+---
  1 |
  2 | \\N
  3 | \N
(3 rows)

- DEFAULT as \D:

postgres=# CREATE TEMP TABLE copy_default (id integer primary key, value
text default 'test');
CREATE TABLE
postgres=# copy copy_default from stdin with (format text, DEFAULT '\D');
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1 \D
>> 2 \\D
>> 3 "\D"
>> \.
COPY 3
postgres=# TABLE copy_default ;
 id | value
+---
  1 | test
  2 | \D
  3 | "D"
(3 rows)

postgres=# TRUNCATE copy_default ;
TRUNCATE TABLE
postgres=# copy copy_default from stdin with (format csv, DEFAULT '\D');
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,\D
>> 2,\\D
>> 3,"\D"
>> \.
COPY 3
postgres=# TABLE copy_default ;
 id | value
+---
  1 | test
  2 | \\D
  3 | \D
(3 rows)

If you do not specify DEFAULT in COPY FROM, it will have no default value
for
that option. So, if you try to load \D in CSV mode, then it will load the
literal value:

postgres=# CREATE TEMP TABLE copy (id integer primary key, value text
default 'test');
CREATE TABLE
postgres=# copy copy from stdin with (format csv);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,\D
>> 2,\\D
>> 3,"\D"
>> \.
COPY 3
postgres=# TABLE copy ;
 id | value
+---
  1 | \D
  2 | \\D
  3 | \D
(3 rows)


Does that address your concerns?

I am attaching the new patch, containing the above test in the regress
suite.

Best regards,
Israel.

Em ter., 16 de ago. de 2022 às 17:27, Andrew Dunstan 
escreveu:

>
> On 2022-08-16 Tu 14:12, Israel Barth Rubio wrote:
> > Hello all,
> >
> > With the current implementation of COPY FROM in PostgreSQL we are able to
> > load the DEFAULT value/expression of a column if the column is absent
> > in the
> > list of specified columns. We are not able to explicitly ask that
> > PostgreSQL uses
> > the DEFAULT value/expression in a column that is being fetched from
> > the input
> > file, though.
> >
> > This patch adds support for handling DEFAULT values in COPY FROM. It
> > works
> > similarly to NULL in COPY FROM: whenever the marker that was set for
> > DEFAULT
> > value/expression is read from the input stream, it will evaluate the
> > DEFAULT
> > value/expression of the corresponding column.
> >
> > I'm currently working as a support engineer, and both me and some
> > customers had
> > already faced a situation where we missed an implementation like this
> > in COPY
> > FROM, and had to work around that by using an input file where the
> > column which
> > has a DEFAULT value/expression was removed.
> >
> > That does not solve all issues though, as it might be the case that we
> > just want a
> > DEFAULT value to take place if no other value was set for the column
> > in the input
> > file, meaning we would like to have a column in the input file that
> > sometimes assume
> > the DEFAULT value/expression, and sometimes assume an actual given value.
> >
> > The implementation was performed about one month ago and included all
> > regression
> > tests regarding the changes that were introduced. It was just rebased
> > on top of the
> > master branch before submitting this patch, and all tests are still
> > succeeding.
> >
> 

Add support for DEFAULT specification in COPY FROM

2022-08-16 Thread Israel Barth Rubio
Hello all,

With the current implementation of COPY FROM in PostgreSQL we are able to
load the DEFAULT value/expression of a column if the column is absent in the
list of specified columns. We are not able to explicitly ask that
PostgreSQL uses
the DEFAULT value/expression in a column that is being fetched from the
input
file, though.

This patch adds support for handling DEFAULT values in COPY FROM. It works
similarly to NULL in COPY FROM: whenever the marker that was set for DEFAULT
value/expression is read from the input stream, it will evaluate the DEFAULT
value/expression of the corresponding column.

I'm currently working as a support engineer, and both me and some customers
had
already faced a situation where we missed an implementation like this in
COPY
FROM, and had to work around that by using an input file where the column
which
has a DEFAULT value/expression was removed.

That does not solve all issues though, as it might be the case that we just
want a
DEFAULT value to take place if no other value was set for the column in the
input
file, meaning we would like to have a column in the input file that
sometimes assume
the DEFAULT value/expression, and sometimes assume an actual given value.

The implementation was performed about one month ago and included all
regression
tests regarding the changes that were introduced. It was just rebased on
top of the
master branch before submitting this patch, and all tests are still
succeeding.

The implementation takes advantage of the logic that was already
implemented to
handle DEFAULT values for missing columns in COPY FROM. I just modified it
to
make it available the DEFAULT values/expressions for all columns instead of
only
for the ones that were missing in the specification. I had to change the
variables
accordingly, so it would index the correct positions in the new array of
DEFAULT
values/expressions.

Besides that, I also copied and pasted most of the checks that are
performed for the
NULL feature of COPY FROM, as the DEFAULT behaves somehow similarly.

Best regards,
Israel.


v1-0001-Added-support-for-DEFAULT-in-COPY-FROM.patch
Description: Binary data