Re: Error with Insert from View with ON Conflict

2021-11-04 Thread Alex Magnum
Steve, Peter,
thanks .

Below query works. the view is used to generate the stats and not using the
table http_ping_uptime_stats  . So my interpretation was that I need to
alias the view for the UPDATE to know where the value is coming from as
both have the same name.
That obviously did not work. I also tried to give different names the
fields returned in the view  eg. checks2, uptime2 etc... so that there
won't be a conflict but SET checks = V.checks2  or checks = checks2 also
did not work.

All works now as intended. Thanks for the hint!

Alex

INSERT INTO http_ping_uptime_stats
SELECT * FROM view_http_ping_uptime_stats AS V WHERE month
=date_trunc('month',now())
ON CONFLICT (url,ip,month) DO UPDATE
  SET last_update = excluded.last_update,
  checks = excluded.checks,
  uptime = excluded.uptime,
  errors = excluded.errors;

On Thu, Nov 4, 2021 at 8:54 AM Peter Geoghegan  wrote:

> On Wed, Nov 3, 2021 at 2:18 PM Steve Baldwin 
> wrote:
> > I'm pretty sure the 'alias' for the '.. on conflict do update ..' needs
> to be 'excluded' (i.e. checks = excluded.checks, ...). Check the docs.
>
> That's right . The excluded.* pseudo-table isn't exactly the same
> thing as the target table -- it is a tuple that has the same "shape",
> that represents what the implementation tried (and failed) to insert
> into the table. I have to imagine that Alex wants to reference that,
> because that's the standard idiomatic approach with ON CONFLICT. And
> because the only alternative interpretation is that Alex intends to
> update those columns using their current values (not new values),
> which won't really change anything -- that seems unlikely to have been
> the intent.
>
> --
> Peter Geoghegan
>


Streaming replication versus Logical replication

2021-11-04 Thread Alanoly Andrews
We are currently running some long-running SELECT queries on the replication 
database in a streaming replication pair. Some of these queries can run for 1 
hour or more. To avoid errors related to "data no more being available" on the 
replication due to vacuuming of old data on the primary database, we have set 
the following parameters: max_standby_archive_delay = -1, 
max_standby_streaming_delay = -1, hot_standby_feedback = on. With these set, 
the long queries are able to run to completion, but there is table bloat on 
both the primary and the replicated databases, leading to throughput delay on 
the primary production database.

Will this issue exist if we use "logical replication" instead? With the above 
three parameters set back to normal, will the replicated database get 
overwritten when vacuuming runs on the primary and removes old data? If it does 
not, will there be table bloat on the primary database? What is the mechanism 
by which data changes on the "publisher" are propagated to the "subscriber"? 
What happens when the subscriber database has an long-running query?

Thanks.

Alanoly Andrews.




This e-mail may be privileged and/or confidential, and the sender does not 
waive any related rights and obligations. Any distribution, use or copying of 
this e-mail or the information it contains by other than an intended recipient 
is unauthorized. If you received this e-mail in error, please advise me (by 
return e-mail or otherwise) immediately.

Ce courriel est confidentiel et prot?g?. L'exp?diteur ne renonce pas aux droits 
et obligations qui s'y rapportent. Toute diffusion, utilisation ou copie de ce 
message ou des renseignements qu'il contient par une personne autre que le 
(les) destinataire(s) d?sign?(s) est interdite. Si vous recevez ce courriel par 
erreur, veuillez m'en aviser imm?diatement, par retour de courriel ou par un 
autre moyen.'. If the disclaimer can't be applied, attach the message to a new 
disclaimer message.


Re: to_date() and to_timestamp() with negative years

2021-11-04 Thread Bryn Llewellyn
> adrian.kla...@aklaver.com wrote:
> 
> Not sure how this can be handled in a totally predictable way given the 
> unpredictable ways in which datetime strings are formatted?
> 
> The only thing I can say it is it points out that when working with datetimes 
> settling on a standard format is your best defense against unpredictable 
> results.

Thank you very much, again, for your help with my seemingly endless nitpicking 
questions on this matter, Adrian. Here's the most favorable conclusion that I 
can draw:

1. A precedent has been set by The SQL Standard folks together with the 
PostgreSQL implementation and other implementations like Oracle Database. All 
this stretches back a long time—to more than four decades ago.

2. This has brought us in PG to the rules that "Table 9.26. Template Patterns 
for Date/Time Formatting", "Table 9.27. Template Pattern Modifiers for 
Date/Time Formatting", and "Usage notes for date/time formatting" set out and 
interpret.

3. The rules are hard to understand and the PG doc gives insufficient detail to 
allow the outcomes in corner cases like you just showed us to be predicted 
confidently. Some users seek to understand the rules by reading PG's source 
code.

4. Certainly, the rules set a significant parsing challenge. You hint that they 
might even prevent a complete practical solution to be implemented.

5. None of this matters when the app designer has the freedom to define how 
date-time values will be provided, as text values, by user interfaces or 
external systems. In these cases, the complexity can be controlled by edict and 
correct solutions can be painlessly implemented and tested. Not a day goes by 
that I don't have to enter a date value at a UI. And in every case, a UI gadget 
constrains my input and makes its individual fields available to the programmer 
without the need for parsing—so there's reason to be optimistic. The programmer 
can easily build the text representation of the date-time value to conform to 
the simple rules that the overall application design specified.

6. In some cases text data that's to be converted arrives in a format that 
cannot be influenced. And it might contain locutions like we've been discussing 
("zero" meaning "1 BC", unfavorable paradigms for separating tokens, and the 
like). In these cases, the diligent programmer might, just, be able to use the 
full arsenal of available tools to implement a scheme that faultlessly parses 
the input. But the likelihood of bugs seems to be pretty big.

I'll say "case closed, now" — from my side, at least.

Re: Streaming replication versus Logical replication

2021-11-04 Thread Ninad Shah
Yes, it is going to resolve the issue because streaming is completely a
slave(with few exceptions). Even the VACUUM operation gets replicated
through the master, which is not a case with logical replication. In
logical replication, only data from a few tables gets replicated. In terms
of database administration, they are different entities.

In case the subscriber has long-running queries, unlike streaming
replication, it does not affect synchronisation operations.


Regards,
Ninad Shah


On Thu, 4 Nov 2021 at 21:16, Alanoly Andrews  wrote:

> We are currently running some long-running SELECT queries on the
> replication database in a streaming replication pair. Some of these queries
> can run for 1 hour or more. To avoid errors related to "data no more being
> available" on the replication due to vacuuming of old data on the primary
> database, we have set the following parameters: max_standby_archive_delay
> = -1, max_standby_streaming_delay = -1, hot_standby_feedback = on. With
> these set, the long queries are able to run to completion, but there is
> table bloat on both the primary and the replicated databases, leading to
> throughput delay on the primary production database.
>
> Will this issue exist if we use "logical replication" instead? With the
> above three parameters set back to normal, will the replicated database get
> overwritten when vacuuming runs on the primary and removes old data? If it
> does not, will there be table bloat on the primary database? What is the
> mechanism by which data changes on the "publisher" are propagated to the
> "subscriber"? What happens when the subscriber database has an long-running
> query?
>
> Thanks.
>
> Alanoly Andrews.
>
>
> This e-mail may be privileged and/or confidential, and the sender does not
> waive any related rights and obligations. Any distribution, use or copying
> of this e-mail or the information it contains by other than an intended
> recipient is unauthorized. If you received this e-mail in error, please
> advise me (by return e-mail or otherwise) immediately.
>
> Ce courriel est confidentiel et protégé. L'expéditeur ne renonce pas aux
> droits et obligations qui s'y rapportent. Toute diffusion, utilisation ou
> copie de ce message ou des renseignements qu'il contient par une personne
> autre que le (les) destinataire(s) désigné(s) est interdite. Si vous
> recevez ce courriel par erreur, veuillez m'en aviser immédiatement, par
> retour de courriel ou par un autre moyen.'. If the disclaimer can't be
> applied, attach the message to a new disclaimer message.
>


Re: ZFS filesystem - supported ?

2021-11-04 Thread Benedict Holland
Right... which is why you do pg_basebackup infrequently. It also captures
WALs when conducting the backup. The tradeoff is that if you have a huge
amount of WAL files then running a restore can be quite time-consuming.
There isn't really a clear win here though. You trade off a long time
backup that will restore everything to that exact moment in time or you
trade off a restore that could take an extremely long time in the case
where you just keep WALs and do a base backup monthly. Worst case scenario
is that you need to restore an hour before a scheduled base backup so you
get to go through every WAL to that point. Snapshots are fine. We also just
pg_dump and tar+bz2 the result. You can even do that on a remote machine.
It is probably a crude solution but it works.

But again, the OP should probably be paying for a service and not relying
on a message board for something like this. I guarantee that Postgres
owners know what to do for this case and have a selection of best
practices because they should and need to or you hire a professional DBA to
step in and tell you what they found works the best.

I absolutely loved the response of ZFS because the clear winner when ZFS
and ext4 performed the benchmark was nearly the same. Now you get to deal
with a non-default file system for a gain that should not be noticeable
unless you are really trying to use every single last cycle on your
computer. If you are, I highly recommend getting the paid support and
hiring a crack team of DBAs who can install your servers using complex
distrubutions.

Thanks,
~Ben

On Mon, Nov 1, 2021 at 10:43 PM Mladen Gogala 
wrote:

>
> On 11/1/21 17:58, Stephen Frost wrote:
> > Well, at least one alternative to performing these snapshots would be to
> > use a tool like pg_basebackup or pgbackrest to perform the backups
> > instead.
>
> File system based backups are much slower than snapshots. The
> feasibility of file based utility like pg_basebackup depends on the size
> of the database and the quality of the infrastructure. However, if
> opting for snapshot based solutions, I would advise something like Pure
> or NetApp which use special hardware to accelerate the process and have
> tools to backup snapshots, like SnapVault (NetApp).
>
> Also, when using file level utilities, I would go full commercial.
> Commercial backup utilities are usually optimized for speed, support
> deduplication and maintain backup catalog, which can come handy if there
> are regulatory requirements about preserving your backups (HIPPA, SOX)
>
>
> --
> Mladen Gogala
> Database Consultant
> Tel: (347) 321-1217
> https://dbwhisperer.wordpress.com
>
>
>
>


RE: Streaming replication versus Logical replication

2021-11-04 Thread Alanoly Andrews
Thanks, Ninad, for the response.
So, am I to understand that when there is a long-running query on the 
subscriber, the vacuumed data updates from the publisher are held over on the 
subscriber until the query completes? If so, where and how are they held over, 
and what does it mean in terms of disk space (either on the publisher or on the 
subscriber)?

Regards.
Alanoly Andrews.

From: Ninad Shah [mailto:nshah.postg...@gmail.com]
Sent: Thursday, November 4, 2021 2:20 PM
To: Alanoly Andrews 
Cc: pgsql-general@lists.postgresql.org
Subject: Re: Streaming replication versus Logical replication


[Email External/Externe] Caution opening links or attachments/attention lors de 
l'ouverture de liens ou de pièces jointes.
Yes, it is going to resolve the issue because streaming is completely a 
slave(with few exceptions). Even the VACUUM operation gets replicated through 
the master, which is not a case with logical replication. In logical 
replication, only data from a few tables gets replicated. In terms of database 
administration, they are different entities.

In case the subscriber has long-running queries, unlike streaming replication, 
it does not affect synchronisation operations.


Regards,
Ninad Shah


On Thu, 4 Nov 2021 at 21:16, Alanoly Andrews 
mailto:alano...@invera.com>> wrote:
We are currently running some long-running SELECT queries on the replication 
database in a streaming replication pair. Some of these queries can run for 1 
hour or more. To avoid errors related to "data no more being available" on the 
replication due to vacuuming of old data on the primary database, we have set 
the following parameters: max_standby_archive_delay = -1, 
max_standby_streaming_delay = -1, hot_standby_feedback = on. With these set, 
the long queries are able to run to completion, but there is table bloat on 
both the primary and the replicated databases, leading to throughput delay on 
the primary production database.

Will this issue exist if we use "logical replication" instead? With the above 
three parameters set back to normal, will the replicated database get 
overwritten when vacuuming runs on the primary and removes old data? If it does 
not, will there be table bloat on the primary database? What is the mechanism 
by which data changes on the "publisher" are propagated to the "subscriber"? 
What happens when the subscriber database has an long-running query?

Thanks.

Alanoly Andrews.


This e-mail may be privileged and/or confidential, and the sender does not 
waive any related rights and obligations. Any distribution, use or copying of 
this e-mail or the information it contains by other than an intended recipient 
is unauthorized. If you received this e-mail in error, please advise me (by 
return e-mail or otherwise) immediately.

Ce courriel est confidentiel et protégé. L'expéditeur ne renonce pas aux droits 
et obligations qui s'y rapportent. Toute diffusion, utilisation ou copie de ce 
message ou des renseignements qu'il contient par une personne autre que le 
(les) destinataire(s) désigné(s) est interdite. Si vous recevez ce courriel par 
erreur, veuillez m'en aviser immédiatement, par retour de courriel ou par un 
autre moyen.'. If the disclaimer can't be applied, attach the message to a new 
disclaimer message.


This e-mail may be privileged and/or confidential, and the sender does not 
waive any related rights and obligations. Any distribution, use or copying of 
this e-mail or the information it contains by other than an intended recipient 
is unauthorized. If you received this e-mail in error, please advise me (by 
return e-mail or otherwise) immediately.


Ce courriel est confidentiel et protégé. L'expéditeur ne renonce pas aux droits 
et obligations qui s'y rapportent. Toute diffusion, utilisation ou copie de ce 
message ou des renseignements qu'il contient par une personne autre que le 
(les) destinataire(s) désigné(s) est interdite. Si vous recevez ce courriel par 
erreur, veuillez m'en aviser immédiatement, par retour de courriel ou par un 
autre moyen.'. If the disclaimer can't be applied, attach the message to a new 
disclaimer message.


Re: Streaming replication versus Logical replication

2021-11-04 Thread Christophe Pettus



> On Nov 4, 2021, at 12:16, Alanoly Andrews  wrote:
> 
> Thanks, Ninad, for the response.
> So, am I to understand that when there is a long-running query on the 
> subscriber, the vacuumed data updates from the publisher are held over on the 
> subscriber until the query completes? If so, where and how are they held 
> over, and what does it mean in terms of disk space (either on the publisher 
> or on the subscriber)?

That's not quite correct.

Vacuum operations happen independently on the publisher and subscriber in 
logical replication.  Unlike binary replication, changes caused by vacuuming on 
the publisher are not sent over to the subscriber.



Re: to_date() and to_timestamp() with negative years

2021-11-04 Thread Adrian Klaver

On 11/4/21 10:29 AM, Bryn Llewellyn wrote:

/adrian.kla...@aklaver.com  wrote:/

Not sure how this can be handled in a totally predictable way given 
the unpredictable ways in which datetime strings are formatted?


The only thing I can say it is it points out that when working with 
datetimes settling on a standard format is your best defense against 
unpredictable results.


Thank you very much, again, for your help with my seemingly endless 
nitpicking questions on this matter, Adrian. Here's the most favorable 
conclusion that I can draw:




3. The rules are hard to understand and the PG doc gives insufficient 
detail to allow the outcomes in corner cases like you just showed us to 
be predicted confidently. Some users seek to understand the rules by 
reading PG's source code.


I would say that is because datetimes in string formats are often hard 
to understand as a result of folks inventing their own formats.


As an example a commit message of mine from some years ago:

"
Fix issue with date parsing of Javascript dates coming from browser
on Windows. This occurred in both Firefox and Chrome. The issue being
that the date had a timezone of Pacific Standard Time instead of PST 
like it does from a Linux machine. dateutils choked on that timezone and 
therefore the date was not parsed. The fix was to add fuzzy=True to the 
dateutils.parse(). This allows dateutils to skip over anything it does 
not understand and parse the rest. This works as the date does include 
the correct tz offset. For the record the date format of concern is-Tue 
Dec 08 2015 00:00:00 GMT-0800 (Pacific Standard Time)

"

Can't remember what version of Windows this was. The dates ended up in a 
Postgres database via Python code in a Django app. This fix is one of 
the reasons I really like the Python dateutils library. The solution being:


from dateutil.parser import parse

parse('Tue Dec 08 2015 00:00:00 GMT-0800 (Pacific Standard Time)', 
fuzzy=True)

datetime.datetime(2015, 12, 8, 0, 0, tzinfo=tzoffset(None, 28800))

To see what it is doing:

parse('Tue Dec 08 2015 00:00:00 GMT-0800 (Pacific Standard Time)', 
fuzzy_with_tokens=True)


(datetime.datetime(2015, 12, 8, 0, 0, tzinfo=tzoffset(None, 28800)),
 (' ', ' ', ' ', ' (Pacific Standard Time)'))

where everything after the datetime are tokens it ignored.

I include this mainly as illustration that data clean up maybe more 
practical before it ever hits the database and in a library that is 
specialized for the task at hand.




4. Certainly, the rules set a significant parsing challenge. You hint 
that they might even prevent a complete practical solution to be 
implemented.


Yes that would depend on someone coding an AI that can fully understand 
people and what there intentions where from a string value. Given the 
answers I get when asking people directly what they intended, I'm not 
holding my breath.




5. None of this matters when the app designer has the freedom to define 
how date-time values will be provided, as text values, by user 
interfaces or external systems. In these cases, the complexity can be 
controlled by edict and correct solutions can be painlessly implemented 
and tested. Not a day goes by that I don't have to enter a date value at 
a UI. And in every case, a UI gadget constrains my input and makes its 
individual fields available to the programmer without the need for 
parsing—so there's reason to be optimistic. The programmer can easily 
build the text representation of the date-time value to conform to the 
simple rules that the overall application design specified.


Yep, constraining the imagination of the end user solves a lot of problems.



6. In some cases text data that's to be converted arrives in a format 
that cannot be influenced. And it might contain locutions like we've 
been discussing ("zero" meaning "1 BC", unfavorable paradigms for 
separating tokens, and the like). In these cases, the diligent 
programmer might, just, be able to use the full arsenal of available 
tools to implement a scheme that faultlessly parses the input. But the 
likelihood of bugs seems to be pretty big.


From my experience that pretty much defines all aspects of programming.


I'll say "case closed, now" — from my side, at least.



--
Adrian Klaver
adrian.kla...@aklaver.com




An self-contained example "expanded" C data type

2021-11-04 Thread Michel Pelletier
The docs for expanded data types are good, but for a working example you
have to go trolling through the array data type source code, which is
enlightening but a pretty heavy lift for me especially if I have to come
back to it after some time.

So I decided to distill what I could glean from the source into a simple,
self-contained, completely useless extension called pgexpanded:

https://github.com/michelp/pgexpanded

I'd like to ask 1) does this seem like a useful project to other people?
If there is interest in me taking it further I'll expand it with some more
documentation and pgtap tests.  and if you think "yes" to 1, then 2) would
those who know a lot more on this subject than I do would like to review
the code? I'll put on my finest asbestos.

Thanks,

-Michel


Re: An self-contained example "expanded" C data type

2021-11-04 Thread Pavel Stehule
Hi

pá 5. 11. 2021 v 4:47 odesílatel Michel Pelletier <
pelletier.mic...@gmail.com> napsal:

> The docs for expanded data types are good, but for a working example you
> have to go trolling through the array data type source code, which is
> enlightening but a pretty heavy lift for me especially if I have to come
> back to it after some time.
>
> So I decided to distill what I could glean from the source into a simple,
> self-contained, completely useless extension called pgexpanded:
>

I think so this can be part of contrib

Regards

Pavel


> https://github.com/michelp/pgexpanded
>
> I'd like to ask 1) does this seem like a useful project to other people?
> If there is interest in me taking it further I'll expand it with some more
> documentation and pgtap tests.  and if you think "yes" to 1, then 2) would
> those who know a lot more on this subject than I do would like to review
> the code? I'll put on my finest asbestos.
>
> Thanks,
>
> -Michel
>