Re: Execution plan does not use index

2020-11-09 Thread Pavel Stehule
út 10. 11. 2020 v 8:18 odesílatel Peter Coppens 
napsal:

> Michael
>
> Many thanks for spending your time on this. Your alternative does not help
> unfortunately (see execution plan)
>
> Still a sequential scan on the complete table. I have tried many
> alternatives and somehow whenever I add a column that is not in the index
> (I64_01) the optimizer decides not to use the index. If I remove that
> column, the index is used. I guess it estimates that the extra indirection
> from index pages to the row pages is more costly than scanning the 168M
> records. Pretty sure it’s not, but I cannot explain it to the stubborn
> thing :)
>
> Btw, thanks for the >= tip (I was aware of it)
>
> Wkr,
>
> Peter
>
>
> Hash Join  (cost=683.93..7270857.46 rows=458127 width=20)
>   Hash Cond: (mv_inner.device_id = d.short_id)
>   Join Filter: ((mv_inner."timestamp" > ('2020-11-06 00:00:00'::timestamp
> without time zone - pg_timezone_names.utc_offset)) AND
> (mv_inner."timestamp" < ('2020-11-07 00:00:00'::timestamp without time zone
> - pg_timezone_names.utc_offset)))
> *  ->  Seq Scan on measurement_value mv_inner  (cost=0.00..7166797.33
> rows=1287989 width=1006)*
> *Filter: (("timestamp" > '2020-11-05 00:00:00'::timestamp without
> time zone) AND ("timestamp" < '2020-11-08 00:00:00'::timestamp without time
> zone))*
>

when you see cast in filter, then you should check type equality in
constraints. With some exception Postgres uses indexes only when filtered
value has same type like column type.

Maybe there is inconsistency between timestamp (with time zone), and
timestamp without time zone

Regards

Pavel


  ->  Hash  (cost=656.61..656.61 rows=2186 width=20)
> ->  Hash Join  (cost=77.87..656.61 rows=2186 width=20)
>   Hash Cond: (dc.timezone = pg_timezone_names.name)
>   ->  Hash Join  (cost=55.37..533.83 rows=615 width=18)
> Hash Cond: (dc.device_id = d.id)
> ->  Seq Scan on device_configuration dc
>  (cost=0.00..470.01 rows=615 width=30)
>   Filter: latest
> ->  Hash  (cost=46.83..46.83 rows=683 width=20)
>   ->  Seq Scan on device d  (cost=0.00..46.83
> rows=683 width=20)
>   ->  Hash  (cost=10.00..10.00 rows=1000 width=48)
> ->  Function Scan on pg_timezone_names
>  (cost=0.00..10.00 rows=1000 width=48)
>
>
>
> On 10 Nov 2020, at 01:15, Michael Lewis  wrote:
>
> On Mon, Nov 9, 2020 at 1:11 PM Peter Coppens 
> wrote:
>
>> Adding the tzn.utc_offset results in the fact that the execution plan no
>> longer considers to use the index on the measurement_value table. Is there
>> any way the SQL can be rewritten so that the index is used? Or any other
>> solution so that the query with the timezone offset returns in a comparable
>> time?
>>
>
> I am not aware of a best practice to handle this. Your where condition on
> mv.timestamp now depends on several joins to do a filtering that used to be
> a static range that can be scanned into the index as a first node in the
> plan. I have sometimes used a sub-query on a broader condition that allows
> the use of the index, and then fully reducing the set later. Something like
> this-
>
> select d.short_id,mv.timestamp,mv.I64_01
>   from device d, device_configuration dc, (
> select mv.*
>   from measurement_value AS mv_inner
>   where mv.timestamp > '2020-11-06'::timestamp - interval '1 day' and
> mv.timestamp < '2020-11-07'::timestamp + interval '1 day'
> offset 0 /* to prevent in-lining the join to the outside set */
> ) mv, pg_timezone_names tzn
>   where mv.device_id=d.short_id and dc.device_id = d.id and
> dc.latest=true and dc.timezone=tzn.name and
> mv.timestamp > '2020-11-06'::timestamp - tzn.utc_offset and
> mv.timestamp < '2020-11-07'::timestamp - tzn.utc_offset
>
> By the way, it seems a little odd to be exclusive on both the begin and
> end. I'd usually expect timestamp >= start_date and timestamp <
> end_date + interval '1 day' to fully capture a 24 hour period. Right now,
> you are excluding any data that happens to have a timestamp value with
> .00 seconds (midnight exactly).
>
>
>


Re: Execution plan does not use index

2020-11-09 Thread Peter Coppens
Michael

Many thanks for spending your time on this. Your alternative does not help 
unfortunately (see execution plan)

Still a sequential scan on the complete table. I have tried many alternatives 
and somehow whenever I add a column that is not in the index (I64_01) the 
optimizer decides not to use the index. If I remove that column, the index is 
used. I guess it estimates that the extra indirection from index pages to the 
row pages is more costly than scanning the 168M records. Pretty sure it’s not, 
but I cannot explain it to the stubborn thing :)

Btw, thanks for the >= tip (I was aware of it)

Wkr,

Peter


Hash Join  (cost=683.93..7270857.46 rows=458127 width=20)
  Hash Cond: (mv_inner.device_id = d.short_id)
  Join Filter: ((mv_inner."timestamp" > ('2020-11-06 00:00:00'::timestamp 
without time zone - pg_timezone_names.utc_offset)) AND (mv_inner."timestamp" < 
('2020-11-07 00:00:00'::timestamp without time zone - 
pg_timezone_names.utc_offset)))
  ->  Seq Scan on measurement_value mv_inner  (cost=0.00..7166797.33 
rows=1287989 width=1006)
Filter: (("timestamp" > '2020-11-05 00:00:00'::timestamp without time 
zone) AND ("timestamp" < '2020-11-08 00:00:00'::timestamp without time zone))
  ->  Hash  (cost=656.61..656.61 rows=2186 width=20)
->  Hash Join  (cost=77.87..656.61 rows=2186 width=20)
  Hash Cond: (dc.timezone = pg_timezone_names.name)
  ->  Hash Join  (cost=55.37..533.83 rows=615 width=18)
Hash Cond: (dc.device_id = d.id)
->  Seq Scan on device_configuration dc  (cost=0.00..470.01 
rows=615 width=30)
  Filter: latest
->  Hash  (cost=46.83..46.83 rows=683 width=20)
  ->  Seq Scan on device d  (cost=0.00..46.83 rows=683 
width=20)
  ->  Hash  (cost=10.00..10.00 rows=1000 width=48)
->  Function Scan on pg_timezone_names  (cost=0.00..10.00 
rows=1000 width=48)



> On 10 Nov 2020, at 01:15, Michael Lewis  wrote:
> 
> On Mon, Nov 9, 2020 at 1:11 PM Peter Coppens  > wrote:
> Adding the tzn.utc_offset results in the fact that the execution plan no 
> longer considers to use the index on the measurement_value table. Is there 
> any way the SQL can be rewritten so that the index is used? Or any other 
> solution so that the query with the timezone offset returns in a comparable 
> time?
> 
> I am not aware of a best practice to handle this. Your where condition on 
> mv.timestamp now depends on several joins to do a filtering that used to be a 
> static range that can be scanned into the index as a first node in the plan. 
> I have sometimes used a sub-query on a broader condition that allows the use 
> of the index, and then fully reducing the set later. Something like this-
> 
> select d.short_id,mv.timestamp,mv.I64_01
>   from device d, device_configuration dc, (
> select mv.*
>   from measurement_value AS mv_inner
>   where mv.timestamp > '2020-11-06'::timestamp - interval '1 day' and 
> mv.timestamp < '2020-11-07'::timestamp + interval '1 day'
> offset 0 /* to prevent in-lining the join to the outside set */
> ) mv, pg_timezone_names tzn
>   where mv.device_id=d.short_id and dc.device_id = d.id  and 
> dc.latest=true and dc.timezone=tzn.name  and
> mv.timestamp > '2020-11-06'::timestamp - tzn.utc_offset and 
> mv.timestamp < '2020-11-07'::timestamp - tzn.utc_offset
> 
> By the way, it seems a little odd to be exclusive on both the begin and end. 
> I'd usually expect timestamp >= start_date and timestamp < end_date + 
> interval '1 day' to fully capture a 24 hour period. Right now, you are 
> excluding any data that happens to have a timestamp value with .00 
> seconds (midnight exactly).



Re: How to set up a schema default date to '2020-01-01'?

2020-11-09 Thread raf
On Tue, Nov 10, 2020 at 10:51:02AM +0530, mark armon <1994hej...@gmail.com> 
wrote:

> On Mon, Nov 9, 2020 at 9:44 PM David G. Johnston 
> wrote:
> 
> > On Mon, Nov 9, 2020 at 9:08 AM mark armon <1994hej...@gmail.com> wrote:
> >
> >> How to set up a schema default date (now) to '2020-01-01'?  Whatever
> >> timezone would be OK.
> >
> > What is a "schema default" (date or otherwise)?
>
> > David J.
> >
> like I create a schema: test, I want the default date to 2020-01-01, so
> when I do
> 
> select test.now;
> 
> the result is 2020-01-01

I would recommend that you create a stored
function/procedure that returns the "default" that you
want it to return, and execute a call to that, rather
than executing "select test.now". The statement "select
test.now" doesn't even seem to be valid sql. Bear in
mind that I have no idea what you are talking about so
my advice might not be helpful.

I have timestamped database backups and timestamped
schema update files so that my database load script
knows which schema updates to apply when loading an old
database backup (i.e. anything whose timestamp is later
than the timestamp of the backup), which seems like it
might be related to what you are after, but I don't
understand the idea of a "default" date. The "date" for
my schema is always the present so as to match the
corresponding software in its current state.

Perhaps you can explain in more detail what you are
after.

cheers,
raf





Re: How to set up a schema default date to '2020-01-01'?

2020-11-09 Thread David G. Johnston
On Mon, Nov 9, 2020 at 10:21 PM mark armon <1994hej...@gmail.com> wrote:

> like I create a schema: test, I want the default date to 2020-01-01, so
> when I do
>
> select test.now;
>
> the result is 2020-01-01
>
That is not presently a feature that PostgreSQL implements.

While you can leverage probably a custom configuration variable it might be
less subtle to simply store the desired reference in a SQL function and
then call the function.

David J.


Re: How to set up a schema default date to '2020-01-01'?

2020-11-09 Thread mark armon
like I create a schema: test, I want the default date to 2020-01-01, so
when I do

select test.now;

the result is 2020-01-01


On Mon, Nov 9, 2020 at 9:44 PM David G. Johnston 
wrote:

> On Mon, Nov 9, 2020 at 9:08 AM mark armon <1994hej...@gmail.com> wrote:
>
>>
>> How to set up a schema default date (now) to '2020-01-01'?  Whatever
>> timezone would be OK.
>>
>
> What is a "schema default" (date or otherwise)?
>
> David J.
>


Re: initdb --data-checksums

2020-11-09 Thread Michael Paquier
On Mon, Nov 09, 2020 at 06:03:43PM +0100, Paul Förster wrote:
> indeed, it is. Have a look at:
> 
> https://www.postgresql.org/docs/12/app-pgchecksums.html
> 
> Make sure the database is cleanly shut down before doing it.

This tool is really useful with upgrades after pg_upgrade.  Please
note that there is a --progress option, so you can basically know how
long it is going to take until completion.
--
Michael


signature.asc
Description: PGP signature


Re: New "function tables" in V13 documentation

2020-11-09 Thread Merlin Moncure
On Sun, Nov 8, 2020 at 3:57 PM Thomas Kellerer  wrote:
>
> In case someone is interested: there is a little discussion going on on 
> Reddit whether the new format of presenting functions in V13 is a step 
> backwards:
>
>
> https://www.reddit.com/r/PostgreSQL/comments/jpi0rp/does_anyone_else_feel_like_the_v13_docs_are_a/

It's more than a little ironic that reddit's "old" format (still
visible via old.reddit.com) is objectively clearer and better along
exactly the same lines.

merlin




Re: Execution plan does not use index

2020-11-09 Thread Michael Lewis
On Mon, Nov 9, 2020 at 1:11 PM Peter Coppens 
wrote:

> Adding the tzn.utc_offset results in the fact that the execution plan no
> longer considers to use the index on the measurement_value table. Is there
> any way the SQL can be rewritten so that the index is used? Or any other
> solution so that the query with the timezone offset returns in a comparable
> time?
>

I am not aware of a best practice to handle this. Your where condition on
mv.timestamp now depends on several joins to do a filtering that used to be
a static range that can be scanned into the index as a first node in the
plan. I have sometimes used a sub-query on a broader condition that allows
the use of the index, and then fully reducing the set later. Something like
this-

select d.short_id,mv.timestamp,mv.I64_01
  from device d, device_configuration dc, (
select mv.*
  from measurement_value AS mv_inner
  where mv.timestamp > '2020-11-06'::timestamp - interval '1 day' and
mv.timestamp < '2020-11-07'::timestamp + interval '1 day'
offset 0 /* to prevent in-lining the join to the outside set */
) mv, pg_timezone_names tzn
  where mv.device_id=d.short_id and dc.device_id = d.id and dc.latest=true
and dc.timezone=tzn.name and
mv.timestamp > '2020-11-06'::timestamp - tzn.utc_offset and
mv.timestamp < '2020-11-07'::timestamp - tzn.utc_offset

By the way, it seems a little odd to be exclusive on both the begin and
end. I'd usually expect timestamp >= start_date and timestamp < end_date +
interval '1 day' to fully capture a 24 hour period. Right now, you are
excluding any data that happens to have a timestamp value with .00
seconds (midnight exactly).


Re: New "function tables" in V13 documentation

2020-11-09 Thread David G. Johnston
On Mon, Nov 9, 2020 at 3:30 PM Ron  wrote:

> On 11/9/20 3:05 PM, David G. Johnston wrote:
>
> On Mon, Nov 9, 2020 at 2:01 PM Ron  wrote:
>
>> My suggestion is to add a "table of contents" at the top of non-trivial
>> sections that simply lists available functions by name (generally ignoring
>> argument variations) and a quick one line description of purpose.  Once a
>> person finds the name of the function that suits their needs they can then
>> reference the main table for details, warnings, and examples.
>>
>>
>> This is what TOCs are for, no?
>>
>>
> Are you criticizing my over-explanation of the phrase "table of contents"
> here?
>
>
> Why do you think that?
>
> I'm *agreeing* this is why TOCs were invented.
>
>
Because agreement without elaboration usually just merits a +1 so I read
more into the statement than you intended.

David J.


Re: New "function tables" in V13 documentation

2020-11-09 Thread Ron

On 11/9/20 3:05 PM, David G. Johnston wrote:
On Mon, Nov 9, 2020 at 2:01 PM Ron > wrote:



My suggestion is to add a "table of contents" at the top of
non-trivial sections that simply lists available functions by name
(generally ignoring argument variations) and a quick one line
description of purpose.  Once a person finds the name of the function
that suits their needs they can then reference the main table for
details, warnings, and examples.


This is what TOCs are for, no?


Are you criticizing my over-explanation of the phrase "table of contents" 
here?


Why do you think that?

I'm *agreeing* this is why TOCs were invented.


--
Angular momentum makes the world go 'round.


Re: New "function tables" in V13 documentation

2020-11-09 Thread David G. Johnston
On Mon, Nov 9, 2020 at 2:01 PM Ron  wrote:

> My suggestion is to add a "table of contents" at the top of non-trivial
> sections that simply lists available functions by name (generally ignoring
> argument variations) and a quick one line description of purpose.  Once a
> person finds the name of the function that suits their needs they can then
> reference the main table for details, warnings, and examples.
>
>
> This is what TOCs are for, no?
>
>
Are you criticizing my over-explanation of the phrase "table of contents"
here?

David J.


Re: New "function tables" in V13 documentation

2020-11-09 Thread Ron

On 11/9/20 2:47 PM, David G. Johnston wrote:
On Mon, Nov 9, 2020 at 1:41 PM Tom Lane > wrote:


Alvaro Herrera mailto:alvhe...@alvh.no-ip.org>> writes:
> On 2020-Nov-08, Adrian Klaver wrote:
>> Yeah, I would agree with the mobile first design comments. Then
again that
>> plague is hitting most sites these days. My 2 cents is it is a step
>> backwards. You can cover more ground quickly and digest it faster
in the old
>> format.

> The person who made that comment retracted later.

> If you have suggestion on how to improve the new format, I'm sure we can
> discuss that.  It seems pretty clear to me that we're not going back to
> the old format.

I think there's no question that the new format is better in any case
where a function needs more than a couple words of documentation.
I could see the argument for adopting a more compact format for tables
that contain no such functions.  I think you might find that the set of
such tables is nigh empty, though; even section 9.3 (mathematical
functions) has a lot of functions that need a sentence or two.  We used
to either omit important details for such functions or stick them in
footnotes, and neither of those options is very nice.


My observation is that the new format reduces one's ability to quickly 
skim the table to find out what is present since there is considerable 
extra information in one's eyes during that process that needs to be 
skimmed over.


My suggestion is to add a "table of contents" at the top of non-trivial 
sections that simply lists available functions by name (generally ignoring 
argument variations) and a quick one line description of purpose.  Once a 
person finds the name of the function that suits their needs they can then 
reference the main table for details, warnings, and examples.


This is what TOCs are for, no?

--
Angular momentum makes the world go 'round.


Re: New "function tables" in V13 documentation

2020-11-09 Thread Tom Lane
=?UTF-8?B?Sm9zZWYgxaBpbcOhbmVr?=  writes:
> But it is not clear for me what exactly was the problem with the old
> format. Is there any discussion anyone can point me to to ensure I'll
> not just revive the old problems, but improve the overall situation?

The primary discussion threads for this change were

https://www.postgresql.org/message-id/flat/9326.1581457869%40sss.pgh.pa.us

https://www.postgresql.org/message-id/flat/8691.1586798003%40sss.pgh.pa.us

There are a lot of older threads about content (not layout) deficiencies
in our docs that were practically impossible to fix under the old format;
here's one:

https://www.postgresql.org/message-id/flat/158110996889.1089.4224139874633222837%40wrigleys.postgresql.org

regards, tom lane




Re: New "function tables" in V13 documentation

2020-11-09 Thread Adrian Klaver

On 11/9/20 12:35 PM, Alvaro Herrera wrote:

On 2020-Nov-09, Adrian Klaver wrote:


If you have suggestion on how to improve the new format, I'm sure we can
discuss that.  It seems pretty clear to me that we're not going back to
the old format.


Improve it by going back to old format. Not sure why that is not open to
discussion?


Because the old format had problems.


That reply is about as useful as the 'improvements'.


--
Adrian Klaver
adrian.kla...@aklaver.com




Re: New "function tables" in V13 documentation

2020-11-09 Thread David G. Johnston
On Mon, Nov 9, 2020 at 1:41 PM Tom Lane  wrote:

> Alvaro Herrera  writes:
> > On 2020-Nov-08, Adrian Klaver wrote:
> >> Yeah, I would agree with the mobile first design comments. Then again
> that
> >> plague is hitting most sites these days. My 2 cents is it is a step
> >> backwards. You can cover more ground quickly and digest it faster in
> the old
> >> format.
>
> > The person who made that comment retracted later.
>
> > If you have suggestion on how to improve the new format, I'm sure we can
> > discuss that.  It seems pretty clear to me that we're not going back to
> > the old format.
>
> I think there's no question that the new format is better in any case
> where a function needs more than a couple words of documentation.
> I could see the argument for adopting a more compact format for tables
> that contain no such functions.  I think you might find that the set of
> such tables is nigh empty, though; even section 9.3 (mathematical
> functions) has a lot of functions that need a sentence or two.  We used
> to either omit important details for such functions or stick them in
> footnotes, and neither of those options is very nice.
>

My observation is that the new format reduces one's ability to quickly skim
the table to find out what is present since there is considerable extra
information in one's eyes during that process that needs to be skimmed over.

My suggestion is to add a "table of contents" at the top of non-trivial
sections that simply lists available functions by name (generally ignoring
argument variations) and a quick one line description of purpose.  Once a
person finds the name of the function that suits their needs they can then
reference the main table for details, warnings, and examples.

David J.


Re: New "function tables" in V13 documentation

2020-11-09 Thread Tom Lane
Alvaro Herrera  writes:
> On 2020-Nov-08, Adrian Klaver wrote:
>> Yeah, I would agree with the mobile first design comments. Then again that
>> plague is hitting most sites these days. My 2 cents is it is a step
>> backwards. You can cover more ground quickly and digest it faster in the old
>> format.

> The person who made that comment retracted later.

> If you have suggestion on how to improve the new format, I'm sure we can
> discuss that.  It seems pretty clear to me that we're not going back to
> the old format.

I think there's no question that the new format is better in any case
where a function needs more than a couple words of documentation.
I could see the argument for adopting a more compact format for tables
that contain no such functions.  I think you might find that the set of
such tables is nigh empty, though; even section 9.3 (mathematical
functions) has a lot of functions that need a sentence or two.  We used
to either omit important details for such functions or stick them in
footnotes, and neither of those options is very nice.

regards, tom lane




Re: New "function tables" in V13 documentation

2020-11-09 Thread Josef Šimánek
po 9. 11. 2020 v 21:35 odesílatel Alvaro Herrera
 napsal:
>
> On 2020-Nov-09, Adrian Klaver wrote:
>
> > > If you have suggestion on how to improve the new format, I'm sure we can
> > > discuss that.  It seems pretty clear to me that we're not going back to
> > > the old format.
> >
> > Improve it by going back to old format. Not sure why that is not open to
> > discussion?
>
> Because the old format had problems.

The new format has problems as well. I was thinking about reviving old
format conditionally (based on css media queries) to wide screens, and
use current format on mobile-like screen widths.

But it is not clear for me what exactly was the problem with the old
format. Is there any discussion anyone can point me to to ensure I'll
not just revive the old problems, but improve the overall situation?

>
>




Re: New "function tables" in V13 documentation

2020-11-09 Thread David G. Johnston
On Mon, Nov 9, 2020 at 1:33 PM Adrian Klaver 
wrote:

> On 11/9/20 12:06 PM, Alvaro Herrera wrote:
>
> > If you have suggestion on how to improve the new format, I'm sure we can
> > discuss that.  It seems pretty clear to me that we're not going back to
> > the old format.
>
> Improve it by going back to old format. Not sure why that is not open to
> discussion?
>

More usefully, the current format and content changes are not going to be
"reverted" so "going back" is not really an option.  If you want to propose
a patch for moving forward to a new format that is similar to the old one
while retaining the content changes that would be a possible way forward.

David J.


Re: New "function tables" in V13 documentation

2020-11-09 Thread Alvaro Herrera
On 2020-Nov-09, Adrian Klaver wrote:

> > If you have suggestion on how to improve the new format, I'm sure we can
> > discuss that.  It seems pretty clear to me that we're not going back to
> > the old format.
> 
> Improve it by going back to old format. Not sure why that is not open to
> discussion?

Because the old format had problems.




Re: New "function tables" in V13 documentation

2020-11-09 Thread Adrian Klaver

On 11/9/20 12:06 PM, Alvaro Herrera wrote:

On 2020-Nov-08, Adrian Klaver wrote:


On 11/8/20 1:57 PM, Thomas Kellerer wrote:

In case someone is interested: there is a little discussion going on on
Reddit whether the new format of presenting functions in V13 is a step
backwards:

 
https://www.reddit.com/r/PostgreSQL/comments/jpi0rp/does_anyone_else_feel_like_the_v13_docs_are_a/


Yeah, I would agree with the mobile first design comments. Then again that
plague is hitting most sites these days. My 2 cents is it is a step
backwards. You can cover more ground quickly and digest it faster in the old
format.


The person who made that comment retracted later.

If you have suggestion on how to improve the new format, I'm sure we can
discuss that.  It seems pretty clear to me that we're not going back to
the old format.


Improve it by going back to old format. Not sure why that is not open to 
discussion?



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Foreign Data Wrapper Handler

2020-11-09 Thread Adrian Klaver

On 11/9/20 9:21 AM, Susan Hurst wrote:

Thanks for the detailed instructions, Laurenz!

"The foreign server encapsulates the connection string to access a remote
PostgreSQL database.  Define one per remote database you want to access."

Where do I define "one per remote database"?.in pg_hba.conf?


https://www.postgresql.org/docs/12/sql-createserver.html



---

Susan E Hurst
Principal Consultant
Brookhurst Data LLC
Email: susan.hu...@brookhurstdata.com
Mobile: 314-486-3261





--
Adrian Klaver
adrian.kla...@aklaver.com




Execution plan does not use index

2020-11-09 Thread Peter Coppens
Hello,

Consider the following PostgreSQL 9.6.18 tables
- measurement_value: time series table  with a unique key on
(device_id,timestamp) columns and a number of columns with measurements.
Table contains a large number of rows (>150million)
- device table: with device properties (short_id joins to device_id in
measurement_value table)
- device_configuration table: extra device properties, including a timezone

This query works fine and quick (returns ±320K rows in 3.2 seconds when
explain is removed)

explain
  select d.short_id,mv.timestamp,mv.I64_01
  from device d, device_configuration dc, measurement_value mv,
pg_timezone_names tzn
  where mv.device_id=d.short_id and dc.device_id = d.id and dc.latest=true
and dc.timezone=tzn.name and
mv.timestamp > '2020-11-06'::timestamp and mv.timestamp <
'2020-11-07'::timestamp

==>
Hash Join  (cost=23.63..2156655.51 rows=1454320 width=20)
  Hash Cond: (dc.timezone = pg_timezone_names.name)
  ->  Nested Loop  (cost=1.13..2089933.38 rows=409070 width=34)
->  Nested Loop  (cost=0.56..7497.34 rows=615 width=18)
  ->  Index Scan using device_short_id_key on device d
 (cost=0.28..2423.90 rows=683 width=20)
  ->  Index Scan using device_configuration_device_latest_idx
on device_configuration dc  (cost=0.28..7.42 rows=1 width=30)
Index Cond: ((device_id = d.id) AND (latest = true))
Filter: latest

*->  Index Scan using measurement_values_pkey on measurement_value
mv  (cost=0.57..3375.60 rows=1047 width=20)  Index Cond:
((device_id = d.short_id) AND ("timestamp" > '2020-11-06
00:00:00'::timestamp without time zone) AND ("timestamp" < '2020-11-07
*00:00:00'::timestamp
without time zone))
  ->  Hash  (cost=10.00..10.00 rows=1000 width=32)
->  Function Scan on pg_timezone_names  (cost=0.00..10.00 rows=1000
width=32)


When adding the timezone offset to the where clause, the query becomes
extremely slow due to a full table scan of the measurement_value table
(±320K rows in 11 minutes)

(See  ->  Seq Scan on measurement_value mv  (cost=0.00..6308233.77
rows=169617977 width=20))

explain
  select d.short_id,mv.timestamp,mv.I64_01
  from device d, device_configuration dc, measurement_value mv,
pg_timezone_names tzn
  where mv.device_id=d.short_id and dc.device_id = d.id and dc.latest=true
and dc.timezone=tzn.name and
mv.timestamp > '2020-11-06'::timestamp *- tzn.utc_offset* and
mv.timestamp < '2020-11-07'::timestamp *- tzn.utc_offset*

==>
Hash Join  (cost=683.93..18226567.32 rows=60331762 width=20)
  Hash Cond: (mv.device_id = d.short_id)
  Join Filter: ((mv."timestamp" > ('2020-11-06 00:00:00'::timestamp without
time zone - pg_timezone_names.utc_offset)) AND (mv."timestamp" <
('2020-11-07 00:00:00'::timestamp without time zone -
pg_timezone_names.utc_offset)))
  ->  *Seq Scan on measurement_value mv  (cost=0.00..6308233.77
rows=169617977 width=20)*
  ->  Hash  (cost=656.61..656.61 rows=2186 width=20)
->  Hash Join  (cost=77.87..656.61 rows=2186 width=20)
  Hash Cond: (dc.timezone = pg_timezone_names.name)
  ->  Hash Join  (cost=55.37..533.83 rows=615 width=18)
Hash Cond: (dc.device_id = d.id)
->  Seq Scan on device_configuration dc
 (cost=0.00..470.01 rows=615 width=30)
  Filter: latest
->  Hash  (cost=46.83..46.83 rows=683 width=20)
  ->  Seq Scan on device d  (cost=0.00..46.83
rows=683 width=20)
  ->  Hash  (cost=10.00..10.00 rows=1000 width=48)
->  Function Scan on pg_timezone_names
 (cost=0.00..10.00 rows=1000 width=48)


Adding the tzn.utc_offset results in the fact that the execution plan no
longer considers to use the index on the measurement_value table. Is there
any way the SQL can be rewritten so that the index is used? Or any other
solution so that the query with the timezone offset returns in a comparable
time?

Many thanks!

Peter


Re: New "function tables" in V13 documentation

2020-11-09 Thread Alvaro Herrera
On 2020-Nov-08, Adrian Klaver wrote:

> On 11/8/20 1:57 PM, Thomas Kellerer wrote:
> > In case someone is interested: there is a little discussion going on on
> > Reddit whether the new format of presenting functions in V13 is a step
> > backwards:
> > 
> > 
> > https://www.reddit.com/r/PostgreSQL/comments/jpi0rp/does_anyone_else_feel_like_the_v13_docs_are_a/
> 
> Yeah, I would agree with the mobile first design comments. Then again that
> plague is hitting most sites these days. My 2 cents is it is a step
> backwards. You can cover more ground quickly and digest it faster in the old
> format.

The person who made that comment retracted later.

If you have suggestion on how to improve the new format, I'm sure we can
discuss that.  It seems pretty clear to me that we're not going back to
the old format.




Re: New "function tables" in V13 documentation

2020-11-09 Thread Tony Shelver
On Mon, 9 Nov 2020 at 02:54, Adrian Klaver 
wrote:

> On 11/8/20 1:57 PM, Thomas Kellerer wrote:
> > In case someone is interested: there is a little discussion going on on
> > Reddit whether the new format of presenting functions in V13 is a step
> > backwards:
> >
> >
> >
> https://www.reddit.com/r/PostgreSQL/comments/jpi0rp/does_anyone_else_feel_like_the_v13_docs_are_a/
>
> Yeah, I would agree with the mobile first design comments. Then again
> that plague is hitting most sites these days. My 2 cents is it is a step
> backwards. You can cover more ground quickly and digest it faster in the
> old format.
>
> >
> >
> > Thomas
> >
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
> Agreed, old format much more readable.
>


Re: After vacuum application runs very slow ? is this common behavior ?

2020-11-09 Thread Sri Linux
Thank you for your response.

On Fri, Nov 6, 2020 at 11:14 PM Adrian Klaver 
wrote:

> On 11/6/20 8:20 PM, Sri Linux wrote:
> > Hi All,
> >
> > Our production database size is about 2TB and we had run into issues and
> > Postgres log did recommend running the vacuum in single-user mode. We
> > have successfully completed running a single user mode vacuum without
> > any options. .. services were online after vacuum. Application load time
> > and response times are very poor and don't load sometimes. Is this
> > common behavior after a vacuum? DO we need to consider any post
> > activities or give some time for the database to settle down?
>
> Given that there is no actual useful/hard data in the above the chances
> of there being a relevant answer in return is doubtful.
>
> Also given your subsequent post the major version you are using is
> nearing 1 year past EOL and is 21 minor releases behind the last release
> in the series. At the very least you need to bring it up to the last
> minor release.
>
> >
> > Did full os restart for the DB server?
> >
> > Thanks
> > Sri.
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: Temporary tables usage in functions

2020-11-09 Thread Pavel Stehule
po 9. 11. 2020 v 18:19 odesílatel Michael Lewis  napsal:

> Also may I know if excessive use of temporary tables may cause locks?
>>>
>>
>> Usually there are no problems with locks, but there is a problem with
>> system tables bloating. Creating and dropping temp tables is expensive like
>> creating or dropping normal tables.
>>
>
> Dropping a real table requires scanning all of shared_buffers, right? I
> wouldn't think dropping a temp table requires that kind of heavy operation
> because it should be stored within separate temp_buffers.
>

cleaning shared buffers probably is not a problem - when shared buffers are
less than 20GB. The main problem is bloating pg_class, pg_attribute, maybe
pg_depend. And VACUUM (autovacuum) is done after the transaction. So if
somebody drop and recreate temp table when some function is starting, and
this function is called 1M times inside the transaction, then there is
brutal bloating of the system catalogue. And bloated system tables can do
lot of other performance problems.



>
> Against other databases, there can be used arrays instead temporary
>> tables. This is significantly more effective.
>>
>
> Can you expand on this point? What do you mean? Like using a values
> statement rather than temp table? I find that I often need to create a temp
> table and analyze it to allow the planner to make wise decisions.
>

This is correct usage of temp tables. When you need ANALYZE over some data,
then there is no other possibility than using a temp table.

But a lot of people are coming from the MS SQL world, where temporary
tables are used significantly often - for passing a list of ids between
procedures, for creating multiline result, ...  The implementation of MS
SQL temp tables or table variables is very different, and the usage is much
more common - typically when we use arrays in Postgres. Different example
is a Oracle. There are primary global temporary tables - again there is
very low (zero) impact on system catalog, and some patterns that are
working on Oracle well don't work well in Postgres (under longer higher
load).

postgres=# \dt+ pg_attribute
  List of relations
┌┬──┬───┬──┬─┬───┬┬─┐
│   Schema   │ Name │ Type  │  Owner   │ Persistence │ Access
Method │  Size  │ Description │
╞╪══╪═══╪══╪═╪═══╪╪═╡
│ pg_catalog │ pg_attribute │ table │ postgres │ permanent   │ heap
 │ 568 kB │ │
└┴──┴───┴──┴─┴───┴┴─┘
(1 row)

Timing is on.
postgres=# do $$
begin
  for i in 1..1000
  loop
create temp table foo(a int);
drop table foo;
  end loop;
end;
$$;
DO
Time: 2043,087 ms (00:02,043)
postgres=# do $$
begin
  for i in 1..1000
  loop
create temp table foo(a int);
drop table foo;
  end loop;
end;
$$;
DO
Time: 2082,437 ms (00:02,082)
postgres=# do $$
begin
  for i in 1..1000
  loop
create temp table foo(a int);
drop table foo;
  end loop;
end;
$$;
DO
Time: 2124,664 ms (00:02,125)
postgres=# do $$
begin
  for i in 1..1000
  loop
create temp table foo(a int);
drop table foo;
  end loop;
end;
$$;
DO
Time: 2137,486 ms (00:02,137)
postgres=# do $$
begin
  for i in 1..1000
  loop
create temp table foo(a int);
drop table foo;
  end loop;
end;
$$;
DO
Time: 2188,999 ms (00:02,189)
postgres=# do $$
begin
  for i in 1..1000
  loop
create temp table foo(a int);
drop table foo;
  end loop;
end;
$$;
DO
Time: 2209,167 ms (00:02,209)
postgres=# do $$
begin
  for i in 1..1000
  loop
create temp table foo(a int);
drop table foo;
  end loop;
end;
$$;
DO
Time: 2344,531 ms (00:02,345)

postgres=# \dt+ pg_attribute
  List of relations
┌┬──┬───┬──┬─┬───┬─┬─┐
│   Schema   │ Name │ Type  │  Owner   │ Persistence │ Access
Method │  Size   │ Description │
╞╪══╪═══╪══╪═╪═══╪═╪═╡
│ pg_catalog │ pg_attribute │ table │ postgres │ permanent   │ heap
 │ 1592 kB │ │
└┴──┴───┴──┴─┴───┴─┴─┘
(1 row)

So some patterns that are usual with temporary tables on Oracle or on MSSQL
are bad for Postgres. This is artificial example - the reality can be worse
due too long transactions that can block vacuum.

On second hand - the advaise for Oracle is using temporary tables only when
it is necessary and isn't possible to use collection too.

Regards

Pavel


Re: Foreign Data Wrapper Handler

2020-11-09 Thread Laurenz Albe
On Mon, 2020-11-09 at 11:21 -0600, Susan Hurst wrote:
> "The foreign server encapsulates the connection string to access a remote
> PostgreSQL database.  Define one per remote database you want to access."
> 
> Where do I define "one per remote database"?.in pg_hba.conf?

No, in SQL:

   CREATE FOREIGN SERVER somename FOREIFN DATA WRAPPER postgres_fdw OPTIONS 
(...);

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Foreign Data Wrapper Handler

2020-11-09 Thread Susan Hurst

Thanks for the detailed instructions, Laurenz!

"The foreign server encapsulates the connection string to access a 
remote
PostgreSQL database.  Define one per remote database you want to 
access."


Where do I define "one per remote database"?.in pg_hba.conf?

---

Susan E Hurst
Principal Consultant
Brookhurst Data LLC
Email: susan.hu...@brookhurstdata.com
Mobile: 314-486-3261

On 2020-11-09 11:12, Laurenz Albe wrote:

On Sun, 2020-11-08 at 13:09 -0600, Susan Hurst wrote:

The first pass thru https://www.postgresql.org/docs/12/fdwhandler.html
does not tell me what I think I need to know, but I will digest this
more thoroughly. Maybe I need to understand more of the lingo re:
foreign data wrappers. I do understand that all fdw names must be 
unique
within a database so if I want to connect to more than one foreign db, 
I

need a different name for each connection. I cannot name each fdw
postgres_fdw. I would like to name the fdws something like:
dbname_to_foreigndbname.

For example, here are 2 possible fdws:

create foreign data wrapper stp_to_geo;
create foreign data wrapper stp_to_metrics;

That syntax creates the fdw and I can create user mappings but I 
cannot
import any foreign schemas into my database. The error message says 
that

there is no handler for the fdw. That's where I'm stuck.

BTW, I did try using postgres_fdw as a handler...

create foreign data wrapper stp_to_geo handler postgres_fdw;

...but then I got these errors:
ERROR:  function postgres_fdw() does not exist
ERROR:  foreign-data wrapper "stp_to_geo" does not exist

Looks like I need to study a bit more.


This is how you would create a new foreign data wrapper object for 
PostgreSQL:


  CREATE FOREIGN DATA WRAPPER myfdw
 HANDLER public.postgres_fdw_handler
 VALIDATOR public.postgres_fdw_validator;

This assumes that you installed the extension "postgres_fdw" in schema 
"public".


But you normally don't have to create a new foreign data wrapper: the 
one named

"postgres_fdw" that is created by the extension is good enough.
The only reason would be to have a foreign data wrapper with
non-default options,
but since there are no options for "postgres_fdw", that is moot.

So don't do that.

The hierarchy of objects is as follows:

- The foreign data wrapper encapsulates the code required to access the 
foreign
  data source.  You need only one per database; no need to create a new 
one.


- The foreign server encapsulates the connection string to access a 
remote
  PostgreSQL database.  Define one per remote database you want to 
access.


- The user mapping encapsulates the credentials for a user to access a
foreign server.
  Create one per user and foreign server (or a single one for PUBLIC =
everybody).

- The foreign table describes how a remote table is mapped locally.
  Define one per table that interests you.

Yours,
Laurenz Albe





Re: Temporary tables usage in functions

2020-11-09 Thread Michael Lewis
>
> Also may I know if excessive use of temporary tables may cause locks?
>>
>
> Usually there are no problems with locks, but there is a problem with
> system tables bloating. Creating and dropping temp tables is expensive like
> creating or dropping normal tables.
>

Dropping a real table requires scanning all of shared_buffers, right? I
wouldn't think dropping a temp table requires that kind of heavy operation
because it should be stored within separate temp_buffers.


Against other databases, there can be used arrays instead temporary tables.
> This is significantly more effective.
>

Can you expand on this point? What do you mean? Like using a values
statement rather than temp table? I find that I often need to create a temp
table and analyze it to allow the planner to make wise decisions.

>


Re: Foreign Data Wrapper Handler

2020-11-09 Thread Laurenz Albe
On Sun, 2020-11-08 at 13:09 -0600, Susan Hurst wrote:
> The first pass thru https://www.postgresql.org/docs/12/fdwhandler.html 
> does not tell me what I think I need to know, but I will digest this 
> more thoroughly. Maybe I need to understand more of the lingo re: 
> foreign data wrappers. I do understand that all fdw names must be unique 
> within a database so if I want to connect to more than one foreign db, I 
> need a different name for each connection. I cannot name each fdw 
> postgres_fdw. I would like to name the fdws something like: 
> dbname_to_foreigndbname.
> 
> For example, here are 2 possible fdws:
> 
> create foreign data wrapper stp_to_geo;
> create foreign data wrapper stp_to_metrics;
> 
> That syntax creates the fdw and I can create user mappings but I cannot 
> import any foreign schemas into my database. The error message says that 
> there is no handler for the fdw. That's where I'm stuck.
> 
> BTW, I did try using postgres_fdw as a handler...
> 
> create foreign data wrapper stp_to_geo handler postgres_fdw;
> 
> ...but then I got these errors:
> ERROR:  function postgres_fdw() does not exist
> ERROR:  foreign-data wrapper "stp_to_geo" does not exist
> 
> Looks like I need to study a bit more.

This is how you would create a new foreign data wrapper object for PostgreSQL:

  CREATE FOREIGN DATA WRAPPER myfdw
 HANDLER public.postgres_fdw_handler
 VALIDATOR public.postgres_fdw_validator;

This assumes that you installed the extension "postgres_fdw" in schema "public".

But you normally don't have to create a new foreign data wrapper: the one named
"postgres_fdw" that is created by the extension is good enough.
The only reason would be to have a foreign data wrapper with non-default 
options,
but since there are no options for "postgres_fdw", that is moot.

So don't do that.

The hierarchy of objects is as follows:

- The foreign data wrapper encapsulates the code required to access the foreign
  data source.  You need only one per database; no need to create a new one.

- The foreign server encapsulates the connection string to access a remote
  PostgreSQL database.  Define one per remote database you want to access.

- The user mapping encapsulates the credentials for a user to access a foreign 
server.
  Create one per user and foreign server (or a single one for PUBLIC = 
everybody).

- The foreign table describes how a remote table is mapped locally.
  Define one per table that interests you.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: initdb --data-checksums

2020-11-09 Thread Paul Förster
Hi Matt,

> On 09. Nov, 2020, at 18:00, Matt Zagrabelny  wrote:
> 
> Hello,
> 
> I see the --data-checksums option for initdb. Is it possible to use 
> --data-checksums after the cluster has been initialized? I'm guessing "not", 
> but thought I'd ask.
> 
> I'm running v12 on Debian.
> 
> Thanks for any help!

indeed, it is. Have a look at:

https://www.postgresql.org/docs/12/app-pgchecksums.html

Make sure the database is cleanly shut down before doing it.

Cheers,
Paul



initdb --data-checksums

2020-11-09 Thread Matt Zagrabelny
Hello,

I see the --data-checksums option for initdb. Is it possible to use
--data-checksums after the cluster has been initialized? I'm guessing
"not", but thought I'd ask.

I'm running v12 on Debian.

Thanks for any help!

-m


Re: Database system was interrupted. Possible reasons for a database to suddenly stop accepting connections?

2020-11-09 Thread Laurenz Albe
On Mon, 2020-11-09 at 13:53 +, Buzenets, Yuriy (GE Renewable Energy, 
consultant) wrote:

If I delete all the noise from the log, file, this remains:

> < 2020-10-29 11:51:59.345 PDT >STATEMENT:  SELECT NULL AS TABLE_CAT, 
> n.nspname AS TABLE_SCHEM, [...]

> < 2020-10-29 12:04:09.700 PDT >LOG:  database system was interrupted; last 
> known up at 2020-10-29 12:03:16 PDT
> < 2020-10-29 12:04:09.812 PDT >LOG:  database system was not properly shut 
> down; automatic recovery in progress
> < 2020-10-29 12:04:09.817 PDT >LOG:  redo starts at 43C/E887E1C8
> < 2020-10-29 12:04:09.916 PDT >LOG:  record with zero length at 43C/E8ED94D8
> < 2020-10-29 12:04:09.917 PDT >LOG:  redo done at 43C/E8ED94A8
> < 2020-10-29 12:04:09.917 PDT >LOG:  last completed transaction was at log 
> time 2020-10-29 12:03:46.422937-07
> < 2020-10-29 12:04:09.938 PDT >LOG:  MultiXact member wraparound protections 
> are now enabled
> < 2020-10-29 12:04:09.941 PDT >LOG:  autovacuum launcher started
> < 2020-10-29 12:04:09.941 PDT >LOG:  database system is ready to accept 
> connections

> < 2020-10-29 12:26:44.453 PDT >LOG:  received fast shutdown request
> < 2020-10-29 12:26:44.453 PDT >LOG:  aborting any active transactions
> < 2020-10-29 12:26:44.544 PDT >LOG:  shutting down
> < 2020-10-29 12:26:44.587 PDT >LOG:  database system is shut down

> < 2020-10-29 12:27:13.156 PDT >LOG:  database system was shut down at 
> 2020-10-29 12:26:44 PDT
> < 2020-10-29 12:27:13.178 PDT >LOG:  MultiXact member wraparound protections 
> are now enabled
> < 2020-10-29 12:27:13.179 PDT >LOG:  autovacuum launcher started
> < 2020-10-29 12:27:13.179 PDT >LOG:  database system is ready to accept 
> connections
> < 2020-10-29 15:02:42.588 PDT >LOG:  received fast shutdown request
> < 2020-10-29 15:02:42.588 PDT >LOG:  aborting any active transactions
> < 2020-10-29 15:02:42.712 PDT >LOG:  shutting down
> < 2020-10-29 15:02:42.732 PDT >LOG:  database system is shut down

> < 2020-10-29 15:02:43.711 PDT >LOG:  database system was shut down at 
> 2020-10-29 15:02:42 PDT
> < 2020-10-29 15:02:43.714 PDT >LOG:  MultiXact member wraparound protections 
> are now enabled
> < 2020-10-29 15:02:43.715 PDT >LOG:  autovacuum launcher started
> < 2020-10-29 15:02:43.716 PDT >LOG:  database system is ready to accept 
> connections

There was a operating system crash or a "kill -9" somewhere between 11:51 and 
12:04 on Oct. 29.

PostgreSQL got restarted and recovered at 12:04.

Then there were two more clean shutdowns and restarts at 12:26 and 15:02.

Your problem is probably the first crash.  If you don't have any indication 
that the machine crashed,
look into the kernel log - perkaps the out-of-memory killer struck (assuming 
this is Linux).

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: How to set up a schema default date to '2020-01-01'?

2020-11-09 Thread David G. Johnston
On Mon, Nov 9, 2020 at 9:08 AM mark armon <1994hej...@gmail.com> wrote:

>
> How to set up a schema default date (now) to '2020-01-01'?  Whatever
> timezone would be OK.
>

What is a "schema default" (date or otherwise)?

David J.


How to set up a schema default date to '2020-01-01'?

2020-11-09 Thread mark armon
How to set up a schema default date (now) to '2020-01-01'?  Whatever
timezone would be OK.


Re: database aliasing options ?

2020-11-09 Thread Tom Lane
David Gauthier  writes:
> Our IT dept has configured our PG DB as a "High Availability" database.  It
> has a primary and backup server (disks too).  Normally both are running but
> if one goes down, the other is still available for use, effectively keeping
> the DB up while the failed server is being repaired.

> My question has to do with creating and using a DB alias for DB connection
> code. Of course the idea would be to toggle which server the alias is
> pointing to based on the state of the servers.  The idea is to manage which
> server is being used OUTSIDE of the DB connection code by changing the
> alias. The code always uses the alias.

Some people do this by changing the DNS entry for the server.

Another thought is that maybe you don't have to do anything dynamic,
if you set up the clients with host lists not just single hostnames
(that is, "-h host1,host2" or the equivalent).  That should work for
any libpq-based client.

regards, tom lane




database aliasing options ?

2020-11-09 Thread David Gauthier
Hi:

version 11.5 on linux.

Our IT dept has configured our PG DB as a "High Availability" database.  It
has a primary and backup server (disks too).  Normally both are running but
if one goes down, the other is still available for use, effectively keeping
the DB up while the failed server is being repaired.

My question has to do with creating and using a DB alias for DB connection
code. Of course the idea would be to toggle which server the alias is
pointing to based on the state of the servers.  The idea is to manage which
server is being used OUTSIDE of the DB connection code by changing the
alias. The code always uses the alias.

Years ago, working for a different company, they had DB aliases, so I'm
assuming this is still possible (please tell if if it's not !).  If this is
possible, where should I point our IT guys as far as instructions on how to
enable this or, if there are multiple options, what are a couple of the
simplest/best?  Also, what options are there for the automated toggling of
which server the alias is pointing to if/when the one it's pointing to
comes down ?

Thanks for any replies !


Re: Different result behavior when using ANY(ARRAY(SELECT)) and IN (SELECT)

2020-11-09 Thread Tom Lane
Davide Jensen  writes:
> I'm encountering some problems in understanding the behavior of a query
> that uses an IN operator, the query i'm executing is the following:

> SELECT * FROM   (
>   SELECT _id,
>  ROW_NUMBER() OVER () AS _rownumber

I think your problem is in the above, not anything later in the query.
ROW_NUMBER without any ordering specification is indeterminate.
If the query gets parallelized, it's no surprise that the selected rows
will be processed in varying order from one try to the next.  Your
second phrasing of the query seems to be non-parallelizable, but the
row_number() result is still pretty indeterminate; it just doesn't
happen to have changed within your test run.

What is it you are expecting to get out of including a fictional
row number in the query result, anyway?

regards, tom lane




Re: Database system was interrupted. Possible reasons for a database to suddenly stop accepting connections?

2020-11-09 Thread Adrian Klaver

On 11/9/20 5:53 AM, Buzenets, Yuriy (GE Renewable Energy, consultant) wrote:
Some time ago the database at my work suddenly stopped accepting 
connections. In the logs there was a message “the database system was 
interrupted; last known up at 2020-10-29 12:03:16 PDT”, followed by a 
lot of “the database system is starting up” messages. It seems like the 
database tried to recover from whatever failure occurred but recovery 
was not successful until the database process was shut down and 
restarted. The latest error message before “system was interrupted” 
occurs 13 minutes before so it’s not clear to us what could cause this 
issue and why the database process had to be restarted. If anyone could 
help us understand what might’ve been the cause, we would greatly 
appreciate it. We’re using PostgreSQL 9.4 on CentOS 6.8, we know it 
reached EOL and plan to upgrade to a newer version soon. I’ve also 
attached the log file from the time the database system was unavailable 
if it helps to figure out the cause of the outage


There are a lot of 'terminating connection due to administrator command' 
lines. How many connections where/are being used to this server?




Thanks,

Yuriy Buzenets




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: RAISE INFO in function

2020-11-09 Thread Pavel Stehule
po 9. 11. 2020 v 14:46 odesílatel Yambu  napsal:

> Hi
>
> May i know if RAISE INFO impacts performance significantly in a function?
>
> Should i comment them out once i'm done using/debugging ?
>

It depends on more factors - but expressions in RAISE statements are
calculated every time and sometimes the result can be filtered. Inside
cycles or when functions is evaluated for every row of query, then the
overhead can be significant:

postgres=# do $$
begin
  for i in 1..10 loop
raise debug '%', i;
  end loop;
end;
$$;
DO
Time: 35,164 ms
postgres=# do $$
begin
 -- plpgsql cannot to ignore empty blocks, this loop was evaluated 100K
  for i in 1..10 loop
--raise debug '%', i + 1;
  end loop;
end;
$$;
DO
Time: 2,535 ms

Regards

Pavel


> regards
>


Database system was interrupted. Possible reasons for a database to suddenly stop accepting connections?

2020-11-09 Thread Buzenets, Yuriy (GE Renewable Energy, consultant)
Some time ago the database at my work suddenly stopped accepting connections. 
In the logs there was a message "the database system was interrupted; last 
known up at 2020-10-29 12:03:16 PDT", followed by a lot of "the database system 
is starting up" messages. It seems like the database tried to recover from 
whatever failure occurred but recovery was not successful until the database 
process was shut down and restarted. The latest error message before "system 
was interrupted" occurs 13 minutes before so it's not clear to us what could 
cause this issue and why the database process had to be restarted. If anyone 
could help us understand what might've been the cause, we would greatly 
appreciate it. We're using PostgreSQL 9.4 on CentOS 6.8, we know it reached EOL 
and plan to upgrade to a newer version soon. I've also attached the log file 
from the time the database system was unavailable if it helps to figure out the 
cause of the outage

Thanks,
Yuriy Buzenets


postgresql-Thu.log
Description: postgresql-Thu.log


RAISE INFO in function

2020-11-09 Thread Yambu
Hi

May i know if RAISE INFO impacts performance significantly in a function?

Should i comment them out once i'm done using/debugging ?

regards


Re: Temporary tables usage in functions

2020-11-09 Thread Pavel Stehule
Hi

po 9. 11. 2020 v 13:07 odesílatel Yambu  napsal:

> Hi
>
> May I know if a temporary table is dropped at the end of a function?
>

PostgreSQL temporary tables can be dropped on the end of transaction or end
of session.


>
> Also may I know if excessive use of temporary tables may cause locks?
>

Usually there are no problems with locks, but there is a problem with
system tables bloating. Creating and dropping temp tables is expensive like
creating or dropping normal tables.

Against other databases, there can be used arrays instead temporary tables.
This is significantly more effective.

Regards

Pavel


> regards
>


Re: Temporary tables usage in functions

2020-11-09 Thread Josef Šimánek
po 9. 11. 2020 v 13:07 odesílatel Yambu  napsal:
>
> Hi
>
> May I know if a temporary table is dropped at the end of a function?

Check 
https://www.postgresql.org/docs/12/sql-createtable.html#SQL-CREATETABLE-TEMPORARY,
especially the "ON COMMIT" part.

>
> Also may I know if excessive use of temporary tables may cause locks?
>
> regards




Temporary tables usage in functions

2020-11-09 Thread Yambu
Hi

May I know if a temporary table is dropped at the end of a function?

Also may I know if excessive use of temporary tables may cause locks?

regards


Different result behavior when using ANY(ARRAY(SELECT)) and IN (SELECT)

2020-11-09 Thread Davide Jensen
Hi everyone,
I'm encountering some problems in understanding the behavior of a query
that uses an IN operator, the query i'm executing is the following:

SELECT * FROM   (
  SELECT _id,
 ROW_NUMBER() OVER () AS _rownumber
  FROM   (
  SELECT   "Id"_id,
   "IdClass"::regclass _idclass,
   "CurrentId" _currentid,
   "User"  _user,
   "BeginDate" _begindate,
   "EndDate"   _enddate,
   "Status"_status,
   "Code"  _code,
   "Description"   _description
  FROM "Hardware" _hardware
  WHERE"IdClass" = ANY (ARRAY[...])
  AND  "Id" IN (
   (
  SELECT "Id"
  FROM   "Hardware"
  WHERE  "Status" = 'A'
  AND"IdClass" NOT IN
('"NetEquipment"'::regclass,

 '"Battery"'::regclass,

 '"DistribBoard"'::regclass,

 '"Termination"'::regclass,

 '"SysEquipment"'::regclass) ))
  AND  "IdClass" = ANY (ARRAY[...])
  AND  "Status" = 'A'
  ORDER BY _code ASC) _rdinner) _rdouther
WHERE  _id = 64297639;

I've excluded the content of the arrays because it would have been too much
useless text;
When executing this query i'm expecting a single result with the row number
of the record and its Id but instead i'm obtaining different results at
every execution, here the results of 3 sequential executions:
* 1) 64297639;2490
   64297639;20867
* 2) 64297639;2484
   64297639;14683
* 3) 64297639;2511
   64297639;20844
After researching a bit on the internet I've found people that were
encountering some issues with the usage of the IN operator and solved the
problem by switching it to an ANY(ARRAY()), so i've changed the following
part of the query:
AND  "Id" IN (
   (
  SELECT "Id"
  FROM   "Hardware"
  WHERE  "Status" = 'A'
  AND"IdClass" NOT IN
('"NetEquipment"'::regclass,

 '"Battery"'::regclass,

 '"DistribBoard"'::regclass,

 '"Termination"'::regclass,

 '"SysEquipment"'::regclass) ))

to this one:

AND  "Id" = ANY ( ARRAY (
  SELECT "Id"
  FROM   "Hardware"
  WHERE  "Status" = 'A'
  AND"IdClass" NOT IN
('"NetEquipment"'::regclass,

 '"Battery"'::regclass,

 '"DistribBoard"'::regclass,

 '"Termination"'::regclass,

 '"SysEquipment"'::regclass) ))
In this case the result of the whole query is what i'm expecting and it's
consistent at every execution:
64297639;2211
To try and understand the difference between the two queries i've tried
using the EXPLAIN ANALYZE and this is the result:

* Query with IN operator:

"Subquery Scan on _rdouther  (cost=9285.49..11035.69 rows=1 width=16)
(actual time=26.841..41.850 rows=2 loops=1)"
"  Filter: (_rdouther._id = 64297639)"
"  Rows Removed by Filter: 27138"
"  ->  WindowAgg  (cost=9285.49..10889.84 rows=11668 width=16) (actual
time=25.156..40.675 rows=27140 loops=1)"
"->  Subquery Scan on _rdinner  (cost=9285.49..10743.99 rows=11668
width=8) (actual time=25.153..34.801 rows=27140 loops=1)"
"  ->  Gather Merge  (cost=9285.49..10627.31 rows=11668
width=815) (actual time=25.152..32.900 rows=27140 loops=1)"
"Workers Planned: 1"
"Workers Launched: 1"
"->  Sort  (cost=8285.48..8314.65 rows=11668 width=815)
(actual time=18.988..19.671 rows=13570 loops=2)"
"  Sort Key: _hardware."Code""
"  Sort Method: quicksort  Memory: 1717kB"
"  ->  Hash Semi Join  (cost=1408.19..3307.79
rows=11668 width=815) (actual time=8.489..15.665 rows=13570 loops=2)"
"Hash Cond: (_hardware."Id" =
"Hardware"."Id")"
"->  Append  (cost=0.00..1739.09 rows=11673
width=45) (actual time=0.009..4.489 rows=13570 loops=2)"
"  ->  Parallel Seq Scan on "Hardware"
_hardware  (cost=0.00..0.00 rows=1 width=226) (actual time=0.000..0.000
rows=0 loops=2)"
" 

Backup Restore from other node after switchover/failover

2020-11-09 Thread Dirk Krautschick
Hi,

haven’t tested it yet but maybe I can get a quick answer here.
We have discussed the following scenario.

Few nodes as streaming replication cluster all in sync with taking backup
only from one dedicated node. Now that node which is responsible for the backups
goes down. For sure I have a full backup taken from that crashed node
some time ago but now this node is gone and I have to take care for a
restore for some reason.

Am I able to make a full restore with that last backup from the offline node
to any other still existing node with Recovering the archived WAL from the new
selected node if I have activated archive_mode as “always”?

Or is it strictly necessary to create a new full backup from the new node
which is responsible for the backups after the failover or maybe after a
Switchover, too?

Thanks and best regards

Dirk