Re: ALTER TABLE .. SET STATISTICS

2018-08-05 Thread Alvaro Herrera
On 2018-Aug-05, Ron wrote:

> For columns of type bytea which store image data (PDFs, JPGs, etc) would it
> speed up the ANALYZE process to SET STATISTICS = 0?

Yes.  I'd suggest also looking at ALTER TABLE ... SET STORAGE to avoid
having Postgres try to compress those columns, which is useless waste of
CPU cycles, since those formats already compress.

> That way, default_statistics_target could be cranked higher -- giving better
> statistics for needed columns -- without polluting pg_statistics with
> unneeded data?

Right.

With the default being 100 I don't think there's a real need to crank it
higher server-wide, but it may be useful to crank it higher for specific
columns.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Who and How is responsible for released installations packages and 3rd party packs? (e.g. on https://yum.postgresql.org/9.6/redhat/rhel-7.3-x86_64/)

2018-08-05 Thread Devrim Gündüz

Hi,

On Thu, 2018-08-02 at 08:53 +0300, Alexandru Lazarev wrote:
> I saw on URL there are PG 9.6.8 and 9.6.9 - Are there maintained only
> latest 2 build releases?

Yes, older releases are deleted automagically.

Regards,
-- 
Devrim Gündüz
EnterpriseDB: https://www.enterprisedb.com
PostgreSQL Consultant, Red Hat Certified Engineer
Twitter: @DevrimGunduz , @DevrimGunduzTR


signature.asc
Description: This is a digitally signed message part


Conditional JOINs to optimise expensive views

2018-08-05 Thread wouter-postgresql
Hello all,

I'm trying to write queries with conditional JOINs. The goal is to guard 
expensive views by checking some conditions first and not execute the view at 
all if any of the conditions fail. Think along the lines of REST, like Not 
Authorized or Not Modified.

My problem is that the Postgresql (9.5 and 9.6) query planner executes LEFT and 
LATERAL JOINed queries even when the JOIN conditions are false.

I don't understand why.

I could work around some of this by adding a (dirty-ish?) CASE statement or a 
slower and more complex CTE construct or even create functions for each of 
these queries, but I'd really like to know why the most simple solution doesn't 
work. I tried many variations, but in my use cases, Postgresql always executes 
JOINs if the conditions are not constants.

What I'm really looking for is a general pattern with good performance to guard 
expensive views.

Here is an example query (is_owner is false):

SELECT is_owner, is_newer, json 
FROM (
 SELECT id, owner = '053bffbc-c41e-dad4-853b-ea91fc42ea17' "is_owner"
  , modified >= created "is_newer" 
 FROM datasets 
 WHERE id = '056e4eed-ee63-2add-e981-0c86b8b6a66f'
) cond
LEFT JOIN view_dataset view
ON view.id = cond.id 
   AND cond.is_owner 
   AND cond.is_newer;

Which returns (correctly, but executing the slow JOINed view):
is_owner: false
is_newer: true
json: (null)


I don't want that query to execute the JOINed view, that is the whole point of 
those conditions, but it does.


Here is a working sql fiddle:

http://sqlfiddle.com/#!17/6882c6/5

And here is the stackoverflow thread:

https://dba.stackexchange.com/questions/211642/optimising-expensive-join-subquery-by-filtering-with-conditionals


(Note that in the fiddle example there is a  0.5 second delay in the view to 
simulate it being expensive, so it should be pretty clear if the query planner 
executes the view or not.)

Thanks for any insight!



Re: ALTER TABLE .. SET STATISTICS

2018-08-05 Thread Tom Lane
Ron  writes:
> For columns of type bytea which store image data (PDFs, JPGs, etc) would it 
> speed up the ANALYZE process to SET STATISTICS = 0?

Should do, though I'm not sure how much of an effect you'd see.

regards, tom lane



Re: Walsender waiting on SnapbuildSync

2018-08-05 Thread Andres Freund
On 2018-08-04 13:34:04 -0600, Brent Kerby wrote:
> On Postgres 10.3 (on AWS RDS), I am running logical decoding using the
> test_decoding output plugin, and every few minutes I am seeing pauses in
> the stream, unrelated to any large transactions. About once every hour or
> two, the pause is long enough that the database disconnects my client due
> to exceeding wal_sender_timeout (30 seconds -- the RDS default value);
> after reconnecting it is able to make progress again. My client is using
> the streaming replication protocol via pgjdbc (with a status interval of 1
> second). What I'm seeing is that during such a pause, the server is not
> sending any data to the client:
> 
> - pg_stat_replication.sent_lsn stops advancing
> - My client is blocking in a call to PGReplicationStream.read()
> - pg_stat_activity shows that the walsender process has a wait_event of
> 'SnapbuildSync'.
> 
> In this scenario, it makes sense that the client would be timed out: pgjdbc
> only sends feedback to the server at the beginning of a call to
> PGReplicationStream.read(), so if a single call blocks a long time, never
> receiving any data from the server, then the client would stop sending
> feedback to the server, causing timeout.
> 
> My question is why might the server be spending so much time waiting on
> SnapbuildSync? The docs describe this event as follows:
> 
> "IO / SnapbuildSync / Waiting for a serialized historical catalog snapshot
> to reach stable storage."

> I gather that this is related to statistics collection, but I'm not
> understanding why a walsender process would wait on such an event nor why
> it would take such a long time. Any ideas?

It's *not* related to statistics collection. It's fsyncing a few *small*
files to disk. On a first blush this seems to indicate that your storage
system is quite overloaded?

The relevant piece of code is:

/*
 * fsync the file before renaming so that even if we crash after this we
 * have either a fully valid file or nothing.
 *
 * TODO: Do the fsync() via checkpoints/restartpoints, doing it here has
 * some noticeable overhead since it's performed synchronously during
 * decoding?
 */
pgstat_report_wait_start(WAIT_EVENT_SNAPBUILD_SYNC);
if (pg_fsync(fd) != 0)
{
int save_errno = errno;

CloseTransientFile(fd);
errno = save_errno;
ereport(ERROR,
(errcode_for_file_access(),
 errmsg("could not fsync file \"%s\": %m", 
tmppath)));
}
pgstat_report_wait_end();
CloseTransientFile(fd);

Greetings,

Andres Freund



Walsender waiting on SnapbuildSync

2018-08-05 Thread Brent Kerby
On Postgres 10.3 (on AWS RDS), I am running logical decoding using the
test_decoding output plugin, and every few minutes I am seeing pauses in
the stream, unrelated to any large transactions. About once every hour or
two, the pause is long enough that the database disconnects my client due
to exceeding wal_sender_timeout (30 seconds -- the RDS default value);
after reconnecting it is able to make progress again. My client is using
the streaming replication protocol via pgjdbc (with a status interval of 1
second). What I'm seeing is that during such a pause, the server is not
sending any data to the client:

- pg_stat_replication.sent_lsn stops advancing
- My client is blocking in a call to PGReplicationStream.read()
- pg_stat_activity shows that the walsender process has a wait_event of
'SnapbuildSync'.

In this scenario, it makes sense that the client would be timed out: pgjdbc
only sends feedback to the server at the beginning of a call to
PGReplicationStream.read(), so if a single call blocks a long time, never
receiving any data from the server, then the client would stop sending
feedback to the server, causing timeout.

My question is why might the server be spending so much time waiting on
SnapbuildSync? The docs describe this event as follows:

"IO / SnapbuildSync / Waiting for a serialized historical catalog snapshot
to reach stable storage."

I gather that this is related to statistics collection, but I'm not
understanding why a walsender process would wait on such an event nor why
it would take such a long time. Any ideas?

Another thing is that when these pauses occur they are always in between
transactions, i.e., after the client has received a COMMIT message but
before receiving the next BEGIN. And the transactions before and after are
generally normally-sized ones (at most a few kilobytes of WAL), so this
doesn't appear to be related to issues with large transactions that have
been discussed in the past.

- Brent

(Originally posted here:
https://stackoverflow.com/questions/51687322/postgres-walsender-waiting-on-snapbuildsync
)