[GENERAL] json_agg doesn't generate the smallest json possible?

2017-04-27 Thread Glen Huang
Hi all,

I'm trying to use json_agg to generate some json array from the rows (which 
contain some json columns) I selected. But it seems the returned json contains 
some extraneous whitespace characters. Any efficient way I can get rid of them?

Also a quick question, if the final result is meant to be sent to a client, 
should I use json_agg or jsonb_agg?

Thanks.

Glen

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Logical slot preservation after physical failover

2017-04-27 Thread James Sewell
Hi all,

I am working with the following topology:

   1. Physical replication between server A1 and A2
   2. Physical replication between server B1 and B2
   3. A1 and B1 are masters, and both have a virtual IP attached
   4. Logical replication between A1 and B1 using the virtual IP addresses
   5. On failure of (A1,B1) (A2,B2) will be promoted and the virtual IP
   address moved

I threw together a diagram  to help visualize the
environment.

Is there any way to retain the items which are associated with a logical
slot when I fail from A1 to A2 (so the virtual address is attached to A2,
and A2 has been promoted)?

I saw that there was some discussion about 'failover slots' and then
'logical replication on standbys' but it's unclear where this is going.

Not having this feature essentially means that logical replication is a toy
feature in Enterprise (where HA clustering is the general rule), as it
always comes with a high possibility of data loss. This is compounded as it
seems logical replication sends transactions in sequence (ie xid 2 blocks
xid 3 until it commits).

Any thoughts / hints / hacks?

Cheers,

James Sewell,
PostgreSQL Team Lead / Solutions Architect



Suite 112, Jones Bay Wharf, 26-32 Pirrama Road, Pyrmont NSW 2009
*P *(+61) 2 8099 9000 <(+61)%202%208099%209000>  *W* www.jirotech.com  *F *
(+61) 2 8099 9099 <(+61)%202%208099%209000>

-- 

--
The contents of this email are confidential and may be subject to legal or 
professional privilege and copyright. No representation is made that this 
email is free of viruses or other defects. If you have received this 
communication in error, you may not copy or distribute any part of it or 
otherwise disclose its contents to anyone. Please advise the sender of your 
incorrect receipt of this correspondence.


Re: [GENERAL] [PERFORM] Questionaire: Common WAL write rates on busy servers.

2017-04-27 Thread Andres Freund
On 2017-04-28 01:29:14 +0200, Tomas Vondra wrote:
> I can confirm this observation. I bought the Intel 750 NVMe SSD last year,
> the device has 1GB DDR3 cache on it (power-loss protected), can do ~1GB/s of
> sustained O_DIRECT sequential writes. But when running pgbench, I can't push
> more than ~300MB/s of WAL to it, no matter what I do because of
> WALWriteLock.

Hm, interesting.  Even if you up wal_buffers to 128MB, use
synchronous_commit = off, and play with wal_writer_delay/flush_after?

- Andres


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [PERFORM] Questionaire: Common WAL write rates on busy servers.

2017-04-27 Thread Tomas Vondra

On 04/27/2017 07:35 PM, Andres Freund wrote:

On 2017-04-27 10:29:48 -0700, Joshua D. Drake wrote:

On 04/27/2017 09:34 AM, Andres Freund wrote:

On 2017-04-27 09:31:34 -0700, Joshua D. Drake wrote:

On 04/27/2017 08:59 AM, Andres Freund wrote:




I would agree it isn't yet a widespread issue.


I'm not yet sure about that actually. I suspect a large
percentage of people with such workloads aren't lingering lots on
the lists.


That would probably be true. I was thinking of it more as the
"most new users are in the cloud" and the "cloud" is going to be
rare that a cloud user is going to be able to hit that level of
writes. (at least not without spending LOTS of money)


You can get pretty decent NVMe SSD drives on serveral cloud
providers these days, without immediately bancrupting you. Sure, it's
instance storage, but with a decent replication and archival setup,
that's not necessarily an issue.

It's not that hard to get to the point where postgres can't keep up
with storage, at least for some workloads.



I can confirm this observation. I bought the Intel 750 NVMe SSD last 
year, the device has 1GB DDR3 cache on it (power-loss protected), can do 
~1GB/s of sustained O_DIRECT sequential writes. But when running 
pgbench, I can't push more than ~300MB/s of WAL to it, no matter what I 
do because of WALWriteLock.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [PERFORM] Questionaire: Common WAL write rates on busy servers.

2017-04-27 Thread Tomas Vondra

On 04/27/2017 06:34 PM, Andres Freund wrote:

On 2017-04-27 09:31:34 -0700, Joshua D. Drake wrote:

On 04/27/2017 08:59 AM, Andres Freund wrote:



Ok, based on the, few, answers I've got so far, my experience is
indeed skewed. A number of the PG users I interacted with over
the last couple years had WAL write ranges somewhere in the range
of 500MB/s to 2.2GB/s (max I'veseen). At that point WAL insertion
became a major bottleneck, even if storage was more than fast
enough to keep up. To address these we'd need some changes, but
the feedback so far suggest that it's not yet a widespread
issue...


I would agree it isn't yet a widespread issue.


I'm not yet sure about that actually. I suspect a large percentage
of people with such workloads aren't lingering lots on the lists.



To a certain extent, this is a self-fulfilling prophecy. If you know 
you'll have such a busy system, you probably do some research and 
testing first, before choosing the database. If we don't perform well 
enough, you pick something else. Which removes the data point.


Obviously, there are systems that start small and get busier and busier 
over time. And those are the ones we see.


cheers

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Large data and slow queries

2017-04-27 Thread Samuel Williams
Citus is also now just an extension.

Sounds pretty awesome, I'll certainly consider your system if/when we
decide to make changes.

On 22 April 2017 at 08:41, Andrew Staller  wrote:
> Samuel,
>
> Short answer to your questions:  (1) TimescaleDB and CitusDB are focusing on
> solving different problems, and (2) TimescaleDB is an Apache 2-licensed
> extension to run in your Postgres database, not a fork or different system.
>
> Longer answer to your first question:
>
> From what we've read and the conversations we've had with Citus (awesome
> folks, btw), they're primarily solving a different problem -- OLAP queries
> and distributed transactions -- while we are focused on time-series data
> analysis.  As such, we haven't benchmarked against Citus and if we were to,
> it would require some unrealistic finagling in order to make it an
> apples-to-apples comparison.
>
> As an example, their partitioning is based on a single primary key, while
> all writes in their clustered version go to a cluster master (although their
> hosted version of Citus MX changes this a bit).  We perform two-dimensional
> partitioning using your primary key and time, with the partitions
> automatically created/closed on time intervals and data volume in order to
> keep the most recent partitioned chunks of data (and their B-trees) in
> memory to support fast ingest for time-series data. (***) We have only
> released our single-node version, but the clustering will allow you to write
> directly to any node.  Citus has also decided only to support the SQL
> queries they are optimized for, while we've made the decision to support
> "full SQL" (even if some queries might not be fast).
>
> *** We blogged yesterday about TimescaleDB's partitioning design choices in
> more depth, if you are interested:
> https://blog.timescale.com/time-series-data-why-and-how-to-use-a-relational-database-instead-of-nosql-d0cd6975e87c
>
>
> On Thu, Apr 20, 2017 at 5:43 PM, Samuel Williams
>  wrote:
>>
>> Andrew, how would timescaledb compare to citus - and is timescaledb an
>> extension to postgres or is it an entirely separate system?
>>
>> On 21 April 2017 at 02:44, Andrew Staller  wrote:
>> > Awesome thread.
>> >
>> > Samuel,
>> >
>> > Just wanted you to be aware of the work we're doing at TimescaleDB
>> > (http://www.timescale.com/), a time-series database extension for
>> > PostgreSQL.
>> >
>> > Some of how we might help you:
>> > - automatic partitioning by space (primary key - like country_id, for
>> > instance) and time. This creates "chunks" of your data, right-sized by
>> > volume and time constraints (which makes inserts fast at much greater
>> > scale
>> > than Vanilla PostgreSQL - not sure if this is a problem for you)
>> > - this will also help if your queries are selective on time and
>> > country_id
>> > (per this example)
>> > - the partitioning by time allows you to DROP old chunks without the
>> > need
>> > for vacuums
>> >
>> > On Thu, Apr 20, 2017 at 8:30 AM, Vick Khera  wrote:
>> >>
>> >> I'm curious why you have so many partial indexes. Are you trying to
>> >> make
>> >> custom indexes per query? It seems to me you might want to consider
>> >> making
>> >> the indexes general, and remove the redundant ones (that have the same
>> >> prefix list of indexed fields).
>> >>
>> >> Secondly your table is 102Gb. Clearly there's a lot of data here. How
>> >> many
>> >> rows does that take? I would further suggest that you partition this
>> >> table
>> >> such that there are no more than about 10 million rows per partition
>> >> (I've
>> >> done this by using a id % 100 computation). Maybe in your case it makes
>> >> sense to partition it based on the "what" field, because it appears you
>> >> are
>> >> trying to do that with your partial indexes already.
>> >>
>> >> On Wed, Apr 19, 2017 at 10:11 PM, Samuel Williams
>> >>  wrote:
>> >>>
>> >>> Okay, so after changing longitude/latitude to float4, and
>> >>> re-organizing the table a bit, I got the query down from about 8
>> >>> minutes to 40 seconds.
>> >>>
>> >>> The details are in the gist comments:
>> >>> https://gist.github.com/ioquatix/bddda36d9e4ffaceb7a62d7b62259121
>> >>>
>> >>> Now, just need to get performance another 2 orders of magnitude
>> >>> better. Can we make the index a bit more coarse grained, perhaps
>> >>> convert long/lat to integers or something, use a hilbert index, use a
>> >>> postgis index... ideas?
>> >>>
>> >>>
>> >>> --
>> >>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> >>> To make changes to your subscription:
>> >>> http://www.postgresql.org/mailpref/pgsql-general
>> >>
>> >>
>> >
>> >
>> >
>> > --
>> > TimescaleDB | Growth & Developer Evangelism
>> > c: 908.581.9509
>> >
>> > 335 Madison Ave.
>> > New York, NY 10017
>> > www.timescale.com
>> > https://github.com/timescale/timescaledb
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mail

[GENERAL] Perl Dev 2017 survey results

2017-04-27 Thread
It may interest some to see the Perl Developer 2017 Survey results. Of most 
interest to this crowd is question 20:

Q20 What do you use for databases?

MySQL is the most used tool for databases with 46% (367) followed by 
PostgreSQL with 30% (241) and SQLite 15% (119). Oracle was the choice of 11%, 
(89) of respondents.

796 respondents answered this question

Of course, I suspect many don't get to pick their DB despite their preference. 
Fortunately, I get to use Postgres at my current job. :) I had to use Mysql at 
my previous job. :-/

If you want to see all the results, you can find them at 
http://blog.builtinperl.com/post/perl-developer-survey-2017-results---part-1

Kevin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Cannot spot the difference in two queries

2017-04-27 Thread Rob Sargent
I have three very similar functions, two of which are fast and one is 
slow and I cannot explain the slowness of the third. All of which, with 
the correct arguments should return the same "optimal set". I present 
one of the two fast versions and the slow one, in whole and in part.I'm 
using postgres 9.6 on ubuntu 16.4 desktop (16G memory).


I'm confused because the bulk of the difference in the sql is in the 
"with" clause and those are equally quick and return the same dataset, 
and to my eye the processing after that is the same as sql but clearly 
not in the server.  I'm missing something obvious I'm sure but whatever 
it is is causing a nasty Cartesian then sorting that out.


(And I get that mm.markerset_id is not needed in the where clause)

Any insights appreciated.
Thanks,
rjs

I have not included table defs, thinking that since they're the same for 
both queries ...


Fast function

   create or replace function seg.optimal_pvalue_set(people_name text,
   markers_name text, chr int)
   returns table (segment_id uuid, optval numeric, firstbase int) as
   $$
   declare
  mkset uuid;
  rcount int;
   begin
  select id into mkset from seg.markerset where name = markers_name
   and chrom = chr;
   --
  create temp table optmarkers on commit drop as
  with segset as (
  select s.id
 , s.chrom
 , s.markerset_id
 , s.startbase
 , s.endbase
 , ((s.events_equal + s.events_greater)/(1.0 *
   (s.events_less + s.events_equal + s.events_greater))) as pval
  from seg.segment s
   join seg.probandset i on s.probandset_id = i.id
   join (select people_id, array_agg(person_id) as persons
 from seg.people_member
 group by people_id) as pa on i.probands <@ pa.persons
   join seg.people o on pa.people_id = o.id
  where
   s.markerset_id = mkset
   and o.name = people_name
  )
  select m.id as mkrid
 , min(ss.pval) as optval
  from segset ss
   join seg.markerset_member mm on ss.markerset_id =
   mm.markerset_id
   join seg.marker m on mm.member_id = m.id
  where
   m.basepos between ss.startbase and ss.endbase
   and m.chrom = ss.chrom
   and mm.markerset_id = mkset   --
   'b474655c-80d2-47e7-bcb5-c65245195888'
  group by m.id;
   --
  get diagnostics rcount = ROW_COUNT;
  raise notice '% segments to optimize', rcount;
   --
  return query
  select s.id as segment_id, o.optval, min(m.basepos) as firstbase
  from optmarkers o
   join seg.marker m on o.mkrid = m.id
   join seg.markerset_member mm on m.id = mm.member_id
   join seg.segment s on mm.markerset_id = s.markerset_id
  where mm.markerset_id = mkset
and m.basepos between s.startbase and s.endbase
and ((s.events_equal + s.events_greater)/(1.0 *
   (s.events_less + s.events_equal + s.events_greater))) = o.optval
  group by s.id, o.optval
  order by firstbase;
   end;
   $$ language plpgsql;

   -- timing the CTE base
   --with, the first
  select count(s.id)
 -- s.id
 -- , s.chrom
 -- , s.markerset_id
 -- , s.startbase
 -- , s.endbase
 -- , ((s.events_equal + s.events_greater)/(1.0 *
   (s.events_less + s.events_equal + s.events_greater))) as pval
  from seg.segment s
   join seg.probandset i on s.probandset_id = i.id
   join (select people_id, array_agg(person_id) as persons
 from seg.people_member
 group by people_id) as pa on i.probands <@ pa.persons
   join seg.people o on pa.people_id = o.id
  where
   s.markerset_id = 'ed3b4817-1739-4727-9fac-35d1d63071ea'
   --mkset
   and o.name = '1151704'; -- people_name
 count
   ---
 30762
   (1 row)

   --explain analyze of Full CTE portion (that which constructs the
   temp table in prep for final query)
   --manually supplying the needed ids and such.
   QUERY PLAN
   
--
 GroupAggregate  (cost=15412.11..15413.24 rows=65 width=48) (actual
   time=5462.143..6042.901 rows=52196 loops=1)
   Group Key: m.id
   CTE segset
 ->  Nested Loop  (cost=4820.07..6024.99 rows=310 width=76)
   (actual time=11.854..449.869 rows=30762 loops=1)
   ->  Seq Scan on probandset i  (cost=0.00..21.27 rows=1
   width=16) (actual time=0.007..0.286 rows=57 loops=1)

Re: [GENERAL] [PERFORM] Questionaire: Common WAL write rates on busy servers.

2017-04-27 Thread Andres Freund
On 2017-04-27 10:29:48 -0700, Joshua D. Drake wrote:
> On 04/27/2017 09:34 AM, Andres Freund wrote:
> > On 2017-04-27 09:31:34 -0700, Joshua D. Drake wrote:
> > > On 04/27/2017 08:59 AM, Andres Freund wrote:
> > > 
> 
> > > I would agree it isn't yet a widespread issue.
> > 
> > I'm not yet sure about that actually.  I suspect a large percentage of
> > people with such workloads aren't lingering lots on the lists.
> 
> That would probably be true. I was thinking of it more as the "most new
> users are in the cloud" and the "cloud" is going to be rare that a cloud
> user is going to be able to hit that level of writes. (at least not without
> spending LOTS of money)

You can get pretty decent NVMe SSD drives on serveral cloud providers
these days, without immediately bancrupting you.  Sure, it's instance
storage, but with a decent replication and archival setup, that's not
necessarily an issue.

It's not that hard to get to the point where postgres can't keep up with
storage, at least for some workloads.

- Andres


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [PERFORM] Questionaire: Common WAL write rates on busy servers.

2017-04-27 Thread Joshua D. Drake

On 04/27/2017 09:34 AM, Andres Freund wrote:

On 2017-04-27 09:31:34 -0700, Joshua D. Drake wrote:

On 04/27/2017 08:59 AM, Andres Freund wrote:




I would agree it isn't yet a widespread issue.


I'm not yet sure about that actually.  I suspect a large percentage of
people with such workloads aren't lingering lots on the lists.


That would probably be true. I was thinking of it more as the "most new 
users are in the cloud" and the "cloud" is going to be rare that a cloud 
user is going to be able to hit that level of writes. (at least not 
without spending LOTS of money)






The only people that are likely going to see this are going to be on bare
metal. We should definitely plan on that issue for say 11.


"plan on that issue" - heh. We're talking about major engineering
projects here ;)


Sorry, wasn't trying to make light of the effort. :D





I do have a question though, where you have seen this issue is it with
synchronous_commit on or off?


Both. Whether that matters or not really depends on the workload. If you
have bulk writes, it doesn't really matter much.


Sure, o.k.

Thanks,

Andres



- Andres




--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [PERFORM] Questionaire: Common WAL write rates on busy servers.

2017-04-27 Thread Andres Freund
On 2017-04-27 09:31:34 -0700, Joshua D. Drake wrote:
> On 04/27/2017 08:59 AM, Andres Freund wrote:
> 
> > 
> > Ok, based on the, few, answers I've got so far, my experience is indeed
> > skewed.  A number of the PG users I interacted with over the last couple
> > years had WAL write ranges somewhere in the range of 500MB/s to 2.2GB/s
> > (max I'veseen).  At that point WAL insertion became a major bottleneck,
> > even if storage was more than fast enough to keep up.  To address these
> > we'd need some changes, but the feedback so far suggest that it's not
> > yet a widespread issue...
> 
> I would agree it isn't yet a widespread issue.

I'm not yet sure about that actually.  I suspect a large percentage of
people with such workloads aren't lingering lots on the lists.


> The only people that are likely going to see this are going to be on bare
> metal. We should definitely plan on that issue for say 11.

"plan on that issue" - heh. We're talking about major engineering
projects here ;)


> I do have a question though, where you have seen this issue is it with
> synchronous_commit on or off?

Both. Whether that matters or not really depends on the workload. If you
have bulk writes, it doesn't really matter much.

- Andres


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [PERFORM] Questionaire: Common WAL write rates on busy servers.

2017-04-27 Thread Joshua D. Drake

On 04/27/2017 08:59 AM, Andres Freund wrote:



Ok, based on the, few, answers I've got so far, my experience is indeed
skewed.  A number of the PG users I interacted with over the last couple
years had WAL write ranges somewhere in the range of 500MB/s to 2.2GB/s
(max I'veseen).  At that point WAL insertion became a major bottleneck,
even if storage was more than fast enough to keep up.  To address these
we'd need some changes, but the feedback so far suggest that it's not
yet a widespread issue...


I would agree it isn't yet a widespread issue.

The only people that are likely going to see this are going to be on 
bare metal. We should definitely plan on that issue for say 11. I do 
have a question though, where you have seen this issue is it with 
synchronous_commit on or off?


Thanks,

JD



--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL upgrade best practices, please share your thoughts

2017-04-27 Thread Adrian Klaver

On 04/27/2017 08:41 AM, Daniel Westermann wrote:

Hi all,

currently I am preparing my session (PostgreSQL upgrade best practices)
for the Swiss PGDay 2017 (http://www.pgday.ch/2017/). What I plan to
come up with is:

- Support policy
- Release notes
- Why it is important to upgrade (or at least to patch to the latest
minor release)
- Minor version upgrades vs major version upgrades
- The importance of version specific directories
- What to take care of with tablespaces
- What to take care of with extensions
- Pain points with (missing) statistics after the upgrade
- Upgrades using pg_dump
- Upgrades using pg_dumpall for global objects and then pg_dump in parallel
- pg_upgrade (check mode, normal mode vs. link mode, upgrading with
streaming replication configurations)
- Getting support (mailing lists)
- PostgreSQL 10: Logical replication

What I would like to ask you:
What would you add to the above list?


That is an ambitious agenda and I sincerely wish you the best on pulling 
it together.


Not sure if these are already covered as sub-points

- Minor version upgrades vs major version upgrades
Make clear that the version numbering that distinguishes this is 
changing with version 10.


- Getting support (mailing lists)
Differentiate between the community version vs versions with proprietary 
extension.


A new point.

Postgres source
The importance of verifying where the different instances of Postgres 
came from. It is not uncommon to see cross contamination issues because 
someone went from using a self compiled version to one from a package or 
pulled from different repos.




Can you please share what issues you had when upgrading and maybe how
you solved it? I'd love to include as much information as possible.

Thanks in advance
Daniel




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [PERFORM] Questionaire: Common WAL write rates on busy servers.

2017-04-27 Thread Andres Freund
Hi,

On 2017-04-24 21:17:43 -0700, Andres Freund wrote:
> I've lately seen more and more installations where the generation of
> write-ahead-log (WAL) is one of the primary bottlenecks.  I'm curious
> whether that's primarily a "sampling error" of mine, or whether that's
> indeed more common.
> 
> The primary reason I'm curious is that I'm pondering a few potential
> optimizations, and would like to have some guidance which are more and
> which are less important.
> 
> Questions (answer as many you can comfortably answer):
> - How many MB/s, segments/s do you see on busier servers?
> - What generates the bulk of WAL on your servers (9.5+ can use
>   pg_xlogdump --stats to compute that)?
> - Are you seeing WAL writes being a bottleneck?OA
> - What kind of backup methods are you using and is the WAL volume a
>   problem?
> - What kind of replication are you using and is the WAL volume a
>   problem?
> - What are your settings for wal_compression, max_wal_size (9.5+) /
>   checkpoint_segments (< 9.5), checkpoint_timeout and wal_buffers?
> - Could you quickly describe your workload?

Ok, based on the, few, answers I've got so far, my experience is indeed
skewed.  A number of the PG users I interacted with over the last couple
years had WAL write ranges somewhere in the range of 500MB/s to 2.2GB/s
(max I'veseen).  At that point WAL insertion became a major bottleneck,
even if storage was more than fast enough to keep up.  To address these
we'd need some changes, but the feedback so far suggest that it's not
yet a widespread issue...

- Andres


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] PostgreSQL upgrade best practices, please share your thoughts

2017-04-27 Thread Daniel Westermann
Hi all, 

currently I am preparing my session (PostgreSQL upgrade best practices) for the 
Swiss PGDay 2017 (http://www.pgday.ch/2017/). What I plan to come up with is: 

- Support policy 
- Release notes 
- Why it is important to upgrade (or at least to patch to the latest minor 
release) 
- Minor version upgrades vs major version upgrades 
- The importance of version specific directories 
- What to take care of with tablespaces 
- What to take care of with extensions 
- Pain points with (missing) statistics after the upgrade 
- Upgrades using pg_dump 
- Upgrades using pg_dumpall for global objects and then pg_dump in parallel 
- pg_upgrade (check mode, normal mode vs. link mode, upgrading with streaming 
replication configurations) 
- Getting support (mailing lists) 
- PostgreSQL 10: Logical replication 

What I would like to ask you: 
What would you add to the above list? 
Can you please share what issues you had when upgrading and maybe how you 
solved it? I'd love to include as much information as possible. 

Thanks in advance 
Daniel 



Re: [GENERAL] Query which "never" terminates

2017-04-27 Thread Viktor Fougstedt

Hello,

and thanks for your reply.

SET from_collapse_limit=1; 

did no difference, I’m afraid.

I ran both VACUUM and ANALYZE before I posted, and there was no noticable 
difference from either of them.


Regards,
/Viktor

> On 27 apr 2017, at 15:54, pinker  wrote:
> 
> Try to change from_collapse_limit values at first, for instance run SET
> from_collapse_limit = 1; and then your query in the same session. 
> have you ran analyze lately? 
> 
> 
> 
> --
> View this message in context: 
> http://www.postgresql-archive.org/Query-which-never-terminates-tp5958734p5958736.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
> 
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres connection Monitor

2017-04-27 Thread Jan de Visser
On Thu, Apr 27, 2017 at 6:18 AM, basti  wrote:
> Hallo, we have a Postgres database Server and around that there are 8
> clients get data from it.
>
> All servers are the the same datacenter i think (traceroute only 4-5 hops)
>
> Now we plan to move the server to an other hoster/datacenter.
>
> I have done simple test with ping and traceroute.
> Is there a way to monitor "traffic times" in postgres ?
> How long a query would take to be answered ?

You could write a tiny little C program which connects and sends a
PQping(), and measure that.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query which "never" terminates

2017-04-27 Thread pinker
Try to change from_collapse_limit values at first, for instance run SET
from_collapse_limit = 1; and then your query in the same session. 
have you ran analyze lately? 



--
View this message in context: 
http://www.postgresql-archive.org/Query-which-never-terminates-tp5958734p5958736.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Query which "never" terminates

2017-04-27 Thread Viktor Fougstedt
Dear sirs.

I have an issue which I am at a loss to solve. Because of the behaviour I am 
not certain that this is a “slow query” problem, so I send my question here.

My question regards a query using deep subqueries. EXPLAIN gives a maximum cost 
of about 2.000 (details below), but the query runs "forever" (at least 3 
hours). It takes 100% CPU on the machine while running. It is expected to 
produce 0 rows. I have auto-vacuum enabled, but have also manually VACUUM:ed 
and ANALYZE:d the database.

I am using "PostgreSQL 9.4.9 on x86_64-apple-darwin14.5.0, compiled by Apple 
LLVM version 7.0.2 (clang-700.1.81), 64-bit)".

The query has worked as expected before, but stopped working. See below for 
query, EXPLAIN-output, definition of all tables, the table's sizes, and a gdb 
backtrace of the "hung" process. I apologize for the SQL, which is horribly 
ugly. It is in part automatically generated, and therefore looks like . I 
have tried to format it in an as-readable-form as possible.

I have manually executed the innermost subquery ("SELECT n.account, ..."), and 
it produces 1 row in negligible time.The next levels out ("SELECT s3.nid, ...", 
"SELECT s4.nid, ..." and "SELECT i.assignment, ...") all produce 0 rows and run 
fine.

I have run "CREATE TABLE tmptbl AS SELECT i.assignment, ...", running the 
topmost subquery and all its subqueries. They produce 0 rows in negligible 
time. If I replace the subquery with a select from that temporary table, 
"SELECT s6.nid AS orgunit, i.tags_point AS tags_point FROM tmptbl, ...", that 
query runs fine, producing 0 rows in notime.

It is only when I use the complete query that it hangs - the individual parts 
all run.

A "select * from pg_stat_activity where waiting='t'” while the query is running 
produces 0 rows.

I also think that locking problems are unlikely, since the query works if I 
split it in two and use a table to store
intermediate output (which is 0 rows, BTW). Unless an inner query can hang on a 
lock held by an outer one.

The EXPLAIN (since it does not terminate I cannot run EXPLAIN ANALYZE): 
https://explain.depesz.com/s/Ex2z

See below for all the ugly details.

Thank you in advance, for all and any help.
/Viktor


SELECT s6.nid AS orgunit, i.tags_point AS tags_point
  FROM (SELECT i.assignment AS assignment, i.tags_point AS tags_point
  FROM (SELECT s4.nid AS assignment, i.tags_point AS tags_point
  FROM (SELECT s3.nid AS person, i.tags_point AS tags_point
  FROM (SELECT n.account AS account, tp.point AS 
tags_point
  FROM "pdb4_ad"."account_name" n,
   UNNEST(ARRAY[0]) AS injected_points,
   tags_point tp
 WHERE injected_points = tp.point
   AND int8range(n._r_from, n._r_to) @> 
tp.rev
   AND (n._value = '__bootstrap__')
   AND tsrange(n._start, n._stop) @> tp.time
   ) i,
   tags_point AS tp,
   "pdb4_ad"."account" AS s1,
   "pdb4_ad"."identifies" AS s2,
   "pdb4_person"."person" AS s3
 WHERE i.tags_point = tp.point
   AND i.account = s1.nid
   AND s1.nid = s2."account"
   AND s2."person" = s3.nid
   AND INT8RANGE(s1._r_from, s1._r_to) @> tp.rev
   AND TSRANGE(s1._start, s1._stop) @> tp.time
   AND INT8RANGE(s2._r_from, s2._r_to) @> tp.rev
   AND TSRANGE(s2._start, s2._stop) @> tp.time
   AND INT8RANGE(s3._r_from, s3._r_to) @> tp.rev
   AND TSRANGE(s3._start, s3._stop) @> tp.time
   ) i,
   tags_point AS tp,
   "pdb4_person"."person" AS s1,
   "pdb4_org"."assignment" AS s2,
   "pdb4_org"."assignment_active" AS s3,
   "pdb4_org"."assignment" AS s4
WHERE i.tags_point = tp.point
  AND i.person = s1.nid
  AND s1.nid = s2."person"
  AND s2.nid = s3."assignment"
  AND s3."assignment" = s4.nid
  AND INT8RANGE(s1._r_from, s1._r_to) @> tp.rev
  AND TSRANGE(s1._start, s1._stop) @> tp.time
  AND INT8RANGE(s2._r_from, s2._r_to) @> tp.rev
  AND TSRANGE(s2._start, s2._stop) @> tp.time
  AND INT8RANGE(s3._r_from, s3._r_to) @> tp.rev
  AND TSRANGE(s3._start, s3._stop) @> tp.time
  AND INT8RANGE(s4._r_

[GENERAL] Postgres 9.6 Parallel Query

2017-04-27 Thread basti
Hello,

is there a recomment to set max_worker_processes and
max_parallel_workers_per_gather per cpu core or thread?

Best regards


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Postgres connection Monitor

2017-04-27 Thread basti
Hallo, we have a Postgres database Server and around that there are 8
clients get data from it.

All servers are the the same datacenter i think (traceroute only 4-5 hops)

Now we plan to move the server to an other hoster/datacenter.

I have done simple test with ping and traceroute.
Is there a way to monitor "traffic times" in postgres ?
How long a query would take to be answered ?




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general