Re: [GENERAL] Restoring tables with circular references dumped to separate files

2017-10-20 Thread Uwe
I would alter the tables and remove the foreign key constraint, restore the 
data and add the constraint back. If the data is consistent, adding the 
foreign key should work without error.


On Friday, October 20, 2017 8:15:27 PM PDT doganmeh wrote:
> I was dumping each table to a separate file so I could pick and choose when
> restoring. However, seems this was not a great idea, since two of my tables
> happened to reference each other via FOREIGN KEYs, and I am not able to
> restore them. Is there a way to do this without manually merging the dump
> files? Thanks for guidance in advance.
> 
> 
> 
> --
> Sent from:
> http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html




[GENERAL] Restoring tables with circular references dumped to separate files

2017-10-20 Thread doganmeh
I was dumping each table to a separate file so I could pick and choose when
restoring. However, seems this was not a great idea, since two of my tables
happened to reference each other via FOREIGN KEYs, and I am not able to
restore them. Is there a way to do this without manually merging the dump
files? Thanks for guidance in advance. 



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Weird performance difference

2017-10-20 Thread Tom Lane
Israel Brewster  writes:
>> Can you send "explain" (not explain analyze) for the production server?

> Not for the full query - it only just completed, after 70 minutes or so, and 
> I wasn't running under EXPLAIN ANALYZE. Running with a shorter date range of 
> only 7 days, as you suggest below: https://explain.depesz.com/s/r80j 
> 

First thing that jumps out from that is

> Foreign Scan on oag_schedules (cost=100.00..128.60 rows=620 width=108) 
> (actual time=3.576..477.524 rows=79,853 loops=1)

Being off by a factor of 100 at the scan level is never a good start for a
join plan.  Turn on use_remote_estimate (assuming these are postgres_fdw
tables).  Also try explicitly ANALYZE'ing the foreign tables.  I do not
believe auto-analyze will touch foreign tables ...

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Weird performance difference

2017-10-20 Thread Israel Brewster
On Oct 20, 2017, at 3:31 PM, Justin Pryzby  wrote:
> 
> On Fri, Oct 20, 2017 at 03:08:26PM -0800, Israel Brewster wrote:
>> Summary: the following query takes around 12 seconds on my test machine. On 
>> my production machine, it's at half an hour and counting. What's going on?
>> 
>> which, when run on my test server, has this explain analyze output: 
>> https://explain.depesz.com/s/4piv . 
>> Around 12 second runtime, which isn't too bad (in the grand scheme of 
>> things), although there is probably room for improvement.
> 
> Are these cast to ::date cast is really needed (Alternately, do you have an 
> index on column::date ?)
> |WHERE outtime::date>='2017-01-01'
> |ON outtime::date BETWEEN oag_schedules.startdate
> |AND outtime::date BETWEEN oag_batches.eff_from

Possibly not, but outtime is a timestamp while startdate and eff_from are 
simple date fields. I was being explicit :-) and no, I don't have an index on 
column::date, that's certainly something I can try to optimize things.

> The problem is clearly here:
> 
> Merge Join (cost=30,604.12..31,301.12 ROWS=1 width=76) (actual 
> time=1,153.883..9,812.434 ROWS=3,420,235 loops=1)
>Merge Cond: oag_schedules.flightnum)::text) = 
> (legdetail.flightnum)::text) AND ((oag_schedules.origin)::text = 
> (legdetail.legfrom)::text))
>Join Filter: (((legdetail.outtime)::date >= oag_schedules.startdate) AND 
> ((legdetail.outtime)::date <= COALESCE(oag_schedules.enddate, 
> 'infinity'::date)) AND (date_part('isodow'::text, 
> ((legdetail.outtime)::date)::timestamp without time zone) = ANY 
> ((oag_schedules.frequency)::double precision[])))
>ROWS REMOVED BY JOIN FILTER: 6822878
> 
> Can you send "explain" (not explain analyze) for the production server?

Not for the full query - it only just completed, after 70 minutes or so, and I 
wasn't running under EXPLAIN ANALYZE. Running with a shorter date range of only 
7 days, as you suggest below: https://explain.depesz.com/s/r80j 


I notice that this is a COMPLETELY different query plan from what I got on my 
test machine, but I'm not sure if that's a good thing or not. The end result is 
obviously not.

> 
> And \d for those tables.

Sure:

flightlogs=# \d legdetail
Table "public.legdetail"
   Column   |   Type   |   Modifiers

+--+
 id | integer  | not null default 
nextval('legdetail_id_seq'::regclass)
 logid  | integer  | 
 flightnum  | character varying(32)| 
 legfrom| character varying(6) | 
 legto  | character varying(6) | 
 pax| integer  | default 0
 cargo  | integer  | default 0
 legdate| date | default now()
 outtime| timestamp with time zone | not null
 ontime | timestamp with time zone | 
 offtime| timestamp with time zone | 
 intime | timestamp with time zone | 
 blocktime  | interval | 
 flighttime | interval | 
 mail   | integer  | default 0
 csom   | character(1) | 
 nonrevpax  | smallint | 
 legsic | character varying(128)   | 
 nonrevfrt  | integer  | 
 ti | boolean  | default false
Indexes:
"legdetail_pkey" PRIMARY KEY, btree (id)
"csom_idx" btree (csom)
"flightnum_idx" btree (flightnum)
"legdate_idx" btree (legdate)
"legfrom_idx" btree (legfrom)
"legto_idx" btree (legto)
"logid_idx" btree (logid)
"outtime_idx" btree (outtime)
Foreign-key constraints:
"logfk" FOREIGN KEY (logid) REFERENCES logs(id)
Triggers:
legdetail_audit AFTER UPDATE ON legdetail FOR EACH ROW WHEN (old.* <> 
new.*) EXECUTE PROCEDURE changeaudit()
update_hours_trigger AFTER INSERT OR DELETE OR UPDATE ON legdetail FOR EACH 
ROW EXECUTE PROCEDURE update_hours()
update_logtime_trigger AFTER INSERT OR DELETE OR UPDATE ON legdetail FOR 
EACH ROW EXECUTE PROCEDURE update_logtime()
update_status_legs AFTER INSERT OR DELETE OR UPDATE ON legdetail FOR EACH 
ROW EXECUTE PROCEDURE acstatupdate_loc()

flightlogs=# \d oag_batches
 Foreign table "public.oag_batches"
  Column  |   Type   | Modifiers |   FDW Options
--+--+---+--
 batchid  | uuid | not null  | (column_name 'batchid')
 name | character varying(256)   | not null  | (column_name 'name')
 created  | timestamp with time zone |   | (column_name 'created')
 eff_from | date |   | (column_name 'eff_from')
 eff_to   | date |   | (column_name 'eff_to')
Server: localhost
FDW Options: (schema_name 'pub

Re: [GENERAL] Weird performance difference

2017-10-20 Thread Justin Pryzby
On Fri, Oct 20, 2017 at 03:08:26PM -0800, Israel Brewster wrote:
> Summary: the following query takes around 12 seconds on my test machine. On 
> my production machine, it's at half an hour and counting. What's going on?
> 
> which, when run on my test server, has this explain analyze output: 
> https://explain.depesz.com/s/4piv . Around 
> 12 second runtime, which isn't too bad (in the grand scheme of things), 
> although there is probably room for improvement.

Are these cast to ::date cast is really needed (Alternately, do you have an 
index on column::date ?)
|WHERE outtime::date>='2017-01-01'
|ON outtime::date BETWEEN oag_schedules.startdate
|AND outtime::date BETWEEN oag_batches.eff_from

The problem is clearly here:

 Merge Join (cost=30,604.12..31,301.12 ROWS=1 width=76) (actual 
time=1,153.883..9,812.434 ROWS=3,420,235 loops=1)
Merge Cond: oag_schedules.flightnum)::text) = 
(legdetail.flightnum)::text) AND ((oag_schedules.origin)::text = 
(legdetail.legfrom)::text))
Join Filter: (((legdetail.outtime)::date >= oag_schedules.startdate) AND 
((legdetail.outtime)::date <= COALESCE(oag_schedules.enddate, 
'infinity'::date)) AND (date_part('isodow'::text, 
((legdetail.outtime)::date)::timestamp without time zone) = ANY 
((oag_schedules.frequency)::double precision[])))
ROWS REMOVED BY JOIN FILTER: 6822878

Can you send "explain" (not explain analyze) for the production server?

And \d for those tables.

And/or EXPLAIN ANALYZE for a query with shorter date range on production (to
confirm it has a similar problem in rowcount estimation).

You can try munging the query to move/change the "Join Filter" components of
the query to see which one is contributing most to the rowcount estimate being
off by a factor of 3e6.

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Weird performance difference

2017-10-20 Thread Israel Brewster
Summary: the following query takes around 12 seconds on my test machine. On my production machine, it's at half an hour and counting. What's going on?Details:As a first stab at getting some data I need, I've developed the following SQL query:SELECT    legfrom,    count(*) as totaldeps,    count(*) filter (where depdly<='00:00:00'::interval) as d0,    count(*) filter (where depdly<='00:05:00'::interval) as d5,    count(*) filter (where depdly<='00:14:00'::interval) as d14,    count(*) filter (where depdly<='00:30:00'::interval) as d30,    count(*) filter (where legto=destination) as totalarriv,    count(*) filter (where arrdly<='00:00:00'::interval AND legto=destination) as a0,    count(*) filter (where arrdly<='00:05:00'::interval AND legto=destination) as a5,    count(*) filter (where arrdly<='00:14:00'::interval AND legto=destination) as a14,    count(*) filter (where arrdly<='00:30:00'::interval AND legto=destination) as a30,    count(*) filter (where legto!=destination) as divFROM(SELECT distinct on (legdetail.flightnum, legfrom, outtime)    legdetail.flightnum,    legfrom,    legto,    outtime,    intime,    depart_time,    arrival_time,    destination,    outtime::time-oag_schedules.depart_time as depdly,    intime::time-oag_Schedules.arrival_time as arrdlyFROM oag_batchesNATURAL JOIN oag_schedulesINNER JOIN legdetail    ON outtime::date BETWEEN oag_schedules.startdate AND coalesce(oag_schedules.enddate,'infinity')    AND outtime::date BETWEEN oag_batches.eff_from AND coalesce(oag_batches.eff_to,'infinity')    AND extract(isodow from outtime::date)=ANY(frequency)    AND oag_schedules.flightnum::text=legdetail.flightnum    AND oag_schedules.origin=legdetail.legfromWHERE    outtime::date>='2017-01-01'    AND legdetail.flightnum~E'^[0-9]*$'    AND legdetail.flightnum::integer < 8000    AND csom='S'ORDER BY legdetail.flightnum, legfrom, outtime, oag_batches.eff_from DESC) s1GROUP BY legfromORDER BY legfrom;which, when run on my test server, has this explain analyze output: https://explain.depesz.com/s/4piv. Around 12 second runtime, which isn't too bad (in the grand scheme of things), although there is probably room for improvement.oag_batches and oag_schedules are postgres_fdw foreign tables in the same database cluster (different DB same host/server instance). oag_schedules has about 80,000 rows, legdetail has about 60,000 rows, and oag_batches has 125 rows, so while we are dealing with a fair amount of data, it's nothing extreme - I have tables with billions of rows that it handles just fine.With the query comfortably running on my test machine, I went ahead and tried it out on my production machine. And I waited. And waited. And waited some more. As of the writing of this line, pg_stat_activity shows that the query has been running for over 30 minutes. It also shows a status of "active", and nothing under the wait_event column. A second entry in pg_stat_activity shows the foreign data wrapper connection, with a state of "idle in transaction" and again nothing under the wait_event column. Top shows the two processes using around 62% and 45% CPU, with the rest of machine being 96% idle, so they are doing *something*, but not even enough to max out one of the cores, nor is the machine as a whole maxed out. The PostgreSQL log shows plenty of activity (and none of the users of the database are complaining), so apparently the database as a whole is working fine - it's just this one query that's not completing. Further, iotop shows no significant disk access - read is a solid 0.00 B/s, and write is only around 80 K/s, so it's not like it's waiting on disk i/o.Both test server and production are running PostgreSQL 9.6, although my test server is at 9.6.1 while production is at 9.6.5 (huh, production got ahead of test. I'll have to fix that. Oh well), and I did a dump from production to test using -C just before starting work on the query, so the datasets are pretty closely matched. The production server is configured to take full(er) advantage of the hardware than my test database, so if anything I would expect it to be faster. Hardware differences obviously can have an effect on the query speed, but I wouldn't think a difference of 12 seconds to over half an hour (and counting). So what could explain this huge discrepancy?One difference I can think of is that my production database is replicating (streaming replication), but since this is just a select, I can't think why that would make a difference... any thoughts?
---Israel BrewsterSystems Analyst IIRavn Alaska5245 Airport Industrial RdFairbanks, AK 99709(907) 450-7293---BEGIN:VCARD
VERSION:3.0
N:Brewster;Israel;;;
FN:Israel Brewster
ORG:Frontier Flying Service;MIS
TITLE:PC Support Tech II
EMAIL;type=INTERNET;type=WORK;type=pref:isr...@frontierflying.com
TEL;type=WORK;type=pref:907-450-7293
item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701;

Re: [GENERAL] How to find out extension directory

2017-10-20 Thread Joe Conway
On 10/20/2017 02:10 PM, Tom Lane wrote:
> "David G. Johnston"  writes:
>> On Fri, Oct 20, 2017 at 1:12 PM, rakeshkumar464 
>> wrote:
>>> How do I know beforehand where the dir path is ?
> 
>> I think pg_config (
>> https://www.postgresql.org/docs/current/static/app-pgconfig.html ) is what
>> you are looking for.
> 
> specifically, you can assume it's the "extension" subdirectory of
> whatever SHAREDIR is.  "pg_config --sharedir" will tell you that.

And starting with pg 9.6 you can also do:

test=# select * from pg_config where name = 'SHAREDIR';
   name   |   setting
--+--
 SHAREDIR | /usr/local/pgsql-REL_10_STABLE/share
(1 row)


-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] How to find out extension directory

2017-10-20 Thread Tom Lane
"David G. Johnston"  writes:
> On Fri, Oct 20, 2017 at 1:12 PM, rakeshkumar464 
> wrote:
>> How do I know beforehand where the dir path is ?

> I think pg_config (
> https://www.postgresql.org/docs/current/static/app-pgconfig.html ) is what
> you are looking for.

specifically, you can assume it's the "extension" subdirectory of
whatever SHAREDIR is.  "pg_config --sharedir" will tell you that.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Logical replication - behavior of REFRESH PUBLICATION's copy_data option

2017-10-20 Thread Adam LaMore
Hi all,

I'm exploring the new PG10 logical replication feature and trying to
understand how ALTER SUBSCRIPTION ... REFRESH PUBLICATION works.

My planned publication will have over 100 tables, some of which are quite
large. If I add a table to the publication, I understand that I have to use
the above command on the subscriber to refresh its subscription. The
default value of the copy_data option is true, but I'm unclear if that will
(A) re-copy all of the data for all of the tables in the publication; (B)
copy the data for only the newly added table; or (C) some other behavior.
So does anyone know which one it is?

If (A) is true, then I assume I can avoid that by setting copy_data =
false. If I do that, how do I copy only the data for the newly added table
from the publisher to the subscriber? Perhaps an out-of-band dump and
restore of just that table? And should that be before or after the call to
"REFRESH PUBLICATION?

Thanks in advance,
Adam


Re: [GENERAL] How to find out extension directory

2017-10-20 Thread David G. Johnston
On Fri, Oct 20, 2017 at 1:12 PM, rakeshkumar464 
wrote:

> I am documenting on automating installation of pgaudit extension for
> containers.  On my laptop I see that the directory where the files
> pgaudit.control and pgaudit--1.2.sql needs to be present is
>
> /usr/share/postgresql/10/extension.
>
> How do I know beforehand where the dir path is ?
>

I think pg_config (
https://www.postgresql.org/docs/current/static/app-pgconfig.html ) is what
you are looking for.

David J.


Re: [GENERAL] How to find out extension directory

2017-10-20 Thread Melvin Davidson
On Fri, Oct 20, 2017 at 4:12 PM, rakeshkumar464 
wrote:

> I am documenting on automating installation of pgaudit extension for
> containers.  On my laptop I see that the directory where the files
> pgaudit.control and pgaudit--1.2.sql needs to be present is
>
> /usr/share/postgresql/10/extension.
>
> How do I know beforehand where the dir path is ?
>
>
>
>
> --
> Sent from: http://www.postgresql-archive.org/PostgreSQL-general-
> f1843780.html
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

> How do I know beforehand where the dir path is ?

That would depend on the O/S, but it is not really necessary to know the
path, as PostgreSQL will know it.
Simply establish a connection to the database you want to create the
extension in and do

*CREATE EXTENSION extention_name;*

https://www.postgresql.org/docs/10/static/sql-createextension.html

-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


[GENERAL] How to find out extension directory

2017-10-20 Thread rakeshkumar464
I am documenting on automating installation of pgaudit extension for
containers.  On my laptop I see that the directory where the files
pgaudit.control and pgaudit--1.2.sql needs to be present is

/usr/share/postgresql/10/extension.

How do I know beforehand where the dir path is ?




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tgrm index for word_similarity

2017-10-20 Thread Igal @ Lucee.org

On 10/19/2017 4:54 PM, Igal @ Lucee.org wrote:
I want to use Postgres for a fuzzy auto-suggest search field.  As the 
user will be typing their search phrase, Postgres will show a list of 
items that fuzzy-matches what they typed so far, ordered by popularity 
(ntile(20)) and distance, i.e. 1 - word_similarity().


I created a Materialized View with two columns: name text, popularity int.

My query at the moment is:

    SELECT name, popularity
    FROM   temp.items3_v
      ,(values ('some phrase'::text)) consts(input)
    WHERE  true
        and word_similarity(input, name) > 0.01  -- be lenient as some 
names are 75 characters long and we want to match even on a few 
characters of input

    ORDER BY 2, input <<-> name

I tried to add a GIN trgm index on `name`:

    CREATE INDEX temp_items3_v_tgrm_item_name ON temp.items3_v USING 
GIN(name gin_trgm_ops);


But it is not used

What index would be good for that kind of query?


I see that when I use LIKE or ILIKE the index is used, but I lose all of 
the "fuzzy" benefits by doing that.


Is there any type of INDEX or even building my own COLUMN of trgm that 
can help speed my word_similarity() results?  When used in auto-suggest 
there are usually several queries for each user in a relatively short 
period of time, so speed is important.


Thanks,


Igal



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Replication stops under certain circumstances

2017-10-20 Thread Kim Rose Carlsen
Hi

I have some strange issues with a postgresql read replica that seems to stop 
replicating under certain circumstances.

Whenever we have changes to our views we have script that drops all views and 
reload them from scratch with the new definitions. The reloading of the views 
happens in a transaction to avoid confusion for everyone using the database. 
When this update gets to the slave it seems there is a chance for a deadlock to 
occur that doesn't get detected. 

As I was trying to reproduce this behavior, I ran into another weird situation 
that I don't entirely understand. The symptom is the same that replication 
stops, but it looks quite different. This example won't reproduce the issue 
reliably, but after a few hours I get a slave that won't continue to replicate 
until I restart it. The queries in the example won't make much sense, and I 
don't know if they can be simplified further and still cause the "desired" 
effect.

Setup:
Launch a new RDS psql instance (9.6.2) on AWS (will be referred to as 
db-master) and create a read replica (will be referred to as db-slave). The 
following options are changed from AWS default:
  
max_standby_streaming_delay=-1
hot_standby_feedback=1


On the master create 2 dummy tables:
create table a  (id serial primary key);
create table b  (id serial primary key);

Setup thread 1 to do work on master:

while true; do psql -h db-master -U postgres db -c 'begin; drop view if exists 
view_a cascade; drop view if exists view_b; drop view if exists view_c; create 
view view_a as select * from a; create view view_b as select * from b; create 
view view_c as select * from view_a join view_b using (id); insert into a 
values (default); insert into b values (default); commit;'; done

Setup thread 2 to do work on Slave:
while true; do psql -h  db-slave -U postgres db -c 'begin; select * from view_c 
order by random() limit 10; select * from view_a order by random() limit 10;'; 
done

Setup thread 3 to do more work on slave:
while true; do psql -h  db-slave -U postgres db -c 'begin; select * from view_b 
order by random() limit 10; select * from view_a order by random() limit 10;'; 
done

Every now and then a deadlock is detected and one connection is aborted, this 
works as expected. But After a while(serveral hours) it becomes impossible to 
connect to db on db-slave and thread 2 and 3 stops producing output. When 
trying to connect the psql client just hangs. However it is possible connect to 
template1 database to get a look on what is going on.



template1=> select * from pg_stat_activity;
-[ RECORD 1 ]+
datid    | 16384
datname  | rdsadmin
pid  | 7891
usesysid | 10
usename  | rdsadmin
application_name | 
client_addr  | 
client_hostname  | 
client_port  | 
backend_start    | 
xact_start   | 
query_start  | 
state_change | 
wait_event_type  | 
wait_event   | 
state    | 
backend_xid  | 
backend_xmin | 
query    | 
-[ RECORD 2 ]+
datid    | 1
datname  | template1
pid  | 11949
usesysid | 16388
usename  | hiper
application_name | psql
client_addr  | 192.168.10.166
client_hostname  | 
client_port  | 41002
backend_start    | 2017-10-20 16:30:26.032745+02
xact_start   | 2017-10-20 16:30:34.306418+02
query_start  | 2017-10-20 16:30:34.306418+02
state_change | 2017-10-20 16:30:34.306421+02
wait_event_type  | 
wait_event   | 
state    | active
backend_xid  | 
backend_xmin | 26891
query    | select * from pg_stat_activity;





There are no active connection except rdsadmin from aws.



template1=> select * from pg_locks;
  locktype  | database | relation | page | tuple | virtualxid | transactionid | 
classid | objid | objsubid | virtualtransaction |  pid  |    mode | 
granted | fastpath 
+--+--+--+---++---+-+---+--++---+-+-+--
 virtualxid |  |  |  |   | 3/929  |   | 
    |   |  | 3/929  |  9640 | ExclusiveLock   | 
t   | t
 relation   |    16390 | 2659 |  |   |    |   | 
    |   |  | 4/829  |  9639 | AccessShareLock | 
t   | t
 relation   |    16390 | 1249 |  |   |    |   | 
    |   |  | 4/829  |  9639 | AccessShareLock | 
t   | t
 virtualxid |  |  |  |   | 4/829  |   | 
    |   |  | 4/829  |  9639 | ExclusiveLock   | 
t   | t
 relation   |    1 |    11695 |  |   |    |   | 
    |   |  | 5/148  | 11949 | AccessShareLock | 
t 

[GENERAL] Issues shutting down PostgreSQL 10 cleanly

2017-10-20 Thread Stephen Froehlich
I am in the midst of heavy, large write operations on a new database this is 
currently around 23 billion lines.

My max_wal_size is 256 GB and my checkpoint_timeout is 12 hours. (Smaller 
values were hindering the writes.) Note: this is on Ubuntu 16.04 LTS.

I am having trouble shutting down postgres cleanly - how can I fix that?

A "/etc/init.d/postgresql restart" command gave the following log entries:

2017-10-20 09:17:55.069 MDT [16388] LOG:  received fast shutdown request
2017-10-20 09:17:55.069 MDT [16388] LOG:  aborting any active transactions
2017-10-20 09:17:55.069 MDT [20336] FATAL:  terminating autovacuum process due 
to administrator command
2017-10-20 09:17:55.069 MDT [20540] FATAL:  terminating autovacuum process due 
to administrator command
2017-10-20 09:17:55.069 MDT [20132] FATAL:  terminating autovacuum process due 
to administrator command
2017-10-20 09:17:55.070 MDT [1486] sfroehlich@mediacom_df_vd FATAL:  
terminating connection due to administrator command
2017-10-20 09:17:55.070 MDT [1488] sfroehlich@mediacom_df_vd FATAL:  
terminating connection due to administrator command
2017-10-20 09:17:55.070 MDT [1487] sfroehlich@mediacom_df_vd FATAL:  
terminating connection due to administrator command
2017-10-20 09:17:55.070 MDT [1489] sfroehlich@mediacom_df_vd FATAL:  
terminating connection due to administrator command
2017-10-20 09:17:55.070 MDT [1485] sfroehlich@mediacom_df_vd FATAL:  
terminating connection due to administrator command
2017-10-20 09:17:55.074 MDT [16388] LOG:  worker process: logical replication 
launcher (PID 16400) exited with exit code 1
2017-10-20 09:19:08.363 MDT [16395] LOG:  shutting down
2017-10-20 09:20:29.559 MDT [32352] LOG:  listening on IPv4 address "0.0.0.0", 
port 5432
2017-10-20 09:20:29.559 MDT [32352] LOG:  listening on IPv6 address "::", port 
5432
2017-10-20 09:20:29.560 MDT [32352] LOG:  listening on Unix socket 
"/var/run/postgresql/.s.PGSQL.5432"
2017-10-20 09:20:57.956 MDT [32454] LOG:  database system shutdown was 
interrupted; last known up at 2017-10-20 09:19:08 MDT
2017-10-20 09:21:03.598 MDT [32454] LOG:  database system was not properly shut 
down; automatic recovery in progress
2017-10-20 09:21:03.646 MDT [32454] LOG:  redo starts at 972/DB0B0838
pg_ctl: server did not start in time
2017-10-20 09:21:29.645 MDT [32352] LOG:  received smart shutdown request
2017-10-20 09:21:29.735 MDT [32558] LOG:  shutting down
2017-10-20 09:21:30.019 MDT [32352] LOG:  database system is shut down

Then single-user-mode restore gave the following output.

postgres@pisa:/usr/lib/postgresql/10/bin$ ./postgres --single -D 
/etc/postgresql/10/main/ -P -d 1
2017-10-20 09:24:59.669 MDT [33361] DEBUG:  mmap(35192307712) with MAP_HUGETLB 
failed, huge pages disabled: Cannot allocate memory
2017-10-20 09:25:00.077 MDT [33361] LOG:  database system was shut down in 
recovery at 2017-10-20 09:21:29 MDT
2017-10-20 09:25:00.077 MDT [33361] DEBUG:  checkpoint record is at 97F/4F7E7168
2017-10-20 09:25:00.077 MDT [33361] DEBUG:  redo record is at 972/DB0B0838; 
shutdown FALSE
2017-10-20 09:25:00.077 MDT [33361] DEBUG:  next transaction ID: 0:1194772; 
next OID: 966656
2017-10-20 09:25:00.077 MDT [33361] DEBUG:  next MultiXactId: 1; next 
MultiXactOffset: 0
2017-10-20 09:25:00.077 MDT [33361] DEBUG:  oldest unfrozen transaction ID: 
548, in database 1
2017-10-20 09:25:00.077 MDT [33361] DEBUG:  oldest MultiXactId: 1, in database 1
2017-10-20 09:25:00.077 MDT [33361] DEBUG:  commit timestamp Xid oldest/newest: 
0/0
2017-10-20 09:25:00.077 MDT [33361] DEBUG:  transaction ID wrap limit is 
2147484195, limited by database with OID 1
2017-10-20 09:25:00.077 MDT [33361] DEBUG:  MultiXactId wrap limit is 
2147483648, limited by database with OID 1
2017-10-20 09:25:00.077 MDT [33361] DEBUG:  starting up replication slots
2017-10-20 09:25:00.077 MDT [33361] LOG:  database system was not properly shut 
down; automatic recovery in progress
2017-10-20 09:25:00.078 MDT [33361] DEBUG:  resetting unlogged relations: 
cleanup 1 init 0
2017-10-20 09:25:00.125 MDT [33361] LOG:  redo starts at 972/DB0B0838
2017-10-20 10:35:05.773 MDT [33361] LOG:  invalid record length at 
97F/4F7E71D8: wanted 24, got 0
2017-10-20 10:35:05.773 MDT [33361] LOG:  redo done at 97F/4F7E7168
2017-10-20 10:35:05.773 MDT [33361] LOG:  last completed transaction was at log 
time 2017-10-20 09:17:46.905639-06
2017-10-20 10:35:05.774 MDT [33361] DEBUG:  resetting unlogged relations: 
cleanup 0 init 1
2017-10-20 10:35:06.350 MDT [33361] DEBUG:  performing replication slot 
checkpoint
2017-10-20 11:25:10.991 MDT [33361] DEBUG:  MultiXactId wrap limit is 
2147483648, limited by database with OID 1
2017-10-20 11:25:10.991 MDT [33361] DEBUG:  MultiXact member stop limit is now 
4294914944 based on MultiXact 1
2017-10-20 11:25:11.035 MDT [33361] DEBUG:  loaded library "pgcrypto"

PostgreSQL stand-alone backend 10.0
backend> [Ctrl+D]

2017-10-20 11:30:31.268 MDT [33361] NOTICE:  shutting down
2017-10-20 11:30:31.270 MDT [

Re: [GENERAL] How do you decide what aggregates to add?

2017-10-20 Thread Tom Lane
Seamus Abshere  writes:
> Who decides if a seemingly-useful aggregate is added to Postgres?

There's no particularly well-defined process for it, but as far as
these items ago:

> 1. I just discovered first()/last() as defined in the wiki [1], where
> it's noted that conversion from Access or Oracle is much easier with
> them.

Those are (a) undefined as to precise behavior, (b) redundant with
the first_value/last_value window functions, which *are* well defined
(not to mention SQL-standard), and (c) easy enough to make in SQL if
you want them despite (a) and (b).  So I don't really see them
getting over the hump.

> 2. We use our "homemade" jsonb_object_agg(jsonb) constantly, which is
> modeled off of (built-in) json_object_agg(name, value) and (built-in)
> jsonb_agg(expression). [2]

I dunno, the semantics of using jsonb_concat as an aggregate transfn
seem pretty squishy to me.  It's certainly much less well-defined
as to what you get than for either of the existing aggs you mention.

The short answer really is that we spend a lot of sweat on making Postgres
extensible so that we don't have to put in (and then maintain forever)
every little special-purpose function somebody might want.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] How do you decide what aggregates to add?

2017-10-20 Thread Seamus Abshere
hi,

Who decides if a seemingly-useful aggregate is added to Postgres? I
would like to advocate for a couple, but I worry that I'm
misunderstanding some community process that has decided _not_ to add
aggregates or something.

1. I just discovered first()/last() as defined in the wiki [1], where
it's noted that conversion from Access or Oracle is much easier with
them.
2. We use our "homemade" jsonb_object_agg(jsonb) constantly, which is
modeled off of (built-in) json_object_agg(name, value) and (built-in)
jsonb_agg(expression). [2]

Since building these into Postgres (though not fast C versions) is a
matter of a dozen lines of SQL, why haven't they been added already?
Seems like a great thing to brag about in release notes, etc.

Thanks for your thoughts,
Seamus

[1] https://wiki.postgresql.org/wiki/First/last_(aggregate)
[2] http://blog.faraday.io/how-to-aggregate-jsonb-in-postgres/

--
Seamus Abshere, SCEA
https://www.faraday.io
https://github.com/seamusabshere
https://linkedin.com/in/seamusabshere


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general